In response to an exciting challenge on virtuallyfun.
I started by writing a tool to extract files from the filesystem images (both diskettes and HDD) for easier access on my regular computer. This was posted on the challenge page and a download link is available in the files section at the end of the story (along with all the other related files and tools). It should work on most Linux/Unix systems. I plan to add write support to it later and will update it once that's done. While testing this, I noticed that QNX uses yet another line-termination convention.1
Using dd(1) I extracted the boot sector and disassembled it using objdump(1):
$ dd if=qnx12_boot_patched.img of=boot.bin count=1 $ objdump -D -Mintel,i8086 -b binary -m i386 boot.bin --start-address 0x80 >boot.disI first disassembled the entire boot sector, but since it starts with a jump to address 0x80 (to leave room for the QNX filesystem superblock) and it doesn't seem to use anything below 0x80, I reran the instruction in order to obtain a cleaner file. Having this file, I started the first of many analyses. The loader actions I identified were:
$ dd if=qnx12_boot_patched.img skip=522 of=kernel.binyields the entire kernel image. I noticed that this begins with some kind of header:
00000000 70 00 01 00 00 40 74 05 91 1c 6d 00 00 00 00 00 |p....@t...m.....| 00000010 00 00 00 00 74 61 73 6b 00 50 43 00 00 00 00 00 |....task.PC.....| 00000020 e4 05 01 00 00 40 97 05 97 15 6d 01 00 00 00 00 |.....@....m.....| 00000030 00 00 00 00 66 73 79 73 00 00 00 00 00 00 00 00 |....fsys........| 00000040 7b 0b 01 00 00 40 5f 06 fe 29 eb 03 00 00 00 00 |{....@_..)......| 00000050 00 00 00 00 64 65 76 00 00 00 00 00 00 00 00 00 |....dev.........| 00000060 da 11 01 00 00 40 3d 00 01 02 09 00 00 00 00 00 |.....@=.........| 00000070 00 00 00 00 69 64 6c 65 00 00 00 00 00 00 00 00 |....idle........| 00000080 17 12 01 00 00 40 a3 01 d5 01 d5 01 00 00 00 00 |.....@..........| 00000090 00 00 00 00 73 68 61 72 65 64 00 00 00 00 00 00 |....shared......|The only important info so far is the entry point segment (0x0070). In order to have proper addresses in disassembly, I first made another copy of the kernel skipping the first 256 bytes and ran objdump on that copy, with a start-address of 0x80). The code there makes sense as a kernel start: Disable interrupts both in the CPU and in the PIC (Programmable Interrupt Controller), followed by setting Interrupt Vector Table (IVT) addresses. The vector address for int 0x72 seems special - it is based on some data from the kernel header - later I realized it's the segment of "shared" section. The rest was quite difficult to comprehend at the time.
After that look at the kernel I thought the most promising approach would be to see what mount does, at least that would give me an idea where to look (disassembling the entire kernel doesn't seem a good idea at the moment).
To find out more about QNX executables (a mandatory prerequisite) I started by compiling a test program (a good thing the OS includes a C compiler as well as a suitable editor), keeping the intermediate assembly file and also using the "generate map" linker option. Looking at the resulting files gave me a rough idea about QNX executables. As an aside, the resulting map file does not use file offset but segment offset. It was rather obvious that code is in segment 1 and the prologue and epilogue routines are easily identified, thus helping me locate the actual code despite not knowing anything about the file format.
At this level, QNX uses interrupts as a syscall mechanism. Later I realized that many low-level functions are based on message passing between tasks. First issue was finding out what each interrupt call does. Correlating the function calls I used, the map file produced by the linker and the disassembly of the actual executable, I was able to determine what interrupt service each call used2. E.g. int 72,1 is fopen, int 72,b is fput and so on.
While helpful, the simple executables I compiled looked rather different than mount. The most obvious difference was that my test programs have the code section around the beginning (with a call to main always at offset 0x27 into the file) while mount would have lots of data at the beginning and the code section further away into the file. Looking for the familiar prologue/epilogue routines helped with finding the actual code in mount as well. But when I started analyzing it, at some point I hit garbage (a jump into nonsense-code).
Around this time I had the idea to try and see what differences there are between the "cold" (on-disk) kernel, the running kernel and the running kernel with HDD driver loaded. Since I don't know how to dump memory from either pcem or 86box, my solution was to write (under QNX) another tool for dumping memory to file. When all you have is a hammer...
There were lots of differences but some differences were telling (offsets are hex into kernel):
The process of using some system functions in a program, then disassembling it to see what interrupts they used was rather laborious and I was hoping to find a reference to all syscalls provided by QNX via interrupt services. I asked about this on the challenge page but to no avail. However, while checking the FTP archive for any clues, I found something else I've been looking for: the qnx_load technote, detailing the executable file format. This proved very helpful indeed! It allowed me to write (yet another) small tool to extract the code segment and data segment of executable files, just as they would have been loaded into memory. When I found that nonsense-code gap in mount, I thought of using QNX debug utility as a last resort measure, to at least bridge the gaps where the nonsense-code was. Fortunately this desperate approach was no longer needed.
Despite not being a direct answer to my question, Mitchell Schoenbrun's advice proved informative and it helped me understand the underlying system philosophy. One of his remarks also saved me some time at later point, when disassembling fopen (the fact about device names beginning with "$").
Having the correct contents of both code and data segments for mount, I went back to disassembly. Mount does more than simple mounting of disks, but eventually the following sequence emerged (excerpt from notes taken during analysis):
Following the messages sent by mount, I see that DEFINE_DRIVER fills what appears to be a driver_entry struct (this can be done beforehand in kernel image).
SET_ATTR is a little bit longer:
Now everything was ready for patching. Most structures can simply be copied into the kernel binary, but a call to disk_init is needed to initialize the hardware. The copy-protection routine takes 77 bytes in fsys and is positioned at just the right spot - after the floppy driver initialization but before any read takes place. I wrote a small program (entry.asm) to take care of that, and patched it in the place of the protection routine. The program was assembled using fasm and I patched the kernel using dd, then wrote the new kernel to the boot diskette (see qnx_dis.tgz at the end of the story).
Booting from this diskette made the harddisk available from the start, without needing to run mount. It was still booting from diskette, though, so the next step was writing a small boot loader. To keep things simple, I decided to use the same approach as QNX used for the boot floppy: keep the kernel outside of filesystem. I created a slightly smaller partition (reserving cylinders above 300 for the kernel) and copied the QNX files to it. I made the partition active (BIOS might not consider the HDD bootable without an active partition) and I also copied the kernel, starting at absolute sector 20468 (CHS 301,0,1) - just after this partition.
Writing the boot loader should have been a relatively simple matter, if only I would have remembered that the 8088 lacks certain real-mode instructions that are taken for granted now. Eventually I got my loader (and QNX) to run on an emulated 286-type computer (with AMI BIOS), and printed debug messages from the loader to check where it's going wrong. Only after doing that I remembered forty's remark about the debugger in PCjs and realized I could use it to debug my loader. The problem was that 8088 doesn't support shift instructions with immediate count (shift register contents a given number of times). Either the count must be 1, or the count has to be loaded into CL. After a few changes my boot loader was working on the 8088 XT as well.
This mostly solved the problem, but the kernel would still try to load /cmds/sh from floppy. It was happy with a diskette in drive B (thus clearly booting from HDD), but this wasn't enough. Feeling rather pressed for time, I then decided to use a less elegant hack: since the TA_CREATE message was pre-filled in kernel, why not change it? Namely, use "3:/cmds/sh" in order to force the use of HDD? This seemed possible, except everything needed to fit the existing size (right after this message is the jump table for various int 70 functions). I created a different directory on the HDD ("/xi") and copied sh to it. I also created a small script file that set SEARCH to 3, cd to 3:/ and load the regular /config/sys.init script. This proved enough to have a QNX 1.2 boot from hard disk without any need to access floppy. I posted this version of the HDD image to the challenge page.
This was followed by Dan's request for a PCjs machine image that I posted on my site as well as a flurry of small updates for things I forgot or that weren't properly copied; most embarrassing being that I forgot the variable at 0x464f in fsys, (the one that holds the maximum devno and that was causing some issues with mount and chkfsys). Upon running chkfsys, another slip became visible: I forgot to update partition size in the device entry structure.
Besides the small slips, one issue that was bothering me was that ugly "/xi" hack. Having traced most of what was happening in task up to fopen, I decided to disassemble fopen as well. Int 72 points to offset 0 in shared, and shared is not a task per se. I think it would be fair to say it's a sort of libc for QNX. Naming aside, code for int 72 uses a jump table at offset 0x5a (in shared) and following that I started to trace what fopen does. This proved to be quite a complex function, but eventually I got to a point where if path argument starts with a / character it sends a message to fsys: GET_SEARCH_ORDER. Of course! Back into fsys, I followed the message path thru two more jump tables (already identified from checking mount messages), and finally found the search list at DS:4650. One more change to entry.asm (my procedure that runs instead of copy protection) and /cmds/sh is properly loaded without issue.
Just to be on the safe side I added a "cd 3:/" and "search 3" to /config/sys.init. The search command is needed because floppy drive 2: is automatically added to search (I assume during floppy driver initialization). I think having only hdd is better, otherwise any command that is not found would cause a floppy access (and with no floppy inserted this results in an annoying delay).
Thanks to Tenox and Dan Dodge for the challenge!
1 QNX uses 0x1e - ASCII character RS (Record Separator) - as newline character
2 when referring to interrupt service, I mean the interrupt number as well as a function number (usually) loaded in AX register before the interrupt call. Also, as well as when using the segment:offset notation of real mode x86, the hexadecimal notation is implied, i.e. int 72 really means int 0x72. Somewhat unusual (different from the typical BIOS / DOS interfaces) is the use of the stack, instead of registers, for syscall arguments. I assume this approach was chosen as it allows a larger part of the OS to be written in C (or another high-level language), instead of ASM.
3 the kernel "header" consists of 5 32-byte records that all seem to start with the code segment for the corresponding entry. This is suggested by the first jump from boot loader (at 0070:0080) followed by checking the output of "task +code" against this assumption. Knowing this (and knowing, from boot sector analysis, that the entire kernel is loaded in memory as one big block) made it possible to separate the kernel into "task", "fsys", "dev", "idle" and "shared" sections.
4 this means filling that entire portion with 0xc3 (ret), I assume in order to deter casual reverse engineering.