“Hello world” with DEBUG
Coding “Hello world” with DEBUG will be a blunt exercise on programming futility. Or an exercise on retro, old-school coding. More than two decades ago I used to code in x86 (Intel) assembly, almost daily. I remember the masochist approach to learning the opcodes and the hardware architecture. The famous RBIL (Ralf Brown’s Interrupt List) was, back then, my favorite “reference”. First painful steps were taken and first crashes happily followed. I remember trying to code, as expected, the traditional “hello, world!”, using a strange tool included in DOS, DEBUG.COM. I wrote a post about this “hello, world” with DEBUG.COM elsewhere, and yesterday I found the time to reread it: I verified, first with awe, then with horror, and finally, with relief, that I had almost completely forgotten how to code in assembly. So I’ll revisit this here, mostly as a self-imposed disciplinary measure, an exercise on programming, specifically, an exercise on programming futility. Heck, DEBUG isn’t even available on the Windows 10 machine I’m typing this on. However, DEBUG looked pretty cool back then: it could assemble, disassemble and dump hexadecimal output. You could create little programs, or inspect programs and peek memory areas.
Specifically what I want is to build a minimal “hello, world!” program using DEBUG.COM. I don’t have any use for this, but it comes as a “relaxing” post after several weeks focused on the release of “DragonScales 3: Eternal Prophecy of Darkness” on Steam and the localization of “DragonScales 5: The Frozen Tomb”. After we execute DEBUG.COM we’ll meet a prompt with a “-” symbol. Now we can input our commands. I want to assemble, i.e., I want to type assembly language instructions. The command for that is “a”, which might be optionally followed by a memory address. By default, instructions will be placed starting from CS:0100, so I’ll use that address. Equivalently, I could type “a 0100” or “a 100” to achieve the same result.
Now we have to place the data in memory. For this little program I only need the characters for “hello, world!!”. Notice I want two “!!” at the end. That’s because I want the final program to occupy exactly 32 bytes; we’ll see the reason for this later on. I’ll use the pseudo-instruction “DB” to define our string. With DB I can neatly provide the string using ASCII values, like this:
db "hello, world!!"
Those are 14 bytes. However, I want a prettier output, with a newline character before and after our string. A newline is in fact two characters: a carriage return (CR is ASCII 13) and a line feed (LF is ASCII 10). In hexadecimal, CR is 0Dh, and LF is 0Ah. OK. Now our DB would be modified to look like this:
db 0d,0a,"hello, world",0d,0a
Those are 18 bytes. We are not done yet with our data. In order to effectively print the message to the standard output I’ll recur to the function 09h of INT 21h. Check RBIL D-2109. In short, I have to place the value 09h in register AH, and DS:DX should point to the beginning of our string. The function will print every character until finding a “$” character (i.e., “$” acts as the “zero” in null-terminated C strings). ASCII value of “$” is 36, or 24 in hexadecimal. Therefore, we modify our DB instruction again:
db 0d,0a,"hello, world",0d,0a,"$"
Our final string comprises 19 bytes. Good. Notice that the full hexadecimal representation of such string is:
0D 0A 68 65 6C 6C 6F 2C 20 77 6F 72 6C 64 21 21 0D 0A 24
We begin with the newline (0D 0A) and end with the “$” (24).
Now, we want our program execution to begin at CS:0100. We want executable instructions there. So we can’t place the data at such address. What we will place at CS:0100 is a jump assembly instruction to skip the data (the “hello, world!!” string) and arrive to our actual code. The data will come immediately after such jump instruction. Our jump instruction will take an operand (the destination address), meaning it’s a 2-byte long assembly instruction. Therefore, our data will be located at CS:0102. And, as our data takes 19 characters, we have to skip those 19 bytes, i.e., jump to CS:0115. And that’s how our program begins:
jmp 115 db 0d,0a,"hello, world!!",0d,0a,"$"
With these characters in memory we can setup our call to function 09h of INT 21h. First, place value 09h in AH:
mov ah, 9
And then, as told above, place the address of our string in DX. The string will be located at address 102, so:
mov dx, 102
And, at last:
To terminate the program, use function 0 of INT 21h (check RBIL D-2100):
mov ah, 0 int 21
Finally, our little program is as follows:
-a CS:0100 jmp 115 ; Skip 19 bytes of the string CS:0102 db 0d,0a,"hello, world!!",0d,0a,"$" ; "$" is the end of the string CS:0115 mov ah, 9 ; Print function of INT 21h CS:0117 mov dx, 102 ; Our string begins at 102 CS:011A int 21 CS:011C mov ah, 0 ; Terminate our program CS:011E int 21 CS:0120
In 0120 we enter nothing, just press the ENTER or RETURN key. The assembly command will stop after an empty line. Done. Now we can input the Go command followed by the address we want our execution to start at:
You should see “hello, world!!” with newlines above and below. Then we can generate an executable file. First use the Name command to tell DEBUG how we want our program to be called:
Then we will use the Write command to produce the binary file hello.com. However, the write command requires that register CX contains the amount of bytes that should be written to disk. Just substract the last byte location from the first one’s and add one, or equivalently, 0120 – 0100 = 20, i.e., 32 bytes. So use the command Register to set the proper value to CX:
-rcx CX 0000 : 20
And then write with:
Done. You’ll now have a hello.com file. It should work perfectly. Before quitting DEBUG (Quit command, just “q”), let’s use the Dump command to see the content of our program in hexadecimal:
Alternatively, just open hello.com with some hexadecimal viewer. We’ll get this:
CS:0100 EB 13 0D 0A 68 65 6C 6C 6F 2C 20 77 6F 72 6C 64 CS:0110 21 21 0D 0A 24 B4 09 BA 02 01 CD 21 B4 00 CD 21
Our program comprises exactly 32 bytes (remember the “20” we wrote to CX?), ending with that last “21” byte.
We can go a step further. DEBUG has an Enter command which allows us to input data or code directly into specific memory locations. Let’s fire up DEBUG again, and then input our hello world program directly like this:
-e 100 EB 13 0D 0A 68 65 6C 6C 6F 2C 20 77 6F 72 6C 64 -e 110 21 21 0D 0A 24 B4 09 BA 02 01 CD 21 B4 00 CD 21 -g=0100
There is a nice project, copy/v86, which emulates several operating systems online, including DOS with DEBUG and even Vim! Boot DOS and try all the previous commands. For the sake of exaggeration, open Vim and type this:
e 100 EB 13 0D 0A 68 65 6C 6C 6F 2C 20 77 6F 72 6C 64 e 110 21 21 0D 0A 24 B4 09 BA 02 01 CD 21 B4 00 CD 21 g=0100
Save the file as hello.hex. Notice we’re not adding the “-“. Now in your DOS command line:
debug < hello.hex