16 Jan

16 bits COM Oddity

I can’t even pinpoint what a 16 bits COM Oddity really means, but I think the idea is therein, somehow. Previously, I explained how to code a simple a “hello, world” program using the DEBUG tool that was shipped with DOS. Revisiting this obsolete knowledge was unexpectedly fun. We’ll retrieve the hexadecimal version of “hello, world” (well, “hello, world!!”) from that post:

EB 13 0D 0A 68 65 6C 6C 6F 2C 20 77 6F 72 6C 64
21 21 0D 0A 24 B4 09 BA 02 01 CD 21 B4 00 CD 21

That’s all we need for our “hello, world!!” binary. 32 bytes exactly. We can create that file bit by bit but that’d be too excessive, I think. Let’s use the echo command instead. This is the full command I entered in my Windows 10 cmd.exe prompt:

echo|set /p="Ù‼♪◙hello, world!!♪◙$┤○║☻☺═!1└═!">hello.com

After that you’ll get a 16-bit COM, hello.com, that will display the “hello, world!!” message. Funny 🙂

What are those weird characters?

First a little explanation. We want our hello.com file to be, byte after byte, an exact representation of the hexadecimal sequence above presented. We’ll use cmd.exe commands to dump characters into the file and, if we choose our characters carefully in order to match the target hexadecimal values, we’ll end up with the exact representation we’re looking for. For instance, the first 2 bytes block, EB 13, is the “jmp 115” instruction. Then comes the newline (0D 0A), and so on. If we convert our hexadecimal to decimal, we get:

235 19 13 10 104 101 108 108 111 44 32 119 111 114 108 100 
 33 33 13 10  36 180   9 186   2  1 205 33 180   0 205  33

The first byte in hello.com must be EB, or 235 in decimal. In order to dump our characters from the command line, we’ll convert that decimal value to a character. I’m trying this on a Windows 10 (64-bits) machine, with cmd.exe using Code page 850 Multilingual Latin 1. In such code page, character 235 is Ù. And 19 is ‼. And, luckily, 13 is ♪ and 10 is ◙. Those two characters are especially important because they represent the carriage return and the line feed, respectively, and some shells won’t convert them to characters. However, happily, cmd.exe with my default code page will handle them as we need. To input those characters you can type the usual ALT + decimal value.

There are a few important things to notice:

Read More