1

OK, before someone else marks this question as a duplicate. Let me make this very clear that this is more of a debugging problem than a logical problem. The logic is correct as far as I know because if I individually print the value in bx register after each operation, then I get correct output. The problem is that storing the results in bx register should make changes in the memory location it holds which is not happening.


So, I was learning assembly language these days, in NASM. I am following a pdf document which asks you to print a hexadecimal number (convert hex number to hex string and then print it).

I've written the code but it doesn't seem to print the correct hex number. On the other hand if I just print the variable FINAL_ST in the following code snippet without calling INIT (which is the start of the conversion of hex number to hex string), it works fine and prints 0x0000.

I've searched multiple times but to no avail.

I found out that gdb can be used to debug nasm programs but I could not understand how to use it when the output is a .bin file.

And I also tried constructing a Control Flow Graph for this code to understand execution flow but could not find an appropriate tool for it. :(

Code:

[org 0x7c00] mov ax, 0x19d4 mov bx, FINAL_ST + 5 ; jmp PRINTER ; works :/ jmp INIT NUM: add dx, 0x0030 mov [bx], dx jmp CONT ALPHA: add dx, 0x0037 mov [bx], dx jmp CONT CONT: dec bx shr ax, 4 cmp ax, 0x0000 jne INIT je PRINTER INIT: mov dx, 0x000f and dx, ax cmp dx, 0x000a jl NUM jge ALPHA ;STRING PRINTER PRINTER: mov bx, FINAL_ST mov ah, 0x0e jmp PRINT ; this doesn't work PRINT: mov al, [bx] int 0x10 inc bx cmp byte[bx], 0x00 jne PRINT FINAL_ST: db "0x0000", 0x00 END: times 510 - ($ - $$) db 0 dw 0xaa55 

Commands used:

nasm boot_hex1.asm -f bin -o boot_hex1.bin

qemu-system-x86_64 boot_hex1.bin

I get the output as 0x1 while the expected output is 0x19D4.

14
  • 2
    Any particular reason why are you learning x86 assembly basics on bootloader binary? (it would make more sense to me to learn first basic x86 assembly in 32b linux (you can build+run+debug elf32 binaries in 64b linux too), then to learn about 16b specialities and limits and bootloaders). And you need debugger for qemu. Here is some Q about that, maybe it will help: stackoverflow.com/q/14242958/4271923 ... about your task: you are converting binary value, not hexadecimal. mov ax, 0x19d4 will load ax with value 6612 encoded in binary into the 16 bits of register ax. Commented Dec 1, 2017 at 9:12
  • Everything "hexadecimal" about that value is only your formatting in the source code, after it is being assembled into machine code, that information is lost and irrelevant. CPU operates with bits, which are two levels of electrical current, often interpreted as 0 or 1 from programmer point of view. And ax has 16 of those "bits". There's nothing about format, just 16x zero or one. Commented Dec 1, 2017 at 9:15
  • @Ped7g . No, there's no specific reason for learning basics on bootloader. Actually I just googled OS development and started following this. I get your point that it is basically binary representation that we are converting to string(stored as hex representation). I guess it is a mistake on my part. What edits would you like me to make to the question? Commented Dec 1, 2017 at 9:21
  • And I tried executing those commands in the question you linked it. It just opened another window with title as QEMU(Stopped). Commented Dec 1, 2017 at 9:23
  • I have some quick and dirty code that displays byes and words in HEX from within a bootloader using NASM. It was part of some test code in this Stackoverflow answer under the section Test Code to See if Your BIOS is Overwriting the BPB . There is a function print_byte_hex and print_byte_word that you might be able to draw inspiration from. It was designed to print out the address and bytes of the bootloader itself. Commented Dec 1, 2017 at 9:26

2 Answers 2

2

Your issue is on the two lines that look like this:

mov [bx], dx 

This moves the 16-bit value in DX to the address specified in BX. Since x86 is little endian this has the effect of moving DL to [BX] and DH to [BX+1] on each iteration of your loop. Since DH is always zero in your code this has the effect of NUL terminating the string after each character is written to the FINAL_ST buffer.

The problem is that you are really looking at updating memory pointed to by BX with the byte in DL. Change both lines to be:

mov [bx], dl 

I have a Stackoverflow answer with bootloader tips. Tip #1 is:

When the BIOS jumps to your code you can't rely on CS,DS,ES,SS,SP registers having valid or expected values. They should be set up appropriately when your bootloader starts. You can only be guaranteed that your bootloader will be loaded and run from physical address 0x00007c00 and that the boot drive number is loaded into the DL register.

At a minimum you should set DS to zero since you are using an ORG (origin point) of 0x7c00. You can't assume the BIOS will set DS to zero before transferring control to your bootloader. It works in QEMU since its BIOS happens to have the value 0x0000 in DS already. Not all hardware and emulators will guarantee this.

Sign up to request clarification or add additional context in comments.

2 Comments

Also, this statement mov [bx], dx, does it mean that contents of dh (msb half of dx) were being moved to [bx], that is why a blank(one of the control characters actually) character was being printed?
@vishal-wadhwa Yes, that's correct. Since x86 processors are little Endian it was writing the byte in DL to [BX] and the byte in DH into [bx+1] on each loop. DH was always zero so it was like writing the character in DL followed by a NUL terminating character. This is why your number appears truncated in your version.
1

Here is a proc that has a working solution if someone needs it...

; Use: convert a hex value into a string ; Input: hex value(+10), string pointer(+12) ; Output: None HEX_BINARY_LEN equ 4 ALPHA_MIN equ 000ah ALPHA_ASCII equ 55 DECIMAL_ASCII equ 48 HEX_VALUE equ 10 STRING_PTR equ 12 ;---------------------------------------------------------------- proc hexToString push bp push bx push ax push dx mov bp, sp mov bx, [bp + STRING_PTR] add bx, 3 ; because we start from the end mov ax, [bp + HEX_VALUE] digitLoop: mov dx, 000fh and dx, ax cmp dx, ALPHA_MIN ;---------------------------- jge alphaDigit add dx, DECIMAL_ASCII mov [bx], dl jmp wasDecimalDigit ;-------------------- alphaDigit: add dx, ALPHA_ASCII mov [bx], dl wasDecimalDigit: ;---------------------------- dec bx shr ax, HEX_BINARY_LEN cmp ax, 0000h jne digitLoop mov sp, bp pop dx pop ax pop bx pop bp retn 6 endp hexToString ;---------------------------------------------------------------- 

6 Comments

How to convert a binary integer number to a hex string? has links to a couple 16-bit versions, and has a couple 32-bit scalar versions.
The standard frame-pointer setup for BP is push bp / mov bp,sp before any more pushes, so stack args are at a fixed offset from BP regardless of what else you do with SP in the prologue. Your first comment on how to call this says (+10) and (+12) but those offsets aren't relative to the return address, they're only meaningful with this specific BP setup. And BTW, in most calling conventions it's normal to let AX and DX be clobbered by functions, instead of spending extra instructions saving/restoring them. Also, you retn 6, but your function only takes 4 bytes of args.
Your constants could be defined in more meaningful ways, like ALPHA_ASCII equ 'A' - 10 for example, and '0' for decimal. Naming is hard, but for HEX_BINARY_LEN I would have called it BITS_PER_DIGIT or something. And instead of and dx, 0Fh, use and dx, BITS_PER_DIGIT - 1, otherwise it's pointless to have this factored out as a named constant instead of hard-coded.
You don't need mov [bx], dl twice; put that instruction in the part at the end of the loop that runs, like right before dec bx. Also, instead of a whole instruction for add bx, 3, you could use mov [bx+3], dl. Also, shr sets FLAGS, so it's redundant to cmp with zero right after.
Also, if you swap your usage of AX and DX, you can save machine-code size: add al, '0' is only a 2-byte instruction, vs. 3 for add dl, '0' or add dx, '0'. The unconditional jmp could also be removed if you change the constants to ALPHA_ASCII - DECIMAL_ASCII or similar, so 10..15 go through two add instructions with no taken branch, but 0..9 have one taken branch and one add. (Or to keep your same cmp/jge, reverse it so it's two adds with the first one using a negative value, vs. a taken branch and one add ALPHA_ASCII
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.