6

I'm trying to learn NASM assembly, but I seem to be struggling with what seems to simply in high level languages.

All of the textbooks which I am using discuss using strings -- in fact, that seems to be one of their favorite things. Printing hello world, changing from uppercase to lowercase, etc.

However, I'm trying to understand how to increment and print hexadecimal digits in NASM assembly and don't know how to proceed. For instance, if I want to print #1 - n in Hex, how would I do so without the use of C libraries (which all references I have been able to find use)?

My main idea would be to have a variable in the .data section which I would continue to increment. But how do I extract the hexadecimal value from this location? I seem to need to convert it to a string first...?

Any advice or sample code would be appreciated.

0

4 Answers 4

10

First write a simple routine which takes a nybble value (0..15) as input and outputs a hex character ('0'..'9','A'..'F').

Next write a routine which takes a byte value as input and then calls the above routine twice to output 2 hex characters, i.e. one for each nybble.

Finally, for an N byte integer you need a routine which calls this second routine N times, once for each byte.

You might find it helpful to express this in pseudo code or an HLL such as C first, then think about how to translate this into asm, e.g.

void print_nybble(uint8_t n) { if (n < 10) // handle '0' .. '9' putchar(n + '0'); else // handle 'A'..'F' putchar(n - 10 + 'A'); } void print_byte(uint8_t n) { print_nybble(n >> 4); // print hi nybble print_nybble(n & 15); // print lo nybble } void print_int16(uint16_t n) { print_byte(n >> 8); // print hi byte print_byte(n & 255); // print lo byte } 
Sign up to request clarification or add additional context in comments.

5 Comments

the C code given above produces "(" when given any value above 0x80 (i.e. if given 0x8F it give "(F")
@AdrianZhang: It works for me - did you change something ? Can you provide a minimal reproducible example that shows the problem ?
Never mind, it seems that changing from a uint8_t to a char produces the wrong result. Funny how that happened.
Scratch that, it was not an unsigned char. Silly me.
Yes, it’s implementation-defined as to whether plain char is treated as signed or unsigned, so they are best avoided when you need unsigned.
1

Is this a homework assignment?

Bits is bits. Bit, Byte, word, double word, these are hardware terms, something instruction sets/assembler is going to reference. hex, decimal, octal, unsigned, signed, string, character, etc are manifestations of programming languages. Likewise .text, .bss, .data, etc are also manifestations of software tools, the instruction set doesnt care about one address being .data and one being .text, it is the same instruction either way. There are reasons why all of these programming language things exist, very good reasons sometimes, but dont get confused when trying to solve this problem.

To convert from bits to human readable ascii, you first need to know your ascii table, and bitwise operators, and, or, logical shift, arithmetic shift, etc. Plus load and store and other things.

Think mathmatically what it takes to get from some number in a register/memory into ascii hex. Say 0x1234 which is 0b0001001000110100. For a human to read it, yes you need to get it into a string for lack of a better term but you dont necessarily need to store four characters plus a null in adjacent memory locations in order to do something with it. It depends on your output function. Normally character based output entities boil down to a single output_char() of some sort called many times.

You could convert to a string but that is more work, for each ascii character you compute call some sort of single character based output function right then. putchar() is an example of a byte output character type function.

So for binary you want to examine one bit at a time and create a 0x30 or 0x31. For octal, 3 bits at a time and create 0x30 to 0x37. Hex is based on 4 bits at a time.

Hex has the problem that the 16 characters we want to use are not found adjacent to each other in the ascii table. So you use 0x30 to 0x39 for 0 to 9 but 0x41 to 0x46 or 0x61 to 0x66 for A to F depending on your preference or requirements. So for each nybble you might AND with 0xF, compare with 9 and ADD 0x30 or 0x37 (10+0x37 = 0x41, 11+0x37 = 0x42, etc).

Converting from bits in a register to an ascii representation of binary. If the bit in memory was a 1 show a 1 (0x31 ascii) of the bit was a 0 show a 0 (0x30 in ascii).

 void showbin ( unsigned char x ) { unsigned char ra; for(ra=0x80;ra;ra>>=1) { if(ra&x) output_char(0x31); else output_char(0x30); } } 

It may seem logical to use unsigned char above, but unsigned int, depending on the target processor, could produce much better (cleaner/faster) code. but that is another topic

The above could look could look something like this in assembler (intentionally NOT using x86)

 ... mov r4,r0 mov r5,#0x80 top: tst r4,r5 moveq r0,#0x30 movne r0,#0x31 bl output_char mov r5,r5, lsr #1 cmp r5,#0 bne top ... 

Unrolled is easier to write and going to be a bit faster, the tradeoff is more memory used

 ... tst r4, #0x80 moveq r0, #0x30 movne r0, #0x31 bl output_char tst r4, #0x40 moveq r0, #0x30 movne r0, #0x31 bl output_char tst r4, #0x20 moveq r0, #0x30 movne r0, #0x31 bl output_char ... 

Say you had 9 bit numbers and wanted to convert to octal. Take three bits at a time (remember humans read left to right so start with the upper bits) and add 0x30 to get 0x30 to 0x37.

 ... mov r4,r0 mov r0,r4,lsr #6 and r0,r0,#0x7 add r0,r0,#0x30 bl output_char mov r0,r4,lsr #3 and r0,r0,#0x7 add r0,r0,#0x30 bl output_char and r0,r4,#0x7 add r0,r0,#0x30 bl output_char ... 

A single (8 bit) byte in hex might look like:

 ... mov r4,r0 mov r0,r4,lsr #4 and r0,r0,#0xF cmp r0,#9 addhi r0,r0,#0x37 addls r0,r0,#0x30 bl output_character and r0,r4,#0xF cmp r0,#9 addhi r0,r0,#0x37 addls r0,r0,#0x30 bl output_character ... 

Making a loop from 1 to N storing that value in memory and reading it from memory (.data), output in hex:

 ... mov r4,#1 str r4,my_variable ... top: ldr r4,my_variable mov r0,r4,lsr #4 and r0,r0,#0xF cmp r0,#9 addhi r0,r0,#0x37 addls r0,r0,#0x30 bl output_character and r0,r4,#0xF cmp r0,#9 addhi r0,r0,#0x37 addls r0,r0,#0x30 bl output_character ... ldr r4,my_variable add r4,r4,#1 str r4,my_variable cmp r4,#7 ;say N is 7 bne top ... my_variable .word 0 

Saving to ram is a bit of a waste if you have enough registers. Although with x86 you can operate directly on memory and dont have to go through registers.

x86 isnt the same as the above (ARM) assembler so it is left as an exercise of the reader to work out the equivalent. The point is, it is the shifting, anding, and adding that matter, break it down into elementary steps and the instructions fall out naturally from there.

Comments

1

Quick and dirty GAS macro

.altmacro /* Convert a byte to hex ASCII value. c: r/m8 byte to be converted Output: two ASCII characters, is stored in `al:bl` */ .macro HEX c mov \c, %al mov \c, %bl shr $4, %al HEX_NIBBLE al and $0x0F, %bl HEX_NIBBLE bl .endm /* Convert the low nibble of a r8 reg to ASCII of 8-bit in-place. reg: r8 to be converted Output: stored in reg itself. */ .macro HEX_NIBBLE reg LOCAL letter, end cmp $10, %\reg jae letter /* 0x30 == '0' */ add $0x30, %\reg jmp end letter: /* 0x57 == 'A' - 10 */ add $0x57, %\reg end: .endm 

Usage:

mov $1A, %al HEX <%al> 

<> are used because of .altmacro: Gas altmacro macro with a percent sign in a default parameter fails with "% operator needs absolute expression"

Outcome:

  • %al contains 0x31 , which is '1' in ASCII
  • %bl contains 0x41 , which is 'A' in ASCII

Now you can do whatever you want with %al and %bl, e.g.:

  • loop over multiple bytes and copy them to memory (make sure to allocate twice as much memory as there are bytes)
  • print them with system or BIOS calls

Comments

-1

Intel Syntax. This is from my bootloader but you should be able to get the idea.

print_value_of_CX: print_value_of_C_high: print_value_of_C_high_high_part: MOV AH, CH SHR AH, 0x4 CALL byte_hex_printer print_value_of_C_high_low_part: MOV AH, CH SHL AH, 0x4 SHR AH, 0x4 CALL byte_hex_printer print_value_of_C_low: print_value_of_C_low_high_part: MOV AH, CL SHR AH, 0x4 CALL byte_hex_printer print_value_of_C_low_low_part: MOV AH, CL SHL AH, 0x4 SHR AH, 0x4 CALL byte_hex_printer byte_hex_printer: CMP AH, 0x00 JE move_char_for_zero_into_AL_to_print CMP AH, 0x01 JE move_char_for_one_into_AL_to_print CMP AH, 0x02 JE move_char_for_two_into_AL_to_print CMP AH, 0x03 JE move_char_for_three_into_AL_to_print CMP AH, 0x04 JE move_char_for_four_into_AL_to_print CMP AH, 0x05 JE move_char_for_five_into_AL_to_print CMP AH, 0x06 JE move_char_for_six_into_AL_to_print CMP AH, 0x07 JE move_char_for_seven_into_AL_to_print CMP AH, 0x08 JE move_char_for_eight_into_AL_to_print CMP AH, 0x09 JE move_char_for_nine_into_AL_to_print CMP AH, 0x0A JE move_char_for_A_into_AL_to_print CMP AH, 0x0B JE move_char_for_B_into_AL_to_print CMP AH, 0x0C JE move_char_for_C_into_AL_to_print CMP AH, 0x0D JE move_char_for_D_into_AL_to_print CMP AH, 0x0E JE move_char_for_E_into_AL_to_print CMP AH, 0x0F JE move_char_for_F_into_AL_to_print move_char_for_zero_into_AL_to_print: MOV AL, 0x30 CALL print_teletype_stringB RET move_char_for_one_into_AL_to_print: MOV AL, 0x31 CALL print_teletype_stringB RET move_char_for_two_into_AL_to_print: MOV AL, 0x32 CALL print_teletype_stringB RET move_char_for_three_into_AL_to_print: MOV AL, 0x33 CALL print_teletype_stringB RET move_char_for_four_into_AL_to_print: MOV AL, 0x34 CALL print_teletype_stringB RET move_char_for_five_into_AL_to_print: MOV AL, 0x35 CALL print_teletype_stringB RET move_char_for_six_into_AL_to_print: MOV AL, 0x36 CALL print_teletype_stringB RET move_char_for_seven_into_AL_to_print: MOV AL, 0x37 CALL print_teletype_stringB RET move_char_for_eight_into_AL_to_print: MOV AL, 0x38 CALL print_teletype_stringB RET move_char_for_nine_into_AL_to_print: MOV AL, 0x39 CALL print_teletype_stringB RET move_char_for_A_into_AL_to_print: MOV AL, 0x41 CALL print_teletype_stringB RET move_char_for_B_into_AL_to_print: MOV AL, 0x42 CALL print_teletype_stringB RET move_char_for_C_into_AL_to_print: MOV AL, 0x43 CALL print_teletype_stringB RET move_char_for_D_into_AL_to_print: MOV AL, 0x44 CALL print_teletype_stringB RET move_char_for_E_into_AL_to_print: MOV AL, 0x45 CALL print_teletype_stringB RET move_char_for_F_into_AL_to_print: MOV AL, 0x46 CALL print_teletype_stringB RET 

1 Comment

Use a lookup table like a normal person for mapping a contiguous input range to various outputs, not a chain of branches!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.