3

I have been implementing just for fun a simple operating system for x86 architecture from scratch. I implemented the assembly code for the bootloader that loads the kernel from disk and enters in 32-bit mode. The kernel code that is loaded is written in C, so in order to be executed the idea is to generate the raw binary from the C code.

Firstly, I used these commands:

$gcc -ffreestanding -c kernel.c -o kernel.o -m32 $ld -o kernel.bin -Ttext 0x1000 kernel.o --oformat binary -m elf_i386 

However, it didn't generate any binary giving back these errors:

kernel.o: In function 'main': kernel.c:(.text+0xc): undefined reference to '_GLOBAL_OFFSET_TABLE_' 

Just for clarity sake, the kernel.c code is:

/* kernel.c */ void main () { char *video_memory = (char *) 0xb8000 ; *video_memory = 'X'; } 

Then I followed this tutorial: http://wiki.osdev.org/GCC_Cross-Compiler to implement my own cross-compiler for my own target. It worked for my purpose, however disassembling with the command ndisasm I obtained this code:

00000000 55 push ebp 00000001 89E5 mov ebp,esp 00000003 83EC10 sub esp,byte +0x10 00000006 C745FC00800B00 mov dword [ebp-0x4],0xb8000 0000000D 8B45FC mov eax,[ebp-0x4] 00000010 C60058 mov byte [eax],0x58 00000013 90 nop 00000014 C9 leave 00000015 C3 ret 00000016 0000 add [eax],al 00000018 1400 adc al,0x0 0000001A 0000 add [eax],al 0000001C 0000 add [eax],al 0000001E 0000 add [eax],al 00000020 017A52 add [edx+0x52],edi 00000023 0001 add [ecx],al 00000025 7C08 jl 0x2f 00000027 011B add [ebx],ebx 00000029 0C04 or al,0x4 0000002B 0488 add al,0x88 0000002D 0100 add [eax],eax 0000002F 001C00 add [eax+eax],bl 00000032 0000 add [eax],al 00000034 1C00 sbb al,0x0 00000036 0000 add [eax],al 00000038 C8FFFFFF enter 0xffff,0xff 0000003C 16 push ss 0000003D 0000 add [eax],al 0000003F 0000 add [eax],al 00000041 41 inc ecx 00000042 0E push cs 00000043 088502420D05 or [ebp+0x50d4202],al 00000049 52 push edx 0000004A C50C04 lds ecx,[esp+eax] 0000004D 0400 add al,0x0 0000004F 00 db 0x00 

As you can see, the first 9 rows (except for the NOP that I don't know why it is inserted) are the assembly translation of my main function. From 10 row to the end, there's a lot code that I don't know why it is here.

In the end, I have two questions:

1) Why is it produced that code?

2) Is there a way to produce the raw machine code from C without that useless stuff?

15
  • 1
    You are looking at inefficient code generated when you don't enable optimizations. When compiling C you could try to pass -O3 . The first part of the code generated is typical stack frame prologue and then it allocates space on the stack for local variables. Commented Feb 28, 2017 at 12:12
  • Inserting the option for optimization of course it does not generate the stack frame prologue, however it still does generate the code after the RET that has not matching with the main function. Commented Feb 28, 2017 at 12:19
  • 1
    The stuff after the function is likely exception handling information. I didn't look at it closely. It really isn't code but data. You could try building with GCC using -fno-exceptions and see what happens Commented Feb 28, 2017 at 12:20
  • 1
    @Olaf, although C doesn't, GCC will often still create an .eh_frame section in the object. I usually use a linker script to discard the .eh_frame section and the comment section(as well as the build notes). Commented Feb 28, 2017 at 12:41
  • 1
    What GCC you are using? I tried your code and it works fine - binary is 21 bytes. Also rename main() to _start() to eliminate warning. Commented Feb 28, 2017 at 12:44

1 Answer 1

2

A few hints first:

  • avoid naming your starting routine main. It is confusing (both for the reader and perhaps for the compiler; when you don't pass -ffreestanding to gcc it is handling main very specifically). Use something else like start or begin_of_my_kernel ...

  • compile with gcc -v to understand what your particular compiler is doing.

  • you probably should ask your compiler for some optimizations and all warnings, so pass -O -Wall at least to gcc

  • you may want to look into the produced assembler code, so use gcc -S -O -Wall -fverbose-asm kernel.c to get the kernel.s assembler file and glance into it

  • as commented by Michael Petch you might want to pass -fno-exceptions

  • your probably need some linker script and/or some hand-written assembler for crt0

  • you should read something about linkers & loaders


 kernel.c:(.text+0xc): undefined reference to '_GLOBAL_OFFSET_TABLE_' 

This smells like something related to position-independent-code. My guess: try compiling with an explicit -fno-pic or -fno-pie

(on some Linux distributions, their gcc might be configured with some -fpic enabled by default)

PS. Don't forget to add -m32 to gcc if you want x86 32 bits binaries.

Sign up to request clarification or add additional context in comments.

4 Comments

I'm your upvote. I also provided an example in my last comment using OBJCOPY instead of a linker script. Linker script is my preference of course but there is always more than one way to skin the cat.
Thanks for the advices. With the -fno-pic option I can compile directly with my gcc without using the cross-compiler gcc I made. However, even passing the option -fno-exceptions, If I disassembly from the binary I had the same useless code after the RET. With the procedure proposed by @Michael Petch it worked fine! Thanks also to you
@gyro91 : If you are going to work on a toy OS over the long term I highly recommend you stick with a cross compiler. It will save you hassles and grief in the long run. The useless code is actually data being interpreted as instructions by NDISASM because in a binary file it can't properly distinguish between code and data that has been lumped together.
@Michael Petch: Actually I am working with a cross compiler. I used OBJCOPY as you suggested and it worked fine!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.