Wednesday, 31 January 2018

investigate the relationship between basic C source code and the output of the C compiler

This blog will be exploring around the explanation of compiled C code on Aarch64 Register architecture to learn how assembly language is used to understand C code.Assembly language is a symbolic representation of machine language. It is therefore architecture-specific.
#include <stdio.h>

int main() {
    printf("Hello World!\n");
}

Having this compiled chunk of code by including compiler options like

-g               # enable debugging information
-O0              # do not optimize (that's a capital letter and then the digit zero)
-fno-builtin     # do not use builtin function optimizations
 gcc -g -O0 -fno-builtin -o hello helloworld.c we will be creating a binary ELF(Executable and Linkable Format) file which contains multiple sections. These sections may contain object code, link tables, debugging symbols, program data(such as constants and initial values of variables)metadata about the program and ELF sections and comments. 
We could examine that binary produced by previous approach.The Objdump flags that I will use are listed and described below 
-f # display header information for the entire file
-s # display per-section summary information
-d # disassemble sections containing code
--source # (implies -d) show source code, if available, along with disassembly

1) By executing the command objdump -f -s -d --source hello The size of the generated file is 18KB. If we closely observe there will be a main section where we can see the code that we wrote and in the same section we can see the string to be printed.

2) By adding -static to our command to compile the code gcc -g -o0 -fno-builtin -o hello helloworld.c.Then in the same way we run the output through objdump with the same flags. A standarad input library is added into our assembly and the <printf@plt> instruction called is replaced with <_IO_printf> instruction. The size of objdump would be around 108 kb

3) Removing compiler option -fno-builtin <printf@puts> gets replaced by <puts@plt> GCC optimizes the code according depending on the source code here we have hello world string in printf which can be basically converted to puts according to gcc which is also used to display strings but if we had any character in it then it would have optimized it to putschar.

4)Removing compiler option -g says we don't want to see debugging information anymore which can save us upto 2 to 3 kb maybe or even less than that

5) Adding additional arguments to the printf() function in our program. We are displaying multiple integers using printf we can notice that each of those integers is stored in different register of memory and 1,2,3,4,5 are assigned in registers using mov while 6,7,8,9 are pushed using pushq into registers

6)adding -o3 to the gcc option and removing -o0 will be increasing performance of the code in terms of execution and running the program

From the above blog we assembly language can really help in understanding  compiled c code as it shows the information which directly goes to hardware we could say it is to low level as it shows the accurate information where our single bit data is getting loaded or stored in our registers which are present in our memory.


No comments:

Post a Comment