x86 Assembly Notes

Published on the

Compiling

Nasm is the assembler I use. To compile to native Linux executables i.e. ELF binaries on 64-bit Linux:

If, however, you want to produce 32-bit code, replace elf64 with elf32. If you want to run 32-bit code 64-bit Linux, then:

To compile to a binary to run on bare metal.

To run the image on the 64-bit x86 QEMU emulator:

To run the image on the 32-bit x86 QEMU emulator:

Registers

Register 64-bit 32-bit
Accumulator RAX/R0 EAX
Counter RCX/R1 ECX
Data RDX/R2 EDX
Base RBX/R3 EBX
Stack Pointer RSP/R4 ESP
Stack Base Pointer RBP/R5 EBP
Source RSI/R6 ESI
Destination RDI/R7 EDI

x86-64 also includes the register R8 to R15.

System Calls

A system call is how user space programs can interface with the kernel. On 32-bit and 64-bit Linux, it is possible to perform a system call by causing a CPU interrupt:

Here eax stores the system call, ebx, ecx, edx, esi, edi, ebp store the first, second, etc parameter. eax stores the return value.

The x86 architecture has a dedicated instruction for performing system calls:

Here rax stores the system call, rdi, rsi, rdx, r10, r8, r9 store the first, second, etc parameter. rax stores the return value.

One important system call to print characters is sys_write:

This is system call 1 with syscall and 4 with int 0x80. fd refers to the file descriptor: 0 for stdin, 1 for stdout and 2 for stderr. For instance, to print the character ‘H’ in 64-bit x86 assembly:

Alternatively to pop the stack:

It is important to note that write() is expecting a char[], i.e. 8-bit ASCII characters. Therefore, in the above example you are printing ASCII 72 and then 7 0s, which correspond to nothing. You could replace the third parameters with 1, and nothing will change. However suppose you calculated 72 as 8-bit binary concatenated to itself (which is 18504), and you pushed that to the stack and printed it (with the third parameters as at least 2), then you would printing out 2 Hs.

Another important system call to exit is sys_exit:

This is 60 with syscall and 1 with int 0x80.

Calling and returning

The x86 instruction set includes the instructions call and ret. call pushes the address of the next instruction after the function onto the stack and then goes to the specified instruction. ret pops the address of this instruction and goes to it. To create a function that prints H: