Your assignment is to write two "hello world" programs in Linux assembly language using the GAS assembler (gas is available on the linprog machines as as, but with a small caveat mentioned later.) Also please create a "Makefile" that will assemble these two source files into two executables.
Your first program should be called hello_world.s.
Your second program should be called hello_world64.s.
Your make file should be called Makefile. It should by default produce two static (NOT dynamic) binaries: hello_world and hello_world64.
Your output for hello_world should look like:
langley@sophie ~/assembly $ ./hello_world Hello World --- this is John Smith! CIS 4385 Spring 2013
where "John Smith" is to be substitued with your name.
Your output for hello_world64 should look like:
langley@sophie ~/assembly $ ./hello_world64 Hello World (64 bit version) --- this is John Smith! CIS 4385 Spring 2013
When you do an strace hello_world, it should look like this:
strace ./hello_world execve("./hello_world", ["./hello_world"], [/* 40 vars */]) = 0 [ Process PID=12870 runs in 32 bit mode. ] write(0, "Hello World --- this is John Smi"..., 59Hello World --- this is John Smith! CIS 4385 Spring 2013 ) = 59 _exit(0)
strace ./hello_world64 execve("./hello_world64", ["./hello_world64"], [/* 40 vars */]) = 0 write(1, "Hello World (64 bit version) ---"..., 76Hello World (64 bit version) --- this is John Smith! CIS 4385 Spring 2013 ) = 76 _exit(0) = ?
Also, please make sure that both programs use an entry point named hello_world, as shown below in the ld examples.
You must use GAS, and not NASM, FASM, or any of the many other assemblers that are out there. GAS is far better documented than most of the others (viz., AS manual); it's installed on most machines by default, and it handles multiple versions very gracefully.
Assembling with GAS is very simple:
as --32 -g -o hello_world.o hello_world.s # assembles 32 bit as --64 -g -o hello_world64.o hello_world64.s # assembles 64 bit
Next, you need to use the ld linker to create your executables:
ld -m elf_i386 -e hello_world -g -static -o hello_world hello_world.o ld -m elf_x86_64 -e hello_world -g -static -o hello_world64 hello_world64.o
The above invocations of ld use -m to make sure that 32 bit and 64 bit binaries are created. The -g is to keep debug information around, and the -static tells the linker that this is a static binary which does not depend on any dynamic libraries.
Using gdb is very simple:
% gdb ./hello_world # start up the debugger % break hello_world # set a breakpoint at your entry point % run # start the program, which will stop immediate % info reg # show your registers, the most important command here (can be abbreviated "i r") % step # step one instruction (can be abbreviated "s") % help # everything you could want to know about GDB ;-) % help all # all the possible commands...
There are two big differences in 32bit and 64bit Linux assembly, and both center around how system calls are made.
1) In 32bit assembly, you use the instruction
int $0x80
to make a system call.
2) In 32bit assembly, arguments to system calls are loaded into these registers:
EAX -- the system call that you want to execute EBX -- argument 1 for the system call ECX -- argument 2 for the system call EDX -- argument 3 for the system call ESI -- argument 4 for the system call EDI -- argument 5 for the system call EBP -- argument 6 for the system call
syscall
to make a system call.
2) In 64bit assembly, arguments to system calls are loaded into these registers:
RAX -- the system call that you want to make RDI -- argument 1 for the system call RSI -- argument 2 for the system call RDX -- argument 3 for the system call RCX -- argument 4 for the system call R8 -- argument 5 for the system call R9 -- argument 6 for the system call
I happen to like using cpp to rewrite my logical system call names using "unistd_32.h" and "unistd_64.h", like this
mov $__NR_exit, %eax # cpp turns "__NR_exit" into "1" int $0x80
However, for this assignment, it's probably easier to note that the relevant system calls and their numbers are:
32bit: write(2) is 4 exit(2) is 1 64bit: write(2) is 1 exit(2) is 60
Thus, the above lines of assembly could be just written as
mov $1, %eax # just put "1" in by hand rather than use cpp to find it int $0x80
Please create a tar file like this:
% tar cf assign2.tar Makefile hello_world.s hello_world64.s