Reverse-engineering: Using GDB on Linux

24
Reverse-engineering Using GDB on Linux

Transcript of Reverse-engineering: Using GDB on Linux

Page 1: Reverse-engineering: Using GDB on Linux

Reverse-engineeringUsing GDB on Linux

Page 2: Reverse-engineering: Using GDB on Linux

Reverse-engineering: Using GDB on LinuxAt Holberton School, we have had a couple rounds of a ‘#forfun’ project called crackme. For these projects, we are given an executable that accepts a password. Our assignment is to crack the program through reverse engineering. For round one we were given four Linux tools to use, and we had to demonstrate how to find the answer with each tool. It was quickly apparent that using a standard library string comparison is a bad idea as was hardcoding passwords into the executable in plain text. Another round demonstrated that the ltrace tool could gather not only the password from string comparisons, but the encryption method (MD5 in that case) to decrypt the password.

Page 3: Reverse-engineering: Using GDB on Linux

Reverse-engineering: Using GDB on LinuxThis week we were given another crack at hacking. I went to my go-to tool for reverse-engineering, the GNU Project Debugger (aka GDB), to find the password. If you would like to take a shot at cracking the executable, you can find it at Holberton School’s Github. The file relevant to this post is crackme3.

Page 4: Reverse-engineering: Using GDB on Linux

Program ChecksBefore I dig too deep into the exec file, I check what information I can get from it. First, I do a test run of the file to see what error information is provided.

$ ./crackme3

Usage: ./crackme3 password

For this executable the password is expected to be provided on the command line.

Page 5: Reverse-engineering: Using GDB on Linux

Program ChecksThe next check I run is ltrace just to see if the password will appear. In addition, it can provide some other useful information about how the program works.

$ ltrace ./crackme3 password__libc_start_main(0x40068c, 2, 0x7ffcdd754bd8, 0x400710 <unfinished …>strlen(“password”) = 8puts(“ko”ko) = 3+++ exited (status 1) +++

The return shows that the program is checking the length of the string, but there is no clear indication that this is a roadblock. Time to break down the program.

Page 6: Reverse-engineering: Using GDB on Linux

The GNU Project DebuggerGDB is a tool developed for Linux systems with the goal of helping developers identify sources of bugs in their programs. In their own words, from the gnu.org website:

GDB, the GNU Project debugger, allows you to see what is going on `inside’ another program while it executes — or what another program was doing at the moment it crashed.

Page 7: Reverse-engineering: Using GDB on Linux

The GNU Project DebuggerWhen reverse engineering a program, the tool is used to review the compiled Assembly code in either the AT&T or Intel flavors to see step-by-step what is happening. Breakpoints are added to stop the program midstream and review data in the memory registers to identify how it is being manipulated. I will cover these steps in more detail below.

Page 8: Reverse-engineering: Using GDB on Linux

The Anatomy of AssemblyTo get started, I entered the command to launch the crackme3 file with GDB followed by the disass command and the function name. The output is a list of Assembly instructions that direct each action of the executable.

$ gdb ./crackme3

(gdb) disass main

Page 9: Reverse-engineering: Using GDB on Linux

The Anatomy of Assembly

Page 10: Reverse-engineering: Using GDB on Linux

The Anatomy of AssemblyIn the previous slide, the AT&T and Intel syntaxes are displayed side-by-side. However, the output will actually display only one of the two. I prefer to use the AT&T format because the flow makes more sense to me. The first column provides the address of the command. The next column is the command itself followed by the data source and the destination. Jumps and function calls have the jump location or function name following those lines. Intel syntax reverses the data source and destination in its display. There are additional differences in the command names and data syntaxes, but this is common when comparing scripts of two different languages that perform the same function. If I were writing Assembly, my syntax preference might be different and would be based on more than just flow of information.

Page 11: Reverse-engineering: Using GDB on Linux

The Logic FlowEvery script depends highly on logic flow. Depending on the compiler and options selected when compiled, the flow of the Assembly code could be straightforward or very complex. Some options intentionally obfuscate the flow to disrupt attempts to reverse engineer the executable. Below is the output of the disass main command in AT&T syntax.

Page 12: Reverse-engineering: Using GDB on Linux

The Logic Flow

Page 13: Reverse-engineering: Using GDB on Linux

The Logic FlowThe portions of the command not highlighted are jumps and closing processes before exit. There are four types of jumps in the output of main and check_password; je, jmp, jne, and jbe. The jmp command performs the described jump regardless of condition. The other three are conditional jumps. The first two, je and jne, are straightforward. They mean jump if equal and jump if not equal. The last command, jbe, is a jump used in a loop that means jump if less than or equal.

Page 14: Reverse-engineering: Using GDB on Linux

The Heart of the QuestionUltimately we are looking for the password. Based on the information from the main output, it is primarily depending on the check_password function to determine whether to exit or provide access. To analyze the process happening in that function, I entered disass check_password.

Page 15: Reverse-engineering: Using GDB on Linux

The Heart of the Question

Page 16: Reverse-engineering: Using GDB on Linux

The Heart of the QuestionThe first thing I confirm is that the length of the password entered is important. The program looks for a password that is four characters long. The instruction at 0x400632 actually shows the password in integer form, but I did not recognize it immediately. That value is stored in memory four bytes before the memory address stored in the RBP register. I use x/h * $RBP - 0x04 to print the value. The ‘h’ stands for hexadecimal and it is the easiest format to to see how the password is stored. From the instruction set, a comparison of two registers, rax and rdx, occurs at 0x40066a. This is where the next step of my investigation leads.

Page 17: Reverse-engineering: Using GDB on Linux

Registered and Certified(gdb) b *0x40066a(gdb) run test

Starting program: /home/vagrant/reverse_engineering/crackme3/crackme3 testBreakpoint 1, 0x000000000040066a in check_password ()

(gdb) info registers

Page 18: Reverse-engineering: Using GDB on Linux

Registered and CertifiedI set a breakpoint to analyze the data in process. Breakpoints do exactly what they say, they interrupt the process at the given instruction address. Once the breakpoint is set, I initialize the executable with the command run test. The value ‘test’ is the four character password I used to get past length test and into the password comparison. Once the breakpoint is triggered I enter info registers to view the data in the registers at the point the program was interrupted.

Page 19: Reverse-engineering: Using GDB on Linux

Registered and CertifiedRegister Data On Each Loop

1: RDX — 0x41 = A

2: RDX — 0x42 = B

3: RDX — 0x43 = C

4: RDX — 0x4 = ^D or EOF

Blue — User input | Green — Stored Password

Page 20: Reverse-engineering: Using GDB on Linux

Registered and CertifiedFrom the register information, I find two integers are stored; 0x74 and 0x41. The ASCII value of the letter ‘t’ is 0x74. The printable letter for 0x41 is ‘A’. I also noticed that in RCX is an integer value of 0x4434241. If read in reverse, it is 41, 42, 43 and 4. Converted to the character values it is A, B, C, ^D or EOF.

Page 21: Reverse-engineering: Using GDB on Linux

Registered and CertifiedInputting the password is tricky. Bash interprets the EOF file command so it isn’t passed to the executable. In fact, it is used to exit executables. I tried to store it in a file, but emacs reads ^D (Ctrl + D) as an end of buffer command. My workaround is to use an online ASCII to text converter and paste into a file through Atom. It adds a new line character which I remove with emacs.

Page 22: Reverse-engineering: Using GDB on Linux

Registered and CertifiedTo get the password past Bash and into the executable, I use the command line below. This keeps Bash busy with passing the values to the executable so it does not interpret the EOF, end of file or transmission.

$ ./crackme3 $(< 0-password)

Congratulations!

Page 23: Reverse-engineering: Using GDB on Linux

ConclusionGdb is a powerful tool that is useful for much more than I have covered in this post. Take the time to read the documentation from GNU to learn more. I am confident there are many other tools that can be used as well. Share your go-to tool for reverse-engineering or debugging in the comments below.

Page 24: Reverse-engineering: Using GDB on Linux

About Rick HarrisStudent at Holberton School, a project-

based, peer-learning school developing full-stack software engineers in San Francisco.

Involved in the IT industry for 6+ years most recently as a part of the Office of Information Technologies at the University of Notre Dame.

Keep in Touch

Twitter: @rickharris_dev

LinkedIn: rickharrisdev