On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

37
On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad

Transcript of On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

Page 1: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

On the Effectiveness of Address-Space Randomization

CS6V81 - 005

Brian Ricks and Vasundhara Chimmad

Page 2: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

Overview

● ASLR: Address Space Layout Randomization– Certain brute force attacks can be thwarted by

constantly randomizing the address-space layout each time the program is restarted.

– The attacker must either craft a specific exploit for each instance of a randomized program or perform brute force attacks to guess the address-space layout.

Page 3: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

Overview – PaX ASLR

● PaX applies ASLR to binaries and dynamic libraries.– For the purposes of ASLR, a process’s user

address space consists of three areas, called the executable, mapped, and stack areas.

– ASLR randomizes these three areas separately, adding to the base address of each one an o set ffvariable randomly chosen when the process is created.

● We will focus on the mapped area, which includes the heap and dynamic libraries

Page 4: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

Overview – PaX ASLR

● PaX ASLR provides the following randomness:– 16 bits for addresses in the executable area

– 16 bits for addresses in the mapped area

– 24 bits for addresses in the stack area

● We will focus on the mapped data offset, which we call delta_mmap– Limited to 16 bits of randomness:

● Altering bits 28-31 would affect the mmap() function in terms of handling large memory mappings

● Altering bits 0-11 would cause memory mapped pages not to be aligned on page boundaries

Page 5: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

Breaking PaX ASLR

● Overview– Target: the Apache web server

● No known buffer overflows, so one will be replicated in the Apache source

– Exploit using return-to-libc technique● The stack addresses are randomized using 24-bits

– Makes guessing them by brute-force not feasible● Instead, we use knowledge of the stack layout, as the

layout does not change.

Page 6: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

Breaking PaX ASLR

● Overview– Determine the value of delta_mmap

● Brute force attack that pinpoints an address in libc

– Once delta_mmap is obtained, mount a return-to-libc attack to spawn a shell

● We assume that the stack is write only, in that we cannot execute shellcode directly on the stack

● We instead call a predefined function from the libc library, which is linked by default.

– One such function, system(), can execute programs, such as a shell.

Page 7: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

Precomputing libc Addresses

● In the libc library, determine the address o sets of the functions ff system(), usleep(), and a return (ret) instruction.– We can obtain these offsets by using the

standard objdump tool, which displays information from object files (such as the libc library).

Page 8: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

Precomputing libc Addresses

● Once these offsets are obtained, we can calculate the correct virtual addresses of system() and ret as follows:

address = 0x40000000 + o setff + delta_mmap.– 0x40000000: This is the standard base address for memory

obtained using the mmap() function● Already known

– o setff : The offset from the standard base address● Obtained from objdump

– delta_mmap: The PaX ASLR offset● We need to figure this out!!

Page 9: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

Obtaining the value of delta_mmap● Obtain the value of delta_mmap● What about usleep()?

– We use this function to help us determine delta_mmap.

– delta_mmap comprises the 'missing' 16-bits in the address for usleep() (we already know the others: they comprise the base address and the offset)

– We try to guess the address for usleep() by guessing the value of delta_mmap:

● Only 2^16 = 65535 possible values● Possible by brute force

Page 10: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

Obtaining the value of delta_mmap● Why usleep()?

– Gives deterministic behavior in Apache for a successful guess vs a failed guess

● Failed guess: child process crashes, Apache spawns a new process (forks)

– Connection closes immediately– The new child process uses the same delta_mmap value as the

crashed one!– Can keep guessing, knowing that delta_mmap will not change

● Successful guess: child process hangs for 16 seconds– We can infer from this 16 second delay that we found the correct

address for usleep()– The guess for delta_mmap is the correct value

Page 11: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

● How do we do this?– Iterate over all possible values for delta_mmap starting

from 0 and ending at 65535.

– For each value of delta_mmap, compute the guess for the randomized virtual address of usleep() from its o set ffand base address.

– Create the attack bu er and send it to the Apache web ffserver (buffer overflow exploit).

– If the connection closes immediately, continue with the next value of delta_mmap. If the connection hangs for 16 seconds, then the current guess for delta_mmap is correct.

Obtaining the value of delta_mmap

Page 12: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

● Why does the child process hang for 16 seconds on a successful guess?– We send to usleep() an argument of 16,843,009, which

corresponds to roughly 16 seconds that the process will sleep for.

– This value is represented in the attack buffer as 0x01010101

● Notice that if we want a number any lower than this, we will end up with a '00' somewhere in the hex representation.

● A '00' will be interpreted by strcpy() as a null terminator, and thus will terminate before overflowing the entire buffer.

Obtaining the value of delta_mmap

Page 13: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

● What does the attack buffer look like?– Top figure: the stack

before probing

– Bottom figure: the stack after one probe

● The buffer is toward the bottom in the figure, and the overflow spreads upward, as denoted by the arrow

Obtaining the value of delta_mmap

Page 14: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

● Iteration of one probe– Enter ap_getline()

● This function is modified to include a 64 char buffer (which the attack buffer is written to) and the strcpy() function which will cause the overflow

– The return address (EIP) in the stack frame is overwritten with the guessed address for usleep()

● When ap_getline() returns, control is redirected to the guessed address

– The stack pointer (EBP) is overwritten with 0xDEADBEEF (must be overwritten to reach EIP)

Obtaining the value of delta_mmap

Page 15: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

● Iteration of one probe● When ap_getline() returns:

– If the guess is correct, the address 0xDEADBEEF (above EIP) will be interpreted as the return address for usleep()

● This will cause a crash on return from usleep(), but the purpose here is to enter the function

● This address will make you a 1337 h4x0r

– If the guess is correct, the value 0x01010101 will be interpreted as the argument for usleep()

● Hex for 16,843,009 decimal, or about 16 seconds.

Obtaining the value of delta_mmap

Page 16: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

● Iteration of one probe● When ap_getline() returns:

– If the guess is incorrect, the child process will segfault.● This will cause Apache to fork() a new child process. However,

this new process will have the same randomization as the old one (PaX randomization occurs when the parent process starts).

● Thus, we just guess again

Obtaining the value of delta_mmap

Page 17: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

● We can now compute the addresses in libc of all other functions with certainty

● Use the same buffer overflow exploit (to obtain delta_mmap) to conduct a return-to-libc attack.

● We initially start in the stack frame for the ap_getline() function.

● The overflow causes the ap_getline() function to return to a sequence of ret instructions, the address of which can be any ret instruction found in libc

After obtaining delta_mmap

Page 18: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

● Sequence of events:– The 64 byte buffer in ap_getline() is overflowed by using

strcpy() to copy the attack buffer into it.

– EIP for ap_getline()'s (current) stack frame is overwritten (due to the overflow) with the address of a ret instruction from libc.

– When ap_getline() returns, the address in EIP is a pointer to a ret instruction!

● Remember, when ret is called for the ap_getline() function, EIP is popped off the stack and into the EIP register

● This results in a 32-bit word (address) being popped off the stack (from the EIP location)

After obtaining delta_mmap

Page 19: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

● Sequence of events:– When EIP is popped off the stack, execution jumps to the

address contained in the EIP register (what was popped)● In our case though, this address is a pointer to a ret instruction in

libc!

– Thus, the ret instruction pops EIP off the stack again, and again, this address is a pointer to a ret instruction!

– What we are doing is essentially shifting the stack downwards one address at a time until we hit the address of system() (part of the attack buffer)

After obtaining delta_mmap

Page 20: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

● Sequence of events:– When we have popped enough of the stack to reach

system(), then we know that the pointer to the 64 byte buffer will be in position to serve as the argument to system().

● Why? Because we know the stack layout doesn't change, and thus we can figure out exactly how many ret instructions to put in the attack buffer so that system()'s address will be exactly two words down in the stack from the 64 byte buffer pointer

– The pointer to the 64 byte buffer can be found in the stack frame for ap_getline()’s calling function, and thus we overflow all stack frames with ret instructions until we hit the stack frame for ap_getline()’s calling function.

After obtaining delta_mmap

Page 21: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

● What does the attack buffer look like?– First 64 bytes: the shell

command that we want system() to execute

– This is followed by a series of ret instructions

● These are pointers to any ret instruction found in libc

● We already know the addresses of ret functions in libc

After obtaining delta_mmap

Page 22: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

● What does the attack buffer look like?– Above the last ret

instruction is the address of system()

– We have just enough ret instructions to 'eat up' the stack such that we reach the 64 byte buffer in position to be the argument for system()

After obtaining delta_mmap

Page 23: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

● What does the attack buffer look like?– Again we use

0xDEADBEEF to overwrite EBP for the current stack frame and for the return address of system()

– The pointer into the 64 byte buffer is not overwritten!!

● We need this for our arg to system()!!

After obtaining delta_mmap

Page 24: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

● Sequence of events:– Thus, when system() is called, the pointer to the

64 byte buffer (which contains say “/bin/sh”) is passed as an argument to system()

After obtaining delta_mmap

Page 25: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

● Why do we need to use pointers to ret instructions? Couldn't we use say replace the ret addresses with 0xDEADBEEF (or some other 1337 address) instead and simply overwrite the EIP of ap_getline()'s stack frame with the address in the stack where we overflowed with the system() address? Wouldn't this allow us to jump directly to the correct place in the stack without having to pop words to get there?

– Sure, but how are we going to get that stack address?

– PaX ASLR randomizes 24-bits for stack addresses

– That would require 2^24 = 16,777,216 guesses of the offset alone in the worst case to figure out this stack address!! Not feasible to simply jump to this address.

– But, the stack layout is not randomized (as mentioned)

After obtaining delta_mmap

Page 26: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

● 2.4 GHz Pentium 4 client attacking an Athlon 1.8GHz server.– Connected over a 100Mbps network

● Each probe resulted in about 200 bytes of network traffic– Total of 12.8MB in the worst case– Total of 6.4MB in the average case

Experimental Platform

Page 27: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

● 10 Trials● Total # of Apache child processes spawned

concurrently: 150● Results (of 10 trials):

– Slowest time: 810 seconds– Average time: 216 seconds– Quickest time: 29 seconds

Experimental Results

Page 28: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

Improvements

• Attacks exploited the low entropy of

16 bits

• Address space layouts are

randomized only at program loading

Page 29: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

64 bit architectures

• 16 bits of address space randomization can

be defeated by brute force

• 64 bits is good as 40 address bits are

available for randomization

• Online brute force attack wont go unnoticed

Page 30: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

Randomization frequency

•More frequency of randomization

•Re-randomizing adds no more than 1 bit

of security against brute force. •Increase the frequency

Page 31: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

Randomization granularity

•Finer granularity by increasing randomness

•By randomizing functions and variable

addresses within memory segments

• In addition to randomizing base addresses

Page 32: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

Randomizing at Compile time

• Compiler and linker can be modified to

randomize variables and function addresses

within their segments

• Introduction of random padding

• By placing entry points in random order

within a library additional 10-12 bits of

entropy

Page 33: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

Randomizing at runtime

•Randomizing more than 16 bits but

prevent the fragmentation of virtual

address space.

•Function re-ordering within shared

library

•Effective against return-to-libc attacks.

Page 34: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

•Modifying the compiler and linker ,

relative jumps can be eliminated at

compile time.

•Defer resolution of offsets until runtime

dynamic linking

•Allows to order functions arbitrarily and

loading from different libraries also non-

contiguous portions of virtual memory

Page 35: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

•Library pages differ from each processes•Clustering functions into page size

groups and shuffling groups instead of individual functions.

•Code that need to call these functions must be able to locate them effectively

•Global Offset Table(GOT) and array of

pointers initialized by runtime dynamic

linker need to be fixed . .

Page 36: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

•Difficult with all the constraints.

•Designing a linking architecture that facilitates function shuffling in shared code pages effectively n securely is needed. Further research in this area is needed.

Page 37: On the Effectiveness of Address-Space Randomization CS6V81 - 005 Brian Ricks and Vasundhara Chimmad.

Monitoring and catching errors

•Crash detection and reaction mechanism

called Watcher.

• Attacker incorrect guesses will trigger

segmentation violations .

•But limited actions of crash watcher