Post-Attack Analysis of Unknown Vulnerabilities
description
Transcript of Post-Attack Analysis of Unknown Vulnerabilities
![Page 1: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/1.jpg)
Computer Science
Post-Attack Analysis of Unknown Vulnerabilities
Peng Ning
With Emre C. Sezer, Chongkyung Kil, and Jun Xu
![Page 2: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/2.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 2Computer Science
Motivation
• Vulnerability analysis– Essential for
• Patching
• Vulnerability based signature generation
– Painstakingly slow• Depends on human efforts
• Existing approaches– Static analysis (e.g., [Chen et al. 04] , [Feng et al. 04], [Larochelle & Evans 01])
• False positives
– Dynamic analysis (e.g., Minos [Crandall et al. 04], TaintCheck [Newsome & Song 05], DIRA [Smirnov & Chiueh 05])
• Used for detection; inadequate vulnerability information
– Symbolic execution (e.g., Exe [Cadar et al. 06], DACODA [Crandall et al. 05])• Scalability issues
– Recovery (e.g., STEM [Sidiroglou et al. 05], SEAD [Lacosto et al. 07])• Change of application semantics
![Page 3: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/3.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 3Computer Science
MemSherlock
• MemSherlock is an automated debugger– Automated analysis of unknown memory corruption vulnerabilities
– Appeared in ACM CCS ’07
• MemSherlock provides– Statement that causes the memory corruption
– Dynamic program slice leading to the corruption
– Program variables involved in the vulnerability
– All presented at programming language level
• Implications– Generating vulnerability conditions
– Improves signature or patch generation speed
![Page 4: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/4.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 4Computer Science
General Framework: Web Application Example
Light-weight IDS
Program
Logger
Traffic
MemSherlock
Instrumented
Program
Replayer
Trigger
![Page 5: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/5.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 5Computer Science
MemSherlock Overview
• Goal is to provide vulnerability information – Intuitive, easy to understand for the programmer
• Not only the corruption point– Slice of program involved in the vulnerability
– Effects of user inputs
– Program variables involved
– Variable relationships (e.g., pointer aliasing)
– Type of vulnerability (e.g., stack buffer overflow)
• MemSherlock performs two important tasks– Finding the corruption point
– Tracking program state
![Page 6: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/6.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 6Computer Science
MemSherlock: Finding Corruption Point
• Observation: A memory object is modified by a small set of statements (inspired by AccMon)
• For memory object m, write set of m is the set of statements that legitimately modify m, WS(m)
• Security Condition: Memory object m should only be updated by statements in WS(m)
![Page 7: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/7.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 7Computer Science
MemSherlock: Assembly Line
• Pre-Debugging Phase– Instruments the program for debugging phase
– Extracts program information via static analysis
– Needs to be performed once
• Debugging Phase– Tracks program state
– Monitors memory writes and checks for violation of security condition
– Tracks tainted data and its propagation
![Page 8: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/8.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 8Computer Science
MemSherlock Architecture
Static Analyzer
Source Code
Rewriting
Compiler
Debugging Agent
Vulnerabilityinformation
Pre-debugging phase
CC CC
010110100101
procvaraddr
Original source files
Program executable
Malicious input
Debugging information
Library specification
![Page 9: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/9.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 9Computer Science
Pre-debugging: Generating Write Sets
• MemSherlock analyses source code to determine write sets
• For a program variable v, WS(v) includes– Assignment statements (i.e., v=expr)
– Library function calls where v is passed as an argument that can be modified (i.e., memcpy(&v,src))
• MemSherlock treats DLLs as black boxes– Assumption: A DLL is internally secure, but externally insecure
• e.g., no stack overflows in the library functions
• Sound for common, well tested libraries (e.g., clib)
– Requires library specifications
– For each DLL, a list of functions and the arguments they might modify
![Page 10: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/10.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 10Computer Science
Dealing with Pointers
• For a pointer variable p two write sets are kept– WS(p) – Statements that modify p
– WS(ref(p)) – Statements that modify the referent (e.g., *p=5)
• ref(p) is resolved during runtime (debugging)
• Perform the same analysis for pointer-type function arguments at function calls– Removes the requirement for inter-procedural static analysis
1 int i = 0;2 int *p = &i;3 *p = 1;4 p = NULL;
WS(i) = {1}WS(p) = {2,4}
WS(ref(p)) = {3}
(a) Code example
Line1234
ref(p)N/A
ii
NULL
WS(i){1}{1,3}{1,3}{1}
(b) Write sets after static analysis
(c) ref(p) and WS(i) during monitoring
![Page 11: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/11.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 11Computer Science
Chained Dereferences
• Earlier technique can only handle simple dereferences
• Source code rewriting is used to convert all chained dereferences to simple dereferences
• Any other dereference that is not simple is converted in the same manner
1 int z;
2 int *y = &z;
3 int **x = &y;
4 **x = 10;
1 int z;
2 int *y = &z;
3 int **x = &y;
4 int *temp = *x;
5 *temp = 10;
![Page 12: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/12.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 12Computer Science
Output of Pre-debugging Phase
• Simplified program– Simplified pointer dereferences
– Compiled with debugging options
• Input file for the debugger– Program variables and their write sets
– Addresses of global symbols
– Frame pointer offsets of local variables
– Other flags that help the debugger
![Page 13: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/13.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 13Computer Science
MemSherlock Architecture: Debugging
Static Analyzer
Source Code
Rewriting
Compiler
Debugging Agent
Vulnerabilityinformation
CC CC
010110100101
procvaraddr
Original source files
Program executable
Malicious input
Debugging information
Library specification
Debugging phase
![Page 14: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/14.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 14Computer Science
Debugging: Dynamic Monitoring
• Runtime monitoring– State Maintenance
– Incorporates taint analysis from TaintCheck• Produces a dynamic slice of the program leading to the vulnerability
• Write Checking– Monitors and validates memory writes
– Write sets are file name and line number pairs <f,l>• Instruction pointer IP is translated into <f,l>
– Write sets are associated with program variables• A destination address is translated into a program variable
![Page 15: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/15.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 15Computer Science
Keeping Program State
• A given memory region may correspond to different program variables depending on program state
• Dynamic monitor keeps track of memory mapping
mainStack base
Virtual Address Space
fnc A
fnc B
main
fnc A
fnc C
Stack base
Program State 1 Program State 2
Memory write0xABABABAB
Memory write0xABABABAB
![Page 16: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/16.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 16Computer Science
Debugging: Key Data Structures
• Keeps two lists of memory regions– ActiveMemoryRegions
• Memory corresponding to program variables or their referent memory regions
– NonWritableRegions• Saved registers, return addresses, metadata encapsulating dynamically allocated
memory regions
![Page 17: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/17.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 17Computer Science
Debugging: State Maintenance
• Function calls/returns (memory)– Local variable addresses are calculated and added to ActiveMemoryRegions– Location of return address and saved registers are added to
NonWritableRegions list
• Heap memory (memory)– malloc/free calls are intercepted– Allocated memory is added to ActiveMemoryRegions– The metadata encapsulating the buffer is added to NonWritableRegions
• Pointer value updates (write sets)– Searches ActiveMemoryRegions to find the referent and updates its WS
![Page 18: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/18.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 18Computer Science
Debugging: Write Checking
• When instruction IP modifies memory m– if m is in ActiveMemoryRegions
• determines the variable v it belongs to
• converts IP into <f,l>
• checks if <f,l> is in WS(v)
• If the memory write check fails or m is in NonWritableRegions– Marks the operation as a memory corruption
– Displays the vulnerability information
![Page 19: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/19.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 19Computer Science
Generating Vulnerability Information
• The slice of program contributing to the vulnerability– Statements that have propagated tainted values
– Statements that have modified related memory regions
• Dependency between memory objects involved in the vulnerability– Points to analysis shows memory regions and how they were accessed
• Program state– Call stack information
– Write set information
![Page 20: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/20.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 20Computer Science
Example Test Case: Null HTTP
•~~http.c~~• 91: void ReadPOSTData(int sid) {• …•100: conn[sid].PostData=calloc(conn[sid].dat->in_ContentLength+1024, sizeof(char));•101: if (conn[sid].PostData==NULL) { ...•107: do {•108: rc=recv(conn[sid].socket, pPostData, 1024, 0);•109: …
•--20361-- Error type: Heap Buffer Overflow
•--20361-- Dest Addr: 3AB3E360
•--20361-- IP: 0x804E5C7: ReadPOSTData (http.c:108)
•--20361-- Dest address resolved to:
•--20361-- Global variable "heap var"
• @ 3AB3E280 (size: 224)
•--20361--
•--20361-- Memory allocated by 0x804E531:
• ReadPOSTData (http.c:100)
•--20361-- TAINTED destination 3AB3E360
•--20361-- Fully tainted from:
•--20361-- 0x804E5C7: ReadPOSTData (http.c:108)
•--20361--
•--20361-- TAINTED size used during allocation
•--20361-- Tainted from:
•--20361-- 0x804E456: ReadPOSTData (http.c:100)
•--20361-- 0x804FBB5: read_header (http.c:153)
•--20361-- 0x805121B: sgets (server.c:211)
•Error Report:
![Page 21: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/21.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 21Computer Science
Vulnerability Analysis Example
~~http.c~~ 91: void ReadPOSTData(int sid) { 92: char *pPostData;
...100: conn[sid].PostData=calloc(
conn[sid].dat->in_ContentLength+1024, sizeof(char));...
107: do {108: rc=recv(conn[sid].socket, pPostData, 1024, 0);
... Heap Object
Create
![Page 22: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/22.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 22Computer Science
Vulnerability Analysis Example
Object
Use
~~http.c:~~119: int read_header(int sid) {121: char line[2048];
...127: do {128: memset(line, 0, sizeof(line));129: sgets(line, sizeof(line)-1, conn[sid].socket);
...
153: conn[sid].dat->in_ContentLength=atoi((char *)&line+16); ...
169: if (conn[sid].dat->in_ContentLength<MAX_POSTSIZE) {170: ReadPOSTData(sid);
~~http.c~~ 91: void ReadPOSTData(int sid) { 92: char *pPostData;
...100: conn[sid].PostData=calloc(
conn[sid].dat->in_ContentLength+1024, sizeof(char));...
107: do {108: rc=recv(conn[sid].socket, pPostData, 1024, 0);
...
Object
Taint
![Page 23: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/23.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 23Computer Science
Vulnerability Analysis Example
Object
~~http.c:~~119: int read_header(int sid) {121: char line[2048];
...127: do {128: memset(line, 0, sizeof(line));129: sgets(line, sizeof(line)-1, conn[sid].socket);
...
153: conn[sid].dat->in_ContentLength=atoi((char *)&line+16); ...
169: if (conn[sid].dat->in_ContentLength<MAX_POSTSIZE) {170: ReadPOSTData(sid);
~~server.c~~202: int sgets(char *buffer, int max, int fd)203: { ...209: conn[sid].atime=time((time_t*)0);210: while (n<max) {211: if ((rc=recv(conn[sid].socket, buffer, 1, 0))<0) {
...
Object
Taint
Taint
Create
![Page 24: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/24.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 24Computer Science
Implementation
• Source code is rewritten using CIL (C Intermediate Language)• CodeSurfer was used to extract program variables and their write sets
– A commercial static analysis tool
• objdump and dwarfdump were used to extract global symbol information
• Dynamic Monitoring is implemented in Valgrind– An open source emulator
![Page 25: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/25.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 25Computer Science
Evaluation
• Tested 11 real-world applications with known memory corruption vulnerabilities
• Test cases included– Stack/Heap buffer overflow, Format string– Both control flow and non-control data attacks
• Testing methodology– Programs were run under MemSherlock– Exploit programs were used to attack the applications– Log and replay was not used
![Page 26: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/26.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 26Computer Science
Evaluation Results
Application Name
Vuln.Type
Description Captured? #FP
GHTTP S A small HTTP server Yes 7
Icecast S An mp3 broadcast server Yes 0
Sumus S A game server for ‘mus’ Yes 0
Monit S Multi-purpose anomaly detector Yes 0
Newspost S Automatic news posting Yes 2
Prozilla S A download accelerator for Linux No 0
NullHTTP H An HTTP server Yes 0
Xtelnet H A telnet server Yes 4
Wsmp3 H Web server with mp3 broadcasting Yes 0
OpenVMPS F Open source VLan management policy server Yes 2
Power F UPS monitoring utility Yes 10
Type abbreviations: (S)tack overflow, (H)eap overflow and (F)ormat string
![Page 27: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/27.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 27Computer Science
False Negatives
• Prozilla:– memcpy uses a kernel function to manipulate page tables when copying entire
pages
– Valgrind cannot trace into kernel
– Can be prevented by function wrappers
• Other false negatives are theoretically possible– structs within unions or arrays
• Current implementation does not support unions
• Currently do not differentiate between elements of an array
– Memory corruption errors inside DLLs
![Page 28: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/28.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 28Computer Science
False Positives
• Embedded assembly
• Incomplete library specification– library functions keeping internal state (e.g., strtok(Null, delim) )
– library functions that modify global variables as side effects (e.g., optarg, errno)
– pointers that point to hidden global structures (e.g., getdatetime() in time.h)
• struct pointers– void pointers that are type-cast to modify struct variables
– since the pointer is not of type struct, MemSherlock fails to update accordingly
![Page 29: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/29.jpg)
Nov 14, 2007 2007 GMU-CSA Workshop 29Computer Science
Conclusion
• Fully automated vulnerability analysis
• The analysis output is intuitive and human readable
• Future Challenges– Automated, long-term fix of vulnerabilities
• Semantic consistency is a great challenge
– Automated, temporary fix of vulnerabilities• Generating vulnerability condition
• Improving signature generation
![Page 30: Post-Attack Analysis of Unknown Vulnerabilities](https://reader035.fdocuments.us/reader035/viewer/2022062408/56813ad5550346895da30fb4/html5/thumbnails/30.jpg)
Computer Science
Thank You