Auditing Closed-Source Applications Using reverse engineering in a security context Speech Outline:...

Auditing Closed-Source ApplicationsUsing reverse engineering in a security context

Speech Outline:

1. Different Approaches to auditing binaries2. How to spot common programming mistakes in the

binary3. Writing a small script that automates the task of

searching for suspicious coding constructs4. Example of using this script to find a buffer

overflow in a major web server application.

© HalVar Flake

White Hat vs Black Hat auditingWhite Hat Auditing:

Black Hat Auditing

• All code has to be audited• Continues after a vulnerability has been found• Has to be repeated upon every upgrade

Trying to find a single vulnerable condition throughwhich the security of an application can be compromised

• Only audits suspicious parts of the code• Only one vulnerable condition is needed• Will only be repeated if all old problems have been fixed.

Trying to ensure application security by auditing every line of code in agiven application, hopefully fixing all problems and leaving the programin a secure and stable condition

© HalVar Flake

Closed-Source Auditing Approaches1. Stress Testing with Junk Input

Long strings of data are more or less randomly generated and sent to the application,usually trying to overflow every single string that gets parsed by a certain protocol.

Pros:

• Stress testing tools are re-usable for a given protocol• Will work automatically with little to no supervision• Do not require specialized personnel to use

Cons:

• The analyzed protocol needs to be known in advance• Complex problems involving several conditions at once

will be missed• Undocumented options and backdoors will be missed

© HalVar Flake

Closed-Source Auditing Approaches2. Manual Reverse Engineering

A reverse engineer carefully reads the disassembly of the program, tediously re-constructing the program flow and spotting programming errors. This was the approach Joey__ demonstrated at BlackHat Singapore.

Pros:

• Even the most complex issues can be spotted

Cons:

• The process involved is incredibly time-consuming• A highly skilled and specialized auditor is needed• The danger is inherent that an auditor will burn out

and thus miss obvious problems

© HalVar Flake

Closed-Source Auditing Approaches3. Looking at suspicious code constructs

A reverse engineer audits calls to functions wich are know to be the source of commonprogramming errors. He looks through these calls and decides which ones to read moreclosely.

Pros:• Reasonable depth: Even relatively complex issues can

be uncovered• In comparison to a complete manual audit, this approach

saves quite a bit of time• The process of looking for suspicious constructs can be

automated to a certain degreeCons:

• Not all problems will be uncovered• Needs highly specialized auditor• If nothing is found, the auditor is back to approach Nr. 2

© HalVar Flake

The right tool for the task:

IDA Pro by Ilfak Guilfanov www.datarescue.com

• Can disassemble x86, SPARC, MIPS and much more ...• Includes a powerful scripting language• Can recognize statically linked library calls • Features a powerful plug-in interface

© HalVar Flake

Dynamically linked library calls:Diagram of program flow

ApplicationCode

DynamicLinkage Table

Executable Image

strcpy ( ) - Code

sprintf ( ) - Code

strcat ( ) - Code

...

Dynamic Library

© HalVar Flake

Statically linked library calls :Diagram of program flow

ApplicationCode

strcpy( ) - Code

Executable Image

strcat( ) - Code

....

© HalVar Flake

Assembly recap: Passing argumentsA simple example

void *memcpy(void *dest, void *src, size_t n);

Assembly representation:

push 4mov eax, unkn_40D278push eaxlea eax, [ebp+var_458]push eaxcall _memcpy

© HalVar Flake

Dangerous Programming Constructs

The classical strcpy/strcat

This call targets a stack buffer

The source is variable, not a static string

© HalVar Flake


Criteria for suspicious strcpy/strcat calls:

• Does the call target a stack or heap buffer of fixed size ?• Is the source buffer dynamic and not a fixed string ?

The classical strcpy/strcat

© HalVar Flake

Dangerous Programming Constructssprintf( ) targeting fixed buffers

Expanded strings are not static and not fixed in length

© HalVar Flake

Format string containing „%s“

Target buffer is a stack buffer


sprintf( ) targeting fixed buffers

Criteria for suspicious sprintf( ) calls:

• Does the call target a stack or heap buffer of fixed size ?• Does the format string contain a „%s“ ?• Is the expanded string of non-fixed length ?

© HalVar Flake


*scanf( ) parsing untrusted input

Format string contains „%s“

Data is parsed into stack buffers

© HalVar Flake


*scanf( ) parsing untrusted input

Criteria for suspicious *scanf( ) calls:

• Does the format string contain a „%s“ ?• Does the call parse a string into a fixed buffer ?

© HalVar Flake


strncat/strncpy failing to null-terminate

Copying data into a stack buffer again ...

© HalVar Flake

If the source is larger than n (4000 bytes), no NULL will be appended



The target buffer is only n bytes long



Criteria for suspicious strncat/strncpy( ) calls:

• Is the length n the same size as or bigger than the targeted buffer ?• Is the source buffer dynamic and not a fixed string ?• Does the call target a stack or heap buffer of fixed size ?

© HalVar Flake


format-string vulnerabilities

Argument deficiency

Format string is a dynamic variable

© HalVar Flake


format-string vulnerabilities

Criteria for suspicious *printf-calls

• Does the call suffer from an argument deficiency ?• Is the format string dynamic instead of a static string ?

© HalVar Flake

Dangerous Programming ConstructsCast-screwups

void func(char *dnslabel){ char buffer[256]; char *indx = dnslabel; int count;

count = *indx; buffer[0] = '\x00';

while (count != 0 && (count + strlen (buffer)) < sizeof (buffer) - 1) { strncat (buffer, indx, count); indx += count; count = *indx; }}

© HalVar Flake

Dangerous Programming ConstructsCast-screwups

Criteria for suspicious size_t utilization

• Does the function copy memory with a size_t as length ?• Is the size_t a dynamic value instead of a hardwired one ?• Is the size_t subtracted from immediately before the call ?• Is the size_t at any point written after it has been sign-extended with the movsx-mnemonic ?

© HalVar Flake

Automating the boring parts:Hands on: A simple sprintf( ) analyzing script

Things to check for when analyzing a sprintf()-call:

• Does the sprintf( ) target a static buffer ?

• Does the format string contain an „%s“ ?

• Does the call suffer from an argument deficiency ?

• If so, is the format string static or dynamic ?

© HalVar Flake


static GetStackCorr(lpCall){ while((GetMnem(lpCall) != "add")&&(GetOpnd(lpCall, 0) != "esp")) lpCall = Rfirst(lpCall);

return(xtol(GetOpnd(lpCall, 1)));}

Trace the code further until an „add esp, somevalue“ is found

Convert the somevalue to a number and return it

© HalVar Flake


static GetBinString(eaString){ auto strTemp, chr; strTemp = ""; chr = Byte(eaString); while((chr != 0)&&(chr != 0xFF)) { strTemp = form("%s%c", strTemp, chr); eaString = eaString + 1; chr = Byte(eaString); } return(strTemp);}

Zero the string

Get a byte

Until either a NULL or a 0xFF is found, append one byte ata time to the string, then return the string.

© HalVar Flake


Steps to take to retrieve argument n of a call:

1. Locate the n-th push before the function call2. If an immediate offset is pushed, return this value3. If a register was pushed, trace back until the

instruction is found which loaded the registerand return the value it was loaded with

© HalVar Flake

static GetArg(lpCall, n){ auto TempReg; while(n > 0) { lpCall = RfirstB(lpCall); if(GetMnem(lpCall) == "push") n = n-1; } if(GetOpType(lpCall, 0) == 1) { TempReg = GetOpnd(lpCall, 0); lpCall = RfirstB(lpCall); while(GetOpnd(lpCall, 0) != TempReg) lpCall = RfirstB(lpCall); return(GetOpnd(lpCall, 1)); } else return(GetOpnd(lpCall, 0));}

Trace back until the n-th push is found

Is the pushed operanda register ?

Find where theregister was last

accessed ...

... and return the value which was pushed ...

© HalVar Flake

static AuditSprintf(lpCall){ auto fString, fStrAddr, buffTarget; buffTarget = GetArg(lpCall, 1); fString = GetArg(lpCall, 2); if(strstr(fString, "offset") != -1) fString = substr(fString, 7, -1); fStrAddr = LocByName(fString); fString = BinStrGet(fStrAddr); if(GetStackCorr(lpCall) < 12) if(strlen(fString) < 2) Message("%lx --> Format String Problem ?\n", lpCall); if(strstr(fString, "%s") != -1) if(strstr(buffTarget, "var_") != -1)

Message("%lx --> Overflow problem ? \"%s\"\n", lpCall, fString);}

Clean up the arguments

Check for argument deficiency

Check for a dynamic format string

Scan the format string for „%s“

Check if the target is a stack variable© HalVar Flake

static main(){ auto FuncAddr, xref; FuncAddr = AskAddr(-1, "Enter address:"); xref = Rfirst(FuncAddr); while(xref != -1) { if(GetMnem(xref) == "call") AuditSprintf(xref); xref = Rnext(FuncAddr, xref); } xref = DfirstB(FuncAddr); while(xref != -1) { if(GetMnem(xref) == "call") AuditSprintf(xref); xref = DnextB(FuncAddr, xref); }}

Ask auditor to enter theaddress of the sprintf( )

Call the auditing functiononce for each call to sprintf( )

Repeat for all indirect calls

© HalVar Flake

Hands on: Seeing the script in action

Running it against iWS 4.1 SHTML.DLL

We feed our script the number 0x10007068The result:

This looks as if we can supply a very long string here ...

© HalVar Flake

Auditing Closed-Source Applications Using reverse engineering in a security context Speech Outline:...

Documents

Transcript of Auditing Closed-Source Applications Using reverse engineering in a security context Speech Outline:...