Detecting Manipulated Remote Call...

19
Detecting Manipulated Remote Call Streams Abstract In the Internet, mobile code is ubiquitous and includes such examples as browser plug-ins, Java applets, and document macros. In this paper, we address an important vulnerability in mobile code security that exists in remote execution systems such as Condor, Globus, and SETI@Home. These systems schedule user jobs for execution on remote idle machines. However, they send most of their important system calls back to the local machine for execution. Hence, an evil process on the remote machine can manipulate a user’s job to send destructive system calls back to the local machine. We have developed techniques to remotely detect such manipulation. Before the job is submitted for remote execution, we construct a model of the user’s binary program using static analysis. This binary analysis is applicable to commodity remote execution systems and applications. During remote job execution, the model checks all system calls arriving at the local machine. Execution is only allowed to continue while the model remains valid. We begin with a finite-state machine model that accepts sequences of system calls and then build optimizations into the model to improve its precision and efficiency. We also propose two program transformations, renaming and null call insertion, that have a significant impact on the precision and efficiency. As a desirable side-effect, these techniques also obfuscate the program, thus making it harder for the adversary to reverse engineer the code. We have implemented a simulated remote execution environment to demonstrate how optimizations and transformations of the binary program increase the precision and efficiency. In our test programs, unoptimized models increase run-time by 0.5% or less. At moderate levels of optimization, run-time increases by less than 13% with precision gains reaching 74%. 1 Introduction Code moves around the Internet in many forms, includ- ing browser plug-ins, Java applets, document macros, operating system updates, new device drivers, and remote execution systems such as Condor [26], Globus [13,14], SETI@Home [32], and others [1,11,35]. Mobile code traditionally raises two basic trust issues: will the code imported to my machine perform mali- cious actions, and will my remotely running code exe- cute without malicious modification? We are addressing an important variant of the second case: the safety of my code that executes remotely and makes frequent service requests back to my local machine (Figure 1). In this case, we are concerned that a remotely executing pro- cess can be subverted to make malicious requests to the local machine. The popular Condor remote scheduling system [26] is an example of a remote execution environment. Con- dor allows a user to submit a job (program), or possibly many jobs, to Condor to run on idle machines in their local environment and on machines scattered world- wide. Condor jobs can execute on any compatible machine with no special privilege, since the jobs send their file-access and other critical system calls to execute on their home machines. The home or local machine acts as a remote procedure call (RPC) server for the remote job, accepting remote call requests and process- ing each call in the context of the user of the local sys- tem. This type of remote execution, with frequent interactions between machines, differs from execution of “mobile agents” [17,30], where the remote job exe- cutes to completion before attempting to contact and report back to the local machine. If the remote job is subverted, it can request the local machine to perform dangerous or destructive actions via these system calls. Subverting a remote job is not a new idea and can be done quickly and easily with the right tools [16,27]. In this paper, we describe techniques to detect when the remote job is making requests that differ from its intended behavior. We are Jonathon T. Giffin Somesh Jha Barton P. Miller Computer Sciences Department University of Wisconsin, Madison {giffin,jha,bart}@cs.wisc.edu This work is supported in part by Office of Naval Research grant N00014-01-1-0708, Department of Energy grants DE-FG02- 93ER25176 and DE-FG02-01ER25510, Lawrence Livermore National Lab grant B504964, and NSF grant EIA-9870684. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes, notwithstanding any copyright notices affixed thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the above government agencies or the U.S. Government.

Transcript of Detecting Manipulated Remote Call...

acros. InCondor,d most ofnipulate

tect such

is. Thismodel

e beginprove itsnificantit harder

strate howtimized

recision

the

6]n-lyirld-ledte

nehess-s-tn-

nd

ee

jobilyengre

Detecting Manipulated Remote Call Streams

Abstract

In the Internet, mobile code is ubiquitous and includes such examples as browser plug-ins, Java applets, and document mthis paper, we address an important vulnerability in mobile code security that exists in remote execution systems such asGlobus, and SETI@Home. These systems schedule user jobs for execution on remote idle machines. However, they sentheir important system calls back to the local machine for execution. Hence, an evil process on the remote machine can maa user’s job to send destructive system calls back to the local machine. We have developed techniques to remotely demanipulation.

Before the job is submitted for remote execution, we construct a model of the user’s binary program using static analysbinary analysis is applicable to commodity remote execution systems and applications. During remote job execution, thechecks all system calls arriving at the local machine. Execution is only allowed to continue while the model remains valid. Wwith a finite-state machine model that accepts sequences of system calls and then build optimizations into the model to imprecision and efficiency. We also propose two program transformations, renaming and null call insertion, that have a sigimpact on the precision and efficiency. As a desirable side-effect, these techniques also obfuscate the program, thus makingfor the adversary to reverse engineer the code. We have implemented a simulated remote execution environment to demonoptimizations and transformations of the binary program increase the precision and efficiency. In our test programs, unopmodels increase run-time by 0.5% or less. At moderate levels of optimization, run-time increases by less than 13% with pgains reaching 74%.

1 Introduction

Code moves around the Internet in many forms, includ-ing browser plug-ins, Java applets, document macros,operating system updates, new device drivers, andremote execution systems such as Condor [26], Globus[13,14], SETI@Home [32], and others [1,11,35].Mobile code traditionally raises two basic trust issues:will the code imported to my machine perform mali-cious actions, and will my remotely running code exe-cute without malicious modification? We are addressingan important variant of the second case: the safety of mycode that executes remotely and makes frequent servicerequests back to my local machine (Figure 1). In thiscase, we are concerned that a remotely executing pro-

cess can be subverted to make malicious requests tolocal machine.

The popular Condor remote scheduling system [2is an example of a remote execution environment. Codor allows a user to submit a job (program), or possibmany jobs, to Condor to run on idle machines in thelocal environment and on machines scattered worwide. Condor jobs can execute on any compatibmachine with no special privilege, since the jobs sentheir file-access and other critical system calls to execuon their home machines. The home or local machiacts as a remote procedure call (RPC) server for tremote job, accepting remote call requests and proceing each call in the context of the user of the local sytem. This type of remote execution, with frequeninteractions between machines, differs from executioof “mobile agents” [17,30], where the remote job executes to completion before attempting to contact areport back to the local machine.

If the remote job is subverted, it can request thlocal machine to perform dangerous or destructivactions via these system calls. Subverting a remoteis not a new idea and can be done quickly and easwith the right tools [16,27]. In this paper, we describtechniques to detect when the remote job is makirequests that differ from its intended behavior. We a

Jonathon T. Giffin Somesh Jha Barton P. Miller

Computer Sciences DepartmentUniversity of Wisconsin, Madison

{giffin,jha,bart}@cs.wisc.edu

This work is supported in part by Office of Naval Research grantN00014-01-1-0708, Department of Energy grants DE-FG02-93ER25176 and DE-FG02-01ER25510, Lawrence LivermoreNational Lab grant B504964, and NSF grant EIA-9870684.

The U.S. Government is authorized to reproduce and distributereprints for Governmental purposes, notwithstanding any copyrightnotices affixed thereon.

The views and conclusions contained herein are those of theauthors and should not be interpreted as necessarily representing theofficial policies or endorsements, either expressed or implied, of theabove government agencies or the U.S. Government.

ofc-dn--

lngs.re-e.lsek

n-ere-heales,e

ci-

s:

gen-

h-ernCbynsed

-heser-

i-se,es

it-

ileeg

ant

to

addressing the issue of the local host’s safety; we are notprotecting the remote job from inappropriate access toits data nor are we detecting modification of its calcu-lated result (beyond those which would appear as inap-propriate remote system calls).

A local machine that accepts calls as valid withoutfirst verifying that the remote job generated the callsduring correct execution is vulnerable to maliciouslygenerated calls. Conventional authentication methodsusing secret data fail in this inherently risky environ-ment. An attacker knows everything present in theremote code, including an authentication mechanism orkey, and can manipulate this code at will. Thus,although the local machine must distrust calls fromremotely executing code, it has little ability to validatethese requests. This vulnerability currently exists in thethousands of machines worldwide running Condor, Glo-bus, Java applets, and similar systems. Our techniquesaddress this deficiency.

Our basic approach to detecting malicious systemcall streams is to perform a pre-execution static analysisof the binary program and construct a model represent-ing all possible remote call streams the process couldgenerate. As the process executes remotely, the localagent operates the model incrementally, ensuring thatany call received remains within the model. Should acall fall outside the set of expected next calls determinedby the model, we consider the remote process manipu-lated. Reasonably, a precise model should closely mirrorthe execution behavior of the application.

As others have noticed [23,36,37], specification of aprogram’s intended behavior can be used for host-basedintrusion detection. Our approach brings four benefits tothese intrusion detection systems:• Direct operation on binary code.• Automated construction of specifications.• Elimination of false alarms.• Protection against new types of attacks.We further address an important new source of vulnera-bilities: request verification when even cryptographicauthentication mechanisms cannot be trusted.

Any program model representing sequencesremote system calls is valid. Previous model constrution techniques include human specification [22] andynamic analysis. A dynamic analyzer completes traiing runs over multiple execution traces to build probability distributions indicating the likelihood of each calsequence [12,15,39]. False alarms occur if the trainiruns do not exercise all possible program control flowStatic analysis produces non-probabilistic models repsenting all control flow paths through an executablThese models are conservative, producing no faalarms [36,37] but potentially accepting an attacsequence as valid.

Our models are finite-state machines. We use cotrol flow graphs generated from the binary code undanalysis to construct either a non-deterministic finitstate automaton or a push-down automaton to mirror tflow of control in the executable. Automata are naturstructures to represent sequences of remote call namwith push-down automata being more precise. Wdevelop several optimizations to further increase presion while maintaining run-time efficiency.

We evaluate our program models using two metricprecisionandefficiency. Precision measures how tightlythe model fits the application it represents. Improvinprecision reduces the opportunity for an attack to baccepted as valid by the model. Efficiency rates the rutime impact of model operation. To evaluate our tecniques and models, we built a prototype static analyzand model builder for a simulated remote executioenvironment. We read binary programs on SPARSolaris machines and produce a model for operationa simulated local agent. The agent receives notificatiofrom the application when system calls are encounterduring execution and operates the model accordingly.

Our models are efficient. Non-deterministic finitestate automaton (NFA) models add 0.5% or less to trun-times of our test applications. In the less preciNFA models, optimizations become invaluable. Modeate optimization levels improve precision up to 74%while keeping run-time overheads below 13%. Optmized push-down automaton models are more precibut keep overheads as low as 1%. The precision valuof these optimized models approach zero, indicating ltle opportunity for an adversary to begin an attack.

Other strategies have been used to counter mobcode manipulation exploits. Generally orthogonal, onfinds the greatest security level when incorporatincomponents of all three areas into a solution.

Replication. A form of the Byzantine agreement[24], a remote call will be accepted as genuine ifmajority of replicated processes executing on differemachines generate the identical call. Sometimes used

Figure 1: Remote execution with system calls beingexecuted on home (local) machine.

LocalAgent

Local Host Remote Host

RemoteSystem Calls

ApplicationProcess

e-s aro-]m

e.ckite

,-

e-n.

ar-e

h

Anag-s,n,

heinsa

chlesnInk-

ityto

of

verify the results returned by mobile agents [31], suchtechniques appear limited in an environment with fre-quent system call interactions over a wide network.

Obfuscation. A program can be transformed intoone that is operationally equivalent but more difficult toanalyze [7,8,30,38]. We are applying a variant of suchtechniques to improve our ability to construct precisestate machines and hamper an adversary’s ability tounderstand the semantics of the program. Even though ithas been popular in recent years to discount obfuscationbased upon Barak et. al. [5], in Section 3.4.2 we discusswhy their theoretical results do not directly apply in ourcontext.

Sandboxing. Running a program in an environmentwhere it can do no harm dates back to the early days ofthe Multics operating system project [29]. CRISIS, forexample, maintains per-process permissions that limitsystem access in WebOS [6]. Our techniques could beconsidered a variety of sandboxing, based on stronganalysis of the binary program and construction of averifying model to support that analysis.

This paper makes contributions in several areas:Binary analysis. We target commodity computa-

tional Grid environments where the availability ofsource code for analysis cannot be assumed. Further, ouranalysis is not restricted to a particular source language,so our techniques have wide applicability.

Model optimizations. We develop and use tech-niques to increase the precision of the finite-statemachines we generate, limiting the opportunities for anattacker to exploit a weakness of the model. In particu-lar, we reduce the number of spurious control flows inthe generated models withdead automata removal,automata inlining, the bounded stack model, and thehybrid model. Argument recoveryreduces opportunitiesfor exploit. We also present a linear timeε-reductionalgorithm to simplify our non-deterministic statemachines.

Reduced model non-determinism with obfuscatorybenefits. Many different call sites in a program generaterequests with the same name. (All opens, for example.)Our technique ofcall site renaminggives us a great abil-ity to reduce the non-determinism of our models byuniquely naming every call site in the program andrewriting the executable. We further insertnull calls–dummy remote system calls–at points of high non-deter-minism to provide a degree of context sensitivity to themodel. Call site renaming and null call insertion addi-tionally obfuscate the code and the remote call stream.With binary rewriting, other obfuscation techniques arelikewise possible.

Context-free language approximations. In general,the language generated by the execution trace of a pro-

gram is context-free. A push-down automaton–a finitstate machine that includes a run-time stack–definecontext-free language. However, such automata are phibitively expensive to operate incrementally [36,37and stack growth potentially consumes all systeresources. We usestack abstractionsthat over-approxi-mate a context-free language with a regular languagOur push-down automata with bounded run-time staare less expensive to operate and require finresources.

We provide background on the Condor systemremote execution in the computational Grid environment, and security exploits in Section 2. Section 3 prsents our analysis techniques in an algorithmic fashioExperimental results are found in Section 4 and compison to previous work in Section 5. Related work can bfound in Section 6. We conclude in Section 7 witdescriptions of several areas of continuing work.

2 Threats

Remote execution is becoming a common scenario.important class of remotely executing jobs requirecommunication path back to the local machine that oriinated the job; the job sends its critical system callsuch as those for file access or network communicatioback to the local machine to execute in the context of tsubmitting user. This type of remote execution occursthe Condor distributed scheduling system [26], Globucomputational Grid infrastructure [13,14], and Javapplets.

The implementation associated with our researtakes place in the context of Condor. Condor schedujobs on hosts both within the originator’s organizatioand on machines belonging to other organizations.addition to scheduling these remote jobs, Condor checpoints and migrates the jobs as necessary for reliabiland performance reasons. It is possible for a given job

Figure 2: Grid environment exploit. A lurker processattaches to the remote job, inserting code that takes control

the network link.

LocalAgent

ApplicationProcess

Local Host Remote Host

MaliciousRemote Calls

Lurker

te

rtodi-

a-e-

e

o

se

d

llyon-nal.r-

eer-i-

li-

le

execute, at different times, on several hosts in severaldifferent administrative domains.

Condor is a prevalent execution environment, par-ticularly for scientific research. For example, in the year2000, researchers used Condor to solve a 32-year-oldunsolved combinatorial optimization problem callednug30 [2]. Remote jobs ran on 2510 processors acrossthe United States and Italy and outside the administra-tive control of the program’s authors. Furthermore, thenetwork path between each remote process and its origi-nating host included public Internet links. A maliciousthird party with access to either the execution machinesor network links could have manipulated the originatingmachine, as we now detail.

Remote system calls in Condor are simply a variantof a remote procedure call (RPC). A clientstublibrary islinked with the application program instead of the stan-dard system call library. The stub functions within thislibrary package the parameters to the call into a mes-sage, send the message over the network to the submit-ting machine, and await any result. Alocal agenton thesubmitting machine services such calls, unpacking therequest, executing the call, and packaging and sendingany result back to the remote machine.

This RPC model exposes the submitting machine toseveral vulnerabilities. These vulnerabilities have thecommon characteristic that a malicious entity on theremote machine can control the job, and therefore con-trol its remote system call stream. This malicious systemcall stream could cause a variety of bad things to bedone to the submitting user. The simplest case of a mali-cious remote host is when the host’s owner (with admin-istrative privileges) takes control of the remote job.More complex and harder-to-track cases might becaused by previous malicious remote jobs. A previouslydiscovered vulnerability in Condor had this characteris-tic [27]. When a remote job executes, it is typically runas a common, low privilege user, such as “nobody.” Amalicious user could submit a job that forks (creates anew process) and then terminates. The child processremains running, but it appears to Condor as if the jobhas terminated. When a new job is scheduled to run onthat host, the lurking process detects the newly arrivedjob and dynamically attaches to the job and takes con-trol of it. The lurker can then generate malicious remotecalls that will be executed to the detriment of themachine that originated the innocent job (see Figure 2).

Similar results are possible with less unusualattacks. If the call stream crosses any network that is notsecure, a machine on the network may impersonate theapplication process, generating spoofed calls that maybe treated by the local host as genuine. Imposter appletshave successfully used impersonation attacks against the

servers with whom the original applets communica[16].

3 Generating Models Using Static Analysis

We start with the binary program that is submitted foexecution. Before execution, we analyze the programproduce two components: a checking agent and a mofied application (see Figure 3). Thechecking agentis alocal agent that incorporates the model of the appliction. As the agent receives remote system calls for excution, it first verifies the authenticity of each call byoperating the model. Execution continues only while thmodel remains in a valid state. Themodified applicationis the original program with its binary code rewritten timprove model precision while also offering a modicumof obfuscation. The modified application executeremotely, transmitting its remote system calls to thchecking agent.

Our various models are finite-state machines:non-deterministic finite automata(NFA) and push-downautomata(PDA). Each edge of an automaton is labelewith an alphabet symbol–here the identity of a remotesystem call. The automaton hasfinal states, or stateswhere operation of the automaton may successfucease. The ordered sequences of symbols on all cnected sequences of edges from the entry state to a fistate define thelanguageaccepted by the automatonFor a given application, the language defined by a pefect model of the application is precisely all possiblsequences of remote system calls that could be genated by the program in correct execution with an arbtrary input.

Construction of the automaton modeling the appcation progresses in three stages:1. A control flow graph(CFG) is built for each proce-

dure in the binary. Each CFG represents all possibexecution paths in a procedure.

Figure 3: Our static analyzer reads a binary program andproduces a local checking agent and a modified applicationthat executes remotely. The checking agent incorporates a

model of the application.

Analyzer

BinaryProgram

CheckingAgent

ModifiedApplication

n

er-

on

o-

minb)etes

m-

rate

ngh

eftrol

2. We convert the collection of CFGs into a collectioof local automata. Each local automaton modelsthe possible streams of remote system calls genated in a single procedure.

3. We compose these automata at points of functicalls internal to the application, producing aninter-procedural automatonmodeling the application asa whole.

The interprocedural automaton is the model incorprated into the checking agent.

Figure 4(a) shows an example C language prograthat writes a string to the standard output. The mafunction translates to the SPARC code in Figure 4(when compiled. We include the C code solely for threader’s ease; the remainder of this section demonstraanalysis of the binary code that a compiler and assebler produces from this source program.

3.1 From Binary Code to CFGsWe use a standard tool to read binary code and geneCFGs. TheExecutable Editing Library(EEL) providesan abstract interface to parse and edit (rewrite) SPARCbinary executables [25]. EEL builds objects representithe binary under analysis, including the CFG for eacprocedure and acall graph representing the interproce-dural calling structure of the program. Nodes of thCFG, or basic blocks,contain linear sequences oinstructions and edges between blocks represent con

main (int argc, char **argv) {if (argc > 1) {

write(1,argv[1],10);line(1);end(1);

} else {write(1,“none\n”,6);close(1);

}}

line (int fd) {write(fd, “\n”, 1);

}

end (int fd) {line(fd);close(fd);

}

main:savecmp %i0, 1ble L1mainmov 1, %o0ld [%i1+4], %o1call writemov 10, %o2call linemov 1, %o0call endmov 1, %o0b L2mainnop

L1main:sethi %hi(Dnone), %o1or %o1, %lo(Dnone), %o1call writemov 6, %o2call closemov 1, %o0

L2main:retrestore

(a) (b)

Figure 4: Code Example. (a) This C code writes tostdout a command line argument as text or the string “none\n ” if noargument is provided. (b) The SPARC assembly code formain . We do not show the assembly code for line or end.

Figure 5: Control Flow Graph for main . Control transfers inSPARC code have one delay slot. Outgoing edges of each

basic block are labeled with the name of the call in the block.

CFG ENTRY

savecmp %i0, 1blemov 1, %o0

ld [%i1+4], %o1call writemov 10, %o2

call linemov 1, %o0

call endmov 1, %o0

bnop

retrestore

CFG EXIT

sethi %hi(Dnone), %o1or %o1, %lo(Dnone), %o1call writemov 6, %o2

call closemov 1, %o0

ε

εε

ε

ε

close

write

write

line

end

ofm-te.

6).

ts ofal.rall

a-

the

-

w

the

flow; i.e. the possible paths followed at branches.Figure 5 shows the CFG formain from Figure 4.

3.2 From CFGs to Local AutomataFor each procedure, we use its CFG to construct an NFArepresenting all possible sequences of calls the proce-dure can generate. This is a natural translation of theCFG into an NFA that retains the structure of the CFGand labels the outgoing edges of each basic block withthe name of the function call in that block, if such a callexists. Outgoing edges of blocks without a function callare labeledε. The automaton mirrors the points of con-trol flow divergence and convergence in the CFG andthe possible streams of calls that may arise when tra-versing such flow paths.

Formally, we convert each control flow graphinto an NFA given by ,Q

being the set of states,Σ the input alphabet,δ the transi-tion relation,q0 the unique entry state, andF the set ofaccepting states; where:

To reduce space requirements, each NFA isε-reduced and minimized. The classicalε-reduction algo-rithm simultaneously determinizes the automaton, anexponential process [19]. We develop a linear timeε-reduction algorithm, shown below, thatdoes not deter-minize the automaton. The algorithm recognizes that a

set of states in a strongly connected component madeε-edges are reachable from one another without consuing an input symbol and collapses them to a single sta

The resultant graph is the reduced automaton (FigureUsing standard algorithms and data structures, ourε-reduction procedure runs in linear time.

Automaton minimization recognizes equivalenstates, where equivalence indicates that all sequencesymbols accepted following the states are identicSuch states are collapsed together, reducing the ovesize and complexity of the automaton. AnO(n log n)algorithm exists to minimize deterministic automat[18], but it is not easily abstracted to an NFA. Our prototype uses anO(n2) version of the algorithm suitable foran NFA.

Figure 6: Local Automata. The local automata for each ofthe three functions given in Figure 4 afterε-reduction.

is the unique CFG entry

line end

main

line

close

writeline

close

write

write

end

G V E,⟨ ⟩= A Q Σ δ q0 F, , , ,( )=

Q V=Σ ID v V∈ v contains a call labeled ID,∃{ }=

q0 v0=

F v v is a CFG exit{ }=

δs t

εif no call ats→

s tID

if call labeled ID ats→

s t E∈→∪=

Figure 7: Final NFA Model. The automaton producedfollowing call site replacement.ε-reduction has not been

performed. The dotted line represents a path not present inoriginal program but accepted by the model.

1. Abstract the automaton to a directed graph.2. Using only ε-edges, calculate thestrongly con-

nected components of the graph.3. All states in the same strongly connected compo

nent may reach any other by a sequence ofε-transi-tions, so the states are collapsed together. We nohave adirected acyclic graph(DAG) over the col-lapsed states, with the remainingε-edges those thatconnect strongly connected components.

4. For all non-ε-edgese originating at a staten in theDAG, add copies ofe originating from all statesmsuch thatm reaches n by a sequence ofε-edges.

5. Remove theε-edges that connect strongly con-nected components.

6. Remove unreachable states and edges fromgraph.

line end

main

close

write

close

write

write

ε

εε

εε

ε

k,ess

nra-the

gesra-A

ot

ing

is

oel

-l

3.3 From Local Automata to an InterproceduralAutomatonConstructing an Interprocedural NFA. We extend thenotion of a single procedure NFA model to a model ofthe entire application. The local automata are composedto form one global NFA bycall site replacement. Wereplace every edge representing a procedure call withcontrol flow through the automaton modeling the callee,a common technique used elsewhere to construct systemdependence graphs [20] and also used by Wagner andDean in their work [36,37].

Where there was an edge representing a called function,control now flows through the model of that function.Recursion is handled just as any other function call. Callsite replacement reintroducesε-edges, so the automatonis reduced as before. Figure 7 presents the final automa-ton, withoutε-reduction for clarity.

There is no replication of automata. Call sitereplacement links multiple call sites to the same proce-dure to thesamelocal automaton. Every final state ofthe called automaton hasε-edges returning to all callsites.Impossible pathsexist: control flow may enter theautomaton from one call site but return on anε-edge toanother (Figure 7). Such behavior is impossible in actualprogram execution, but a malicious user manipulatingthe executing program may use such edges in the modelas an exploit. In applications with thousands of proce-dures and thousands more call sites, such imprecisionmust be addressed.

Constructing an Interprocedural PDA. Introductionof impossible paths is a classical program analysis prob-lem arising fromcontext insensitiveanalysis (see e.g.[28]). A push-down automaton eliminates impossiblepaths by additionally modeling the state of the applica-tion’s run-time stack. An executing application cannotfollow an impossible path because the return site loca-tion is stored on its run-time stack. A PDA iscontextsensitive, including a model of the stack to preciselymirror the state of the running application.

This is an interprocedural change. We constructlocal automata as before. Theε-edges added during callsite replacement, though, now contain an identifieruniquely specifying each call edge’s return state(Figure 8). Eachε-edge linking the source of a functioncall edge to the entry state of the called automaton

pushes the return state identifier onto the PDA stacjust as the executing program pushes the return addronto the run-time stack. Theε-edges returning controlflow from the callee pop the identifier from the PDAstack, mirroring the application’s pop of the returaddress from its run-time stack. Such a pop edge is tversed only when the identifier on the edge matchessymbol at the top of the stack. The identifiers on theε-edges define matched sets of edges. Only return edthat correspond to a particular entry edge may be tversed when exiting the called automaton. Since a PDtracks this calling context, impossible paths cannexist.

We link local automata using modified call sitereplacement:

Formally, let the interprocedural PDA be, whereQ is the set of states,Σ

is the input alphabet,Γ is the stack alphabet,δ is thetransition relation,q0 is the unique entry state,Z0 is theinitial stack configuration, andF is the set of acceptingstates. Given local NFA models

1. Add anε-edge from the source state of the calledge to the entry state of the called automaton.

2. Add ε-edges from every final state of the calledautomaton back to the destination state of the calledge.

3. Remove the original call edge.

Figure 8: PDA Model. Theε-edges into and out of a calledautomaton are paired so that only a return edge correspond

to the edge traversed at call initiation can be followed.

1. Uniquely mark each local automaton state thatthe target of a non-system call edge.

For each non-system call edge, do steps 2, 3, and 4:2. Add anε-edge from the source state of the edge t

the entry state of the destination automaton. Labtheε-edge withpush X, whereX is the identifier atthe target of the call edge.

3. Add anε-edge from each final state of the destination automaton to the target of the call edge. Labeeachε-edge withpop X, whereX is the identifierfrom step 2.

4. Delete the original call edge.

line end

main

close

write

close

write

write

ε

εε

εε

ε

B

A C

pop B

pop A pop C

push B

push Cpush A

P Q Σ Γ δ q0 Z0 F, , , , , ,( )=

Ai Qi Σi δi q0 i, Fi, , , ,( )=

on

s.ally

he

oaion

em

athessiblellyes

-nea-l

tay

enun--tedlylar

w

ys

ere

er-

for the procedures, the PDAP for the program is givenby:

The initially executed automaton, here denoted byA0, isthat modeling the function to which the operating sys-tem first transfers control, e.g._start or main .

Unfortunately, a PDA is not a viable model in anoperational setting. In a straightforward operation of theautomaton, the run-time stack may grow until it con-sumes all system resources. In particular, the stack sizeis infinite in the presence of left recursion. To counterleft recursion challenges, Wagner and Dean operate thePDA with an algorithm similar to thepost* algorithmused in the context of model checking of push-downsystems [10]. They demonstrate the algorithm to be pro-hibitively expensive [36,37]. Addressing imprecisionrequires a more reasonable approach.

3.4 Optimizations to Address Sources of Impre-cisionImprecisions in the models arise from impossible paths,context insensitive analysis, and malicious argumentmanipulation. We develop several optimizations that tar-get these particular sources of imprecision while main-taining efficiency.

3.4.1 Impossible Paths

Discarding push-down automata as not viable requiresimpossible paths to be readdressed. Impossible pathsarise at the final states of automata that are spliced intomultiple call sites. Theε-return edges introduce diver-gent control flow where no such divergence exists in theapplication. We have developed several NFA model

optimizations to reduce the effect of return edges upthe paths in the model.

Dead Automata Removal. A leaf automatonis alocal automaton that contains no function call edgeAny leaf automaton that contains no remote system cedges is dead–it models no control flow of interest. Another local automaton that contains a call edge to tdead leaf may replace that call edge with anε-edge. Thiscontinues, recursively, backward up the call chain. Teliminate impossible paths introduced by linking todead automaton, we insert this dependency calculatstep prior to call site replacement.

Automata Inlining. Recall that in call site replace-ment, all calls to the same function are linked to thsame local automaton. Borrowing a suitable phrase frocompilers, we useautomata inliningto replace each callsite with a splice to auniquecopy of the called automa-ton. Impossible paths are removed from this call sitethe expense of a larger global automaton. In theory, tglobal automaton may actually be smaller and ledense because false edges introduced by imposspaths will not be present, however we have generafound that the state space of the automaton doincrease significantly in practice.

Single-Edge Replacement. An inlining special case,single-edge replacement is a lightweight inlining technique used when the called automaton has exactly oedge. The function call edge is simply replaced withcopy of the edge in the callee. This is inexpensive inlining, for no states norε-edges are added, yet the moderealizes inlining gains.

Bounded Stack Model. Revisiting the idea of a PDAmodel, we find that both the problems of infinite lefrecursion and, more generally, unbounded stacks mbe solved simply by limiting the maximum size of therun-time stack. For someN, we model only the topNelements of the stack; all pop edges are traversed whthe stack becomes empty. The state space of the rtime automaton is now finite, requiring only finite memory resources. Correspondingly, the language accepby the bounded-stack PDA is regular, but more closeapproximates a context-free language than a reguNFA.

Unfortunately, a bounded stack introduces a neproblem at points of left recursion. Any recursiondeeper than the maximum height of the stack destroall context sensitivity: the stack first fills with only therecursive symbol; then, unwinding recursion clears thstack. All stack symbols prior to entering recursion alost.

Hybrid Model. This recursion effect seems to be thopposite of what is desired. For many programs, recusion typically involves a minority of its functions. We

of the initially executed automaton

are the final states of the initially executedautomaton

, for a aremote call

, whereais a procedure call with andr is identi-fied byID

, whereais a procedure call with andp is identi-fied byID

Q Qii

∪=

Σ Σii

∪=

Γ ID ID is the destination identifier of a call edge{ }=

q0 v0=

Z0 ∅=

F F0=

δ q a ε, ,( ) p ε,( ) if i s.t. q∃ pa δi∈→=

δ q ε ε, ,( ) p ID,( ) if i r s.t. q,∃ ra δi∈→=

p q0 a,=

δ q ε ID, ,( ) p ε,( ) if i r s.t. r,∃ pa δi∈→=

q Fa∈

er-hen-eu-h

lsftet,s.e

areed

ssticn-i-yeeraya

dsh

x-ay”,heheorre-llyn.eryalor-ltsr

heheu-

consider that it may be more precise to discard recursivesymbols rather than symbols prior to entering recursion.Our hybrid model uses both NFA and PDA edges duringinterprocedural construction to accomplish this. Call sitereplacement uses simpleε-edges when the procedurecall is on a recursive cycle. A stack symbol is used onlywhen a call is not recursive. Recursion then adds nosymbols to the PDA stack, leaving the previous contextsensitivity intact. As in the bounded-stack PDA, thehybrid automaton defines a regular language that over-approximates the context-free grammar accepted by atrue PDA.

3.4.2 Context Insensitivity

Regardless of the technique used to construct the inter-procedural model, the analysis basis for all local modelsis context insensitive. We take all control flow paths asequally likely irrespective of the previous execution flowand do not evaluate predicates at points of divergence.This straightforward analysis leads to a degree of non-determinism in the local automata that we seek toreduce. Reducing non-determinism decreases the size ofthe frontier of possible current states in the automaton atrun-time. There are, in turn, fewer outgoing edges fromthe frontier, improving efficiency and precision.

Renaming. During program analysis, every remotecall site is assigned a randomly generated name. Weproduce a stub function with this random name thatbehaves as the original call and rewrite the binary pro-gram so that the randomly named function is called.That is, rather than calling a remote system call stubnamed, say,write , the call is to a stub named_3998 . Weare essentially passing all call site names through a one-time encryption function. The key is stored at the check-ing agent (on the submitting machine), which translatesthe random name back to the original call name beforeexecution.

All call sites are thus differentiated. Two separatcalls to the same function now appear as calls to diffeent functions. The random names label edges in tautomaton and serve as the input symbol at model rutime. Renaming reduces non-determinism, for thmodel knows precisely where the program is in exection after every received call. Comparing Figure 9 witFigure 6, we see that the automaton formain becomesfully deterministic with renamed call sites.

This is an alphabet change, moving from symboindicating call names to the potentially larger set osymbols defining individual call sites. An attacker musspecify attacks given this randomly generated alphabthus requiring analysis to recover the transformationFurther, only remote calls that are actually used in thprogram may be used in an attack. Renamed callsgenerated from call sites, blocking from use any unusremote call stub still linked into the application.

Call site renaming produces equivalent but lehuman-readable program text, acting as a simplisobfuscation technique [8]. The checking agent maitains the transformations; recovery by a malicious indvidual requires program analysis to indicate likelremote call names given the context of the call in thprogram. Since we can rewrite the binary code, furthobfuscation techniques are applicable: arguments mbe reordered and mixed with dummy arguments onper-call-site basis, for example. More general methoto obscure control flow are similarly possible, althougwe have not pursued such techniques.

A recent paper by Barak et. al. presents a compleity-theoretic argument that proves the impossibility ofspecific class of obfuscating transformations [5]. Thedefine an obfuscated program as a “virtual black boxi.e., any property that can be computed by analyzing tobfuscated program can also be computed from tinput-output behavior of the program. In contrast ttheir work, we require that it is computationally hard foan adversary to recover the original system calls corsponding to the renamed calls, i.e., it is computationahard for the adversary to invert the renaming functioHence, our obfuscation requirement is much weakthan the “virtual blackbox” requirement imposed bBarak et. al. However, we are not claiming theoreticguarantees of the strength of our obfuscation transfmation but merely observing that the theoretical resupresented by Barak et. al. do not directly apply in oucontext.

Null calls. Insertion of null calls–dummy remotesystem calls that translate to null operations at tchecking agent–provides similar effects. We place tcalls within the application so that each provides exec

Figure 9: The automaton for main after call site renaming.Edges labeled with function calls internal to the applicationare not renamed, as these edges are splice points for call site

replacement.

main

line

_297

_154_3998

end

s

ce.r-

n.eifngta-

tedeteectrd-

-.

gs.

lhisventdd-

ps

adl-heg

tion context to the checking agent, again reducing non-determinism.

For example, null calls may be placed immediatelyfollowing each call site of a frequently called function.Recall that we introduce impossible paths during callsite replacement, and specifically where we link the finalstates of a local automaton to the function call returnstates. Inserting the null calls at the function call returnsites distinguishes the return locations. Only the truereturn path will be followed because only the symbolcorresponding to the null call at the true return site willbe transmitted. The other impossible paths exiting fromthe called automaton are broken.

There is a run-time element to renaming and nullcall insertion. While reducing non-determinism, thepossible paths through the automaton remain unchanged(although they are labeled differently). To an attackerwith knowledge of the transformations, the availableattacks in a transformed automaton are equivalent tothose in the original,provided the attacker takes controlof the call stream before any remote calls occur. Anattacker who assumes control after one or more remotecallswill be restricted because operation of the model tothat point will have been more precise.

3.4.3 Argument Manipulation

A remote system call exists within a calling context thatinfluences the degree of manipulation available to amalicious process. For example, at a call site toopen , amalicious process could alter the name of the file passedas the first argument to the call. A model that checksonly the names of calls in the call stream would acceptthe open call as valid even though it has been mali-ciously altered. The context of the open call, however,may present additional evidence to the checking agentthat enables such argument modifications to be detectedor prevented.

Argument Recovery. As local automata are con-structed, we recover all statically determined argumentsby backward slicing on the SPARC argument registers.In backward register slicing, we iterate through the pre-vious instructions that affect a given register value [34].Essentially, we are finding the instructions that comprisean expression tree. We simulate the instructions in soft-ware to recover the result, used here as an argument to acall. We successfully recover numeric arguments knownstatically and strings resident in the data space of theapplication. The checking agent stores all recoveredarguments so that they are unavailable for manipulation.

In Figure 10, the backward slice of register%o1 atthe point of the second call towrite in function main

iterates through the two instructions that affect the valueof %o1. Only the emphasized instructions are inspected;

instructions that do not affect the value%o1are ignored.In this case,Dnone is a static memory location indicatingwhere in the data space the string for “none\n ” resides.We recover the string by first simulating the instructionsethi and or in software to compute the memoryaddress and then reading the string from the data spa

A similar analysis is used to determine possible tagets of indirect calls. Every indirect call site is linked toevery function in the program that has its address takeWe identify such functions by slicing backward on thregister written at every program point to determinethe value written is an entry address. Our register sliciis intraprocedural, making this a reasonable compution.

3.5 Unresolved IssuesDynamic Linking. A dynamically linked applicationloads shared object code available on the remomachine into its own address space. Although this cois non-local, we can fairly assume that the remomachine provides standard libraries to ensure correxecution of remote jobs. Analysis of the local standalibraries would then provide accurate models of dynamically linked functions.

Although straightforward, we have not yet implemented support for dynamically linked applicationsSome libraries on Solaris 8, such aslibnsl.so , useindirect calls extensively. As we improve our handlinof indirect calls, we expect to handle these application

Signal Handling. During execution, receipt of a sig-nal will cause control flow to jump in and out of a signahandler regardless of the previous execution state. Tentry and exit is undetectable to the checking agent sathe alarms it may generate. As we already instrumethe binary, we expect to insert null calls at the entry anexit points of all signal handlers to act as out-of-bannotifications of signal handler activity. These instrumentations have not yet been implemented.

Multithreading. Both kernel and user level threadswaps are invisible to the checking agent; thread swawill likely cause the run-time model to fail, and thisremains an area for future research. User level threscheduling would allow instrumentation of the scheduing routines so that the checking agent could swap to tcorresponding model for the thread. A kernel schedulin

Figure 10: Register Slicing. We iterate backwards throughthe instructions that modify register%o1 prior to the call site.

sethi %hi(Dnone), %o1

call write

or %o1, %lo(Dnone), %o1

a-

nt0aallinlyor

y

s

sdslte

or

as

theoresistode

the

nd.isea

n

on

-ingerhealtheat

monitor would require kernel modifications and is cur-rently not under consideration.

Interpreted Languages. Programs written in lan-guages such as SML [3] and Java are compiled into anintermediate form rather than to native binary code. Toexecute the program, a native-code run-time interpreterreads this intermediate representation as data and exe-cutes specific binary code segments based upon thisinput. Binary code analysis will build a model of theinterpreter that accepts all sequences of remote calls thatcould be generated byanycompiled program. A precisemodel for a specific application can be built either withknowledge of the intermediate representation and theway it is interpreted by the run-time component or bypartial evaluation of the interpreter [21]. However, if theprogram is compiled into native code before execution,as is common in many Java virtual machine implemen-tations [33], our techniques could again be used to con-struct program-specific models of execution.

4 Experimental Results

We evaluate our techniques using two criteria:precisionandefficiency. A precise model is one that incorporatesall sequences of calls that may be generated by an appli-cation but few or no sequences that cannot. An efficientmodel is one that adds only a small run-time overhead.Only efficient models will be deployed, and only precisemodels are of security interest.

This section looks first at a prototype tool we usedto evaluate our techniques and models. We examinemetrics that measure precision and propose a method toidentify unsafe states in an automaton. Our tests showthat although null call insertion markedly improves theprecision of our models, care must be used so that theadditional calls do not overwhelm the network. Wefinally examine optimizations, including renaming,argument recovery, and stack abstractions that improvethe quality of our models.

4.1 Experimental SetupWe implemented an analyzer and a run-time monitor fora simulated remote execution environment to test theprecision and efficiency of our automaton models. Theanalyzer examines the submitted binary program andoutputs an automaton and a modified binary. The autom-aton is read and operated by a stand-alone process, themonitor, that acts as the checking local agent, communi-cating with the modified program using message-pass-ing inter-process communication. The monitor is not anRPC server and only verifies that the system callencountered by the program is accepted by the model. Ifthe monitor successfully updates the automaton, the

original system call proceeds in the rewritten appliction.

Our analyzer and simulated execution environmerun on a Sun Ultra 10 440 Mhz workstation with 64Mb of RAM running Solaris 8. To simulate a wide-arenetwork, we add a delay per received remote system cequivalent to the round trip time between a computerMadison, Wisconsin and a computer in Bologna, Ita(127 ms). We do not include a delay for data transfer, fwe do not statically know what volume of data will betransferred. Null calls require no reply, so the delaadded per null call is the average time to callsend with a20 byte buffer argument (13µs). During evaluation, thecollection of Solaris libc kernel trap wrapper functiondefines our set of remote system calls.

We present the analysis results for six test program(see Table 1 for program descriptions and workloaand Table 2 for statistics). All workloads used defauprogram options; we specified no command linswitches.

As we have not implemented general support fdynamically linked functions, we statically link all pro-grams. However, several network libraries, suchlibresolv.so , can only be dynamically linked onSolaris machines. We analyze these libraries usingsame techniques as for an application program, but stthe generated automata for later use. When our analyof a program such as procmail or finger reveals a calla dynamically linked function, we read in the storelocal automaton and continue. We currently ignore thindirect calls in dynamically linked library functionsunless the monitor generates an error at run-time atindirect call location.

4.2 Metrics to Measure Precision and EfficiencyWe wish to analyze both the precision of our models athe efficiency with which the monitor may operate themPrecision dictates the degree to which an adversarylimited in their attacks, and thus the usefulness of thmodel as a counter-measure. Efficient operation isrequirement for deployment in real remote executioenvironments.

For comparison, we measure automaton precisiusing Wagner and Dean’sdynamic average branchingfactormetric [36,37]. This metric first partitions the system calls into two sets, dangerous and safe. Then, durapplication execution and model operation, the numbof dangerous calls that would next be accepted by tmodel is counted following each operation. The totcount is averaged over the number of operations onmodel. Smaller numbers are favorable and indicate than adversary has a small opportunity for exploit.

adingll

on-cy

s.at

ere

fte-f

gs

jha,”

jha,”

lude

%

Our efficiency measurements are straightforward.Using the UNIX utility time , we measure each applica-tion’s execution time in the simulated environment with-out operating any model. This is a baseline measureindicating delay due to simulated network transit over-head, equivalent to a remote execution environment’srun-time conditions. We then turn on checking and vari-ous optimizations to measure the overhead introducedby our checking agent. We find the NFA model efficientto operate but the bounded PDA disappointingly slow.However, the extra precision gained from inclusion ofnull calls into the bounded PDA model dramaticallyimproves efficiency.

4.3 The NFA ModelWe evaluate the models of the six test programs withrespect to precision and efficiency. Our baseline ana-

lyzer includes renaming, argument recovery, deautomaton removal, and single-edge replacement. Usthe NFA model, we compare the results of several nucall placement strategies against this baseline and csider the trade-off between performance and efficiendue to the null call insertion.

We use four different null call placement strategieFirst, no calls are inserted. Second, calls are insertedthe entry point of every function with afan-in of 10 ormore–that is, the functions called by 10 or more othfunctions in the application. Third, we insert calls at thentry point of every function with a fan-in of 5 orgreater. Fourth, we instrument functions with a fan-in o2 or more. We have tried three other placement stragies but found they occasionally introduced a glut onull calls that would overwhelm the network: addincalls to all functions on recursive cycles; to all function

Program Description Workload

entropy Calculates the conditional probabilities of packetheader fields from tcpdump data.

Compute one conditional probability from 100,000data records.

random1 Generates a randomized sequence of numbers fromthree seed values.

Randomize the numbers 1-999.

gzip Compresses and decompresses files. Compress a single 13 Mb text file.

GNU finger Displays information about the users of a computer. Display information for three users, “bart,” “and “giffin.”

finger Displays information about the users of a computer. Display information for three users, “bart,” “and “giffin.”

procmail Processes incoming mail messages. Process a single incoming message.

Table 1: Test program descriptions and test workloads.

ProgramSource

LanguageLines of Code

(Source)Compiler

Number of Func-tions (Binary)

Instructions(Binary)

entropy C 1,047 gcc 868 58,141

random1 Fortran 172 f90 1,232 133,632

gzip C 8,163 gcc 883 56,686

GNU finger C 9,504 cc 1,469 95,534

finger C 2,456 gcc 1,370 90,486

procmail C 10,717 cc 1,551 107,167

Table 2: Test programs statistics.Source code line counts do not include library code. Statistics for the binary programs inccode in statically linked libraries.

Program No modelNo nullcalls

% increaseNull callsfan-in 10

% increaseNull callsfan-in 5

% increaseNull callsfan-in 2

% increase

entropy 208.33 208.48 0.1 % 208.50 0.1 % 208.41 0.0 % 287.27 37.9

gzip 81.49 81.61 0.1 % 82.16 0.8 % 82.26 0.9 % 675.47 728.9 %

random1 9.68 9.69 0.1 % 10.80 11.5 % 10.92 12.8 % 10.68 10.4 %

GNU finger 55.22 55.30 0.1 % 55.46 0.4 % 56.23 1.8 % 55.50 0.5 %

finger 30.23 30.25 0.1 % 30.28 0.2 % 32.59 7.8 % 33.72 11.5 %

procmail 20.90 21.00 0.5 % 21.04 0.7 % 21.08 0.9 % 21.00 0.5 %

Table 3: NFA run-time overheads. Absolute overheads indicate execution time in seconds.

.r-

of

e

n

n

thgin

ctndd

g,didydd

ale

v-etor

entthci-g-

u-g

on

whose modeling automaton’s entry state is also a finalstate, ensuring every call chain generates at least onesymbol; and to functions with fan-in of 1 or more. Aswe expected, this sequence of greater instrumentationincreases the precision and quality of the automatawhile negatively impacting performance as the extracalls require additional network activity. More gener-ally, the problem of selecting good null call insertionpoints is similar to that of selecting optimal locations toinsert functions for program tracing, a topic of previousresearch [4]. We will investigate the use of such selec-tion algorithms for our future implementations.

We found that null call insertion dramaticallyimproved precision. Figure 11 shows the dynamic aver-age branching factor for the six test programs at each ofthe null call placement strategies. Instrumenting at themaximum level improves precision over non-instru-mented models by an order of magnitude or more. Eventhough null call insertion adds edges to the local autom-ata, we observe that the number of edges in the finalautomaton are usually significantly lower, indicatingthat call site replacement introduces fewer impossiblepaths. The edge count in procmail’s model drops by anorder of magnitude even though the state countincreases modestly. We believe these results demon-strate the great potential of introducing null calls.

Although unfortunate, null call insertion has theexpected detrimental effect on application run-times.Each null call encountered during execution dropsanother call onto the network for relay to the checkingagent. The application need not wait for a response, buteach call is still an expensive kernel trap and adds to net-work traffic. Table 3 shows the additional execution

time resulting from operation of models with null callsTable 4 lists the bandwidth requirements of each insetion level for null calls that each consume 100 bytesbandwidth.

We make two primary observations from thesresults. First,our NFA model is incredibly efficient tooperate at run-time when no null calls have beeinserted. Second,inserting null calls in functions withfan-in 5 or greater is a good balance between precisiogain and additional overhead in our six test programs.Unfortunately, two programs require moderate bandwiat this instrumentation level. We believe the varyinbandwidth needs among our test programs are duepart to our naive null call insertion strategies. We expethat an algorithm such as that developed by Ball aLarus [4] will reduce bandwidth requirements animprove consistency among a collection of programs.

4.4 Effects of OptimizationsWe analyzed procmail further to evaluate renaminargument recovery, and our stack abstractions. Wenot analyze automaton inlining here, for it surprisinglproved to be an inefficient optimization. Inlining addesignificant overhead to model construction but deliverelittle gain in precision. Similarly, we found the run-timecharacteristics of the hybrid model to be nearly identicto those of the bounded PDA. We will not examininlining or the hybrid model in any greater detail.

To see the effects of renaming and argument recoery, we selectively turned off these optimizations. Thgraph in Figure 12 measures average branching facdependent on use of call site renaming and of argumrecovery in the program procmail. As we expected, borenaming and argument recovery reduced the impresion in the model. The reduction produced by renaminis solely due to the reduction in non-determinism. Argument recovery reduces imprecision by removing argments from manipulation by an adversary. Renaminand argument recovery together reduce imprecisimore than either optimization alone.

Figure 11: NFA precision.Models included all baselineoptimizations.

entropygzip

random1

GNU finger

fingerprocmail

0

2

4

6

8

10

12A

vera

ge B

ranc

hing

Fac

tor

NFA Precision

No null callsNull calls in functions with fan-in >= 10Null calls in functions with fan-in >= 5Null calls in functions with fan-in >= 2

* Value < 0.001

* * *

ProgramNull callsfan-in 10

Null callsfan-in 5

Null callsfan-in 2

entropy 0.0 0.0 1198.3gzip 3.9 9.3 4350.5random1 223.9 296.6 314.8

GNU finger 0.9 8.3 12.9

finger 0.8 144.0 270.9

procmail 4.1 12.6 17.7

Table 4: Null call bandwidth requirements, in Kbps. Theprograms used NFA models with baseline optimizations.

e

h-r-

m

ls.ores aa

earfe

ck

ilytic

rly.

d

:-

elnsun--

an

ndrkl,

,t.

-instg,hery.ce-ntnd

ars

At first glance, it may seem counter-intuitive thatargument recovery should reduce imprecision to agreater degree than renaming. Argument recovery is,after all, a subset of renaming; static arguments distin-guish the call site. However, an attacker cannot manipu-late a recovered argument, so system calls that weredangerous with unknown arguments become of nothreat with argument recovery.

We analyzed the bounded PDA model for procmailwith stack bounds from 0 to 10. Figure 13 shows theaverage branching factors of our PDA at varying levelsof null call instrumentation and bounded stack depth.Figures 14 and 15 show the run-time overheads of thesemodels at two different time scales.

Null call insertion has a surprising effect on opera-tion of the bounded stack models. The added precisionof the null calls actually decreases run-time overheads.We were surprised to discover cases wherethe bounded-stack PDA with null call instrumentation was nearly asefficient to operate as an NFA model, but at a higherlevel of precision. Observe that higher levels of null callinstrumentation actually reduce the execution times, asoperation of the models becomes more precise.

Increasing the stack size produces a similar effect.The plots for instrumentation in functions with fan-in of5 in Figure 14 and in functions with fan-in of 10 inFigure 15 show a common pattern. Up until a stackbound of size 6, the model’s efficiency improves. Moreexecution context is retained, so fewer paths in themodel are possible. As the state grows past a bound of 6,the cost of increased state begins to dominate. Finding

this transition value is an important area for futurresearch.

4.5 Discussion on MetricsMeasuring precision with the dynamic average brancing factor metric ignores several important consideations:1. An attack likely consists of a sequence of syste

calls and is not a single call in isolation. A call maybe dangerous only when combined with other cal

2. The attacker could steer execution through onemore “safe” system calls to reach a portion of thmodel that accepts an attack sequence. Perhaptypical run of the program does not reach this areof the model, so the dangerous edges do not appin the dynamic average branching factor. Such saedges should not cover the potential for an attadownstream in the remote call sequence.

We do not see any obvious dynamic metric that easovercomes these objections. The straightforward staanalogue to dynamic average branching factor isstaticaverage branching factor, the same count averaged ovethe entire automaton with all states weighted equalThe prior complaints remain unsatisfied.

We propose a metric that combines static andynamic measurements. Ouraverage adversarialopportunitymetric requires two stages of computationfirst, the automaton modeling the application is composed with a set of attack automata to identify all modstates with attack potential; then, the monitor maintaia count of the dangerous states encountered during rtime. “Attack potential” indicates a known attack is possible beginning at the current state orat a state reach-able from it. We are locating those positions in themodel where an adversary could successfully insertattack and counting visits to those states at run-time.

5 Comparison with Existing Work

We measured dynamic average branching factor aexecution overhead for comparison with the earlier woof Wagner and Dean. We compare only the NFA modeas it is the only model our work has in common withtheir own. They analyzed four programs; two of themprocmail and finger intersect our own experimental seAlthough we do not know what version of finger Wagner and Dean used, we compared their numbers agaour analysis of GNU finger. We used call site renaminargument recovery, and single-edge replacement. Tresults for Wagner and Dean include argument recove(They have no analogue to renaming or edge replament). On the two programs, we observed a significadiscrepancy between their reported precision values athose we could generate. Upon investigation, it appe

Figure 12: Precision improvements with renamed call sitesand argument recovery.

NoneRename

Arg Capture

Both

0

2

4

6

8

10

12

14

Ave

rage

Bra

nchi

ng F

acto

rNFA Precision -

Effects of Optimizations (procmail)

No null callsNull calls in functions with fan-in >= 10Null calls in functions with fan-in >= 5Null calls in functions with fan-in >= 2

Figure 13: Effect of stack depth and null call insertion upon PDA precision.Baseline optimizations were used.

Figure 14: Effect of stack depth and null call insertion upon PDA run-time overhead, 7 second time scale.Baselineoptimizations were used. This time scale shows trends for null call insertion for fan-ins of 5 and 2.

Figure 15: Effect of stack depth and null call insertion upon PDA run-time overhead, 700 second time scale.The source datais identical to that of Figure 14. This time scale shows trends for no null call insertion and insertion for fan-in of 10.

0 1 2 3 4 5 6 7 8 9 10

Stack Bound

0123456789

101112

Ave

rage

Bra

nchi

ng F

acto

r PDA Precision - Effect of Stack Depth (procmail)

No null callsNull calls in functions with fan-in >= 10Null calls in functions with fan-in >= 5Null calls in functions with fan-in >= 2

0 1 2 3 4 5 6 7 8 9 10

Stack Bound

0

1

2

3

4

5

6

7

Ove

rhea

d (s

econ

ds)

PDA Overhead - Effect of Stack Depth - 7 sec Time Scale (procmail)

No null callsNull calls in functions with fan-in >= 10Null calls in functions with fan-in >= 5Null calls in functions with fan-in >= 2

> 12 sec

0 1 2 3 4 5 6 7 8 9 10

Stack Bound

0

100

200

300

400

500

600

700

Ove

rhea

d (s

econ

ds)

PDA Overhead - Effect of Stack Depth - 700 sec Time Scale (procmail)

No null callsNull calls in functions with fan-in >= 10Null calls in functions with fan-in >= 5Null calls in functions with fan-in >= 2

756 sec

eys-

deln-

eur

to be caused by the differences in library code betweenour respective test platforms. Wagner and Dean ana-lyzed programs compiled on Red Hat Linux, but we useSolaris 8. Solaris is an older operating system andincludes more extensive library code in its standardlibraries. Solaris libc, for example, is structured differ-ently than glibc on Linux and includes functionality notfound in glibc. To see the differences, compareFigure 17, the automaton for the socket system call inglibc, with Figure 16, the automaton for the same func-tion in Solaris libc. In this case, the Solaris socket func-tion includes code maintaining backwards compatibilitywith an earlier method of resolving the device path for anetworking protocol. While socket has the greatest dif-ference of the functions we have inspected, we havefound numerous other library functions with a similarcharacteristic. Simply, Linux and Solaris have differentlibrary code and we have found the Solaris code to bethe more complex.

To better understand the influence of this differentlibrary code base, we identified several functions inSolaris libc that differed significantly from the equiva-

lent function in glibc. We instrumented the code of thidentified functions so that each generates a remote stem call event in a manner similar to glibc. As weexpected, the average branching factor of each modropped significantly (Figure 18). Because we intetionally instrument the library functions incorrectly, themodel generated is semantically invalid. However, wbelieve the change in precision values reinforces ohypothesis.

Figure 16: Thesocket model in Solaris libc.

Figure 17: Thesocket model in Linux glibc.

Entry Exitsocket

Figure 18: Comparison of our baseline NFA models withthe prior results of Wagner and Dean.

Finger Procmail0

2

4

6

8

10

12

Ave

rage

Bra

nchi

ng F

acto

r

Precision

Full Solaris libcglibc EmulationWagner and Dean

* Value <= 0.01

* *Finger Procmail

0

2

4

6

8

10

12

Ove

rhea

d (s

econ

ds)

Efficiency

Our NFA ModelWagner and Dean

* Value <= 0.01

*

ful

e-sely-orre

enof-

ngnse

er-he-le-ee-o-antoal

e,

edr-kgerlop

s-nedgsnks-s.

ful

Our model operation improves significantly overthe work of Wagner and Dean. Figure 18 also showsoverheads in each of the two programs attributed tomodel operation. Our gain is partly due to implementa-tion: Wagner and Dean wrote their monitor in Java. Ourcode runs natively and is highly efficient, introducingonly negligible delay.

6 Related Work

There are three areas with techniques and goals similarto those considered in this paper: applications of staticanalysis to intrusion detection, statistical anomaly-detection-based intrusion detection, and secure agentry.We compare the techniques presented in this paper withthe existing research in the three areas.

Our work applies and extends the techniquesdescribed by Wagner and Dean [36,37]. To our knowl-edge, they were the first to propose the use of static anal-ysis for intrusion detection. However, they analyzed Csource code by modifying a compiler and linker to con-struct application models. Our analysis is performed onbinaries, independent of any source language or com-piler, removing the user’s burden to supply their sourcecode. We also propose several optimizations and pro-gram transformations that improve model precision andefficiency. We believe the optimizations proposed in thispaper are important contributions and can be used byother researchers working in this area.

There is a vast body of work applying dynamicanalysis to intrusion detection. In statistical anomaly-detection-based intrusion detection systems such asIDES [9], a statistical model of normal behavior is con-structed from a collection of dynamic traces of the pro-gram. For example, a sequence of system calls, such asthat produced by the utilitiesstrace and truss , can beused to generate a statistical model of the program (seeForrest et al. [12]). Behaviors that deviate from the sta-tistical model are flagged as anomalous but are not aguarantee of manipulation. Theoretically, we can use astatistical program model in our checking agent. Practi-cally, however, these models suffer from false alarmrates; i.e. they reject sequences of system calls that rep-resent acceptable but infrequent program behavior.Human inspection of jobs flagged as anomalous is inap-propriate in our setting so we did not pursue thisapproach.

The literature on safe execution of mobile agents onmalicious hosts (also known as secure agentry) is vast.The reader is referred to the excellent summary on vari-ous techniques in the area of secure agentry bySchneider [31]. We are currently exploring whether

techniques from this area, such as replication, are usein our setting.

7 Future Work

We continue progressing on a number of fronts. Formost, we are working to expand our infrastructure baof static analysis techniques to include points-to anasis for binaries and regular expression construction farguments. Standard points-to analysis algorithms adesigned for a higher-level source language and oftrely on datatype properties evident from the syntaxthe code. We will adapt the algorithms to the weaklytyped SPARC code. For arguments, we envision usistronger slicing techniques to build regular expressiofor arguments not statically determined. Better codanalyses will produce more precise models.

We have two research areas targeting run-time ovhead reductions in our complex models. To reduce timpact of null call insertions, we will investigate adaptations of the Ball and Larus algorithm to identify optimacode instrumentation points for minimum-cost codprofiling [4]. To reduce the overhead of our PDA models, we will collapse all run-time values at the samautomaton state into a single value with a DAG reprsenting all stack configurations. When traversing outging edges, a single update to the DAG is equivalent toindividual update to each previous stack. Our hope ismake our complex and precise models attractive for reenvironments.

We will add general support for dynamically linkedapplications and signal handlers to our analysis enginenabling analysis of larger test programs.

To better measure the attack opportunities affordby our models, we will implement the average adversaial opportunity metric and create a collection of attacautomata. Having an accurate measure of the daninherent in an automaton better enables us to devestrategies to mitigate the possible harm.

Acknowledgments

We thank David Wagner for patiently answering quetions about his work and for providing his specificatioof dangerous system calls. David Melski pointed out threlevance of the Ball and Larus research [4]. We hamany insightful discussions with Tom Reps regardinstatic analysis. Hong Lin initially researched solutionto the remote code manipulation vulnerability. GlenAmmons provided helpful support for EEL. We thanthe other members of the WiSA security group at Wiconsin for their valuable feedback and suggestionLastly, we thank the anonymous referees for their usecomments.

let

.

y

s

,

t,

s

ry

Availability

Our research tool remains in development and we arenot distributing it at this time. Contact Jonathon Giffin,[email protected] , for updates to this status.

References[1] A.D. Alexandrov, M. Ibel, K.E. Schauser, and C.J.

Scheiman, “SuperWeb: Towards a Global Web-BasedParallel Computing Infrastructure”,11th IEEEInternational Parallel Processing Symposium,Geneva, Switzerland, April 1997.

[2] K. Anstreicher, N. Brixius, J.-P. Goux, and J.Linderoth, “Solving Large Quadratic AssignmentProblems on Computational Grids”,17th InternationalSymposium on Mathematical Programming, Atlanta,Georgia, August 2000.

[3] A.W. Appel and D.B. MacQueen, “Standard ML ofNew Jersey”, Third International Symposium onProgramming Language Implementation and LogicProgramming, Passau, Germany, August 1991. Alsoappears in J. Maluszynski and M. Wirsing, eds.,Programming Language Implementation and LogicProgramming, Lecture Notes in Computer Science#528, pp. 1-13, Springer-Verlag, New York (1991).

[4] T. Ball and J.R. Larus, “Optimally Profiling andTracing Programs”, ACM Transactions onProgramming Languages and Systems16, 3, pp. 1319-1360, July 1994.

[5] B. Barak, O. Goldreich, R. Impagaliazzo, S. Rudich, A.Sahai, S. Vadhan, and K. Yang, “On the(Im)possibility of Obfuscating Programs”,21st AnnualInternational Cryptography Conference, SantaBarbara, California, August 2001. Also appears in J.Kilian, ed.,Advances in Cryptology - CRYPTO 2001,Lecture Notes in Computer Science #2139, pp. 1-18,Springer-Verlag, New York (2001).

[6] E. Belani, A. Vahdat, T. Anderson, and M. Dahlin,“The CRISIS Wide Area Security Architecture”,Seventh USENIX Security Symposium, San Antonio,Texas, January 1998.

[7] S. Chow, Y. Gu, H. Johnson, and V.A. Zakharov, “AnApproach to the Obfuscation of Control-Flow ofSequential Computer Programs”,Information SecurityConference ‘01, Malaga, Spain, October 2001.

[8] C. Collberg, C. Thomborson, and D. Low, “BreakingAbstractions and Unstructuring Data Structures”,IEEEInternational Conference on Computer Languages,Chicago, Illinois, May 1998.

[9] D.E. Denning and P.J. Neumann,Requirements andModel for IDES–A Real-Time Intrusion DetectionSystem, Technical Report, SRI International, August1985.

[10] J. Esparza, D. Hansel, P. Rossmanith, and S. Schwoon,“Efficient Algorithms for Model Checking PushdownSystems”, 12th Conference on Computer AidedVerification, Chicago, Illinois, July 2000. Also appears

in E.A. Emerson and A.P. Sistla, eds.,Computer AidedVerification, Lecture Notes in Computer Science#1855, pp. 232-247, Springer-Verlag, New York(2000).

[11] G.E. Fagg, K. Moore, and J.J. Dongarra, “ScalabNetworked Information Processing Environmen(SNIPE)”, Supercomputing ‘97, San Jose, California,November 1997.

[12] S. Forrest, S.A. Hofmeyr, A. Somayaji, and T.ALongstaff, “A Sense of Self for Unix Processes”,1996IEEE Symposium on Research in Security and Privac,Oakland, California, May 1996.

[13] I. Foster and C. Kesselman, “Globus: AMetacomputing Infrastructure Toolkit”, TheInternational Journal of Supercomputer Applicationand High Performance Computing11, 2, pp. 115-129,Summer 1997.

[14] I. Foster and C. Kesselman, eds.,The Grid: Blueprintfor a New Computing Infrastructure , MorganKaufmann, San Francisco (1998).

[15] A.K. Ghosh, A. Schwartzbard, and M. Schatz“Learning Program Behavior Profiles for IntrusionDetection”, 1st USENIX Workshop on IntrusionDetection and Network Monitoring, Santa Clara,California, April 1999.

[16] J.T. Giffin and H. Lin, “Exploiting Trusted Applet-Server Communication”, Unpublished Manuscrip2001. Available athttp://www.cs.wisc.edu/~giffin/.

[17] F. Hohl, “A Model of Attacks of Malicious HostsAgainst Mobile Agents”,4th ECOOP Workshop onMobile Object Systems: Secure Internet Computation,Brussels, Belgium, July 1998.

[18] J. Hopcroft,Ann log nAlgorithm for Minimizing Statesin a Finite Automaton, Theory of Machines andComputations, pp. 189-196, Academic Press, NewYork (1971).

[19] J.E. Hopcroft, R. Motwani, and J.D. Ullman,Introduction to Automata Theory, Languages, andComputation, Addison Wesley, Boston (2001).

[20] S. Horwitz and T. Reps, “The Use of ProgramDependence Graphs in Software Engineering”,14thInternational Conference on Software Engineering,Melbourne, Australia, May 1992.

[21] N.D. Jones, C.K. Gomard, and P. Sestoft,PartialEvaluation and Automatic Program Generation,Prentice Hall International Series in ComputeScience, Prentice Hall, Englewood Cliffs, New Jerse(1993).

[22] C. Ko, G. Fink, and K. Levitt, “Automated Detection ofVulnerabilities in Privileged Programs by ExecutionMonitoring”, 10th Annual Computer SecurityApplications Conference, Orlando, Florida, 1994.

[23] C. Ko, “Logic Induction of Valid BehaviorSpecifications for Intrusion Detection”,2000 IEEESymposium on Security and Privacy, Oakland,California, 2000.

:

n

:

l

,

r,

d

[24] L. Lamport, R. Shostak, and M. Pease, “The ByzantineGenerals Problem”, ACM Transactions onProgramming Languages and Systems4, 3, pp. 382-401, July 1982.

[25] J.R. Larus and E. Schnarr, “EEL: Machine-Independent Executable Editing”,SIGPLAN ‘95Conference on Programming Language Design andImplementation, La Jolla, California, June 1995.

[26] M. Litzkow, M. Livny, and M. Mutka, “Condor–AHunter of Idle Workstations”, 8th InternationalConference on Distributed Computer Systems, SanJose, California, June 1988.

[27] B.P. Miller, M. Christodorescu, R. Iverson, T. Kosar,A. Mirgorodskii, and F. Popovici, “Playing Inside theBlack Box: Using Dynamic Instrumentation to CreateSecurity Holes”,Parallel Processing Letters11, 2/3,pp. 267-280, June/September 2001. Also appears in theSecond Los Alamos Computer Science InstituteSymposium, Sante Fe, NM (October 2001).

[28] T. Reps, “Program Analysis via Graph Reachability”,Information and Software Technology40, 11/12, pp.701-726, November/December 1998.

[29] J.H. Saltzer, “Protection and the Control ofInformation Sharing in Multics”,Communications ofthe ACM17, 7, pp. 388-402, July 1974.

[30] T. Sander and C.F. Tschudin, “Protecting MobileAgents Against Malicious Hosts”, in G. Vigna, ed.,Mobile Agents and Security, Lecture Notes inComputer Science #1419, pp. 44-60, Springer-Verlag,New York (1998).

[31] F.B. Schneider, “Towards Fault-tolerant and SecureAgentry”, 11th International Workshop on DistributedAlgorithms, Saarbrucken, Germany, September 1997.

[32] SETI@home: Search for Extraterrestrial Intelligenceat Home, 23 January 2002,http://setiathome.ssl.berkeley.edu/.

[33] Sun Microsystems,Java Virtual Machines, 11 May2002, http://java.sun.com/j2se/1.4/docs/guide/vm/.

[34] F. Tip, “A Survey of Program Slicing Techniques”,Journal of Programming Languages3, 3, pp.121-189,September 1995.

[35] A. Vahdat, T. Anderson, M. Dahlin, E. Belani, D.Culler, P. Eastham, and C. Yoshikawa, “WebOSOperating System Services for Wide AreaApplications”, Seventh International Symposium oHigh Performance Distributed Computing, Chicago,Illinois, July 1998.

[36] D.A. Wagner,Static Analysis and Computer SecurityNew Techniques for Software Assurance, Ph.D.Dissertation, University of California at Berkeley, Fal2000.

[37] D. Wagner and D. Dean, “Intrusion Detection viaStatic Analysis”,2001 IEEE Symposium on Securityand Privacy, Oakland, California, May 2001.

[38] C. Wang, J. Davidson, J. Hill, and J. Knight“Protection of Software-based SurvivabilityMechanisms”, International Conference ofDependable Systems and Networks, Goteborg,Sweden, July 2001.

[39] C. Warrender, S. Forrest, and B. Pearlmutte“Detecting Intrusions Using System Calls: AlternativeData Models”,1999 IEEE Symposium on Security anPrivacy, Oakland, California, May 1999.