Weighted Pushdown Systems and their Application to Interprocedural Dataflow Analysis Thomas Reps...
-
Upload
lizbeth-dean -
Category
Documents
-
view
218 -
download
2
Transcript of Weighted Pushdown Systems and their Application to Interprocedural Dataflow Analysis Thomas Reps...
Weighted Pushdown Systemsand their Application to
Interprocedural Dataflow Analysis
Thomas RepsUniversity of Wisconsin
GrammaTech, Inc.
Joint work with S. Schwoon, S. Jha, and D. Melski
Weighted Pushdown Systemsand their Application to
Interprocedural Dataflow AnalysisInterprocedural Dataflow AnalysisApplication
Weighted Pushdown Systems
Dataflow Analysis
Pushdown Systems
Intraprocedural Analysis
enter n
V0
MOP(n) = pfp(V0) pPathsTo[n]
pfp = fk fk-1 … f2 f1
f1 f2 fk-1 fk
x = 3
p(x,y)
return from p
printf(y)
start main
exit main
if . . .
b = a
printf(b)
(
)
p(a,b)
return from p
exit p
start p(a,b)
]
(
Context-Sensitive Interprocedural Analysis
start n
V0 ret
( )
f1 f2 fk-1 fk
f3
f4
f5
fk-2
fk-3
callq
enterq exitq
MOVP(n) = pfp(V0) pMatchedPathsTo[n]
void p() { if (...) { x = x + 1; p(); // p_calls_p1 x = x - 1; } if (...) { x = x - 1; p(); // p_calls_p2 x = x + 1; } return;}
An Expanded Set of Queries
int x;
void main() { x = 5; p(); //main_calls_p return;}
5
<x, enterp p_calls_p2 p_calls_p1 main_calls_p>
An Expanded Set of Queries
main_calls_p
enterp
p_calls_p1
p_calls_p2
x = 5
x = 5
x = x + 1
x = x - 1
x = 5
x = x + 1
x = x - 1p_calls_p1
main_calls_p
enterp
p_calls_p1
p_calls_p2
x = 5
+1
-1p_calls_p1
+1-1
p_calls_p1+1
-1
p_cal
ls_p
2+1-1
An Expanded Set of Queries
int x;
void main() { x = 5; p(); //main_calls_p return;}
void p() { if (...) { x = x + 1; p(); // p_calls_p1 x = x - 1; } if (...) { x = x - 1; p(); // p_calls_p2 x = x + 1; } return;}
5
<x, enterp (p_calls_p2 p_calls_p1)* main_calls_p>
An Expanded Set of Queries
int x;
void main() { x = 5; p(); //main_calls_p return;}
void p() { if (...) { x = x + 1; p(); // p_calls_p1 x = x - 1; } if (...) { x = x - 1; p(); // p_calls_p2 x = x + 1; } return;}
5 4 =
<x, enterp (p_calls_p2 + p_calls_p1)* main_calls_p>
5 4
An Expanded Set of Queries
int x;
void main() { x = 5; p(); //main_calls_p return;}
void p() { if (...) { x = x + 1; p(); // p_calls_p1 x = x - 1; } if (...) { x = x - 1; p(); // p_calls_p2 x = x + 1; } return;}
5 4 =
<x, enterp Σ*>
5 4
MOVP = any stack configuration
An Expanded Set of Queries
L1 = <x, enterp p_calls_p2 p_calls_p1 main_calls_p>
L2 = <x, enterp (p_calls_p2 p_calls_p1)* main_calls_p>
L3 = <x, enterp (p_calls_p2 + p_calls_p1)* main_calls_p>
L4 = <x, enterp Σ*>
MOVP’(L) = pfp(V0) c L, p MatchedPathsTo[c]
MOVP(n) = pfp(V0) pMatchedPathsTo[n]
MOVP’(L3) = MOVP’(L4) = MOVP(enterp)
So What? Who Cares? [Yawn]
• Virtual inline expansion– Value for x in configurations with an even # of calls to p: MOVP’(<x, n (p_calls_p p_calls_p)* main_calls_p>)– Value for x in configurations with an odd # of calls to p : MOVP’(<x,n p_calls_p (p_calls_p p_calls_p)*
main_calls_p>)
• Stack-constrained queries– at breakpoint at n, fetch stack from debugger (say S)– stack-constrained slicing:
“What are the program elements that could have affected the values used at n, given that we have reached n with stack S?”
So What? Who Cares? [Yawn]
• Software model checking: Check properties by model checking a CFG encoded as a PDS– SLAM [Ball & Rajamani]
– MOPS [Chen & Wagner]
– “Meta-Compliation” [Engler et al.]
• PDS WPDS– GrammaTech software model checker implemented
on top of the WPDS++ library
So What? Who Cares? [Yawn]
• Convenient framework for implementing interprocedural dataflow analyses– Create weighted PDS from interprocedural CFG
– Either exhaustive or demand-driven dataflow analysis
– Used to solve subproblem needed for recovering the organization of stack-frames in x86 executables [Balakrishnan & Reps CC 04]
d eq:
Unrolled Program = Transition System
b g
a
c h
j
f i
p:
d eq:
a
b d
f
c e
p: a
b d
f
c e
p: a
b d
f
c e
p: a
b d
f
c e
p: a
b d
f
c e
p: a
b d
f
c e
p: a
b d
f
c e
p:
Unrolled Program = ∞Transition System
׃ ׃ ׃ ׃׃ ׃ ׃ ׃
Pushdown System (PDS)
States: { σ1, σ2, σ3, σ4 }
Stack symbols: { A, B, C, D }
Transition rules: <σ1, A> <σ2, > <σ1, A> <σ2, B>
<σ1, A> <σ2, B C>
Pushdown System (PDS)
States: { σ1, σ2, σ3, σ4 }
Stack symbols: { A, B, C, D }
Transition rules: <σ1, A> <σ2, > <σ1, A> <σ2, B>
<σ1, A> <σ2, B C>
If the state is σ1 and thetop of the stack is A, then pop A transition to state σ2
Pushdown System (PDS)
States: { σ1, σ2, σ3, σ4 }
Stack symbols: { A, B, C, D }
Transition rules: <σ1, A> <σ2, > <σ1, A> <σ2, B>
<σ1, A> <σ2, B C>
If the state is σ1 and thetop of the stack is A, then pop A transition to state σ2
push B
Pushdown System (PDS)
States: { σ1, σ2, σ3, σ4 }
Stack symbols: { A, B, C, D }
Transition rules: <σ1, A> <σ2, > <σ1, A> <σ2, B>
<σ1, A> <σ2, B C>
If the state is σ1 and thetop of the stack is A, then pop A transition to state σ2
push C; then push B
Rules Define a Transition Relation
<σ,A> <σ’,ε>
<σ,A> <σ’,B>
<σ,A> <σ’,B C>
σ A
σ’
B
σ’σ A
B
C
σ’σ A
Pushdown System (PDS)
• PDS = Pushdown automaton without an input tape
• Mechanism for defining a class of infinite-state transition systems
<σ, A> <σ, A A>
<σ,A>
<σ,AA>
<σ,AAA>
׃
<σ,AAAA>
Supergraph as a PDS
d e
b g
a
c h
j
f i
p:
q:<σ, a> <σ, b>
Supergraph as a PDS
d e
b g
a
c h
j
f i
p:
q:<σ, b> <σ, c>
Supergraph as a PDS
d e
b g
a
c h
j
f i
p:
q:<σ, c> <σ, d f>
save return siteon stack
Supergraph as a PDS
d e
b g
a
c h
j
f i
p:
q:<σ, d> <σ, e>
Supergraph as a PDS
d e
b g
a
c h
j
f i
p:
q:<σ, e> <σ, ε>
uncovers mostrecent call site
Supergraph as a PDS
d e
b g
a
c h
j
f i
p:
q:<σ, f> <σ, g>
Supergraph as a PDS
d e
b g
a
c h
j
f i
p:
q:<σ, g> <σ, h>
Supergraph as a PDS
d e
b g
a
c h
j
f i
p:
q:<σ, h> <σ, d i>
save return siteon stack
a
b d
f
c e
p:
a
b d
f
c e
p:
a
b d
f
c e
p:
a
b d
f
c e
p:a
b d
f
c e
p:a
b d
f
c e
p:a
b d
f
c e
p:
Unrolled Program = ∞ Transition System
׃ ׃ ׃ ׃׃ ׃ ׃ ׃
<σ, f e c>
<σ, b c c>
PDS Terminology
Configuration <σ, f e c>
c c’ (transition relation) c’ follows from c by a transition rule c predecessor of c’ c’ successor of cc0 c1 . . . cn (a run)
c * c’ reflexive transitive closure of
σ,fec
a
b d
f
c e
p:
a
b d
f
c e
p:
a
b d
f
c e
p:
a
b d
f
c e
p:a
b d
f
c e
p:a
b d
f
c e
p:a
b d
f
c e
p:
A Run
׃ ׃ ׃ ׃׃ ׃ ׃ ׃
<σ,a> <σ,b> <σ,ac> <σ,bc> <σ,acc> <σ,fcc> <σ,cc> <σ,dc> <σ,aec> <σ,fec>
a
b d
f
c e
p:
a
b d
f
c e
p:
a
b d
f
c e
p:
a
b d
f
c e
p:a
b d
f
c e
p:a
b d
f
c e
p:a
b d
f
c e
p:
A Run
׃ ׃ ׃ ׃׃ ׃ ׃ ׃
a
b d
f
c e
p:
a
b d
f
c e
p:
a
b d
f
c e
p:
a
b d
f
c e
p:a
b d
f
c e
p:a
b d
f
c e
p:a
b d
f
c e
p:
A Run
׃ ׃ ׃ ׃׃ ׃ ׃ ׃
Representing Distributive Functions[POPL 95]
Identity Function
Constant Function
a b c
a b c
f({a,b}) = {a,b}
f = λV.V
f({a,b}) = {b}
f = λV.{b}
“Gen/Kill” Function
Non-“Gen/Kill” Function a b c
a b c
Representing Distributive Functions[POPL 95]
f({a,b}) = {a,c}
f({a,b}) = {a,b}
f = λV.(V {b}) {c}
f = λV. if aV then V {b} else V {b}
x = 3
p(x,y)
return from p
printf(y)
start main
exit main
start p(a,b)
if . . .
b = a
p(a,b)
return from p
printf(b)
exit p
x y a b
, [start] , [x = 3] , [start] x, [x = 3], [start] y, [x = 3]
, [start] , [x = 3] , [start] x, [x = 3], [start] y, [x = 3], [x = 3] , [p(x,y)] y, [x = 3] y, [p(x,y)], [x = 3] , [p(x,y)] y, [x = 3] y, [p(x,y)]
׃ ׃ ׃ ׃׃ ׃ ׃ ׃
M
pre*(M)
׃ ׃ ׃ ׃׃ ׃ ׃ ׃
M
post*(M)
• The set of configurations pre*(S) can be infinite
• Example– <σ,A> <σ>– pre* ( {<σ,A>}) = { σ Ai | i ≥ 1 }
• Solution in the PDS literature: Represent a set of configurations with an automaton
Representation Issue
<σ,A>
<σ,AA>
<σ, >
<σ,AAA>
...
From M to Pre*(M)
<σ,A> <σ1,A1 . . . Am>
σ
A
A1 . . . Amσ1
σ A
σ1
A1
Am
...
Observation
• For IFDS problems (Reps, Horwitz, & Sagiv [POPL 95]), PDS literature provides solution to MOVP’ problem– Bouajjani, Esparza, & Maler [Concur 97]– Esparza et al. [CAV 00]
• But . . . some problems are not IFDS– linear constants [Sagiv, Reps, & Horwitz 96]
– affine relations [Müller-Olm & Seidl 03]
Interprocedural Dataflow Analysis
Application
Weighted Pushdown Systems
Dataflow Analysis
Pushdown Systems
Weighted Pushdown System (WPDS)
States: { σ1, σ2, σ3, σ4 }
Stack symbols: { A, B, C, D }
Transition rules: <σ1, A> <σ2, > <σ1, A> <σ2, B>
<σ1, A> <σ2, B C>
w1
w2
w3
Idempotent Semiring (D, , , 0, 1)
[= Meet Semilattice (D, , ..., , ...)]
a 0 = aa b = b aa (b c) = (a b) ca a = aa 1 = aa (b c) = (a b) ca (b c) = (a b) (a c)(a b) c = (a c) (b c)a 0 = 0 a = 0
a b iff a b = a = = R
׃ ׃ ׃ ׃׃ ׃ ׃ ׃
Mroot
post*(Mroot)
׃ ׃ ׃ ׃׃ ׃ ׃ ׃
n
post*(Mroot)
n
W
W
׃ ׃ ׃ ׃׃ ׃ ׃ ׃
n
post*(Mroot)
n
0
From M to Pre*(M)
σ
<σ,A> <σ1,A1 . . . Am>w
V (w X)A
A1 . . . Amσ1
X
wσ1
A1
Am
......
..
.X
σk
V
σ A
σk w X
void p() { if (...) { x = x + 1; p(); // p_calls_p1 x = x - 1; } if (...) { x = x - 1; p(); // p_calls_p2 x = x + 1; } return;}
An Expanded Set of Queries
int x;
void main() { x = 5; p(); //main_calls_p return;}
Demo
An Application• Analysis of x86 code
– no use of debugging information– Subgoal: discover affine relations on registers
int main(){int i,j, a[10];j=0;for(i=0;i<10;++i){
a[i]=i;}return a[2];
}
Difficulties with Object Code
; ebx corresponds to variable isub esp, 44 mov [esp+40],0 ; j = 0 xor ebx, ebx ; i = 0 lea ecx, [esp] loc_9:
mov [ecx], ebx ; a[i]=iinc ebx ; i++add ecx, 4 cmp ebx, 10 ; i<10?jl short loc_9 ;
mov eax, [esp+8] ;return a[2]
add esp, 44
retn
Identifying the layout of stack-frames
?2
1
1
2
Rest of stack0ffffh
int j (4 bytes)
int a[10]
(40 bytes)………
0h
ebp
esp + 40
; ebx corresponds to variable isub esp, 44 mov [esp+40],0 ; j = 0 xor ebx, ebx ; i = 0 lea ecx, [esp] loc_9:
mov [ecx], ebx ; a[i]=iinc ebx ; i++add ecx, 4 cmp ebx, 10 ; i<10?jl short loc_9 ;
mov eax, [esp+8] ;return a[2]
add esp, 44
retn
esp
Problems with Widening
esp + 4
esp + 36esp + 8
ecx = esp + 4*ebx + 4
Rest of stack0ffffh
int j (4 bytes)
int a[10]
(40 bytes)………
0h
ebp
esp + 40
; ebx corresponds to variable isub esp, 44 mov [esp+40],0 ; j = 0 xor ebx, ebx ; i = 0 lea ecx, [esp] loc_9:
mov [ecx], ebx ; a[i]=iinc ebx ; i++add ecx, 4 cmp ebx, 10 ; i<10?jl short loc_9 ;
mov eax, [esp+8] ;return a[2]
add esp, 44
retn
Problems with Widening
ecx [esp, esp + 36] ?
Widening: ecx [esp, ]
Limited widening: ecx [esp, esp + 36]
esp
Affine-Relation Analysis
• Affine-relation– r1, r2, …, rn: variables, a0,a1, …, an: integer
constants– a0 +i=1..n(airi) = 0
• Interprocedural affine-relation analysis [Müller-Olm & Seidl 04]– Flow-sensitive and context-sensitive
problem– Algorithm: Solve a constraint system
• Our application:– r1, r2, …, r8: x86 registers– Constraint system WPDS
Performance
• Running time: linear in program size• Constant of proportionality: k8
– Only 8 registers operations on 9 x 9matrices
Program nInsts nProcs Time (s)
print 75539 697 .77
finger 96123 893 7.13
winhlp32 157634 6491 17.32
regsvr32 225857 9625 37.15
cmd 230481 2317 52.38
notepad 239408 2911 41.85
Contributions
• Algorithms for the “generalized pushdown reachability problem”:
MOVP’(L) = pfp(V0) c L, p MatchedPathsTo[c]
– Running time: O(|Q|2 x |PDS| x H)– [Sound solutions for non-distributive dataflow
problems]– Differential propagation algorithms, too
• Construction of witness trees (optional)
• Program analysis– T. Reps, S. Schwoon, and S. Jha, Weighted
pushdown systems and their application to interprocedural dataflow analysis, SAS 03
• Authorization problems– S. Jha and T. Reps, Analysis of SPKI/SDSI
certificates using model checking, CSFW 02– S. Schwoon, S. Jha, T. Reps, and S. Stubblebine,
On generalized authorization problems, CSFW 03– S. Jha and T. Reps, Model checking SPKI/SDSI. To
appear in J. Comp. Security
Second Topic
Authorization Problems• Traditionally, authorization restrictions are
specified using access control lists (ACLs)– Associate permissions with objects– E.g., AFS permissions for directory D:
reps rlidwkajha rlidwkreps:students rl
• SPKI/SDSI– Local name spaces
reps studentreps student spouse
– Delegation
SPKI/SDSI
Principals (Public Keys) KBob, KAlice Individuals KCS CS Department KOwner[R] Owner of resource R
Local Names KCS faculty KBob myStudents
Extended Names KBob myStudents Spouse
Name Certs
Bob is a CS faculty member KCS faculty KBob
Alice is a student of Bob’s KBob myStudents KAlice
Alice’s friends . . . KAlice myFriends KJoe
KAlice myFriends KMary enemies KAlice myFriends KMary enemies spouse
Auth Certs
A CS faculty member can use host H KOwner[H] KCS faculty
Bob allows access to his students KBob KBob myStudents
Can delegate
Cannot delegate
Alice allows access to her friends KAlice KAlice myFriends
Certificate ChainKOwner[H]
KBob
KCS faculty
KAlice
KBob myStudents
KOwner[H] KCS faculty
KCS faculty KBob
KBob KBob myStudents
KBob myStudents KAlice
KAlice KAlice myFriends
Does not apply!
A Certificate Chain is a Run
<KBob, >
<KOwner[H],>
<KCS, faculty >
<KAlice, >
<KBob, myStudents >
<KOwner[H], > <KCS,faculty >
<KBob, > <KBob, myStudents >
<KBob, myStudents> <KAlice, >
<KCS, faculty> <KBob, >
Pre*(S)
Basic Authorization Query:<KOwner[H],> Pre*({<KAlice,□>,
<KAlice,■>})?
S = {<KAlice, >, <KAlice, >}<KOwner[H],>
{<KAlice, >,<KAlice, >}
KCSKOwner[H] KAliceKBob
{ , }
What Does the Automaton Represent?
• A set of configurations:<K, a1 … am > is in the set if there is a path
• Initial automaton represents {<KAlice, >,<KAlice, >}
KK . . .a1 a2
am
{ , }
KCSKOwner[H] KAlice
KBob
From M to Pre*(M)
<σ,A> <σ1,A1 . . . Am>
A1 . . . Amσ1
σ A
σ1
A1
Am
...
σ
A
Pre*({<KAlice, >, <KAlice, >})
KCSKOwner[H] KAliceKBob
{ , }
faculty myStudents
<KCS,faculty > <KBob, >
<KBob, myStudents > <KAlice, >
Pre*({<KAlice, >, <KAlice, >})
KCSKOwner[H] KAliceKBob
{ , }
faculty myStudents
<KBob, > <KBob, myStudents ■>
<KOwner[H], > <KCS, faculty >
Pre*({<KAlice, >, <KAlice, >})
KCSKOwner[H] KAliceKBob
{ , }
faculty myStudents
<KOwner[H], > Pre*({<KAlice, □>, <KAlice, ■>})
Other Certificate-Set-Analysis Problems
• Shared access?– Given two resources R1 and R2, what principals can
access both R1 and R2?
• Expiration vulnerability?– What resources will principal K be prevented from
accessing if certificate set C ’ expires?
• Many more . . .
• Main message– Several certificate-set-analysis problems can be
solved by model checking a PDS
Weighted Pushdown System (WPDS)
States: { σ1, σ2, σ3, σ4 }
Stack symbols: { A, B, C, D }
Transition rules: <σ1, A> <σ2, > <σ1, A> <σ2, B>
<σ1, A> <σ2, B C>
w1
w2
w3
<KInsurer, □> <KH, patient ■> <KH, patient> <KAIDS, patient> <KH, patient> <KIM, patient> <KAIDS, patient> <KAlice, > <KIM, patient> <KAlice, >
Privacy using a Weighted PDS
S
I
ISISI
Privacy using a Weighted PDS
<KInsurer, □>
<KH, patient ■>
I
<KIM, patient ■>
I
<KAlice, ■>
I
<KH, patient ■>
I
<KAIDS, patient ■>
S
SS I = I
S I
I S S = S I I I = I
What to Take Away . . .
• Observation: Can perform interprocedural dataflow analysis using WPDSs– supports a broader set of dataflow-analysis
queries than past work (30 years worth . . .)
• WPDSs have another application– Certificate-set-analysis problems
• Libraries for WPDSs– WPDS Library: C [Schwoon, Reps, & Jha]– WPDS++ (soon): C++ [Kidd & Reps]
Related Work• Pushdown systems
– Bouajjani, Esparza, & Maler [Concur 97]– Esparza et al. [CAV 00]– Bouajjani, Esparza, & Touili [POPL 03]
• Dataflow analysis– Sharir & Pnueli 81– IDE framework: Sagiv, Reps, & Horwitz [TCS 96]
• Weighted-hypergraph problems– Knuth [IPL 77]– Grammar flow analysis: Möncke & Wilhelm [WAGA
91]– Ramalingam thesis [LNCS #1089]– Ramalingam & Reps [J. Alg 96]