1APLAS05 A Path Sensitive Type System for Resource Usage Verification of C like languages Korea...
-
Upload
jameson-daby -
Category
Documents
-
view
215 -
download
0
Transcript of 1APLAS05 A Path Sensitive Type System for Resource Usage Verification of C like languages Korea...
1APLAS05
A Path Sensitive Type System for Resource Usage Verification
of C like languages
Korea Advanced Institute of Science and Technology
Hyun-Goo Kang, Youil Kim, Taisook Han, Hwansoo Han
3APLAS05
Resource Usage Protocol A program should use resources in a valid way.
Such a protocol is usually specified by a correct sequence of actions on the resource, which is recognizable by a finite state machine.
Example– A file should be open before being written.– A memory cell should not be accessed after deallocation. – An acquired lock should be released eventually.– …
4APLAS05
Example
[ Program 1 ] main() { FILE* fp = fopen(“f”,”w”); fprintf(fp,”x”); fclose(fp);}
[ Program 2 ] main() { FILE* fp = fopen(“f”,”w”); if (fp) { fprintf(fp,”x”); fclose(fp); }}
When a program analyzer assumes that fopen always opens the specified file,
Miss the bug
False alarm
Path Sensitivity is Essential !
5APLAS05
A Path Sensitive Specification in FA
Closed
Opened
Error
fopen {ret>0}
close
read/ write/ close
fopen {ret<=0}
fopen
read/ write
6APLAS05
Related Works
Path insensitive verification : actions in finite automata specification are limited as syntactically identifiable sets– Resource Usage Analysis (Igarashi & Kobayashi)– Vault (DeLine & Fahndrich)
Path sensitive but whole program analysis – SLAM (Ball & Rajamani @ MSR), BLAST (Henzinger et. al. @ UCB)– ESP (Das et. al. @ MSR)
Path sensitive and modular, but unsound – Saturn (Yichen Xie, Alex Aiken @ Stanford)
7APLAS05
Our Goal is
To design a path sensitive resource usage analysis
To design it as a modular analysis for modular specification/verification and scalability
To design it as an automatic and sound analysis
8APLAS05
Observations
Path sensitivity is essential. Values to identify paths are mainly constants and limite
d to some simple integer values. A pointer to file-like resources is normally used just as
a reference. Intraprocedural alias of resources is often but interproc
edural alias of resources is not frequent. Resource allocation rarely appears within loops. Even if
it appears, every resource allocated in the loop should be deallocated or should have the same specification.
9APLAS05
Selected Abstraction
Domain abstraction– Resource states are traced in concrete level. (no abstraction, finite)– Values that identify paths are traced with a constant propagation lattic
e. Join at merge point
– If resource contexts from different paths are different, then we collect (union) them as a set.
– Otherwise we do normal join over our lattice type. (t) Resource identification
– Resources are identified by allocation points. All resources allocated in the same program point should satisfy the same resource usage specification.
Tracing resources– Alias information is traced in the path sensitive way within function bod
y under the assumption of no interprocedural alias.
11APLAS05
Our Type System
Type ≈ lattice element instrumented with type variables
Basically a subtype system (bounded polymorphism)
We add flow and path sensitivity.
12APLAS05
Domain Design (Basic Types)
; ` P v MP { v P } ` v MP
M Z
T
P
MZ MP ZP
sign
r1
T
rn
NRRC
…
resource id
AL NA
T
allocation state
C O
T
resource state
A ` X1 v X2 if X1vX2 2 Bas or X1 v X2 2 A
£
value
£
state of a resource
13APLAS05
Natural definition of resource heap would be – resource Id ! (allocation state, resource state)
But we are interested only in the resources related to the function inferred.– constrained heap
– heap update history
Domain Design (Resource Heap)
{(r1,AL,O)}concretize(
)
w{}
w
H¢[r1 (AL,O)]
H¢[r1 (NA,C)]¢[r1 (AL,O)]
{h | h(r1) = open}
{(r1,AL,O),(r2,NA,C)}
w
w
14APLAS05
A Input Path (A)– a set of constraints over all type variables (input partition) { 1vP, 1vRC, v{(1,AL,O)} }– order is defined as
Output Paths ()– set of outputs : { (v1,1,H1), …, (vn,n,Hn) }– order is defined as
Domain Design (Set of Paths)
A1 ` A2
` A1 v A2
8 (v,,H) 2 1. 9 (v’,’,h’) 2 2.
A` vvv’ Æ A` v’ Æ A` HvH’
` 1v2
15APLAS05
{ v>, v>, v{} }vP
vMZ
x > 0(x) = (,)
, vP, v>, v{} ,
vMZ, v>, v{} ,
vP, v>, v{} 2,
vMZ, v>, v{} 1, v>, v>, v{} 1t2,
Input Path Partitioning / Merging
16APLAS05
(x) = (,)
close x
error “not opened”
error “not allocated”
error “not resource”
{ v>, v>, v{} }
v>, vRC, v{(,AL,O)}
v>, vRC, v{(,AL,C)}
v>, vRC, v{(,NA,>)}
v>, vNR, v{}
A,,H ` close x : {(Z,,H¢[R (AL,C)])}
(x)=(R,D)
A ` R v RC A ` H v {(R,AL,O)}
{ (Z, , ¢[ (AL,C)]) }
,
17APLAS05
(x) = (,)
open x
error “not closed”
error “not allocated”
error “not resource”
{ v>, v>, v{} }
v>, vRC, v{(,AL,C)}
v>, vRC, v{(,AL,O)}
v>, vRC, v{(,NA,>)}
v>, vNR, v{}
A,,H ` open x : {(P,,H¢[R (AL,O)]),
(Z,,H) }
(x)=(R,D)
A ` RvRC A ` H v {(R,AL,C)}
Z,,HP,,H¢[ (AL,O)]
,
18APLAS05
A set of input path(A)/output paths() pairs:– 8,,. {(A1,1),…,(An,n) }
– order is defined as
Domain Design (Function Type)
8(A2,2)2ts2. 9(A1,1)2ts1. ` A2vA1 Æ A2 ` 1v2
8(A1,1)2ts1. 9(A2,2)2ts2. ` A1vA2 Æ A1 ` 1v2
A ` ts1vts2
19APLAS05
vMZ,vRC,v{(,AL,C)}
]
f(x)
x=open x
x > 0
use x
close x
f (x) v>,v>, v{} [x:(,)]
v>, v>, v{} {}
vP, vRC, v{(,AL,O)} [x:(,)]
vP,v>, v{} [x:(,)]
Fixpoint !!
vP, vRC, v{(,AL,O)}[x:(,)].[(AL,C)]
vMZ,v>, v{}[x:(,)]
vMZ, vRC, v{(,AL,C)}[x:(,P)].[(AL,O)]
[x:(,Z)]
vP,vRC,v{(,AL,O)} .[(AL,C)]
vMZ,vRC,v{(,AL,C)} {}
.[(AL,C)]
vMZ,vRC,v{(,C)}[x:(,ZP){.[(AL,C)]}
[
Typing Example
]
vP,vRC,v{(,AL,O)} .[(AL,C)]
vMZ,vRC,v{(,AL,C)}
open x : 8,,. {vRC, v{(,AL,C)} ! {((,P),¢[(AL,O)]),((,Z),)}
close x : 8,,. {vRC, v{(,AL,O)} ! {((NR,Z),¢[(AL,C)])}
use x : 8,,. {vRC, v{(,AL,O)} ! {((NR,Z),)}
[x:(,Z)] {}
[x:(,P)] {}[x:(,ZP)] {} (={.[((AL,C)]}]
{})
20APLAS05
Theorem 1 [Correctness of Type System]If a configuration C is typed, then C is (finished) or it goes without type error.
– Two main lemma : subject reduction & progress
Theorem 2 [Correctness of Algorithm]If I(A,,H,e) = { (A1,1), , (An,n) },
then Ai,,H ` e : i.
Soundness
21APLAS05
Implementation
We have implemented a prototype, and experimented it with some C programs.
The prototype extends the algorithm in the paper:– Partitions input constraints more lazily.– Handles global variables and heap storage.– Detects resource leaks.
22APLAS05
Ongoings and future work
Type based dynamic allocation Multiple error message Resource type based slicing Modular pointer analysis specialized for
this problem Specification language
23APLAS05
Conclusion
We formalized a sound path-sensitive analysis for resource usage protocols.
Our analysis is modular; the analysis summarizes each function as a type scheme, without using any user annotations.
In the paper, we also showed how to handle dynamic resource allocation and aliases.
27APLAS05
Related Works Path insensitive verification : actions in finite automata specification are limi
ted as syntactically identifiable sets– Resource Usage Analysis (Igarashi & Kobayashi)– Vault (DeLine & Fahndrich)
Path sensitive but whole program analysis – SLAM (Ball & Rajamani @ MSR), BLAST (Henzinger et. al. @ UCB)
– C2BP. Then, model check– ESP (Das et. al. @ MSR)
– Ideas of selective join– Lighter-weighted than SLAM/BLAST. But still whole program analysis
Path sensitive and modular, but unsound – Saturn (Yichen Xie, Alex Aiken @ Stanford)
– Program constructs Bit level boolean constraint (equation)– Inference SAT solving– Unsound : assumption of no alias between arguments, finite loop unrolling– Blind summary : not symbolic (their optimization : slicing query dependent part after who
le equation generation)
28APLAS05
Ongoings and future work
Type based dynamic allocation v {(ri,NA,X)} ! ¢[][ri (AL,Y)]
v {(,NA,X)} ―alloc({})! ¢[][ (AL,Y)]
Multiple error message– Better error recovery algorithm to remove multiple false alarm c
aused by one bug Resource type based slicing
– In GCC package of SPEC95 benchmark, there is a function that opens 15 file concurrently (215 path), but if we slice it based on FILE* type, then we can accelerate the complexity of inference to 2£15 safely
Pointer / structure / array– Modular pointer analysis specialized for this problem
Specification Language
ri is program point of alloci
now is program point of allocator function (instantiated)
29APLAS05
vP, v>, v{} [fp:(2,2)],
{ v>, v>, v{} }vP
vMZ
x > 0(x) = (,)
, vP, v>, v{} ,
vMZ, v>, v{} ,
vMZ, v>, v{} [fp:(1,1)],
Alias
Can not be combined !
1 # 2 by no interprocedural alias assumption