WALA Tutorial at PLDI 2010

93

description

This is the tutorial that we delivered at PLDI 2010

Transcript of WALA Tutorial at PLDI 2010

Page 1: WALA Tutorial at PLDI 2010
Page 2: WALA Tutorial at PLDI 2010

What is WALA? ! Java libraries for static and dynamic program

analysis !  Initially developed at IBM T.J. Watson Research

Center ! Open source release in 2006 under Eclipse

Public License ! Key design goals

• Robustness • Efficiency • Extensibility

Page 3: WALA Tutorial at PLDI 2010

(Some) Previous Uses of WALA

!  Research •  over 40 publications from 2003-present •  Including one at PLDI’10 (MemSAT) •  http://wala.sf.net/wiki/index.php/Publications.php

!  Products •  Rational Software Analyzer: NPEs (Loginov et al.

ISSTA’08), resource leak detection (Torlak and Chandra, ICSE’10)

•  Rational AppScan: taint analysis (Tripp et al., PLDI’09), string analysis (Geay et al., ICSE’09)

•  Tivoli Storage Manager: Javascript analysis •  WebSphere: analysis of J2EE apps

Page 4: WALA Tutorial at PLDI 2010

WALA Features: Static Analysis

!  Pointer analysis / call graph construction •  Several algorithms provided (RTA, variants of

Andersen’s analysis) •  Highly customizable (e.g., context sensitivity policy) •  Tuned for performance (time and space)

!  Interprocedural dataflow analysis framework •  Tabulation solver (Reps-Horwitz-Sagiv POPL’95) with

extensions •  Also tuned for performance

!  Context-sensitive slicing framework •  With customizable dependency tracking

Page 5: WALA Tutorial at PLDI 2010

Other Key WALA Features !  Multiple language front-ends

•  Bytecode: Java, .NET (internal) •  Source (CAst): Java, Javascript, X10, PHP (partial,

internal), ABAP (internal) •  Add your own!

!  Generic analysis utilities / data structures •  Graphs, sets, maps, constraint solvers, …

!  Limited code transformation •  Java bytecode instrumentation via Shrike •  But, main WALA IR is immutable, and no code gen

provided "  designed primarily for computing analysis info

Page 6: WALA Tutorial at PLDI 2010

What We’ll Cover ! Overviews of main modules

•  Important features • Key class names • How things fit together • How to customize

! “Deep dives” into real code •  Interprocedural dataflow analysis example • CAst Javascript front-end

Page 7: WALA Tutorial at PLDI 2010

How to get WALA !  Walkthrough on “Getting Started” page at wala.sf.net

!  Code available in SVN repository •  Trunk or previous tagged releases •  Split into several Eclipse projects, e.g., com.ibm.wala.core, com.ibm.wala.core.tests

!  Dependence on Eclipse •  Easiest to build / run from Eclipse, but command line

also supported •  Runtime dependence on some Eclipse plugins

(progress monitors, GUI functionality, etc.); must be in classpath

Page 8: WALA Tutorial at PLDI 2010

GENERAL UTILITIES

Page 9: WALA Tutorial at PLDI 2010

!"#"$%&'&$(')*+'*),-

!"#$%&#'

()*+,'$*-.$/012)"#,3'

4"5+2"-#$6*#*7028$%209&)'

Page 10: WALA Tutorial at PLDI 2010

!"#$%&!$'&()'&*)+,)#)-'"'$.-#

!"#$%#

$&'()%!"#$%#

*+#',-%!"#$%#

*+#',-%$&'()%!"#$%# ./#0%1#2(

!!!!!!

34 5 "!!!!!!

,64, ,6"

,')%78,9

:;;)%#./#0%1#2(

!!!!!!5 "<=3

./#0%1#2(.')%

Page 11: WALA Tutorial at PLDI 2010

!"#$%&'(%)'*+)',+-$+.+/)0)%1/

!"#$%&'()$*+',-#('# .//+'#01#2'3#4*

('51()$*+'!"#$%&',-#('#

*-2%)'#%).+)')1'.03+'.-04+'5.%/6'&+/.+'0/&'.-0$.+'-0$).

! 7+/.+'81$&.9':;0<'=';%/>'?'#%).'-+$'81$&

! *-0$.+'81$&.9'/5;#+$'1@'.+)'#%).

! A024520)+'#+.)'5.+'1@'0'.%/62+'&+/.+'-1$)%1/

! ,+#020/4+'1/';5)0)%1/B'0;1$)%C%/6')1'.03+'41.)

!"#$!%&#$' (%)!%&#$'

Page 12: WALA Tutorial at PLDI 2010

!"#$%&'()*'!%*'+%,$%-%.*#*)/.

!"#$%&'()$*+',-#('#./#0'1#2*

!"#$%&'(3$*'4./#0'1#2*,-#('#

./#0'1#2*5')2+/#2*6

!"#$%&'&()*(+,)&($+-./(*01

!#0%'-,#1%'23'-"#$).4'1/55/.',/$*)/.-'/6'2)*'-%*-

! !*#*%'-,7)*').*/'1/55/.'#.&',$)0#*%',/$*)/.-

! +%,/-)*/$3'5#.#4%-'-%*'/6'1/55/.',/$*)/.-

! 8/55/.',/$*)/.-'1/5%'#.&'4/'/.'&%5#.&

Page 13: WALA Tutorial at PLDI 2010

!"#$%%&'%()&*(+"*,

!"#$%%&'%(')*")

+",) -&%.)

! -".'*(*/(0"12(3"#$%('"4(#&*,"*(&521"5"'*.*&/',

" 6.7.5"*"7&8"3(#9(*4/(/*0"7(&521"5"'*.*&/',

" :,,"7*(*4/(&521"5"'*.*&/',(%&;"(,.5"(7",$1*,

" <.=*/79(&'*"7>.="(.11/4,($,"(.,(,*.'3.73(#&*,"*,

" </7(3";"1/25"'*(/'19?(5.@/7(*&5"(.'3(,2.="(=/,*,

/$)0#+"(')*")

/$)0#+"(')*")

!!"#$%&'!$((%)*+!*,,-&+.!&/!0!!!!$((%)*+!"1!2!%)3.4*,,-&/5!!!!$((%)*+!61!2!1&78.4*,,-&/5

!!!!&3!-"1!92!61/!0!!!!!!*66)1.!"1!22!615!!!!:

!!!!1).#1+!"15!!:

Page 14: WALA Tutorial at PLDI 2010

!"#$%&'(")*&+,)(,#,-.".$/-

!"#$%&'&($)*+, -#($%&'&($)*+,

.)&/0*+,

!"#$%"&$'()*+"#$%"&$,-

+."*/#"0123#$450&6#7,-

8&+6*%660&6#,(*.-

3&&9#%.*:&."%+.70&6#,(*.-

!"#$%"&$'()*/#";$#60&6#7,(*.-

+."*/#";$#60&6#<&1.",(*.-

!"#$%"&$'()*/#"=1::0&6#7,(*.-

+."*/#"=1::0&6#<&1.",(*0-

8&+6*%66>6/#,(*7$:?*(*67"-

3&&9#%.*@%7>6/#,(*7$:?*(*67"-

Page 15: WALA Tutorial at PLDI 2010

!"#$%&%'()&*+,(

-%+&%.%/0*012/

!"#$%&%'!('%)*+*,%&-./

!"#$%&%'0',%)*+*,%&-./

!"#$%&%'1&*23-./

!"#$%&#'()*&+,-$'.

-$%&#'/0&,!"#$"()*&+.

!"#$%&#123'()*&+,.

4#&+2#/+5-6$!#&+2#&'/0&7,4"#8&#$7.

4"#8&#$%&#8(99'/0&'()*&+7,-$"/0&.

4"#8&#$%&#:+&0'/0&'()*&+7,-$"/0&.

Page 16: WALA Tutorial at PLDI 2010

!"#$%$&'()"*+',$*)$-$./"/01.

!"#$%$&'&($)"*"($+,-./01

234#$+$&!"#$%$&'&($)"*"($+,-.01!"#$%$&5+"67,-.01

!"#$%&$'()*%+(,$*-./%$0(%10234"#$%50$6716$8-3"79"!"*(,$*./%$0(%102:"$;%$<68"!4"#$%50$6+(,$*8-3"7.=<%"#$%50$6716$>1)<%-3"79"!"*(,$*./%$0(%102:"$;%$<68"34"#$%?)@@716$8-3"79"!"*(,$*./%$0(%102:"$;%$<68"!4"#$%?)@@+(,$*8-3"7.=<%"#$%?)@@716$>1)<%-3"79"!"*(,$*.A1=6"(66B6#$-3"80@9"3"68%9"!"*(,$*.,11*$(<"C(8B6#$-3"80@9"3"68%9"!"*(,$*.?$%2:"$;%$<68"!4"#$%B6#$+(,$*8-3"80@9"3"68%.

Page 17: WALA Tutorial at PLDI 2010

!"#"$%&'!$()*'+)"$(,%-#./'

0$"(1,*'2%$.,'3"($&*

!"#$%&'(%)'

!)*+,&,!"#$%&'(%)'

(

- .

/

0

1

2

( - /

( - .

( - /

/ 0

( - . / 0 1 2

!"#3(%4"5+,&'

Page 18: WALA Tutorial at PLDI 2010

!"#"$%&'!$()*'+)"$(,%-#./

0"),*'1%$.,'2"($&*

!"#$%&'"()*+,

!"#!(-./0+,1(2+3&+,%&/,

!"#"()(-'1(2+3&+,%&/,

4526+,+*!"#!(-./0+,1(2+3&+,%&/,

4526+,+*!"#"()(-'1(2+3&+,%&/,

%

7 8

6

9

:

;

% 7 6% 7 9 % ; 6: 6

% 7 9 : ; 6 8

6 ; : 9 7 8 %

Page 19: WALA Tutorial at PLDI 2010

!"#"$%&'!$()*'+)"$(,%-#./

011.

!""#$%&'$(&

Page 20: WALA Tutorial at PLDI 2010

!"#$%&'()*"+,%-.

!"#$%&'"()

*+#,-(-.!"#$%&'"()

!"#$%&%/-0("%'$-()

Page 21: WALA Tutorial at PLDI 2010

!"#"$%&'()*+#,-+

!"#$$%&$$'(&)*+,(-

.*/'$,+0(&1234 .*/5(&678*('96:/#,6;(,<#%*(34

83<!"4=!=<#$%

83<!"&'''4=!=<#$%

!5(&678*('96:/#,6;(,<#%*(

83<!"4=!=<#$%

.*/>**/?1*(&/,(34

.*/@,%*5(&678*('96:/#,634

.*/A%.*5(&678*('96:/#,634

Page 22: WALA Tutorial at PLDI 2010

!"#"$%&'()*"+,%-

!

"

# $

%

&

' (

)

*

))

+ ,

)

-

.

/ 0

12345678)

12345678*

9:77578:338;:3<57

!485:=>8?@A5B846:?715681<?=42@?8:AA782478?<CD56

!"#$%&'()*+$(,*#%-.#/#)&+01'2/

Page 23: WALA Tutorial at PLDI 2010

!"#"$%&'()*"+,%-!"#$%&'()*+,-!.+*/0+12%/3.&/0+3*+%*#$.4'5./607.8+*)9:.+;")&/%0)<=/+%)5')07.>'?''+./"+)').@'2%/3.&/0+()%0)A0)9/*)/<''''B*$".9C5./D*!!.7E)7.F<)07.>>G'H

!"#$%&'()*+,-!.+*/0+12%/3.&/0+3*+%*#$.4'5./I75.8+*)9:.+;")&/%0)<=/+%)5':+0JK'=/+%)5'/0>'?''%:'<:+0J'LL')07.9MNO'PP'/0'LL')07.9MQO>''''+./"+)').@'2%/3.&/0+;%$/.+<R.+0<>>G''.$9.'%:'<:+0J'LL')07.9MNO'PP'/0'LL')07.9MSO>''''+./"+)').@'2%/3.&/0+;%$/.+<0).<>>G''.$9.'?''''+./"+)'2%/3.&/0+E7.)/%/,C%)9/*)&.<>GHH

!"#$%&'T#9/+*&/D../-!.+*/0+12%/3.&/0+3*+%*#$.4'5./D../-!.+*/0+<>'?''+./"+)'2%/3.&/0+()%0)C%)9/*)&.<>G'H

Page 24: WALA Tutorial at PLDI 2010

!"#"$%&'()*"+,%-

!

"

# $

%

&

' (

)

*

))

+

)

,

-

.

/

0

!"#$%&'()%*%+%(%,!"#$%-'.)%*%+%(%.%,!"#$%/'0)%*%+%(%0%,!"#$%1'2)%*%+%.%2%,!"#$%3'4)%*%+%(%.%0%2%4%,!"#$%5'6)%*%+%(%.%0%2%4%6%,!"#$%7'8)%*%+%8%,!"#$%9':)%*%+%:%,

Page 25: WALA Tutorial at PLDI 2010

INTERMEDIATE REPRESENTATION

Page 26: WALA Tutorial at PLDI 2010

!"#$%&'()*+,

!"#$%&'()

!"

!"#$%&'!$()*+,-.*/($0(,-11"2*3(4*/15

*+(,-.!"#$%&'()

/0&!"#$%&'()

Page 27: WALA Tutorial at PLDI 2010

!"#$%&'(%'&)

!"

#$%&%'#(%&&%#)*

!+%#$

,-%&%.'#$/#0*

,0%&%12#345%56789:%#(/#)

!+%#$!+%#$

;5<7;2%#-

=>)

?>@

$>-

=

(

)

$

0

-

?

A

@

BCD

Page 28: WALA Tutorial at PLDI 2010

!"#$%&'$()"*+,-.#

!!"#$%&'()&*+$

!!",+&+#$%&'()&*+$

!!"-+./0'1#$%&'()&*+$

!!"-+$21'%*+$#$%&'()&*+$

!!"3*145"))1%%#$%&'()&*+$

!!",1&#$%&'()&*+$ !!"6(&#$%&'()&*+$

!!""7%&'0)&#$2+81#$%&'()&*+$

!!"#$2+81#$%&'()&*+$

Page 29: WALA Tutorial at PLDI 2010

!"#$%&'$()"*+,-.#

!!"#$%&'()&*+$

!!"",%&'-)&#$.+/0#$%&'()&*+$

!!"#$.+/0#$%&'()&*+$

1-.-!)'*2&#$.+/0

",%&'-)&304*)-5#$.+/0

6(5&*70&('$8-5(0#$.+/0#$%&'()&*+$

Page 30: WALA Tutorial at PLDI 2010

!"#$%&'(%'&)*#+,#-./)0

! +,#1./)0#2&.3,/)#/,0%,1(%#345')#1'67)&0#,1#(.1%)8%

" 9#:(.2;<#.=#4#345')#,1#4#/,0%,1(%#(.1%)8%

" )>?>#,10,/)#4#(.1/,%,.145#.&#5..2

! @0)/#%.#/)1.%)#2&)(,0)#,1=.&64%,.1

" )>?>#45,40,1?#A,%B#.%B)&#345')0

" )>?>#2&)(,0)#%;2)

! $2)(,=,)/#7;#2.5,(;#/'&,1?#!"#(&)4%,.1

!!"#$%&'(#&)$*+

,-).( $/012.*1$&/

3(1#$4$/012.*1$&/5

Page 31: WALA Tutorial at PLDI 2010

!"#$%&'(%'&)*#+,#-./)0!"#$#%!&#$$#!'(

)*#!"

+,#$#-%!".!/(

+/#$#01!234#456789#!&.!:

!:#$#;%!'(#

)*#!"

<4=6<1#!,

>?'

:?@

"?,

>

&

'

"

/

,

:

A

@

BCD

!A#$#;%!'(#

=<64#E<71FG

*7894#E<71FG

Page 32: WALA Tutorial at PLDI 2010

!"#$%&'&%&()*#+(,$)(

!"

#$%&%'#(%&&%#)*

!+%#$

,-%&%.'#$/#0*

,0%&%12#345%56789:%#(/#)

!+%#$!+%#$

;5<7;2%#-

=

(

)

$

0

-

>

?

@

A5+B:5

C5<A5+'#*

C5<B:5:'#*

D5<A5+'$*%E

D5<B:5:'$*%E

Page 33: WALA Tutorial at PLDI 2010

!"#$%&'&%&()*#+,-

!"#

$"%

&"'

()* +,-./

$"%

$"%

!"#!"#

&"'

0

Page 34: WALA Tutorial at PLDI 2010

!"#$%&'&%&()*#+,-(#!./(0(.1(

234-5%(#43)%#-0(1&)(#643)%#7(.(08'#%,-(6

! $)()#9(1'80(9#%,-()#8.9#3%:(0#;.3<.#%,-()

! (=7=#13.10(%(#%,-()#/034#13.)%8.%)

! (=7=#13.10(%(#%,-()#/034#8''318%&3.

!.%(0/81(#8''3<)#5)(#8103))#'8.7587()

new TypeInference(ir, doPrimitives); getType(vn)

!"#$%&#''#()

%*#'#+$%,-%.)

%.#'#/01#2345

%,#'#/01#2677

(

&

,

.

*

8

677 345

9:;05

%,'677-#%.'345-#%*'9:;05

Page 35: WALA Tutorial at PLDI 2010

!"#$%&'()*#+,-.#$%&'()#/01

!" !#$%&'(!"

)*%#$%&'(

+'*,%,'-

.$%#$%&'(/0

,-*%123%,'-+'*,%,'-/,-($40

5$%6,1*%7,-$/0

5$%6,1*%8'9/0

5$%6,1*%:;;*$%/0

5$%7<*%7,-$/0

5$%7<*%8'9/0

5$%7<*%:;;*$%/0

5$%="7/0

,-*%123%,'-+'*,%,'->%<?$*><->';;*$%>,->%&$>,-*%123%,'-><11<@

Page 36: WALA Tutorial at PLDI 2010

!"#$%&'()*#+,-)(%.)#/01

!" !#$%&'(!"

!)*%$+'($#$%&'(

,$%#$%&'(-.

/$%)*%$+'($!0($1-20%./$%320$4567$8-20%.

,$%)*%$+'($!0($19%:;$<9:09'==<$%9209%&$920<%85+%2'09:88:*

Page 37: WALA Tutorial at PLDI 2010

!"#$%&'()*#+%(,-#.,/)0

!"

#$%&'()*+),$-./0%12/0%3

!0-%45(%/'02/06$7 &'()*28)*5$205,9$4

:44);2'<20),$-

Page 38: WALA Tutorial at PLDI 2010

SCOPES AND CLASS HIERARCHIES

Page 39: WALA Tutorial at PLDI 2010

Building a Call Graph buildCG(String jarFile) { // represents code to be analyzed AnalyisScope scope = AnalysisScopeReader (1) .makeJavaBinaryAnalysisScope(jarFile, null); // a class hierarchy for name resolution, etc. IClassHierarchy cha = ClassHierarchy.make(scope); (2) // what are the call graph entrypoints? Iterable<Entrypoint> e = (3) Util.makeMainEntrypoints(scope, cha); // encapsulates various analysis options AnalysisOptions o = new AnalysisOptions(scope, e); (4) // builds call graph via pointer analysis CallGraphBuilder builder = (5) Util.makeZeroCFABuilder(o, new AnalysisCache(), cha, scope); CallGraph cg = builder.makeCallGraph(o, null); (6) }

Page 40: WALA Tutorial at PLDI 2010

AnalysisScope !  Represents a set of files to analyze

!  To construct from classpath: AnalysisScopeReader.makeJavaBinaryAnalysisScope()

!  To read info from scope text file: AnalysisScopeReader.readJavaScope() •  Each line of scope file gives loader, lang, type, val

"  E.g., “Application,Java,jarFile,bcel-5.2.jar” "  Common types: classFile, sourceFile, binaryDir, jarFile "  Examples in com.ibm.wala.core.tests/dat

!  Exclusions: exclude some classes from consideration •  Used to improve scalability of pointer analysis, etc. •  Also specified in text file; see, e.g., com.ibm.wala.core.tests/dat/GUIExclusions.txt

Page 41: WALA Tutorial at PLDI 2010

Background: Class Loaders !  In Java, a class is identified by name and class loader

•  E.g., < Primordial, java.lang.Object >

!  Class loaders form a tree, rooted at Primordial

!  Name lookup first delegates to parent class loader •  So, can’t write an app class java.lang.Object

!  User-defined class loaders can provide isolation •  Used by Eclipse for plugins, J2EE app servers

!  WALA naming models class loaders

Page 42: WALA Tutorial at PLDI 2010

Multiple Names in Bytecode // this is java.lang.Object class Object { public String toString() { … } } // no overriding of toString() class B extends Object {} class A extends B {}

Legal names in bytecode: <Application, A, toString()>, <Application, B, toString()>, <Application, java.lang.Object, toString()>, <Primordial, java.lang.Object, toString()> Resolved entity: <Primordial, java.lang.Object, toString()>

Page 43: WALA Tutorial at PLDI 2010

WALA Name Resolution

Entity Reference Type Resolved Type

Resolver Method

class TypeReference IClass lookupClass()

method MethodReference IMethod resolveMethod()

Field FieldReference IField resolveField()

Entity references resolved via IClassHierarchy

Page 44: WALA Tutorial at PLDI 2010

More on class hierarchies !  For Java class hierarchy: ClassHierarchy.make(scope)

!  Supports Java-style subtyping (single inheritance, multiple interfaces)

!  Necessary for constructing method IRs, since only resolved IMethods have bytecode info

!  Watch out for memory leaks! •  Resolved entities (IClass, IMethod, etc.) keep

pointers back to class hierarchy •  In general, use entity references in analysis results

Page 45: WALA Tutorial at PLDI 2010

INTERPROCEDURAL DATAFLOW ANALYSIS

Page 46: WALA Tutorial at PLDI 2010

Tabulation-Based Analysis

!  “Functional approach” to context-sensitive analysis (Sharir and Pnueli, 1981)

!  Tabulates partial function summaries on demand

!  Some enhancements in WALA’s implementation •  Multiple return sites for calls (for exceptions) •  Optional merge operators •  Handles partially balanced problems •  Customizable worklist orderings

(Reps, Horwitz, Sagiv, POPL95)

Page 47: WALA Tutorial at PLDI 2010

T

Tabulation Overview

TabulationSolver

IFlowFunctionMap Edge flow functions for

supergraph

TabulationDomain Provides numbering of

domain facts

TabulationResult

ISupergraph Supergraph over which analysis is

computed

Seeds Initial path edges for

analysis TabulationProblem

Page 48: WALA Tutorial at PLDI 2010

Supergraph !  Collection of “procedure graphs,” connected by calls

•  ICFGSupergraph: procedure graphs are CFGs •  SDGSupergraph: procedure graphs are PDGs

!  Example call representation for ICFGSupergraph •  (For general supergraph , possibly many calls, returns, entries, exits)

EXC. HANDLER

ENTRY CALL

EXIT RETURN

CALLER CALLEE

EXIT

Normal call-to-ret

Exc. call-to-ret

Normal call-to-entry

Normal ret-to-exit

Exc. ret-to-exit

Page 49: WALA Tutorial at PLDI 2010

Domain / Flow Functions / Seeds !  TabulationDomain

•  Maintains mapping from facts to integers •  Controls worklist priorities (optional)

!  IFlowFunctionMap •  Flow functions on supergraph edges •  All functions map int (or two ints) to IntSet

(via TabulationDomain) •  Function for each type of edge (normal, call->return,

call->entry, exit->return) •  Also, call->return function for “missing” calls

"  For handling missing code, CG expansion (Snugglebug)

!  Seeds (TabulationProblem.initialSeeds()) •  Depends on problem / domain representation

Page 50: WALA Tutorial at PLDI 2010

Partially Balanced Problems !  For when flows can start / end in a non-entrypoint

•  E.g., slice from non-entrypoint statement

!  PartiallyBalancedTabulationProblem •  Additional “unbalanced” return flow function for

return without a call •  getFakeEntry(): source node for path edges of

partially balanced flows

!  Compute with PartiallyBalancedTabulationSolver

!  Examples: ContextSensitiveReachingDefs, Slicer

Page 51: WALA Tutorial at PLDI 2010

Debugging Your Analysis !  IFDSExplorer

•  Gives GUI view of analysis result •  Needs paths to GraphViz dot executable and PDF

viewer

!  Set VM property com.ibm.wala.fixedpoint.impl.verbose to true for occasional lightweight info

!  Increase TabulationSolver.DEBUG_LEVEL for detailed info

Page 52: WALA Tutorial at PLDI 2010

Deep Dive: Reaching Defs !  The classic dataflow analysis, for Java static fields

!  Three implementations available in com.ibm.wala.core.tests •  IntraprocReachingDefs

"  Uses BitVectorSolver, a specialized DataflowSolver •  ContextInsensitiveReachingDefs

"  Uses BitVectorSolver over interprocedural CFG •  ContextSensitiveReachingDefs

"  Uses TabulationSolver

!  We’ll focus on ContextSensitiveReachingDefs

Page 53: WALA Tutorial at PLDI 2010

Example

class StaticDataflow { static int f, g; static void m() { f = 2; } static void testInterproc() { f = 3; m(); (1) g = 4; m(); (2) } }

Context-sensitive analysis should give different result after (1) and (2)

Page 54: WALA Tutorial at PLDI 2010

The Domain and Supergraph

!  Supergraph: ICFGSupergraph •  Procedures: call graph nodes (CGNode)

"  In context-sensitive call graph, possibly many CGNodes for one Imethod

•  Nodes: BasicBlockInContext<IExplodedBasicBlock> "  “exploded” basic block has at most one instruction (eases

writing transfer functions) "  BasicBlockInContext pairs BB with enclosing CGNode

!  Domain (static field writes): Pair<CGNode,Integer> •  Integer is index in IR instruction array; only valid way to

uniquely identify IR instruction •  ReachingDefsDomain extends MutableMapping to

maintain fact numbering

Page 55: WALA Tutorial at PLDI 2010

Flow Functions (1) !  Normal flow for non-putstatics is IdentityFlowFunction.identity()

!  Most call-related flow functions are also identity •  Since static fields are in global scope

!  Call-to-return function is KillEverything.singleton() •  Defs must survive callee to reach return

Page 56: WALA Tutorial at PLDI 2010

Flow Functions (2)

public IntSet getTargets(int d1) { IntSet result = MutableSparseIntSet.makeEmpty(); // first, gen this statement int factNum = domain.getMappedIndex(Pair.make(node, index)); result.add(factNum); // if incoming statement defs the same static field, kill it; // otherwise, keep it if (d1 != factNum) { // must be different statement IField sf = cha.resolveField(putInstr.getField()); Pair<CGNode, Integer> other = domain.getMappedObject(d1); SSAPutInstruction otherPut = getPutInstr(other); IField otherSF = cha.resolveField(otherPut.getField()); if (!sf.equals(otherSF)) { result.add(d1); } } return result; }

Normal flow function for putstatic (modified for formatting / clarity)

Page 57: WALA Tutorial at PLDI 2010

Seeds !  Standard tabulation approach: special ‘0’ fact

•  Add ‘0 -> 0’ edge to all flow functions •  Seed with (main_entry, 0) -> (main_entry, 0)

!  Our approach: partially balanced tabulation •  For field write numbered n in basic block b of method

m, add seed (m_entry, n) -> (b, n) "  (source fact doesn’t matter)

•  Unbalanced flow function is just identity •  Advantage: keeps other flow functions cleaner •  See ReachingDefsProblem.collectInitialSeeds()

Page 58: WALA Tutorial at PLDI 2010

Putting it all Together !  ReachingDefsProblem collects domain, supergraph,

flow functions, seeds

!  Running analysis (simplified): PartiallyBalancedTabulationSolver solver = new PartiallyBalancedTabulationSolver( new ReachingDefsProblem()); TabulationResult result = solver.solve();

!  In the real code: •  Lots of long generic type instantiations (sigh) •  Handling CancelException (enables cancelling

running analysis from GUI)

Page 59: WALA Tutorial at PLDI 2010

CALL GRAPHS / POINTER ANALYSIS

Page 60: WALA Tutorial at PLDI 2010

Call Graph Builder Overview

CallGraphBuilder

Context Selector What context to use

when analyzing call to some method?

Heap Model How should objects and pointers be abstracted?

CallGraph

AnalysisOptions Specifies entrypoints,

how to handle reflection, etc.

PointerAnalysis

Page 61: WALA Tutorial at PLDI 2010

Entrypoints !  What are entrypoint methods?

•  main() method "  Util.makeMainEntrypoints()

•  All application methods "  AllApplicationEntrypoints

•  JavaEE Servlet methods, Eclipse plugin entries, …

!  What types are passed to entrypoint arguments? •  Just declared parameter types (DefaultEntrypoint) •  Some concrete subtype (ArgumentTypeEntrypoint) •  All subtypes (SubtypesEntrypoint)

Page 62: WALA Tutorial at PLDI 2010

Heap Model !  Controls abstraction of pointers and object instances

!  InstanceKey: abstraction of an object •  All objects of some type (ConcreteTypeKey) •  Objects allocated by some statement in some calling

context (AllocationSiteInNode) •  ZeroXInstanceKeys: customizable factory

!  PointerKey: abstraction of a pointer •  Local variable in some calling context

(LocalPointerKey) •  Several merged vars (offline substitution), etc.

Page 63: WALA Tutorial at PLDI 2010

Context Selector !  Gives context to use for callee method at some call site

!  Context examples •  The default context (Everywhere) •  A call string (CallStringContext) •  Receiver object (ReceiverInstanceContext)

!  ContextSelector examples •  nCFAContextSelector: n-level call strings •  ContainerContextSelector: object sensitivity for

containers

Page 64: WALA Tutorial at PLDI 2010

Built-In Algorithms

Rapid Type Analysis (RTA)

0-CFA context-insensitive, class-based heap

0-1-CFA context-insensitive,

allocation-site-based heap

0-1-Container-CFA 0-1-CFA with object-sensitive containers

Increasing precision

(Grove and Chambers, TOPLAS 2001)

For builders, see com.ibm.wala.ipa.callgraph.impl.Util

Page 65: WALA Tutorial at PLDI 2010

Performance Tips !  Use AnalyisScope exclusions

•  Often, much of standard library (e.g., GUI libraries) is irrelevant

!  Analyze older libraries •  Java 1.4 libraries much smaller than Java 6

!  Tune context-sensitivity policy •  E.g., more sensitivity just for containers

Page 66: WALA Tutorial at PLDI 2010

Code Modelling (Advanced) !  SSAContextInterpreter: builds SSA for a method

•  Normally, based on bytecode •  Customized for reflection, Object.clone(), etc.

!  MethodTargetSelector: determines method dispatch •  Normally, based on types / class hierarchy •  Customized for native methods, JavaEE, etc.

!  ClassTargetSelector: determines types of allocated objects •  Normally, type referenced in new expression •  Customized for adding synthetic fields, JavaEE, etc.

Page 67: WALA Tutorial at PLDI 2010

Refinement-Based Points-To Analysis

!  Refines analysis precision as requested by client •  Computes results on demand •  See Sridharan and Bodik, PLDI 2006

!  Implemented in DemandRefinementPointsTo •  Baseline analysis is context insensitive

"  Field sensitivity, on-the-fly call graph via refinement •  Context sensitivity (modulo recursion), other regular

properties via additional state machines •  Refinement policy can be easily customized •  Can also compute “flows to” on demand

!  Usage fairly well documented on wiki

!  See DemandCastChecker for an example client

Page 68: WALA Tutorial at PLDI 2010

SLICING

Page 69: WALA Tutorial at PLDI 2010

Slicer Overview

Slicer.compute{Forward|Backward}Slice()

CallGraph

Collection<Statement>

Statement Seed for the slicer

PointerAnalysis

DataDependenceOptions Which data deps to consider

ControlDependenceOptions Which control deps to consider

Page 70: WALA Tutorial at PLDI 2010

Statement !  Identifies a node in the System Dependence Graph (SDG)

[Horwitz,Reps,Binkley,TOPLAS’90] !  Key statement types

•  NormalStatement "  Normal SSA IR instruction "  Represented by CGNode and instruction index

•  ParamCaller, ParamCallee "  Extra nodes for modeling parameter passing "  SDG edges from def -> ParamCaller -> ParamCallee -> use

•  NormalReturnCaller, NormalReturnCallee "  Analogous to ParamCaller, ParamCallee "  Also ExceptionalReturnCaller/Callee

•  HeapParamCaller, HeapParamCallee, etc. "  For modeling interprocedural heap-based data deps "  Edges via interprocedural mod-ref analysis

Page 71: WALA Tutorial at PLDI 2010

Dependence Options !  DataDependenceOptions

•  FULL (all deps) and NONE (no deps) •  NO_BASE_PTRS: ignore dependencies for memory

access base pointers "  E.g., exclude forward deps from defs of x to y=x.f

•  NO_HEAP: ignore dependencies to/from heap locs •  NO_EXCEPTIONS: ignore deps from throw to catch •  Various combinations (e.g., NO_BASE_NO_HEAP)

!  ControlDependenceOptions •  FULL (all deps) and NONE (no deps) •  NO_EXCEPTIONAL_EDGES: ignore exceptional control

flow

Page 72: WALA Tutorial at PLDI 2010

Thin Slicing !  Just “top-level” data dependencies

(see [Sridharan-Fink-Bodik PLDI’07])

!  For context-sensitive thin slicing, use Slicer with DataDependenceOptions.NO_BASE_PTRS and ControlDependenceOptions.NONE

!  For efficient context-insensitive thin slicing, use the CISlicer class

Page 73: WALA Tutorial at PLDI 2010

Performance Tips !  Some configs do not scale to large programs

•  E.g., context-sensitive slicing with heap deps •  Discussion in [SFB07]

!  Run with minimum dependencies needed

!  Apply pointer analysis scalability tips •  Exclusions, earlier Java libraries

Page 74: WALA Tutorial at PLDI 2010

INSTRUMENTING BYTECODES WITH SHRIKE

Page 75: WALA Tutorial at PLDI 2010

Key Shrike Features !  Patch-based instrumentation API

•  Each instrumentation pass implemented as a patch •  Several patches can be applied simultaneously to original

bytecode "  No worries about instrumenting the instrumentation

•  Branch targets / exc. handlers automatically updated

!  Efficient •  Unmodified class methods copied without parsing •  Efficient bytecode representation / parsing

"  Array of immutable instruction objects "  Constant instrs represented with single shared object

!  Some ugliness hidden •  JSRs, exception handler ranges, 64k method limit

Page 76: WALA Tutorial at PLDI 2010

Key Shrike Classes

MethodEditor Core class for transforming

bytecodes via patches

ClassReader Immutable view of .class file info; reads data lazily

ShrikeCT: reading / writing

.class files

ShrikeBT: instrumenting

bytecodes

MethodData Mutable view of

method info

ClassInstrumenter Utility for instrumenting an

existing class (mutable)

ClassWriter Generates JVM

representation of a class

CTCompiler Compiles ShrikeBT method

into JVM bytecodes

Page 77: WALA Tutorial at PLDI 2010

Instrumenting A Method (1)

instrument(byte[] orig, int i) { // mutable helper class ClassInstrumenter ci = (1) new ClassInstrumenter(orig); // mutable representation of method data MethodData md = ci.visitMethod(i); (2) // see next slide; mutates md, ci doInstrumentation(md); (3) // output instrumented class in JVM format ClassWriter w = ci.emitClass(); (4) byte[] modified = w.makeBytes(); (5) }

Page 78: WALA Tutorial at PLDI 2010

Instrumenting A Method (2)

doInstrumentation(MethodData md) { // manages the patching process MethodEditor me = new MethodEditor(md); (1) me.beginPass(); (2) // add patches me.insertAtStart(new Patch() { … }); (3) me.insertBefore(j, new Patch() { … }); (4) … // apply patches (simultaneously) me.applyPatches(); me.endPass(); (5) }

Page 79: WALA Tutorial at PLDI 2010

Shrike Clients !  Small example: see com.ibm.wala.shrike.bench.Bench

!  Dila (com.ibm.wala.dila in incubator) •  Dynamic call graph construction

(CallGraphInstrumentation) •  Utilities for runtime instrumentation

"  Instrumenting class loader "  Mechanisms for controlling what gets instrumented

•  Work continues on better WALA integration / docs

Page 80: WALA Tutorial at PLDI 2010

Java Annotation Support !  Supported features

•  Reading .class file attributes •  Parsing some annotation info from attributes

"  E.g., generics (com.ibm.wala.types.generics) •  Manipulating JVM class attributes directly

"  See ClassWriter.addClassAttribute()

!  Missing features •  Higher-level APIs for modifying known annotations •  Automatic fixing of StackMapTable attribute after

instrumentation "  Speeds bytecode verification in Java 6

Page 81: WALA Tutorial at PLDI 2010

Eclipse Support !  WALA projects are Eclipse plug-ins

•  Easy to invoke from other plug-in

!  Various utilities in com.ibm.wala.ide project •  EclipseProjectPath: creates AnalysisScope for

Eclipse project •  JdtUtil: find all Java projects, code within projects,

etc.

!  JDT CAst frontend for Java source analysis

!  Prototype utils in com.ibm.wala.eclipse project •  E.g., display call graph for selected project

Page 82: WALA Tutorial at PLDI 2010

FRONT ENDS / CAST

Page 83: WALA Tutorial at PLDI 2010

!"#"$%&'()$*(+

!"#"$%&

'())(*$"+,$

-($%&$-./*+0/,(.

12.345$67,58(95

-($%&$-./*+0/,(.

#/*:;/:5

-./*+0/,(.

67,58(95

<(.)/,

=/.+5.$>

=/.+5.$?

%*,5.*/0$12.345

&;*,3)5$03@./.7

Page 84: WALA Tutorial at PLDI 2010

!"#"$%&'()*+($,-*.'$/.+

!"#"$%&

'()*+,$

-./,012,

31$%&$

3)45674/1)

89:#

;%#

"-;

!"#"

<=>3$)?5/*@,

21/<A5?

34@4)*5

Page 85: WALA Tutorial at PLDI 2010

!"#$%&'()'*+,-.#/0.$+,

!"#$%&'()*+*

,-./&#0&#

!"#"$%&'()*+,$%&

!"#$%&

1&23&#4

'50&6-3&4

Page 86: WALA Tutorial at PLDI 2010

!"#$%&'(&)*&#+

!"#$%&

'&()&#*

+,-&./)&*

!"#$%&,-

! ./01'23)4)'567&89*&:

;<='*97>?@A

! ,B1

! 2$@7&#@)C'BD0'9@C6:

EB<FG

! ,B1

! 2BD0H'E$@*9I+'9@C6:

09J$CC)'-)K)#$@

! LD,'2L87$9@!8#$M7:

! 2A@*&#'*&4&C9MK&@7:

Page 87: WALA Tutorial at PLDI 2010

!"#"$%&'()*$+&,*$-(&./$0.,

!"#"$%&

'())(*$"+,$

-($%&$

-./*+0/,(.

1/2/34.56,

-./*+0/,5(*

1/2/

-./*+0/,5(*

"7"8

-./*+0/,5(*

%9)$6/.+:.

;:*<

=4056+:$1>-

8(0?@0(,

'A+,()$"B-#&

&C5*(

8D8

-./*+0/,5(*

EFG

-./*+0/,5(*

Page 88: WALA Tutorial at PLDI 2010

!"#$%&'%()*)+,$-.*

!"#$%&'$()"

*+"+%,$()"

-)"$%)./0.)1

*%,23/-%+,$()"

4)&%'+/5,22("6

7+')%8("6

449/-)":+%#()"

94;#

<%)=

#)&%'+

>%+

!7!7

>,%#+%

#)&%'+

<(.+#

Page 89: WALA Tutorial at PLDI 2010

!"#"$%&'()*+,-)&.%)'/,*01,1&")'/,

2&",-3")1*45',/*-)&.%).&1-*)/*6787*9/::/,*7$2*;97-)<

! 9/:=',")'/,*/>*?1,1&'%*",@*!"#"$%&'()*7$2*,/@1-

! A,3B*('1%1*/>*%/@1*)5")*.,@1&-)",@-*45',/

2&",-3")1*97-)*7$2*)/*6787*C&1*+4*>/&:

! $5"&1@*=B*",/)51&*',)1&,"3*!"#"$%&'()*)&",-3")/&

! DE)1,@-*?1,1&'%*)&",-3")'/,*:"%5',1&B

!"#$%&'$()"#*

+,-#$.%/"#0/$)%

-,.#

1%)2

#)&%'3

4/%#3%*

56(").)-#$.%/"#0/$)%

#)&%'3

1(03#

4%3

!5

Page 90: WALA Tutorial at PLDI 2010

!"#$%&'$()"*+,",%-$()"

!"#$#%!&#$$#!'(

)*#!"

+"#$#,-!./0#012345#!&6!'

)*#!")*#!"

70827-#!"

*..%96#:(#;

####70827-#9$$:#<<#9=012345%:(>

?

<<

9#$$#:,-!./0#

9=012345%:(

70827-

@58A73-5438.7#,BC40B0-85#70D275,!0#@EA#8700#F34/

Page 91: WALA Tutorial at PLDI 2010

!"#$%"&'(&")'*%+,-'!%.+$/"#

!""#$%&'(&)

&&&&*+,-*.&$//'&00&$1+2-345#'(6

7

00

$&//&'8.9":+&

$1+2-345#'(

*+,-*.

;<=

><?

@<A

BCD

E+,F4"G:#$&H&00(&)

&&IJ/&E+,F4"G:#$14+!,(6

&&I=&/&E+,F4"G:#$1*8EK,(6

&&I@&/&.+L&F4"G:#(

&&IJ15-GG+55"*&/&I@6

&&I@13MM#NIJ19&//&,*-+O(

&&I@1!345+&/&I=6

7

&&

P5,Q*3.543,"*&I-84M5&BCD&*+G-*589+4'&"9+*&,K+&PRQ

Page 92: WALA Tutorial at PLDI 2010

!"#$%&'(")*+*",'-.//*,0

!"#$#%!&#$$#!'(

)*#!"

+"#$#,-!./0#012345#!&6!'

)*#!")*#!"

70827-#!"

*..%96#:(#;

####70827-#9$$:#<<#9=012345%:(>

?

<<

9#$$#:,-!./0#

9=012345%:(

70827-

@58A73-5438.7#B.C,05#5.27B0#C.5,8,.-5#*7.D#@EA

F&6&GH#I#F&6'JH

F&6&GH#I#F&6&KH

F&6&JH#I#F&6'JH

F&6'H#I#F&6'JH

Page 93: WALA Tutorial at PLDI 2010

!!"#$%&'()*+%&

!"#$#%!&#$$#!'(

)*#!"

!"#$#+,!-./#/01234#!&5!'

)*#!")*#!"

6/716,#!"

!"#$#%!&#$$#!'(

)*#!"

!8#$#+,!-./#/01234#

!&5!'

)*#!")*#!"

6/716,#!9

!9#$#:%!"5!8(

;<=<#>-/4#?-@A#@6-@2B27+-,#+,#CC<#?-,!/64+-,

;<=<#+D@3/D/,74#*133AE@61,/>#CC<#F-,!/64+-,! +G/G#@H+4#-,3A#+,4/67/>#*-6#3+!/#!231/4