Dataflow Frequency Analysis based on Whole Program Paths Eduard Mehofer Institute for Software...
-
Upload
bruce-flowers -
Category
Documents
-
view
220 -
download
0
Transcript of Dataflow Frequency Analysis based on Whole Program Paths Eduard Mehofer Institute for Software...
Dataflow Frequency Analysis based on Whole Program Paths
Eduard Mehofer
Institute for Software Science
University of Vienna
www.par.univie.ac.at/~mehofer
Bernhard Scholz
Institute of Computer Languages
Vienna University of Technology
www.complang.tuwien.ac.at/scholz
Page 2
Dataflow Frequency Analysis
Goal– accurately computing frequencies of data flow facts
Problem: – high costs for computing accurate frequencies
• requires whole program path• efficient data structures and algorithm?
Approach:– exploiting algebraic properties of bi-distributive DFA problems – employing WPPs to capture control flow– computing frequencies in a bottom-up style on the WPP graph
Page 3
Outline Motivation
WPP profiling
Properties of bi-distributive DFAs
Algorithm
Experiments
Conclusion
Page 4
ProgramProgram
Classical Approach
Drawback:
Classical Program Optimization:
transformation
data flow analysis
optimizer
binary informationOptimizedprogram
Optimizedprogram
OptimizerOptimizer
heavily rarely never
Page 5
ProgramProgram
Profiling Approach
Advantage:
Probabilistic Program Optimization:
transformation
dataflow freq. analysis
Optimizer based on profiling
frequency informationOptimizedprogram
Optimizedprogram
OptimizerOptimizer
heavily rarely never
ProfileProfile
Page 6
Running Example CFG Example
– simple code fragment– 8 times left branch– terminates via right branch
Reaching definitions problem– two definitions: d1, d2
– d1 kills d2 and vice versa– use of x at the end of loop
Questions – How often does d1 hold at node 5?– How often does d2 hold at node 5?
s
1
32
4
5
d1: x:=... d2: x:=...
...x...
Page 7
WPP Profiling
Captures the whole program path – Larus at PLDI’99
Path profiling techniques for acyclic paths– minimal insertion of instrumentation code– keeps executable fast
Sequitur for compression – builds a grammar – terminals are acyclic paths– nonterminals have only one production– graph representation of grammar – grammar has only sentence – best case: logarithmic size reduction
Page 8
WPP Example
s
1
32
4
5
CFG Example
Program Run- 8x left branch- 1x right branch
A
S
a b c
S a A A A b c
A b b
WPP Graph & Grammar
Terminals:a: [s,1,2,4]
b: [1,2,4]
c: [1,3,4,5]
Page 9
Bi-Distributive Dataflow Problems
Properties– finite lattice 2D (power set of dataflow facts)– transition functions are monotone– transition functions distribute
– representation relation– covers bit-vector problems
Due to properties– transition functions represented as 0/1-matrices– states represented as 0/1-vectors
)()()(
)()()(
YfXfYXf
YfXfYXf
Page 10
Representation Relation Transition function f: 2D 2D
– represented by f r : D 2D
– artificial data fact )0(})({)(
}{)0()(
fdfdf
ffr
r
2
4d1: x:=...
1
{d1, }
{}d2
{}d1
M(24)rD
Example
Page 11
Matrix Representation
Matrix representation of function f
otherwise,0
)( if,1 jr
iij
dfda
1
0
1
1
1
1
100
000
100
,)42( 21 ddM A
121,)42( dddM
Example
Page 12
Dataflow Frequencies Definition of dataflow frequencies for node v
r whole program path
– prefix: set of all sub-paths from start node to node v : converts data flow facts to 0/1-vector– state(): data flow facts which hold along path – sums up the occurrences of data flow facts which hold in v
Approach for fast computation– adopt definition for grammar symbols of SEQUITUR
))state(()(),Prefix(
vr
rvy
s
v
Page 13
Frequency Matrix Definition of frequency matrices
– sum computation due matrix calculus
Ak
vvuuu
uuuMvFk
]),,,[()( 21),Prefix(],,,,[ 121
)()(
))state(()(),Prefix(
cvF
vy
r
r
rv
Frequency matrices for eliminating sum
Computation of frequency matrices for grammar symbols
Page 14
Terminals Transition function
– compose function for acyclic path t:[u1, u2, ..., uk]
– represent transition function as matrix
Akk
Ak
A
uuMuuM
uuuMtM
)()(
]),,,([)(
211
21
otherwise0
,,, if]),,,([)( 2121 k
A
t
uuuvvuuMvF
Frequency matrix
Page 15
Nonterminals Transition function
– compose transition function for ntX1, X2, ..., Xk
– represent transition function as matrix
AAAk
Ak
A
XMXMXM
XXXMntM
)()()(
])()(
12
21
AkX
AXXnt
XXXMvF
XMvFvFvF
k)()(
)()()()(
21
121
Frequency matrix
Page 16
Example Terminal b: [1,2,4]
100
000
100
])4,2,1([)4( Ab MF 2
4d1: x:=...
1
AAAAA bMbMbMbbMAM )()()()()(
200
000
200
)()4()4()4( AbbA bMFFF
Nonterminal Abb
Page 17
Algorithm
forall vN do
forall tT do
compute terminal t for node v
endfor
forall ntNT in reverse topological order do
compute nonterminal nt for node v
endfor
endfor
)()()( cvFvy S
Pseudo-Code
Page 18
Example
A
S
a b c
Transition matrices and frequency matrices for terminals
A
S
a b c
a
A
S
a b c
b
A
S
a b c
c
Page 19
Example
A
S
a b c
Transition matrices and frequency matrices for nonterminals
A
S
a b c
A
A
S
a b c
S
Frequency matrix of start symbol S
contains the dataflow frequency information!
Page 20
Experiments
Gcc-Compiler 2.95.2– data flow frequency analysis written in C++/C– implementation of WPP (runtime & compiletime)
Benchmark– some programs of SpecInt95– reaching definitions problem
Environment– Sun Ultra Enterprise 450 (4 x 296 MhZ) with 2.5 GB
Page 21
Node Statistics
0
2000
4000
6000
8000
10000
12000
No
des
not executed
executed w/o DFA
analyzed
about 40% of nodes are executed no computations for 60% of nodes required
Page 22
WPP Size & Overhead
0
5000
10000
15000
20000
099.go
124.m88ksim
129.compress130.li
132.ijpeg
134.perl0
5
10
15
20
25
30
35
099.go
124.m88ksim
129.compress130.li
132.ijpeg
134.perl
WPP Size in Kbytes Compile Overhead in %
- Compile time overhead almost proportional to WPP size
Page 23
Conclusion
Novel dataflow frequency analysis– designed for bi-distributive dataflow analysis problems– matrix representation of transition functions– employs SEQUITUR Grammars
Accurate and efficient algorithm
Experiments– platform: gcc for Ultra 450– benchmark: reaching definitions problem for SpecInt95– overhead is proportional to the size of WPP
Page 24
Stop!