Iterative Data-flow Analysis
COMP 512Rice UniversityHouston, Texas
Fall 2003
Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use.
New lecture, 2003
COMP 512, Fall 2003 2
Review
Last class:• Looked at Global Common Subexpression Elimination (Cocke
70)• Defined the available expressions problem as the key to finding
opportunities and proving safety
AVAIL(n0) = ØAVAIL(b) = xpred(b) (DEEXPR(x) (AVAIL(x) EXPRKILL(x) ))
• Looked at an algorithm to solve these equations Compute initial information: DEEXPR & EXPRKILL Apply an iterative solver to find a fixed-point solution
Today• Why does the iterative solver work?
COMP 512, Fall 2003 3
Data-flow Analysis
DefinitionData-flow analysis is a collection of techniques for compile-time reasoning about the run-time flow of values
• Almost always involves building a graph Problems are trivial on a basic block Global problems control-flow graph (or derivative) Whole program problems call graph (or derivative)
• Usually formulated as a set of simultaneous equations Sets attached to nodes and edges Semilattice to describe values We solved AVAIL with an iterative fixed-point algorithm
• Desired result is usually meet over all paths solution “What is true on every path from the entry?” “Can this happen on any path from the entry?” Related to the safety of optimization (how we use the
results)
COMP 512, Fall 2003 4
Data-flow Analysis
Limitations1. Precision – “up to symbolic execution”
Assume all paths are taken2. Solution – cannot afford to compute MOP solution
Large class of problems where MOP = MFP= LFP Not all problems of interest are in this class
3. Arrays – treated naively in classical analysis Represent whole array with a single fact
4. Pointers – difficult (and expensive) to analyze Imprecision rapidly adds up Need to ask the right questions
SummaryFor scalar values, we can quickly solve simple problems
Good news:Simple problems can carry us pretty far
*
COMP 512, Fall 2003 5
Data-flow Analysis
SemilatticeA semilattice is a set L and a meet operation such that, a, b, & c L :
1. a a = a2. a b = b a3. a (b c) = (a b) c
imposes an order on L, a, b, & c L :1. a ≥ b a b = b2. a > b a ≥ b and a ≠ b
A semilattice has a bottom element, denoted 1. a L, a = 2. a L, a ≥
COMP 512, Fall 2003 6
Data-flow Analysis
How does this relate to data-flow analysis?• Choose a semilattice to represent the facts• Attach a meaning to each a L
Each a L is a distinct set of known facts• With each node n, associate a function fn : L L
fn models behavior of code in block corresponding to n• Let F be the set of all functions that the code might generate
Example — AVAIL
• Semilattice is (2E,), where E is the set of all expressions & is Set are bigger than |variables|, is Ø
• For a node n, fn has the form fn(x) = an (x bn ) Where an is DEExpr(n) and bn is EXPRKILL(n)
COMP 512, Fall 2003 7
Concrete Example: Available Expressions
m a + bn a + b
A
p c + dr c + d
B
y a + bz c + d
G
q a + br c + d
C
e b + 18s a + bu e + f
D e a + 17t c + du e + f
E
v a + bw c + dx e + f
F
E = {a+b,c+d,e+f,a+17,b+18}2E is the set of all subsets of E
2E = [ {a+b,c+d,e+f,a+17,b+18}, {a+b,c+d,e+f,a+17}, {a+b,c+d,e+f,b+18}, {a+b,c+d,a+17,b+18}, {a+b,e+f,a+17,b+18}, {c+d,e+f,a+17,b+18}, {a+b,c+d,e+f}, {a+b,c+d,b+18}, {a+b,c+d,a+17}, {a+b,e+f,a+17}, {a+b,e+f,b+18},{a+b,a+17,b+18}, {c+d,e+f,a+17}, {c+d,e+f,b+18}, {c+d,a+17,b+18},{e+f,a+17,b+18}, {a+b,c+d},{a+b,e+f},{a+b,a+17}, {a+b,b+18},{c+d,e+f},{c+d,a+17}, {c+d,b+18},{e+f,a+17},{e+f,b+18}, {a+17,b+18},{a+b}, {c+d}, {e+f}, {a+17}, {b+18}, {} ]
COMP 512, Fall 2003 8
Concrete Example: Available Expressions
The Lattice
{ }
{a+b} {c+d} {e+f} {a+17} {b+18}
{a+b,c+d} {a+b,a+17} {c+d,e+f} {c+d,b+18} {e+f,b+18} {a+b,e+f} {a+b,b+18} {c+d,a+17} {e+f,a+17} {a+17,b+18}
{a+b,c+d,e+f} {a+b,c+d,b+18} {a+b,c+d,a+17} {a+b,e+f,a+17} {a+b,e+f,b+18} {a+b,a+17,b+18} {c+d,e+f,a+17} {c+d,e+f,b+18} {c+d,a+17,b+18}
{e+f,a+17,b+18},
{a+b,c+d,e+f,a+17} {a+b,c+d,e+f,b+18} {a+b,c+d,a+17,b+18} {a+b,e+f,a+17,b+18} {c+d,e+f,a+17,b+18}
{a+b,c+d,e+f,a+17,b+18},
*
Comparability(transitive)
COMP 512, Fall 2003 9
Concrete Example: Available Expressions
The Lattice
{ }
{a+b} {c+d} {e+f} {a+17} {b+18}
{a+b,c+d} {a+b,a+17} {c+d,e+f} {c+d,b+18} {e+f,b+18} {a+b,e+f} {a+b,b+18} {c+d,a+17} {e+f,a+17} {a+17,b+18}
{a+b,c+d,e+f} {a+b,c+d,b+18} {a+b,c+d,a+17} {a+b,e+f,a+17} {a+b,e+f,b+18} {a+b,a+17,b+18} {c+d,e+f,a+17} {c+d,e+f,b+18} {c+d,a+17,b+18}
{e+f,a+17,b+18},
{a+b,c+d,e+f,a+17} {a+b,c+d,e+f,b+18} {a+b,c+d,a+17,b+18} {a+b,e+f,a+17,b+18} {c+d,e+f,a+17,b+18}
{a+b,c+d,e+f,a+17,b+18},
*
meet
COMP 512, Fall 2003 10
Round-robin Iterative Algorithm
• Termination: does it halt?• Correctness: what answer does it produce?• Speed: how quickly does it find that answer?
AVAIL(b0 ) Øfor i 1 to N
AVAIL(bi ) { all expressions }change truewhile (change)
change false for i 0 to N
TEMP xpred (b) (DEF (x) (AVAIL(x) NKILL(x) ))if AVAIL(bi ) ≠ TEMP then
change true AVAIL(bi ) TEMP
The round-robin solver is easier to analyze than the worklist solver.
COMP 512, Fall 2003 11
Round-robin Iterative Algorithm
Termination• Makes sweeps over the nodes• Halts when some sweep produces no change
AVAIL(b0 ) Øfor i 1 to N
AVAIL(bi ) { all expressions }change truewhile (change)
change false for i 0 to N
TEMP xpred (b) (DEF (x) (AVAIL(x) NKILL(x) ))if AVAIL(bi ) ≠ TEMP then
change true AVAIL(bi ) TEMP
COMP 512, Fall 2003 12
Iterative Data-flow Analysis
• Any finite semilattice is bounded• Some infinite semilattices are bounded
-.002 … -.001 … 0 … .001 … .002 …
Real constants
Termination• If every fn F is monotone, i.e., x ≤ y f(x) ≤ f(y), and• If the lattice is bounded, i.e., every descending chain is finite
Chain is sequence x1, x2, …, xn where xi L, 1 ≤ i ≤ n xi > xi+1, 1 ≤ i < n chain is descending
Then• The set at each node can only change a finite number of times • The iterative algorithm must halt on an instance of the problem
COMP 512, Fall 2003 13
Iterative Data-flow Analysis
Correctness (What does it compute?)
• If every fn F is monotone, i.e., x ≤ y f(x) ≤ f(y), and• If the semilattice is bounded, i.e., every descending chain is
finite Chain is sequence x1, x2, …, xn where xi L, 1 ≤ i ≤ n xi > xi+1, 1 ≤ i < n chain is descending
Given a bounded semilattice S and a monotone function space F k such that f k() = f j() j > k• f k() is called the least fixed-point of f over S• If L has a T, then k such that f k(T) = f j(T) j > k and
f k(T) is called the maximal fixed-point of f over S
optimism
COMP 512, Fall 2003 14
Iterative Data-flow Analysis
Correctness• If every fn F is monotone, i.e., f(xy) ≤ f(x) f(y), and• If the lattice is bounded, i.e., every descending chain is finite
Chain is sequence x1, x2, …, xn where xi L, 1 ≤ i ≤ n xi > xi+1, 1 ≤ i < n chain is descending
Then• The round-robin algorithm computes a least fixed-point (LFP)• The uniqueness of the solution depends on other properties of F
• Unique solution it finds the one we want• Multiple solutions we need to know which one it finds
COMP 512, Fall 2003 15
Iterative Data-flow Analysis
Correctness• Does the iterative algorithm compute the desired answer?
Admissible Function Spaces1. f F, x,y L, f (x y) = f (x) f (y)2. fi F such that x L, fi(x) = x
3. f,g F h F such that h(x ) = f (g(x))4. x L, a finite subset H F such that x = f H f ()
If F meets these four conditions, then an instance of the problem will have a unique fixed point solution (instance graph + initial values) LFP = MFP = MOP order of evaluation does not matter
Not distributive fixed point solution may not be unique
*
COMP 512, Fall 2003 16
Iterative Data-flow Analysis
If a data-flow framework meets those admissibility conditionsthen it has a unique fixed-point solution• The iterative algorithm finds the (best) answer• The solution does not depend on order of computation• Algorithm can choose an order that converges quickly
Intuition• Choose an order so that changes propagate as far as possible
on each “sweep” Process a node’s predecessors before the node
• Cycles pose problems, of course Ignore back edges when computing the order?
*
COMP 512, Fall 2003 17
Ordering the Nodes to Maximize Propagation
2 3
4
1
Postorder
3 2
1
4
Reverse Postorder
• Reverse postorder visits predecessors before visiting a node• Use reverse preorder for backward problems
Reverse postorder on reverse CFG is reverse preorder
N+1 - postorder number
COMP 512, Fall 2003 18
Iterative Data-flow Analysis
Speed• For a problem with an admissible function space & a bounded
semilattice,• If the functions all meet the rapid condition, i.e.,
f,g F, x L, f (g()) ≥ g() f (x) xthen, a round-robin, reverse-postorder iterative algorithm will halt in d(G)+3 passes over a graph G
d(G) is the loop-connectedness of the graph w.r.t a DFST Maximal number of back edges in an acyclic path Several studies suggest that, in practice, d(G) is small
(<3) For most CFGs, d(G) is independent of the specific DFST
Sets stabilize in two passes around a
loop
Each pass does O(E ) meets & O(N ) other operations
*
COMP 512, Fall 2003 19
Iterative Data-flow analysis
What does this mean?• Reverse postorder
Easily computed order that increases propagation per pass• Round-robin iterative algorithm
Visit all the nodes in a consistent order (RPO) Do it again until the sets stop changing
• Rapid condition Most classic global data-flow problems meet this condition
These conditions are easily met Admissible framework, rapid function space Round-robin, reverse-postorder, iterative algorithm
The analysis runs in (effectively) linear time
COMP 512, Fall 2003 20
Some problems are not admissible
Global constant propagation
• First condition in admissibility f F, x,y L, f (x y) = f (x) f (y)
• Constant propagation is not admissible Kam & Ullman time bound does not hold There are tight time bounds, however, based on lattice height Require a variable-by-variable formulation …
a b + c
• Function “f” models block’s effects• f(S1) = {a=7,b=3,c=4}• f(S2) = {a=7,b=1,c=6}• f(S1S2) = Ø
S1: {b=3,c=4}
S2: {b=1,c=6}
COMP 512, Fall 2003 21
Some admissible problems are not rapid
Interprocedural May Modify sets
• Iterations proportional to number of parameters Not a function of the call graph Can make example arbitrarily bad
• Proportional to length of chain of bindings…
shift(a,b,c,d,e,f){ local t; … call shift(t,a,b,c,d,e); f = 1; …}
• Assume call-by-reference• Compute the set of variables (in shift) that can be modified by a call to shift• How long does it take?
shift
a b c d e f Nothing to do with d(G)
COMP 512, Fall 2003 22
Extra Slides Start Here
COMP 512, Fall 2003 23
Computing Available Expressions
AVAIL(b) = xpred(b) (DEEXPR(x) (AVAIL(x) EXPRKILL(x) ))
where
• EXPRKILL(b) is the set of expression killed in b, and • DEEXPR(b) is the set of expressions defined in b and not
subsequently killed in b
Initial conditionAVAIL(n0) = Ø, because nothing is computed before n0
The other node’s AVAIL sets will be computed over their preds.
N0 has no predecessor.
COMP 512, Fall 2003 24
Making Theory Concrete
Computing AVAIL for the exampleAVAIL(A) = ØAVAIL(B) = {a+b} (Ø all)
= {a+b}AVAIL(C) = {a+b}AVAIL(D) = {a+b,c+d} ({a+b} all)
= {a+b,c+d} AVAIL(E) = {a+b,c+d}AVAIL(F) = [{b+18,a+b,e+f}
({a+b,c+d} {all - e+f})]
[{a+17,c+d,e+f} ({a+b,c+d} {all -
e+f})]= {a+b,c+d,e+f}
AVAIL(G) = [ {c+d} ({a+b} all)] [{a+b,c+d,e+f} ({a+b,c+d,e+f} all)]= {a+b,c+d}
m a + bn a + b
A
p c + dr c + d
B
y a + bz c + d
G
q a + br c + d
C
e b + 18s a + bu e + f
D e a + 17t c + du e + f
E
v a + bw c + dx e + f
F
A B C D E F GDEEXPR a+b c+d a+b,c+db+18,a+b,e+fa+17,c+d,e+fa+b,c+d,e+fa+b,c+dEXPRK ILL { } { } { } e+f e+f { } { }
*
COMP 512, Fall 2003 25
Redundancy Elimination Wrap-up
Algorithm Acronym CreditsLocal Value Numbering LVN Balke, 1967Superlocal Value Numbering SVN ManyDominator-based Value Num’g DVNT Simpson, 1996Global CSE (with AVAIL) GCSE Cocke, 1970SCC-based Value Numbering† SCCVN/VDCM Simpson, 1996Partitioning Algorithm† AWZ Alpern et al,
1988
… and there are many others …
†We have not seen these ones (yet).
Three general approaches•Hash-based, bottom-up
techniques•Data-flow techniques•PartitioningEach has strengths & weaknesses
COMP 512, Fall 2003 26
Making Theory Concrete
Comparing the techniques
m a + bn a + b
A
p c + dr c + d
B
y a + bz c + d
G
q a + br c + d
C
e b + 18s a + bu e + f
D e a + 17t c + du e + f
E
v a + bw c + dx e + f
F
LVN
LVNSVN
SVNSVN
DVNDVNGRE
DVNGRE
The VN methods are ordered•LVN ≤ SVN ≤ DVN (≤ SCCVN)•GRE is different
o Based on names, not valueo Two phase algorithm
Analysis Replacement
COMP 512, Fall 2003 27
Redundancy Elimination Wrap-up
ComparisonsOn/Off Operates Basis of
Name Scope Line On IdentityLVN local online blocks valueSVN superlocal online EBBs value
DVNT regional online dom. Tree value GCSE global offline CFG lexical
Visits AlgebraicName per Node Commutes Identities Constants OptimisticLVN 1 yes yes yes n/aSVN 1 yes yes yes n/a
DVNT 1 yes yes yes n/aGCSE D(CFG) + 3 no no no no
Better results in loops
COMP 512, Fall 2003 28
Redundancy Elimination Wrap-up
Generalizations• Hash-based methods are fastest• AWZ (& SCCVN) find the most cases• Expect better results with larger scope
Experimental data• Ran LVN, SVN, DVNT, AWZ• Used global name space for DVNT
Requires offline replacement Exposes more opportunities
• Code was compiled with lots of optimization
How did they do? DVNT beat AWZ
Improvements grew with scope
DVNT vs. SCCVN was ± 1%
DVNT 6x faster than SCCVN SCCVN 2.5x faster than
AWZ
*
The partitioning method based on DFA minimization
COMP 512, Fall 2003 29
Redundancy Elimination Wrap-up
Conclusions• Redundancy elimination has some depth & subtlety• Variations on names, algorithms & analysis matter• Compile-time speed does not have to sacrifice code quality
DVNT is probably the method of choice• Results quite close to the global methods (± 1%)• Much lower costs than SCCVN or AWZ
COMP 512, Fall 2003 30
Lattice Theory
This stuff is somewhat dry
Everybody stand up and stretch
Top Related