Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

28
UNIVERSITY NIVERSITY OF OF D DELAWARE ELAWARE C COMPUTER & OMPUTER & INFORMATION NFORMATION SCIENCES CIENCES DEPARTMENT EPARTMENT Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II John Cavazos University of Delaware

description

Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II. John Cavazos University of Delaware. What is SSA?. Many data-flow problems have been formulated To limit number of analyses, use single analysis to perform multiple transformations  SSA - PowerPoint PPT Presentation

Transcript of Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

Page 1: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Optimizing CompilersCISC 673

Spring 2011Static Single Assignment II

John Cavazos

University of Delaware

Page 2: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

2

What is SSA?

• Many data-flow problems have been formulated

• To limit number of analyses, use single analysis to perform multiple transformations SSA

• Compiler optimization algorithms enhanced or enabled by SSA: Constant propagation Dead code elimination Global value numbering Partial redundancy elimination Register allocation

Page 3: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

3

SSA Construction Algorithm (High-level sketch)

1. Insert -functions

2. Rename values

Page 4: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

4

SSA Construction Algorithm (Less high-level sketch)

• Insert -functions at every join for every name

• Solve reaching definitions

• Rename each use to def that reaches it (will be unique)

What’s wrong with this approach

• Too many -functions!

To do better, we need a more complex approach

Builds maximal SSA

Page 5: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

5

1. Insert -functions

a.) calculate dominance frontiers

b.) find global namesfor each name, build list of blocks that

define it

c.) insert -functions

SSA Construction Algorithm (Less high-level sketch)

Page 6: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

6

Insert -functions

global name n

worklist ← Block(n) // blocks in which n is assigned block b ∈ worklist

block d in b’s dominance frontierinsert a -function for n in dadd d to worklist

Page 7: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

7

Computing Dominance Frontiers

•Only join points are in DF(n) for some n

•Simple algorithm for computing dominance frontiers

For each join point x (i.e., |preds(x)| > 1) For each CFG predecessor of xRun up to IDOM(x ) in dominator tree,

add x to DF(n) for each n between x and IDOM(x )

Page 8: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

8

Dominance Frontiers

n B0 B1 B2 B3 B4 B5 B6 B7

DOM(n) 0 0,1 0,1,2 0,1,3 0,1,3,4

0,1,3,5

0,1,3,6

0,1,7

IDOM(n) -- 0 1 1 3 3 3 1

DF(n) -- -- 7 7 6 6 7 1

For each join point x

For each CFG pred of x

Run to IDOM(x ) in dom tree,

add x to DF(n) for each n

between x and IDOM(x )

B1

B2 B3

B4 B5

B6

B7

B0

B1

B2 B3

B4 B5

B6

B7

B0

Flow Graph Dominance Tree

Page 9: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Dominance Frontiers & -Function Insertion

Dominance Frontiers

• A definition at n forces a -function at m iff n DOM(m) but n DOM(p) for some p preds(m)

• DF(n ) is fringe just beyond region n dominates

B1

B2 B3

B4 B5

B6

B7

B0

0 1 2 3 4 5 6 7DOM 0 0,1 0,1,2 0,1,3 0,1,3,4 0,1,3,5 0,1,3,6 0,1,7DF – – 7 7 6 6 7 1

in 1 forces -function in DF(1) = Ø (halt )

x ...

x (...)• DF(4) is {6}, so in 4 forces -function

in 6x (...) in 6 forces -function in DF(6) = {7}

x (...)

in 7 forces -function in DF(7) = {1}

For each assignment, we insert the -functions

Page 10: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

a (a,a)b (b,b)c (c,c)d (d,d)y a+bz c+di i+1

B7

i > 100i ...B0

b ...c ...d ...

B2

a (a,a)b (b,b)c (c,c)d (d,d)i (i,i) a ...c ...

B1

a ...d ...

B3

d ...B4 c ...

B5

d (d,d)c (c,c)b ...B6

i > 100

With all the -functions

• Lots of new ops

• Renaming is next

Assume a, b, c, & d defined before B0

Excluding local names avoids ’s for y & z

Page 11: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

11

2. Rename variables in a pre-order walk over dominator tree(uses counter and a stack per global name)

Staring with the root block, b

a.) generate unique names for each -functionand push them on the appropriate stacks

SSA Construction Algorithm (Less high-level sketch)

Page 12: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

12

2. Rename variables (cont’d)

b.) rewrite each operation in the blocki. Rewrite uses of global names with the current version (from the stack)ii. Rewrite definition by creating & pushing new name

c.) fill in -function parameters of successor blocks

d.) recurse on b’s children in the dominator tree

e.) <on exit from block b > pop names generated in b from stacks

SSA Construction Algorithm (Less high-level sketch)

Page 13: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

13

SSA Construction Algorithm

for each global name icounter[i] 0stack[i]

call Rename(n0)

NewName(n)i counter[n]counter[n] counter[n] + 1push ni onto stack[n]

return ni

Rename(b)for each -function in b, x (…)

rename x as NewName(x)

for each operation “x y op z” in brewrite y as top(stack[y])rewrite z as top(stack[z])rewrite x as NewName(x)

for each successor of b in the CFGrewrite appropriate parameters

for each successor s of b in dom. treeRename(s)

for each operation “x y op z” in bpop(stack[x])

Adding all the details ...

Page 14: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

a (a,a)b (b,b)c (c,c)d (d,d)y a+bz c+di i+1

B7

i > 100i ...B0

b ...c ...d ...

B2

a (a,a)b (b,b)c (c,c)d (d,d)i (i,i) a ...c ...

B1

a ...d ...

B3

d ...B4 c ...

B5

d (d,d)c (c,c)b ...B6

i > 100

CountersStacks

1 1 1 1 0

a

a0 b0 c0 d0

Before processing B0

b c d i

Assume a, b, c, & d defined before B0

i has not been defined

20

Page 15: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

a (a,a)b (b,b)c (c,c)d (d,d)y a+bz c+di i+1

B7

i > 100i0 ...B0

b ...c ...d ...

B2

a (a0,a)b (b0,b)c (c0,c)d (d0,d)i (i0,i)

a ...c ...

B1

a ...d ...

B3

d ...B4 c ...

B5

d (d,d)c (c,c)b ...B6

i > 100

CountersStacks

1 1 1 1 1

a b c d i

a0 b0 c0 d0

End of B0

i0

21

Page 16: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

a (a,a)b (b,b)c (c,c)d (d,d)y a+bz c+di i+1

B7

i > 100i0 ...B0

b ...c ...d ...

B2

a1 (a0,a)b1 (b0,b)c1 (c0,c)d1 (d0,d)i1 (i0,i)

a2 ...c2 ...

B1

a ...d ...

B3

d ...B4 c ...

B5

d (d,d)c (c,c)b ...B6

i > 100

CountersStacks

3 2 3 2 2

a b c d i

a0 b0 c0 d0

End of B1

i0

a1 b1 c1 d1 i1

a2 c2

22

Page 17: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

a (a2,a)

b (b2,b)c (c3,c)

d (d2,d)y a+bz c+di i+1

B7

i > 100i0 ...B0

b2 ...c3 ...d2 ...

B2

a1 (a0,a)

b1 (b0,b)c1 (c0,c)

d1 (d0,d)

i1 (i0,i) a2 ...c2 ...

B1

a ...d ...

B3

d ...B4 c ...

B5

d (d,d)c (c,c)b ...B6

i > 100

CountersStacks

3 3 4 3 2

a b c d i

a0 b0 c0 d0

End of B2

i0

a1 b1 c1 d1 i1

a2 c2b2 d2

c3

23

Page 18: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

a (a2,a)

b (b2,b)c (c3,c)

d (d2,d)y a+bz c+di i+1

B7

i > 100i0 ...B0

b2 ...c3 ...d2 ...

B2

a1 (a0,a)

b1 (b0,b)c1 (c0,c)

d1 (d0,d)

i1 (i0,i) a2 ...c2 ...

B1

a ...d ...

B3

d ...B4 c ...

B5

d (d,d)c (c,c)b ...B6

i > 100

i ≤ 100

CountersStacks

3 3 4 3 2

a b c d i

a0 b0 c0 d0

Before starting B3

i0

a1 b1 c1 d1 i1

a2 c2

24

Page 19: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

a (a2,a)

b (b2,b)c (c3,c)

d (d2,d)y a+bz c+di i+1

B7

i > 100i0 ...B0

b2 ...c3 ...d2 ...

B2

a1 (a0,a)

b1 (b0,b)c1 (c0,c)

d1 (d0,d)

i1 (i0,i) a2 ...c2 ...

B1

a3 ...d3 ...

B3

d ...B4 c ...

B5

d (d,d)c (c,c)b ...B6

i > 100

CountersStacks

4 3 4 4 2

a b c d i

a0 b0 c0 d0

End of B3

i0

a1 b1 c1 d1 i1

a2 c2

a3

d3

25

Page 20: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

a (a2,a)

b (b2,b)c (c3,c)

d (d2,d)y a+bz c+di i+1

B7

i > 100i0 ...B0

b2 ...c3 ...d2 ...

B2

a1 (a0,a)

b1 (b0,b)c1 (c0,c)

d1 (d0,d)

i1 (i0,i) a2 ...c2 ...

B1

a3 ...d3 ...

B3

d4 ...B4 c ...

B5

d (d4,d)c (c2,c)b ...B6

i > 100

CountersStacks

4 3 4 5 2

a b c d i

a0 b0 c0 d0

End of B4

i0

a1 b1 c1 d1 i1

a2 c2

a3

d3

d4

26

Page 21: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

a (a2,a)

b (b2,b)c (c3,c)

d (d2,d)y a+bz c+di i+1

B7

i > 100i0 ...B0

b2 ...c3 ...d2 ...

B2

a1 (a0,a)

b1 (b0,b)c1 (c0,c)

d1 (d0,d)

i1 (i0,i) a2 ...c2 ...

B1

a3 ...d3 ...

B3

d4 ...B4 c4 ...

B5

d (d4,d3)c (c2,c4)

b ...B6

i > 100

CountersStacks

4 3 5 5 2

a b c d i

a0 b0 c0 d0

End of B5

i0

a1 b1 c1 d1 i1

a2 c2

a3

d3

c4

27

Page 22: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

a (a2,a3)

b (b2,b3)c (c3,c5)

d (d2,d5)y a+bz c+di i+1

B7

i > 100i0 ...B0

b2 ...c3 ...d2 ...

B2

a1 (a0,a)

b1 (b0,b)c1 (c0,c)

d1 (d0,d)

i1 (i0,i) a2 ...c2 ...

B1

a3 ...d3 ...

B3

d4 ...B4 c4 ...

B5

d5 (d4,d3)c5 (c2,c4)

b3 ...B6

i > 100

CountersStacks

4 4 6 6 2

a b c d i

a0 b0 c0 d0

End of B6

i0

a1 b1 c1 d1 i1

a2 c2

a3

d3

c5 d5

b3

28

Page 23: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

a (a2,a3)

b (b2,b3)c (c3,c5)

d (d2,d5)y a+bz c+di i+1

B7

i > 100i0 ...B0

b2 ...c3 ...d2 ...

B2

a1 (a0,a)

b1 (b0,b)c1 (c0,c)

d1 (d0,d)

i1 (i0,i) a2 ...c2 ...

B1

a3 ...d3 ...

B3

d4 ...B4 c4 ...

B5

d5 (d4,d3)c5 (c2,c4)

b3 ...B6

i > 100

CountersStacks

4 4 6 6 2

a b c d i

a0 b0 c0 d0

Before B7

i0

a1 b1 c1 d1 i1

a2 c2

29

Page 24: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

a4 (a2,a3)

b4 (b2,b3)c6 (c3,c5)

d6 (d2,d5)y a4+b4

z c6+d6

i2 i1+1

B7

i > 100i0 ...B0

b2 ...c3 ...d2 ...

B2

a1 (a0,a4)

b1 (b0,b4)c1 (c0,c6)

d1 (d0,d6)

i1 (i0,i2) a2 ...c2 ...

B1

a3 ...d3 ...

B3

d4 ...B4 c4 ...

B5

d5 (d4,d3)c5 (c2,c4)

b3 ...B6

i > 100

CountersStacks

5 5 7 7 3

a b c d i

a0 b0 c0 d0

End of B7

i0

a1 b1 c1 d1 i1

a2 c2

a4

b4

c6

d6 i2

30

Page 25: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

a4 (a2,a3)

b4 (b2,b3)c6 (c3,c5)

d6 (d2,d5)y a4+b4

z c6+d6

i2 i1+1

B7

i > 100i0 ...B0

b2 ...c3 ...d2 ...

B2

a1 (a0,a4)

b1 (b0,b4)c1 (c0,c6)

d1 (d0,d6)

i1 (i0,i2) a2 ...c2 ...

B1

a3 ...d3 ...

B3

d4 ...B4 c4 ...

B5

d5 (d4,d3)c5 (c2,c4)

b3 ...B6

i > 100

After renaming

• Semi-pruned SSA form

• We’re done …

Semi-pruned only names live in 2 or more blocks are “global names”.

31

Page 26: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

26

SSA Construction Algorithm (Pruned SSA)

What’s this “pruned SSA” stuff?

• Minimal SSA still contains extraneous -functions

• Inserts some -functions where they are dead

• Would like to avoid inserting them

Page 27: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

27

SSA Construction Algorithm (Two Ideas)

• Semi-pruned SSA: discard names used in only one block Significant reduction in total number of -functions Needs only local Live information (cheap to

compute)

• Pruned SSA: only insert -functions where their value is live Inserts even fewer -functions, but costs more to do Requires global Live variable analysis (more expensive)

In practice, both are simple modifications to step 1.

Page 28: Optimizing Compilers CISC 673 Spring 2011 Static Single Assignment II

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

28

SSA Deconstruction

At some point, we need executable code

• Few machines implement operations

• Need to fix up the flow of values

Basic idea

• Insert copies -function pred’s

• Simple algorithm Works in most cases

• Adds lots of copies Many of them coalesce away

X17 (x10,x11)

... x17

... ...

... x17

X17 x10 X17 x11