Register Allocation via Coloring of Chordal Graphs
description
Transcript of Register Allocation via Coloring of Chordal Graphs
Register Allocation via Coloring of Chordal Graphs
Fernando M Q Pereira
Jens Palsberg
UCLA
• But, what is a chordal graph?
95% of the interference graphs produced from Java Methods by the JoeQ compiler are Chordal.
• Java 1.5 library: 95.5% (23,681 methods).
• Public ML graphs: 94.1% (27,921 IG’s).– Not considering pre-colored registers.
Why is this good news?
• Many problems that are NP-complete for general graphs are linear time for chordal graphs. Ex. graph coloring.
• Simpler register allocation algorithms, but still competitive!
Register Allocation is Complicated…
build
simplify
coalesce
freezepotential
spill
select
actualspill
• Iterated Register Coalescing [George and Appel 96]
The Proposed Algorithm.
pre-spillingphase
coalescingphase
coloringphase
SEO
• Simple
• Modular
• Efficient
• Works with non-chordal interference graphs.
– Induced subgraphs: H = G[VH]
– Induced cycles: H is a cycle.
– Clique: H is a complete graph.
A B
DC
E
A B
DC
A B
DC
E
A B
D E
A
DC
A B
DC
E
Some terminology:
Chordal Graphs.
• A graph G is chordal iff the size of the largest induced cycle is 3 (it is a triangle).
• non-chordal: Chordal:
A B
D E
A B
D E
Why are Interference graphs Chordal?
• Chordal graphs are the intersection graphs of subtrees of a tree:
A
B C
DE F
But CFG’s are not trees…
int m(int a, int d) { int b, c; if(a > 0) { b = 1; c = a; } else { c = 2; b = d; } return b + c; }}
a b
d c
Interference graphs of programs in SSA form are chordal.
• Independently proved by Brisk[2005], Hack[2005], Bouchez[2005].
• Intuition:– The chordal graphs are the intersection graphs
of subtrees of a tree.– Live ranges in SSA are subtrees of the
dominance tree.
Why only 95% of chordal graphs?
• Executable code is not in SSA form.• SSA elimination.
– Phi-functions are abstract constructions.– In executable code, phi functions are replaced by
copy instructions.
• We call programs after SSA elimination Post-SSA programs.
• Some Post-SSA programs are non-chordal :(
The Proposed Algorithm.
pre-spillingphase
coalescingphase
coloringphase
SEO
The pre-spilling version:
post-spillingphase
coalescingphase
coloringphase
SEO
The post-spilling version:
The Example.
1 int B = R1;2 int A = R2;3 int F = 1;4 int E = A + F;5 int D = 0;6 int C = D;7 R2 = C + E;8 R1 = B;
A B C
F E D
Two registers available for allocation: R1 and R2
R2R1
The Simplicial Elimination Ordering.
pre-spillingphase
coalescingphase
coloringphase
SEO
Sim
plicial Elim
ination Ordering (S
EO
)
Simplicial Elimination Ordering
Neighbors of N that precede N constitute a clique:
S1 = (A, F, B, E, D, C, R2, R1)S2 = (R2, B, E, F, A, D, C, R1)But S3 = (R2, R1, D, F, E, C, A, B) is not a SEO. Why?
} are SEO’s
A B C
F E D
R2R1
Sim
plicial Elim
ination Ordering (S
EO
)
A third definition of chordal graph.
• A graph G = (V, E) is chordal if, and only if, it has a simplicial elimination ordering [Dirac 61].
• There exist O(|V| + |E|) algorithms to find a simplicial elimination ordering:– Maximum Cardinality Search,– Lexicographical Breadth First Search.
Sim
plicial Elim
ination Ordering (S
EO
)
The Pre-Spilling Phase.
The pre-spilling phase
pre-spillingphase
coalescingphase
coloringphase
SEO
The Pre-Spilling Phase
• Chromatic number = size of largest clique.1 - List all the maximal cliques in the graph.2 - Remove nodes until all maximal cliques have K
or less nodes.2.1 - Which registers to remove?– For each register r:
• n = number of big cliques that contain r.• f = frequency of use.• s = size of r’s live range.
– Spill factor = n * s / f
The pre-spilling phase
S1 = ( A, F, B, E, D, C, R2, R1 )
A A B B B B R2
F F E
Node B is present in most of the cliques, and must be removed.
The pre-spilling phase
Only look into cliques greater than K = 2.
A B C
F E D
R2R1
S1 = ( A, F, E, D, C, R2, R1 )
A B B R2F E
Node B is present in most of the cliques, and must be removed.
The pre-spilling phase
Resulting graph:
A C
F E D
R2R1
The Coloring Phase.
pre-spillingphase
coalescingphase
coloringphase
SEO
The coloring phase
Coloring Chordal Graphs.• Feed the greedy coloring with a simplicial
elimination ordering.
The coloring phase
S1 = ( A, F, E, D, C, R2, R1 )
C
F E
D
R2R1 A
Coloring Chordal Graphs.
The coloring phase
F
S1 = ( A, F, E, D, C, R2, R1 )
C
E
D
R2R1 A
Coloring Chordal Graphs.
The coloring phase
EF
S1 = ( A, F, E, D, C, R2, R1 )
C
D
R2R1 A
Coloring Chordal Graphs.
The coloring phaseD
EF
S1 = ( A, F, E, D, C, R2, R1 )
C
R2R1 A
Coloring Chordal Graphs.
The coloring phase
C
D
EF
S1 = ( A, F, E, D, C, R2, R1 )
R2R1 A
Coloring Chordal Graphs.
The coloring phase
R2
C
D
EF
S1 = ( A, F, E, D, C, R2, R1 )
R1 A
Coloring Chordal Graphs.
The coloring phase
R2
C
D
EF
S1 = ( A, F, E, D, C, R2, R1 )
AR1
Register Coalescing.
pre-spillingphase
coalescingphase
coloringphase
SEO
The coalescing phase
Register Coalescing
• Greedy coalescing after register allocation.– Why not before graph coloring?
Algorithm: Register CoalescingInput: (G, color(G))Output: (G, color’(G))begin for every non-interfering move instruction (x := y) do let color(x) = color(y) = unused color(N(x) U N(y));end
The coalescing phase
Register Coalescing
The coalescing phase
D
EFA
R2
C
R1
Register Coalescing
The coalescing phase
D
EFA
R2
C
R1
The Post-Spilling Phase.
post-spillingphase
coalescingphase
coloringphase
SEO
The P
ost-Spilling P
hase
• Remove nodes assigned same color. E.g:– Remove least used color.– Remove greatest color.
• Faster implementation, but generates worse code.
What about a Non-Chordal Graph?
• Coloring is no longer optimal.
• The number of colors will be between the optimal, and twice the optimal for almost every possible graph [Bollobas 1988].
Benchmark
• The Java 1.5 standard libraries.– 23,681 methods.
• Algorithms implemented in the JoeQ framework.
• Two test cases:– Code without any transformation: 90% chordal.– Programs in Post-SSA form: 95% chordal.
Non-transformed Programs
Chordal coloring
16 registers
Chordal coloring
18 registers
IRC
18 registers
# registers / method 4.13 4.20 4.25
# spills / method 0.0055 0.0044 0.0050
Total # spill 131 105 115
Maximum # spill 17 15 16
Coalescing / moves 0.29 0.34 0.31
Post-SSA Programs
Chordal coloring
16 registers
Chordal coloring
18 registers
IRC
18 registers
# registers / method 4.12 4.13 4.17
# spills / method 0.0053 0.0040 0.0049
Total # spill 125 94 118
Maximum # spill 16 17 27
Coalescing / moves 0.68 0.72 0.70
Registers per Method - Post-SSA Program
0
1000
2000
3000
4000
5000
6000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Number of registers
Number of methods
MCS Iterated Coalescing
Methods in Java 1.5
• 23,681 methods; 22,544 chordal methods.
• 85% methods could be colored with 6 regs.
• 99.8% could be colored with 16 regs.
• 28 methods demanded more than 16 regs.
Time and Complexity: G = (V, E)
• SEO: O(|V| + |E|);
• Pre-spilling: O(|V| + |E|);
• Coloring: O(|E|);
• Coalescing: O(|V|3);
colo
ring
spil
ling
coa
lesc
ing
pre-s
pillin
g
Larges
t colo
r
Least
used
color
Related Work.
• All the 27,921 public ML interference graphs are 1-perfect [Andersson 2003].– Structured programs have 1-perfect IG?
• Polynomial register allocation [Brisk 2005], [Hack 2005]. – SSA-Interference graphs are chordal.
Conclusions
• Many interference graphs of structured programs are chordal;
• New algorithm:– Modular;– Efficient;– Competitive;
• We have an extended version of the algorithm implemented on top of GCC:– http://compilers.cs.ucla.edu/fernando/projects/
Are Java Interference Graphs 1-Perfect?
• 1-Perfect graph: minimum coloring equals largest clique.– It is different of perfect graphs.
• All the 27,921 IG of the ML compiler compiling itself are 1-perfect [Andersson, 2003].– Not considering pre-colored registers: 94.5% of
chordal graphs.
SSA and Post-SSA Graphs.
• SSA interference graphs:– Chordal– Perfect– 1-Perfect
• Post SSA graphs:– If phi functions are replaced by copy
instructions, than register allocation is NP-complete.
Non-1-Perfect Example
int m(it a, int d) { int e, c; if(in() > 0) { e = 0; c = d; } else { b = 0; c = a; e = b; } return c + e;}
ad
e
b
c
The Post SSA Interference Graph.
ad
e2 b
c1
e1
c
c2
e
e2 = 0;c2 = d;
b = 0;c1 = d;e1 = b;
e = e2;c = c2;
c = c1;e = e1;
return c + e;
x
References
• [Andersson 2003] Christian Andersson, Register Allocation by Optimal Graph Coloring, 12th Conference on Compiler Construction
• [Brisk 2005] Philip Brisk and Foad Dabiri and Jamie Macbeth and Majid Sarrafzadeh, Polynomial-Time Graph Coloring Register Allocation, 14th International Workshop on Logic and Synthesis
• [Hack 2005] Sebastian Hack and Daniel Grund and Gerhard Goos, Towards Register Allocation for Programs in SSA-form.