Code Generation and Optimisation

Post on 28-Apr-2022

0 views 0 download

Transcript of Code Generation and Optimisation

Code Generation and Optimisation

Register Allocation

via Graph Colouring

Broad problem

y := 10; x := y*y*y; print x

li y, 10 mul t, y, y mul x, t, y print x

Current compiler

Problem: to automatically convert unlimited register code to limited register code.

Broad solution

The number of registers used in a piece of code can be reduced by:

register reuse: when two registers are never simultaneously live they can be remapped to a single register;

register spilling: register values can be stored in memory and loaded into registers only when needed.

Example: register reuse

Since x and t are never live at the same time they can be remapped to a single register r.

li y, 10 mul t, y, y mul x, t, y print x

li y, 10 mul r, y, y mul r, r, y print r

Spill memory

Suppose registers a, b, and y are spilt to memory. They might be arranged in memory as follows:

Memory

a

b

y

spill

The spill register points to the beginning of the spilt registers in memory.

Loads and Stores

To access spilt registers, we need to extend our target language with instructions for accessing memory.

instr ::= … (MIPS2) | load r , r , i (r0 ⟵ mem[r1+i]) | store r , r , i (mem[r1+i] ⟵ r0)

Concrete syntax of MIPS3

data Instr = … | LOAD(Reg, Reg, Int) | STORE(Reg, Reg, Int)

Abstract syntax of MIPS3

Example: register spilling

Suppose y has been spilt to memory. We must wrap accesses to y with load and/or store instructions.

mul y, x, x ... print y

mul t, x, x store t, spill, 2

... load t, spill, 2

print t

Register allocation

Aims:

Introduce as much register reuse as possible.

(The number of registers used will fall

significantly, but may still be beyond the limit of the target machine.)

Spill registers to memory only if necessary. Memory access is expensive!

INTERFERENCE GRAPHS

When are registers live at the same time?

Interference graphs

Using the results of liveness analysis, we can construct an interference graph:

each register r is represented by a node labelled r in the interference graph;

there is an edge between r1 and r2 if liveness analysis states that, at any point in the code, r1 and r2 are live at the same time.

Exercise 1

Draw the interference graph for the following code.

li a, 0

addi b, a, 1

muli a, b, 2

blt a, n, loop

add c, c, b

L1:

loop:

L2:

L3:

L5: print c

L4:

Label Live-in

L1 c, n

loop a, c, n

L2 b, c, n

L3 b, c, n

L4 a, c, n

L5 c

GRAPH COLOURING

Register allocation

Let colours 1…N represent the N registers available in the target machine.

Register allocation is similar to graph colouring:

colour the interference graph with at most N colours, such that no two nodes connected by an edge have the same colour.

Register allocation

if the graph is not N-colourable then registers may be spilt (assigned no colour).

But there is one big difference:

The number of spills should be kept to a minimum!

Example 1

Suppose the target machine has 3 registers. Consider register allocation on the following interference graph.

a b

c d

Register Colour

a 1

b 2

c 2

d 3

Register Colour

a 1

b 2

c 3

d Spilt

Good answer: Bad answer:

Algorithm

Register allocation, like graph colouring, is NP-complete. That is, no optimal polynomial-time algorithm is known.

We use a mild variant of Chaitin's aglorithm. It is an approximate polynomial time algorithm that typically gives good results.

Algorithm

Has two components:

The basic colouring algorithm that is parameterised by the order in which nodes should be coloured.

The simplification algorithm which decides a good order to colour the nodes.

Basic colouring algorithm

Inputs:

an interference graph g;

a list of registers rs (the order).

Outputs:

a colouring c that partially-maps registers to colours, initially empty. Uncoloured registers are to be spilt.

Basic colouring algorithm

Let neighbours(g, c, r) denote the colours (if any) of the registers connected to r.

foreach r in rs possible := {1..N} – neighbours(g,c,r) ; if possible ≠ {} then c[r] := minimum(possible)

Basic colouring algorithm

Exercise 2

c

e f

d

h g

a

b

Assuming the target machine has just two registers , apply the basic colouring algorithm twice to the following interference graph.

First use the node ordering b, e, f, h, g, c, d, a, and then use the node ordering h, d, g, a, c, f, e, b.

Order matters!

When using the basic colouring algorithm, some node orderings lead to fewer spills than others.

The simplification algorithm aims to find a good ordering for the basic colouring algorithm.

Chaitin’s observation

If g-r is N-colourable then so is g, since when r and its edges are added to g-r, the neighbourhood of r contains fewer than N colours.

N is the number of available registers;

graph g contains a node r with fewer than N neighbours;

g-r is the graph obtained by removing r and its edges.

Suppose that

Node ordering

Chaitin's observation suggests the following recursively-defined node ordering.

To colour the nodes in graph g, first colour the nodes in g-r, where r has fewer than N neighbours, then colour node r.

(Note, this does not help us when there exists no node with fewer than N neighbours.)

Deterministic simplification algorithm

Inputs:

an interference graph g.

Outputs:

A list of registers rs representing an order in which to colour the nodes.

Let pick(g) return the register in g with the fewest neighbours; if more than one exists, return the first alphabetically.

Deterministic simplification algorithm

rs := [] while non-empty(g) r := pick(g) ; rs := [r] ++ rs ; g := del(r, g)

Deterministic simplification algorithm

Let del(r, g) return graph g with register r, and its edges, removed.

Exercise 3

c

e f

d

h g

a

b

Give the output of the simplification algorithm on the following interference graph.

REGISTER ALLOCATION

register: a device for storing small amounts of data

allocate: to apportion for a specific purpose

Webster’s Dictionary

Substitution

If register x is coloured c, then the register allocator replaces all occurrences of x in the code with rc.

For example, if registers x, y, and t are coloured 1, 2, and 1 then:

li y, 10 mul t, y, y mul x, t, y print x

li r2, 10 mul r1, r2, r2

mul r1, r1, r2

print r1

Spilling

If register r is not coloured, then spill-code must be inserted to transfer the value of r to and from memory as required.

For example, if registers x and y are to be spilt to memory locations spill[offsetx] and spill[offsety] then

add x, x, y

load t1, spill, offsetx

load t2, spill, offsety

add t1, t1, t2 store t1, spill, offsetx

COALESCING

Reducing unnecessary move instructions

Coalescing

If we have an instruction

move x, y

and there is no edge between x and y in the interference graph then the instruction can be eliminated.

Registers x and y can be coalesced into a new register xy whose neighbours are the union of those of x and of y.

Snag

Since a coalesced register xy may have more neighbours than x or y alone, a graph that is N-colourable before coalescing may not be after.

Coalescing can be done selectively during register allocation without compromising N-colourablity: see Section 11.2 of Appel[1].

[1] Modern Compiler Implementation in ML.

SUMMARY

What have we learnt?

Summary

Register allocation transforms unlimited-register code into limited-register code.

Aim to introduce as few spills as possible by reusing registers.

Chaitin's algorithm not optimal but efficient and typically gives good results.

Coalescing allows unnecessary move instructions to be removed.

Limitations

A long-living but infrequently-used variable may get mapped to a register r (and not spilt).

This seems wasteful of a valuable resource: it would be better to map a frequently-used variable to register r.

Acknowledgements

Many ideas in this chapter were taken from:

Gregory Chaitin of IBM: see “Register Allocation and Spilling via Graph Colouring”.

Andrew Appel: see “Modern Compiler Implementation in ML”.

IMPLEMENTATION

Interference graph (1)

Interference graph: each register r is mapped to a list of registers that are live at the same time as r.

Compute all registers live at the same time as r:

liveWith :: (Liveness, Reg) -> [Reg] liveWith(live, r) = bigUnion([rs | (l, rs) <- live, member(r, rs)])

type IG = Map Reg [Reg]

Interference graph (2)

Construct the interference graph for a given code sequence:

interferenceGraph :: Code -> IG interferenceGraph(code) = [(r, liveWith(live, r)) | r <- regs] where root = fresh() g = cfg(code, root) live = liveness(g) regs = bigUnion([use(i) ++ def(i) | i <- code])

Basic colouring

A colour is an integer between 1 and numRegs and a colouring maps registers to colours:

type Colour = Int

type Colouring = Map Reg Colour

Colour graph g in the order specified by rs, given an initial colouring c:

basic :: ([Reg], IG, Colouring) -> Colouring basic([], g, c) = c basic(r:rs, g, c) = case possible of [] -> basic(rs, g, c) x:xs -> basic(rs, g, insert(c, r, x)) where possible = [1..numRegs] \\ neighbours(g, c, r)

Simplification algorithm

Return registers in a good order for colouring:

Colour the registers in the order given by the simplification algorithm.

simplify :: IG -> [Reg] simplify([]) = [] simplify(g) = simplify(del(r, g)) ++ [r] where r = pick(g)

colour :: IG -> Colouring colour(g) = basic(simplify(g), g, [])