Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper...

20
Order from Chaos — the big picture — 1 COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. Comp 512 Spring 2011

Transcript of Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper...

Page 1: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

Order from Chaos

— the big picture —

1COMP 512, Rice University

Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved.

Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use.

Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved.

Comp 512Spring 2011

Page 2: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

2

Optimization

The subject is confusing

• Whole notion of optimality

• Incredible number of transformations

• Odd, inconsistent terminology

Maybe this stuff is inherently hard

• Many intractable problems

• Many NP-complete problems

• Much overlap between problems and between solutions

• If optimization wasn’t confusing, why take COMP 512 ?

{ Value numberingRedundancy eliminationCommon subexpressions

Cooper McKinley, & Torczon cite 237 distinct papers in the survey!

Page 3: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

3

Optimization

A brief catalog (circa 1982 )

And this was before the literature exploded

Page 4: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

4

Optimization

The literature throws fuel on the fire

• Terminology is non-standard & non-intuitive

• Explanations are terse and incomplete

• Little comparative data that is believable

• No sense of perspective

• Papers give conflicting advice

An example – Is inline substitution profitable?

• Holler’s thesis: it almost always helps

• Hall’s thesis: it occasionally helps, but has lots of problems

• MacFarland’s thesis: it causes instruction cache misses

Reality lies somewhere in the middle

Waterman showed that program-specific heuristics win

{ RevivalPartially-dead codeForward propagation

} Not all those papers can be the best !

Page 5: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

5

Optimization

Improvement should be objective

• Easy to quantify

• Produce concrete improvements

• Taking measurements seems easy

Code either gets better or it gets worse

But, …

• Linear-time heuristics for hard problems

• Unforeseen consequences & poorly understood interactions

• “Obvious wins” have non-obvious downsides

• Multiple ways to achieve the same end

Experimental computer science takes a lot of work

Page 6: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

6

The Role of Comp 512

Bringing order out of chaos

• Provide a framework for thinking about optimization

• Differentiate analysis from transformation†

• Think about how things help, not what they do

Goal: a rational approach to the subject matter

• Objective criteria for evaluating ideas & papers

• Bring high school level science back into the game

†The Comp 512 Motto:Knowledge alone does not make code run faster. You have to change the code to make it run faster.

Page 7: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

7

Classic Taxonomy

Machine independent transformations

• Applicable across a broad range of machines

• Decrease ratio of overhead to real work

• Reduce running time or space

• Examples: dead code elimination

Machine dependent transformations

• Capitalize on specific machine properties

• Improve the mapping from IR to this machine

• Might use an exotic instruction (shift the reg. window for a loop)

• Example: instruction scheduling

Page 8: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

8

Classic Taxonomy

Distinction is not always clear

• Replacing multiply with shifts and adds

• Eliminating a redundant expression

The truth is somewhat muddled

• Machine independent means that we deliberately & knowingly ignore target-specific constraints

• Machine dependent means that we explicitly consider target-specific constraints

Redundancy elimination might fit in either category Versions that consider register pressure

Page 9: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

9

The Comp 512 Taxonomy

An effects-based classification (for speed)

• Five machine-independent ways to speed up code Eliminate a redundant computation Move code to a place where it executes less often Eliminate dead code Specialize a computation based on context Enable another transformation

• Three machine-dependent ways to speed up the code Manage or hide latency Take advantage of special hardware features Manage finite resources

For scalar optimization, this covers most of them

Page 10: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

10

Dead code

Dead code elim.

Partial d.c.e.

Constant propagation

Algebraic identities

The Comp 512 Taxonomy

Machine Independent

From §6 of Cooper, McKinley, & Torczon

Redundancy

Redundancy elim.

Partial red. elim.

Consolidation

Code motion

Loop-invariant c.m.

Consolidation

Global Scheduling [Click]

Constant propagation

Create opportunities

Reassociation

Replication

Specialization

Replication

Strength Reduction

Method caching

Heapstack allocation

Tail recursion elimination

Page 11: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

11

The Comp 512 Taxonomy

Machine Dependent

From §6 of Cooper, McKinley, & Torczon

Hide latency

Scheduling

Blocking references

Prefetching

Code layout

Data packing

Manage resources

Allocate (registers, tlb slots)

Schedule

Data packing

Coloring memory locations

Special features

Instruction selection

Peephole optimization

Page 12: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

12

What have we seen so far?

• Redundancy elimination LVN, SVN, DVN It is a category in taxonomy by itself

• Loop Unrolling Form of specialization

• Dead store elimination Form of dead code elimination

• Block Placement Form of latency hiding & resource management

• Inline substitution Form of specialization

Page 13: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

13

What about Fortran H? (Lecture 2)

• Eliminate a redundant computation Commoning

• Move code to a place where it executes less often Backward motion

• Eliminate dead code (must have done it, but don’t talk about it)

• Specialize a computation based on context Strength reduction

• Enable another transformation Reassociation

• Manage or hide latency

• Take advantage of special hardware features

• Manage finite resources Register allocation

}Didn’t really talk about these, if they did them

*

Page 14: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

14

Scope of Optimization (another axis)

Local

• Handles individual basic blocks Maximal length sequence of straight line code Each of the bi’s is a basic block

• Basic blocks are easy to analyze

• Can prove strongest results

• Code quality suffers at block boundaries

Local methods

• Value numbering, instruction scheduling

b4 b5

b6

b1

b3b2

*

Page 15: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

15

Superlocal

• Handles extended basic blocks

Sequence of blocks where each has a unique predecessor

Use results for bi to help with bj

• Analysis & transformation over larger region

• Fewer rough edges

• Can make it efficient by reusing results

Superlocal methods

• Value numbering, instruction scheduling

Scope of Optimization

b4 b5

b6

b1

b3b2

*

Page 16: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

16

Regional

• Arbitrary subset of blocks (loop nests, dominator subtrees)

• Use results from one block to improve others

• Limiting scope can increase focus on performance critical regions

• Can eliminate some global impediments

Regional methods

• Loop xforms (unroll, fuse, interchange,strip mining, blocking), DVNT, OSR,register promotion, prefetch insertion,software pipelining, trace scheduling ...

Scope of Optimization

b1

b2

b3

b4

b5

Remember Fortran H

*

Page 17: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

17

Whole procedure (global or intraprocedural )

• Handles entire procedure

• Make decisions based on global knowledge & global benefit

• No rough edges inside procedure (tied to compilation unit)

• Classic data-flow analysis

Global methods

• CSE (AVAIL, PRE, LCM), constant propagation, GCRA, dead code elim.,hoisting, copy coalescing, ...

Scope of Optimization

b4 b5

b6

b1

b3b2

Page 18: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

18

Whole program (interprocedural )

• Handles more than one procedure, up to entire program

• Create even larger scopes for optimization

• Limited interactions between procedures Parameters + global variables

• Analysis problems are harder

• Opportunities are different

• Issues for compiler structure

Interprocedural methods

• Inline substitution, procedure cloning,

constant propagation, using whole

program analysis to support global xforms

Scope of Optimization

b4 b5

b6

b1

b3b2

b4 b5

b6

b1

b3b2

b4 b5

b6

b1

b3b2

Page 19: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

19

Decision Complexity

Yet another axis on which to categorize optimizations

Constant timeLow-order

polynomial timeHard Problems

LVN, SVN, DVNT

Block placement

Tree balancing (ILP)

Loop unrolling Reassociation

Inline substitution

Spill Choice in RA

Copy Coalescing

Page 20: Order from Chaos — the big picture — 1COMP 512, Rice University Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

COMP 512, Rice University

20

Near-term Roadmap

Data-flow analysis

• Deeper treatment of iterative data-flow theory

• Quick look at other solvers

Static single assignment form

• Construction and destruction of SSA

• Example algorithms that use SSA

Populating the Taxonomy

• More transformations, more transformations, more …

• Try to fill in the odd corners of the taxonomy