Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

32
UNIVERSITY NIVERSITY OF OF D DELAWARE ELAWARE C COMPUTER & OMPUTER & INFORMATION NFORMATION SCIENCES CIENCES DEPARTMENT EPARTMENT Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation John Cavazos University of Delaware

description

Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation. John Cavazos University of Delaware. High Level View of JVM. JVM Interpreter. Reads a bytecode from a method “Interprets” the bytecode Decodes opcode and operands Based on opcodes jumps to some C code Passes operands - PowerPoint PPT Presentation

Transcript of Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

Page 1: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Optimizing CompilersCISC 673

Spring 2011Dynamic Compilation

John CavazosUniversity of Delaware

Page 2: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

High Level View of JVM

Page 3: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

JVM Interpreter Reads a bytecode from a method “Interprets” the bytecode

Decodes opcode and operands Based on opcodes jumps to some C code Passes operands

Continues reading bytecodes from method until: Call Return Exception

Page 4: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Interpretation Popular approach for high-level languages

Ex, Python, APL, SNOBOL, BCPL, Perl, MATLAB

Useful for memory-challenged environments

Low startup time & space overhead, but much slower than native code execution

MMI (Mixed Mode Interpreter) [Suganauma’01] Fast interpreter implemented in assembler

Page 5: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Dynamic Compilation Techniques

Baseline compiler Translates bytecodes one by one to

machine code Quick compilation

Reduced set of optimizations for fast compilation

Page 6: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Dynamic Compilation Techniques

Full compilation Full optimizations only for selected hot

methods Classic just-in-time compilation

Compile methods to native code on first invocation

Ex, ParcPlace Smalltalk-80, Self-91 Initial high (time & space) overhead for each

compilation Precludes use of sophisticated optimizations (eg.

SSA) Responsible for many of today’s

myths

Page 7: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Interpretation vs JIT

0

20

40

60

80

100

120

Intepreter Compiler

Initial Overhead Execution

0

500

1000

1500

2000

2500

Intepreter Compiler

Execution: 20 time units Execution: 2000 time units

Page 8: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Selective Optimization

Hypothesis: most execution is spent in a small percentage of methods (90/10 rule)

Idea: use two execution strategies1. Interpreter or non-optimizing compiler2. Full-fledged optimizing compiler

Strategy: Use option 1 for initial execution of all methods Profile to find “hot” subset of methods Use option 2 on this subset

Page 9: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Selective Optimization

0

20

40

60

80

100

120

Intepreter Compiler Selective

Initial Overhead Execution

0

500

1000

1500

2000

2500

Intepreter Compiler Selective

Initial Overhead Execution

Selective opt: compiles 10%-20% of methods, representing 90-99% of execution time

Execution: 20 time units Execution: 2000 time units

Page 10: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Designing a Selective Optimizer AKA: Adaptive Optimization System What is the system architecture?

What are the profiling mechanisms and policies for driving recompilation? How effective are these systems?

Page 11: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Basic Structure of a Dynamic Compiler

ProgramMachine

code

Structural inlining

unrollingloop perm

Scalar cse

constantsexpressions

Memory scalar repl

ptrs

Reg. Alloc

Scheduling peephole

Still needs good core compiler - but more

Page 12: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Raw Profile Data

Instrumented code

Basic Structure of a Dynamic Compiler

Compiler subsystem

Optimizations

Interpreter or Simple Translation

Program Executing Program

Profile Processor

History

prior decisionscompile time

ControllerCompilation

decisions

Processed Profile

Page 13: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Method Profiling

Counters Call Stack Sampling Combinations

Page 14: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Method Profiling: Counters Insert method-specific counter on method entry and loop

back edges Counts how often a method is called and approximates how

much time is spent in a method Very popular approach: Self, HotSpot Issues: overhead for incrementing counter can be

significant Not present in optimized code

Page 15: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Method Profiling: Counters

foo ( … ) { fooCounter++; if (fooCounter > Threshold) { recompile( … ); } . . .

}

Page 16: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Method Profiling: Call Stack Sampling

Periodically record which method(s) on call stack

Approximates amount of time spent in each method

Can be compiled into the code Jikes RVM, JRocket

or use hardware sampling Issues: timer-based sampling is not

deterministic

Page 17: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Method Profiling: Call Stack Sampling

ABC

AB

A AB

ABC

ABC

......

Sample

Page 18: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Method Profiling Mixed Combinations

Use counters initially and sampling later on IBM DK for Java

foo ( … ) { fooCounter++; if (fooCounter > Threshold) { recompile( … ); } . . . }

ABC

Page 19: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Recompilation Policies

Problem: given optimization candidates, which should be optimized?

Counters: Optimize method that surpass threshold Simple, but hard to tune, doesn’t

consider context Sampling: Optimize method on call

stack top Addresses context issue

Page 20: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Recompilation Policies

Problem: given optimization candidates, which should be optimized?

Call Stack Sampling: Optimize all methods that are sampled

Simple to implement Use cost/benefit model

Seemingly complicated, but easy to engineer Maintenance free Naturally supports multiple optimization

levels

Page 21: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Jikes RVM: Recompilation Policy – Cost/Benefit Model

Define cur, current opt level for method m Exe(j), expected future execution time at level

j Comp(j), compilation cost at opt level j

Choose j > cur that minimizes Exe(j) + Comp(j)

If Exe(j) + Comp(j) < Exe(cur) recompile at level j

Page 22: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Jikes RVM: Recompilation Policy – Cost/Benefit Model

Assumptions Sample data determines how

long a method has executed Method will execute as much in

the future as it has in the past Compilation cost and speedup

are offline averages

Page 23: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Optimization LevelsOptimization

Level

Opt LevelO0

Opt LevelO1

Opt LevelO2

Branch Opts Low Constant Prop / Local CSE

Reorder Code Copy Prop / Tail Recursion

Static Splitting / Branch Opt Med Simple Opts Low

While into Untils / Loop Unroll Branch Opt High / Redundant BR

Simple Opts Med / Load Elim Expression Fold / Coalesce

Global Copy Prop / Global CSE SSA

Optimizations Controlled

Page 24: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Short Running Programs

No FDO, Mar’04, AIX/PPC

Page 25: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Short Running Programs

No FDO, Mar’04, AIX/PPC

Page 26: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Steady State

No FDO, Mar’04, AIX/PPC

Page 27: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Steady State

Page 28: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Profiling for What to Do

Myth: Sophisticated profiling is too expensive to perform online

Reality: Well-known technology can collect sophisticated profiles with sampling and minimal overhead

Page 29: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Suggested ReadingDynamic Compilation

Adaptive optimization in the Jalapeno JVM, M. Arnold, S. Fink, D. Grove, M. Hind, and P. Sweeney, Proceedings of the 2000 ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages & Applications (OOPSLA '00), pages 47--65, Oct. 2000.

Page 30: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Spare Slides

Page 31: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Method Profiling Timer Based

class Thread scheduler (...) { ... flag = 1;}void handler(...) { // sample stack, perform GC, swap threads, etc. .... flag = 0;}

foo ( … ) { // on method entry, exit, & all loop backedges if (flag) { handler( … ); } . . . }

ABC

Useful for more than profiling Jikes RVM

Schedule garbage collection Thread scheduling policies, etc.

if (flag) handler();

if (flag) handler();

if (flag) handler();

Page 32: Optimizing Compilers CISC 673 Spring 2011 Dynamic Compilation

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Arnold-Ryder [PLDI 01]: Full Duplication Profiling

Full-Duplication Framework

Duplicated CodeChecking Code

Method Entry

Checks

EntryBackedges

CheckPlacement

Generate two copies of a method• Execute “fast path” most of the time• Execute “slow path” with detailed profiling occassionally• Adapted by J9 due to proven accuracy and low overhead