Post on 13-Dec-2015
CISC 879 - Machine Learning for Solving Systems Problems
John CavazosDept of Computer & Information Sciences
University of Delaware
www.cis.udel.edu/~cavazos/cisc879
Machine Learning applied to Static
CompilationLecture 2
CISC 879 - Machine Learning for Solving Systems Problems
Hardware constantly changing
Heterogeneous Processors in Gaming Devices
Massively ParallelGraphics Processing Units
Heterogeneous Processors In Supercomputers
Powerful Embedded Devices
CISC 879 - Machine Learning for Solving Systems Problems
Compilers changing slower
► In the early days of compilers …
1957: The FORTRAN Automatic Coding System
FrontEnd
Front End
Middle End Back End
IndexOptimiz’n
CodeMerge
FlowAnalysis
RegisterAllocation
Final Assembly
CISC 879 - Machine Learning for Solving Systems Problems
► And 50 years later… ► Compilers have not changed much► Inadequate support for modern architectures
Compiler changing slower
FrontEnd
Front End
Middle End Back End
High-LevelOptimiz’n
Mid-LevelOptimiz’n
FlowAnalysis
RegisterAllocation
Final Assembly
2007: Typical Compiler
CISC 879 - Machine Learning for Solving Systems Problems
Proposed Solution► Intelligent Compilers
► Using AI (i.e., machine learning) techniques ► Learn to optimize
► Specialize to architecture
Feedback
Intelligent Compiler(Ex: Neural Networks, Decision Trees,
Reinforcement Learning)
Applications
Architecture
CISC 879 - Machine Learning for Solving Systems Problems
Intelligent Compilers?► Compiler improves itself
► Showing it examples of behaviour we want.
Unroll Tiling Fusion Fission
CISC 879 - Machine Learning for Solving Systems Problems
Applying Machine Learning
► Inputs► Program characterization
► Outputs► Set of optimizations to apply
CISC 879 - Machine Learning for Solving Systems Problems
Case Study► Whole Program Optimization► Paper: Rapidly Selecting Good Compiler Optimizations
using Performance Counters, Cavazos et al., CGO 2007
CISC 879 - Machine Learning for Solving Systems Problems
Whole Program Optimization
► Automatically construct “model”► Map performance counters to good opts
► Model predicts optimizations to apply► Use performance counter characterization
CISC 879 - Machine Learning for Solving Systems Problems
Inputs : Performance Cntrs
► Mnemonic Description Avg Values► FPU_IDL (Floating Unit Idle) 0.473
► VEC_INS (Vector Instructions) 0.017
► BR_INS (Branch Instructions) 0.047
► L1_ICH (L1 Icache Hits) 0.0006
Application
CISC 879 - Machine Learning for Solving Systems Problems
Outputs : OptimizationsOptimizati
onLevel
Opt LevelO0
Opt LevelO1
Opt LevelO2
Optimizations Controlled
Branch Opts Low Constant Prop / Local CSE Reorder
Code Copy Prop / Tail Recursion
Static Splitting / Branch Opt Med Simple Opts LowWhile into Untils / Loop
Unroll Branch Opt High / Redundant
BR Simple Opts Med / Load Elim
Expression Fold / Coalesce Global Copy Prop /
Global CSE SSA
CISC 879 - Machine Learning for Solving Systems Problems
Training Compiler
► Present a training database of► Characteristics of application► “Right” optimizations to use
Unroll Tiling Fusion Fission
Unroll Tiling Fusion Fission
(.91,.32,.40,51) (.61,.12,.50,81)Model Model
CISC 879 - Machine Learning for Solving Systems Problems
Using Trained Compiler
► Present characteristics of “new” application
► Compiler predicts how to optimize it
(.81,.35,.40,69)
Model
CISC 879 - Machine Learning for Solving Systems Problems
Performance Counters
CISC 879 - Machine Learning for Solving Systems Problems
Characterization of 181.mcf
► Perf cntrs relative to several benchmarks
CISC 879 - Machine Learning for Solving Systems Problems
Characterization of 181.mcf
► Perf cntrs relative to several benchmarks
Problem: Greater number of memory Problem: Greater number of memory accesses per instruction than averageaccesses per instruction than average
CISC 879 - Machine Learning for Solving Systems Problems
Training PC Model
Compiler and
CISC 879 - Machine Learning for Solving Systems Problems
Programs to train model (different from test program).
Compiler and
Training Perf Cntr Model
CISC 879 - Machine Learning for Solving Systems Problems
Baseline runs to capture performance counter values.
Compiler and
Training Perf Cntr Model
CISC 879 - Machine Learning for Solving Systems Problems
Obtain performance counter values for a benchmark.
Compiler and
Training Perf Cntr Model
CISC 879 - Machine Learning for Solving Systems Problems
Best optimizations runs to get speedup values.
Compiler and
Training Perf Cntr Model
CISC 879 - Machine Learning for Solving Systems Problems
Best optimizations runs to get speedup values.
Compiler and
Training Perf Cntr Model
CISC 879 - Machine Learning for Solving Systems Problems
Perform training on a large set of programs.
Compiler and
Training Perf Cntr Model
CISC 879 - Machine Learning for Solving Systems Problems
New program interested in obtaining good performance.
Compiler and
Using Perf Cntr Model
CISC 879 - Machine Learning for Solving Systems Problems
Baseline run to capture performance counter values.
Compiler and
Using Perf Cntr Model
CISC 879 - Machine Learning for Solving Systems Problems
Input performance counter values to model.
Compiler and
Using Perf Cntr Model
CISC 879 - Machine Learning for Solving Systems Problems
Model predicts optimization sequences to apply
Compiler and
Using Perf Cntr Model
CISC 879 - Machine Learning for Solving Systems Problems
Model can predict multiple optimization sequences to try.
Compiler and
Using Perf Cntr Model
CISC 879 - Machine Learning for Solving Systems Problems
►Variation of ordinary regression
►Inputs
►Continuous, discrete, or a mix
►60 performance counters►All normalized to cycles executed
►Ouputs
►Number between 0 and 1
►Probability an optimization is beneficial
Logistic Regression
CISC 879 - Machine Learning for Solving Systems Problems
► Pathscale industrial-strength compiler► Compare to highest opt level (-Ofast)► Orchestrate 121 compiler optimizations
► AMD Athlon processor► Real machine; Not simulation
► 57 benchmarks► SPEC (95, 2000), MiBench, Polyhedral
Experimental Methodology
CISC 879 - Machine Learning for Solving Systems Problems
► RAND
► Randomly select 500 optimization seqs
► Combined Elimination (CE)► State-of-the-art search technique [CGO ‘06]
► Performance Counter (PC) Model
Evaluated Search Strategies
CISC 879 - Machine Learning for Solving Systems Problems
PCModel vs CE
9 benchmarks over 20% improvement and 17% on average!
CISC 879 - Machine Learning for Solving Systems Problems
PCModel vs CE
Obtained over 25% improvement on 6 benchmarks!
CISC 879 - Machine Learning for Solving Systems Problems
PCModel vs CE
On average, CE obtains 9% and PC Model 17% over -Ofast
CISC 879 - Machine Learning for Solving Systems Problems
Performance vs Evaluations
CISC 879 - Machine Learning for Solving Systems Problems
Performance vs Evaluations
PC Model (17%)
CISC 879 - Machine Learning for Solving Systems Problems
Performance vs Evaluations
Random (17%)
CISC 879 - Machine Learning for Solving Systems Problems
Performance vs Evaluations
Combined Elim (12%)
CISC 879 - Machine Learning for Solving Systems Problems
CE worse than RAND?
► Combined Elimination
► Easily stuck in local minima
► RAND and PC Model
► Probabilistic techniques
► Depends on distribution of good points
► Not susceptible to local minima
CISC 879 - Machine Learning for Solving Systems Problems
Static vs Dynamic Features
CISC 879 - Machine Learning for Solving Systems Problems
► Using machine learning successful► Out-performs production-quality compiler
► Using performance counters► Determines automatically important characteristics► Optimizations applied only when beneficial
Conclusions
CISC 879 - Machine Learning for Solving Systems Problems
► Use performance counters to predict “how” and “when” to apply an optimization
► Individual Opts: E.g., how many times to unroll a loop?
► Optimization sequences: Which opts to apply?
► Malware identification► Can malware be identified by performance counter
characteristics?
Example Projects