CISC 879 - Machine Learning for Solving Systems Problems John Cavazos Dept of Computer & Information...

Post on 13-Dec-2015

219 views 0 download

Tags:

Transcript of CISC 879 - Machine Learning for Solving Systems Problems John Cavazos Dept of Computer & Information...

CISC 879 - Machine Learning for Solving Systems Problems

John CavazosDept of Computer & Information Sciences

University of Delaware

www.cis.udel.edu/~cavazos/cisc879

Machine Learning applied to Static

CompilationLecture 2

CISC 879 - Machine Learning for Solving Systems Problems

Hardware constantly changing

Heterogeneous Processors in Gaming Devices

Massively ParallelGraphics Processing Units

Heterogeneous Processors In Supercomputers

Powerful Embedded Devices

CISC 879 - Machine Learning for Solving Systems Problems

Compilers changing slower

► In the early days of compilers …

1957: The FORTRAN Automatic Coding System

FrontEnd

Front End

Middle End Back End

IndexOptimiz’n

CodeMerge

FlowAnalysis

RegisterAllocation

Final Assembly

CISC 879 - Machine Learning for Solving Systems Problems

► And 50 years later… ► Compilers have not changed much► Inadequate support for modern architectures

Compiler changing slower

FrontEnd

Front End

Middle End Back End

High-LevelOptimiz’n

Mid-LevelOptimiz’n

FlowAnalysis

RegisterAllocation

Final Assembly

2007: Typical Compiler

CISC 879 - Machine Learning for Solving Systems Problems

Proposed Solution► Intelligent Compilers

► Using AI (i.e., machine learning) techniques ► Learn to optimize

► Specialize to architecture

Feedback

Intelligent Compiler(Ex: Neural Networks, Decision Trees,

Reinforcement Learning)

Applications

Architecture

CISC 879 - Machine Learning for Solving Systems Problems

Intelligent Compilers?► Compiler improves itself

► Showing it examples of behaviour we want.

Unroll Tiling Fusion Fission

CISC 879 - Machine Learning for Solving Systems Problems

Applying Machine Learning

► Inputs► Program characterization

► Outputs► Set of optimizations to apply

CISC 879 - Machine Learning for Solving Systems Problems

Case Study► Whole Program Optimization► Paper: Rapidly Selecting Good Compiler Optimizations

using Performance Counters, Cavazos et al., CGO 2007

CISC 879 - Machine Learning for Solving Systems Problems

Whole Program Optimization

► Automatically construct “model”► Map performance counters to good opts

► Model predicts optimizations to apply► Use performance counter characterization

CISC 879 - Machine Learning for Solving Systems Problems

Inputs : Performance Cntrs

► Mnemonic Description Avg Values► FPU_IDL (Floating Unit Idle) 0.473

► VEC_INS (Vector Instructions) 0.017

► BR_INS (Branch Instructions) 0.047

► L1_ICH (L1 Icache Hits) 0.0006

Application

CISC 879 - Machine Learning for Solving Systems Problems

Outputs : OptimizationsOptimizati

onLevel

Opt LevelO0

Opt LevelO1

Opt LevelO2

Optimizations Controlled

Branch Opts Low Constant Prop / Local CSE Reorder

Code Copy Prop / Tail Recursion

Static Splitting / Branch Opt Med Simple Opts LowWhile into Untils / Loop

Unroll Branch Opt High / Redundant

BR Simple Opts Med / Load Elim

Expression Fold / Coalesce Global Copy Prop /

Global CSE SSA

CISC 879 - Machine Learning for Solving Systems Problems

Training Compiler

► Present a training database of► Characteristics of application► “Right” optimizations to use

Unroll Tiling Fusion Fission

Unroll Tiling Fusion Fission

(.91,.32,.40,51) (.61,.12,.50,81)Model Model

CISC 879 - Machine Learning for Solving Systems Problems

Using Trained Compiler

► Present characteristics of “new” application

► Compiler predicts how to optimize it

(.81,.35,.40,69)

Model

CISC 879 - Machine Learning for Solving Systems Problems

Performance Counters

CISC 879 - Machine Learning for Solving Systems Problems

Characterization of 181.mcf

► Perf cntrs relative to several benchmarks

CISC 879 - Machine Learning for Solving Systems Problems

Characterization of 181.mcf

► Perf cntrs relative to several benchmarks

Problem: Greater number of memory Problem: Greater number of memory accesses per instruction than averageaccesses per instruction than average

CISC 879 - Machine Learning for Solving Systems Problems

Training PC Model

Compiler and

CISC 879 - Machine Learning for Solving Systems Problems

Programs to train model (different from test program).

Compiler and

Training Perf Cntr Model

CISC 879 - Machine Learning for Solving Systems Problems

Baseline runs to capture performance counter values.

Compiler and

Training Perf Cntr Model

CISC 879 - Machine Learning for Solving Systems Problems

Obtain performance counter values for a benchmark.

Compiler and

Training Perf Cntr Model

CISC 879 - Machine Learning for Solving Systems Problems

Best optimizations runs to get speedup values.

Compiler and

Training Perf Cntr Model

CISC 879 - Machine Learning for Solving Systems Problems

Best optimizations runs to get speedup values.

Compiler and

Training Perf Cntr Model

CISC 879 - Machine Learning for Solving Systems Problems

Perform training on a large set of programs.

Compiler and

Training Perf Cntr Model

CISC 879 - Machine Learning for Solving Systems Problems

New program interested in obtaining good performance.

Compiler and

Using Perf Cntr Model

CISC 879 - Machine Learning for Solving Systems Problems

Baseline run to capture performance counter values.

Compiler and

Using Perf Cntr Model

CISC 879 - Machine Learning for Solving Systems Problems

Input performance counter values to model.

Compiler and

Using Perf Cntr Model

CISC 879 - Machine Learning for Solving Systems Problems

Model predicts optimization sequences to apply

Compiler and

Using Perf Cntr Model

CISC 879 - Machine Learning for Solving Systems Problems

Model can predict multiple optimization sequences to try.

Compiler and

Using Perf Cntr Model

CISC 879 - Machine Learning for Solving Systems Problems

►Variation of ordinary regression

►Inputs

►Continuous, discrete, or a mix

►60 performance counters►All normalized to cycles executed

►Ouputs

►Number between 0 and 1

►Probability an optimization is beneficial

Logistic Regression

CISC 879 - Machine Learning for Solving Systems Problems

► Pathscale industrial-strength compiler► Compare to highest opt level (-Ofast)► Orchestrate 121 compiler optimizations

► AMD Athlon processor► Real machine; Not simulation

► 57 benchmarks► SPEC (95, 2000), MiBench, Polyhedral

Experimental Methodology

CISC 879 - Machine Learning for Solving Systems Problems

► RAND

► Randomly select 500 optimization seqs

► Combined Elimination (CE)► State-of-the-art search technique [CGO ‘06]

► Performance Counter (PC) Model

Evaluated Search Strategies

CISC 879 - Machine Learning for Solving Systems Problems

PCModel vs CE

9 benchmarks over 20% improvement and 17% on average!

CISC 879 - Machine Learning for Solving Systems Problems

PCModel vs CE

Obtained over 25% improvement on 6 benchmarks!

CISC 879 - Machine Learning for Solving Systems Problems

PCModel vs CE

On average, CE obtains 9% and PC Model 17% over -Ofast

CISC 879 - Machine Learning for Solving Systems Problems

Performance vs Evaluations

CISC 879 - Machine Learning for Solving Systems Problems

Performance vs Evaluations

PC Model (17%)

CISC 879 - Machine Learning for Solving Systems Problems

Performance vs Evaluations

Random (17%)

CISC 879 - Machine Learning for Solving Systems Problems

Performance vs Evaluations

Combined Elim (12%)

CISC 879 - Machine Learning for Solving Systems Problems

CE worse than RAND?

► Combined Elimination

► Easily stuck in local minima

► RAND and PC Model

► Probabilistic techniques

► Depends on distribution of good points

► Not susceptible to local minima

CISC 879 - Machine Learning for Solving Systems Problems

Static vs Dynamic Features

CISC 879 - Machine Learning for Solving Systems Problems

► Using machine learning successful► Out-performs production-quality compiler

► Using performance counters► Determines automatically important characteristics► Optimizations applied only when beneficial

Conclusions

CISC 879 - Machine Learning for Solving Systems Problems

► Use performance counters to predict “how” and “when” to apply an optimization

► Individual Opts: E.g., how many times to unroll a loop?

► Optimization sequences: Which opts to apply?

► Malware identification► Can malware be identified by performance counter

characteristics?

Example Projects