The Impact of Machine Learning on Branch Prediction...
Transcript of The Impact of Machine Learning on Branch Prediction...
Background
● Branch Prediction ○ Long Pipelines , BIT → :(
● Supervis ed Machine Learning ○ Input Features ○ Output Targets ○ Cos t
● Dynamic Prediction ○ Firs t Proposed by L. Vitan Us ing LVQ ○ Implemented by D. A. J iménez and C. Lin
https :/ / www.eetimes .com/document.as p?doc_id=1322696
2
ESP: Framework Introduction
● Motivation: ISAs w/ Branch Hints (“Likely” Bits ) ● Poses Sta tic Branch Prediction as ML Problem ● Two Phases :
1. Profile Set of Programs 2. Use Profile to Predict Branches in New Programs
● Robus t to Different Environments ● Doesn’t Rely on Expert Heuris tics
6
ESP: ML Problem Formalization
● Training Data - Sta tic Features ○ Opcode ○ Branch Direction ○ Branch Operand Type ○ Bas ic Block Graph Metadata ○ More...
● Target - Branch Taken/ Not Taken
[7]
7
ESP: Neural Network (NN) Approach
● Predict Taken Probability ● ‘tanh’ Activations ● Normalized Features ● Batch Training ● Early Stopping
[7]
8
ESP: Decision Tree (DT) Approach
● Clas s ify Taken or Not ● Split on Features by Max. Information Gain ● C4.5 Allows for Discrete & Continuous Features ● Pruning Employed
http://www.cnblogs.com/superhuake/archive/2012/07/25/2609124.html
9
ESP: Evaluation
● Both Models Beat Previous Sta te of the Art Methods ○ C/ Fortran Benchmarks ○ SPEC (tomcatv, nasa7, etc.) ○ Others (perl, gzip, tex, etc.)
● NNs Outperformed DTs by 1% ● NNs Difficult to Interpret (DTs are Not) ● DT: Sacrifice Performance for Explainability?
11
Dynamic Methods
● Perceptrons ○ Main Advantage Comes From Linear His tory Growth Rate
○ Also Are More Accurate Than Any Other Dynamic Predictors
13
Standard Dynamic Predictors
● Saturating counters ○ Updates Each Encounter
● BHT(Branch His tory Table) ○ Each Branch Has Independant His tory Entries ○ Requires 2^n PHT Entries Per Branch
14
What is a Perceptron?
● Supervis ed, Binary Clas s ifier ● Mos t Bas ic Neural Network ● Makes Linear Predictions ● Comparable to Linear Regres s ion
https://training.seer.cancer.gov/anatomy/nervous/tissue.html
16
Simple Perceptron Prediction Example n=4
Branch His tory
Weights
Prediction -1*1 + 1*30 + 1*-2 + -1*-20 + 1_10 = 57 > 0
Result = Taken
-1 (NT) 1 (T) 1 (T) -1 (NT) 1 (BIAS)
1 30 -2 -20 10
17
Hardware Implementation
● Compared to gshare, bimodal well known techniques .
● Compared to global/ local combined predictors .
● Perceptron done with global prediction, global/ local prediction, and finally a dual predictor with override agreement.
18
Hardware Budgeting Perceptron predictors work better with larger his tory tables , which in turn a llows for better prediction accuracy.
19
[7]
Results 4K budget global: Perceptron: 4.6% Gshare : 6.2% Bimode: 5.4% 4K budget local/ global: Perceptron: 4.5% Hybrid: 5.2%
20 [7]
AMD Zen Architecture
● AMD Ryzen CPU ● Perceptron Branch Prediction
https://www.anandtech.com/show/10907/
21
Conclusion
● Both Proven to Boos t Performance ● Static Pitfa lls :
○ It’s Sta tic ○ Static Prediction Irrelevant w/ Modern CPUs
● Dynamic Pitfa lls : ○ Implementation Complexity ○ Prediction Latency
https :/ / img1.etsys tatic.com/ 057/0/9266605/ il_340x270.760182121_h47b.jpg
22
References [1] Lucian N. VINȚAN, “DYNAMIC NEURAL BRANCH PREDICTION FUNDAMENTALS,” 2016. [Online]. Available: http:/ / www.agir.ro/ buletine / 2501.pdf. [Acces s ed: 18-Nov-2017].
[2] L. N. Vintan and C. Egan, “Extending correlation in branch prediction s chemes ,” in EUROMICRO Conference, 1999. Proceedings . 25th, 1999, vol. 1, pp. 441– 448.
[3] D. A. J iménez, “Fas t path-bas ed neural branch prediction,” in Proceedings of the 36th annual IEEE/ ACM International Sympos ium on Microarchitecture, 2003, p. 243.
[4] Raj Parihar, “Branch Prediction Techniques and Optimizations ,” 18-J ul-2016. [Online]. Available: http:/ / www.cs e.iitd.ac.in/ ~ s rs arangi/ col718_2016/ papers / branchpred/ branch-pred-many.pdf. [Acces s ed: 18-Nov-2017].
[5] D. A. J iménez and C. Lin, “Neural methods for dynamic branch prediction,” ACM Trans actions on Computer Sys tems (TOCS), vol. 20, no. 4, pp. 369– 397, 2002.
23
References [6] D. A. J iménez and C. Lin, “Dynamic branch prediction with perceptrons ,” in High-Performance Computer Architecture, 2001. HPCA. The Seventh International Sympos ium on, 2001, pp. 197– 206.
[7] B. Calder et a l., “Evidence-bas ed s tatic branch prediction us ing machine learning,” ACM Trans actions on Programming Languages and Sys tems (TOPLAS), vol. 19, no. 1, pp. 188– 222, 1997.
[8] G. H. Loh, “A s imple divide-and-conquer approach for neural-clas s branch prediction,” in 14th International Conference on Parallel Architectures and Compilation Techniques (PACT’05), 2005, pp. 243– 254.
[9] Chris Williams . “’neural network’ s potted deep ins ide Sams ung's Galaxy S7 s ilicon brain. The Regis ter. 22 Aug 2016.
[10]Rosenblatt, Frank. Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books, 1962.
[11]Peter Bright. “AMD’s moment of Zen: Finally, an architecture that can compete. Arstechnica. 3 Feb, 2017.
24