Post on 23-Mar-2016
description
Computer Architecture: A Constructive Approach
Branch Direction Prediction –Pipeline Integration
Joel EmerComputer Science & Artificial Intelligence Lab.Massachusetts Institute of Technology
April 23, 2012 L20-1http://csg.csail.mit.edu/6.S078
NA pred with decode feedback
April 23, 2012 L20-2http://csg.csail.mit.edu/6.S078
F
Fetch
fr D
Decode
dr R
RegRead
rr X
Execute
xr M
Memory
mr W
Write-backxf
NextAddress
Prediction
df
DirectionPrediction
Direction prediction recipeExecute
Send redirects on mispredicts (unchanged) Send direction prediction training
Decode
Check if next address matches direction pred Send redirect if different (update naPred)
Fetch Generate prediction Learn from feedback Accept redirects from later stages
April 23, 2012 L20-3http://csg.csail.mit.edu/6.S078
Epoch management recipeExecute
On exec epoch mismatch - poison instruction Otherwise,
On mispredict – change exec epoch and redirect.Decode
On new exec epoch – update local exec/decode epochs Otherwise,
On decode epoch mismatch – drop instruction If not dropped,
On next addr mispredict – change decode epoch and redirect.Fetch
On exec redirect – update local exec epoch On decode redirect – if for current exec epoch then update
local decode epochApril 18, 2012 L20-4http://csg.csail.mit.edu/6.S078
Add direction feedbacktypedef struct { Bool correct; NaInfo naPredInfo; Addr nextAddr; DirInfo dirPredInfo; Bool taken;} Feedback deriving (Bits, Eq);
FIFOF#(Tuple3#(Epoch,Epoch,Feedback)) decFeedback<-mkFIFOF;FIFOF#(Tuple2#(Epoch,Feedback)) execFeedback <- mkFIFOF;
April 23, 2012 L20-5http://csg.csail.mit.edu/6.S078
Feedback needs information for training
direction predictor
Execute epochDecode epoch
Execute epoch
Execute (branch analysis)// after executing instruction...let nextEeEpoch = eeEpoch;let cond = execData.execInst.cond; let nextPc = cond?execData.execInst.addr : execData.pc+4;let correctPred = (nextPC == execData.nextAddrPred);
if (!correctPred) nextEeEpoch += 1;eeEpoch <= nextEeEpoch;execFeedback.enq(tuple2(nextEeEpoch, Feedback{correct: correctPred, taken: cond, dirPredInfo: execData.dirPredInfo, naPredInfo: execData.naPredInfo, nextAddr: nextPc}));
// enqueue instruction to next stage
April 23, 2012 L20-6http://csg.csail.mit.edu/6.S078
Note: may have been reset in
decode
Always send feedback
Decode with mispredict detectrule doDecode; let decData = newDecData(fr.first); let correctPath = (decData.execEpoch != deEpoch) ||(decData.decEpoch == ddEpoch);
let instResp = decData.fInst.instResp; let pcPlus4 = decData.pc+4;
if (correctPath) begin decData.decInst = decode(instResp, pcPlus4); let target = knownTargetAddr(decData.decInst); let brClass = getBrClass(decData.decInst); let predTarget = decData.nextAddrPred; let predDir = decData.dirPred;
April 23, 2012 L20-7http://csg.csail.mit.edu/6.S078
Determine if epoch of incoming instruction is on
good path
New exec epoch
Same dec epoch
Decode with mispredict detect let decodedTarget = case (brClass) NonBranch: pcPlus4; UncondKnown: target; CondBranch: (predDir?target:pcPlus4); default: decData.nextAddrPred; endcase; if (decodedTarget != predTarget) begin decData.decEpoch = decData.decEpoch + 1; decData.nextAddrPred = decodedTarget; decFeedback.enq( tuple3(decData.execEpoch, decData.decEpoch, Feedback{correct: False, naPredInfo: decData.naPredInfo, nextAddr: decodedTarget, dirPredInfo: decData.dirPredInfo, taken: decData.takenPred})); enddr.enq(decData); end // of correct path April 23, 2012 L20-8http://csg.csail.mit.edu/6.S078
Wrong next addr?
Tell exec addr of next instruction!
Send feedback
New dec epoch
Enqueue to next stage on correct path
Calculate target as best as decode can
Decode with mispredict detect else begin // incorrect path decData.decEpoch = ddEpoch; decData.execEpoch = deEpoch; end ddEpoch <= decData.decEpoch; deEpoch <= decData.execEpoch; fr.deq;
endrule
April 23, 2012 L20-9http://csg.csail.mit.edu/6.S078
Preserve current epoch if instruction on incorrect path
decData.*Epoch have been set properly so we always save them.
Integration into Fetchrule doFetch(); function Action enqInst(); action let d <- mem.side(MemReq{op: Ld, addr: fetchPC, data:?}; match {.nAddrPred,.naPredInfo}<-naPred.predict(fetchPc); match {.dirPred,.dirPredInfo}<-dirPred.predict(fetchPc); FBundle fInst = FBundle{instResp: d}; FData fData = FData{pc: fetchPc, fInst: fInst, inum: iNum, execEpoch: feEpoch, naPredInfo:naPredInfo, nextAddrPred:nAddrPred, dirPredInfo:dirPredInfo, dirPred:dirPred }; iNum <= iNum + 1; fetchPc <= nAddrPred; fr.enq(fData); endactionendfunction
April 18, 2012 L20-10http://csg.csail.mit.edu/6.S078
Handling redirect from executeif (execFeedback.notEmpty) begin match {.execEpoch, .fb} = execFeedback.first; execFeedback.deq; if(!fb.correct) begin dirPred.repair(fb.dirPredInfo, fb.taken); dirPred.train(fb.dirPredInfo, fb.taken); naPred.repair(fb.naPredInfo, fb.nextAddr); naPred.train(fb.naPredInfo, fb.nextAddr); feEpoch <= execEpoch; fetchPc <= feedback.nextAddr; end else begin dirPred.train(fb.dirPredInfo, fb.taken); naPred.train(fb.naPredInfo, fb.nextAddr); enqInst; endend
April 23, 2012 L20-11http://csg.csail.mit.edu/6.S078
Train and repair on redirect
Just train on correct prediction
Handling redirect from decodeelse if (decFeedback.notEmpty) begin decFeedback.deq; match {.execEpoch, .decEpoch, .fb} = decFeedback.first; if (execEpoch == feEpoch) begin if (!fb.correct) begin // epoch unchanged fdEpoch <= decEpoch; dirPred.repair(fb.dirPredInfo, fb.taken); naPred.repair(fb.naPredInfo, fb.nextAddr); fetchPc <= feedback.nextAddr; end else // dec feedback on correct prediction enqInst; end else // dec feedback, but fetch is in new exec epoch enqInst;else // no feedback enqInst;
April 23, 2012 L20-12http://csg.csail.mit.edu/6.S078
Just repair never train on feedback
from decode
Immediate update issuesIf the direction director does not update immediately on predictions things are easy. But if the predictor updates, we will predict and update the predictor on non-branches.
Possible solutions: Move direction prediction to decode, so we know not to
update on non-branches. But makes timing more critical. Simply use direction predictor even on non-branch
instructions. Note: for superscaler issue designs this is a less significant problem.
April 23, 2012 L20-13http://csg.csail.mit.edu/6.S078
Note: In the lab code we communicate the branch type of each instruction to allow training and repair to decide if they want to perform updates or not based on instruction type.
Predictor PrimitiveIndexed table holding values
Operations Predict Update
Algebraic notation
Prediction = P[Width, Depth](Index; Update)
October 24, 2011 L20-14http://csg.csail.mit.edu/6.s078
Index
Prediction
Update
Depth
Width
P
UI
One-bit Predictor
October 24, 2011 L20-15http://csg.csail.mit.edu/6.s078
PC
Taken
Prediction
A21064(PC; T) = P[ 1, 2K ](PC; T)
P
U
I
1 bit
What happens on loop branches?
At best, mispredicts twice for every use of loop.
Simple temporal prediction
Two-bit Predictor
October 24, 2011 L20-16http://csg.csail.mit.edu/6.s078
PC
+/- Adder
TakenPrediction
Counter[W,D](I; T) = P[W, D](I; if T then P+1 else P-1)
A21164(PC; T) = MSB(Counter[2, 2K](PC; T))
P
U
I
2 bits
History Register
October 24, 2011 L20-17http://csg.csail.mit.edu/6.s078
PC
Concatenate
TakenHistory
History(PC, T) = P(PC; P || T)
P
U
I
Global History
October 24, 2011 L20-18http://csg.csail.mit.edu/6.s078
GHist(;T) = MSB(Counter(History(0, T); T))
Ind-Ghist(PC;T) = MSB(Counter(PC || Hist(GHist(;T);T)))
Taken
0
Concat
Global History
+/-
Prediction
Can we take advantage of a pattern at a particular PC?
Local History
October 24, 2011 L20-19http://csg.csail.mit.edu/6.s078
PC
Concat
Local History
+/-
Prediction
Taken
LHist(PC, T) = MSB(Counter(History(PC; T); T))
Can we take advantage of the global pattern at a particular PC?
Two-level Predictor
October 24, 2011 L20-20http://csg.csail.mit.edu/6.s078
0
Concat
Global History
+/-
Prediction
Taken
2Level(PC, T) = MSB(Counter(History(0; T)||PC; T))
Concat
PC
Two-Level Branch Predictor
October 24, 2011 L20-21http://csg.csail.mit.edu/6.s078
Pentium Pro uses the result from the last two branchesto select one of the four sets of BHT bits (~95% correct)
0 0kFetch PC
Shift in Taken/¬Taken results of each branch
2-bit global branch history shift register
Taken/¬Taken?
Gshare Predictor
October 24, 2011 L20-22http://csg.csail.mit.edu/6.s078
0
Concat
Global History
+/-
Prediction
Taken
2Level(PC, T) = MSB(Counter(History(0; T) PC; T))
xor
PC
Choosing Predictors
October 24, 2011 L20-23http://csg.csail.mit.edu/6.s078
LHist
GHist
Chooser
Chooser = MSB(P(PC; P + (A==T) - (B==T))or
Chooser = MSB(P(GHist(PC; T); P + (A==T) - (B==T))
Prediction
Tournament Branch Predictor(Alpha 21264)
Choice predictor learns whether best to use local or global branch history in predicting next branchGlobal history is speculatively updated but restored on mispredictClaim 90-100% success on range of applications
October 24, 2011 L12-24http://csg.csail.mit.edu/6.s078
Local history table
(1,024x10b)
PC
Local prediction (1,024x3b)
Global Prediction (4,096x2b)
Choice Prediction (4,096x2b)
Global History (12b)Prediction