Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA
description
Transcript of Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA
![Page 1: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/1.jpg)
Constructive Computer ArchitectureTutorial 7:SMIPS Labs and EpochsAndy Wright6.S195 TA
November 1, 2013 http://csg.csail.mit.edu/6.s195 T07-1
![Page 2: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/2.jpg)
IntroductionLab 6 6 Stage SMIPS Processor Due todayLab 7 Complex Branch Predictors Posted online Due next Friday
November 1, 2013 http://csg.csail.mit.edu/6.s195 T07-2
![Page 3: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/3.jpg)
Lab 6Does low IPC => low grade? Only if low IPC is from a ‘mistake’ in your
processor.What is a ‘mistake’? Not updating your BTB with redirect
information Using too small of a scoreboard Having schedule conflicts between
pipeline stages
November 1, 2013 http://csg.csail.mit.edu/6.s195 T07-3
![Page 4: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/4.jpg)
Lab 6What questions do you have?
November 1, 2013 http://csg.csail.mit.edu/6.s195 T07-4
![Page 5: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/5.jpg)
Lab 7Adding history bits to BTB Combines target and direction
predictionImplement BHT Separates direction prediction from
target predictionSynthesize for FPGA Used to calculate IPS
November 1, 2013 T07-5http://csg.csail.mit.edu/6.s195
![Page 6: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/6.jpg)
Fixed slides from L16Epoch Management
November 1, 2013 T07-6http://csg.csail.mit.edu/6.s195
![Page 7: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/7.jpg)
Multiple predictors in a pipeline
At each stage we need to take two decisions: Whether the current instruction is a wrong path
instruction. Requires looking at epochs Whether the prediction (ppc) following the current
instruction is good or not. Requires consulting the prediction data structure (BTB, BHT, …)
Fetch stage must correct the pc unless the redirection comes from a known wrong path instructionRedirections from Execute stage are always correct, i.e., cannot come from wrong path instructions
October 28, 2013 L16-7http://csg.csail.mit.edu/6.S195
![Page 8: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/8.jpg)
Dropping or poisoning an instruction
Once an instruction is determined to be on the wrong path, the instruction is either dropped or poisonedDrop: If the wrong path instruction has not modified any book keeping structures (e.g., Scoreboard) then it is simply removedPoison: If the wrong path instruction has modified book keeping structures then it is poisoned and passed down for book keeping reasons (say, to remove it from the scoreboard) Subsequent stages know not to update any architectural state for a poisoned instruction
October 28, 2013 http://csg.csail.mit.edu/6.S195 L16-8
![Page 9: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/9.jpg)
recir
ect
N-Stage pipeline – BTB only
Executed2eDecodef2dFetchPC
miss pred?
fEpoch
At Execute: (pc) if (epoch!=eEpoch) then mark instruction as poisoned (ppc) if (no poisoning) & mispred then change eEpoch; send <pc,
newPc, ...> to FetchAt Fetch:
msg from execute: train BTB with <pc, newPc, taken, mispredict> if msg from execute indicates misprediction then set pc, change
fEpoch
attached to every fetched instruction
{pc, ppc, epoch}
eEpoch{pc, newPc, taken mispredict, ...}
BTB
...
October 28, 2013 http://csg.csail.mit.edu/6.S195 L16-9
![Page 10: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/10.jpg)
N-Stage pipeline:Two predictors
Suppose both Decode and Execute can redirect the PC; Execute redirect should have priority, i.e., Execute redirect should never be overruledWe will use separate epochs for each redirecting stage
feEpoch and deEpoch are estimates of eEpoch at Fetch and Decode, respectively
fdEpoch is Fetch’s estimates of dEpoch Initially set all epochs to 0
Executed2eDecodef2dFetchPC
miss pred?
miss pred?
redirect PC
redirect PCdeEpoch
eEpochfeEpoch eRec
irect
fdEpoch dEpoch
dRec
irect
...
October 28, 2013 http://csg.csail.mit.edu/6.S195 L16-10
![Page 11: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/11.jpg)
N-Stage pipeline: Two predictorsRedirection logic
Executed2eDecodef2dFetchPC
miss pred?
miss pred?
deEpocheEpochfeEpoch eR
ecire
ct
fdEpoch dEpoch
dRec
irect
...
At execute: (pc) if (ieEp!=eEp) then poison the instruction (ppc) if (no poisoning) & mispred then change eEp; (ppc) for every control instruction send <pc, target pc, taken, mispred…> to fetch
At fetch: msg from execute: if (mispred) set pc, change feEp, msg from decode: If (no redirect message from Execute) if (ideEp=feEp) then set pc, change fdEp to idEp
At decode: …
{..., ieEp}{pc, ppc, ieEp, idEp}
{pc, newPc, taken mispredict, ...}
{pc, newPc, idEp, ideEp...}
make sure that the msg from Decode is not from a wrong path instructionOctober 28, 2013 http://csg.csail.mit.edu/6.S195 L16-11
![Page 12: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/12.jpg)
Decode stageRedirection logic
Executed2eDecodef2dFetchPC
miss pred?
miss pred?
deEpocheEpochfeEpoch eR
ecire
ct
fdEpoch dEpoch
dRec
irect
...
{..., ieEp}{pc, ppc, ieEp, idEp}
{pc, newPc, taken mispredict, ...}
{pc, newPc, idEp, ideEp...}
October 28, 2013 http://csg.csail.mit.edu/6.S195 L16-12
Is ieEp = deEp ? Is idEp = dEp ? Current instruction is OK but
Execute has redirected the pc;Set <deEp, dEp> to <ieEp, idEp>check the ppc prediction via BHT,Switch dEp if misprediction
yes no
yes noCurrent instruction is OK; check the ppc prediction via BHT, Switch dEp if misprediction
Wrong path instruction; drop it
![Page 13: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/13.jpg)
Another way to manage epochsWrite the rules as simple as possible (guarded atomic actions), then add EHRs if necessary
October 28, 2013 L16-13http://csg.csail.mit.edu/6.S195
![Page 14: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/14.jpg)
Fetch RulefInst.pc = pc;fInst.ppc = prediction( pc );fInst.eEpoch = eEpoch;fInst.dEpoch = dEpoch;…pc <= fInst.ppc;f2dFifo.enq( fInst );
October 28, 2013 L16-14http://csg.csail.mit.edu/6.S195
![Page 15: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/15.jpg)
Decode Ruleif( dInst.eEpoch != eEpoch )
kill fInstelse if( dInst.dEpoch != dEpoch )
kill fInstelse begin
let newpc = prediction( dInst );if( newpc != dInst.ppc ) beginpc <= newpcdEpoch <= !dEpoch;end…
end
October 28, 2013 L16-15http://csg.csail.mit.edu/6.S195
![Page 16: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/16.jpg)
Execute Ruleif( eInst.eEpoch != eEpoch )
poison eInstelse begin
if( mispredict ) beginpc <= newpc;eEpoch <= !eEpoch;train branch predictorsend…
end
October 28, 2013 L16-16http://csg.csail.mit.edu/6.S195
![Page 17: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/17.jpg)
ConflictsPC read < PC write fetch < {decode, execute}dEpoch read < dEpoch write fetch < decodeeEpoch read < eEpoch write {fetch, decode} < executePC write C PC write fetch C decode C execute C fetch
October 28, 2013 L16-17http://csg.csail.mit.edu/6.S195
None of these stages can execute in the same clock cycle!
![Page 18: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/18.jpg)
Now add EHRs1) Choose an ordering between the rules
and assign the corresponding EHR ports
(fetch, decode, execute)2) Change conflicting registers into EHRs
(pc)
Ehr#(3, Addr) pc -> mkEhr(?);
October 28, 2013 L16-18http://csg.csail.mit.edu/6.S195
![Page 19: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/19.jpg)
Fetch Rule – port 0fInst.pc = pc[0];fInst.ppc = prediction( pc[0] );fInst.eEpoch = eEpoch;fInst.dEpoch = dEpoch;…pc[0] <= fInst.ppc;f2dFifo.enq( fInst );
October 28, 2013 L16-19http://csg.csail.mit.edu/6.S195
![Page 20: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/20.jpg)
Decode Rule – port 1if( dInst.eEpoch != eEpoch )
kill fInstelse if( dInst.dEpoch != dEpoch )
kill fInstelse begin
let newpc = prediction( dInst );if( newpc != dInst.ppc ) beginpc[1] <= newpc;dEpoch <= !dEpoch;end…
end
October 28, 2013 L16-20http://csg.csail.mit.edu/6.S195
![Page 21: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/21.jpg)
Execute Rule – port 2if( eInst.eEpoch != eEpoch )
poison eInstelse begin
if( mispredict ) beginpc[2] <= newpc;eEpoch <= !eEpoch;train branch predictorsend…
end
October 28, 2013 L16-21http://csg.csail.mit.edu/6.S195
![Page 22: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/22.jpg)
Another Ordering1) Choose an ordering between the rules
and assign the corresponding EHR ports(execute, decode, fetch)
2) Change conflicting registers into EHRs(pc, dEpoch, eEpoch)
Ehr#(3, Addr) pc -> mkEhr(?);Ehr#(3, Bool) dEpoch -> mkEhr(False);Ehr#(3, Bool) eEpoch -> mkEhr(False);
October 28, 2013 L16-22http://csg.csail.mit.edu/6.S195
![Page 23: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/23.jpg)
Fetch Rule – port 2fInst.pc = pc[2];fInst.ppc = prediction( pc[2] );fInst.eEpoch = eEpoch[2];fInst.dEpoch = dEpoch[2];…pc[2] <= fInst.ppc;f2dFifo.enq( fInst );
October 28, 2013 L16-23http://csg.csail.mit.edu/6.S195
![Page 24: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/24.jpg)
Decode Rule – port 1if( dInst.eEpoch != eEpoch[1] )
kill fInstelse if( dInst.dEpoch != dEpoch[1] )
kill fInstelse begin
let newpc = prediction( dInst );if( newpc != dInst.ppc ) beginpc[1] <= newpc;dEpoch[1] <= !dEpoch[1];end…
end
October 28, 2013 L16-24http://csg.csail.mit.edu/6.S195
![Page 25: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/25.jpg)
Execute Rule – port 0if( eInst.eEpoch != eEpoch[0] )
poison eInstelse begin
if( mispredict ) beginpc[0] <= newpc;eEpoch[0] <= !eEpoch[0];train branch predictorsend…
end
October 28, 2013 L16-25http://csg.csail.mit.edu/6.S195
![Page 26: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/26.jpg)
Different View of EHRThis transformation makes more sense when you think of an EHR as sub-cycle register.This is explained more in the paper “The Ephemeral History Register: Flexible Scheduling for Rule-Based Designs” by Daniel L. Rosenband
October 28, 2013 L16-26http://csg.csail.mit.edu/6.S195
![Page 27: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/27.jpg)
Questions?
November 1, 2013 http://csg.csail.mit.edu/6.s195 T07-27
![Page 28: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/28.jpg)
November 1, 2013 T07-28http://csg.csail.mit.edu/6.s195
![Page 29: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/29.jpg)
6 stage SMIPS pipeline
November 1, 2013 http://csg.csail.mit.edu/6.s195
IFetch Decode WBRFetch Exec Memory
Register File
Scoreboard
DMemIMem eEpoch
fEpoch
PC
Redirect
T07-29
![Page 30: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/30.jpg)
Poisoning Pipeline
November 1, 2013 http://csg.csail.mit.edu/6.s195
IFetch Decode WBRFetch Exec Memory
Register File
Scoreboard
DMemIMem eEpoch
fEpoch
PC
Redirect
Poison Kill
T07-30
![Page 31: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/31.jpg)
Correcting PC in Decode and Execute
November 1, 2013 http://csg.csail.mit.edu/6.s195
IFetch Decode WBRFetch Exec Memory
Register File
Scoreboard
DMemIMem eEpoch
feEpoch
PC
Redirect
123456
Executing Write Back
dEpoch
Decoding
fdEpoch
feEpoch
Fetch has local estimates of eEpoch and dEpochDecode has a local estimate of eEpoch
T07-31
![Page 32: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/32.jpg)
fEpoch and PC feedback
November 1, 2013 http://csg.csail.mit.edu/6.s195
IFetch Decode WBRFetch Exec Memory
Register File
Scoreboard
DMemIMem Epoch [0]
Epoch [1]
PC [1]Make the PC an EHR too! Whenever Execute sees a misprediction, IFetch reads the correct next instruction in the same cycle!
PC [0]
T07-32
![Page 33: Constructive Computer Architecture Tutorial 7: SMIPS Labs and Epochs Andy Wright 6.S195 TA](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816143550346895dd0b70e/html5/thumbnails/33.jpg)
RFile and SB feedback
November 1, 2013 http://csg.csail.mit.edu/6.s195
IFetch Decode WBRFetch Exec Memory
Bypass Register File
Pipeline Scoreboard
DMemIMem eEpoch
fEpoch
PC
Redirect
You can use a scoreboard that removes before searching (called a pipeline scoreboard because it is similar to pipeline fifo’s deq<enq behavior)
T07-33