Super Scalar Issue Despatch
description
Transcript of Super Scalar Issue Despatch
-
CSL718 : Superscalar ProcessorsIssue and Despatch23rd Jan, 2006
-
Early proposals/prototypes1982 1983 1984 1985 1986 1987 1988 1989IBMDECStanford UKyushu UCheetahAmerica project(4)Multititan project(2)Match(2) Torch(4)SIMP(4) DSNS(4)TermSuperscalar
-
Commercial superscalarsRISCsIntel960KA/KB 960CA (3)1989IBMPower 1 RS/6000 (4)1990HPPA7000 PA7100 (2)1992SUNSPARC SuperSparc (3)1992DECAlpha 21064(2)1992MotorolaMC88100 MC88110(2)1993MotorolaPowerPC 601/603 (3)1993MIPSR4000 R8000(4)1994
-
Commercial superscalarsCISCsIntel80486 Pentium (2)1993Motorola MC68040 MC68060 (2)1993GmicroGmicro/100p Gmicro 500 (2)1993AMDK5(2) 4 RISC instr1995CYRIXM1 (2)1995
-
Tasks of superscalar processingParallel Parallel Preserving thedecoding instruction sequential and issue execution consistency of instruction execution and exception processing
-
Superscalar decode and issueI - cacheInstructionbufferDecode & IssueIFD/II - cacheInstructionbufferDecode & IssueIFDIScalarIssueSuperscalarIssue
-
Parallel DecodingFetch multiple instructions in instruction bufferDecode multiple instructions in parallel instruction windowPossibly check dependencies among these as well as with the instructions already under execution
-
Pre-decodingDo partial decoding while instructions are being loaded in I-cacheDecoded information is appended to the instructionThis includes instruction class, resources required etc.
Second level cacheor main memoryPre-decode unitI - cacheN bits/cycleN + n bits/cycle
-
Number of Pre-decode bitsProcessorNo. of predecode bitsPA 7200 (1995)5PA 8000 (1996)5PowerPC 620(1996)7UltraSparc (1995)4HAL PM1 (1995)4AMD K5 (1995)5 (per byte)R 10000 (1996)4
-
Issue vs DispatchBlocking IssueDecode and issue to EU
Instructions may be blocked due to data dependencyNon-blocking IssueDecode and issue to bufferFrom buffer dispatch to EU
Instructions are not blocked due to data dependency
-
Blocking IssueEUEUEUDecode Check & IssueInstructionbufferissue window
-
Non-blocking (shelved) IssueReservationstationDep. Checking/dispatchEUReservationstationDep. Checking/dispatchEUReservationstationDep. Checking/dispatchEUDecode & IssueInstructionbuffer
-
Handling of Issue BlockagesPreserving issue order Alignment of instruction issuealigned unalignedin-order out of order
-
Issue OrdercdabeaIssue windowInstructionsto be issued
InstructionsissuedcdabeaIssue windowInstructionsto be issued
InstructionsissuedIssue in strict program orderOut of order IssuecExample: MC 88110, PowerPC 601Independent instructionDependent instructionIssued instruction
-
Alignmentcdabeafixed windowcheckedin cycle 1Aligned IssueUnaligned Issueissuedin cycle 1fghnext windowcdbebcheckedin cycle 2issuedin cycle 2fghdedcheckedin cycle 3issuedin cycle 3fghccdabeagliding windowfghcdbebfghdefghcdef
-
Design choices in instruction issueCoping with Coping with Use of Handling of Issuefalse data unresolved shelving issue blockages ratedependencies control (2-6) dependenciesno Register renamingwait speculativeblocking shelved
-
Frequently used issue policies in scalar processorsTraditional Traditional Traditional Traditionalscalar issue scalar issue scalar issue scalar issue with shelving with shelving with spec. and renaming executionCDC 6600IBM 360/91i386MC68030R3000SparcI486MC68040R4000MicroSparc
-
Frequently used issue policies in super scalar processorsStraightforward Straightforward Straight forward Advancedsuperscalar superscalar superscalar superscalar issue issue with issue with issue shelving renaming (renaming+shelving)aligned unaligned(speculative execution in all)PentiumPowerPC601PA7100SuperSparcAlpha21164MC68060PA7200UltraSparcMC88110R8000PowerPC602R10000PentiumProPowerPC602PA8000Sparc64Am29000K5
-
Frequently used issue policies Traditional Traditional Straight forward Advancedscalar issue scalar issue superscalar issue superscalar with spec. Issue executionaligned unaligned
-
Design Space of ShelvingScope of Layout of Operand fetch Instructionshelving shelving policy dispatch scheme bufferspartial full
-
Layout of Shelving BuffersType of the Number of Number of readshelving buffers shelving buffer entries and write portsStand combined withalone renaming and(RS) reorderingindividual 2-4group 6-16central 20total 15-40depends onno. of EUsconnected
-
Reservation Stations (RS)EUEUEUEUEUEUEUEUIndividual RSsGroup RSsCentral RS
-
Combined Buffer(for Shelving, Renaming, Reordering)EUEUDRISFrom decode/issueDeferred scheduling, Register renaming and InstructionShelving
-
Operand Fetch PoliciesIssueboundfetchDispatchboundfetch
-
Issue bound operand fetch(with single register file)EUEUEUEUDecode/issueRFinstructiondata
-
Dispatch bound operand fetch (with single register file)EUEUEUEUDecode/issue
-
Issue bound operand fetch(with multiple register files)EUEUEUEUDecode/issueRFRFinstructiondata
-
Dispatch bound operand fetch (with multiple register files)EUEUEUEUDecode/issue
-
Updating RFs and RSsEUEUEUEUDecode/issueRFRFinstructiondata
-
Instruction dispatch schemeDispatch Dispatch Checking Treatment ofpolicy rate operand empty RS availabilitysingle multipleinstr/ instr/cycle cycleIndividual RSGroup or central RS
-
Dispatch policySelection Arbitration Dispatchrule rule orderRule for identifyinginstructions which areready for execution(data dependency check)Rule for choosingone out of severalready instructions(earlier instruction has priority)
-
Dispatch orderin-order partially out of out of order ordercheckcheck
-
Checking availability of operandsDirect check of Check of explicit score-board bits status bits in RS
(usual for dispatch (usual for issuebound operand fetch) bound operand fetch)
control flow approach data flow approachFlynns terminology
-
Score-boardRegisterFile10110012Data statusIntroduced with CDC6600
-
Checking in dispatch bound fetchRegisterFileReservationstationOC Rs1 Rs2 RdEUdecodedinstructioncheck V bits of sourcesupdate Rdset V bitRs1,Rs2,Rdreset V bit of RdOC(opcode)Os1Os2 (operand value)result, Rd
-
Checking in issue bound fetchOC Os1/Is1 Vs1 Os2/Is2 Vs2 RdEUdecodedinstructionOC, Os1, Os2, Rdresult, RdRegisterFileupdate Rd, set V bitRs1,Rs2,Rdreset V bit of RdOs1Os2 (operand value)Reservation stationcheck Vs1, Vs2associative update ofIs1, Is2 with Rd, set Vs bits
-
Treatment of an empty RSStraight forward Bypassingapproach RS if emptyAt least onecycle stay in RSEUEUNx586Sparc64PowerPc 604
-
Approaches in dispatchingStraight forward Enhanced Advanced in order partially out of order out of order single single multiple instr/cycle instr/cycle instr/cycleindividual RSs individual RSs group/central RSs
Power1, PPC603 Power2 PM1, PentiumProNx586, Am29000 PPC604,620 PA8000, R10000