Better answers Relaxing Constraints: Thoughts on the Evolution of Computer Architecture Joel Emer...
-
Upload
violet-skinner -
Category
Documents
-
view
219 -
download
0
Transcript of Better answers Relaxing Constraints: Thoughts on the Evolution of Computer Architecture Joel Emer...
Better answers
Relaxing Constraints: Relaxing Constraints: Thoughts on the Thoughts on the Evolution of Computer Evolution of Computer ArchitectureArchitecture
Joel EmerJoel Emer
Alpha Development GroupAlpha Development Group
Compaq Computer CorporationCompaq Computer Corporation
Better answers
1
10
100
3.73
Date of Introduction
SP
EC
int9
5.
EV45-275
EV5-300
EV56-500
EV6-575
EV67-730
EV4-200
EV56-600
EV56-400
Moore’s Law Alpha-styleMoore’s Law Alpha-style
Better answers
Iron Law of PerformanceIron Law of Performance
Performance = Performance = Frequency * Instructions Frequency * Instructions
CPI CPI
Frequency – largely circuit design/technologyFrequency – largely circuit design/technology
CPI – largely organizationCPI – largely organization
Instructions – largely architecture/compilerInstructions – largely architecture/compiler
Better answers
OutlineOutline
Review of technology factorsReview of technology factors
Retrospective on the quantitative methodRetrospective on the quantitative method
Augmenting the quantitative methodAugmenting the quantitative method
RecommendationRecommendation
Better answers
Power Dissipation TrendsPower Dissipation Trends
Power Dissipation
0
20
40
60
80
100
120
21064 21164 21264 21364
Po
we
r (W
)
0
0.5
1
1.5
2
2.5
3
3.5
Vol
tage
(V
)
Supply Current
010
203040
5060
7080
21064 21164 21264 21364
Cu
rre
nt
(A)
0
0.5
1
1.5
2
2.5
3
3.5
Vo
lta
ge
(V
)
•Power consumption is increasing
•Supply current is increasing faster!
Better answers
Coping With Power GrowthCoping With Power Growth
Technology techniques Better cooling technology needed Accelerate Vdd scaling SOI Clock distribution
Architectural possibilities Use less power-hungry structures Reduce useless speculation
Better answers
Clock Distribution TrendsClock Distribution Trends
32%
18%15%
10%
10%
8%
5%
2%
Global Clock Networks
Instruction Issue Units
Caches
Floating Execution Units
Integer Execution Units
Memory Management Unit
I/O
Miscellaneous Logic
21264 Power (Peak)
Frequencies will continue to scaleFrequencies will continue to scale Clock edge rates are not scalingClock edge rates are not scaling
Better answers
Coping With Clock DistributionCoping With Clock Distribution
Technology solutionTechnology solution Low swing differential clocksLow swing differential clocks Adiabatic clockingAdiabatic clocking
Architectural possibilitiesArchitectural possibilities Multiple clock zonesMultiple clock zones Asynchronous designAsynchronous design
Better answers
Communication DelayCommunication Delay
21064 ~ 1cycle
21164 ~ 1.5 cycles
21264 ~ 3 cycles
21464 ~ 6 cycles
Not drawn to scale
Microprocessor Chip Microprocessor Chip
Better answers
Coping With Communication DelayCoping With Communication Delay
Technology solutionsTechnology solutions Low K dielectricsLow K dielectrics Thinner (Cu) interconnectThinner (Cu) interconnect
Architectural possibilitiesArchitectural possibilities Deeper pipeliningDeeper pipelining Replication/clustering of structuresReplication/clustering of structures More autonomous computationMore autonomous computation
Better answers
SIA RoadmapSIA Roadmap
1997 1999 2002 2005 2008 2012
Technology Node (um) 250 180 130 100 70 50Memory (bit/chip) 256M 1G 4G 16G 64G 256GTransistors/chip (MPU) 11M 21M 76M 200M 520M 1.4GChip Frequency (MHz) 750 1250 2100 3500 6000 10,000Wiring Levels (max) 6 6 to 7 7 7 to 8 8 to 9 9Power Supply Voltage, Vdd (V) 1.8-2.5 1.5-1.8 1.2-1.5 0.9-1.2 0.6-0.9 0.5-0.6Power - High Performance (W), w/Heat sink 70 90 130 160 170 175Power -Hand-held (W) 1.2 1.4 2 2.4 2.8 3.2*The 2012 is directly from the SIA 1997 National Technology Roadmap
Better answers
OutlineOutline
Review of technology factorsReview of technology factors
Retrospective on the quantitative methodRetrospective on the quantitative method
Augmenting the quantitative methodAugmenting the quantitative method
RecommendationRecommendation
Better answers
DisclaimerDisclaimer
The names used and events depicted in this talk are The names used and events depicted in this talk are meant to be real. The events are, however, not an meant to be real. The events are, however, not an exhaustive enumeration of significant milestones.exhaustive enumeration of significant milestones.
The misrepresentations of fact and omission of The misrepresentations of fact and omission of contributors are unintentional and solely the contributors are unintentional and solely the responsibility of the presenter. Finally, the responsibility of the presenter. Finally, the interpretations are just that and are mine as well. interpretations are just that and are mine as well.
Better answers
uPC Histogram Chart – 1981-5uPC Histogram Chart – 1981-5
Compute Read R-Stall Write W-Stall lB-Stall Total
Decode 1.000 0.613 1.613Spec1 0.895 0.306 0.364 1.565Spec2-6 1.052 0.148 0.116 0.161 0.192 0.102 1.771B-Disp 0.221 0.005 0.226Simple 0.870 0.029 0.017 0.033 0.027 0.977Field 0.482 0.049 0.058 0.007 0.002 0.600Float 0.292 0.000 0.000 0.008 0.001 0.302Call/Ret 0.937 0.133 0.074 0.130 0.184 1.458System 0.434 0.015 0.031 0.014 0.028 0.522Character 0.318 0.039 0.099 0.046 0.004 0.506Decimal 0.026 0.002 0.000 0.001 0.002 0.031Int/Except 0.055 0.002 0.005 0.004 0.006 0.071Mem Mngmt 0.555 0.061 0.200 0.004 0.003 0.824Abort 0.127 0.127TOTAL 7.267 0.783 0.964 0.409 0.450 0.720 10.593
Average VAX Instruction T iming (Cycles per Instruction)
TABLE 8
Better answers
Paper countsPaper counts
ISCA 1ISCA 1 ISCA24ISCA24
No modelNo model 2222 11
Analytic ModelAnalytic Model 55 ½½
SimulationSimulation 11 21½ 21½
MeasurementMeasurement 00 77
Better answers
Scientific MethodScientific Method
Make hypothesis about behaviorMake hypothesis about behavior Design experiment Design experiment Run experiment and quantifyRun experiment and quantify Interpret resultsInterpret results New hypothesisNew hypothesis
Better answers
Scientific MethodScientific Method
Make hypothesis about behaviorMake hypothesis about behavior Pick baseline design and workload Pick baseline design and workload Run experiment and quantifyRun experiment and quantify Interpret resultsInterpret results New hypothesisNew hypothesis
Better answers
Scientific MethodScientific Method
Make hypothesis about behaviorMake hypothesis about behavior Pick baseline design and workload Pick baseline design and workload Run simulation model or measure hardwareRun simulation model or measure hardware Interpret resultsInterpret results New hypothesisNew hypothesis
Better answers
Scientific MethodScientific Method
Make hypothesis about behaviorMake hypothesis about behavior Pick baseline design and workload Pick baseline design and workload Run simulation model or measure hardwareRun simulation model or measure hardware Interpret resultsInterpret results Propose new designPropose new design
Better answers
Making and Testing HypothesisMaking and Testing Hypothesis
Cache experiment (Schlansker)Cache experiment (Schlansker)
64K word cache64K word cache 32-way set associative cache/LRU replacement32-way set associative cache/LRU replacement 200x200 matrix subblock of an N x N matrix200x200 matrix subblock of an N x N matrix Read twiceRead twice
SizesSizes N=2727: 0 missesN=2727: 0 misses N=2729: 24160 missesN=2729: 24160 misses N=2731: 36382 missesN=2731: 36382 misses
Better answers
Propose new designPropose new design
Direct mapped 4-way associative
4-way skewed
Skewed associative (Seznec)Skewed associative (Seznec)
Better answers
Quantitative Approach ProblemsQuantitative Approach Problems
Too much abstractionToo much abstraction Intra-chip latenciesIntra-chip latencies Memory subsystemMemory subsystem
Poor workloadsPoor workloads
Too incremental…Too incremental…
Better answers
Quantitative -> IncrementalQuantitative -> Incremental
0
0.5
1
1.5
2
2.5
3
3.5
4
a b c d e f g h I j k l
Better answers
OutlineOutline
Review of technology factorsReview of technology factors
Retrospective on the quantitative methodRetrospective on the quantitative method
Augmenting the quantitative methodAugmenting the quantitative method
RecommendationRecommendation
Better answers
Relaxing ConstraintsRelaxing Constraints
Select a constraint to relaxSelect a constraint to relax
Generate designGenerate design
Employ quantitative methodEmploy quantitative method
Evaluate resultsEvaluate results
Better answers
Important Steps…Important Steps…
BeforeBefore Carefully pick a constraint to relaxCarefully pick a constraint to relax
AfterAfter Find contributions without constraintFind contributions without constraint Preserving results after reinstating the constraintPreserving results after reinstating the constraint
Better answers
Extrapolate From Current TrendsExtrapolate From Current Trends
Personal Workstation – Xerox PARC – late 70’sPersonal Workstation – Xerox PARC – late 70’s
ResultsResults Accelerate innovationAccelerate innovation
VAX 11/780VAX 11/780 DoradoDorado
5 MHz5 MHz 15 MHz15 MHz
512 Kilobytes512 Kilobytes 8 Megabytes8 Megabytes
40+ Users40+ Users 1 User1 User
Better answers
Throw Out StandardsThrow Out Standards
Distributed file system - 1985Distributed file system - 1985
Better answers
Use a Simpler Starting PointUse a Simpler Starting Point
Fetch Decode/Map
Queue Reg Read
Execute Dcache/Store Buffer
Reg Write
Retire
PC
Icache
RegisterMap
DcacheRegs Regs
RISC out-of-order (Johnson, Tourng)RISC out-of-order (Johnson, Tourng)
Better answers
CISC-based O-O-OCISC-based O-O-O
K6 (Johnson)K6 (Johnson) Pentium Pro (Colwell, Papworth…)Pentium Pro (Colwell, Papworth…)
PC
Icache
Covert CISC
to RISC
RISCO-O-OCore
Better answers
Abandon conventionsAbandon conventions
VLIW (Fisher)VLIW (Fisher) Relieve hardware of all dependency responsibilityRelieve hardware of all dependency responsibility Give that responsibility to compilerGive that responsibility to compiler
Expected consequencesExpected consequences Much simpler implementationMuch simpler implementation Faster cycle timeFaster cycle time
Better answers
Sometimes not what you expectSometimes not what you expect
Compiler scheduling for hardware is a great ideaCompiler scheduling for hardware is a great idea
For 21064 - narrow in-orderFor 21064 - narrow in-order For 21164 - wider in-orderFor 21164 - wider in-order For 21264 – wider out-of-order For 21264 – wider out-of-order
Better answers
Issue Logic Critical LoopIssue Logic Critical Loop
InstructionSlot
InstructionIssue
to floating pointmultiply pipeline
to floating pointadd pipeline
to integerpipeline 0
to integerpipeline 1
IssueConflictChecker
S2 S3
X
Better answers
Make a Radical DepartureMake a Radical Departure
Multiscalar research (Sohi, Smith…)Multiscalar research (Sohi, Smith…)
Better answers
New Mechanism RequiredNew Mechanism Required
Dependence prediction (Moshovos)Dependence prediction (Moshovos)
Store
Load
Store
ProgramOrder
Load
Load
Load
Load
Load
Store
Execution Order
Trap!
Better answers
What Was Really ImportantWhat Was Really Important
Full hardware management (Sohi)Full hardware management (Sohi) SequencingSequencing Register dependenciesRegister dependencies Memory dependenciesMemory dependencies
Refinement (Mowry and Olukuton)Refinement (Mowry and Olukuton) Compiler managed – registers, sequencingCompiler managed – registers, sequencing Hardware managed memory dependence onlyHardware managed memory dependence only
Better answers
Ignoring Implementation RealitiesIgnoring Implementation Realities
SMT - in-order (Tullsen, Eggers, Levy)SMT - in-order (Tullsen, Eggers, Levy)
Fetch Issue Reg Read
Execute Dcache/Store Buffer
Reg Write
IcacheDcache
PC
Icache
Regs Regs
Better answers
Solution Already AvailableSolution Already Available
SMT out-of-order SMT out-of-order Fetch Decode/
MapQueue Reg
ReadExecute Dcache/
Store Buffer
Reg Write
Retire
IcacheDcache
PC
RegisterMap
Regs Regs
Better answers
OutlineOutline
Review of technology factorsReview of technology factors
Retrospective on the quantitative methodRetrospective on the quantitative method
Augmenting the quantitative methodAugmenting the quantitative method
RecommendationRecommendation
Better answers
Pay Attention to RealityPay Attention to Reality
Look at technology trendsLook at technology trends PowerPower LatencyLatency
Use more realistic modelsUse more realistic models More organizational detailsMore organizational details Better workloadsBetter workloads
Better answers
Ignore RealityIgnore Reality
Look for revolutionary contributionsLook for revolutionary contributions
Decide on a constraint to relaxDecide on a constraint to relax Apply the scientific method Apply the scientific method Revolutionary contributions may arise becauseRevolutionary contributions may arise because
– Constraint will be relaxed in timeConstraint will be relaxed in time– Constraint wasn’t fundamentalConstraint wasn’t fundamental– New avenues of exploration will be openedNew avenues of exploration will be opened