8/16/2015\course\cpeg323-08F\Topics1b.ppt1 A Review of Processor Design Flow.
-
Upload
aileen-miller -
Category
Documents
-
view
219 -
download
3
Transcript of 8/16/2015\course\cpeg323-08F\Topics1b.ppt1 A Review of Processor Design Flow.
04/19/23 \course\cpeg323-08F\Topics1b.ppt 1
A Review of Processor Design Flow
04/19/23 \course\cpeg323-08F\Topics1b.ppt 2
How to design a CPU ?
• Instruction-set architecture (ISA) design
• Function-level (RTL) design
• Component-level design
• Gate-level/switch-level design
• Circuit-level design
04/19/23 \course\cpeg323-08F\Topics1b.ppt 3
Design Method
Gate Level/circuit level: toward full CAD
Register Level: CAD + heuristics/intuition
ISA Level: mainly heuristic process
with simulation validation
04/19/23 \course\cpeg323-08F\Topics1b.ppt 4
Instruction SetArchitecture
Design(Microarchitecture
Design-I)
System-LevelDesign
RTL Level Design
(MicroarchitectureDesign II)
CompilerDesign
CodeOptimizer
HardwareDesign
SwitchLevel
Design
CircuitLeveldesign
ISASimulator
System LevelSimulator
RTLLevel
Simulator
SwitchLevel
Simulator
CircuitLevel
Simulator
Arch./CompilerDesign Toolset Processor Architecture
Design FlowDiagram
HDL (VHDL or Verilog)
CodeGenerator
Design Levels of Abstraction
04/19/23 \course\cpeg323-08F\Topics1b.ppt 5
RenIfsSetWb2H := vOR3(RenCoverUpdtIFMWb2H, vAND2(RenCrab_Data_Hi_Cx5B[31], RenCrabIfsWrEnCx5H), vAND2(RenIfsValidWb3H, vNOT(RenCrabIfsWrEnCx5H)))
RenIfsSetWb2H := vOR3(RenCoverUpdtIFMWb2H, vAND2(RenCrab_Data_Hi_Cx5B[31], RenCrabIfsWrEnCx5H), vAND2(RenIfsValidWb3H, vNOT(RenCrabIfsWrEnCx5H)))
mov eax, [edi]cmp eax, 4jne label10
mov eax, [edi]cmp eax, 4jne label10
eaxebxecxedx
CPU Branch UnitBranch Unit
I-CacheI-Cache D-CacheD-Cache
SwitchSwitch
Instruction DecodeRegister MappingInstruction DecodeRegister Mapping
Int RegsInt Regs FP RegsFP Regs
ALUALU
FPUFPU
Address Calculatio
n
Address Calculatio
n
MICROARCH
Abstract
Architecture
Logic
CIRCUIT
LAYOUT
Concrete
04/19/23 \course\cpeg323-08F\Topics1b.ppt 6
Design Levels and Component Types
Levels Components Information OutputUnits
ISA Instruction setand ISA (Microarchitecture-functional)
RTL Registers, word Register-levelCounters, implementationCombinational of the chosenand sequential microaarchitectureCircuitsetc.
Gate AND, OR, … bits Gate-level implementationof register-levelcomponents
04/19/23 \course\cpeg323-08F\Topics1b.ppt 7
Classical ISA Level Design Method
• Select a prototype structure A
• Modify A to accommodate:
- new performance demand and new technology
• Evaluation (ISA simulation)
• Repeating until satisfaction
04/19/23 \course\cpeg323-08F\Topics1b.ppt 8
Overall Simulation Strategy
1. Instruction level simulator: this is used for performance evaluation at the instruction set level as well as for more detailed modeling, e.g. the pipeline and memory system. This level is also used to generate test vectors employed in lower-level simulators.
2. System level simulation: this simulator models the details of the system environment including such things as interrupts and memory management.
(Virtual machine level ..)
04/19/23 \course\cpeg323-08F\Topics1b.ppt 9
Overall Simulation Strategy
3. RTL level: this simulator models are RTL description of the design
4. Switch level with delays: used to simulate the design mostly in components; test vectors are generated from the RTL level.
5. Circuit simulation: it is used for detailed modeling of the critical paths as well as for verification of circuits under variations in temperature, power supply, etc.
(Con’d)
04/19/23 \course\cpeg323-08F\Topics1b.ppt 10
Performance of Simulators
Simulator Level of Accuracy Simulation Rate
ISA Instruction set ~ 106 cycles/second
System System level (OS instructions + interrupts) ~ 104cycles/second
RTL Synchronous register transfer ~ 103 cycles/second
Gate Gate/switch level ~ 1 cycles/second
# of cycles simulated per second on a host machine
04/19/23 \course\cpeg323-08F\Topics1b.ppt 11
Instruction Set Architecture Simulation
Execution-driven
simulator
Trace-drivensimulator
(cache simulatorbranch prediction
simulator, etc.)
Traces(e.g. memory accesses
branch trace, etc.)
Runtimestatistics
(frequencies,cycle counts, etc.)
Profileinformation
Statistics(e.g. cache behavior,
branch behavior, etc.)
Objectfile
ArchitectureModels
04/19/23 \course\cpeg323-08F\Topics1b.ppt 12
Performance Study by Simulation
• Develop performance model that is:- Flexible- Parameterized (via knobs)- 95% clock accurate compared to RTL- Significantly smaller than RTL
• Models consist of two parts:- Instruction-set simulator -> executes benchmark- Pipeline simulator -> “accountant” for clock cycles
• Run benchmarks, update microarchitecture accordingly
• Cycle of: code -> simulate -> characterize -> tune
04/19/23 \course\cpeg323-08F\Topics1b.ppt 13
Revisit: How to design a CPU ?• Instruction-set architecture (ISA) design
• Function-level (RTL) design
• Component-level design
• Gate-level/switch-level design
• Circuit-level design
Monty Denneau: I work on everything down to and including 4. Cyclops skips (2) and goes directly to 3/4. A lot of time was spent restructuring the design to make 4 meet timing. I probably spent thousands of hours on 4. We have no 5 - ASICS provides a library of gates, latches, and memory, etc. August 28, 2007