CS152 / Kubiatowicz Lec10.1 3/3/03©UCB Spring 2003 CS152 Computer Architecture and Engineering...
-
date post
21-Dec-2015 -
Category
Documents
-
view
216 -
download
1
Transcript of CS152 / Kubiatowicz Lec10.1 3/3/03©UCB Spring 2003 CS152 Computer Architecture and Engineering...
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.1
CS152Computer Architecture and Engineering
Lecture 10
High-Level Design/Microcode programming
March 3, 2002
John Kubiatowicz (www.cs.berkeley.edu/~kubitron)
lecture slides: http://www-inst.eecs.berkeley.edu/~cs152/
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.2
Recap: What’s wrong with our CPI=1 processor?
° Long Cycle Time
° All instructions take as much time as the slowest
° Real memory is not as nice as our idealized memory
• cannot always get the job done in one (short) cycle
PC Inst Memory mux ALU Data Mem mux
PC Reg FileInst Memory mux ALU mux
PC Inst Memory mux ALU Data Mem
PC Inst Memory cmp mux
Reg File
Reg File
Reg File
Arithmetic & Logical
Load
Store
Branch
Critical Path
setup
setup
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.3
Recap: Partitioning the CPI=1 Datapath
° Add registers between smallest steps
° Place enables on all registers
PC
Nex
t P
C
Ope
rand
Fet
ch Exec Reg
. F
ile
Mem
Acc
ess
Dat
aM
em
Inst
ruct
ion
Fet
ch
Res
ult
Sto
re
AL
Uct
r
Reg
Dst
AL
US
rc
Ext
Op
Mem
Wr
nPC
_sel
Reg
Wr
Mem
Wr
Mem
Rd
Equ
al
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.4
Recap: Example Multicycle Datapath
° Critical Path ?
PC
Nex
t P
C
Ope
rand
Fet
ch
Inst
ruct
ion
Fet
ch
nPC
_sel
IRRegFile E
xtA
LU Reg
. F
ile
Mem
Acc
ess
Dat
aM
em
Res
ult
Sto
reR
egD
stR
egW
r
Mem
Wr
Mem
Rd
S
M
Mem
ToR
eg
Equ
al
AL
Uct
rA
LU
Src
Ext
Op
A
B
E
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.5
Recap: FSM specification
IR <= MEM[PC]
R-type
A <= R[rs]B <= R[rt]
S <= A fun B
R[rd] <= SPC <= PC + 4
S <= A or ZX
R[rt] <= SPC <= PC + 4
ORi
S <= A + SX
R[rt] <= MPC <= PC + 4
M <= MEM[S]
LW
S <= A + SX
MEM[S] <= BPC <= PC + 4
BEQ
PC <= Next(PC)
SW
“instruction fetch”
“decode”
0000
0001
0100
0101
0110
0111
1000
1001
1010
00111011
1100
Exe
cute
Mem
ory
Writ
e-ba
ck
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.6
Recap: Micro-controller Design
° The state digrams that arise define the controller for an instruction set processor are highly structured
° Use this structure to construct a simple “microsequencer”
• Each state in previous diagram becomes a “microinstruction”
• Microinstructions often taken sequentially
° Control reduces to programming this device
sequencercontrol
datapath control
micro-PCsequencer
microinstruction ()
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.7
Recap: Specific Sequencer from last lecture
°Sequencer-based control unit from last lecture
• Called “microPC” or “µPC” vs. state register
Control Value Effect 00 Next µaddress = 0 01 Next µaddress = dispatch ROM 10 Next µaddress = µaddress + 1
ROM:
Opcode
microPC
1
µAddressSelectLogic
Adder
ROM
Mux
0012
R-type 000000 0100BEQ 000100 0011ori 001101 0110LW 100011 1000SW 101011 1011
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.8
Recap: Microprogram Control Specification
0000 ? inc 10001 x load 1 1
0011 0 zero 1 00011 1 zero 1 10100 x inc 0 1 fun 10101 x zero 1 0 0 1 10110 x inc 0 0 or 10111 x zero 1 0 0 1 01000 x inc 1 0 add 11001 x inc 1 0 11010 x zero 1 0 1 1 01011 x inc 1 0 add 11100 x zero 1 0 0 1 0
µPC Taken Next IR PC Ops Exec Mem Write-Backen sel A B Ex Sr ALU S R W M M-R Wr Dst
R:
ORi:
LW:
SW:
BEQ
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.9
Hardware Representation Languages:
Block Diagrams: FUs, Registers, & Dataflows
Register Transfer Diagrams: Choice of busses to connect FUs, Regs
Flowcharts
State Diagrams
Fifth Representation "Language": Hardware Description Languages
E.G., ISP' VHDL
Verilog
Descriptions in these languages can be used as input to
simulation systems
synthesis systems
Representation Languages
Two different ways to describe sequencing & microoperations
hw modules described like programswith i/o ports, internal state, & parallelexecution of assignment statements
"software breadboard"
generate hw from high level description
"To Design is to Represent"
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.10
Simulation Before Construction
"Physical Breadboarding"
discrete components/lower scale integration preceeds actual construction of prototype
verify initial design concept
No longer possible as designs reach higher levels of integration!
Simulation Before Construction
high level constructs implies faster to construct
play "what if" more easily
limited performance accuracy, however
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.11
Levels of DescriptionArchitectural Simulation
Functional/Behavioral/Dataflow
Register Transfer
Logic
Circuit
models programmer's view at ahigh level; written in your favoriteprogramming language
more detailed model, like theblock diagram view
commitment to datapath FUs,registers, busses; register xferoperations are clock phase accurate
model is in terms of logic gates;higher level MSI functionsdescribed in terms of these
electrical behavior; accuratewaveforms
Schematic capture + logic simulation package like Xilinx ISE
Special languages + simulation systems for describing the inherent parallel activity in hardware
Less AbstractMore AccurateSlower Simulation
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.12
Netlist
° A key data structure (or representation) in the design process is the “netlist”:
• Network List
° A netlist lists components and connects them with nodes:
ex:
g1 "and" n1 n2 n5
g2 "and" n3 n4 n6
g3 "or" n5 n6 n7
Alternative format:n1 g1.in1 n2 g1.in2n3 g2.in1n4 g2.in2n5 g1.out g3.in1n6 g2.out g3.in2n7 g3.outg1 "and"g2 "and"g3 "or"
° Netlist is what is needed for simulation and implementation.
° Could be at the transistor level, gate level, ...
° Could be hierarchical or flat.
° How do we generate a netlist?
n1n2n3n4
n5
n6
n7g1
g2g3
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.13
Design Flow
DesignEntry
High-level Analysis
TechnologyMapping
Low-levelAnalysis
Decoder(output x0,x1,x2,x3; inputs a,b)
{wire abar, bbar;inv(bbar, b);inv(abar, a);nand(x0, abar, bbar);nand(x1, abar, b );nand(x2, a, bbar);nand(x3, a, b );
}
4-LUT FF1
0
latchLogic Block set by configuration
bit-stream
4-input "look up table"
OUTPUTINPUTS
XilinxT
M
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.14
Design Flow
° Circuit is described and represented:• Graphically (Schematics)
• Textually (HDL)
• Other (Special Compilers)
- Memories
- Error Correcting Circuite
° Result of circuit specification (and compilation) is a netlist of:
• generic primitives - logic gates, flip-flops, or
• technology specific primitives - LUTs/CLBs, transistors, discrete gates, or
• higher level library elements - adders, ALUs, register files, decoders, etc.
DesignEntry
High-level Analysis
TechnologyMapping
Low-levelAnalysis
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.15
Design Flow
° High-level Analysis is used to verify:• correct function
• rough:
- timing
- power
- cost
° Common tools used are:• simulator - check functional
correctness, and
• static timing analyzer
- estimates circuit delays based on timing model and delay parameters for library elements (or primitives).
DesignEntry
High-level Analysis
TechnologyMapping
Low-levelAnalysis
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.16
Design Flow
° Technology Mapping:• Converts netlist to implementation
technology dependent details
- Expands library elements,
- Performs:
– partitioning,
– placement,
– routing
° Low-level Analysis• Simulation and Analysis Tools perform low-
level checks with:
- accurate timing models,
- wire delay
• For FPGAs this step could also use the actual device.
DesignEntry
High-level Analysis
TechnologyMapping
Low-levelAnalysis
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.17
Design Flow
Netlist:used between andinternally for all steps.
DesignEntry
High-level Analysis
TechnologyMapping
Low-levelAnalysis
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.18
Design Entry
Schematics are intuitive. They match our use of gate-level or block diagrams.
Somewhat physical. They imply a physical implementation.
• This is why we use them for datapaths
Require a special tool (editor).
Unless hierarchy is carefully designed, schematics can be confusing and difficult to follow.
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.19
High Level Design Languages (HDLs)
° Basic Idea:
• Language constructs describe circuits with two basic forms:
• Structural descriptions similar to hierarchical netlist.
• Behavioral descriptions use higher-level constructs (similar to conventional programming).
° Originally designed to help in abstraction and simulation.
• Now “logic synthesis” tools exist to automatically convert from behavioral descriptions to gate netlist.
• Greatly improves designer productivity.
• However, this may lead you to falsely believe that hardware design can be reduced to writing programs!
° “Structural” example:Decoder(output x0,x1,x2,x3; inputs a,b){
wire abar, bbar;inv(bbar, b);inv(abar, a);nand(x0, abar, bbar);nand(x1, abar, b );nand(x2, a, bbar);nand(x3, a, b );
} ° “Behavioral” example:
Decoder(output x0,x1,x2,x3; inputs a,b){
case [a b]00: [x0 x1 x2 x3] =
0x0;01: [x0 x1 x2 x3] =
0x2;10: [x0 x1 x2 x3] =
0x4;11: [x0 x1 x2 x3] =
0x8; endcase;}
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.20
Administration° Midterm on Wednesday (3/12) from 5:30 - 8:30
• No class on that day
° Pizza and Refreshments afterwards at LaVal’s on Euclid
• I’ll Buy the pizza
• LaVal’s has an interesting history° Review Session:
• Sunday (3/9), 7:00 PM in 306 Soda????
° Lab 3 due this Thursday
• Make sure to come to section to talk with TAs
° Start forming groups
• 4 or 5 per group.
• Probably only 4 person groups unless there are problems
• Must come to section this Thursday to finalize groups
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.21
Verilog History° Originated at Automated Integrated Design Systems (renamed Gateway) in
1985. Acquired by Cadence in 1989.
• Invented as simulation language.
• Synthesis was an afterthought. Many techniques for synthesis developed at Berkeley in 80’s and applied commercially in the 90’s.
° Around the same time as the origin of Verilog, the US Department of Defense developed VHDL.
• Because it was in the public domain it began to grow in popularity.
• VHDL is still popular within the government, in Europe and Japan, and some Universities.
° Standardization
• Afraid of losing market share, Cadence opened Verilog to the public in 1990.
• An IEEE working group was established in 1993, and ratified IEEE Standard 1394 (Verilog) in 1995.
• Verilog is language of choice of Silicon Valley companies, initially because of high-quality tool support and its similarity to C-language syntax.
° Most major CAD frameworks now support both VHDL and Verilog.
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.22
Basic Example: 2-to1 mux in Structural Form
//2-input multiplexor in gatesmodule mux2 (in0, in1, select, out); input in0, in1, select; output out; wire s0, w0, w1;
not (s0, select); and (w0, s0, in0), (w1, select, in1); or (out, w0, w1);
endmodule // mux2
° Notes:
• Comments start with //
• Input/output “wires” by default
• “module”
• port list
• declarations
• wire type
• primitive gates
in1
in0
select
out
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.23
2-1 Mux in Dataflow Form
//Dataflow description of muxmodule mux2 in0, in1, select, out);
input in0,in1,select;output out;
assign out = (~select & in0) | (select & in1);endmodule // mux2
Alternative:
assign out = select ? in1 : in0;
° Notes:
• provides a way to describe combinational logic by its function rather than gate structure (similar to Boolean expressions).
• The assign keyword is used to indicate a continuous assignment. Whenever anything on the RHS changes the LHS is updated.
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.24
2-to-1 mux Behavioral description// Behavioral model of 2-to-1
// multiplexor.module mux2 (in0,in1,select,out); input in0,in1,select; output out; reg out; always @ (in0 or in1 or select) if (select) out=in1; else out=in0;endmodule // mux2
• Behavioral: use keyword always followed by one procedural statement
– Use Begin/End to place more statements after always– @() specifier: wait until an event (here, change on one of 3 sigs)
• Output of procedural assignments must of of type reg– a reg type retains its value until a new value is assigned
– Not necessarily a real register: only for @(posedge signal)
in0
in1
select
outM
UX
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.25
Combining modules: Hierarchy & Bit Vectors//Assuming we have already
// defined a 2-input mux (either// structurally or behaviorally,
//4-input mux built from 3 2-input muxes module mux4 (in0, in1, in2, in3, select, out); input in0,in1,in2,in3; input [1:0] select; output out; wire w0,w1;
mux2 m0 (.select(select[0]), .in0(in0), .in1(in1), .out(w0)), m1 (.select(select[0]), .in0(in2), .in1(in3), .out(w1)), m2 (.select(select[1]), .in0(w0), .in1(w1), .out(out));endmodule // mux4
• Notes:– instantiation similar to primitives
– select is 2-bits wide
– named port assignment
m0
m2
m1
out
in0
in1
in2
select[0]
in3select[1]
Instance Names: m0, m1, m2
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.26
Behavioral 4-to1 mux
//4-input mux behavioral descriptionmodule mux4 (in0, in1, in2, in3, select, out); input in0,in1,in2,in3; input [1:0] select; output out; reg out; always @ (in0 or in1 or in2 or in3 or select)
case (select)2’b00: out=in0;2’b01: out=in1;2’b10: out=in2;2’b11: out=in3;
endcaseendmodule // mux4
° Notes:
• Case construct equivalent to nested if constructs.
• Definition: A structural description is one where the function of the module is defined by the instantiation and interconnection of sub-modules.
• A behavioral description uses higher level language constructs and operators.
• Verilog allows modules to mix both behavioral constructs and sub-module instantiation.
in0in1
out
Select
MU
Xin2in3
2
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.27
Behavioral with Bit Vectors
//Behavioral model of 32-bit // wide 2-to-1 multiplexor.module mux32 (in0,in1,select,out); input [31:0] in0,in1; input select; output [31:0] out; reg [31:0] out; always @ (in0 or in1 or select) if (select) out=in1; else out=in0;endmodule // Mux
//Behavioral model of 32-bit adder.module add32 (C,S,A,B); input [31:0] A,B; output [31:0] S; output C; reg [31:0] S; reg C; always @ (A or B) {C,S} = A + B;endmodule // Add
32
32A
B32
S
C
Ad
der
32in0
in132
out32
Select
MU
X
Concatenation Operation: {}
Bit Vector Sizing and Ordering (32 bits, bit 31 MSB)
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.28
Delay Specifications
`timescale 1ns/1ps
//Dataflow description of mux
module mux2 in0, in1, select, out);input in0,in1,select;output out;
assign out = #(5,10) select ? in1 : in0;
endmodule // mux2
° Notes:
• Delay specifications relative to timescale specification
• May be placed in many different syntactical positions
• #singlenumber
- Delay specification for both edges
• #(rising,falling)
- Delay specification for rising and falling edges
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.29
Sequential Logic
° Notes:
• “always @ (posedge CLK)” forces Q register to be rewritten every simulation cycle.
• “>>” operator does right shift (shifts in a zero on the left).
• Shifts on non-reg variables can be done with concatenation:
wire [3:0] A, B;
assign B = {1’b0, A[3:1]}
// Sequential Logic – involves an edgemodule FF (CLK,Q,D); input D, CLK; output Q; reg Q; always @ (posedge CLK) Q=D;endmodule // FF
//Parallel to Serial convertermodule ParToSer(LD, X, out, CLK);
input [3:0] X;input LD, CLK;output out; reg out;reg [3:0] Q;assign out = Q[0];always @ (posedge CLK)
if (LD) Q=X;else Q = Q>>1;
endmodule // mux2
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.30
Testing: Make sure that things work° Testing methodologies
• Understand what correct behavior iswhen you design things
- Collect vectors for later use
• Build monitor modules to check assertions of correct values
• Produce a regression test
- Set of tests to run each time something changes
° Types of test (Doug Clark):• Directed Vectors – test explicit behavior
• Random Vectors – apply random values or orderings to device
• Daemons – continuous error insertion
° Monitor modules:• Check to see if invariants are maintained
during long running simulations
Alewife Numbers
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.31
module monitorsum32(carry,sum,A,B );input [31:0] A,B;output [31:0] sum;output carry;reg [31:0] predsum;reg precarry;
// The “real” addersum32 mysum (carry,sum,A,B);
`ifndef synthesis // This checker code only for simulationalways @(A or B)
begin #100 //wait for output to settle (don’t make too long!){predcarry,predsum} = A + B;
if ((carry != predcarry) || (sum != predsum))$display(“>>> Mismatch: 0x%x+0x%x->0x%x carry %x”,
A,B,sum,carry);end
`endifendmodule
Monitor Modules: Passthrough testing
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.32
Testbench: Applying Directed Vectors module testmux; reg a, b, s;
wire f;reg expected;// Unit under test.mux2 myMux (.select(s), .in0(a), .in1(b), .out(f));
initialbegin
s=0; a=0; b=1; expected=0; #10 a=1; b=0; expected=1; #10 s=1; a=0; b=1; expected=1; end initial $monitor( "select=%b in0=%b in1=%b out=%b, expected out=%b time=%d", s, a, b, f, expected, $time); endmodule // testmux
° Top-level modules written specifically to test sub-modules.° Notes:
• initial block similar to always except only executes once (at beginning of simulation)• #n’s needed to advance time• $monitor - prints output• A variety of other “system functions”, similar to monitor exist for displaying output
and controlling the simulation.
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.33
module testbench( );reg [31:0] A,B;wire [31:0] sum;wire carry;reg [31:0] predsum;reg predcarry;
// Device under testsum32 mysum (carry,sum,A,B);
alwaysbegin
A = $random; B = $random;#100 //wait for output to settle{predcarry,predsum} = A + B;
if ((carry != predcarry) || (sum != predsum))$display(“>>> Mismatch: 0x%x+0x%x->0x%x carry %x”,
A,B,sum,carry);else
$display(“Successful: 0x%x+0x%x=0x%x carry %x”, A,B,sum,carry);
endendmodule
Testbench: Randomized Vector Testing
° Source of Vectors:
• With $random->predicted result
• Actual vectors
° Check actual results against predicted
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.34
More Verilog Help
° The lecture notes only cover the very basics of Verilog and mostly just the conceptual issues.
° The Mano textbook covers Verilog with many examples.
° The Bhasker book is a good tutorial.
On reserve in the Engineering
° Complete language spec from the IEEE available on handouts page
° Synplify manual (for when we start using synthesis)
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.35
The Big Picture: Where are We Now?
° The Five Classic Components of a Computer
° Today’s Topics:
• Microprogramed control
• Administrivia
• Microprogram it yourself
• Exceptions
Control
Datapath
Memory
Processor
Input
Output
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.36
Microprogramming (Maurice Wilkes)° Control is the hard part of processor design
° Datapath is fairly regular and well-organized
° Memory is highly regular
° Control is irregular and global
Microprogramming:
-- A Particular Strategy for Implementing the Control Unit of a processor by "programming" at the level of register transfer operations
Microarchitecture:
-- Logical structure and functional capabilities of the hardware as seen by the microprogrammer
Historical Note:
IBM 360 Series first to distinguish between architecture & organizationSame instruction set across wide range of implementations, each with different cost/performance
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.37
Instruction Set Architecture (subset of Computer Arch.)
... the attributes of a [computing] system as seen by the programmer, i.e. the conceptual structure and functional behavior, as distinct from the organization of the data flows and controls the logic design, and the physical implementation. – Amdahl, Blaaw, and Brooks, 1964
SOFTWARESOFTWARE-- Organization of Programmable Storage
-- Data Types & Data Structures: Encodings & Representations
-- Instruction Set
-- Instruction Formats
-- Modes of Addressing and Accessing Data Items and Instructions
-- Exceptional Conditions
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.38
“Macroinstruction” Interpretation
MainMemory
executionunit
controlmemory
CPU
ADDSUBAND
DATA
.
.
.
User program plus Data
this can change!
AND microsequence
e.g., Fetch Calc Operand Addr Fetch Operand(s) Calculate Save Answer(s)
one of these ismapped into oneof these
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.39
Variations on Microprogramming
° “Horizontal” Microcode
– control field for each control point in the machine
° “Vertical” Microcode
– compact microinstruction format for each class of microoperation
– local decode to generate all control points (remember ALU?)
branch: µseq-op µadd
execute: ALU-op A,B,R
memory: mem-op S, D
µseq µaddr A-mux B-mux bus enables register enables
HorizontalVertical
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.40
Extreme Horizontal
inputselectN3 N2 N1 N0. . .
13
Incr PCALU control
1 bit for each loadable register enbMAR enbAC . . .
Depending on bus organization, many potential control combinations simply wrong, i.e., implies transfers that can never happen at the same time.
Makes sense to encode fields to save ROM space
Example: mem_to_reg and ALU_to_reg should never happen simultaneously; => encode in single bit which is decoded rather than two separate bits
NOTE: the encoding should be only wide enough so that parallel actions that the datapath supports should still be specifiable in a single microinstruction
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.41
More Vertical Formatsrc dst
DEC
DEC
other control fields next states inputs
MUX
Some of these may havenothing to do with registers!
Multiformat Microcode:1 3 6
1 3 3 3
0 cond next address
1 dst src alu
DEC
DEC
Branch Jump
Register Xfer Operation
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.42
Hybrid Control
Not all critical control information is derived from control logic
E.g., Instruction Register (IR) contains useful control information, such as register sources, destinations, opcodes, etc.
RegisterFile
RS1
DEC
RS2
DEC
RD
DEC
op rs1 rs2 rdIR
tocontrol
enablesignalsfromcontrol
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.43
Vax MicroinstructionsVAX Microarchitecture:
96 bit control store, 30 fields, 4096 µinstructions for VAX ISAencodes concurrently executable "microoperations"
USHF UALU USUB UJMP
11 063656895 87 84
001 = left010 = right . . .101 = left3
010 = A-B-1100 = A+B+1
00 = Nop01 = CALL10 = RTN
JumpAddress
SubroutineControl
ALUControl
ALU ShifterControl
Current intel architecture: 80-bit microcode, 8192 instructions
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.44
Horizontal vs. Vertical Microprogramming
NOTE: previous organization is not TRUE horizontal microprogramming; register decoders give flavor of encoded microoperations
Most microprogramming-based controllers vary between:
horizontal organization (1 control bit per control point)
vertical organization (fields encoded in the control memory and must be decoded to control something)
Horizontal
+ more control over the potential parallelism of operations in the datapath
- uses up lots of control store
Vertical
+ easier to program, not very different from programming a RISC machine in assembly language
- extra level of decoding may slow the machine down
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.45
How Effectively are we utilizing our hardware?
° Example: memory is used twice, at different times
• Ave mem access per inst = 1 + Flw + Fsw ~ 1.3
• if CPI is 4.8, imem utilization = 1/4.8, dmem =0.3/4.8
° We could reduce HW without hurting performance
• extra control
IR <- Mem[PC]
A <- R[rs]; B<– R[rt]
S <– A + B
R[rd] <– S;PC <– PC+4;
S <– A + SX
M <– Mem[S]
R[rd] <– M;PC <– PC+4;
S <– A or ZX
R[rt] <– S;PC <– PC+4;
S <– A + SX
Mem[S] <- B
PC <– PC+4; PC < PC+4; PC < PC+SX;
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.46
“Princeton” Organization
° Single memory for instruction and data access
• memory utilization -> 1.3/4.8
° Sometimes, muxes replaced with tri-state buses
• Difference often depends on whether buses are internal to chip (muxes) or external (tri-state)
° In this case our state diagram does not change
• several additional control signals
• must ensure each bus is only driven by one source on each cycle
RegFile
A
B
A-BusB Bus
IR S
W-Bus
PC
nextPC ZX SX
Mem
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.47
Alternative datapath (book)
° Miminizes Hardware: 1 memory, 1 adder
IdealMemoryWrAdrDin
RAdr
32
32
32Dout
MemWr
32
AL
U
3232
ALUOp
ALUControl
32
IRWr
Instru
ction R
eg
32
Reg File
Ra
Rw
busW
Rb5
5
32busA
32busB
RegWr
Rs
Rt
Mu
x
0
1
Rt
Rd
PCWr
ALUSelA
Mux 01
RegDst
Mu
x
0
1
32
PC
MemtoReg
Extend
ExtOp
Mu
x
0
132
0
1
23
4
16Imm 32
<< 2
ALUSelB
Mu
x1
0
32
Zero
ZeroPCWrCond PCSrc
32
IorD
Mem
Data R
eg
AL
U O
ut
B
A
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.48
Summary I
° Design Process
• Design Entry: Schematics, HDL, Compilers
• High Level Analysis: Simulation, Testing, Assertions
• Technology Mapping: Turn design into physical implementation
• Low Level Analysis: Check out Timing, Setup/Hold, etc
° Verilog – Three programming styles
• Structural: Like a Netlist
- Instantiation of modules + wires between them
• Dataflow: Higher Level
- Expressions instead of gates
• Behavioral: Hardware programming
- Full flow-control mechanisms
- Registers, variables
- File I/O, consol display, etc
3/3/03 ©UCB Spring 2003 CS152 / Kubiatowicz
Lec10.49
Summary II° Specialize state-diagrams easily captured by microsequencer
• simple increment & “branch” fields
• datapath control fields
° Most microprogramming-based controllers vary between:
• horizontal organization (1 control bit per control point)
• vertical organization (fields encoded in the control memory and must be decoded to control something)