Control Unit : Hardwired vs. Microprogrammed Approach.
-
Upload
arron-morgan -
Category
Documents
-
view
283 -
download
12
Transcript of Control Unit : Hardwired vs. Microprogrammed Approach.
Control Unit :Hardwired vs. Microprogrammed Approach
Two Major Blocks in a CPU
Datapath Adders, multipliers, dividers Shifters, Registers Anything that changes or stores data
Control Unit Controls the data How data is stored? Where is it stored? When should data be available?
Control Unit
Correct sequencing of control signals Much like human brain controlling various
parts of body Sequence and timing is the key
Any aberration will result in wrong operation
A Simplified Control Unit
Control Unit
Fetch Unit
Decode Unit
Execution Unit
Write Back Unit
Fetch
Decode
Execute
Write Back
A Possible Implementation
2 to 4Decoder
CLK
Mod-3 Counter
Timing Diagram
CLK
Fetch
Decode
Execute
Write Back
Let’s Sample The Signals
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
Another Way to Generate Signals
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
Hardwired vs Microprogrammed
HardwiredUse gates to generate signalsSqueeze out the juice for performanceDifferent logic styles possible
MicroprogrammedStore the control signals in the sequenceJust read from the memory every clock cycle
A Model Computer (Richard Eckert, SIGCSE Bulletin, Vol. 20, No. 3, September 1988)
Accumulator
ALU
Register B
PC
MAR
MDR
RAM
IR
Control
8
8
12
12
12
12
12
12
12
4
12
Bus
R
W
LM
IPLPEP
LDED
LAEA
SA
EU
LB
LIEI
More Details
L = Load E = Copy to bus A,S = Add and Subtract Sign bit to control unit IP = Increment PC
ACC
ALU
B
PC
MAR
MDR
RAM
IR
ControlBus
RW
LM
IP
LPEP
LD
ED
LAEA
S
AEU
LB
LIEI
LDALoad
Accumulator
1 A←(Mem)
1. MAR ←IR
2. MDR ←M(MAR)
3. A ←MDR
EI,LM
R
ED,LA
STAStore
Accumulator2 (Mem) ←A
1. MAR ←IR
2.MDR ←A
3. M(MAR) ← MDR
EI,LM
EA,LD
W
ADD 3 A ←A+B 1. A←ALU(Add) A,EU,LA
SUB 4 A ←A-B 1. A←ALU(Sub) S,EU,LA
MBA 5 B ←A 1. B←A EA,LB
JMP 6 PC ←Mem 1. PC←IR EI,LP
JN 7 PC ←Mem
If –ve flag is set
1. PC←IR if NF is set NF : EI,LP
HLT 8-15 Stop Clock
“Fetch” IR ←Next Instruction
1. MAR ←PC
2. MDR ←M(MAR)
3. IR ← MDR
EP,LM
R
ED,LI,IP
Mnemonic Opcode Action Register TransfersActive
Controls
Hardwired Unit
IR
Decoder Control Matrix
LDASTA
ADDSUB
MBAJMP
JN
Ring Counter
NF
T5 T1
Halt
Opcode
Control Signals
CLK
Table with Sequencing
IP LP EP LM R W LD ED LI EI LA EA A S EU LB
Fetch T2 T0 T0 T1 T2 T2
LDA T3 T4 T5 T3 T5
STA T3 T5 T4 T3 T4
MBA T3 T3
ADD T3 T3 T3
SUB T3 T3 T3
JMP T3 T3
JN T3*F
T3*F
IP = T2; R=T1+T4*LDA; LI=T2;LP = T3*JMP+T3*JN*NF; W=T5* STA; A = T3*ADD;EP = T0; LD = T4*STA; S = T3*SUB;LM = T0+T3*LDA+T3*STA ED=T2+T5*LDA; …..
Control Matrix
Implement using discrete gates Usually done using PLAs Large control matrices are implemented
hierarchicallyFor speed
A well known process and design flows are widespread
An Alternate Implementation
IRStartingAddress
Generator
uPC
Control Store
CLK
+1
MicroinstructionRegister
+NF
& CD
MAP
1*
01
00
Control
Map CD Meaning
1 * From IR
0 0UnconditionalBranch within Microprogram
0 1
NF=0 => IncrementNF=1 =>
Conditional Branch
32 x 24
HLT
Control ROMJump Address
4-bit opcode
Control Store
Fetch 0
00 0011000000000000 0 0 0 01
01 0000100000000000 0 0 0 02
02 1000000110000000 0 1 0 XX
LDA 1 03 0001000001000000 0 0 0 04
04 0000100000000000 0 0 0 05
05 0000000100100000 0 0 0 00
STA 2 06 0001000001000000 0 0 0 07
07 0000001000010000 0 0 0 08
08 0000010000000000 0 0 0 00
ADD 3 09 0000000000101010 0 0 0 00
SUB 4 0A 0000000000100110 0 0 0 00
MBA 5 0B 0000000000010001 0 0 0 00
JMP 6 0C 0100000001000000 0 0 0 00
JN 7 0D 0000000000000000 1 0 0 0F
0E 0000000000000000 0 0 0 00
0F 0100000001000000 0 0 0 00
Expansion 8-E 10-1E
HLT F 1F 0000000000000000 0 0 1 XX
Instruction Op-CodeuInstructionAddress Control Signals CD MAP HLT Addr. Of Next
Control Word
Example 1 – MBA followed by ADD
Fetch 0
00 0011000000000000 0 0 0 01
01 0000100000000000 0 0 0 02
02 1000000110000000 0 1 0 XX
LDA 1 03 0001000001000000 0 0 0 04
04 0000100000000000 0 0 0 05
05 0000000100100000 0 0 0 00
STA 2 06 0001000001000000 0 0 0 07
07 0000001000010000 0 0 0 08
08 0000010000000000 0 0 0 00
ADD 3 09 0000000000101010 0 0 0 00
SUB 4 0A 0000000000100110 0 0 0 00
MBA 5 0B 0000000000010001 0 0 0 00
JMP 6 0C 0100000001000000 0 0 0 00
JN 7 0D 0000000000000000 1 0 0 0F
0E 0000000000000000 0 0 0 00
0F 0100000001000000 0 0 0 00
Expansion 8-E 10-1E
HLT F 1F 0000000000000000 0 0 1 XX
0B09
LB
EU
SAEA
LA
EI
LI
ED
LD
WRLM
EP
LP
IP
Sequence for MBA,ADD
1. MAR ←PC
2. MDR ←M(MAR)
3. IR ← MDR B←A 1. MAR ←PC
2. MDR ←M(MAR)
3. IR ← MDR A←ALU(Add)
0011000000000000
0011000000000000
0000100000000000
0000100000000000
1000000110000000
1000000110000000
0000000000010001
0000000000101010
MOV B,A
ADD
Example 2 – JN with Flag Set
Fetch 0
00 0011000000000000 0 0 0 01
01 0000100000000000 0 0 0 02
02 1000000110000000 0 1 0 XX
LDA 1 03 0001000001000000 0 0 0 04
04 0000100000000000 0 0 0 05
05 0000000100100000 0 0 0 00
STA 2 06 0001000001000000 0 0 0 07
07 0000001000010000 0 0 0 08
08 0000010000000000 0 0 0 00
ADD 3 09 0000000000101010 0 0 0 00
SUB 4 0A 0000000000100110 0 0 0 00
MBA 5 0B 0000000000010001 0 0 0 00
JMP 6 0C 0100000001000000 0 0 0 00
JN 7 0D 0000000000000000 1 0 0 0F
0E 0000000000000000 0 0 0 00
0F 0100000001000000 0 0 0 00
Expansion 8-E 10-1E
HLT F 1F 0000000000000000 0 0 1 XX
0D
CD
If negative FLAG is set, jump to a new location by skipping to uInstruction at 0F
LB
EU
SAEA
LA
EI
LI
ED
LD
WRLM
EP
LP
IP
Example 3 – JN with Flag Not Set
Fetch 0
00 0011000000000000 0 0 0 01
01 0000100000000000 0 0 0 02
02 1000000110000000 0 1 0 XX
LDA 1 03 0001000001000000 0 0 0 04
04 0000100000000000 0 0 0 05
05 0000000100100000 0 0 0 00
STA 2 06 0001000001000000 0 0 0 07
07 0000001000010000 0 0 0 08
08 0000010000000000 0 0 0 00
ADD 3 09 0000000000101010 0 0 0 00
SUB 4 0A 0000000000100110 0 0 0 00
MBA 5 0B 0000000000010001 0 0 0 00
JMP 6 0C 0100000001000000 0 0 0 00
JN 7 0D 0000000000000000 1 0 0 0F
0E 0000000000000000 0 0 0 00
0F 0100000001000000 0 0 0 00
Expansion 8-E 10-1E
HLT F 1F 0000000000000000 0 0 1 XX
0D
CDCD
LB
EU
SAEA
LA
EI
LI
ED
LD
WRLM
EP
LP
IP
Let’s Review the Microprogramming Model Store the microprogram in control store Fetch the instruction Get the set of control signals from the
control word Move the microinstruction address Lather, Rinse, Repeat
What is Microcode?
Michael Slater's "Microprocessor Based Design" (pg.42):
Microcode tells the processor every detailed step required to execute each machine language instruction. Microcode is thus at an even more detailed level than machine language, and in fact defines the machine language. In a standard microprocessor, the microcode is stored in a ROM or a programmable logic array (PLA) that is part of the microprocessor chip and cannot be modified by the user.'
Thought Experiment
Why is the design a little clumsy? What can we do about it?
Reason for Clumsiness
JN – Conditional Flag check Without any condition check, the whole
process is very smooth Solution – Avoid all conditional checks
Real Life
A little American Football Story Theory vs. Practice
In theory, there is no difference between theory and practice
In practice, theory and practice are two different things altogether
Live with condition checksKeep designs as clean as possible
A General Approach
IR
Starting and Branch
AddressGenerator
uPC
Control Store
Control Word
External Inputs
Conditional Codes
Format of Microinstructions
Pick yoursYour choice is as best as your neighbor’s
What we did :One bit position per control signalOrder of the bits ?
Don’t matterCan result in long microinstructions
Not the number of microinstructions, but the width
A Note About Density
Observe that only a few bits are set to 1 Poor usage of bit space This scheme is called Horizontal
Microprogram Alternate Version : Encode the bits
Vertical Microprogram
Vertical Microprogram
Encode the bits by grouping similar elements together
General Idea :Group similar resources together
There can be only one source or destination register
Some operations are mutually exclusive Read vs Write of memory
Design Issues
Encoding reduces the bit-spaceBut requires decoders
Cost of decoder vs bit-spaceUsually decoder cost is very low
Another Idea
Group concuurently active signals Every meaningful combination gets a code Complex decoder to interpret every code
Vertical vs Horizontal
Horizontal FasterMore areaMore common currently
Cheap transistors
VerticalSlowerMore microinstructions
Microsequencing
Other ways to save on hardware Every instruction had its own
microprogram sequence Also, instructions have several addressing
modesOnly the first few microinstructions differ
Can we share microcode?
A Powerful Technique in Sharing
Bit-ORing Example Two instructions share some microcode Eventually, must branch The default branch (one instruction’s) is X0 The other branch is stored at X1 Change the least significant bit(s?) to get a new address
Compare that with : Having two conditional branches Store two fields, one for each branch Both very unclean
Thought Experiment :
What if we provided explicit branch instead of storing next field in our microprogram?
Typical instruction set will need a lot of branches
Lot of time will be wasted on branching
A Pat on Our Back
We provided explicit field for addressBranch location is now data It is already saved
Caution :Microinstruction can get very wide
Solution :There is no free lunch.
Can we pipeline microfetch?
A neat idea : Why wait till the current micro-op is over? Branch field gives next operation Get the next op
Caveat : External inputs and status flags may change the order What about interrupts?
They are going to follow you everywhere Should have a mechanism that can invalidate microcode
prefetch Similar to pipeline flush for instructions
Commonly used
Historical Perspectives
Hardwired Logic Popular before 60’s
Only way people did it Popular now
Speed Benefits
Microprogram Popular in 70’s
Memory was slower than CPU No on-chip cache Best way is to store the microcode
Now – Depends on who you ask? Shades of gray :
Extremes of spectrum are harder to find nowadays
Tools for Design
Hardwired Any state machine optimizer Assigning states, minimizing tranisitions, races,
hazards,…….. Microcoding
Small ones can be in binary Large ones – Use microassembler
Very useful debug tool Can use microassembler simultaneously with actual
hardware development
Hardwired vs Microcoding
Hardwired units are faster and smaller Emulation is easy with microcoding Hardwired design is complex if large Bugs in hardwired design cannot be fixed
in field Hardwired control is not suited for loops
Looping with microcode can be made as fast
Hardwired vs Microcode vs RISC RISC
Simpler instruction set Hardwired Implementation
RISC instructions are like microcodes Instructions come from I-Cache instead of Control
Store
Difference : Contents are not fixed Advantage : Only load what you want on the I-Cache
Keeps size smaller as compared to Control Stores
Microprogram vs Software Imagine Floating Point Division Solution 1 : Write in software
Long process Error prone Many fetches repeatedly from memory for the given
sequence of operations
Solution 2 : Microcode Long process too – but designer’s not programmers Relatively error free – more thorough design Requires many cycles but fetched and used locally
Emulation A very common use of microcoding IBM System/360
32 bit architecture 16-bit registers
Secret : Most implementations were 8-bit
Keep cost low Heavy microcoding Programmers oblivious
In 1992, International Meta Systems (IMS) announced the 3250 Designed to emulate the x86, 68K, and 6502 architectures Uses customizable microcode, among other techniques Went bust, never released
Another Interesting Note
Writable Control StoreWhat if you, a programmer, can write your
own control store?Not a mad scientist thought
Implemented inVAX 8800PDP-11/60 IBM System/370
Current Trends
Microcode Update Linux Utility - microcode_ctl
Companion to IA32 microcode driver It decodes and sends new microcode to the kernel
driver to be uploaded to Intel IA32 processors Update is volatile – lost on reboots
Microcode updates are also rolled into BIOS updates typically Ready even before an OS is loaded
Intel Said…..
The Pentium(R) Pro processor and Pentium(R) II processor maycontain design defects or errors known as errata that may cause theproduct to deviate from published specifications. Many times, theeffects of the errata can be avoided by implementing hardware orsoftware work-arounds, which are documented in the Pentium Pro Processor Specification Update and the Pentium II ProcessorSpecification Update. Pentium Pro and Pentium II processors include afeature called "reprogrammable microcode", which allows certain typesof errata to be worked around via microcode updates. The microcodeupdates reside in the system BIOS and are loaded into the processorby the system BIOS during the Power-On Self Test, or POST.
Current Trends
Hyperthreading in P4A second logical CPUComplete state of the system in both CPUs
Microcoding in P4Two pointers control flow independentlyBoth processors share the ROM entriesAccess is alternated between the CPUs
Thank You