Text: Pictures: Magnus Bergmar & Marlene Winberg Jan-Åke ...
Simon Winberg - UCT EE OCW
Transcript of Simon Winberg - UCT EE OCW
Presented by
Simon Winberg
Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) Planned to be double period lecture
Terms: Temporal vs spatial, data path
General processor types (JvN vs Harvard)
Datapath design
Instruction Set Architecture
Instruction Path
Micro instructions
Processor aspectsCISC vs RISC
Sources Acknowledgement:Lecture based on the book Logic and Computer Design Fundamentals, Fifth Edition, by Mano, Kime & Martin. Pearson. 2016.
next
lectu
re
Temporal Computation Spatial Computation
The traditional paradigm
Typical of Programmers
Things done over time steps
Suited to hardware
Possibly more intuitive?
Things related in a space
A = input(“A= ? ”);
B = input(“B =? ”);
C = input(“B multiplier ?”);
X = A + B * C
Y = A – B * C
A?
B?
C?
+ *
X !
Y !
-
Which do you think is easier to make sense of?
Can provide a clearer indication of relative
dependencies.
Definition: data path or datapath This is the set of functional units that carry out
data processing operations for a computer system. The datapaths, together with a control unit and
ALU, makes up the CPU of a computer. Larger datapath (or composite datapaths) can be
created by joining more than one together using (e.g.) multiplexers.
Reconfigurable datapaths* : These are datapaths that can be re-purposed at
run-time using a programmable fabric e.g. may allow for more efficient processing and substantial power savings for particular types of application.
* Source: https://en.wikipedia.org/wiki/Datapath
data
source
data
destination
data
Types of Processor ArchitectureEEE4120F
Type
A
Type
B
Type
C
Type
D
Type
E
Type
F
Type
G
Type
H
Type
I
Type
J
(reminders)
Named after John von Neumann A Hungarian mathematician. He was
the first to write about requirements foran electronic computer (done in 1945).
The ‘von Neumann computer’ differed from earlier computers that were programmed by hard wiring.
Most computers since then have been inspired from this design.
The von Neumann computer comprises the following four components:
Memory
Control Unit
Arithmetic LogicUnit (ALU)
Input/Output
Figure 1: The Von Neumann architecture*
* image adapted from http://en.wikipedia.org/w/index.php?title=Von_Neumann_architecture
The ‘JvN Machine’
We will generally be using this model
for a CPU design…
Random access, read/write memory stores both programs and data
Program comprises instructions (von Neumann termed ‘machine instructions’) that tells the computer what do.
Data is simply information to be used by the program
Control unit fetches instruction or data from memory, decodes and executes the instruction, sequentially completes sub-operations for the instruction
Arithmetic Logic Unit does basic arithmetic operations (earlier CPUs didn’t have multiply or divide; had few instructions, e.g. LOAD, STORE, ADD, IN, OUT and JUMP on flags) and logic operations (e.g. AND, OR, SHIFT etc.)
Input/Output interfaces to other systems and the human operator
Simple recap of Von Neumann Arch: http://www.youtube.com/watch?v=DMiEgKZ-qCw
Some history of Von Neumann leading towards his machine: (not examined!)
“The Greatest Computer Programmer Was Its First”http://www.youtube.com/watch?v=Po3vwMq_2xA
The Harvard architecture physically separates storage and signal lines for instructions and data.
The term originated from the “Harvard Mark I” relay-based computer that stored instructions on (24-bits wide) punched tape and data in electro-mechanical counters.
Data storage entirely contained within the central processing unit, and provided no access to the instruction storage as data.
(for the original MarkI) programs needed to be loaded by an operator as the processor could not initialize itself.
ALU
Control
UnitInstruction memory
Data memory
I/O
Nowadays this general architecture (albeit greatly
enhanced) is still relevant! They are technically
referred to as “modified Harvard architecture”. Many
processors today (especially embedded ones) still
implement this separation of data and storage for
performance and reliability reasons. Harvard
Architecture
“Harvard Mark I”
https://www.youtube.com/watch?v=4ObouwCHk8w
…
The Modified Harvard Architecture (MHA) is about the most ubiquitous for low-cost microcontrollers
Main feature of the MHA:
Allows program memory to be read… or
May even allow program memory to be written…
But reprogramming flash is not necessarily just
a matter of writing an address (and the potential
for a simple mistake like M[0x0] = 0).
It usually needs a process to be followed, e.g.:
erases block, put data in write buffer, call the
routine that talks to flash to write the block of
memory.
Your’re probably thinking
“Eeek! No way?!”That is surely just JvN anyway?
You already know about the ARM architecturee.g. flavours by STM, Texas Instruments,
NXP/Freescale, Intel… For which TSMC (Taiwan Semiconductor
Manufacturing Company) and the GlobalFoundries manufacture most of the actual chips.
But these microcontrollers you should also know about PIC (Peripheral Interface Controller) AVR
PIC and AVR are essential for designing low-power, low-cost systems!
PIC Developed by Microchip Technology
(Derived from the PIC1650 originally designed by General Instrument's Microelectronics Division)
Available since 1976* Some use pure Harvard Architecture, i.e. program memory is
protected; others Modified Harvard where the program memory can be read (and sometimes even written)
All models use flash memory for program storage Low cost, low power, ease of reprogramming with built-in
EEPROM Abundant development tools and application notes Often marketed as “PICmicro” A long history of use, probably the most used microcontroller
of all time!
* Information source: https://en.wikipedia.org/wiki/PIC_microcontrollers
AVR – “Alf and Vegard's RISC processor” Modified Harvard architecture 8-bit RISC Even lower power options, generally low cost (maybe the
cheapest AVR is not as cheap as the cheapest PIC) ATmega328P one of the most popular, very low power but
fairly powerful micros Major claim to fame: one of the first micros to have on-chip
flash, instead of one-time programmable ROM or hassles of using EEPROM
Inspiring historical fact: The AVR architecture was conceived by two students from
the Norwegian Institute of Technology (NTH), Alf-EgilBogen and Vegard Wollan*
The name “AVR” is commonly thought to stand for “Alf and Vegard's RISC processor”
* Information source: https://en.wikipedia.org/wiki/AVR_microcontrollers
- Coffee percolator
- ABS brakes
- Radio control of a model airplane
- Car radio (e.g., auto tuning, stored stations)
- Washing machine
- Portable battery-operated game console
- Weather sensor
- Garage door opener
- iPod
- Burglar alarm
- Radar car speed sensor
- Wireless network switch
Which of these applications might you instead use an Intel 8051,
AVR or PIC? Or even some other option.
ARMWhich of these applications do you think a ARM
would be suited to?
Suggested processor choices
I’d probably make the following choices…
- Coffee percolator - PIC
- ABS brakes - ARM or AVR
- Radio control of a model airplane – PIC or AVR
- Car radio – PIC for auto turning feature at least
- Washing machine – Probably a PIC
- Portable battery-operated game console - ARM
- Weather sensor – PIC (lowest power options, usually doesn’t need speed)
- Garage door opener – PIC or AVR, something cheap
- iPod – ARM
- Burglar alarm - probably a PIC, if there’s not much signal processing
- Radar car speed sensor – probably a DSP and an ARM or PIC for comms
- Wireless network switch – perhaps a ARM or something super fast, like DSP
Which of these applications might you instead use an Intel 8051,
AVR or PIC? Or even some other option.
Class activity – to ARM or not to ARM sample solution & discussion
⚫ In a wide range of embedded
computers and control systems e.g.
⚫Highly reliable – ABS, hard drives
⚫Consumer – cameras, PDAs
Cell phones (e.g. Nokia N series, Sony K series)
⚫ Game systems (Nintendo DS, etc)
⚫ Network systems (routers, switches, firewalls, etc)
Nintendo DSCellphone iPod Network switch
ABS breaking
A processor design can be considered as comprising
Datapath: moving data/signals around; and
Control: making decisions (e.g. whether or not to do an operation) and doing operations (e.g. adding two registers)
The specification of a computer is provided by defining its appearance to the programmer at its lowest level, its Instruction Set Architecture (ISA) level
The specification of a computer is provided by defining its appearance to the programmer at its lowest level, its Instruction Set Architecture (ISA) level
From the ISA the computer architecture is developed…
Computer architecture development essentially involves deciding its datapath and control.
Is effective approach for also designing a processor/CPU or designing a special-purpose application accelerator or co-processor
DatapathsMost generally this refers to the registers,
processing units, and interconnections(busses) that are used to process and transfer data in a computer system
Datapath comprisesA set of registers (that store data)Microoperations to perform operations on data
stored in the registersControl interfaces (for sequencing and
arbitrating operations)
Recommended video: “How a datapath works inside a computer
system“. Available at: https://youtu.be/ibYYqvp9FmU
Keep that in the back of your mind while we will look into the
digital circuit mechanisms for how an instruction is decoded,
data gathered, operations invoked, and results stored…
Hopefully this will give you detailed understanding of the CPU
workings so that you can develop your own ☺
You presumably remember the cycle:
1) Fetch
2) Decode
3) Execute
Coverd in EEE3095/6S
Datapath design is
largely about the
circuitry to store
data and to select
and connect data to
be transferred
between processing
units in a processor
select data
processing unit
selecting
select data to transfer
connecting
selectingselecting
connect
connect
connect
store
connect
Will briefly explain the parts…processing
unit
Example of a Datapath
Based on CH8 Fig. 8-1 of Logic and Computer Design Fundamentals, Fifth Edition, by Mano, Kime & Martin. Pearson. 2016.
Onwards to….
The Decoder‘Fans out bits’
i.e. convers m-bit input
to activating one of 2^m output
signals. This is used for decoding
an instruction, to chip select one
of multiple operations
registerLoad
(otherwise
reads)
Datan
Q(data out)
n The RegisterStores values. Basically a D-type flip flop
that can be enabled(load)/disabled(read)
The Multiplexerm-input sel line to select 1 of 2^m
input lines (I) to connect through to
the output line (Y)
I3
I2
I1
I0
Y
sel
sel
General Computer
Architecture comprises:
• Instruction Set Architecture*
• Storage Resources:
• Instruction memory,
• Data memory,
• Register file,
• Program counter
• Datapath:
• Runs instructions
and activates the
processes
…
Based on CH8 of Logic and Computer Design Fundamentals, Fifth Edition, by Mano, Kime & Martin. Pearson. 2016.
General Computer Architecture is
designed around / comprises:
• Instruction Set Architecture*
• Storage Resources: • instruction memory,
• data memory,
• register file,
• program counter
• Datapath: runs the instructions and
activates processes Block diagram of a single cycle computer
Instruction Set Architecture (ISA) =
• Instruction format: how the instructions are structured, how the
bits of the instruction link to operations/data
• Instruction specification: describe each of the available
instructions that the system can execute.
* we delve into detail later in the course
An example instruction format that is likely used for arithmetic/logic operations on a RISC processor
Based on CH8 Fig. 8-15 of Logic and Computer Design Fundamentals, by Mano, Kime & Martin. Pearson. 2016.
Function Unit (FU)
(or ‘execution unit’)
• Carries out the operations
• Could be structured in
various ways.
• Typically…
FU = ALU + CU
and possibly
FUmain = Σ (FUsubunits )
for a multi-processor system
Block diagram of a single cycle computer
Based on CH8 Fig. 8-15 of Logic and Computer Design Fundamentals, by Mano, Kime & Martin. Pearson. 2016.
As show in this diagram, the FU has two input registers, A and B, that it operates on.There is a Function selector (FS) input used to chose the operation to perform. When this operation is performed, the result output is latched to F and updates status flags V,C,N,Z.The MD (memory data) select can bypass the FU if the instruction is loading from memory.
This used for a store instruction, i.e. using A as address and B as data to write to the address
The Arithmetic/Logic Unit (ALU)
Arithmetic operations: + - * /
Logic operations: and, or, xor, not
The ALU has
• Inputs to provide data to be
processed as well as
• Input for type of operation to be
performed, (e.g. if it needs to do an
ADD or AND, etc.)
• Has outputs that provide the
results of the operation (G output
in figure).
• Output flags, e.g. carry (C) output
flag that stores status information
of the operation (e.g. carry flag set)
or error states (divide by zero). An n-Bit ALU
Based on CH8 Fig. 8-2 of Logic and Computer Design Fundamentals, by Mano, Kime & Martin. Pearson. 2016.
Control Unit (CU)
The CU is the part of the CPU that directs the working of the processor.
The CU coordinates the computer's memory, ALU and its input and output devices, making these parts respond to program instructions.
Control Unit: can be abstracted as the circuitry used to sequence and control the system (this simplification obviously hides many aspects such as memory and IO)
Sometimes the control unit is not shown in the design of a computer (or cannot be shown as a distinct subsystem) as it may be distributed through the system, closely integrated within other components; this is particularly the case for highly complex pipelined architectures where it is difficult to have a separate component to arbitrate the system.
Based on CH8 Fig. 8-1 of Logic and Computer Design Fundamentals, Fifth Edition, by Mano, Kime & Martin. Pearson. 2016.
We’ve looked at the
datapath to connect
the pieces, the
register file, the ALU,
the FU, the CU.
Now you hopefully have an
inkling of how the pieces
work together to 1) get data
to the FU inputs, 2) grab
results from FU, 3) move
FU results/memory to
register or memory address
Don’t be too worried yet: we are going to
map out an instruction to the datapath… select data
processing unit
selecting
select data to transfer
connecting
selectingselecting
connect
connect
connect
store
connect
processing
unit
Example of a Datapath
How an instruction is executed
Arithmetic
Logic Unit
(ALU)
A
B
Cin
S
Y
Cout
Z
n
n
select function
carry input
B data in
A data in
result
output Y
carry
out
zero
out
Selecting
where Y will
be sent
Selecting
where A
comes from
Selecting
where B
comes from
Selecting operation to perform, this
may or may not activate the ALU (e.g.
for LOAD the ALU might do nothing)
SA circuitry will select the
right source register and
connect it to A
SB circuitry will select the
right source register and
connect it to B
Cin for a simple processor
will just connect to a D FF
that remembers the last
carry state.
The Opcode part of the
instruction will link to a decoder
(and some other logic) and
select the correct ALU option –
for some opcodes the ALU just
does a NOP or passthrough.
n DR circuitry will select the
right destination to send
the result (Y) of the ALU
(this may be to memory)
Carry output might be send
somewhere else, but usually
to a D FF which loops back as
input for the next instruction
The zero out flag might activate
further things but probably also
loops back to the input for the
next instruction.
But before we get into this…
Micro instructions
As a means plan and design an instruction set
Micro instructions
Major instruction
Ahmadal’s test
FREE Creative Commons License
COUNTRY BOY
Music: https://www.bensound.com
Next up: instruction path
micro-instructionsCache hierarchy, shared memory
I suggest taking a break and stretch a bit if you are planning
to directly proceed with the next presentation.
Image sources:
Clipart sources – public domain CC0 (http://pixabay.com/)
datacenter image – pixabay
ant – needpix.com
commons.wikimedia.org
images from flickr
Disclaimers and copyright/licensing details
I have tried to follow the correct practices concerning copyright and licensing of material,
particularly image sources that have been used in this presentation. I have put much
effort into trying to make this material open access so that it can be of benefit to others in
their teaching and learning practice. Any mistakes or omissions with regards to these
issues I will correct when notified. To the best of my understanding the material in these
slides can be shared according to the Creative Commons “Attribution-ShareAlike 4.0
International (CC BY-SA 4.0)” license, and that is why I selected that license to apply to
this presentation (it’s not because I particulate want my slides referenced but more to
acknowledge the sources and generosity of others who have provided free material such
as the images I have used).