Post on 14-Dec-2015
CSCI1412Lecture 3
Hardware 3More Architecture
Dr John Cowell
phones off (please)
OverviewHow it works!
the fetch / execute cycle in detailMeasuring speed
system clock, GHz, MIPS and FLOPSAdvanced concepts
cache, pipelining, parallelismmemory issues
dynamic and static RAM, SIMMS, DIMMS, and specialist memory
motherboards component layout
© De Montfort University, 2007 CSCI1412-HW-3 2
The Fetch / Execute Cycle
The Fetch / Execute Cycle
© De Montfort University, 2007 CSCI1412-HW-3 4
control unit
RAM
arithmetic / logic unit
decode execute
fetch (store)
BusesComputer memory is made up of a set of
locations. Each has a unique address. The address bus specifies the location. The data
bus transfers the data.The control bus determines e.g. read or write
© De Montfort University, 2007 CSCI1412-HW-3 5
RegistersA CPU contains special purpose registers (typically 32)
Very high speed memory within the processor chipeach register contains a fixed number of bits
e.g. each register in a 32-bit processor has 32 bitsContain instructions to be executed, data being operated on, etc.
Typically there are several named registersSCR sequence control register
holds location of the next piece of information to be fetched controls the sequence of instructions each time it is accessed, it is automatically incremented (increased) by
oneCIR current instruction register
holds the instruction about to be processed© De Montfort University, 2007 CSCI1412-HW-3 6
More RegistersRegisters, continued ...
MAR memory address register holds the location (the address) of information about to be
read from or written to RAMMDR memory data register
holds the value of information just read from or about to be written to RAM
ACC accumulator(s) hold result(s) of processing
Sometimes a processor also has one or moreSTO general purpose store(s)
hold temporary data value(s) for processing
© De Montfort University, 2007 CSCI1412-HW-3 7
Machine CodeVery simple low level instructions.A single high level language instruction (e.g. VB)
may require many machine code instructions.An integral part of the processor.An instruction has an operation code (opcode),
followed by zero or more items of data (operands)
© De Montfort University, 2007 CSCI1412-HW-3 8
Machine CodeFor example
in Zilog Z80 machine code (8-bit processor) instruction C616 in hexadecimal means add the
data held at the following location to the current accumulator
suppose that the SCR currently holds 123416, ACC holds 516 and the contents of memory is as shown below.
What is the sequence the registers are used in?© De Montfort University, 2007 CSCI1412-HW-3 9
123416 C616
123516 1016
location value
Operation code
Operand
Adding data to the Acc.
SCR (address)
MAR (address)
MDR (data) CIR(instruction)
Acc(data)
1234 - - - 5
1234 1234 - - 5
1235 1234 - - 5
1235 1234 C6 - 5
1235 1234 C6 C6 5
1235 1235 C6 C6 5
1236 1235 C6 C6 5
1236 1235 10 C6 5
1236 1236 10 C6 15
© De Montfort University, 2007 CSCI1412-HW-3 10
123416 C616
123516 1016
location value
opcode
operand
Sequence of ActionsFetch
SCR MAR, put address of next instruction into the MAR SCR+1 SCR, point to the next memory locationMAR RAM MDR CIR, read from RAM address (MAR),
into the MDR, into the CIR
Decode Contents of CIR - instruction number C616 means ... data required ...
ExecuteSCR MAR, put address of data into the MAR
SCR+1 SCR, point to the next instructionMAR RAM MDR, read from RAM address(MAR), into the MDR
Store MDR + ACC ACC, add the MDR and Ac contents
in this case, the result in stored in the accumulator
© De Montfort University, 2007 CSCI1412-HW-3 11
Measuring Speeds
The System ClockWhat controls the fetch / execute cycle?
the system clockthis is a quartz chip that provides pulses at a
regular, rapid, rate, like a metronome n.b. not the same as the real date / time clock
The first microprocessor originally ran at 100 KHz, the Pentium IV is now at 1.2 – 4.0 GHz
A clock tick starts the fetch / execute cycleit may take several (perhaps tens of) clock ticks to
complete one complex instruction
© De Montfort University, 2007 CSCI1412-HW-3 13
Gigahertz
The ‘simplest’ measure of speed is just the rate at which the system clock ticksusually quoted in Gigahertz (GHz)
1 Hertz = 1 cycle per second 1 Megahertz = 1 million cycles per second 1 Gigahertz = 1 billion cycles per second
This is meaningful in one type of processore.g. 2.4 GHz Pentium is twice as quick as 1.2 GHz
But is not for comparing different processor typesdifferent processors may take different numbers of cycles to
fetch / execute the ‘same’ instruction e.g. a Pentium takes X cycles to load a number into the
accumulator, whereas a 68040 takes Y cycles
© De Montfort University, 2007 CSCI1412-HW-3 14
MIPSIn order to overcome the limitations of GHz, some
manufacturers prefer to use MIPSmillions of instructions per secondfound by counting the number of cycles (on average)
that a processor takes to execute an instructionHowever, this is still not very helpful
which instructions !?some instructions may be very short: LOAD ACC,0some instructions may be very long
store value zero into RAM from location 0x1000 to 0x1FFF
Can be found by standard benchmarks
© De Montfort University, 2007 CSCI1412-HW-3 15
FLOPSPerhaps, as computers are often used for
mathematical calculations, a better measure would be the number of floating point operations that can be carried out per secondFLOPS: floating point operations per secondfound by running standard mathematical
benchmarksHowever, what use are FLOPS to
a business person using a spreadsheet?a secretary writing letters on a word processor?a computer scientist compiling programs in C++?
© De Montfort University, 2007 CSCI1412-HW-3 16
BenchmarkingThere is no satisfactorily agreed single method of
measuring the speed of computersactual system speed also depends on RAM speed, bus
speeds, video performance, hard disk speeds, etc.Many magazines set up standard tasks simulating
general office / scientific usee.g. Excel / Word running under Windows Vistathese may provide a good comparison of systems, but
may only be applicable to one type of computer (Windows PC) for a short amount of time what happens when Windows Vista becomes obsolete!?
© De Montfort University, 2007 CSCI1412-HW-3 17
Other Architectural Aspects
CachingIntermediate storage - uses high-speed SRAMHolds recently accessed instructions/data
high probability that these will be re-usedDifferent types of cache:
primary cache (Level 1) - in the processor 8Kb - 32 Kb fastest type of cache
secondary (Level 2) – also now in the processor 512Kb - 1Mb
(used to be called cache-on-a-stick - COAST)disk cache (Level 3) - section of RAM
specified by the user (or automatically by operating system)© De Montfort University, 2007 CSCI1412-HW-3 19
PipeliningTechnique used to increase processing speedProcessor begins to execute a second instruction
before first has been completedTherefore several instructions are in the pipeline
up to six instructions in the PentiumThe pipeline is divided into segments
segments are processed concurrentlyAlso used in RAM to preload the next requested
memory content
© De Montfort University, 2007 CSCI1412-HW-3 20
ParallelismIntel Pentium processors have a form of
parallelism called:single instruction multiple data (SIMD)
The same instruction is run on multiple data at the same timeimproves the speed at which sets of data requiring
the same operation can be processedmost of these extensions are for floating-point ops.
Typically used for complex co-ordinate transformsfound in e.g. 3-D games graphics when a picture is
being updated to form the next frame in a motion© De Montfort University, 2007 CSCI1412-HW-3 21
RAMRandom Access MemoryVolatile memory which loses it’s data when the
power is switched off.Two main types:SRAM. Static RAMDRAM. Dynamic RAM
© De Montfort University, 2007 CSCI1412-HW-3 22
SRAM and DRAMDifferences between static and dynamic RAM:Dynamic RAM must be refreshed or it will lose its
dataStatic RAM only needs current to be applied –
bits do not need to be refreshed.
Both SRAM and DRAM are volatile.
Most modern computers use some form of DRAM for the main memory.
© De Montfort University, 2007 CSCI1412-HW-3 23
SRAMUsed in small amounts in computers where very
fast RAM is required, such as in the cache of many CPU's.
DRAM is much less expensive than SRAM, but is usually slower and must constantly be refreshed in order to preserve its contents.
Types of SRAM include:Asynchronous Static RAM Synchronous Burst Static RAM Pipeline Burst Static RAM
© De Montfort University, 2007 CSCI1412-HW-3 24
DRAMDRAM – each data bit is stored in a separate
capacitor. The benefit of this is the avoidance of corruption.
Dynamic because it requires refreshing data integrity.
Types of DRAM include:SDRAM Synchronous Dynamic Random Access
MemoryDDR SDRAM Double Data Rate SDRAM
© De Montfort University, 2007 CSCI1412-HW-3 25
SDRAM SDRAM - Synchronous Dynamic Random Access
Memory.Dynamic because it requires refreshing data
integrity.Synchronous because it lines itself up with the
computer system bus and processor. The computer's internal clock drives the entire mechanism.
Can accept > 1 write command at a time - Pipelining.
© De Montfort University, 2007 CSCI1412-HW-3 26
DDR SDRAM DDR SDRAM (Double Data Rate Synchronous
Dynamic Random Access Memory) Achieves nearly twice the bandwidth of single
data rate SDRAM by double pumping (transferring data on the rising and falling edges of the clock signal) without increasing the clock frequency.
© De Montfort University, 2007 CSCI1412-HW-3 27
DDR2 and DDR3DDR2 and DDR3An evolution of DDR, with higher internal bus
speeds.DDR2 bus runs at twice the speed of DDR
memory.DDR3 at even higher speeds.
Most modern computers use DDR, DDR2 or DDR3 packaged in DIMMs (Dual In-line memory Modules) – electrical contacts plug directly into the main board.
DIMMS have a 64 bit data bus (as do Pentium processors)
SIMMS (now obsolete)have a 32 bit bus
© De Montfort University, 2007 CSCI1412-HW-3 28
Mainboard Layout
© De Montfort University, 2007 CSCI1412-HW-3 29
Intel D945GNT• Dual-channel DDR2 667 / 533 / 400
memory support• PCI Express* x16 graphics connectorTwo PCI Express* x1 connectors• Four Serial ATA ports (3.0 Gb/s)• Integrated Intel® PRO 10/100
Network Connection• Intel® High Definition Audio with 5.1
Surround Sound• Eight Hi-Speed USB 2.0 ports• Intel® Precision Cooling Technology1
Mainboard Layout
© De Montfort University, 2007 CSCI1412-HW-3 30
A Auxiliary fan connector (optional)B SpeakerC PCI Express x1 bus add-in card connectors [2]D Audio codecE Front panel audio connectorF Ethernet deviceG PCI Conventional bus add-in card connectors [2]H PCI Express x16 bus add-in card connectorI Back panel connectorsJ +12V power connector (ATX12V)K Rear chassis fan connectorL LGA775 processor socketM Intel 82945G GMCHN Processor fan connectorO DIMM Channel A sockets [2]P DIMM Channel B sockets [2] connectorDD Intel 82801G I/O Controller Hub (ICH7)EE SPI flash deviceFF IEEE-1394a controller (optional)GG Front panel IEEE-1394a connectors (optional) [2]HH PCI Conventional bus add-in card connectors
Q SCSI LED connector (optional)R Legacy I/O controllerS Power connectorT Diskette drive connectorU Parallel ATE IDE connectorV BatteryW Front chassis fan connectorX BIOS Setup configuration jumper blockY Serial ATA connectors [4]Z Auxiliary front panel power LED connectorAA Front panel connectorBB Front panel USB connectors [2]CC Chassis intrusion
Motherboard in Situ
© De Montfort University, 2007 CSCI1412-HW-3 31
Cooling can be a problem....
SummaryHow it works!
the fetch / execute cycle in detailMeasuring speed
system clock, GHz, MIPS and FLOPSAdvanced concepts
cache, pipelining, parallelismmemory issues
dynamic and static RAM, SIMMS and DIMMS
motherboards component layout
© De Montfort University, 2007 CSCI1412-HW-3 32