Hardware 3 More Architecture Dr John Cowell phones off (please)

32
CSCI1412 Lecture 3 Hardware 3 More Architecture Dr John Cowell phones off (please)

Transcript of Hardware 3 More Architecture Dr John Cowell phones off (please)

Page 1: Hardware 3 More Architecture Dr John Cowell phones off (please)

CSCI1412Lecture 3

Hardware 3More Architecture

Dr John Cowell

phones off (please)

Page 2: Hardware 3 More Architecture Dr John Cowell phones off (please)

OverviewHow it works!

the fetch / execute cycle in detailMeasuring speed

system clock, GHz, MIPS and FLOPSAdvanced concepts

cache, pipelining, parallelismmemory issues

dynamic and static RAM, SIMMS, DIMMS, and specialist memory

motherboards component layout

© De Montfort University, 2007 CSCI1412-HW-3 2

Page 3: Hardware 3 More Architecture Dr John Cowell phones off (please)

The Fetch / Execute Cycle

Page 4: Hardware 3 More Architecture Dr John Cowell phones off (please)

The Fetch / Execute Cycle

© De Montfort University, 2007 CSCI1412-HW-3 4

control unit

RAM

arithmetic / logic unit

decode execute

fetch (store)

Page 5: Hardware 3 More Architecture Dr John Cowell phones off (please)

BusesComputer memory is made up of a set of

locations. Each has a unique address. The address bus specifies the location. The data

bus transfers the data.The control bus determines e.g. read or write

© De Montfort University, 2007 CSCI1412-HW-3 5

Page 6: Hardware 3 More Architecture Dr John Cowell phones off (please)

RegistersA CPU contains special purpose registers (typically 32)

Very high speed memory within the processor chipeach register contains a fixed number of bits

e.g. each register in a 32-bit processor has 32 bitsContain instructions to be executed, data being operated on, etc.

Typically there are several named registersSCR sequence control register

holds location of the next piece of information to be fetched controls the sequence of instructions each time it is accessed, it is automatically incremented (increased) by

oneCIR current instruction register

holds the instruction about to be processed© De Montfort University, 2007 CSCI1412-HW-3 6

Page 7: Hardware 3 More Architecture Dr John Cowell phones off (please)

More RegistersRegisters, continued ...

MAR memory address register holds the location (the address) of information about to be

read from or written to RAMMDR memory data register

holds the value of information just read from or about to be written to RAM

ACC accumulator(s) hold result(s) of processing

Sometimes a processor also has one or moreSTO general purpose store(s)

hold temporary data value(s) for processing

© De Montfort University, 2007 CSCI1412-HW-3 7

Page 8: Hardware 3 More Architecture Dr John Cowell phones off (please)

Machine CodeVery simple low level instructions.A single high level language instruction (e.g. VB)

may require many machine code instructions.An integral part of the processor.An instruction has an operation code (opcode),

followed by zero or more items of data (operands)

© De Montfort University, 2007 CSCI1412-HW-3 8

Page 9: Hardware 3 More Architecture Dr John Cowell phones off (please)

Machine CodeFor example

in Zilog Z80 machine code (8-bit processor) instruction C616 in hexadecimal means add the

data held at the following location to the current accumulator

suppose that the SCR currently holds 123416, ACC holds 516 and the contents of memory is as shown below.

What is the sequence the registers are used in?© De Montfort University, 2007 CSCI1412-HW-3 9

123416 C616

123516 1016

location value

Operation code

Operand

Page 10: Hardware 3 More Architecture Dr John Cowell phones off (please)

Adding data to the Acc.

SCR (address)

MAR (address)

MDR (data) CIR(instruction)

Acc(data)

1234 - - - 5

1234 1234 - - 5

1235 1234 - - 5

1235 1234 C6 - 5

1235 1234 C6 C6 5

1235 1235 C6 C6 5

1236 1235 C6 C6 5

1236 1235 10 C6 5

1236 1236 10 C6 15

© De Montfort University, 2007 CSCI1412-HW-3 10

123416 C616

123516 1016

location value

opcode

operand

Page 11: Hardware 3 More Architecture Dr John Cowell phones off (please)

Sequence of ActionsFetch

SCR MAR, put address of next instruction into the MAR SCR+1 SCR, point to the next memory locationMAR RAM MDR CIR, read from RAM address (MAR),

into the MDR, into the CIR

Decode Contents of CIR - instruction number C616 means ... data required ...

ExecuteSCR MAR, put address of data into the MAR

SCR+1 SCR, point to the next instructionMAR RAM MDR, read from RAM address(MAR), into the MDR

Store MDR + ACC ACC, add the MDR and Ac contents

in this case, the result in stored in the accumulator

© De Montfort University, 2007 CSCI1412-HW-3 11

Page 12: Hardware 3 More Architecture Dr John Cowell phones off (please)

Measuring Speeds

Page 13: Hardware 3 More Architecture Dr John Cowell phones off (please)

The System ClockWhat controls the fetch / execute cycle?

the system clockthis is a quartz chip that provides pulses at a

regular, rapid, rate, like a metronome n.b. not the same as the real date / time clock

The first microprocessor originally ran at 100 KHz, the Pentium IV is now at 1.2 – 4.0 GHz

A clock tick starts the fetch / execute cycleit may take several (perhaps tens of) clock ticks to

complete one complex instruction

© De Montfort University, 2007 CSCI1412-HW-3 13

Page 14: Hardware 3 More Architecture Dr John Cowell phones off (please)

Gigahertz

The ‘simplest’ measure of speed is just the rate at which the system clock ticksusually quoted in Gigahertz (GHz)

1 Hertz = 1 cycle per second 1 Megahertz = 1 million cycles per second 1 Gigahertz = 1 billion cycles per second

This is meaningful in one type of processore.g. 2.4 GHz Pentium is twice as quick as 1.2 GHz

But is not for comparing different processor typesdifferent processors may take different numbers of cycles to

fetch / execute the ‘same’ instruction e.g. a Pentium takes X cycles to load a number into the

accumulator, whereas a 68040 takes Y cycles

© De Montfort University, 2007 CSCI1412-HW-3 14

Page 15: Hardware 3 More Architecture Dr John Cowell phones off (please)

MIPSIn order to overcome the limitations of GHz, some

manufacturers prefer to use MIPSmillions of instructions per secondfound by counting the number of cycles (on average)

that a processor takes to execute an instructionHowever, this is still not very helpful

which instructions !?some instructions may be very short: LOAD ACC,0some instructions may be very long

store value zero into RAM from location 0x1000 to 0x1FFF

Can be found by standard benchmarks

© De Montfort University, 2007 CSCI1412-HW-3 15

Page 16: Hardware 3 More Architecture Dr John Cowell phones off (please)

FLOPSPerhaps, as computers are often used for

mathematical calculations, a better measure would be the number of floating point operations that can be carried out per secondFLOPS: floating point operations per secondfound by running standard mathematical

benchmarksHowever, what use are FLOPS to

a business person using a spreadsheet?a secretary writing letters on a word processor?a computer scientist compiling programs in C++?

© De Montfort University, 2007 CSCI1412-HW-3 16

Page 17: Hardware 3 More Architecture Dr John Cowell phones off (please)

BenchmarkingThere is no satisfactorily agreed single method of

measuring the speed of computersactual system speed also depends on RAM speed, bus

speeds, video performance, hard disk speeds, etc.Many magazines set up standard tasks simulating

general office / scientific usee.g. Excel / Word running under Windows Vistathese may provide a good comparison of systems, but

may only be applicable to one type of computer (Windows PC) for a short amount of time what happens when Windows Vista becomes obsolete!?

© De Montfort University, 2007 CSCI1412-HW-3 17

Page 18: Hardware 3 More Architecture Dr John Cowell phones off (please)

Other Architectural Aspects

Page 19: Hardware 3 More Architecture Dr John Cowell phones off (please)

CachingIntermediate storage - uses high-speed SRAMHolds recently accessed instructions/data

high probability that these will be re-usedDifferent types of cache:

primary cache (Level 1) - in the processor 8Kb - 32 Kb fastest type of cache

secondary (Level 2) – also now in the processor 512Kb - 1Mb

(used to be called cache-on-a-stick - COAST)disk cache (Level 3) - section of RAM

specified by the user (or automatically by operating system)© De Montfort University, 2007 CSCI1412-HW-3 19

Page 20: Hardware 3 More Architecture Dr John Cowell phones off (please)

PipeliningTechnique used to increase processing speedProcessor begins to execute a second instruction

before first has been completedTherefore several instructions are in the pipeline

up to six instructions in the PentiumThe pipeline is divided into segments

segments are processed concurrentlyAlso used in RAM to preload the next requested

memory content

© De Montfort University, 2007 CSCI1412-HW-3 20

Page 21: Hardware 3 More Architecture Dr John Cowell phones off (please)

ParallelismIntel Pentium processors have a form of

parallelism called:single instruction multiple data (SIMD)

The same instruction is run on multiple data at the same timeimproves the speed at which sets of data requiring

the same operation can be processedmost of these extensions are for floating-point ops.

Typically used for complex co-ordinate transformsfound in e.g. 3-D games graphics when a picture is

being updated to form the next frame in a motion© De Montfort University, 2007 CSCI1412-HW-3 21

Page 22: Hardware 3 More Architecture Dr John Cowell phones off (please)

RAMRandom Access MemoryVolatile memory which loses it’s data when the

power is switched off.Two main types:SRAM. Static RAMDRAM. Dynamic RAM

© De Montfort University, 2007 CSCI1412-HW-3 22

Page 23: Hardware 3 More Architecture Dr John Cowell phones off (please)

SRAM and DRAMDifferences between static and dynamic RAM:Dynamic RAM must be refreshed or it will lose its

dataStatic RAM only needs current to be applied –

bits do not need to be refreshed.

Both SRAM and DRAM are volatile.

Most modern computers use some form of DRAM for the main memory.

© De Montfort University, 2007 CSCI1412-HW-3 23

Page 24: Hardware 3 More Architecture Dr John Cowell phones off (please)

SRAMUsed in small amounts in computers where very

fast RAM is required, such as in the cache of many CPU's.

DRAM is much less expensive than SRAM, but is usually slower and must constantly be refreshed in order to preserve its contents.

Types of SRAM include:Asynchronous Static RAM Synchronous Burst Static RAM Pipeline Burst Static RAM

© De Montfort University, 2007 CSCI1412-HW-3 24

Page 25: Hardware 3 More Architecture Dr John Cowell phones off (please)

DRAMDRAM – each data bit is stored in a separate

capacitor. The benefit of this is the avoidance of corruption.

Dynamic because it requires refreshing data integrity.

Types of DRAM include:SDRAM Synchronous Dynamic Random Access

MemoryDDR SDRAM Double Data Rate SDRAM

© De Montfort University, 2007 CSCI1412-HW-3 25

Page 26: Hardware 3 More Architecture Dr John Cowell phones off (please)

SDRAM SDRAM - Synchronous Dynamic Random Access

Memory.Dynamic because it requires refreshing data

integrity.Synchronous because it lines itself up with the

computer system bus and processor. The computer's internal clock drives the entire mechanism.

Can accept > 1 write command at a time - Pipelining.

© De Montfort University, 2007 CSCI1412-HW-3 26

Page 27: Hardware 3 More Architecture Dr John Cowell phones off (please)

DDR SDRAM DDR SDRAM (Double Data Rate Synchronous

Dynamic Random Access Memory) Achieves nearly twice the bandwidth of single

data rate SDRAM by double pumping (transferring data on the rising and falling edges of the clock signal) without increasing the clock frequency.

© De Montfort University, 2007 CSCI1412-HW-3 27

Page 28: Hardware 3 More Architecture Dr John Cowell phones off (please)

DDR2 and DDR3DDR2 and DDR3An evolution of DDR, with higher internal bus

speeds.DDR2 bus runs at twice the speed of DDR

memory.DDR3 at even higher speeds.

Most modern computers use DDR, DDR2 or DDR3 packaged in DIMMs (Dual In-line memory Modules) – electrical contacts plug directly into the main board.

DIMMS have a 64 bit data bus (as do Pentium processors)

SIMMS (now obsolete)have a 32 bit bus

© De Montfort University, 2007 CSCI1412-HW-3 28

Page 29: Hardware 3 More Architecture Dr John Cowell phones off (please)

Mainboard Layout

© De Montfort University, 2007 CSCI1412-HW-3 29

Intel D945GNT• Dual-channel DDR2 667 / 533 / 400

memory support• PCI Express* x16 graphics connectorTwo PCI Express* x1 connectors• Four Serial ATA ports (3.0 Gb/s)• Integrated Intel® PRO 10/100

Network Connection• Intel® High Definition Audio with 5.1

Surround Sound• Eight Hi-Speed USB 2.0 ports• Intel® Precision Cooling Technology1

Page 30: Hardware 3 More Architecture Dr John Cowell phones off (please)

Mainboard Layout

© De Montfort University, 2007 CSCI1412-HW-3 30

A Auxiliary fan connector (optional)B SpeakerC PCI Express x1 bus add-in card connectors [2]D Audio codecE Front panel audio connectorF Ethernet deviceG PCI Conventional bus add-in card connectors [2]H PCI Express x16 bus add-in card connectorI Back panel connectorsJ +12V power connector (ATX12V)K Rear chassis fan connectorL LGA775 processor socketM Intel 82945G GMCHN Processor fan connectorO DIMM Channel A sockets [2]P DIMM Channel B sockets [2] connectorDD Intel 82801G I/O Controller Hub (ICH7)EE SPI flash deviceFF IEEE-1394a controller (optional)GG Front panel IEEE-1394a connectors (optional) [2]HH PCI Conventional bus add-in card connectors

Q SCSI LED connector (optional)R Legacy I/O controllerS Power connectorT Diskette drive connectorU Parallel ATE IDE connectorV BatteryW Front chassis fan connectorX BIOS Setup configuration jumper blockY Serial ATA connectors [4]Z Auxiliary front panel power LED connectorAA Front panel connectorBB Front panel USB connectors [2]CC Chassis intrusion

Page 31: Hardware 3 More Architecture Dr John Cowell phones off (please)

Motherboard in Situ

© De Montfort University, 2007 CSCI1412-HW-3 31

Cooling can be a problem....

Page 32: Hardware 3 More Architecture Dr John Cowell phones off (please)

SummaryHow it works!

the fetch / execute cycle in detailMeasuring speed

system clock, GHz, MIPS and FLOPSAdvanced concepts

cache, pipelining, parallelismmemory issues

dynamic and static RAM, SIMMS and DIMMS

motherboards component layout

© De Montfort University, 2007 CSCI1412-HW-3 32