Lecture 2: Fundamentals of Computer Design Kai Bu [email protected] .

94
Lecture 2: Fundamentals of Computer Design Kai Bu [email protected] http://list.zju.edu.cn/kaibu/comparch

Transcript of Lecture 2: Fundamentals of Computer Design Kai Bu [email protected] .

Page 1: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Lecture 2: Fundamentalsof Computer Design

Kai [email protected]

http://list.zju.edu.cn/kaibu/comparch

Page 2: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Chapter 1

Page 3: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

• Transition from single processor to multiple processors;

• Quantitative approach: empirical observations (of programs, experimentations, simulation) as its tools;

Page 4: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Outline

• Classes of computers• Parallelism• Instruction Set Architecture• Trends• Dependability• Performance Measurement

Page 5: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Outline

• Classes of computers• Parallelism• Instruction Set Architecture• Trends• Dependability• Performance Measurement

Page 6: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

5 Classes of Computers

Page 7: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

PMD: Personal Mobile Device

• Wireless devices with multimedia user interfaces

• cell phones, tablet computers, etc.• a few hundred dollars

Page 8: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

PMD Characteristics• Cost effectiveness

less expensive packaging;absence of fan for cooling

• Responsiveness & Predictabilityreal-time performance: a maximum execution time for each

app segment;soft real-time: average time constraint – tolerate occasionally

missed time constraint on an event.

• Memory efficiencyoptimize code size

• Energy efficiencybattery power, heat dissipation

Page 9: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Desktop Computing

• Largest market share• low-end netbooks: $x00• …• high-end workstations: $x000

Page 10: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Desktop Characteristics

• Price-Performancecombination of performance and price;compute performancegraphics performance

• The most important to customers,and hence to computer designers

Page 11: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Servers

• Provide large-scale and reliable file and computing services (to desktops)

• Constitute the backbone of large-scale enterprise computing

Page 12: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Servers Characteristics

• Availabilityagainst server failure

• Scalabilityin response to increasing demand with scaling up computing capacity, memory, storage, and I/O bandwidth

• Efficient throughputtoward more requests handled in a unit time

Page 13: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Why Server Availability

Page 14: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Clusters/WSCsWarehouse-Scale Computerscollections of desktop computers or serversconnected by local area networks to act as a single larger computer

Characteristicsprice-performance, power, availability

Page 15: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Embedded Computers

hide everywhere

Page 16: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Embedded vs Non-embedded• Dividing line

the ability to run third-party software

• Embedded computers’ primary goalmeet the performance need at a minimum price;rather than achieve higher performance at a higher price

Page 17: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Outline

• Classes of computers• Parallelism• Instruction Set Architecture• Trends• Dependability• Performance Measurement

Page 18: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Application Parallelism

• DLP: Data-Level Parallelismmany data items being operated on at the same time

• TLP: Task-Level Parallelismtasks of work created to be operate independently and largely in parallel

Page 19: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Hardware Parallelism

• Computer hardware exploits two kinds of application parallelism in four major ways:Instruction-Level ParallelismVector Architectures and GPUsThread-Level ParallelismRequest-Level Parallelism

Page 20: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Hardware Parallelism

• Instruction-Level Parallelismexploits data-level parallelismat modest levels – pipelining;at medium levels – speculative exec;

Page 21: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Hardware Parallelism

• Vector Architectures &GPUs (Graphic Process Units)exploit data-level parallelismapply a single instruction to a collection of data in parallel

Page 22: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Hardware Parallelism

• Thread-Level Parallelismexploits either DLP or TLPin a tightly coupled hardware modelthat allows for interaction among parallel threads

Page 23: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Hardware Parallelism

• Request-Level Parallelismexploits parallelism among largely decoupled tasks specified by the programmer or the OS

Page 24: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Classes of Parallel Architectures

by Michael Flynnaccording to the parallelismin the instruction and data streams called for by the instructions at the most constrained component of the multiprocessor:SISD, SIMD, MISD, MIMD

Page 25: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

SISD

• Single instruction stream, single data stream – uniprocessor

• Can exploit instruction-level parallelism

Page 26: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

SIMD

• Single instruction stream, multiple data stream

• The same instruction is executed by multiple processors using different data streams.

• Exploits data-level parallelism• Data memory for each processor;

whereas a single instruction memory and control processor.

Page 27: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

MISD

• Multiple instruction streams, single data stream

• No commercial multiprocessor of this type yet

Page 28: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

MIMD

• Multiple instruction streams, multiple data streams

• Each processor fetches its own instructions and operates on its own data.

• Exploits task-level parallelism

Page 29: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Outline

• Classes of computers• Parallelism• Instruction Set Architecture• Trends• Dependability• Performance Measurement

Page 30: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Instruction Set Architecture

ISA• actual programmer-visible instruction

set• the boundary between software and

hardware• 7 major dimensions

Page 31: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

ISA: Class

• Most are general-purpose register architectures with operands of either registers or memory locations

• Two popular versionsregister-memory ISA: e.g., 80x86

many instructions can access memoryload-store ISA: e.g., ARM, MIPS

only load or store instructions can access memory

Page 32: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

ISA: Memory Addressing

• Byte addressing• Aligned address

object width: s bytesaddress: Aaligned if A mod s = 0

Page 33: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Each misaligned object requires two memory accesses

Page 34: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

ISA: Addressing Modes

• Specify the address of a memory object

• Register, Immediate, Displacement

Page 35: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

ISA: Types and Sizes of OPerands

Type Size in bits

ASCII character 8

Unicode characterHalf word

16

Integerword

32

Double wordLong integer

64

IEEE 754 floating point – single precision

32

IEEE 754 floating point – double precision

64

Floating point –extended double precision

80

Page 36: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

MIPS64 Operations

• Data transfer

Page 37: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

MIPS64 Operations

• Arithmetic Logical

Page 38: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

MIPS64 Operations

• Control

Page 39: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

MIPS64 Operations

• Floating point

Page 40: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

ISA: Control Flow Instructions

• Types:conditional branchesunconditional jumpsprocedure callsreturns

• Branch address: add an address field to PC (program counter)

Page 41: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

ISA: Encoding an ISA

• Fixed length: ARM, MIPS – 32 bits• Variable length: 80x86 – 1~18 bytes

http://en.wikipedia.org/wiki/MIPS_architecture

Start with a 6-bit opcode.

R-type: three registers,

a shift amount field, and a function field;

I-type: two registers,

a 16-bit immediate value; J-type:

a 26-bit jump target.

Page 42: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Computer Architecture

ISA Organization Hardwareactual programmervisible instruction set;boundary between swand hw;

high-level aspectsof computer design:

memory system,memory

interconnect,design of internal processor or CPU;

computer specifics:logic design,packaging tech;

Page 43: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Outline

• Classes of computers• Parallelism• Instruction Set Architecture• Trends• Dependability• Performance Measurement

Page 44: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Five CriticalImplementation Technologies• Integrated circuit logic technology• Semiconductor DRAM• Semiconductor flash• Magnetic disk technology• Network technology

Page 45: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Integrated circuit logic technology

• Moore’s Law: a growth rate in transistor count on a chip of about 40% to 55% per year

doubles every 18 to 24 months

Page 46: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Semiconductor DRAM

• Capacity per DRAM chip doubles roughly every 2 or 3 years

Page 47: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Semiconductor Flash

• Electronically erasable programmable read-only memory

• Capacity per Flash chip doubles roughly every two years

• In 2011, 15 to 20 times cheaper per bit than DRAM

Page 48: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Magnetic Disk Technology

• Since 2004, density doubles every three years

• 15 to 20 times cheaper per bit than Flash

• 300 to 500 times cheaper per bit than DRAM

• For server and warehouse scale storage

Page 49: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Network Technology

• Switches• Transmission systems

Page 50: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Performance Trends

• Bandwidth/Throughputthe total amount of work done in a given time;

• Latency/Response Timethe time between the start and the completion of an event;

Page 51: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Bandwidth over Latency

Page 52: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Power and Energy

• Power = Energy per unit time1 watt = 1 joule per secondenergy to execute a workload = avg power x execution time

• Three primary concernsthe max power for a processorsustained power consumptionenergy and energy efficiency

Page 53: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Power and Energy• Sustained power consumption• Metric: TDP

Thermal Design Powerdetermines cooling requirement

• Heat management1. reduce clock rate and hence power as the thermal temperature approaches the junction temperature limit;2. if 1 is not working, power down the chip.

Page 54: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Power and Energy• Energy and Energy Efficiency• energy to execute a workload =

avg power x execution time• Example

processor A with 20% higher avg power consumption than processor B;but A executes the task with 70% of the time by B;A or B is more efficient?

Page 55: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Power and Energy• Example

processor A with 20% higher avg power consumption than processor B;but A executes the task with 70% of the time by B;A or B is more efficient?

• EnergyConsumptionA= 1.2 x 0.7 x EnergyConsumptionB=0.84 x EnergyConsumptionB

Page 56: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Power and Energy• Primary energy consumption within a micr

oprocessor is for switching transistors – dynamic energy

logic transistion: 0->1->0 or 1->0->1• The energy of a single transition

Page 57: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Power and Energy• The power required per transistor

• For a fixed task, slowing clock rate (frequency) reduces power, but not energy.

Page 58: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Power and Energy• Example

some microprocessors with adjustable voltage;15% reduction in voltage -> 15% reduction in frequency;the impact on dynamic energy and dynamic power?

Page 59: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Power and Energy• Answer

Page 60: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Power and Energy• Challenges

distributing the powerremoving the heatpreventing hot spots

potential research topics

Page 61: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Power and Energy• Energy-efficiency improvement

techniques1. do nothing wellturn off the clock of inactive modules2. DVFS: dynamic voltage-frequency scalingscale down clock frequency and voltage during periods of low activity

Page 62: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

DVFS

Page 63: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Power and Energy• Energy-efficiency improvement techniques

3. design for typical casePMDs, laptops – often idlememory and storage with low power modes to save energy4. overclockingthe chip runs at a higher clock rate for a short time until temperature rises

Page 64: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Cost

• Cost of an Integrated Circuitwafer for test; chopped into dies for

packaging

Page 65: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Cost

• Cost of an Integrated Circuit

percentage of manufactured devices that survives the testing procedure

Page 66: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Cost

• Cost of an Integrated Circuit

Page 67: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Cost

• Cost of an Integrated Circuit

Page 68: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Intel Core i7 Die

Page 69: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Cost• Example

Page 70: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .
Page 71: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Cost• Example

Page 72: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Trends in Cost• Cost of an Integrated Circuit

• N: process-complexity factor for measuring manufacturing difficulty

Page 73: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Outline

• Classes of computers• Parallelism• Instruction Set Architecture• Trends• Dependability• Performance Measurement

Page 74: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Dependability

• SLA: service level agreements• System states: up or down• Service states

service accomplishment

service interruption

failure restoration

Page 75: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Dependability• Two measures of dependability

Module reliabilityModule availability

Page 76: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Dependability• Two measures of dependability

Module reliabilitycontinuous service accomplishment from a reference initial instant

MTTF: mean time to failure MTTR: mean time to repairMTBF: mean time between failuresMTBF = MTTF + MTTR

Page 77: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Dependability• Two measures of dependability

Module reliabilityFIT: failures in time

failures per billion hours

MTTF of 1,000,000 hours= 109/106 = 1000 FIT

Page 78: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Dependability• Two measures of dependability

Module availability

Page 79: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Dependability

• Example

Page 80: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Dependability

• Answer

Page 81: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Outline

• Classes of computers• Parallelism• Instruction Set Architecture• Trends• Dependability• Performance Measurement

Page 82: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Measuring Performance

• Execution timethe time between the start and the completion of an event

• Throughputthe total amount of work done in a given time

Page 83: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Measuring Performance• Computer X and Computer Y• X is n times faster than Y

Page 84: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Quantitative Principles

• Parallelism• Locality

temporal locality: recently accessed items are likely to be accessed in the near future;spatial locality: items whose addresses are near one another tend to be referenced close together in time

Page 85: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Quantitative Principles• Amdahl’s Law

Page 86: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Quantitative Principles• Amdahl’s Law: two factors

1. Fractionenhanced: e.g., 20/60 if 20 seconds out of a 60-second program to enhance2. Speedupenhanced:e.g., 5/2 if enhanced to 2 seconds while originally 5 seconds

Page 87: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .
Page 88: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Quantitative Principles• Example

Page 89: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Quantitative Principles• The Processor Performance Equation

Page 90: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .
Page 91: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Quantitative Principles• Example

Page 92: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Quantitative Principles• Example

Page 93: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

?

Page 94: Lecture 2: Fundamentals of Computer Design Kai Bu kaibu@zju.edu.cn .

Reading

• Chapter 1.8, 1.10 – 1.13