CSCE 212 Introduction to Computer Architecture

Post on 13-Jan-2016

59 views 0 download

description

CSCE 212 Introduction to Computer Architecture. Instructor: Jason D. Bakos. What is Computer Architecture?. The design of computer systems, to… To improve “performance” Run programs faster Use less power, last longer on battery power Generate less or more uniformally distributed heat - PowerPoint PPT Presentation

Transcript of CSCE 212 Introduction to Computer Architecture

CSCE 212Introduction to Computer Architecture

Instructor: Jason D. Bakos

What is Computer Architecture?

• The design of computer systems, to…– To improve “performance”

• Run programs faster• Use less power, last longer on battery power• Generate less or more uniformally distributed heat• Improve video, 3D rendering, encoding, or decoding frame rate• Handle more secure encryption standards with reasonable latency• Achieve routing or network intrution detection at higher line speeds• Be more scalable• Be less expensive (e.g. higher integration)

– Can be achieved via:• Software (better OS, more optimized application code)or• Hardware (processor)

• Designing any complex system requires abstraction

CSCE 212 2

CSCE 212 3

Abstraction

• Abstration used to manage complexity of design– Hide details that are

not important

Application Software

Programs

Compiler

Operating Systems

Device Drivers

Architecture Instructions Registers

Micro-architecture

Datapaths Controllers

Logic Adders Memories

Digital circuits

AND gates NOT gates

Analog circuits

Amplifiers Filters

Devices Transistors Diodes

Physics Electrons

145/146/240/245

311

212

211

211/611

ELCT 371

330

CSCE 212 4

Domains and Levels of Modeling

high level of abstraction

FunctionalStructural

Geometric

low level of abstraction

“Y-chart” from Gajski & Kahn

CSCE 212 5

Domains and Levels of Modeling

Algorithm(behavioral)

Register-TransferLanguage

Boolean Equation

Differential Equation

FunctionalStructural

Geometric

“Y-chart” from Gajski & Kahn

CSCE 212 6

Domains and Levels of Modeling

Processor-MemorySwitch

Register-Transfer

Gate

Transistor

FunctionalStructural

Geometric

“Y-chart” from Gajski & Kahn

CSCE 212 7

Domains and Levels of Modeling

Polygons

Sticks

Standard Cells

Floor Plan

FunctionalStructural

Geometric

“Y-chart” from Gajski & Kahn

CSCE 212 8

Structure

CSCE 212 9

MIPS Microarchitecture

RTL (datapath)

fetch instruction

1. Address <= PC

2. MemRead

3. PC <= PC + 1

4. IR <= MemData

Control

fetch instruction

1. IorD = 0

2. MemRead = 1

3. PCEn = 1

ALUSrcA = 0

ALUSrcB = 01

ALUOp = ADD

PCSource = 01

4. IRWrite = 1

CSCE 212 10

Structure

CSCE 212 11

Logic Synthesis

• Behavior:– S = A + B– Assume A is

2 bits, B is 2 bits, C is 3 bits

A B C

00 (0) 00 (0) 000 (0)

00 (0) 01 (1) 001 (1)

00 (0) 10 (2) 010 (2)

00 (0) 11 (3) 011 (3)

01 (1) 00 (0) 001 (1)

01 (1) 01 (1) 010 (2)

01 (1) 10 (2) 011 (3)

01 (1) 11 (3) 100 (4)

10 (2) 00 (0) 010 (2)

10 (2) 01 (1) 011 (3)

10 (2) 10 (2) 100 (4)

10 (2) 11 (3) 101 (5)

11 (3) 00 (0) 011 (3)

11 (3) 01 (1) 100 (4)

11 (3) 10 (2) 101 (5)

11 (3) 11 (3) 110 (6)

)()(

))((

)()(

010011101012

010101100101012

010100011010101012

010101010101

0101010101012

BBABBAAAABBC

BBAABBAAAAAABBC

BBAAAABBAAAAAAABBC

BBAABBAABBAA

BBAABBAABBAAC

CSCE 212 12

Logic Gates

AY BAY

BAY

inv NAND2NAND3

NOR2

BAY

BAY

CSCE 212 13

Latches

Positive edge-sensitive latch

CSCE 212 14

Elements

CSCE 212 15

Semiconductors

• Silicon is a group IV element (4 valence electrons, shells: 2, 8, 18, 32…)– Forms covalent bonds with four neighbor atoms (3D cubic crystal lattice)– Si is a poor conductor, but conduction characteristics may be altered– Add impurities/dopants (replaces silicon atom in lattice):

• Makes a better conductor• Group V element (phosphorus/arsenic) => 5 valence electrons

– Leaves an electron free => n-type semiconductor (electrons, negative carriers)

• Group III element (boron) => 3 valence electrons– Borrows an electron from neighbor => p-type semiconductor (holes, positive carriers)

forward biasreverse bias

+ + +

+ + +

- - -

- - -P-N junction

+ -- ++ + +

+ + +

- - -

- - -

CSCE 212 16

MOSFETs

body/bulk

GROUND

NMOS/NFET PMOS/PFET

channelshorter length, faster transistor

(dist. for electrons)

body/bulk

HIGH

positive voltage (Vdd)

negative voltage (rel.

to body) (GND)

(S/D to body is reverse-biased)

- - - + + +

+ + + - - -

current current

• Metal-poly-Oxide-Semiconductor structures built onto substrate– Diffusion: Inject dopants into substrate– Oxidation: Form layer of SiO2 (glass)– Deposition and etching: Add aluminum/copper wires

CSCE 212 17

IC Fabrication

• Chips are fabricated using set of masks– Photolithography

• Basic steps– oxidize– apply photoresist– remove photoresist with mask– HF acid eats oxide but not

photoresist– pirana acid eats photoresist

– ion implantation (diffusion, wells)– vapor deposition (poly)– plasma etching (metal)

CSCE 212 18

Layout

3-input NAND

CSCE 212 19

Cell Library (Snap Together)

Layout

CSCE 212 20

Layout

CSCE 212 21

Synthesized and P&R’ed MIPS Architecture

CSCE 212 22

IC Fabrication

CSCE 212 23

8” Wafer

• 8 inch (200 mm) wafer containing Pentium 4 processors– 165 dies, die area = 250 mm2, 55 million transistors, .18m

CSCE 212 24

Another 8” Wafer

CSCE 212 25

Feature Size

• Shrink minimum feature size…– Smaller L decreases carrier time and increases current– Therefore, W may also be reduced for fixed current

– Cg, Cs, and Cd are reduced

– Transistor switches faster (~linear relationship)

CSCE 212 26

Minimum Feature Size

Year Processor Speed Transistors Process

1982 i286 6 - 25 MHz ~134,000 1.5 m

1986 i386 16 – 40 MHz ~270,000 1 m

1989 i486 16 - 133 MHz ~1 million .8 m

1993 Pentium 60 - 300 MHz ~3 million .6 m

1995 Pentium Pro 150 - 200 MHz ~4 million .5 m

1997 Pentium II 233 - 450 MHz ~5 million .35 m

1999 Pentium III 450 – 1400 MHz ~10 million .25 m

2000 Pentium 4 1.3 – 3.8 GHz ~50 million .18 m

2005 Pentium D 2 cores/package ~200 million .09 m

2006 Core 2 2 cores/die ~300 million .065 m

2008 Core i7 4 cores/die ~800 million .040 m

2010 “Sandy Bridge”

8 cores/die ?? .032 m

CSCE 212 27

Clock Speed

• Clock speed is affected by:– Fabrication technology– Architecture: how much work performed in a single cycle

• Execution time =– instructions per program * cycles per instruction * seconds per cycle

• Now we must add to the product:– (number of program threads / number of processor cores)

CSCE 212 28

Integration Density

Core 2 Duo (2007) has ~300M transistors

CSCE 212 29

Integration Density

CSCE 212 30

Microprocessor Technology

• Advances in fabrication (lithography, photoresist, metal layers)

• …faster transistor switching (faster processor)

• …smaller transistors/wires

• …higher integration density

• …more “real estate”

• …architectural improvements!

CSCE 212 31

Microarchitectural Parallelism

• Parallelism => perform multiple operations simultaneously– Instruction-level parallelism

• Execute multiple instructions at the same time• Multiple issue• Out-of-order execution• Speculation• Branch prediction

– Thread-level parallelism (hyper-threading)• Execute multiple threads at the same time on one CPU• Threads share memory space and pool of functional units

– Chip multiprocessing• Execute multiple processes/threads at the same time on multiple CPUs• Cores are symmetrical and completely independent but share a common

level-2 cache