02-Computer System Technology

download 02-Computer System Technology

of 25

Transcript of 02-Computer System Technology

  • 8/13/2019 02-Computer System Technology

    1/25

    Computer System Technology

    1. From Components to Applications

    2. Computer Systems and Their Parts

    3. Generations of Progress

    4. Processor and Memory Technologies

    5. Peripherals, I/O, and Communications

    6. Software Systems and Applications

    Levels of Representation (abstractions)

    MIPS Instruction Representations 2

    lw $t0, 0($2)lw $t1, 4($2)sw $t1, 0($2)

    sw $t0, 4($2)

    High Level LanguageProgram (e.g., C)

    Assembly LanguageProgram (e.g.,MIPS)

    Machine LanguageProgram (MIPS)

    Hardware Architecture Description(e.g., block diagrams)

    Compiler

    Assembler

    MachineInterpretation

    temp = v[k];

    v[k] = v[k+1];

    v[k+1] = temp;

    0000 1001 1100 0110 1010 1111 0101 1000

    1010 1111 0101 1000 0000 1001 1100 0110

    1100 0110 1010 1111 0101 1000 0000 1001

    0101 1000 0000 1001 1100 0110 1010 1111

    Logic Circuit Description (CircuitSchematic Diagrams)

    Architecture

    Implementation

    Register File

    AL

    U

  • 8/13/2019 02-Computer System Technology

    2/25

    Computer Architecture, Background and Motivation

    Figure 3.1 Subfields or views in computer system engineering.

    High-level

    view

    Computerdesigner

    Circuitdesigner

    Applicationdesigner

    Systemd

    esigner

    Logicdesigner

    Software

    Hardware

    Computer organization

    Low-level

    view

    Applicationdomains

    Electroniccomponents

    Computer architecture

    From Components to Applications

    3

    Computer Architecture, Background and Motivation

    Figure 3.2 Like a building architect, whose place at the

    engineering/arts and goals/means interfaces is seen in this diagram, acomputer architect reconciles many conflicting or competing demands.

    ArchitectInterface

    Interface

    Goals

    Means

    ArtsEngineering

    Clients taste:mood, style, . . .

    Clients requirements:function, cost, . . .

    The world of arts:aesthetics, trends, . . .

    Construction technology:material, codes, . . .

    What Is (Computer) Architecture?

    4

  • 8/13/2019 02-Computer System Technology

    3/25

    Computer Architecture, Background and Motivation

    Figure 3.3 The space of computer systems, with what we normallymean by the word computer highlighted.

    Computer

    Analog

    Fixed-function Stored-program

    Electronic Nonelectronic

    General-purpose Special-purpose

    Number cruncher Data manipulator

    Digital

    Computer Systems and Their Parts

    5

    Computer Architecture, Background and Motivation

    Figure 3.4 Classifying computers by computational power and price range.Embedded

    Personal

    Workstation

    Server

    Mainframe

    Super $Millions$100s Ks

    $10s Ks

    $1000s

    $100s

    $10s

    Differences in scale,not in substance

    Price/Performance Pyramid

    6

  • 8/13/2019 02-Computer System Technology

    4/25

    Computer Architecture, Background and Motivation

    Figure 3.5 Embedded computers are ubiquitous, yet invisible. Theyare found in our automobiles, appliances, and many other places.

    Engine

    Impact sensors

    Navigation &entertainment

    Centralcontroller

    BrakesAirbags

    Automotive Embedded Computers

    7

    Computer Architecture, Background and Motivation

    Figure 3.6 Notebooks, a common class of portable computers,are much smaller than desktops but offer substantially the samecapabilities. What are the main reasons for the size difference?

    Personal Computers and Workstations

    8

  • 8/13/2019 02-Computer System Technology

    5/25

    Computer Architecture, Background and Motivation

    Figure 3.7 The (three, four, five, or) six main units of a digitalcomputer. Usually, the link unit (a simple bus or a more elaboratenetwork) is not explicitly included in such diagrams.

    Digital Computer Subsystems

    9

    Computer Architecture, Background and Motivation

    Table 3.2 The 5 generations of digital computers, and their ancestors.

    Generation(begun)

    Processortechnology

    Memoryinnovations

    I/O devicesintroduced

    Dominantlook & fell

    0 (1600s) (Electro-)mechanical

    Wheel, card Lever, dial,punched card

    Factoryequipment

    1 (1950s) Vacuum tube Magneticdrum

    Paper tape,magnetic tape

    Hall-sizecabinet

    2 (1960s) Transistor Magneticcore

    Drum, printer,text terminal

    Room-sizemainframe

    3 (1970s) SSI/MSI RAM/ROMchip

    Disk, keyboard,video monitor

    Desk-sizemini

    4 (1980s) LSI/VLSI SRAM/DRAM Network, CD,mouse,sound

    Desktop/laptop micro

    5 (1990s) ULSI/GSI/WSI, SOC

    SDRAM,flash

    Sensor/actuator,point/click

    Invisible,embedded

    Generations of Progress

    10

  • 8/13/2019 02-Computer System Technology

    6/25

    Computer Architecture, Background and Motivation

    Figure 3.8 The manufacturing process for an IC part.

    15-30cm

    30-60 cm

    Silicon

    crystalingot

    SlicerProcessing:20-30 steps

    Blank waferwith defects

    x x

    x x xx xx xx x

    0.2 cm

    Patterned wafer

    (100s of simple or scoresof complex processors)

    DicerDie

    ~1 cm

    Gooddie

    ~1 cm

    Dietester

    Microchipor other part

    MountingPart

    testerUsable

    part

    to ship

    IC Production and Yield

    11

    Computer Architecture, Background and Motivation

    Figure 3.9 Visualizing the dramatic decrease in yield with larger dies.

    120 dies, 109 good 26 dies, 15 good

    Die yield =def(number of good dies) / (total number of dies)

    Die yield = Wafer yield [1 + (Defect density Die area) / a]a

    Die cost = (cost of wafer) / (total number of dies die yield)= (cost of wafer) (die area / wafer area) / (die yield)

    Effect of Die Size on Yield

    12

  • 8/13/2019 02-Computer System Technology

    7/25

    Computer Architecture, Background and Motivation

    Figure 3.11 Packaging of processor, memory, and other components.

    PC board

    Backplane

    Memory

    CPU

    Bus

    Connector

    (b) 3D packaging of the future(a) 2D or 2.5D packaging now common

    Stacked layersglued together

    Interlayer connectionsdeposited on the

    outside of the stackDie

    Processor and Memory Technologies

    13

    Computer Architecture, Background and Motivation

    Trends in processor performance and DRAM memory chip capacity

    1Mb

    19901980 2000 2010kIPS

    MIPS

    GIPS

    TIPS

    Processorperformance

    Calendar year

    8028668000

    80386

    80486

    68040Pentium

    Pentium II

    R10000

    1.6 / yr

    10 / 5 yrs2 / 18 mos

    64Mb

    4Mb

    64kb

    256kb

    256Mb

    1Gb

    16Mb

    4 / 3 yrs

    Processor

    Memory

    kb

    Mb

    Gb

    Tb

    Memorychipca

    pacity

    Moores Law

    14

  • 8/13/2019 02-Computer System Technology

    8/25

    Computer Architecture, Background and Motivation

    DOS addresses only 1 MB of RAM because we cannotimagine any applications needing more. Microsoft, 1980

    640K ought to be enough for anybody. Bill Gates, 1981

    Computers in the future may weigh no more than 1.5tons. Popular Mechanics

    I think there is a world market for maybe five computers.Thomas Watson, IBM Chairman, 1943

    There is no reason anyone would want a computer in

    their home. Ken Olsen, DEC founder, 1977The 32-bit machine would be an overkill for a personalcomputer. Sol Libes, ByteLines

    Pitfalls of Computer Technology Forecasting

    15

    Computer Architecture, Background and Motivation

    Figure 3.12 Magnetic and optical disk memory units.

    (a) Cutaway view of a hard disk drive (b) Some removable storage media

    Typically2-9 cm

    Floppydisk

    CD-ROM

    Magnetictape

    cartridge

    . .

    ......

    Input/Output and Communications

    16

  • 8/13/2019 02-Computer System Technology

    9/25

    Computer Architecture, Background and Motivation

    Figure 3.13 Latency and bandwidth characteristics of differentclasses of communication links.

    3

    6

    9

    12

    9 6 3 3

    Bandwidth(b/s)

    Latency (s)

    10

    10

    10

    10

    10 10 10 1 10

    Processorbus

    I/Onetwork

    System-areanetwork(SAN) Local-area

    network(LAN)

    Metro-areanetwork(MAN)

    Wide-areanetwork(WAN)

    Geographically distributed

    Same geographic location

    (ns) (s) (ms) (min) (h)

    Communication Technologies

    17

    Computer Architecture, Background and Motivation

    Figure 3.15 Categorization of software, with examples in each class.

    Software

    Application:

    word processor,spreadsheet,circuit simulator,

    . . .

    Operating system Translator:MIPS assembler,

    C compiler,. . .

    System

    Manager:virtual memory,

    security,file system,

    . . .

    Coordinator:scheduling,

    load balancing,diagnostics,

    . . .

    Enabler:disk driver,

    display driver,printing,

    . . .

    Software Systems and Applications

    18

  • 8/13/2019 02-Computer System Technology

    10/25

    Computer Architecture, Background and Motivation

    Figure 3.14 Models and abstractions in programming.

    Compiler

    Assembler

    Interpreter

    temp=v[i]v[i]=v[i+1]v[i+1]=temp

    Swap v[i]and v[i+1]

    add $2,$5,$5add $2,$2,$2add $2,$4,$2lw $15,0($2)lw $16,4($2)sw $16,0($2)sw $15,4($2)jr $31

    00a5102000421020008210208c6200008cf20004acf20000ac62000403e00008

    Veryhigh-level

    languageobjectivesor tasks

    High-levellanguage

    statements

    Assemblylanguage

    instructions,mnemonic

    Machinelanguage

    instructions,binary (hex)

    One task =many statements

    One statement =several instructions

    Mostly one-to-one

    More abstract, machine-independent;easier to write, read, debug, or maintain

    More concrete, machine-specific, error-prone;harder to write, read, debug, or maintain

    High- vs Low-Level Programming

    19

    Computer Performance1. Cost, Performance, and Cost/Performance

    2. Defining Computer Performance

    3. Performance Enhancement and Amdahls Law

    4. Performance Measurement vs Modeling

    5. Reporting Computer Performance

    6. The Quest for Higher Performance

  • 8/13/2019 02-Computer System Technology

    11/25

    19801960 2000 2020$1

    Computercost

    Calendar year

    $1 K

    $1 M

    $1 G

    Computer Architecture, Background and Motivation

    Cost, Performance, and Cost/Performance

    21

    Computer Architecture, Background and Motivation

    Figure 4.1 Performance improvement as a function of cost.

    Performance

    Cost

    Superlinear:economy of

    scale

    Sublinear:diminishingreturns

    Linear

    (ideal?)

    Cost/Performance

    22

  • 8/13/2019 02-Computer System Technology

    12/25

    Computer Architecture, Background and Motivation

    Figure 4.2 Pipeline analogy shows that imbalance between processingpower and I/O capabilities leads to a performance bottleneck.

    ProcessingInput Output

    CPU-bound task

    I/O-bound task

    Defining Computer Performance

    23

    Computer Architecture, Background and Motivation

    B 747

    DC-8-50

    Six Passenger Aircraft to Be Compared

    24

  • 8/13/2019 02-Computer System Technology

    13/25

    Computer Architecture, Background and Motivation

    Table 4.1 Key characteristics of six passenger aircraft: all figuresare approximate; some relate to a specific model/configuration ofthe aircraft or are averages of cited range of values.

    Aircraft Passengers Range(km)

    Speed(km/h)

    Price($M)

    Airbus A310 250 8 300 895 120

    Boeing 747 470 6 700 980 200

    Boeing 767 250 12 300 885 120

    Boeing 777 375 7 450 980 180

    Concorde 130 6 400 2 200 350

    DC-8-50 145 14 000 875 80

    Speed of sound 1220 km / h

    Performance of Aircraft: An Analogy

    25

    Computer Architecture, Background and Motivation

    Performance from the viewpoint of a passenger: Speed

    Note, however, that flight time is but one part of total travel time.Also, if the travel distance exceeds the range of a faster plane,a slower plane may be better due to not needing a refueling stop

    Performance from the viewpoint of an airline: Throughput

    Measured in passenger-km per hour (relevant if ticket price wereproportional to distance traveled, which in reality it is not)

    Airbus A310 250 895 = 0.224 M passenger-km/hrBoeing 747 470 980 = 0.461 M passenger-km/hrBoeing 767 250 885 = 0.221 M passenger-km/hrBoeing 777 375 980 = 0.368 M passenger-km/hrConcorde 130 2200 = 0.286 M passenger-km/hr

    DC-8-50 145 875 = 0.127 M passenger-km/hrPerformance from the viewpoint of FAA: Safety

    Different Views of Performance

    26

  • 8/13/2019 02-Computer System Technology

    14/25

    Computer Architecture, Background and Motivation

    Table 4.1 Key characteristics of six passengeraircraft: all figures are approximate; some relate toa specific model/configuration of the aircraft or areaverages of cited range of values.

    Aircraft Passen-gers

    Range(km)

    Speed(km/h)

    Price($M)

    A310 250 8 300 895 120

    B 747 470 6 700 980 200

    B 767 250 12 300 885 120

    B 777 375 7 450 980 180

    Concorde 130 6 400 2 200 350

    DC-8-50 145 14 000 875 80

    Cost /

    Performance

    536

    434

    543

    489

    1224

    630

    Smallervaluesbetter

    Throughput

    (M P km/hr)

    0.224

    0.461

    0.221

    0.368

    0.286

    0.127

    Largervaluesbetter

    Cost Effectiveness: Cost/Performance

    27

    Computer Architecture, Background and Motivation

    Performance = 1 / Execution time is simplified to

    Performance = 1 / CPU execution time

    (Performance of M1) / (Performance of M2) = Speedup of M1 over M2

    = (Execution time of M2) / (Execution time M1)

    Terminology: M1 isxtimes as fast as M2 (e.g., 1.5 times as fast)

    M1 is 100(x 1)% faster than M2 (e.g., 50% faster)

    CPU time = Instructions (Cycles per instruction) (Secs per cycle)

    = Instructions CPI / (Clock rate)

    Instruction count, CPI, and clock rate are not completely independent,

    so improving one by a given factor may not lead to overall executiontime improvement by the same factor.

    Concepts of Performance and Speedup

    28

  • 8/13/2019 02-Computer System Technology

    15/25

    Computer Architecture, Background and Motivation

    CPU time = Instructions (Cycles per instruction) (Secs per cycle)

    = InstructionsAverage CPI / (Clock rate)

    Clock period

    Clock rate: 1 GHz = 109 cycles / s (cycle time 109 s = 1 n s )200 MHz = 200 106 cycles / s (cycle time = 5 ns)

    Average CPI: Is calculated based on the dynamic instruction mixand knowledge of how many clock cycles are neededto execute various instructions (or instruction classes)

    Instructions: Number of instructions executed, not number ofinstructions in our program (dynamic count)

    Elaboration on the CPU Time Formula

    29

    Computer Architecture, Background and Motivation

    250 instructions

    fori = 1, 100 do20 instructions

    forj = 1, 100 do

    40 instructions

    fork = 1, 100 do

    10 instructions

    endfor

    endfor

    endfor

    How many instructionsare executed in thisprogram fragment?

    Each for consists of two instructions:increment index, check exit condition

    2 + 40 + 1200 instructions100 iterations124,200 instructions in all

    2 + 10 instructions100 iterations

    1200 instructions in all

    2 + 20 + 124,200 instructions100 iterations12,422,200 instructions in all

    12,422,450 Instructions

    fori = 1, n

    while x > 0

    Static count = 326

    Dynamic Instruction Count

    30

  • 8/13/2019 02-Computer System Technology

    16/25

    Computer Architecture, Background and Motivation

    Figure 4.3 Faster steps do not necessarily mean shorter travel time.

    1 GHz

    2 GHz

    4 steps

    Solution

    20 steps

    Suppose addition takes 1 nsClock period = 1 ns; 1 cycleClock period = ns; 2 cycles

    In this example, addition timedoes not improve in going from1 GHz to 2 GHz clock

    Faster Clock Shorter Running Time

    31

    Computer Architecture, Background and Motivation

    0

    10

    20

    30

    40

    50

    0 10 20 30 40 50Enhancement factor (p )

    Speedup

    (s

    )

    f= 0

    f= 0.1

    f= 0.05

    f= 0.02

    f= 0.01

    Figure 4.4 Amdahls law: speedup achieved if a fraction fof a taskis unaffected and the remaining 1 fpart runsp times as fast.

    s =

    min(p, 1/f)

    1f+(1f)/p

    f= fractionunaffected

    p = speedup

    of the rest

    Performance Enhancement: Amdahls Law

    32

  • 8/13/2019 02-Computer System Technology

    17/25

  • 8/13/2019 02-Computer System Technology

    18/25

    Computer Architecture, Background and Motivation

    Figure 4.5 Running times of six programs on three machines.

    Execution time

    Program

    A E FB C D

    Machine 1

    Machine 2

    Machine 3

    Performance Measurement vs Modeling

    35

    Computer Architecture, Background and Motivation

    Original running time of a program = 1 = f1 + f2 + . . . + fk

    New running time after the fraction fi is speeded up by a factorpi

    f1 f2 fk

    + + . . . +p1 p2 pk

    Speedup formula

    1

    S =f1 f2 fk

    + + . . . +p

    1 p

    2 p

    k

    If a particular fractionis slowed down ratherthan speeded up,use sjfj instead of fj/pj ,where sj> 1 is the

    slowdown factor

    Generalized Amdahls Law

    36

  • 8/13/2019 02-Computer System Technology

    19/25

    Computer Architecture, Background and Motivation

    Example 4.3

    You are an engineer at Outtel, a start-up aspiring to compete with Intel

    via its new processor design that outperforms the latest Intel processor

    by a factor of 2.5 on floating-point instructions. This level of performancewas achieved by design compromises that led to a 20% increase in the

    execution time of all other instructions. You are in charge of choosing

    benchmarks that would showcase Outtels performance edge.

    a. What is the minimum required fraction fof time spent on floating-pointinstructions in a program on the Intel processor to show a speedup of2 or better for Outtel?

    Solution

    a. We use a generalized form of Amdahls formula in which a fraction fis speeded up by a given factor (2.5) and the rest is slowed down byanother factor (1.2): 1 / [1.2(1 f) + f /2.5] 2 f 0.875

    Performance Benchmarks

    37

    Computer Architecture, Background and Motivation

    Average CPI = All instruction classes (Class-ifraction) (Class-iCPI)

    Machine cycle time = 1 / Clock rate

    CPU execution time = Instructions (Average CPI) / (Clock rate)

    Table 4.3 Usage frequency, in percentage, for various instructionclasses in four representative applications.

    Application

    Instrn class Data

    compressionC language

    compilerReactor

    simulationAtomic motion

    modeling

    A: Load/Store 25 37 32 37

    B: Integer 32 28 17 5

    C: Shift/Logic 16 13 2 1

    D: Float 0 0 34 42

    E: Branch 19 13 9 10

    F: All others 8 9 6 4

    Performance Estimation

    38

  • 8/13/2019 02-Computer System Technology

    20/25

    Computer Architecture, Background and Motivation

    Example 4.4 (2 of 5 parts)

    Consider two implementations M1 (600 MHz) and M2 (500 MHz) ofan instruction set containing three classes of instructions:

    Class CPI for M1 CPI for M2 CommentsF 5.0 4.0 Floating-pointI 2.0 3.8 Integer arithmeticN 2.4 2.0 Nonarithmetic

    a. What are the peak performances of M1 and M2 in MIPS?b. If 50% of instructions executed are class-N, with the rest divided

    equally among F and I, which machine is faster? By what factor?

    Solution

    a. Peak MIPS for M1 = 600 / 2.0 = 300; for M2 = 500 / 2.0 = 250

    b. Average CPI for M1 = 5.0 / 4 + 2.0 / 4 + 2.4 / 2 = 2.95;

    for M2 = 4.0/4 + 3.8/4 + 2.0/2 = 2.95

    M1 is faster; factor 1.2

    CPI and IPS Calculations

    39

    Computer Architecture, Background and Motivation

    Example 4.5

    Two compilers produce machine code for a program on a machine

    with two classes of instructions. Here are the number of instructions:

    Class CPI Compiler 1 Compiler 2

    A 1 600M 400MB 2 400M 400M

    a. What are run times of the two programs with a 1 GHz clock?

    b. Which compiler produces faster code and by what factor?

    c. Which compilers output runs at a higher MIPS rate?

    Solution

    a. Running time 1 (2) = (600M 1 + 400M 2) / 109 = 1.4 s (1.2 s)

    b. Compiler 2s output runs 1.4 / 1.2 = 1.17 times as fastc. MIPS rating 1, CPI = 1.4 (2, CPI = 1.5) = 1000 / 1.4 = 714 (667)

    MIPS Rating Can Be Misleading

    40

  • 8/13/2019 02-Computer System Technology

    21/25

    Computer Architecture, Background and Motivation

    Table 4.4 Measured or estimated execution times for three programs.

    Time onmachine X

    Time onmachine Y

    Speedup ofY over X

    Program A 20 200 0.1

    Program B 1000 100 10.0

    Program C 1500 150 10.0

    All 3 progs 2520 450 5.6

    Analogy: If a car is driven to a city 100 km away at 100 km/hrand returns at 50 km/hr, the average speed is not (100 + 50) / 2but is obtained from the fact that it travels 200 km in 3 hours.

    Reporting Computer Performance

    41

    Computer Architecture, Background and Motivation

    Table 4.4 Measured or estimated execution times for three programs.

    Time onmachine X

    Time onmachine Y

    Speedup ofY over X

    Program A 20 200 0.1

    Program B 1000 100 10.0

    Program C 1500 150 10.0

    Geometric mean does not yield a measure of overall speedup,but provides an indicator that at least moves in the right direction

    Speedup ofX over Y

    10

    0.1

    0.1

    Arithmetic mean

    Geometric mean6.7

    2.15

    3.4

    0.46

    Comparing the Overall Performance

    42

  • 8/13/2019 02-Computer System Technology

    22/25

    Computer Architecture, Background and Motivation

    Example 4.6 (1 of 3 parts)

    Consider two applications DC and RS and two machines M1 and M2:

    Class Data Comp. Reactor Sim. M1s CPI M2s CPI

    A: Ld/Str 25% 32% 4.0 3.8B: Integer 32% 17% 1.5 2.5C: Sh/Logic 16% 2% 1.2 1.2D: Float 0% 34% 6.0 2.6E: Branch 19% 9% 2.5 2.2F: Other 8% 6% 2.0 2.3

    a. Find the effective CPI for the two applications on both machines.

    Solution

    a. CPI of DC on M1: 0.25 4.0 + 0.32 1.5 + 0.16 1.2 + 0 6.0 +

    0.19 2.5 + 0.08 2.0 = 2.31

    DC on M2: 2.54 RS on M1: 3.94 RS on M2: 2.89

    Effect of Instruction Mix on Performance

    43

    Computer Architecture, Background and Motivation

    State of available computing power ca. the early 2000s:

    Gigaflops on the desktop

    Teraflops in the supercomputer center

    Petaflops on the drawing board

    Note on terminology (see Table 3.1)

    Prefixes for large units:Kilo = 103, Mega = 106, Giga = 109, Tera = 1012, Peta = 1015

    For memory:K = 210 = 1024, M = 220, G = 230, T = 240, P = 250

    Prefixes for small units:micro = 106, nano = 109, pico = 1012, femto = 1015

    The Quest for Higher Performance

    44

  • 8/13/2019 02-Computer System Technology

    23/25

    Computer Architecture, Background and Motivation

    Figure 3.10 Trends in processorperformance and DRAM memorychip capacity (Moores law).

    1Mb

    19901980 2000 2010

    kIPS

    MIPS

    GIPS

    TIPS

    Processorperformanc

    e

    Calendar year

    8028668000

    80386

    80486

    68040

    Pentium

    Pentium II

    R10000

    1.6 / yr

    10 / 5 yrs

    2 / 18 mos

    64Mb

    4Mb

    64kb

    256kb

    256Mb

    1Gb

    16Mb

    4 / 3 yrs

    Processor

    Memory

    kb

    Mb

    Gb

    Tb

    Memorychipcapacity

    Can I call you back? Wejust bought a new computerand were trying to set it upbefore its obsolete.

    Performance Trends and Obsolescence

    45

    Computer Architecture, Background and Motivation

    Figure 4.7 Exponential growth of supercomputer performance.

    19901980 2000 2010

    Supercomputerpe

    rformance

    Calendar year

    CrayX-MP

    Y-MP

    CM-2

    MFLOPS

    GFLOPS

    TFLOPS

    PFLOPS

    Vectorsupercomputers

    CM-5

    CM-5

    $240M MPPs

    $30M MPPs

    Massively parallelprocessors

    Super-computers

    46

  • 8/13/2019 02-Computer System Technology

    24/25

    Computer Architecture, Background and Motivation

    Figure 4.8 Milestones in the DOEs Accelerated Strategic ComputingInitiative (ASCI) program with extrapolation up to the PFLOPS level.

    20001995 2005 2010

    Performance(TFL

    OPS)

    Calendar year

    ASCI Red

    ASCI Blue

    ASCI White

    1+ TFLOPS, 0.5 TB

    3+ TFLOPS, 1.5 TB

    10+ TFLOPS, 5 TB

    30+ TFLOPS, 10 TB

    100+ TFLOPS, 20 TB

    1

    10

    100

    1000 Plan Develop Use

    ASCI

    ASCI Purple

    ASCI Q

    The Most Powerful Computers

    47

    Computer Architecture, Background and Motivation

    Figure 25.1Trend incomputational

    performanceper watt ofpower usedin general-purposeprocessorsand DSPs.

    19901980 2000 2010kIPS

    MIPS

    GIPS

    TIPS

    Performance

    Calendar year

    Absoluteprocessor

    performance

    GP processor

    performanceper Watt

    DSP performanceper Watt

    Performance is Important, But It Isnt Everything

    48

  • 8/13/2019 02-Computer System Technology

    25/25

    Computer Architecture, Background and Motivation

    Ch. 5-8: A simple ISA,variations in ISA

    Ch. 9-12: ALU design

    Ch. 13-14: Data pathand control unit design

    Ch. 15-16: Pipeliningand its limits

    Ch. 17-20: Memory(main, mass, cache, virtual)

    Ch. 21-24: I/O, buses,

    interrupts, interfacing

    Fasten your seatbeltsas we begin our ride!

    Ch. 25-28: Vector andparallel processing

    Roadmap for the Rest of the Book

    49