4304-1-history
-
Upload
safer-muhammet -
Category
Documents
-
view
223 -
download
0
description
Transcript of 4304-1-history
1
EE/CE 4304: Computer Architecture
Bill SwartzDept. of EE
Univ. of Texas at Dallas
EEDG/CE4304 – B. Swartz
Dedicated to Cy Cantrell
1940-2013
3
Introduction / History
Session 01
Computer
• Programmable machine
• Receives input
• Manipulates and stores information known as data
• Provides output
Five Classic Components of a Computer
Control
Datapath
Memory
Processor
Input
Output
Architecture
• Implementation of the machine
Representation
• Analog• Voltage, current, frequency, phase, pressure, position, etc.
• Digital• Symbolic representation almost universally binary
Implementation
• Technology• Mechanical
• Antikythera (150-100BC) Archimedes???
• Difference Machine Charles Babbage (1837)
• Electromechanical• Z3 machine Konrad Zuse (1941)
• Electronic• ENIAC Mauchly & Eckert (1946)
• MEMS (Micro Electro Mechanical System) / nano self assembly ???
PIC12F629 Block Diagram
Microcontroller : Computer on a chipLimited resources
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
HOW D I D W E GET WHERE W E ARE I N COMPUTATION?
Patterson & Hennessy, 4th edition, revised
FIGURE 1.10.7 Characteristics of key commercial computers since 1950, in actual dollars and in 2007 dollars
adjusted for inflation. The last row assumes we can fully utilize the potential performance of the four cores in Barcelona. In contrast to
Figure 1.10.3, here the price of the IBM S/360 model 50 includes I/O devices. (Source: The Computer History Museum and Producer Price
Index for Industrial Commodities.)
Year Name Size
(cu. ft.)
Power
(watts)
Performance
(adds/sec)
Memory
(KB)
Price Price/
performance
vs. UNIVAC
Adjusted
price
(2007 $)
Adjusted
price/
performance
vs. UNIVAC
1951 UNIVAC I 1,000 125,000 2,000 48
$
1,000,000 1 $7,670,724 1
1964 IBM S/360
model 50
60 10,000 500,000 64 $1,000,000 263 $6,018,798 319
1965 PDP-8 8 500 330,000 4 $16,000 10,855 $94,685 13,367
1976 Cray-1 58 60,000 166,000,000 32,000
$
4,000,000 21,842 $13,509,798 47,127
1981 IBM PC 1 150 240,000 256 $3,000 42,105 $6,859 134,208
1991 HP 9000/
model 750
2 500 50,000,000 16,384 $7,400 3,556,188 $11,807 16,241,889
1996 Intel PPro
PC (200 MHz)
2 500 400,000,000 16,384 $4,400 47,846,890 $6,211 247,021,234
2003 Intel Pentium 4
PC (3.0 GHz)
2 500 6,000,000,000 262,144 $1,600 1,875,000,000 $2,009 11,451,750,000
2007 AMD Barcelona
PC (2.5 GHz)
2 250 20,000,000,000 2,097,152 $800 12,500,000,000 $800 95,884,051,042
The University o f Texas a t Dal lasErik Jonsson School o f
Engineering & Computer Science
M A R K E T S FOR EARLY COMPUTERS
• Scientific market
. Math tables
• Military market
. Artillery tables
. Decryption
. Thermonuclear weapon design
• Business market
. Point of sale
. Payroll
• Government market
. Census tabulation
Oc C. D. Cantrell (08/2012)
M AT H E M AT I C A L TABLES
Oc C. D. Cantrell (08/2010)
A 17th-CENTURY A RTILLERY TABLE
http://info.ox.ac.uk/departments/hooke/geometry/fig16n.gif)
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
T H E “ M I K E ” TH ERM ONUCLEAR DEVICE
Eniwetok Atoll, November 1, 1952
http://www.windows.umich.edu/sun/Solar interior/Nuclear Reactions/Fusion/h-bomb.jpg
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
T H E M A I N COMPONENTS OF A COMPUTER
Computer
Processor
Control
Datapath
Input
Oc C. D. Cantrell (08/2010)
Output
Peripheral
DevicesMemory
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
M E C H A N I C A L CALCULATORS A N D COMPUTERS
• Pre-20th century technology: Gears and levers
. Mechanical inertia → slow processing speed
. Unreliable in large systems
• Non-programmable systems
. The Antikythera Mechanism
. Hand-operated numerical calculators
. Cash registers
o Large market; National Cash Register Co.
• Programmable systems
. CharlesBabbage’sDifferenceEngine andAnalytical Engine (decimal rep-
resentation)
Oc C. D. Cantrell (08/2012)
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
T H E A N T I K Y T H E R A M EC HANISM (1)
• Recovered by sponge divers from an ancient shipwreck (~ 60- 80 BCE) off
the Aegean island of Antikythera in 1900 CE
. May havebeenconstructed in Syracuse(Sicily), possibly using mechanical
and mathematical techniques invented by Archimedes
• Identified as an astronomical analog computer
. 30 hand-cut bronze gears in a complex arrangement
. Nothing equally complex produced in Europe until the 1300’s
• Features:
. Input (a selectable date) by hand cranking the mechanism
. Output on engraved dials:
oPosition of the Sun and position and phase of the Moon
oThe Saros and Exeligmos (eclipse) cycles
oThe Metonic and Calippic cycles (solar and lunar months)
oThe Olympiad cycle Oc C. D. Cantrell (08/2012)
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
T H E A N T I K Y T H E R A M EC HANISM (2)
LostepicyclicgearingPossiblyepicyclicsolarmechanism
andplanetarymechanisms
Moonposition
SunpositionDate
FrontdialsZodiac•Egyptiancalendar•Parapegma
Moonphasema1
Edmunds & Freeth, IEEE Computer, July 2011
223or 224 b1
b264
c2
38 c1
24 d1
d232 e1
e2 127
223 e3 f1
i160
b3 3253 12 m2
m1
a1
o1
l1
57 l2
57 n2p1
q1 60 12 p2
n1
Input
Metonicx5Callippic OlympiadCalendars
ExeligmosSarosx4Eclipseprediction
Backdials
b02bmb2mb3
32
2760
60
96
48
48
Figure 4. Schematicgear diagram. Elementsin black are those for which there isevidence;those in red are conjectural. Note that
two gears have 38 teeth (= 2 × 19); one gear has127; and one gear (or possibly two) has 223. Three gears have 53 teeth, the mrst of
which contributesto the correct rotation for the lunar anomaly; the other two cancel out the enect of the mrst, where this prime
factor isn’t wanted. Copyright 2011 ImagesFirst Ltd.
15 n3 m3 53 k1 50 50 e5 e4 188 53 g1
k2 50 50 e6 f2 30 54 h1
Pin and slot Epicyclic lunarmechanism
g2 20 h2 15 60
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
T H E A N T I K Y T H E R A M EC HANISM (3)
Wikimedia Commons
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
EARLY CASH REGISTERS
http://www3.ncr.com/history/
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
B A B B AGE’S DIFFERENCE ENGINE
Oc Prentice-Hall 1995
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
A SLIDE RULE
The calculation shown on the C and D scales is 4.22⇥ 1.66 = 7.01.
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
Oc C. D. Cantrell (08/2012)
ELECTROM ECHANICAL CALCULATORS A N D COMPUTERS
• Early 20th century technology
• Special-purpose systems
. Motor-driven mechanical calculators
. Cash registers
. Enigma cipher machine
. Colossus (used to decrypt the German Enigma cipher)
• Programmable systems
. Punched-card sorters
o Punched cards: Herman Hollerith, 1890’s
o IBM
. Machines using relays to construct logic gates
o Konrad Zuse’s Z3
o Howard Aiken’sMark-II (Harvard)
o Originated the term “bug” for a hardware malfunction
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
Museum Victoria
H E R M A N HOLLERITH’S C ARD PUNCH M A CHINE
• First big application: The 1890 U. S. Census
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
ELECTROM ECHANICAL RELAY SWITCH
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
T H E FIRST COMPUTER BUG
• The first bug was found by Grace Hopper in an electromechanical relay of
the Mark-II computer at Harvard
Oc C. D. Cantrell (08/2010)
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
T H E E N I G M A CIPHER M A CHINE
• The Enigma was a rotor-based cipher machine used in World War II by
German forces, who thought the Enigma’s code was secure.
Oc C. D. Cantrell (08/2002)
Oc Smithsonian Institution 1991
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
T H E “ PURPLE ” CIPHER M A CHINE
• The “purple” cipher machine, which used telephone stepping switches in-
stead of rotors, was used in World War II by the Japanese diplomatic corps
and military. The machine shown was constructed by U.S. cryptanalysts
purely from analysis of coded messages.
Oc C. D. Cantrell (08/2012)
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
ELECTRONIC COMPUTERS — GENERATION 1
• Dominant technologies:
. Processors: Vacuum tubes (thermionic cathodes were unreliable)
. Memory: Acoustic delay line, electrostatic storage, magnetic drum
• Special-purpose systems
. John Atanaso↵’s experimental machine
. Ho↵man’s computer used to decrypt the Japanese Purple cipher
• Programmable systems
. ENIAC I–IV (J. Prosper Eckert, John Mauchly)
• Stored-program systems (John von Neumann)
. Experimental projects: EDVAC (vonNeumann), EDSAC (Maurice Wilkes)
. Commercial products: UNIVAC I, IBM 701, IBM 704 (fp hardware)
Oc C. D. Cantrell (08/2012)
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
VA C U U M TUBES
Left: A Western Electric triode in its glass envelope
Right: A cutaway view of a pentode, showing the flow of electrons
from the cathode (inside) to the plate (outside), under the control of wire grids
Oc C. D. Cantrell (04/2011)
The University o f Texas a t Dal las Erik Jonsson School o f Engineering &
Computer Science
T H E ENIAC I
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
H A RVA R D A RCHITECTURES
• Instructions and data stored in different memories, possibly in different forms
. Common in early digital architectures (e.g., ENIAC)
. Currently common in L1 cache design
oSeparate instruction and data caches
. In some designs, programs were wired on plugboards
. Only one program can execute at a time
. Formulation of concept, design of Mark-II: HowardAiken, Harvard faculty
member
Oc C. D. Cantrell (08/2012)
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
STORED-PRO G R A M COMPUTERS
• Instructions and data stored as numbers in a common memory
. Same hardware can be used to load/store instructions and data
. An instruction must be referenced by its address in memory
. Programs are software, because they are not hardwired
. Same hardware can execute many different programs yielding
general-purpose computers
. Same memory can hold several different programs, and the processor can
switch execution from one to another
. In principle, permits self-organizing structures
oThis feature rarely used intentionally
oSelf-modification by a program usually ) trouble!
. Formulation of concept, design of EDVAC: John von Neumann, Princeton
faculty member and famous mathematician
Oc C. D. Cantrell (08/2012)
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
T H E V ON N E U M A N N A RCHITECTURE (1)
• General-purpose, stored-program computer
. Instructions and data are stored in the same format in memory:
Instructions are data
. In principle, an executing program can modify itself
o In practice, this usually means that an error has occurred!
• Strict ly sequential execution
. Instruction fetched from memory, then
. Instruction decoded, then
. Data fetched from memory, then
. Instruction executed, then
. Result stored to memory
Oc C. D. Cantrell (08/2010)
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
Oc C. D. Cantrell (03/2011)
T H E FIRST I B M M A I N F R A M E COMPUTERS
• Mostly scientific, first-generation stored-program computers
. The 700 family: Based on vacuum-tube technology
o701: 18- and 36-bit integer and fixed-point computation only
⇧Known as the “Defense Calculator” during development
⇧Produced from 1952 to 1954
o704: Added 36-bit floating-point computation
. Superseded by the second-generation, transistor-based 7000 series
• Word length: 36 bits
. Allowed storage of 6 characters coded in 6-bit BCD
• Address length: 18 bits
• 18-bit, fixed-length instructions
. Sign bit, used to indicate word or half-word operand addresses
. 5-bit opcode
. 12-bit operand address
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
I B M MODEL 701 DATA PROCESSING SYSTEM
• This figure shows only the electronic analytical control unit and operator
control panel
Oc IBM
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
I B M 704 “PLUGGABLE UN I T ” (CIRC UIT MODULE)
• Note the vacuum tubes and discrete components
Oc IBM
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
I B M 701 CONTROL PANEL
Oc IBM
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
I B M MODEL 737 M A GNETIC CORE STORAGE U N I T
Oc IBM
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
I B M MODEL 740 CRT OUTPUT RECORDER
Oc IBM
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
A N 80-COLUMN PUNCHED C AR D
• The basis for the cards-in, paper-out user interface of the 1960’s
. 80 columns, 12 rows
. Coded in EBCDIC, not ASCII
Wikimedia Commons
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
I B M MODEL 29 C AR D PUNCH
Oc IBM
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
I B M MODEL 711 C AR D READER
Oc IBM
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
I B M MODEL 716 LINE PRINTER
Oc IBM
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
Oc C. D. Cantrell (08/2012)
ELECTRONIC COMPUTERS — GENERATION 2
• Dominant technologies:
. Processors: Discrete transistors
. Memory: Magnetic “cores”
• Scientific computers
. IBM 7090, 7094
• General-purpose computers
. IBM System/360
oSame architecture for computers with di↵erent price and performance
oDominated the market for large business computers
. Minicomputers: Digital Equipment PDP-8
• Supercomputers
. Control Data 6600
oMajor innovations (load-store architecture, pipeline, multithreading)
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
T H E CONTROL DATA 6600
Wikimedia Commons
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
CONTROL DATA 6600 1 kb CORE M E M O RY PLANE
Wikimedia Commons
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
http://www.wins.uva.nl/fac
M E M O RY BASED ON FERRITE CORES
ulteit/museum/core detail.gif
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
CONTROL DATA 6600 CORDW OOD MODULE
• Hearty thanks and a tip of the hat to Gary Smith, U.T. Austin Computation
Center (retired)
Oc C. D. Cantrell (08/2012)
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
IBM’S L A M E N T A B O U T A N UPSTA RT
• Thomas Watson, IBM CEO, August 1963:
“Last week Control Data ... announced the 6600 system. I understand
that in the laboratory developing the system there are only 34 people
including the janitor. Of these, 14are engineersand 4are programmers
... Contrasting this modest e↵ort with our vast development activities,
I fail to understand why we have lost our industry leadership position
by letting someone else o↵er the world’s most powerful computer.”
Thanks to Gordon Bell, University of Minnesota, 1987
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
CRAY’S RESPONSE
“It seems like Mr. Watson has answered his own question.”
Thanks to Gordon Bell, University of Minnesota, 1987
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
ELECTRONIC COMPUTERS — GENERATION 3
• Dominant technologies:
. Processors: Custom designs using integrated circuits on PC boards
. Memory:
oEarly: Magnetic “cores”
oLater: Integrated circuits (SRAM, DRAM)
• General-purpose computers
. IBM System/370
. Minicomputers: Digital Equipment PDP-11, VAX 11-780
• Supercomputers
. Control Data 7600
. Cray-1
Oc C. D. Cantrell (03/2011)
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
T H E CONTROL DATA 7600
Oc Cray Research, Inc.
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
SEYMOUR CRAY A N D THE CRAY-1
Oc Charles Babbage Institute, University of Minnesota
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
Oc C. D. Cantrell (08/2012)
ELECTRONIC COMPUTERS — GENERATION 4
• Dominant technologies:
. Custom processors: LSI or VLSI ASICS or gate arrays on PC board
. Microprocessors: Motorola 680xx family, Intel 808x family, 6502
. Memory: SRAM, DRAM
• General-purpose computers
. IBM 3xxx, 43xx series
. Minicomputers: Digital Equipment VAX 8400
• Microprocessor-based desktop computers (Apple IIe, Atari, Commodore 64)
• Microprocessor-based engineering workstations running a UNIX OS
• Minisupercomputers: Convex C-1, C-2, C-3 series
• Vector supercomputers
. Cray-2, Cray X-MP, Cray Y-MP, Cray-3
. Fujitsu, Hitachi, NEC
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
T H E CRAY-2
Wikimedia Commons
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
T H E CRAY X - M P
• Up to 4 processors
• Up to 8M 64-bit words of static RAM memory
• Designed by Steve Chen
Oc Cray Research, Inc.
The University o f Texas a t Dal las Erik Jonsson School o f Engineering
& Computer Science
CRAY-3 PC BOA R D A N D DIES
Oc C. D. Cantrell (08/2012)
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
Oc C.
SUN W ORKSTATION SCREENDUMP (1989)
D. Cantrell (03/2011)
The University o f Texas a t Dal las Erik Jonsson School o f Engineering
& Computer Science
Oc C. D. Cantrell (08/2010)
ELECTRONIC COMPUTERS — GENERATION 5
• Dominant technologies:
. Processors: Mass-produced microprocessors
. Memory: SRAM, DRAM
. Compilers
. Instruction-level parallelism
oPipelining
oConcurrent issuing of instructions
oSpeculative execution of branches
oVery-long-instruction-word (VLIW) architectures
• RISC processors
. MIPS (Silicon Graphics)
. SPARC (Sun Microsystems)
. Alpha (Digital Equipment)
• CISC processors
. 80x86, Pentium, Pentium Pro, Pentium II, III, IV (Intel)
T H E RISC REVOLUTION
0
50
200
SPECint rating
DEC Alpha 150
IBM Power2
DEC Alpha
100
250
350
DEC Alpha
300
Year
1.58x per year
1.35x per year
SUN4
MIPS
R2000
MIPS
R3000
IBM
Power1
HP
9000
John L. Hennessy and David A. Patterson, Computer Architecture: A Quantitative Approach
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
T H E INTEL 4004 M I C ROPROCESSOR
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
T H E MIPS R2000 M I C ROPROCESSOR
R2000 microprocessor
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
A P E N T I U M I M I C ROPROCESSOR
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
VECTOR CRT DISPLAYS
Oc C. D. Cantrell (08/2010)
What is CRT?
Vector implies drawing of lines or curves
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer
Science
TEST I M A GE FOR VECTOR CRT DISPLAYS (1)
(Produced from the original C code and modern graphics)Oc C. D. Cantrell (08/2010)
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
TEST I M A GE FOR VECTOR CRT DISPLAYS (2)
(As it was displayed on a Tektronix 4014 vector graphics display)
Oc C. D. Cantrell (08/2010)
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
RASTER CRT DISPLAYS
Oc C. D. Cantrell (08/2010)
Raster implies pixels
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
FRAMEBUFFER FOR A RASTER DISPLAY
X0 X1
Y0
Y1
0
1
1
01
10
1
X0 X1
Y0
Y1
Frame buffer Raster scan CRT display
Oc C. D. Cantrell (08/2010)
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
Oc C. D. Cantrell (08/2012)
ELECTRONIC COMPUTERS — GENERATION 6
• Dominant technologies:
. Processors: Mass-produced microprocessors & embedded processors
. Memory: SRAM, DRAM
. Compilers
. Interpreters and just-in-time compilers
. Reconfigurable architectures
• Concurrent execution
. Multiple cores on one die
. Concurrent execution of multiple processes
. Thread-level parallelism: Multithreading in individual processes
• Challenges going forward:
. Growing disparity between memory and processor speeds
. The power wall
. The scaling wall
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
Oc Intel
P E N T I U M 4 “W I L LAMETTE ” CHIP LAY O U T
A.D.E. 400
MHz
System
Bus
A.D.E.
Execution
Trace Cache Advanced
Transfer
CacheEnhanced
Floating
Point &
Multimedia
Data
Cache
Rapid
Execution
Advanced Dynamic
Execution (A.D.E.)
Hyperpipeline
(20 stages)
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
T H E A M D “ BARCELONA” PROCESSOR
FIGURE 1.9 Inside the AMD Barcelona microprocessor. The left-hand side isa microphotograph of theAMD Barcelona processor
chip, and the right-hand side shows the major blocks in the processor. This chip has four processors or “cores”. The microprocessor in the
laptop in Figure1.7hastwo coresper chip, called an Intel Core2Duo.Copyright © 2009Elsevier, Inc. All rightsreserved.
2MB
Shared
L3
Cache
Northbridge
Core 4 Core 3
Core 2
512kB
L2
L2 Cache
Ctl
HT PHY, link 1
128-bit FPU
L1 Data
Cache
L1 Instr
Cache
Execution
Load/
Store
Fetch/
Decode/
Branch
Slow I/O Fuses
HT PHY, link 4H
T P
HY
, lin
k 3
HT
PH
Y,
link 2
Slow I/O Fuses
DDR
PHY
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
Oc Intel, Inc.
T H E INTEL CORE i7-3960X (Sandy Bridge), 2011
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
FUNDA M E N TA L EQUATION FOR CPU T I M E
• CPU time required to run a program:
CPU execution time =Instructions
Program
Clock periods⇥
Instruction
Seconds⇥
Clock Period
•Instructions
Program
Oc C. D. Cantrell (09/1998)
is determined by:
. The instruction set architecture
. The compiler
. The program
The University o f Texas a t Dal las Erik Jonsson School o f
Engineering & Computer Science
TAKE-AWAYS FOR THIS COURSE
• Customers: Measure to buy
• Architects: Measure to design
• Performance is specific to an individual program or program suite
. Total execution time and total energy consumed summarize performance
• For any given architecture, performance improvements come from:
. Improvements in memory-to-processor bandwidth
. Increases in clock frequency (without adverse CPI e↵ects)
. Improvements in processor organization that lower CPI
. Compiler enhancements that lower CPI and/or instruction count
• Pitfall: Expecting improvement in one aspect of a machine’s performance to
a↵ect the total performance proportionately to the single improvement
Oc C. D. Cantrell (09/2098)