Computer Organization & Assembly Language...
Transcript of Computer Organization & Assembly Language...
CSE 2312 Computer Organization & Assembly Language Programming 1Spring 2015
CSE 2312
Lecture 4 Quantifying Computer Components
Junzhou Huang, Ph.D.
Department of Computer Science and Engineering
Computer Organization &
Assembly Language Programming
CSE 2312 Computer Organization & Assembly Language Programming 2Spring 2015
Quantifying Computer Components
• CPU Speed – Mhz or Ghz CPU Speed, MIPS, MFLOPS ….
– 1.33 Ghz …… Inter Atom Processor
• Bus Speed– Front Side Bus (FSB) … 533 Mhz Inter Atom,
– Number of Channels, Number of data paths
• Memory Capacity and Speed– Gigabytes, Mhz x DataRate
– 166 MHz DDR memory, Quad pump
• Disk Capacity and Bandwidth– GB, TB, MB/sec
• Power Consumption– Watts, mWatts,
– Battery life time (standby vs active) Watt-Hr
CSE 2312 Computer Organization & Assembly Language Programming 3Spring 2015
Metric Units
The principal metric prefixes.
CSE 2312 Computer Organization & Assembly Language Programming 4Spring 2015
Inter Atom
CSE 2312 Computer Organization & Assembly Language Programming 5Spring 2015
More
• iPhone
– 620 Mhz ARM chip
– SIMD, high performance integer CPU (8-stage pipeline, 675 Dhrystone, 2.1 MIPS)
– 16 K/16 K cache
– 0.45 mW/MHz power draw (with cache)
• Wii
– CPU: PowerPC-based "Broadway" processor, 729 Mhz
– GPU: ATI "Hollywood" GPU, 243 MHz
CSE 2312 Computer Organization & Assembly Language Programming 6Spring 2015
More
• iPad
– 1GHz Apple A4
– 16GB ~ 64 GB flash strorage
– Upto 10 hours of battery life
• IBM ThinkPad T42
– Pentium M Processor 735
– 1.7GHz,
– 512 MB RAM
– Intel® Core™ 2 Duo P8600
– 2.4GHz/1066Mhz FSB/3MB cache
– 4G memory,
– 100 G disk
CSE 2312 Computer Organization & Assembly Language Programming 7Spring 2015
What is Performance?
• Which airplane has the best performance?
0 100 200 300 400 500
Douglas
DC-8-50
BAC/Sud
Concorde
Boeing 747
Boeing 777
Passenger Capacity
0 2000 4000 6000 8000 10000
Douglas DC-
8-50
BAC/Sud
Concorde
Boeing 747
Boeing 777
Cruising Range (miles)
0 500 1000 1500
Douglas
DC-8-50
BAC/Sud
Concorde
Boeing 747
Boeing 777
Cruising Speed (mph)
0 100000 200000 300000 400000
Douglas DC-
8-50
BAC/Sud
Concorde
Boeing 747
Boeing 777
Passengers x mph
CSE 2312 Computer Organization & Assembly Language Programming 8Spring 2015
Choosing a Metric
• Which of these are good metrics?
– Energy consumption
– Instructions per second
– CPU utilization
– Execution time
– Cycles per instruction
– Clock rate
• Why?
CSE 2312 Computer Organization & Assembly Language Programming 9Spring 2015
What is Time?
• Response time
– How long it takes to do a task?
• Throughput
– Total work done per unit time,
– Such as tasks, transactions, … per hour
• How are they affected by
– Replacing the processor with a faster version?
– Adding more processors?
CSE 2312 Computer Organization & Assembly Language Programming 10Spring 2015
What is Execution Time?
• Elapsed time– Total response time, including all aspects, such as Processing, I/O, OS
overhead, idle time
– Determines system performance
• CPU time– Time spent processing a given job
– Discounts I/O time, other jobs’ shares
– User CPU time + system CPU time
– Different programs are affected differently by CPU and system performance
Time spent
executing the
program’s
instructions
CSE 2312 Computer Organization & Assembly Language Programming 11Spring 2015
CPU Clock
• Every action is driven by a clock in the CPU
• Clock time = 1/Frequency– 1 Mhz clock = 10–6 seconds
– 1 Ghz clock = 10–9 seconds
• From CPU speed, you know time for 1 clock cycle
CSE 2312 Computer Organization & Assembly Language Programming 12Spring 2015
How Long Does An Instruction Take?
• Digital logic is controlled by a clock
• Clock period: duration of a clock cycle– e.g., 250ps = 0.25ns = 250×10–12s
• Clock frequency (rate): cycles per second– e.g., 4.0GHz = 4000MHz = 4.0×109Hz
Clock (cycles)
Data transferand computation
Update state
Clock period
CSE 2312 Computer Organization & Assembly Language Programming 13Spring 2015
Predicting CPU Time
• Ideal: Only need to know number of instructions
• Reality: Some instructions take longer than others
Rate Clock
nsInstructio
TimeCycle ClocknsInstructio TimeCPU
=
×=
cycle Clock
Seconds
nInstructio
cycles Clock
Program
nsInstructio TimeCPU ××=
Instruction
Count
Cycles per
instruction
CSE 2312 Computer Organization & Assembly Language Programming 14Spring 2015
Instruction Count and Cycles Per Instruction
• IC determined by program, ISA, and compiler
• CPI determined by CPU and other factors– Different instructions have different CPI
– Average CPI affected by instruction mix
Rate Clock
CPIIC
TimeCycle ClockCPIIC TimeCPU
(CPI) nInstructioper Cycles
(IC)Count nInstructioCycles Clock
×=
××=
×=
CSE 2312 Computer Organization & Assembly Language Programming 15Spring 2015
Improving CPU Time
Rate Clock
Cycles Clock CPU
TimeCycle ClockCycles Clock CPU TimeCPU
=
×=
∑
∑
=
=
×==
×=
n
1i
ii
n
1i
ii
Count nInstructio
Count nInstructioCPI
Count nInstructio
Cycles ClockCPI
Count nInstructioCPI Cycles Clock
Relative frequency
Usually a
tradeoff
CSE 2312 Computer Organization & Assembly Language Programming 16Spring 2015
Compiler Matters!
• Suppose compiler has two choices:– Can use 5 or 6 instructions, as described below:
• Which is better?
Class A B C
CPI for class 1 2 3
IC in sequence 1 2 1 2
IC in sequence 2 4 1 1
• Sequence 1: IC = 5
– Clock Cycles
= 2×1 + 1×2 + 2×3= 10
– Avg. CPI = 10/5 = 2.0
• Sequence 2: IC = 6
– Clock Cycles
= 4×1 + 1×2 + 1×3= 9
– Avg. CPI = 9/6 = 1.5
Sequence 2 has lower average CPI, so it is better.
CSE 2312 Computer Organization & Assembly Language Programming 17Spring 2015
Comparing Performance
• Performance = 1 / Execution Time
• “X is n times faster than Y”
• Example: time taken to run a program
– 10s on A, 15s on B… how much faster is A?
– Execution TimeB / Execution TimeA = 15s / 10s = 1.5
– So A is 1.5 times faster than B
n = (Performancex) / (Performancey)
= (Execution Timey) / (Execution Timex)
CSE 2312 Computer Organization & Assembly Language Programming 18Spring 2015
CPI Example
• Computer A: Cycle Time = 250ps, CPI = 2.0
• Computer B: Cycle Time = 500ps, CPI = 1.2
• Same ISA
• Which is faster, and by how much?
1.2500psI
600psI
A TimeCPU
B TimeCPU
600psI500ps1.2I
B TimeCycleBCPICount nInstructioB TimeCPU
500psI250ps2.0I
A TimeCycleACPICount nInstructioA TimeCPU
=××
=
×=××=
××=×=××=
××=
A is faster…
…by this much
CSE 2312 Computer Organization & Assembly Language Programming 19Spring 2015
CPU Example
• Computer A: – 2GHz clock, 10s CPU time
• Let’s design Computer B– Aim for 6s CPU time
– Can do faster clock, but causes 1.2x clock cycles
• How fast must new clock be?
4GHz6s
1024
6s
10201.2Rate Clock
10202GHz10s
Rate Clock TimeCPUCycles Clock
6s
Cycles Clock1.2
TimeCPU
Cycles ClockRate Clock
99
B
9
AAA
A
B
B
B
=×
=××
=
×=×=
×=
×==
CSE 2312 Computer Organization & Assembly Language Programming 20Spring 2015
Time for a Program
• CPU executes various instructions
• A Program has several Instructions, how many?– Depends on program, compiler
• Each Instruction can take several CPU cycles, how many?– Depends on the Instruction Set Architecture (ISA)
– ISA: Learn in this course
• Each cycle has a fixed time based on CPU, BUS speed. What is the clock time, memory speed etc? – Depends on the hardware, organization
– Computer Architecture – Learn in this course
CSE 2312 Computer Organization & Assembly Language Programming 21Spring 2015
CPU Performance Equation
CSE 2312 Computer Organization & Assembly Language Programming 22Spring 2015
Performance Summary
• Performance depends on– Algorithm: affects IC, possibly CPI
– Programming language: affects IC, CPI
– Compiler: affects IC, CPI
– Instruction set architecture: affects IC, CPI, Tc
cycle Clock
Seconds
nInstructio
cycles Clock
Program
nsInstructio TimeCPU ××=
CSE 2312 Computer Organization & Assembly Language Programming 23Spring 2015
How Improve Performance?
We must lower execution time!
• Algorithm– Determines number of operations executed
• Programming language, compiler, architecture– Determine number of machine instructions executed per operation (IC)
• Processor and memory system– Determine how fast instructions are executed (CPI)
• I/O system (including OS)– Determines how fast I/O operations are executed
CSE 2312 Computer Organization & Assembly Language Programming 24Spring 2015
Amdahl’s Law
• Improving an aspect of a computer won’t give a proportional improvement in overall performance
• Especially true of multicore computers
• So make the common case fast!
unaffectedaffected
improved Tfactort improvemen
TT +=
CSE 2312 Computer Organization & Assembly Language Programming 25Spring 2015
Exercise 1
• Problem– There are 3 classes of instructions, A, B, C. Suppose compiler has two
choices: Sequence 1 and Sequence 2, as described below:
• Which one is better? Why?
Class A B C
CPI for class 1 2 3
IC in sequence 1 2 1 2
IC in sequence 2 3 1 1
• Sequence 1: IC = 5
– Clock Cycles = 2×1 + 1×2 + 2×3 = 10
– Avg. CPI = 10/5 = 2.0
• Sequence 2: IC = 5
– Clock Cycles= 3×1 + 1×2 + 1×3 = 8
– Avg. CPI = 8/5 = 1.6
Sequence 2 has lower average CPI, so it is better.
CSE 2312 Computer Organization & Assembly Language Programming 26Spring 2015
Exercise 2
• Problem:– There are two computers: A and B.
– Computer A: Cycle Time = 250ps, CPI = 2.0
– Computer B: Cycle Time = 400ps, CPI = 1.5
– If they have the same ISA, which computer is faster?
– How many times it is faster than another?
• Answer:– We know that CPU = IC * CPI * Cycle time
– Therefore, CPU(A) = IC*2*250 = 500*IC
– CPU(B) = IC*1.5*400 = 600*IC
– So, A is (600/500) = 1.2 times faster.
CSE 2312 Computer Organization & Assembly Language Programming 27Spring 2015
Exercise 3
• Problem:– Computer A has 2GHz clock. It takes 10s CPU time to finish one given task.
– We want to design Computer B to finish the same task within 5s CPU time.
– The clock cycle number for computer B is 2 times as that of Computer A.
– What clock rate should be designed for Computer B?
• Answer:
8GHz5s
1040
5s
10202Rate Clock
10202GHz10s
Rate Clock TimeCPUCycles Clock
5s
Cycles Clock2
TimeCPU
Cycles ClockRate Clock
99
B
9
AAA
A
B
B
B
=×
=××
=
×=×=
×=
×==
CSE 2312 Computer Organization & Assembly Language Programming 28Spring 2015
Homework 1
• See webpage– Chapter 1 in the Tanenbaum’s Textbook
• Due in Class