PowerPoint Presentation · Bioswales Soil Quality ... PowerPoint Presentation ...
PowerPoint Presentation
description
Transcript of PowerPoint Presentation
ECE337 Fall 2009
Chapter 1
Introduction to Computer Architecture and Organization
Architecture & Organization 1
• Architecture is those attributes visible to the programmer (alternatively say, those attributes have a direct impact on the logic execution of a program)– Instruction set, number of bits used for data
representation, I/O mechanisms, addressing techniques.– e.g. Is there a multiply instruction?
• Organization is how features are implemented– Control signals, interfaces, memory technology.– e.g. Is there a hardware multiply unit or is it done by
repeated addition?
Architecture & Organization 2
• All Intel x86 family share the same basic architecture
• This gives code compatibility– At least backwards
• Organization differs between different versions, same architecture, but different organization
Architecture & Organization 3
• Also the Intel Pentium and the AMD Athlon have nearly identical versions of the x86 instruction set architecture, but have totally different internal designs to implement the same architecture.
Why study computer architecture and organization?
• Acquire understanding and appreciation of a computer system’s functional components, their characteristics, their performance and their interactions
• Structure a program that runs more efficiently on a real machine
• Understand the trade-off among various components in order to select the most cost-effective computer system- IEEE/ACM Computer Curricula 2001
Languages, Levels, Virtual (hypothetical) Machines
A multilevel model
primitive,machine-oriented
complex, people-oriented
Translator and interpreter
• Translator : translate instructions before execution
eg. Compiler, Assembler
• Interpreter : interpret instructions during execution
eg. Matlab software, OS software
Contemporary Multilevel Machines
A six-level computer. The support method for each level is indicated below it (along with the name of supporting program).
gates, registers
collection of registers,ALU(data path)
machine language instruction sets (opcode)
Hybrid level. new instruction sets (system calls), memory organization, multiple processes control etc.
system programmer
application programmer
Computer Generations • First Generation
Vacuum Tubes (1945 – 1955)
• Second GenerationTransistors (1955 – 1965)
• Third GenerationIntegrated Circuits (1965 – 1980)
• Fourth GenerationVery Large Scale Integration (1980 – ?)
Computer History - An Overview
http://library.thinkquest.org/18268/History/hist_m.htm
Vacuum Tubes ENIAC - background
• Electronic Numerical Integrator And Computer• Eckert and Mauchly• University of Pennsylvania• Trajectory tables for weapons • Started 1943• Finished 1946
– Too late for war effort
• Used until 1955
ENIAC - details
• Decimal (not binary)• 20 accumulators of 10 digits• Programmed manually by switches• 18,000 vacuum tubes• 30 tons• 15,000 square feet• 140 kW power consumption• 5,000 additions per second
von Neumann machine• Stored Program concept• Main memory storing programs and data• ALU operating on binary data• Control unit interpreting instructions from memory
and executing • Input and output equipment operated by control unit• Princeton Institute for Advanced Studies
– IAS machine• Completed 1952
Structure of von Neumann machine
EDSAC
• the first stored program computer
• built in 1949
• could complete 714 operations per second.
IAS machine- details
• Stored Program
• 4096 x 40 bit word memory– Binary number– 2 x 20 bit instructions– A 40-bit signed integer
Transistors
• Replaced vacuum tubes• Smaller• Cheaper• Less heat dissipation• Solid State device• Made from Silicon• Invented 1947 at Bell Labs• William Shockley et al.
TX-0 (1956)
• the first general-purpose, programmable computer built with transistors.
• created by MIT researchers.
PDP-1(1960)
• The PDP-1 sold for $120,000.
• MIT wrote the first video game, Space War! for it.
• A total of 50 were built.
Integrated Circuit - Microelectronics
• Literally - “small electronics”
• A computer is made up of gates, memory cells and interconnections among them
• These can be manufactured on a semiconductor
• e.g. silicon wafer
IBM 360 (1964)
• IBM 360 with different models
• a great variety of combinations of speed, memory, and power.
• all the models were compatible
• IBM were getting 1,000 orders each month, month within two years.
Very Large Scale Integration
• 1981 - IBM PC
IBM's first PC ran on Intel's 4.77 MHz 8088 microprocessor. It came with Microsoft's MS-DOS operating system.
IBM PS/2 (1987)
• IBM's PS/2 was based on Intel's 80386 chip. At the same time, IBM introduced OS/2
• More than 1 million machines were sold by the end of the year.
Computer Generation
• Vacuum tube - 1946-1957• Transistor - 1958-1964• Small scale integration - 1965 - 1971
– Up to 100 devices on a chip
• Medium scale integration – 1965 -1971– 100-3,000 devices on a chip
• Large scale integration - 1971-1979– 3,000 - 100,000 devices on a chip
• Very large scale integration - 1980 -1991– 100,000 - 100,000,000 devices on a chip
• Ultra large scale integration – 1991 -– Over 100,000,000 devices on a chip
Moore’s Law• Gordon Moore – co-founder of Intel • Number of transistors on a chip was doubling every two years• Since 1970’s development has slowed a little
– Number of transistors doubles every 18 months
• Cost of a chip has remained almost unchanged• Higher packing density means shorter electrical paths, giving
higher speed• Smaller size gives increased flexibility for usage environment• Reduce power and cooling requirements• Fewer interconnections increases reliability (than solder
connections)
Growth in CPU Transistor Count
The Computer Spectrum
A rough categorization of current computers
Better Performance Design: Speeding Microprocessor Up
• Pipelining
• On board cache
• Branch prediction
• Data flow analysis
• Speculative execution
Computer Architecture Challenge: Performance Balance
• Processor speed increased
• Memory capacity increased
• Memory speed lags behind processor speed
Mismatch interface between processor and main memory
Logic(CPU) and Memory Performance Gap
Solutions• Increase number of bits retrieved at one time
– Make DRAM “wider”
• Reduce frequency of memory access– More complex cache and cache on chip
• Increase interconnection bandwidth– High speed buses– Hierarchy of buses
• The processor bus VLB• The cache bus• The memory bus• The local I/O bus (VLB, PCI)• The standard I/O bus (ISA) PCI
ISA
Typical I/O Device Data Rates
Improvements in Processor Chip
• Increase hardware speed of processor– Fundamentally due to shrinking logic gate size
• More gates, packed more tightly, increasing clock rate• Propagation time for signals reduced
• Increase size and speed of caches– Dedicating part of processor chip
• Cache access times drop significantly
• Change processor organization and architecture– Increase effective speed of execution– Parallelism (instruction-level)
Problems with Clock Speed and Logic Density• Power
– Power density increases with density of logic and clock speed– Dissipating heat
• RC delay(the signal delay of a wire) – Speed at which electrons flow limited by resistance and
capacitance of metal wires connecting them– Delay increases as RC product increases– Wire interconnects thinner, increasing resistance– Wires closer together, increasing capacitance
• Solution:– More emphasis on organizational and architectural approaches
Solution1: Increased Cache Capacity
• Typically two or three levels of cache between processor and main memory
• Chip density increased– More cache memory on chip
• Faster cache access
• Pentium chip devoted about 10% of chip area to cache
• Pentium 4 devotes about 50%
Solution 2: More Complex Execution Logic
• Enable parallel execution of instructions
• Pipeline works like assembly line– Different stages of execution of different
instructions at same time along pipeline
• Superscalar allows multiple pipelines within single processor– Instructions that do not depend on one another
can be executed in parallel
Soultion3: Processor-level Parallelism – Multiple Cores
• Multiple processors on single chip– Large shared cache
• If software can use multiple processors, doubling number of processors almost doubles performance
• Use two simpler processors on the chip rather than one more complex processor
• With two processors, larger caches are justified– Power consumption of memory logic less than processing logic
• Example: IBM POWER4– Two cores based on PowerPC
Pentium Evolution (CISC)(1)• 8080
– first general purpose microprocessor– 8 bit data path– Used in first personal computer
• 8086– much more powerful– 16 bit– instruction queue, prefetch a few instructions– 8088 (8 bit external bus) used in first IBM PC
• 80286– 16 Mbyte memory addressable instead of just 1 Mbyte
• 80386– 32 bit (IA-32 architecture, or called x86, x86-32)– Support for multitasking
Pentium Evolution (2)• 80486
– sophisticated powerful cache and instruction pipelining– built in math co-processor, offloading complex math operations from the
main CPU
• Pentium– Superscalar– Multiple instructions executed in parallel
• Pentium Pro– Increased superscalar organization– Aggressive register renaming– branch prediction– data flow analysis– speculative execution
Pentium Evolution (3)• Pentium II
– MMX technology (graphics, video & audio processing)
• Pentium III– Additional floating point instructions for 3D graphics
• Pentium 4– Further floating point and multimedia enhancements
• Itanium– 64 bit – IA-64 Architecture
• Itanium 2– Hardware enhancements to increase speed
• See Intel web pages for detailed information on processors
Intel Computer Family
Moore’s law for (Intel) CPU chips.
PowerPC (RISC)• 1975, 801 minicomputer project (IBM) RISC
• 1986, IBM first commercial RISC workstation product, RT PC. 2MIPS.
• 1990, IBM RISC System/6000– RISC-like superscalar machine
– referred to as POWER architecture
• IBM alliance with Motorola (68000 microprocessors), and Apple, (used 68000 in Macintosh)
• Result is PowerPC architecture– Derived from the POWER architecture
– Superscalar RISC
– Used in Apple Macintosh
– Embedded chip applications
PowerPC Family (general-purpose)(1)• 601 (G1):
– Quickly bring PowerPC architecture to market. – 32-bit machine
• 603(G2):– Low-end desktop and portable computers– 32-bit– Comparable performance with 601– Lower cost and more efficient implementation
• 604(G2):– Desktop and low-end servers– 32-bit machine– Much more advanced superscalar design and greater performance
• 620(G2):– High-end servers– 64-bit architecture
PowerPC Family (2)
• 740/750(G3):– Also known as G3 processor– Two levels of cache on chip
• G4:– Increases parallelism and internal speed
• G5:– Improvements in parallelism and internal speed – 64-bit organization
Internet Resources
• http://www.intel.com/ – Search for the Intel Museum, click on online
exhibit
• http://www.ibm.com
• PowerPC (IBM, Motorola)
• Intel Developer Home, http://developer.intel.com/design/index.htm
Example Computer Families
• Pentium 4 by Intel (CISC)
von Neumann architecture
• UltraSPARC III by Sun Microsystems (RISC) von Neumann architecture
• The 8051 chip by Intel, used for embedded systems
Harvard architecture
Harvard architecture v.s. von Neumann architecture
• Harvard architecture– Separate instruction and data memory– Can read an instruction and access data
memory at the same time
• Von Neumann architecture– Memory store instruction and data– Instruction fetch and data access cannot be at
the same time because of the same bus system
Read Chapter 1 and Review Appendix A & B
Reading Assignment