Computing for the Near and Long Term Haldun Hadimioglu Spring 2010 CS/EE 1012.
-
Upload
jarvis-bodle -
Category
Documents
-
view
216 -
download
0
Transcript of Computing for the Near and Long Term Haldun Hadimioglu Spring 2010 CS/EE 1012.
Computing for the Near and Long Term
Haldun HadimiogluHaldun Hadimioglu
Spring 2010Spring 2010
CS/EE 1012
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 2
Outline
What has happened ?Designing chipsNear future directionsLong term directionsConclusions
Intel Eight-Core Xeon diewith 2.3 billion transistors
Cray Jaguar Supercomputer the fastest computer in the world
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 3
What has Happened ?Moore’s Law has been holding since 1960s
It will continue to holdPerhaps at a slower rate of doubling every three years
ww
w.i
eee.o
rg
Smaller transistors are susceptible to alpha particles !
We will have very small transistors !
More transistors will be defective !
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 4
Intel ‘s Past Microprocessor Roadmap
Intel eight-core Xeon processor (>26MB cache) 2010 2,300,000,000
Intel 1.01 TFLOP, 100 million transistor, 62-Watt, 80-core die, each core at 3.16GHz
Intel Eight-Core Xeon 7500 die with 2.3 billion transistors
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 5
Power Density was Increasing Exponentially!W
att
s/c
m2
1
10
100
1000
i386i386i486i486
Pentium® Pentium®
Pentium® ProPentium® Pro
Pentium® IIPentium® IIPentium® IIIPentium® IIIHot plateHot plate
RocketRocketNozzleNozzleRocketRocketNozzleNozzle
Nuclear ReactorNuclear Reactor
Courtesy : “New Microarchitecture Challenges in the Coming Generations of CMOS Process Technologies” – Fred Pollack, Intel Corp. Micro32 conference key note - 1999. Courtesy Avi Mendelson, Intel.
Pentium® 4Pentium® 4
Power was doubling every 4 years
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 6
Microprocessor speedEvery two years the speed of microprocessors doubles
The processor speed increases 50% a year !But, memory speed increases 10 % a year !
Microprocessor speed for an application depends onNumber of operations in the application (lower better)
The quality of the codeNumber of parallel operations performed (higher better)
Do more operations in parallelHow fast each operation is performed (higher better)
Because of Moore’s Law : transistors are smaller and wires are shorterClock frequency is increased
Until 2005 increasing the clock frequency was the main way to increase the speed
Power consumption (heat generation) increases with the frequency
The chip has to be cooled by usingcooledA heat sink or a fan or a liquid
Since 2005 power consumption changed way to increase speed
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 7
Multi-Core MicroprocessorsSince 2005 microprocessor speed increase depends on
Number of operations in the code (the quality of the code)Number of parallel operations performed
Dual-core microprocessors with reduced frequency consume less power (generate less heat)
Two/Four/Eight cores perform more operations in parallel The speed increase continues into the future with more cores on chip
Clock frequency
Number of cores per chip doubles every two yearsThe memory can become a bottleneck
The memory speed increases 10% a year More cores increase the demand on the memoryThe memory wall problem
Parallel Programming has to be improved dramaticallyParallel programming wall
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 8
Designing ChipsWe have been using hardware description languages (HDLs) to design chips
We write an HDL program to design a chip !Just like we draw a schematic to design a chip
Why an HDL program, why not schematics ?Real life circuits are too complex to be designed by schematics
There are two popular HDLs todayVHDLVerilog HDL
Knowing one HDL language helps one learn another HDL language faster
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 9
Why HDLs ?Software : Statements are executed sequentially
The sequence of statements is significant, since they are executed in that order
Java, C++, C, Ada, Pascal, Fortran,…
Hardware : Events happen concurrentlyA software language cannot be used for describing and simulating hardware
Concurrent software languages cannot be used eitherBecause we do not have powerful tools
Programs in C/C++ etc. will be used to design chips in the future
It is already done for C and C++ programs in limited casesFirst they are converted to HDL programs and then to hardware
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 10
Full Adder VHDL Program
Data-flow description of the Full Adder circuit :
FullAdder
ki
mi
si
ci co
si = ki mi ci + ki mi ci + ki mi ci + ki mi ci
co = ki mi + ki ci + mi ci
IBM dual-core BlueGene/L microprocessor die & its chip
© IB
M
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 11
VHDL Details : 3-to-8 Decoder
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 12
3-to-8 Decoder VHDL Program
Entity Part :3-to-8DCD
A0
G1
Y_L0
A1
A2
Y_L1
Y_L2
Y_L3Y_L4
Y_L5
Y_L6
Y_L7G2A_L
G2B_L
V74x138
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 13
3-to-8 Decoder VHDL Program
All statements happen concurrently
Architecture Part :
3-to-8DCD
A0
G1
Y_L0
A1
A2
Y_L1
Y_L2
Y_L3Y_L4
Y_L5
Y_L6
Y_L7G2A_L
G2B_L
V74x138
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 14
Near Future Directions Double number of cores every two years
Make sure to handle
errors due toAlpha particles
Defective transistors
Parallel Programming
Make sure to improve
Make sure to handle
Memory Wall
Power Wall
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 15
Near Future Directions
HPC Wire, December 4, 2009
September 1, 2009http://www.arstechnica.com
The IBM Power7 chips are implemented in a 45 nanometer copper/SOI process and have 1.2 billion transistors with eight cores on a single die. The Power7 core has 32KB of L1 instruction cache and 32KB of L1 data cache. Each core sports simultaneous multithreading that delivers four virtual threads per core, and has a 256KB of L2 cache tightly coupled to it. The chip also has 32MB of embedded DRAM that acts as a shared L3 cache, with 4 MB segments affiliated with each of the eight cores. The Power7 chip has two dual-channel DDR3 memory controllers implemented on the chip, which deliver 100 GB/sec of sustained bandwidth per chip.
http://www.theregister.co.uk, November, 27, 2009
Intel Unveils 48-Core Research Chip On Wednesday Intel shifted its Tera-scale Computing Research Program into second gear by demonstrating a 48-core x86 processor. The company is intending to use the new chip as a research platform for the purpose of lighting a fire under many-core computing.
According to Intel, the new chip boasts 1.3 billion transistors and is built on 45nm CMOS technology. It's distinction is that it contains the largest number of Intel Architecture (IA) cores ever assembled on a single microprocessor. As such, it represents the sequel to Intel's 2007 "Polaris" 80-core prototype that was based on simple floating point units. While the latter chip was said to reach 2 teraflops, the company is not talking about performance for the 48-core version.
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 16
Intel & IBM Vision for Next 5-8 Years
Fro
m Inte
l
ww
w.a
nandte
ch.c
om
Inte
l Tech
nolo
gy Jou
rnal, N
ovem
ber
2005
Sca
lable
Hig
h P
erf
orm
ance
M
ain
Mem
ory
Syst
em
Usi
ng
PC
M T
ech
nolo
gy,
Moin
uddin
K.
Qure
shi, e
t.al.,
ISC
A 2
00
9,
IBM
Intel
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 17
Near Future Directions : Next 5-8 YearsApplications
Intel : Recognition, Mining, Synthesis as platform 2015 Workload Model (on massively parallel core chips)IBM : Presence information, knowing where and things are and how to best match them, people are sensorizedMicrosoft : Intention machine, computer predicts user intentions and delivers useful informationCMU : Computational thinking, computer science based approach to solving problems, designing systems, understanding human behavior
Traditional computing will continueA C/C++/Java program for an application becomes Software
A compiler generates the machine language program file
A new type of computingA C/C++/Java program for an application becomes Hardware
A hardware compiler generates the transistor circuitThe result is a custom chip
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 18
Near Future Directions : New Computing Types ?
Any other new possibility ?A C/C++/Java program for an application becomes Hardware
A CAD tool generates the bit file to reconfigure the FPGA
An FPGA chip is a hardware programmable chipThe chip emulates the circuit designed
The bit file configures the chipThe CS 2204 Digital Logic Lab uses FPGAs !
There can be more opportunities with FPGA chips !FPGAs are increasingly used in commercial products !
FPGAs are becoming cost competitive with microprocessors FPGAs are becoming speed competitive with custom chips
FPGAs are used for applications whereSpeed and programmability matter
Latest FPGAs also have microprocessor coresThey can run software as well
The application can be divided into software and hardware
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 19
Near Future Directions : New Computing Types
A C/C++/Java program becomesPart software and part hardware
FPGA with cores and reconfigurable areas runs applicationsSoftware is run by processor cores andHardware is in the reconfigurable area
When such an FPGA runs an application, some operations are in hardware and simultaneously some operations in software
Software tools (compilers) and CAD tools must mergeReconfigurable areas & cores allow recovering from errors due to
Alpha particlesDefective transistors
Processor coreto run softwareReconfigurable area
to do operations inhardware These FPGAs are
available now but we need much better tools
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 20
Near Future Directions : Hybrid Switching Elements
CMOL : A circuitry composed of CMOS and nanodevicesA closer look at FPGA-like reconfigurable logic circuits
Interface between CMOS and nanodevices
Two CMOS cells and a nanodevice
A larger view of FPGA-like reconfigurable logic circuits
Figures from : Konstantin K. Likharev
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 21
Near Future Directions : Possible New Structures
Microelectromechanical systems, MEMS, with computing elements
Microembedded systemsSmart Dust at UC BerkeleyMicrobiolab on a chip
Sometimes referred to as a biochip !
Other structures that can be used for a number of different applications with or without computing elements
MicrocamerasMicrosensorsMicromirrorsMicromotorsMicrolensesAn all-optical computing chip with
MicromirrorsMicrolenses
Bio MEMSThe Biochip Group at Mesa+,
University of Twente, Holland
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 22
Near Future Directions : Year 2020SEMATECH : consortium of semiconductor manufacturers from America, Asia and Europe.
SEMATECH predictions for year 2020 (from its 2009 Update of International Technology Roadmap for Semiconductors, ITRS, study) :
Clock speed : 12 GHzNumber of transistors on a microprocessor chip : 35 Billion32Gbit DRAM chipsProcess length : 14 nm
http://www.sematech.org
Make sure to handle errors due to
Alpha particlesDefective transistors
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 23
Long Term Directions : Possible New Structures
NanotechnologyProgrammable materialsNEMSBio NEMS
Nano medicineDrug deliverySmart diagnosis
Nanocomputing1 Watt supercomputer
Quantum computingMolecular computing
Molecular self assemblyTesting of molecular structuresAdaptive molecular structures
Merger of bio and non-bio structuresSynthetic biology
ww
w.ibm
.com
IBM Blue Gene/L molecular dynamics demo
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 24
Long Term Directions : 2020 and BeyondMany interconnected varying-size computing elements using each other’s results autonomously
Ubiquitous computing with little human interventionCloud computing to nano computing
Personal agentsIntelligent spacesNano medicine
Targeted drug delivery
We needSelf-healing, adaptive, self managing, trustworthy, dependable hardware and software
Efficient parallel processingNew computational modelsNew programming languages
Hardware and software reliability
ww
w.u
ky.e
du
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 25
Long Term Directions : 2020 and BeyondWill hardware and software be developed separately like today ?
How will software be developed for nano systems ? Quantum software ? Molecular software ?
Biosoftware ?
How will hardware be developed for nano systems ?VHDL or Verilog HDL or C or C++ or ?
Iron atoms on copper with electron movement
Developing tools is critical
Simulation of protein molecules folding on a supercomputer
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 26
Long Term Directions : 2020 and BeyondBy 2019 a $1000 computer will match the processing power of the human brain
Raymond Kurzweil, KurzweilAI.net, 9/1/1999His keynote speech at the Supercomputing Conference (SC06) in November 2006
The title of his talk is “The Coming Merger of Biological and Non-Biological Intelligence” Singularity point ?
Brain downloads possible by 2050 Ian Pearson, Head of British Telecom’s futurology unit, CNN.com, 5/23/2005
Computers will be used as virtual brain extensions ?Direct brain - Internet link ?
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 27
Long Term Directions
Hans Moravec, 1998
Many ethical issues will be facing you ! Being prepared will help !
Spring 2010
CS/EE1012 Introduction to Computer Engineering
Page 28
Conclusions
Digital Logic evolution will continue :Faster, cheaper, smaller, lighter, less power consuming, higher reliability digital products
Due to converging research in various areas :MathematicsComputer ScienceComputer EngineeringElectrical EngineeringMechanical EngineeringPhysicsChemistryMaterial ScienceBiology ?
There will be many ethical issuesTry to prepare ! Try to be informed !