EE141
1
EE141 EECS141 1 Lecture #30
EE141 EECS141 2 Lecture #30
Great job on projects
Final next week Friday 5-8m in 166 Burrows Covers all material
Homework 10 (arithmetic) and it solutions posted this weekend
Intended for practice only
Not graded
EE141
2
EE141 EECS141 3 Lecture #30
EE141 EECS141 4 Lecture #30
EE141
3
EE141 EECS141 5 Lecture #30
EE141 EECS141 6 Lecture #30
EE141
4
EE141 EECS141 7 Lecture #30
EE141 EECS141 8 Lecture #30
Optimization constraints different than in
binary adder
Once again:
– Need to identify critical path
– And find ways to use parallelism to reduce it
Other possible techniques
Logarithmic versus linear (Wallace Tree Mult)
Data encoding (Booth)
Pipelining
First glimpse at system level optimization
EE141
5
EE141 EECS141 9 Lecture #30
EE141 EECS141 10 Lecture #30
Technology
Node (nm)
45 32 22 16 11 8
Integration
Capacity (BT)
8 16 32 64 128 256 512 1024
Delay Scaling >0.7 ~1?
Energy Scaling ~0.5 >0.5
Transistors Planar 3G, FinFET
Variability High Extreme
ILD ~3 towards 2
RC Delay 1 1 1 1 1 1
Metal Layers 8-9 0.5 to 1 Layer per generation
THE OPTIMISTIC PERSPECTIVE
EE141
6
EE141 EECS141 11 Lecture #30
Energy!
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
0.11
0.12
20 30 40 50 60 70 80 90
Technology node (nm)
EO
P (
fJ)
Silicon Process Technology
Co
st
cost per
transistor
product
cost
Further scaling
is not profitable
Cost
Size
EE141 EECS141 12 Lecture #30
EE141
7
Infrastructural
core
Sensory
swarm
Mobile
access
THE IT PLATFORM OF THE NEXT DECADE
TRILLIONS OF
CONNECTED DEVICES [J. Rabaey, ASPDAC’08]
It Is All About Energy …
EE141
8
Data and Compute Centers “The IT workhorses”
Doing Nothing
Well!
Major Opportunity is in Power Management
Requires Top-Down System Level Solution
[Barroso, Holzle, 20
Mobiles “The home of the user interface” Most “tricks” already in use! (multi-core,
heterogeneity, accelerators, SoC, …)
Mobile μProc Anno 2015 [Courtesy A. Peleg, Intel]
Opportunity: system and
application considerations
Always-connected
Perceptual processing
EE141
9
The Sensory Swarm “Adding senses to the Internet”
Philips Sand module
UCB mm3 radio
UCB PicoCube
[Ref: Ambient Intelligence, W. Weber Ed., 2005]
IMEC e-Cube
Telos Mote
The driver for
Ultra-Low Energy
design for past decade
Yet … True Immersion Still Out of Reach Microscopic Wireless
Artificial Skin
“Microscopic” Health Monitoring
Interactive Surfaces
Another leap in size, cost and energy reduction
Smart Objects
EE141
10
Example: Microscopic Wireless to Power Brain-Machine Interfaces (BMI) The Age of Neuroscience BMI – The Instrumentation of Neuroscience
• Learning about operation of the brain
• Enabling advanced prosthetics
• Enabling innovative human-machine
interfaces
mm3 nodes
remotely powere
uWs to 1 mW power budget
The Dream: Observing Living Cells
Combines sensing,
processing, communication
and energy harvesting and storage in volumes far less
than 1 mm3
10-100
μm
[Courtesy, Hillenius et al, 07]
EE141
11
The Holy Grail: Reducing the Energy/Operation
CMOS digital
Non-CMOS
digital
Interfaces and periphery
Long term musings
Energy Limits in Digital
Claude Shannon John Von Neumann
More than 4 orders of magnitude
below current practice (65 nm at 1V)
Shannon-Von Neumann-
Landauer Bound:
Minimum energy/
operation = kTln(2)
= 4.10-21J/bit at room
temperature
EE141
12
Technology Scaling Not the Solution
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
0.11
0.12
20 30 40 50 60 70 80 90
Technology node (nm)
EO
P (
fJ)
[Based on actual and predictive models]
Sub-Threshold Operation Leads to Minimum Energy/Operation
Op
timal (V
dd , V
th )
Threshold Voltage (Vth)
Supp
ly V
olta
ge
(VD
D)
Energy-Aware FFT Processor
[Chang, Chandrakasan, 2004]
Energy
self-contained processors
Subliminal μprocessor for
retinal implants
3 pJ/inst @ 350 mV
[Blaauw, VLSI’07]
EE141
13
How About Mechanical Computing?
NEMS Relay
[Courtesy: TJ King, E. Alon, UCB]
ON: VGS > VPI – Low on resista
OFF: VGS < VPI – Zero Leakag
NEMS Relays Versus CMOS
32-bit adder CMOS Relay
Supply
Voltage
0.5 V 0.32 - 0.9 V
Load Cap
per Output
25 fF 25 fF
Total Gate
Cap
4.0 pF 125 fF
Area 600 μm2 480 μm2
[CMOS Adder: D. Patil, ARITH'07]
9x
10x
Vdd: 1V 0.5V
Energy/op vs. Delay/op across Vdd
Enables the parallelism concept anew!
Top Related