CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 16 ...
Transcript of CMPEN 411 VLSI Digital Circuits Spring 2012 Lecture 16 ...
Sp12 CMPEN 411 L16 S.1
CMPEN 411VLSI Digital Circuits
Spring 2012
Lecture 16:
Introduction to Soft Errors
[Adapted from Rabaey’s Digital Integrated Circuits, Second Edition, ©2003 J. Rabaey, A. Chandrakasan, B. Nikolic]
Sp12 CMPEN 411 L16 S.2
What is Soft Error
Soft errors are circuit errors caused due to excess charge carriers induced primarily by external radiations
These errors cause an upset event but the circuit it self is not damaged.
Same a SEU (single event upset)
Sp12 CMPEN 411 L16 S.3
B
p substrate
G
n+n+
n channel
Soft Errors
The Phenomena
+ - + -+ -
+ -+-
+ -+ -
+ -+ -
A particle strikeCurrent
Sp12 CMPEN 411 L16 S.4
Soft Errors
The Phenomena
VDD
Vout
CL
Vin
A particle strike
Bit Flip !!!
A particle
strike
!B
LB
L
W
L
0->11->00
Sp12 CMPEN 411 L16 S.5
What cause Soft Errors?
At ground level, there are three major contributors to Soft errors.
1. Cosmic Ray induced neutrons
2. Alpha particles emitted by decaying radioactive impurities in packaging or interconnect materials.
3. Neutron induced 10B fission which releases a Alpha particle and 7Li
Sp12 CMPEN 411 L16 S.6
Evidence of Cosmic Ray Strikes
Documented strikes in large servers found in error logs
Normand, “Single Event Upset at Ground Level,” IEEE Transactions on Nuclear Science, Vol. 43, No. 6, December 1996.
Sun Microsystems, 2000
Cosmic ray strikes on L2 cache with no error detection or correction
- caused Sun’s flagship servers to suddenly and mysteriously crash!
Companies affected
- Baby Bell (Atlanta), America Online, Ebay, & dozens of other corporations
- Verisign moved to IBM Unix servers (for the most part)
Sp12 CMPEN 411 L16 S.7
Reactions from Companies
Fujitsu SPARC in 130 nm technology
80% of 200k latches protected with parity
compare with very few latches protected in Mckinley
ISSCC, 2003
IBM declared 1000 years system MTBF as product goal
very hard to achieve this goal in a cost-effective way
Sp12 CMPEN 411 L16 S.8
Soft Error Rate (SER)
)exp(**s
criticalflux
Q
QCSNSER
Nflux : intensity of the neutron flux.
CS : the area of the cross section of the node.
Qcritical : critical charge necessary for a bit flip.
Qs : the charge collection efficiency.
Sp12 CMPEN 411 L16 S.9
Soft Errors
For a soft error to occur at a specific node in a circuit, the collected charge Q at that particular node should be more than Qcritical.
Qcritical is proportional to the node capacitance and the supply voltage.
Qs is dependent on doping. As CMOS device sizes decrease, the charge stored at each node decreases(due to lower nodal capacitance and lower supply voltages).
Sp12 CMPEN 411 L16 S.10
Modeling of a particle strike
Sp12 CMPEN 411 L16 S.11
A SPICE simulation for SRAM
A particle
strike
!BLBL
WL
0->11->0
0
Sp12 CMPEN 411 L16 S.12
Qcrit as a function of Vdd and Litho
Wissel, IBM CICC 2003
Sp12 CMPEN 411 L16 S.13
Problems caused by SEU
Soft Errors can cause problems in different ways
Change the data value in the Caches and Memory
Corrupt the execution of instruction due the flip of data in the pipeline registers.
Change the character of a SRAM-Based FGPA circuit. (Firm Error)
Datapath logic SET (Single Event Transient) caught by registers/memory
Sp12 CMPEN 411 L16 S.14
SEU in memory
A particle
strike
!B
LBL
WL
0->11->00
When Memories Forget!
Sp12 CMPEN 411 L16 S.15
On-chip Memory: ITRS roadmap
180nm /
'99
130nm /
'02
100nm /
'05
70nm /
'08
50nm /
'11
35nm /
'14
0
20
40
60
80
100
% D
ie u
tiliz
atio
n
Area Reused Logic
Area New Logic
Area Memory
Sp12 CMPEN 411 L16 S.16
SEU in FPGA
source: actel
Sp12 CMPEN 411 L16 S.17
SEU in FPGA: routing
GRM: General Routing Matrix
Sp12 CMPEN 411 L16 S.18
SEU in FPGA: function
Source: Semico 2002
Sp12 CMPEN 411 L16 S.19
SEU in logic: Bit flips caught by FF or memory
Sp12 CMPEN 411 L16 S.20
Error Masking in logic
Logical masking : A particle strikes a portion of the combinational logic that doesn’t determine output.
Electrical masking : The pulse resulting from a particle strike is attenuated by subsequent logic gates.
Latching-window masking : The pulse resulting from a particle strike reaches a latch, but not at the clock transition.
Sp12 CMPEN 411 L16 S.21
Logic attenuation- ‘Hazard Bubble’
Clock
HoldSetupFlip flop/ Latch
1 2 34 5 6
Gate 6
Gate 5
Gate 4
Window of Vulnerability
1
1 1
1
Sp12 CMPEN 411 L16 S.22
SEU in logic: Errors Due to Data SETs
Source: K. Bernstein, IBM
Sp12 CMPEN 411 L16 S.23
SEU in logic: Errors Due to Data SETs
Clock
Data
Data
Data
Data
Setup Time Hold Time
Non-Latching SEU
Earliest-Latching SEU
Non-Latching SEU
Latest-Latching SEU
Window of
Vulnerability
Sp12 CMPEN 411 L16 S.24
Impact of technology scaling on SER
Source: H.Stork , CTO of TI, IRPS 2004
Sp12 CMPEN 411 L16 S.25
Sp12 CMPEN 411 L16 S.26
Physical Solutions are hard
Shielding?
No practical absorbent (e.g., approximately > 10 ft of concrete)
Radiation-hardened cells?
10x improvement possible with significant penalty in performance, area, cost
2-4x improvement may be possible with less penalty
Sp12 CMPEN 411 L16 S.27
Circuit techniques
Layout & circuit techniques
Spatial Redundancy
Time Redundancy
Sp12 CMPEN 411 L16 S.28
Increase the capacitance
Tarnik et al. Intel 2002
CK
D
CK#
Q
GND VDD
capacitor
Sp12 CMPEN 411 L16 S.29
ST tames soft errors in SRAM by adding capacitors
ST tames soft errors in SRAM by adding capacitorsBy Ron Wilson, EE TimesJanuary 13, 2004 (4:42 a.m. EST)URL: http://www.eetimes.com/story/OEG20040112S0069
increased the node capacitance of an SRAM cell substantially
with only about a 5 percent area increase
with a 250X improvement on SER
Sp12 CMPEN 411 L16 S.30
Space redundancy: Redundant Logic
Logic 1
Logic 2Voter
Logic3
Point of failure!!
Sp12 CMPEN 411 L16 S.31
Temporal Sampling Latch with Internal Clock Delays
DFFD Q
CLOCK
OUTMAJ
2ΔT
IN
ΔT
DFFD Q
DFFD Q
Asynchronous VotingTemporal Sampling
Sp12 CMPEN 411 L16 S.32
A fast way to check your FIT rate
IROC just release a web-based SER estimation tool
http://www.iroctech.com/pages/ser_web_guide_new.php
Sp12 CMPEN 411 L16 S.33
Next Lecture
Next lecture
Dynamic logic
- Reading assignment – Rabaey, et al, Ch 7