Post on 21-Apr-2018
Department of Electrical and Computer Engineering, University of Virginia
Xinfei Guo, Mircea R. Stan
Enabling Wearout-Immune BEOL and FEOL with Active Rejuvenation
{xg2dt, mircea}@virginia.edu
BEOL & FEOL Wearout
Wearout Issues
Previous SolutionsTolerate - Design for the worst case (margins, upsizing)
Compensate - Dynamically adapt to wearout
Slow down the wearout - Reduce the stress
Passive Recovery - Remove the stress, shut off the core
Consider FEOL and BEOL separately!
This WorkRepair Both wearout completely by introducing the notion
of Accelerated & Active Recovery
Circadian Rhythms for FULL recovery
Demonstrate both solutions experimentally
Introduce Accelerated & Active Recovery as a new design
knob for cross-layer resilience
Wearout
Accelerated & Active Recovery
Interface Board ChipData Sampling
16-b
frefclk
in
Cout
16
En En
75 LUTs
Circuit Under Test (CUT)
rst
(a) (b)
Thermal Chamber
Counter
Accelerated & Active
Recovery for 12 hours24.5
25
25.5
26
Fre
qu
ency
(MH
z)
Accelerated Stress for 48 hours
72.4%
An example where about 72.4% of wearout is
recovered by accelerated self-healing techniques in
only ¼ of stress time (measured).Accelerated self-healing space exploration (model prediction)
Related Publications• [Submitted] Enabling Wearout-Immune BEOL and FEOL with Active Recovery
• [INTEGRATION, the VLSI Journal] “Implications of Accelerated Self-Healing as a Key Design Knob
for Cross-Layer Resilience”• [ASPDAC ‘16] “Work hard, sleep well - Avoid irreversible IC wearout with proactive rejuvenation”
• [SELSE ‘15] “MCPENS: Multiple-Critical-Path Embeddable NBTI Sensors for Dynamic Wearout
Management”
• [DAC ‘14] “Modeling and Experimental Demonstration of Accelerated Self-Healing Techniques”
Sleep-when-getting-tired for FULL recovery
Main IdeaSchedule accelerated and active recovery for both wearout before
the irreversible wearout kicks in and do this periodically
Cross-Layer Implementations
- VActive Recovery
EN
CircuitWearout Sensors
Architecture
Dark SiliconProgramCounters
+1
SystemVirtualSensorsLoad Balancer
Heating Elements
AcceleratedRecovery
EN
Negative Voltage Generator
Proactive Scheduler
RedundantResources
Cross-layer Accelerated Self-Healing
Core
sensors
Sensor
Sensor
Proactive
Accelerated
From
To cores
Apply to
Sleep Cores Active Cores
Accelerated
rou
ter
Scheduler
& Active Recovery
outputs
sleep cores
outputs
Applications
Scheduler
& Active Recovery
Blocks
Core
Allocation
LoadBalancer
Heat for accelerated
recovery
A potential implementation of Cross-layer Accelerated
Self-Healing in a NoC system
2016 IEEE/ACM Workshop on Variability Modeling and Characterization (VMC), Austin, TX
BEOL
FEOL
Electromigration (EM)
Negative/Positive-bias temperature
instability (N/PBTI)
?
System
Circuit
Device BTI → |Vth| increase
EM → R increase
Faults, Errors,
Reduced MTTF
Delay increase (e.g. 10% in 3 years),
Timing violations
Resistance Recording
Thermal Chamber
Constant Current Supply Device (Wire) under test
BEOL Wearout (EM) Test Setup
EM Test Chip Die Photo
Metal wire under test
Probe
Pads
FEOL Wearout (BTI) Test Setup
On-chip Wire Information
BTI Test Structure
Vsg = 0, room
Passive Recovery
Vsg = 0, high
Accelerated RecoveryActivate the recovery
Vsg = negative room temperature
Active Recovery:
Vsg = negativehigh temperature
Accelerated & Active 1 2 3 4
Recovery
temperature temperature
I = 0, room
Passive Recovery
I = 0, high
Accelerated RecoveryActivate the recovery
I = negative room temperature
Active Recovery:
I = negativehigh temperature
Accelerated & Active 1 2 3 4
Recovery
temperature temperature
BTI (FEOL Wearout) active and accelerated recovery
EM (BEOL Wearout) active and accelerated recovery
Experimental Results
FEOL Measurement Results
BEOL Measurement Results
72.8
73
73.2
73.4
73.6
73.8
74
74.2
74.4
74.6
74.8
0 200 400 600 800 1000 1200
R (
)
Time (min)
Stress (230C, +100mA)
Recovery (230C, -100mA)
There is still a permanent part
Summary• BTI - 72.4% of the wearout is recovered within only 1/4 of the stress
time through both high temperature and negative voltage
• EM - >75% of the wearout is recovered within 1/6 of the stress time
with reverse current and high temperature
• In both cases, there is still an irreversible (permanent) component
Fresh Turbo-boost
Fre
qu
en
cy
Time
Desig
n M
arg
in
Average
Frequency
Negative
Days
Hours
End of life
Years
Wearout
No recoveryActive
Sleep
FEOL recovery under different “circadian rhythms”
Recovery under Different “circadian rhythms”
2X
1X
Reduction of Design Margin >60X
BEOL recovery right after void growth phase starts
76
77
78
79
80
0 200 400 600 800
R (
)
Time (min)
EM Stress (230C, +100mA)
EM Accelerated & Active
Recovery (230C, -100mA)
Fully Recovered!
Start Recovery
EM induced
by reverse
current
operation