Post on 20-May-2020
1JAA 6/28/2007JAA 6/28/2007
Dependability and Security Challenges Dependability and Security Challenges in Emerging Technologiesin Emerging Technologies
Jacob A. AbrahamJacob A. AbrahamUniversity of Texas at AustinUniversity of Texas at Austin
Workshop PanelWorkshop Panel
June 28, 2007June 28, 2007
5JAA 6/28/2007JAA 6/28/2007
Effects on Circuits and SystemsExperiments on chips today show that running some chips at rated speeds produce errors
Correct operation when running at normal speeds
Resistive opens (possible in copper interconnect) cause delay defects
Crosstalk effects could also cause errors
As technology scales down, chips in the future prone to erroneous operation due to:
Process variations (soft errors)Increasing defects (today's memories are an example)
7JAA 6/28/2007JAA 6/28/2007
New Nano Systems
Inter-dot Barriers
Outer Barriers
Dot occupied by Electron
Dot unoccupied
1
“1” “0”
1
QCA Wire
1 0
QCA Inverter
Stable
Unstable
Quantum Cellular Automata
Quantum Devices
8JAA 6/28/2007JAA 6/28/2007
Dealing with ErrorsRedundancy
Hardware (duplication/retry, triplication)Time (recomputation)
Basic ideas in dealing with errors are not newMoore and Shannon, 1956:
“Reliable Circuits Using Less Reliable Relays”von Neumann, 1956:
“Probabilistic Logic and the Synthesis of Reliable Organisms from Unreliable Components”
Self-checking circuits (1970s)Algorithm-based fault tolerance (1980s)Software-based checks (control flow checks, etc.) (1980s)
9JAA 6/28/2007JAA 6/28/2007
Redundancy at Different Levels
Switch (Circuit) level
Module level
Voter
Functional
Module
Checker
Outputs
Error
Inputs
Self CheckingSystem
10JAA 6/28/2007JAA 6/28/2007
Protecting Computations at the System Level
ColumnChecksum
RowChecksum
ijA
B jk
1
2
3
4
1 2 3 4
Algorithm-BasedFault Tolerance
Overhead decreasesas system gets larger!
11JAA 6/28/2007JAA 6/28/2007
CEDA: Control-flow Error Detection through Assertions (Integrated with GCC)
S: global runtime signature register– updated at the beginning and end of each node– each update either an XOR or an AND operation– op. performed based on the program graph properties
S = S XOR 1011
Se = 0011
Se = 1000
br S != 0011 errS = S XOR 0110
Se = 0101 Se: expected value of S at each point in the program - calculated at compile time
Check point: S is checked against its expected value - detects CFE if one occurred - not required inside every node
Node signature: expected value of S inside a node
Node exit signature: expected value of S immediately after exiting a node
12JAA 6/28/2007JAA 6/28/2007
Dealing with FaultsIncreasing possibility of defectsDefect tolerance key for yieldWant system to start in a good state
Cannot produce cost-effective DRAMS without replacing faulty cells with spares
However, sparing cells is much more difficult for logic (very high cost for multiplexers, routing)
14JAA 6/28/2007JAA 6/28/2007
What are some of the characteristics of future products?
Low-cost consumerproducts
High frequency,high resolutionsignals
System service (what matters is what customer sees)
Regulations0
200
400
600
800
1000
1200
1400
1991 '93
'95
'97
'99
2001 '03
'05
'07
'09
'11
SoC Market Size
World Wide Semiconductor Market Size
MS-SOC Contribution to the
SoC Market Size
Mar
ket S
ize
(B$)
15JAA 6/28/2007JAA 6/28/2007
Objective of dependabilityGuarantee that the system meets customer or regulatory specifications
Need not check directly for the specificationsThis is becoming impossible to do with low cost
Only solution for systems of the future is indirect checks from which the specifications can be inferred accurately
Detector Output
10M Samples per Second
2.5 2 1.5 1 0.5 0 0.5 1 1.52.5
2
1.5
1
0.5
0
0.5
1
1.5
Predicted IIP3 [dBm]
Mea
sure
d IIP
3 [d
Bm
]
LNA IIP3
IIP3y=x ref line
Third harmonic of 940 MHz RF system deducedfrom 10 MS/sec detector output
16JAA 6/28/2007JAA 6/28/2007
ConclusionsNeed to deal with soft errors (due to variations, etc.)
Detection and correction techniques
Tolerate defects in manufactureLots of devices, but efficiently using them is key
Level to apply solutions?Usually higher levels are betterMay find good solutions at low levels, too
Can utilize many “old” techniques
Need to look for “new” techniques