ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC...

33
ACOUSTIC WAVE DETECTORS TO PREVENT SEUS IN LLC. Gaurang Upasani 1 28th June, 2011

Transcript of ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC...

Page 1: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

ACOUSTIC WAVE DETECTORS TO PREVENT SEUS IN LLC.

Gaurang Upasani

1

28th June, 2011

Page 2: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Outline

Motivation Overview Acoustic wave detectors : Introduction Application of Acoustic wave detectors

Nehalem Quad core (i7) LLC

2

Page 3: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Motivation

Core count is already TDP/SER limited Decreasing voltage increases SER sensitivity Recent trends in Soft error protection :

Physical level Device resizing, restructuring, hardening.

Adding redundancy at micro architecture and architecture level. Error Detecting and Correcting Codes. Redundant Execution. (DMR, TMR etc.)

3

Page 4: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Protecting Memories: Parity

Simplest. Cannot detect even number of faults. Does not identify which bit has been flipped

The word cannot be corrected Not optimal in terms of number of code bits used.

4

Page 5: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Error Correcting Codes

K data bits, r code bits(n = k+r). r code-bits must be able to determine the exact

position or the positions in the error. Correcting m bits.. 2r >= Ʃ((k+r)Ci) (i = 0 to m)

If K = 64. (64 bit data word) Single bit correction (m=1) the minimum r = 7. Double bit correction (m=2), r = 13. Triple bit correction (m=3), r =20.

5

Page 6: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

ECC Overhead

Multi-threaded Xeon® Processor. (DEC-TED)

Itanium processor SEC-DED

SEC-DEC = single bit error correct double bit error detect DEC-TED = Double bit error correct triple bit error detect

6

Page 7: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Evaluation of EDACs

Each structure must be protected separately. Logic to encode/decode is not cheap

SEC-DED to DEC-TED delay penalty is almost double, encoder area doubles, decoder area increases by 16x.

Special care required for multi bit errors. Spatial – adjacent bits, mostly same strike.

Interleaving: two errors in consecutive bits to be caught in two different code words.

Getting more and more expensive (interleaving factor goes up)

Temporal – non-adjacent bits, two different strikes. Scrubbing: cost in energy and may impact performance

For low voltage, we need very strong ECCs

7

Page 8: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Motivation summary

Error detecting and correcting codes in low voltage operation are expensive Consumes larger die-area Slow encode/decode cost in terms of performance. Increases the power budget due added components.

Finding an appropriate method to detect vulnerable area for protection, adding minimal hardware overhead is a challenge.

8

Page 9: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Outline

Motivation Overview Acoustic wave detectors : Introduction Application of Acoustic wave detectors

Nehalem Quad core (i7) LLC

9

Page 10: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Acoustic wave detectors : Introduction

Our choice: Dimensions : L = 1µm; W= 1µm; H = 0.05µm Area ~ 1µm2 (~1 bit).

Can detect the peak power density of 0.3mw/cm2

at a distance of 5mm from the source of the sound. Area covered by a single detector : 78.5375mm2

Area equivalent to the LLC in Nehalem Quad core (i7), 45nm

R=5mm

cantilever beam 10

Page 11: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Goal

Acoustic sensor ~ 1 bit parity We want to use them to locate the particle strikes

The idea is to use less sensors than all ECC bits to save in area

Once particle location is identified Identify potential bit flips Apply corrective mechanisms if required

MICROARCHITECTURE

Page 12: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Nehalem Quad core (i7), 45nm

• Die area : 245.7mm2

• Approximate dimensions : • L = 13mm , W = 18.9mm

w

L

12

Page 13: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Last level cache

Last level cache Size : 8 MB, Line size = 64B. Approximate Area : 78mm2 Approximate dimensions :

L = 4.41mm W = 17.64mm Approximate area of a bit = 1.0768µm2

w

L

13

Page 14: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

What is it all about? S1 S2 S3 S4

Using 4 such sensors it is possible to detect a particle strike in the LLC

But… we want to be precise in the location and the time

14

Page 15: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Latency

Speed of phonons(vibration energy) in Si lattice is 10km/sec. (~23bits/ns)

What does this mean? If the processor speed = 2GHz. (1 cycle = 0.5ns) For the worst case (5mm away) particle strike

detection one detector needs = 1000 cycles A method to revert back 1k cycles would be used

(check pointing, logs etc. ) Of course, we can add as many sensors as required

to decrease the WC latency

15

Page 16: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Detecting with acoustic sensors

Problem Statement : Finding the unknown location of the particle strike(Xa,Ya).

S10 (X10,Y10)

S2 (X2,Y2)

S1 (X1,Y1)

S3 (X3,Y3)

(Xa,Ya)

S4 (X4,Y4)

S5 (X5,Y5)

S6 (X6,Y6)

S7 (X7,Y7)

S8 (X8,Y8)

S9 (X9,Y9)

Sn (Xn,Yn)

16

Page 17: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Questions to be answered.

How many sensors do I need? At least 3

Where should I put them? What is the accuracy?

Granularity of the location? cache line, cache set, bit?

Latency: when do I detect the error? If you have multiple cantilevers, the detection error is the first one that raises the flag. Some sensors for detection latency, some for localization?

17

Page 18: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

S2 (X2,Y2,t2)

S1 (X1,Y1,t1)

S3 (X3,Y3,t3)

Traversal of acoustic wave.

S10 (X10,Y10)

(Xa,Ya,T)

S4 (X4,Y4)

S5 (X5,Y5)

S6 (X6,Y6)

S7 (X7,Y7)

S8 (X8,Y8)

S9 (X9,Y9)

Sn (Xn,Yn)

18

Page 19: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

S1 (X1,Y1,t1)

S3 (X3,Y3,t3)

S2 (X2,Y2,t2)

(Xa,Ya,T)

d2

d1

d3

Determination of strike position.

T t1 t2 t3 DeltaT12 DeltaT23

time

19

Page 20: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Equations

If the range difference measurement is observed between the two stations the estimated locus of position will be a hyperbola.

We generate simultaneous set of algebraic position equations(generally non-linear).

We linearize them by Taylor series estimation.

We solve them using Gauss-Newton interpolation.

Page 21: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

yes

No

Get “timer0” value event t1

If event t(i+1) is high? i=1,2

Get “timer0” value

DeltaT12 = t2-t1 DeltaT23 = t3-t2

DeltaD12 = Cp * DeltaT12 DeltaD23 = Cp * DeltaT23

Micro controller t1

t2 t3

Sampling frequency

Speed of phonon in Si lattice = Cp DeltaD12 = d2-d1 DeltaD23 = d3-d2

T t1 t2 t3 DeltaT12 DeltaT23

time

Detection Hardware 21

Page 22: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Use the iterative Gauss-Newton interpolation Initial guessed location : Xv,Yv Measured differences in the distances from two sensors

DeltaD12 and DeltaD23 Another problem: it may not converge…

S1 (X1,Y1,t1)

S3 (X3,Y3,t3)

S2 (X2,Y2,t2)

(Xa,Ya,T)

(Xv,Yv)

22

Determination of strike position.

Page 23: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Real World: Errors!!!

T

t1

t2

t3

e1 Tp

e2

e3

S1

S2

S3

Tp = 0.5 ns (sampling period) ei = [0,0.5]ns implies ei-ej = [-0.5,0.5]ns Suppose strike happens at 3.6 ns in time, it will be

detected at S1 at only 4.0ns. So e1 = 0.4ns

23

Page 24: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Impact of Errors

Distribution of error depends on the number of sensors and the sampling frequency.

If we use more than 3 sensors, we have an over determined system of equations. This may help reduce the error measurements

Page 25: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

High-level Algorithm

System inputs •Number of sensors •Location of the sensors •Initial(guessed) location of strike •Difference in measured distances of actual strike from sensors.

System outputs •Location of strike •Error estimation

Iterative triangulation method “Gauss Newton Interpolation”

25

Page 26: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

S1 (X1,Y1,t1)

S3 (X3,Y3,t3)

S2 (X2,Y2,t2)

• We calculate circular error probability CEP • 50% probability within CEP • 93% probability within 2*CEP. • 100% probability the strike within 3*CEP. • Further than 3*CEP is 0.2%

(Xnew,Ynew)

(Xa,Ya,T)

Area of Error Distribution. 26

Page 27: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Summary of issues

Latency of detection Reduced error area 100% coverage (convergence)

Page 28: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Results

Experiments performed for different number of sensors for various locations on the cache, with different initial guessed location with sampling frequency varying from 2GHz to 4 GHz.

Convergence varies depending upon the #sensors, their locations and sampling frequency. We managed to achieve 100% coverage

28

Page 29: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Results

Stabilization of algorithm achieved by... Proper termination criteria. Grid formation of the sensors.

Current results do not include the sampling errors, Goal…

Obtain the localization with the accuracy of ~1 bit. For any number of sensors distributed in grid formation With dynamic selection of “n” sensors And initial guess location.

Page 30: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Results

Achieved localization granularity ~1 bit. 30

#Sensors in the grid

#sensors triggered. #sensors selected

dynamically.

Formation of the grid.

Min. Max. Vertical Horizontal

15 5 8 4 5 3 18 5 9 5 6 3 20 6 11 6 5 4 24 7 12 7 6 4 25 8 14 8 5 5 30 9 15 9 6 5 30 10 17 10 5 6 36 11 18 11 6 6

Page 31: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Results

Latency can be reduced by adding more dummy sensors.

Page 32: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

Future work

Observe different trade offs by varying number of sensors, locations of sensors, sampling frequency.

Changing the sensitivity of the sensors. Map multi-bit upsets. Add micro-architecture features to contain the

error due to the latency. Improving runtime FIT rate budget, chip error

detection, protecting logic etc. Extend the idea to the whole core.

32

Page 33: ACOUSTIC WAVE DETECTORS TO PREVENT SEUSpersonals.ac.upc.edu/gaurang/Acoustic_TRAMS.pdf · ACOUSTIC WAVE DETECTORS TO PREVENT SEU S IN LLC. Gaurang Upasani . 1 . 28th June, 2011 .

References

1. Cosmic Ray Detectors for Integrated Circuit Chips. Eric Hannah, Intel corp. US Patent # 7,166,847

2. The design and Construction of a Mechanical Radiation Detector. M. D. Hammig.

3. Nuclear Radiation Detection via the deflection of pliable microstructures. M. D. Hammig.

4. Position-Location Solutions by Taylor-Series Estimation. Wade H. Foy.

5. Architecture Design for Soft Errors. Shubhendu Mukherjee.

33