Spring 2008 CSE 591 Compilers for Embedded Systems

31
Spring 2008 CSE 591 Compilers for Embedded Systems Aviral Shrivastava Department of Computer Science and Engineering Arizona State University

description

Spring 2008 CSE 591 Compilers for Embedded Systems. Aviral Shrivastava Department of Computer Science and Engineering Arizona State University. Lecture 4: Soft Errors. Software Techniques. Outline. Soft Errors Recap Process Technology and Packaging Solutions - PowerPoint PPT Presentation

Transcript of Spring 2008 CSE 591 Compilers for Embedded Systems

Page 1: Spring 2008 CSE 591 Compilers for Embedded Systems

Spring 2008 CSE 591Compilers for Embedded

Systems

Aviral ShrivastavaDepartment of Computer Science and Engineering

Arizona State University

Page 2: Spring 2008 CSE 591 Compilers for Embedded Systems

Lecture 4: Soft Errors

Software Techniques

Page 3: Spring 2008 CSE 591 Compilers for Embedded Systems

Outline□Soft Errors Recap□Process Technology and Packaging

Solutions□Gate-level and Circuit-level Solutions□Microarchitectural Solutions

□Single-core□Multi-threaded

□Software Solutions□Multi Bit Upsets (MBUs)□Single Event Latchup

Page 4: Spring 2008 CSE 591 Compilers for Embedded Systems

Razor

□ Originally proposed to tolerate process variations and achieve power reduction□ Shadow latch clocked with a delayed clock □ If difference in values latched, raise error

□ How to use it to detect soft errors?

Page 5: Spring 2008 CSE 591 Compilers for Embedded Systems

Multi-issue Processors

□ Superscalar□ Execute instructions from the same thread

□ Multi-threading□ Execute instructions from the same threads in one cycle, but can switch

between applications□ Simultaneous Multithreading

□ Issue instructions from different threads in the same cycle

Superscalar

Multithreading

Simultaneous Multithreading

Page 6: Spring 2008 CSE 591 Compilers for Embedded Systems

SMT Solutions□ SRT: Simultaneous Redundant Threading

□ Duplicate a thread, and run them on the same core as leading thread and trailing thread

□ Threads maintain their contexts, including the register file□ Threads should not diverge when there are no faults□ Memory interface

□ Only leading thread can read from the memory□ Puts a copy in a LVQ – trailing thread reads from here□ Leading thread writes to STB to write store values□ Only trailing thread can write to the memory - after checking the

value in the STB□ Branch Interface

□ Leading thread writes branch outcomes in BOQ□ Trailing thread has perfect branch prediction

Page 7: Spring 2008 CSE 591 Compilers for Embedded Systems

SMT Solutions: PER□ Trailing thread competes for resources – High ILP

phases□ STB fills up causing leading thread stalls□ PER: Partial Explicit Redundancy

□ Leading thread uses all resources during high-ILP phases□ SEM: Single Execution Mode

□ Trailing thread executes during low-ILP phases□ REM: Redundant Execution Mode

□ In REM state, check all instructions□ Need resume point for trailing thread

□ Maintain state (LVQ, STB, RF, etc…)□ Proportional to slack size

Page 8: Spring 2008 CSE 591 Compilers for Embedded Systems

SMT Solutions: IRTR□IR: Instruction Reuse

□Do not execute an instruction, if it has already executed with the same inputs

□Keep a reuse buffer

□IRTR: Implicit Redundancy Through Reuse□Check with previous value for soft errors

□If matches, continue and overwrite the value in buffer□If mis-match, raise flag

□During high ILP regions

Page 9: Spring 2008 CSE 591 Compilers for Embedded Systems

Outline□Soft Errors Recap□Process Technology and Packaging

Solutions□Gate-level and Circuit-level Solutions□Microarchitectural Solutions

□Single-core□Multi-threaded

□Software Solutions□Multi Bit Upsets (MBUs)□Single Event Latchup

Page 10: Spring 2008 CSE 591 Compilers for Embedded Systems

Watchdog Processor & Control Flow Checking

□ Watchdog processor□ Simple processor, receives signals from the main processor□ Checks to see if the signals are coming in correct order

□ S3 should not come after S1□ Watchdog program can be automatically generated□ Formal techniques for correctness□ Asynchronous communication of Main processor with

watchdog processor

Processor

Memory Watchdog Processor

BB1

BB2

BB3

Send S1

Send S2

Send S3

Page 11: Spring 2008 CSE 591 Compilers for Embedded Systems

EDDI (Error Detection by Duplicated Instructions)

□ Duplicate instructions□ Validation instructions

□ Store and branch are sync points□ Check store and branch operands

□ Memory penalty□ Load/store from duplicated locations

Page 12: Spring 2008 CSE 591 Compilers for Embedded Systems

EDDI+CFCSS (Control Flow Checking by Software Signatures)

□ At the beginning of the node, perform G = G xor d□ d2 = s1 xor s2, Then G = s1 xor (s1 xor s2) = s2

□ If two source nodes jump to the same destination node, then the two source nodes should have the same signature

Page 13: Spring 2008 CSE 591 Compilers for Embedded Systems

CFCSS + SWIFT (Software Implemented Fault Tolerance)

□ If two source nodes jump to the same destination node, then the two source nodes should have the same signature□ Need another path-dependent D□ B1 -> B5, D=0, Then G = s1 xor d5 xor 0 = s5□ B3 -> B5, D = s1 xor s3, Then G = s3 xor (s1 xor s5) xor (s1

xor s3) = s5

Page 14: Spring 2008 CSE 591 Compilers for Embedded Systems

ED4I: Error Detection by Diverse Data and Duplicated Instructions

• The simplest way to detect Byzantine Faults is to run the same program on multiple processors and compare results.

• ED4I is Byzantine Fault detection for uniprocessors.

• Must take into account both temporary and and permanent faults.• Re-executing with same inputs does not guard against

permanent faults• Overhead = 100%

Page 15: Spring 2008 CSE 591 Compilers for Embedded Systems

Key Idea• Lets feed into the program two different sets of

data and then compare the results.• Key Insight:

• If the program only uses arithmetic operations, we can alter the input by multiplying all input numbers by a constant.

• Then the modified output will be the (real output) * (the constant).

• Thus, you can verify that the two computations succeeded AND the two computations will be affected by errors differently.

Page 16: Spring 2008 CSE 591 Compilers for Embedded Systems

New Program

• If we alter the input to the program, we must alter the program to work with this modified input.

• The transformation is given the constant k (called the “diversity factor”) and it creates the “k-factor diverse program”.

• The new program will have the same control flow graph as the old program but all the variables will be k-multiples of the of original ones.

Page 17: Spring 2008 CSE 591 Compilers for Embedded Systems

Transformations• If k<0, branches flip directions

(> ↔ <, ≥ ↔ ≤)• All constants in code get multiplied by k.• Addition and Subtraction of variables

unchanged.• Multiplication:

v1*v2*....*vn → (v1*v2*....*vn)/kn-1

• Division: v1/v2 → (v1/v2)*k

Page 18: Spring 2008 CSE 591 Compilers for Embedded Systems

Fault Detection & Data Integrity

• For functional unit hi (such as the adder), fault f and diversity factor k:

• Xi = is the set of inputs to hi

• Ei = subset of X containing the inputs that will result in erroneous output due to the fault.

• E'i = subset of Ei that will escape detection

• Ci(k) = Probability of catching an error in h i.• Di(k) = Probability of missing no errors in hi.

∣ ∣)()('

i

ii

fi X

EEfP=kC )1)(()('

i

i

fj X

EfP=kD

Page 19: Spring 2008 CSE 591 Compilers for Embedded Systems

Choosing the value of k• For some functional units we can derive Ci(k) and

Di(k) analytically for each k.• This is too hard in general so try out a range of k's

empirically to determine Ci(k) and Di(k).• Bus Signal (12-bit)

• 12-bit carry look-ahead adder

• 12-bit Multipliers and Dividers

Page 20: Spring 2008 CSE 591 Compilers for Embedded Systems

Analytical Computation of AVF

□ Iteration Space□ L-dimensional integer vector space

□ L: levels of loop

□ Each point in IS represents an iteration□ Data dependences exist□ Fully ordered in time

□ Array Space□ M-dimensional integer vector space

□ M: array dimension

□ Every point represents an element of the array

}0,1,|),...,{( 21 iiL

L NxLiiZxxxIS

for (i=0; i<N1; i++) for (j=0; j<N2; j++) a[i][j] = a[i][j-1]+ a[i-1][j] + a[i][j+1]}0,1,|),...,{( 21 ii

ML DxMiiZyyyAS

Page 21: Spring 2008 CSE 591 Compilers for Embedded Systems

Analytical Computation of AVF

□ Access Function (AF) of a reference□ Mapping from IS to AS□ When are the elements of array accessed by a

reference

□ References will access different parts of Array Space□ Divide the Array Space into regions, in which

every element is accessed by a subset of references

□ Array Interval (AI): Subset of AS that the reference accesses

□ Every element is accessed by the same set of references

}0,0,,|),{( 221122112

21]][[]][[ NxNxxyxyZyyAF jiajia

}0,0,10,|),{( 221122112

21]][[]10][[ NxNxxyxyZyyAF jiajia

}0,0,1,*2|),{( 221122112

21]][[]][2*[ NxNxxyxyZyyAF jiajia

Page 22: Spring 2008 CSE 591 Compilers for Embedded Systems

Analytical Computation of AVF

Iteration Intervals for an Array Interval

□ Each reference will access the elements of array interval at iterations given by AF (Access Function)

□ Iteration Interval (II) is AF in Array Interval□ Formula of access time of each element in II

□ Vulnerability can be computed as a formula on II□ Time from r/w r□ A reference either reads or writes (not both)

□ Need to time-order points in II□ Break into Iteration Segments, which can be ordered

□ Strict order, or point-wise ordered

Page 23: Spring 2008 CSE 591 Compilers for Embedded Systems

Outline□Soft Errors Recap□Process Technology and Packaging

Solutions□Gate-level and Circuit-level Solutions□Microarchitectural Solutions

□Single-core□Multi-threaded

□Software Solutions□Multi Bit Upsets (MBUs)□Single Event Latchup

Page 24: Spring 2008 CSE 591 Compilers for Embedded Systems

Multiple-bit Upsets (MBUs)

□ Error rate ~ 1/100th of SEU□ Hamming Code

□ 1-bit error correction, 2-bit error detection□ Reed Solomon Codes

□ RS(n,k) with s-bit symbols□ s - Each symbol is s-bits□ n – total number of bits per code, n = 2s-1□ k – data bits□ Number of parity bits = 2t = n-k

□ Can correct errors in ‘t’ symbols, where t = (n-k)/2□ RS(255, 223) with 8-bit symbols

□ Can correct 16 symbol errors in each codeword (255 bits)

□ Other multi-bit error detection and correction schemes□ LDPC

Page 25: Spring 2008 CSE 591 Compilers for Embedded Systems

Copyright 2005, M. Tahoori 25

BitRead

Bit has error

protection

Erroris only detected(e.g., parity + no recovery)

Error can be corrected(e.g, ECC)

yes no

Does bit matter?

Silent Data Corruption

(SDC)

yesyes

no

Detected, but unrecoverable

error (DUE)

no error

yes no

benign faultno error

benign faultno error

Strike on state bit (e.g., in register file)

Page 26: Spring 2008 CSE 591 Compilers for Embedded Systems

Interleaving bits

□ Interleaving converts□ spatial multi-bit error multiple single bit errors

bits

X X X

X = covered with single ECC code

+ + +

+ = covered with different ECC code

// /00 0

Page 27: Spring 2008 CSE 591 Compilers for Embedded Systems

Two Separate Strikes on Different Bits

Temporal Double Bit Errors

□ SECDED ECC (single error correction, double error detection)□ could detect error, but cannot correct the error□ if errors accumulate

□single bit correctable error becomes a double bit detectable error

Cycle 100 Cycle 1,000,000

Page 28: Spring 2008 CSE 591 Compilers for Embedded Systems

Solutions for Temporal Double Bit Errors

□ Natural Effects□ whenever a processor reads a cache block, we can correct

the single bit error□ check for errors when cache blocks are replaced from the

cache

□ More Powerful ECC □ SECDED ECC requires 8 bits per 64 bits

□7 bits for single bit correction□8th bit for double bit detection□Overhead = 13%

□ ECC with two bit correction requires 12 bits per 64 bits□Overhead = 19%

Page 29: Spring 2008 CSE 591 Compilers for Embedded Systems

Scrubbing□Periodically read memory and

correct all single bit errors

□Disallows accumulation of temporal double bit errors

□Standard technique in main memories (DRAMs)

Page 30: Spring 2008 CSE 591 Compilers for Embedded Systems

Outline□Soft Errors Recap□Process Technology and Packaging

Solutions□Gate-level and Circuit-level Solutions□Microarchitectural Solutions

□Single-core□Multi-threaded

□Software Solutions□Multi Bit Upsets (MBUs)□Single Event Latchup

Page 31: Spring 2008 CSE 591 Compilers for Embedded Systems

Single Event Latchup

□ SEL: Single Event Latchup□ Parasitic circuit elements forming a silicon controlled rectifier (SCR)□ Potentially destructive

□ the device current may destroy the device if not current limited and removed "in time.

□ Removal of power to the device is required in all non-catastrophic SEL conditions in order to recover device operations.

□ SEL probability increases with temperature!