Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL...

21
Wang-11 0 D/MAPLD 2 004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics

Transcript of Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL...

Page 1: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

1

SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA

Mandy M. Wang

JPL R&TD Mobility Avionics

Page 2: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

2

Project Background

SEU Sensitive Areas and Mitigation Approaches

Design Details

Conclusion

Agenda

Page 3: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

3

Project Objective

Mobility Avionics project aims to develop an embedded platform for space flight instruments and systems that is scalable, configurable, and capable of withstanding low to medium radiation environments.

Page 4: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

4

Multi-Tiered Strategy

Not Mission Critical

NotTime Critical

EDL Controller

Micro-Mobility Controller

Science Data Processor

Image Processor

Low to Medium Radiation Tolerance is Assumed

Orbiter Command Data Handler

Robust StrategySimple StrategyMotor Control

Science Data Processor

Ground Support Equipment

Always Available Strategy

Time Critical

Mission Critical

Page 5: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

5

Strategies

Simple Strategy: A quick-and-dirty approach. It uses less than desirable techniques such as device reset and reconfiguration as a means of error correction. It may require an external computer for configuration check.

Robust Strategy: A refinement of the simple strategy. It uses a SEU immune FPGA as a monitoring device for the system board base on Xilinx FPGA device. As a result, no external computer is needed.

Page 6: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

6

SEU Sensitive Areas

PPC L1 Cache

10.8 (64%)

Registers0.46 (3%)

Configuration MEM

3.61 (22%)

Block SelectRAM1.78 (11%)

Normalized Data – based on predicted upset rates

(XC2VP20)

Xilinx Virtex-II Pro SEU sensitive areas include:

PPC405 Core registers

Configuration Memory (LUT equation and Routing)

Data path Registers

User Memory (Block or Distributed RAMs)

Page 7: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

7

Mitigation Approaches

Detection Indicator Mitigation Fault Injection

Processor Registers

Processor Comparison at the Coreconnect Bus

Internal FF Processor Reset Serial port

User Memory EDAC Internal FF EDAC Serial port

Configuration Memory

CRC (None) FPGA reconfiguration Serial port

Data Registers TMR (None) TMR (None)

PPC L1 Cache

10.8 (64%)

Registers0.46 (3%)

Configuration MEM

3.61 (22%)

Block SelectRAM1.78 (11%)

Page 8: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

8

System Design - Overview

PPC4051

PPC4052

PLBARB

PLB2OPBBridge

OPBARB

C

Crit. INTC

Non-Crit INTC

DDR SDRAM Cntl UARTs

(External Devices)

EXTMEM(128MB)OCM

BRAM(8K)

Serial PortDecoder

(Injects faultSignals)

FI

FIFI FI

FI

EDC FI

Status BRAMs(4K)

PLB BRAMs (Firmware)(32K)

EDC ControllerFI

EDC

Page 9: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

9

Dual-processor ComparatorPPC 405 Block 1

Cache Units

PLB Bus

MMU CPUTimers

andDebug

PPC 405 Block 2

Cache Units

MMU CPUTimers

andDebug

Arb

iter

DDR SDRAMController

C

PLB IPIF External SDRAM

Note: Yellow lines: PLB master read / write signals for D-Cache Green Lines: PLB master read signals for I-Cache

FIFI

FI FIFI

FI

FI : Fault insertion point

PC

PC : Parity Check

Off Chip Area

FI

PLB IPIF

FI

Page 10: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

10

Dual-Processor Voting Simulation

Page 11: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

11

EDAC OCM BRAMs (Read/Write)

ParityEncoder

ErrorDetection

Correction

PPC405 #1

PPC405 #2

BRAMS (8KB)

Glue Logic

ENCIN

DECOUT

ERROR

FORCE ERROR

PARITY_OUT

PARITY_IN

ENOUT

DECIN

Hamming Code [32,39] Read-modified-write to support byte enable feature Error information is stored in a separate memory space Single-bit error triggers a CPU interrupt Double-bit error triggers a CPU reset

Xilinx XAPP645

Data Out (discard parity bits)

ADDR

EN

W_EN[3:0]

CLK

32

32

32

7

7

32

32

Page 12: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

12

EDAC PLB BRAMs (Read Only)

ParityEncoder

ErrorDetection

Correction

BRAMS (32KB + 8 KB)

ENCIN

DECOUT

ERROR

FORCE ERROR

PARITY_OUT

PARITY_IN

ENOUT

DECIN

Hamming Code [64,72] Read-modified-write to support byte enable feature Single-bit error is stored in a separate memory space Single-bit error triggers a CPU interrupt Double-bit error triggers a device reconfiguration

Xilinx XAPP645

Data Out (discard parity bits)

ADDR

EN

W_EN

CLK

Pro

cess

or L

ocal

Bu

s

64

64

64

GlueLogic

2

2PLB BRAM Controller

64

64

8

8

PL

B

Interface

Page 13: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

13

EDAC DDR SDRAM

Hamming Code [64,72] Read-modified-write to support byte enable and burst of 2-words features Single error is stored in a separate memory space Single error triggers a CPU interrupt Double error triggers device reconfiguration

ParityEncoder

ErrorDetection

Correction

DDR SDRAM (128MB

+32MB)

ENCIN

DECOUT

ERROR

FORCE ERROR

PARITY_IN

DECIN

Xilinx XAPP645

ADDR

CLKPro

cess

or L

ocal

Bu

s

64

64

64

GlueLogic

2

2

32

32

CLKn

4

4

DDR SDRAM Controller

Mux

Demux

64

8PARITY_OUT

ENOUT

8

64

MuxData Out (discard parity bits)

32

PL

B interface m

odules

Page 14: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

14

Self Configuration Checker

ICAPController

ICAP

CRCChecker

FrameAddressMemory(BRAMS)

4 Bytes

Read BackCommands( 44 Bytes)

Virtex-II Pro

Implementation

C script

top.ll(contains frameaddress used forthe design)

Frame address data formatted for BRAMS

(BRAMS)

Digital Design

top.bit

This portion can be ported to a radiation-hardened FPGA in the case of robust strategy

Page 15: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

15

Self Configuration CheckerDesign Highlights

No External I/Os access required

Frame-by-frame read back required

32-bit CRC algorithm implemented. (A CRC signature is generated after device power up)

No SRL16 and Distributed SelectRAMs used in design

Page 16: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

16

Labview Fault Injection PanelScreenshot of fault injection emulator that interfaces with the prototype board.

Fault Injection Error Counters

Process Bus Fault Injection Buttons

ProcessorsMismatchLED Indicator

Fault location map

Program counter resets to zero when a CPU reset occurs.

ASCII CommandInput window

Page 17: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

17

XC2VP20 Device Utilization (without TMR)

Number of External IOBs 57 out of 564 10%  Number of PPC405s 2 out of 2 100% Number of RAMB16s 30 out of 88 34% Number of SLICEs 4334 out of 9280 46%  Number of BUFGMUXs 6 out of 16 37% Number of DCMs 2 out of 8 25% Number of ICAPs 1 out of 1 100% Number of JTAGPPCs 1 out of 1 100%

Page 18: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

18

Slice Utilization (without TMR)

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

VP20 VP40 VP100

OPB Arbiter 40 2.0%

OPB2DCR Bridge 90 4.5%

PLB BRAM Controller 163 8.2%

OCM BRAM Controller 278 14.0%

Interrupt Controller 341 17.2%

Uart Transceiver 368 18.6%

PLB2OPB Bridge 700 35.4%

PLB Arbiter 1005 50.8%

Fault Injeciton Module 93 5.4%

Configuration Checker 156 9.0%

Dual-processor comparator 178 10.3%

OCB EDAC 32-bit Module 233 13.4%

PLB EDAC 64-bit Module 467 26.9%

Hardware Status Memory Controller 606 35.0%

Note: The shaded modules can be replaced by other approach.

Page 19: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

19

Mitigation State Machine

1) CPU mismatch2) CPU watchdog timer3) OCM EDC double-bit error

CPUReset

SystemReset

1) OPB Bus error2) PLB Bus error

1) Configuration check fail2) PLB EDC double-bit error3) DDR SDRAM double-bit error

FPGAReconfiguration

Mitigation S

everity

1) OCM BRAM single-bit error2) PLB BRAM single-bit error3) DDR SDRAM single-bit error

CPUInterrupt

CPU reset counter == full

System reset counter == full

Normal

Page 20: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

20

Conclusion

Identified and categorized error prone regions on the Virtex-II Pro into four types

Developed mitigation strategies for each region.

Radiation test on the overall system is in progress.

Page 21: Wang-110 D/MAPLD 2004 1 SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics.

Wang-110 D/MAPLD 2004

21

Acronyms

• SEU : Single Event Upset

• FPGA: Field Programmable Gate Array

• LUT: Look Up Table

• PLB: Processor Local Bus

• OPB: On-Chip Peripheral Bus

• OCM: On-Chip Memory

• EDAC: Error Detect-And-Correct

• ICAP: Internal Configuration Access Point