LaRC p174/ MAPLD 2004Jones Slide 1 Experiences in the Development of an FPGA Based Radiation...

26
LaRC Jones Slide 1 p174/ MAPLD 2004 Experiences in the Development of an FPGA Based Radiation Tolerant Design Mark Jones [email protected] 757-864-7878 Dr. Robert Klenke [email protected] (804) 827-7007

Transcript of LaRC p174/ MAPLD 2004Jones Slide 1 Experiences in the Development of an FPGA Based Radiation...

LaRCJones Slide 1 p174/ MAPLD 2004

Experiences in the Development of an FPGA Based Radiation Tolerant Design

Mark Jones

[email protected]

757-864-7878

Dr. Robert Klenke

[email protected]

(804) 827-7007

LaRCJones Slide 2 p174/ MAPLD 2004

Gifts

• Geosynchronous• Imaging• Fourier • Transform• Spectrometer

• Was not funded to completion

LaRCJones Slide 3 p174/ MAPLD 2004

Gifts Modules

SensorModule

Control Module

Modulators(Downlink)

LaRCJones Slide 4 p174/ MAPLD 2004

Gifts Control Module

6U CPCI

33MHz, 32 Bit, 3.3 Volt

6U CPCI

33MHz, 32 Bit, 3.3 Volt

IC

(Instrument

Controller) BAE 750

IO

DL

EDS

MEM

2 MB

SRAM

LaRCJones Slide 5 p174/ MAPLD 2004

Gifts Control Module Data Flow

MEM

DLINK

IC

IO

SM Data I/F

Serialized LVDS 21-bit data,16 MHz

SM Command I/F

422 Differential

Spacecraft I/F

1553

Requires Sequential “Block” Readout to Downlink

X-Band

80 Mbps

SMQ-11

Actel PCI CoreActel PCI Core

Actel PCI Core

EDS

Actel PCI Core

CPCI Bus

LaRCJones Slide 6 p174/ MAPLD 2004

MEM overall function and purpose

• CM must accept SM LVDS data up to 256Mbits/sec, average 160Mbits/sec and transfer that data to the communications payload at a rate of 80Mbits/sec

• This throughput requires the data must be compressed (DWNLNK) resulting in a non-continuous and variable throughput – Compression ratio achieved by the DWNLNK varies with the entropy of

the current image.

– Worst case is unable to match the bandwidth of the incoming data.

MEM DWNLNK

Rice Compression

(variable compression rates) CCSDS

Fixed, Data To S/C

160Mbit/sec Data from SM No flow control

2MByte SRAM - Based FIFO buffers incoming

data during low - compression periods

on DWNLNK

MEM to DWNLNK Data transfer via cPCI bus

80Mbit/sec

LaRCJones Slide 7 p174/ MAPLD 2004

MEM DWNLNK

Rice Compression

(variable compression rates) CCSDS

Fixed, Data To S/C

160Mbit/sec Data from SM No flow control

2MByte SRAM - Based FIFO buffers incoming

data during low - compression periods

on DWNLNK

MEM to DWNLNK Data transfer via cPCI bus

80Mbit/sec

MEM overall function and purpose, continued

• This mismatch of data rates between the SM and the DWNLNK output during periods of low image compression requires that the MEM be able to store incoming data from the LVDS interface until the DWNLNK is ready to process and send it

• System engineering studies, using typical values of data entropy across an image, indicate that the MEM must contain at least 1Mbytes of data storage space to avoid overflow and the resulting loss of image data, therefore to provide a 2X margin, a 2Mbyte buffer is used

• The MEM 2Mbytes of SRAM memory are organized as a First In First Out (FIFO) buffer.

LaRCJones Slide 8 p174/ MAPLD 2004

Memory Board Design Flow and Tools

• Actel Libro Platinum Toolset

– VHDL design entry using text editor

– Pre-synthesis Functional Simulation using ModelSim and highly modified Actel-supplied PCI test bench

– Synthesis using Synplicity synthesis tool

– Post-synthesis simulation using ModelSim

• Full licensed copy of ModelSim (as opposed to the reduced performance version supplied with Libro) necessary to complete post-synthesis and post-place & route simulations in reasonable time

– Place & route with Actel’s Designer tools

• Timer static timing analysis tool used to check rough timing and apply timing constraints

– Post-place & route timing simulation with ModelSim

LaRCJones Slide 9 p174/ MAPLD 2004

Functional Block Diagram – Clock Domains

ActelPCICore

DMAFIFO

LVDSFIFO cPCI

Bus

DMAEngine

SRAMController

BAE238A792SRAM (2)

Actel RT54SX32SFPGA

Actel RT54SX32SFPGA (2 – bit sliced)

Actel RT54SX32SFPGA

DeserializerLVDSData

LVDS Clock Domain Core Clock Domain(Delayed cPCI Clock)

cPCI Clock Domain

BAE238A792SRAM (2)

1632

6464

64 64

LaRCJones Slide 10 p174/ MAPLD 2004

GIFTS Memory Board Architecture

– 100 ohm Differential LVDS Impedance requirement– CPCI protocol includes:

• Bus terminating resistors within .5 inches of cPCI connector• 65 ohm board impedance for CPCI signals• Specific trace lengths which will affect FPGA placement relative to CPCI connector

– The data received at the LVDS data interface is received in a serial bit stream. The serialization however is transparent to the MEM as the output of the Deserializer is parallel.

– There are 3 clock domains, the incoming LVDS, the PCI clock domain, and because the CPCI spec only allows 1 load on the CPCI clock and a rad hard(low SEU) low skew buffer or PLL was not found, the clock for the DMA Engine and SRAM controllers use a 3rd clock domain delayed from the PCI clock.

– For a design independent of the amount of clock skew and not rely on place and route for proper function, this resulted in

• Chip to chip asynchronous data transfer using control signals that employ 4 state handshaking

– The “initiator” asserts the “Ready” signal– The “receiver” asserts the “Acknowledge” signal– The “initiator” de-asserts the “Ready” signal when done with transfer– The “receiver” de-asserts the “Acknowledge” signal indicating it is done with transfer

and ready for next

LaRCJones Slide 11 p174/ MAPLD 2004

GIFTS Memory Board Architecture

– The cPCI bus has the capability to transmit one 32-bit double word (Dword) each clock cycle of the 33MHz cPCI clock. This capability of course, requires that the PCI core be supplied with a 32-bit Dword on its input queue each 30.3 ns clock cycle in order to maintain 100% bandwidth on the cPCI bus. However, the SRAM devices that were available for use in the MEM design originally had a fastest access time of 35 ns. This access time restriction required that, in order to achieve improved bus bandwidth, two Dwords must be retrieved from the SRAM and transmitted to the PCI core during a single memory read operation. Furthermore, because data will be arriving across the LVDS interface during the DMA operation, it is necessary to be able to perform both a read operation, to retrieve data for the current DMA operation, and a write operation, to store the just received LVDS data, on the SRAM concurrently. These two considerations required that the SRAM in the MEM be configured as two 64-bit banks that are 128Kbytes deep, thus allowing simultaneous read and write operations to be performed on opposite banks in a ping-pong fashion. The MEM design requires that all DMA data transfers must be performed on a 64-bit boundary (i.e., an even number of 32-bit Dwords).

LaRCJones Slide 12 p174/ MAPLD 2004

Functional Block Diagram – Data Flow

SRAMController

PCICore

LVDSFifo

LeftSRAM

RightSRAM

64

64

64 64

LVDS Data

cPCI Bus

128K X 64

128K X 64

LaRCJones Slide 13 p174/ MAPLD 2004

Functional Block Diagram – Data Flow

SRAMController

PCICore

LVDSFifo

LeftSRAM

RightSRAM

64

64

64 64

LVDS Data

cPCI Bus

128K X 64

128K X 64

LaRCJones Slide 14 p174/ MAPLD 2004

Functional Block Diagram – Data Flow

SRAMController

PCICore

LVDSFifo

LeftSRAM

RightSRAM

64

64

64 64

LVDS Data

cPCI Bus

128K X 64

128K X 64

LaRCJones Slide 15 p174/ MAPLD 2004

SRAMController

Functional Block Diagram – Data Flow

64

64

PCICore

LVDSFifo

64 64LVDS Data cPCI Bus

LeftSRAM128K X 64

RightSRAM

128K X 64

LVDSFifo

w0 w1 w2 w3w0 w1 w2 w3

w4 w5 w6 w7w4 w5 w6 w7

… … w7, w6, w5, w4, w3, w2, w1, w0w7, w6, w5, w4, w3, w2, w1, w0

RightSRAM

128K X 64w0 w1 w2 w3w0 w1 w2 w3

LeftSRAM128K X 64

w4 w5 w6 w7w4 w5 w6 w7

PCICore

[w3 w2], [w1 w0][w3 w2], [w1 w0][w7 w6], [w5 w4]; [w3 w2], [w1 w0][w7 w6], [w5 w4]; [w3 w2], [w1 w0]

w8 w9 w10 w11w8 w9 w10 w11

LaRCJones Slide 16 p174/ MAPLD 2004

GIFTS Memory Board Architecture

– Since Actel PCI core IP was designed for a standard memory module, all of the data that is to be DMAed must be available (otherwise the Core is starved for data). This is not the case with this design since at any given time the IC does not know exactly how much data is available for DMA transfer. Therefore, it was necessary to design a “DMA engine” into the MEM that can keep track of how much data has been received by the MEM and stored in the SRAM, and configure and initiate the individual DMA operations in the cPCI core until the requested total amount of data has been transferred to the destination across the cPCI bus.

LaRCJones Slide 17 p174/ MAPLD 2004

GIFTS Memory Board Architecture

• Bit-slicing SRAM Controller Functionality– 64 bit data paths as a result of memory speed and

throughput– However cannot be implemented in one RT54SXS due to

I/O limitations– Results in SRAM Controller Core bit-sliced across 32-bit

boundaries

LaRCJones Slide 18 p174/ MAPLD 2004

Bit-Sliced FPGA Functional Block Diagram

Left SRAMController

PCICore

LVDSFifo

LeftSRAM

RightSRAM

64

64

64 64

Right SRAMController

32

32

32

32

32

32

128K X 64

128K X 64

LaRCJones Slide 19 p174/ MAPLD 2004

GIFTS Memory Board Architecture

• Requires 4 Actel RT54SX32S FPGAs– SRAM Controller Core bit-sliced into 2 FPGAs– Based on total cell utilizations for

– PCI CORE FGPA was 69%– Left Sram Controller was 67%– Right Sram Controller was 53%– LVDS FIFO was 72%

– PCI Core and LVDS FIFO each implemented on a separate FPGA resulting in 4 FPGAs

LaRCJones Slide 20 p174/ MAPLD 2004

Block Diagram

ActelPCICore

DMAFIFO

LVDSFIFO cPCI

Bus

BAE238A792SRAM (2)

Actel RT54SX32SFPGA

Actel RT54SX32SFPGA

DeserializerLVDSData

LVDS Clock DomainCore Clock Domain(Delayed cPCI Clock) cPCI Clock Domain

BAE238A792SRAM (2)

1632

Left SRAM/DMA Controller

64

64

64 64

Right SRAMController

32

32

32

32

32

32

Actel RT54SX32S

Actel RT54SX32S

LaRCJones Slide 21 p174/ MAPLD 2004

MEM Design Status

• Initial prototype design completed– Limitations imposed by available FPGA size, speed, and I/O pins

overcome– Limitations imposed by Actel-supplied PCI Core overcome– Limitations imposed by available SRAM speed overcome– Complexity of design resulting from 3-asynchronous clock domains

overcome

• Initial brassboard completed and tested– Demonstrated fundamental design concept– Successfully completed reliable DMA of data from LVDS stimulator

to PCI-resident memory– Uncovered several significant bugs and needed additional

functionality– Successfully redesigned and simulated at the behavioral and post-

synthesis level – Revised version of the design will work with existing PCB without

redesign

LaRCJones Slide 22 p174/ MAPLD 2004

MEM Rev. 2 Design to-be-completed

• Complete timing layout and test in post layout simulation with back annotated timing

• Test on the brassboard• Complete documentation of the design

LaRCJones Slide 23 p174/ MAPLD 2004

Backup

LaRCJones Slide 24 p174/ MAPLD 2004

Brassboard Layout

• Layout details – CPCI requirements met for terminating resistor placement and trace

lengths to the PCI FPGA.

– CPCI 65 ohm Impedance requirement met on all signal layers facilitated by stackup design and trace width and separation relationship

– 30mils separation between asynchronous CPCI signals and others

– 100 ohm Differential LVDS Impedance requirement met on all signal layers facilitated by stackup design and trace width and separation relationship, and custom thru-hole arrangement for Micro_Twinax cable connection

– Differential LVDS signals surrounded by ground guard plane• spaced 20mils (2 x distance between pair) away from pairs

– Clocks are routed internal to the board, between planes, and have ground guard traces on both sides.

– Edges of Board will be milled to allow 90mil board to slide into 62mil guides

– Layout approximately 99% complete

LaRCJones Slide 25 p174/ MAPLD 2004

Brassboard Layout

cPC

Icon

nectors

Mem

ory B

oard

Prelim

inary S

ketch

regulator

Deserializer

PC

I_C

OR

E F

PG

AL

VD

S_

FIF

O_

FP

GA

.3overbd

Panel

mount

twin bnc

cab

le

.3overbd

Panel

mount

twin bnc

cab

le

.3overbd

Panel

mount

twin bnc

.3overbd

Panel

mount

twin bnc

cab

le

.3overbd

Panel

mount

twin bnc

cab

le

2.2

5 inche

s min

imum

.5 inch

com

po

ne

nt ke

epo

ut fo

r wed

gelo

ck ch

ann

els

SRAM SIMM socket and board area

1

1

U5

1

1

2

3

SEE ATTACHED DETAIL

~2"~2"

NON-Grounded FrontPanel 6-32 mounting

holes

2.25 inchesminimum

2.25 inchesminimum

2.25 in

ches m

inimu

m

SR

AM

SIM

M socket an

d bo

ard

are

a

U71

2

3

SR

AM

SIM

M so

cket a

nd

boa

rd are

aU6

1

2

3

LE

FT

_S

RA

M_

CT

RL

FP

GA

1

RIG

HT

_SR

AM

_CT

RL

FP

GA

1

.5 inch

com

po

ne

nt ke

epo

ut fo

r wed

gelo

ck ch

ann

els

160

mm

Ma

x 275

mm

SR

AM

SIM

M socket an

d bo

ard

are

a

1

2

3

U8

233.35mm

LaRCJones Slide 26 p174/ MAPLD 2004

Layer 1,9 65 Ohms: 5 mil wide/10 mil sep. 100 Ohms Diff. 7 mil wide/10 mil sep.Layer 3, 6 65Ohms: 5 mil wide/10 mil sep. 100 Ohms Diff.

7 mil wide/12 mil sep.

Brassboard Layout – Stackup