LaRC p174/ MAPLD 2004Jones Slide 1 Experiences in the Development of an FPGA Based Radiation...
-
Upload
lucas-marsh -
Category
Documents
-
view
214 -
download
0
Transcript of LaRC p174/ MAPLD 2004Jones Slide 1 Experiences in the Development of an FPGA Based Radiation...
LaRCJones Slide 1 p174/ MAPLD 2004
Experiences in the Development of an FPGA Based Radiation Tolerant Design
Mark Jones
757-864-7878
Dr. Robert Klenke
(804) 827-7007
LaRCJones Slide 2 p174/ MAPLD 2004
Gifts
• Geosynchronous• Imaging• Fourier • Transform• Spectrometer
• Was not funded to completion
LaRCJones Slide 4 p174/ MAPLD 2004
Gifts Control Module
6U CPCI
33MHz, 32 Bit, 3.3 Volt
6U CPCI
33MHz, 32 Bit, 3.3 Volt
IC
(Instrument
Controller) BAE 750
IO
DL
EDS
MEM
2 MB
SRAM
LaRCJones Slide 5 p174/ MAPLD 2004
Gifts Control Module Data Flow
MEM
DLINK
IC
IO
SM Data I/F
Serialized LVDS 21-bit data,16 MHz
SM Command I/F
422 Differential
Spacecraft I/F
1553
Requires Sequential “Block” Readout to Downlink
X-Band
80 Mbps
SMQ-11
Actel PCI CoreActel PCI Core
Actel PCI Core
EDS
Actel PCI Core
CPCI Bus
LaRCJones Slide 6 p174/ MAPLD 2004
MEM overall function and purpose
• CM must accept SM LVDS data up to 256Mbits/sec, average 160Mbits/sec and transfer that data to the communications payload at a rate of 80Mbits/sec
• This throughput requires the data must be compressed (DWNLNK) resulting in a non-continuous and variable throughput – Compression ratio achieved by the DWNLNK varies with the entropy of
the current image.
– Worst case is unable to match the bandwidth of the incoming data.
MEM DWNLNK
Rice Compression
(variable compression rates) CCSDS
Fixed, Data To S/C
160Mbit/sec Data from SM No flow control
2MByte SRAM - Based FIFO buffers incoming
data during low - compression periods
on DWNLNK
MEM to DWNLNK Data transfer via cPCI bus
80Mbit/sec
LaRCJones Slide 7 p174/ MAPLD 2004
MEM DWNLNK
Rice Compression
(variable compression rates) CCSDS
Fixed, Data To S/C
160Mbit/sec Data from SM No flow control
2MByte SRAM - Based FIFO buffers incoming
data during low - compression periods
on DWNLNK
MEM to DWNLNK Data transfer via cPCI bus
80Mbit/sec
MEM overall function and purpose, continued
• This mismatch of data rates between the SM and the DWNLNK output during periods of low image compression requires that the MEM be able to store incoming data from the LVDS interface until the DWNLNK is ready to process and send it
• System engineering studies, using typical values of data entropy across an image, indicate that the MEM must contain at least 1Mbytes of data storage space to avoid overflow and the resulting loss of image data, therefore to provide a 2X margin, a 2Mbyte buffer is used
• The MEM 2Mbytes of SRAM memory are organized as a First In First Out (FIFO) buffer.
LaRCJones Slide 8 p174/ MAPLD 2004
Memory Board Design Flow and Tools
• Actel Libro Platinum Toolset
– VHDL design entry using text editor
– Pre-synthesis Functional Simulation using ModelSim and highly modified Actel-supplied PCI test bench
– Synthesis using Synplicity synthesis tool
– Post-synthesis simulation using ModelSim
• Full licensed copy of ModelSim (as opposed to the reduced performance version supplied with Libro) necessary to complete post-synthesis and post-place & route simulations in reasonable time
– Place & route with Actel’s Designer tools
• Timer static timing analysis tool used to check rough timing and apply timing constraints
– Post-place & route timing simulation with ModelSim
LaRCJones Slide 9 p174/ MAPLD 2004
Functional Block Diagram – Clock Domains
ActelPCICore
DMAFIFO
LVDSFIFO cPCI
Bus
DMAEngine
SRAMController
BAE238A792SRAM (2)
Actel RT54SX32SFPGA
Actel RT54SX32SFPGA (2 – bit sliced)
Actel RT54SX32SFPGA
DeserializerLVDSData
LVDS Clock Domain Core Clock Domain(Delayed cPCI Clock)
cPCI Clock Domain
BAE238A792SRAM (2)
1632
6464
64 64
LaRCJones Slide 10 p174/ MAPLD 2004
GIFTS Memory Board Architecture
– 100 ohm Differential LVDS Impedance requirement– CPCI protocol includes:
• Bus terminating resistors within .5 inches of cPCI connector• 65 ohm board impedance for CPCI signals• Specific trace lengths which will affect FPGA placement relative to CPCI connector
– The data received at the LVDS data interface is received in a serial bit stream. The serialization however is transparent to the MEM as the output of the Deserializer is parallel.
– There are 3 clock domains, the incoming LVDS, the PCI clock domain, and because the CPCI spec only allows 1 load on the CPCI clock and a rad hard(low SEU) low skew buffer or PLL was not found, the clock for the DMA Engine and SRAM controllers use a 3rd clock domain delayed from the PCI clock.
– For a design independent of the amount of clock skew and not rely on place and route for proper function, this resulted in
• Chip to chip asynchronous data transfer using control signals that employ 4 state handshaking
– The “initiator” asserts the “Ready” signal– The “receiver” asserts the “Acknowledge” signal– The “initiator” de-asserts the “Ready” signal when done with transfer– The “receiver” de-asserts the “Acknowledge” signal indicating it is done with transfer
and ready for next
LaRCJones Slide 11 p174/ MAPLD 2004
GIFTS Memory Board Architecture
– The cPCI bus has the capability to transmit one 32-bit double word (Dword) each clock cycle of the 33MHz cPCI clock. This capability of course, requires that the PCI core be supplied with a 32-bit Dword on its input queue each 30.3 ns clock cycle in order to maintain 100% bandwidth on the cPCI bus. However, the SRAM devices that were available for use in the MEM design originally had a fastest access time of 35 ns. This access time restriction required that, in order to achieve improved bus bandwidth, two Dwords must be retrieved from the SRAM and transmitted to the PCI core during a single memory read operation. Furthermore, because data will be arriving across the LVDS interface during the DMA operation, it is necessary to be able to perform both a read operation, to retrieve data for the current DMA operation, and a write operation, to store the just received LVDS data, on the SRAM concurrently. These two considerations required that the SRAM in the MEM be configured as two 64-bit banks that are 128Kbytes deep, thus allowing simultaneous read and write operations to be performed on opposite banks in a ping-pong fashion. The MEM design requires that all DMA data transfers must be performed on a 64-bit boundary (i.e., an even number of 32-bit Dwords).
LaRCJones Slide 12 p174/ MAPLD 2004
Functional Block Diagram – Data Flow
SRAMController
PCICore
LVDSFifo
LeftSRAM
RightSRAM
64
64
64 64
LVDS Data
cPCI Bus
128K X 64
128K X 64
LaRCJones Slide 13 p174/ MAPLD 2004
Functional Block Diagram – Data Flow
SRAMController
PCICore
LVDSFifo
LeftSRAM
RightSRAM
64
64
64 64
LVDS Data
cPCI Bus
128K X 64
128K X 64
LaRCJones Slide 14 p174/ MAPLD 2004
Functional Block Diagram – Data Flow
SRAMController
PCICore
LVDSFifo
LeftSRAM
RightSRAM
64
64
64 64
LVDS Data
cPCI Bus
128K X 64
128K X 64
LaRCJones Slide 15 p174/ MAPLD 2004
SRAMController
Functional Block Diagram – Data Flow
64
64
PCICore
LVDSFifo
64 64LVDS Data cPCI Bus
LeftSRAM128K X 64
RightSRAM
128K X 64
LVDSFifo
w0 w1 w2 w3w0 w1 w2 w3
w4 w5 w6 w7w4 w5 w6 w7
… … w7, w6, w5, w4, w3, w2, w1, w0w7, w6, w5, w4, w3, w2, w1, w0
RightSRAM
128K X 64w0 w1 w2 w3w0 w1 w2 w3
LeftSRAM128K X 64
w4 w5 w6 w7w4 w5 w6 w7
PCICore
[w3 w2], [w1 w0][w3 w2], [w1 w0][w7 w6], [w5 w4]; [w3 w2], [w1 w0][w7 w6], [w5 w4]; [w3 w2], [w1 w0]
w8 w9 w10 w11w8 w9 w10 w11
LaRCJones Slide 16 p174/ MAPLD 2004
GIFTS Memory Board Architecture
– Since Actel PCI core IP was designed for a standard memory module, all of the data that is to be DMAed must be available (otherwise the Core is starved for data). This is not the case with this design since at any given time the IC does not know exactly how much data is available for DMA transfer. Therefore, it was necessary to design a “DMA engine” into the MEM that can keep track of how much data has been received by the MEM and stored in the SRAM, and configure and initiate the individual DMA operations in the cPCI core until the requested total amount of data has been transferred to the destination across the cPCI bus.
LaRCJones Slide 17 p174/ MAPLD 2004
GIFTS Memory Board Architecture
• Bit-slicing SRAM Controller Functionality– 64 bit data paths as a result of memory speed and
throughput– However cannot be implemented in one RT54SXS due to
I/O limitations– Results in SRAM Controller Core bit-sliced across 32-bit
boundaries
LaRCJones Slide 18 p174/ MAPLD 2004
Bit-Sliced FPGA Functional Block Diagram
Left SRAMController
PCICore
LVDSFifo
LeftSRAM
RightSRAM
64
64
64 64
Right SRAMController
32
32
32
32
32
32
128K X 64
128K X 64
LaRCJones Slide 19 p174/ MAPLD 2004
GIFTS Memory Board Architecture
• Requires 4 Actel RT54SX32S FPGAs– SRAM Controller Core bit-sliced into 2 FPGAs– Based on total cell utilizations for
– PCI CORE FGPA was 69%– Left Sram Controller was 67%– Right Sram Controller was 53%– LVDS FIFO was 72%
– PCI Core and LVDS FIFO each implemented on a separate FPGA resulting in 4 FPGAs
LaRCJones Slide 20 p174/ MAPLD 2004
Block Diagram
ActelPCICore
DMAFIFO
LVDSFIFO cPCI
Bus
BAE238A792SRAM (2)
Actel RT54SX32SFPGA
Actel RT54SX32SFPGA
DeserializerLVDSData
LVDS Clock DomainCore Clock Domain(Delayed cPCI Clock) cPCI Clock Domain
BAE238A792SRAM (2)
1632
Left SRAM/DMA Controller
64
64
64 64
Right SRAMController
32
32
32
32
32
32
Actel RT54SX32S
Actel RT54SX32S
LaRCJones Slide 21 p174/ MAPLD 2004
MEM Design Status
• Initial prototype design completed– Limitations imposed by available FPGA size, speed, and I/O pins
overcome– Limitations imposed by Actel-supplied PCI Core overcome– Limitations imposed by available SRAM speed overcome– Complexity of design resulting from 3-asynchronous clock domains
overcome
• Initial brassboard completed and tested– Demonstrated fundamental design concept– Successfully completed reliable DMA of data from LVDS stimulator
to PCI-resident memory– Uncovered several significant bugs and needed additional
functionality– Successfully redesigned and simulated at the behavioral and post-
synthesis level – Revised version of the design will work with existing PCB without
redesign
LaRCJones Slide 22 p174/ MAPLD 2004
MEM Rev. 2 Design to-be-completed
• Complete timing layout and test in post layout simulation with back annotated timing
• Test on the brassboard• Complete documentation of the design
LaRCJones Slide 24 p174/ MAPLD 2004
Brassboard Layout
• Layout details – CPCI requirements met for terminating resistor placement and trace
lengths to the PCI FPGA.
– CPCI 65 ohm Impedance requirement met on all signal layers facilitated by stackup design and trace width and separation relationship
– 30mils separation between asynchronous CPCI signals and others
– 100 ohm Differential LVDS Impedance requirement met on all signal layers facilitated by stackup design and trace width and separation relationship, and custom thru-hole arrangement for Micro_Twinax cable connection
– Differential LVDS signals surrounded by ground guard plane• spaced 20mils (2 x distance between pair) away from pairs
– Clocks are routed internal to the board, between planes, and have ground guard traces on both sides.
– Edges of Board will be milled to allow 90mil board to slide into 62mil guides
– Layout approximately 99% complete
LaRCJones Slide 25 p174/ MAPLD 2004
Brassboard Layout
cPC
Icon
nectors
Mem
ory B
oard
Prelim
inary S
ketch
regulator
Deserializer
PC
I_C
OR
E F
PG
AL
VD
S_
FIF
O_
FP
GA
.3overbd
Panel
mount
twin bnc
cab
le
.3overbd
Panel
mount
twin bnc
cab
le
.3overbd
Panel
mount
twin bnc
.3overbd
Panel
mount
twin bnc
cab
le
.3overbd
Panel
mount
twin bnc
cab
le
2.2
5 inche
s min
imum
.5 inch
com
po
ne
nt ke
epo
ut fo
r wed
gelo
ck ch
ann
els
SRAM SIMM socket and board area
1
1
U5
1
1
2
3
SEE ATTACHED DETAIL
~2"~2"
NON-Grounded FrontPanel 6-32 mounting
holes
2.25 inchesminimum
2.25 inchesminimum
2.25 in
ches m
inimu
m
SR
AM
SIM
M socket an
d bo
ard
are
a
U71
2
3
SR
AM
SIM
M so
cket a
nd
boa
rd are
aU6
1
2
3
LE
FT
_S
RA
M_
CT
RL
FP
GA
1
RIG
HT
_SR
AM
_CT
RL
FP
GA
1
.5 inch
com
po
ne
nt ke
epo
ut fo
r wed
gelo
ck ch
ann
els
160
mm
Ma
x 275
mm
SR
AM
SIM
M socket an
d bo
ard
are
a
1
2
3
U8
233.35mm