George Mason University FPGA Devices & FPGA Design Flow ECE 448 Lecture 7.
-
Upload
sharon-gray -
Category
Documents
-
view
230 -
download
1
Transcript of George Mason University FPGA Devices & FPGA Design Flow ECE 448 Lecture 7.
George Mason University
FPGA Devices& FPGA Design Flow
ECE 448Lecture 7
2
Required reading
• P. Chu, RTL Hardware Design using VHDL
Chapter 1, Introduction to Digital System Design
• Spartan-6 FPGA CLB, User Guide
CLB Overview Slice Description
3
• designs must be sent for expensive and time consuming fabrication in semiconductor foundry
• bought off the shelf and reconfigured by designers themselves
Two competing implementation approaches
ASICApplication Specific
Integrated Circuit
FPGAField Programmable
Gate Array
• designed all the way from behavioral description to physical layout
• no physical layout design; design ends with a bitstream used to configure a device
4
Which Way to Go?
Off-the-shelf
Low development cost
Short time to market
Reconfigurability
High performance
ASICs FPGAs
Low power
Low cost inhigh volumes
5
Block R
AM
s
Block R
AM
s
ConfigurableLogicBlocks
I/OBlocks
What is an FPGA?
BlockRAMs
6
Modern FPGARAM blocks
Multipliers
Logic blocks
Graphics based on The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
Multipliers/DSP units
RAM blocks
Logic resources
(#Logic resources, #Multipliers/DSP units, #RAM_blocks)
7
Major FPGA Vendors
SRAM-based FPGAs• Xilinx, Inc.• Altera Corp.• Lattice Semiconductor• Atmel
Flash & antifuse FPGAs• Actel Corp. (Microsemi SoC Products Group)• Quick Logic Corp.
~ 51% of the market
~ 34% of the market~ 85%
8
Xilinx
Primary products: FPGAs and the associated CAD software
Main headquarters in San Jose, CA Fabless* Semiconductor and Software Company
UMC (Taiwan) {*Xilinx acquired an equity stake in UMC in 1996} Seiko Epson (Japan) TSMC (Taiwan) Samsung (Korea)
Programmable Logic Devices ISE Alliance and Foundation
Series Design Software
Technology Low-cost High-performance
220 nm Virtex180 nm Spartan II,
Spartan IIE
120/150 nm Virtex II, Virtex II Pro
90 nm Spartan 3 Virtex 465 nm Virtex 545 nm Spartan 640 nm Virtex 628 nm Artix 7 Virtex 7
Xilinx FPGA Families
Altera FPGA Families
Technology Low-cost Mid-range High-performance
130 nm Cyclone Stratix
90 nm Cyclone II Stratix II
65 nm Cyclone III Arria I Stratix III
40 nm Cyclone IV Arria II Stratix IV
28 nm Cyclone V Arria V Stratix V
11
FPGA Family
12
Spartan 6 FPGA Family
George Mason University
CLB Structure
14
Programmableinterconnect
Programmablelogic blocks
The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
General structure of an FPGA
15
Xilinx Spartan 6 CLB
16
Row & Column Relationship Between CLBs & Slices
17
Three Different Types of Slices
50% 25% 25%
18
SLICEX
19
SLICEL
20
16-bit SR
16 x 1 RAM
4-input LUT
The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
Xilinx Multipurpose LUT (MLUT)
64 x 1 ROM
(logic)
64 x 1 RAM
32-bit SR
21
4-input LUT (Look-Up Table) in the Basic ROM Mode
• Look-Up tables are primary elements for logic implementation
• Each LUT can implement any function of 4 inputs
x1 x2 x3 x4
y
x1 x2
y
LUT
x1x2x3x4
y
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y0100010101001100
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y1111111111110000
x1 x2 x3 x4
y
x1 x2 x3 x4
y
x1 x2
y
x1 x2
y
LUT
x1x2x3x4
y
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y0100010101001100
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y0100010101001100
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y1111111111110000
0x1
0x2 x3 x4
0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 00 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y1111111111110000
22
6-Input LUT of Spartan 6
23
24
Reset and Set Configurations
• No set or reset• Synchronous set• Synchronous reset• Asynchronous set (preset)• Asynchronous reset (clear)
25
MLUT as a 32-bit Shift Register (SRL32)
26
Each CLB contains separate logic and routing for the fast generation of sum & carry signals• Increases efficiency and
performance of adders, subtractors, accumulators, comparators, and counters
Carry logic is independent of normal logic and routing resources
Fast Carry Logic
LSB
MSB
Car
ry L
ogic
Rou
ting
27
Accessing Carry Logic
All major synthesis tools can infer carry logic for arithmetic functions• Addition (SUM <= A + B)• Subtraction (DIFF <= A - B)• Comparators (if A < B then…)• Counters (count <= count +1)
Full-adder
xy
cout
sFA
x + y + cin = ( cout s )2
2 1
x y cout s
00001111
00110011
00010111
01101001
cin
01010101
cin
x yCOUT
0011
0101
y
y
CIN
CIN
Propagate = x yGenerate = ySum= Propagate CIN = x y CIN
xy
Carry & Control Logic in Xilinx FPGAs
Carry & Control Logic in Spartan 6 FPGAs
LUT
Hardwired (fast) logic
xy
George Mason University
Examples:
Determine the amount of Spartan 6 resources neededto implement a given circuit
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13
R14
R15
w
abcd
yF
m
clk
0 1 runCircuit 1:
Top level
1
01
0
01234567
cin
x y
cout
s
<<<3
x3
x2
x1
x0
y3
y2
y1
y0
w1
w0
En
y3
y2
y1
y0
a
bcd
a
b
c
d
c
ab
e
e
f
3
2-to-4 Decoder
FullAdder
f
g
h
g h
y
Circuit 1:
F – function
R0
R1
R2
R3
R4
R5
R6
R7
R8
R9
R10
R11
R12
R13
R14
R15
z
a
b
c
d
e
yF
d
clk
0 1 runCircuit 2:
Top level
1
01
0
0
1
2
3
4
5
6
7
x y
cout
s
>>2
x3
x2
x1
x0
y3
y2
y1
y0
y1
y0
z
w3
w2
w1
w0
a
b
c
d
a
e
f
g
h
3
Priority Encoder
Half
Adder
g
h
i
e i
y
a
b
c
d
Circuit 2:
F – function
Circuit 3: Top level
George Mason University
Input/Output Blocks(IOBs)
39
Basic I/O Block Structure
DEC
Q
SR
DEC
Q
SR
DEC
Q
SR
Three-StateControl
Output Path
Input Path
Three-State
Output
Clock
Set/Reset
Direct Input
Registered Input
FF Enable
FF Enable
FF Enable
40
IOB Functionality
• IOB provides interface between the package pins and CLBs
• Each IOB can work as uni- or bi-directional I/O
• Outputs can be forced into High Impedance• Inputs and outputs can be registered
• advised for high-performance I/O• Inputs can be delayed
George Mason University
Clock Management
42
Clock signal fromoutside world
Clocktree
Flip-flops
Special clockpin and pad
A simple clock tree
The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
43
Clock signal fromoutside world
Special clockpin and pad
Daughter clocksused to drive
internal clock treesor output pins
ClockManager
etc.
The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
Clock Manager
44
Ideal clock signal
1 2 3 4
Real clock signal with jitter
Cycle 1
Cycle 2
Cycle 3
Cycle 4
Superimposed cycles
The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
Jitter
45
Clock signal fromoutside world
with jitter
Special clockpin and pad
“Clean” daughterclocks used to driveinternal clock trees
or output pins
ClockManager
etc.
The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
Removing Jitter
46
1.0 x original clock frequency
2.0 x original clock frequency
.5 x original clock frequency
The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
Frequency Synthesis
47
Figure 4-20
0o Phase shifted
90o Phase shifted
180o Phase shifted
270o Phase shifted
The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
Phase shifting
48
DCM – Digital Clock Manager
PLL - Phase Locked Loop
Clock Management Tiles
George Mason University
Spartan-6 Family Attributes
50
Spartan-6 FPGA Family Members
51
FPGA device present on the Digilent Nexys 3 board
XC6SLX16-CSG324C
Spartan 6family
Size324 pins
Package type(Ball Chip-Scale)
Commercialtemperature range
0° C – 85° C
LogicOptimized
George Mason University
FPGA Design Flow
FPGA Design process (1)Design and implement a simple unit permitting to speed up encryption with RC5-similar cipher with fixed key set on 8031 microcontroller. Unlike in the experiment 5, this time your unit has to be able to perform an encryption algorithm by itself, executing 32 rounds…..
Library IEEE;use ieee.std_logic_1164.all;use ieee.std_logic_unsigned.all;
entity RC5_core is port( clock, reset, encr_decr: in std_logic; data_input: in std_logic_vector(31 downto 0); data_output: out std_logic_vector(31 downto 0); out_full: in std_logic; key_input: in std_logic_vector(31 downto 0); key_read: out std_logic; );end AES_core;
Specification / Pseudocode
VHDL description (Your Source Files)
Functional simulation
Post-synthesis simulationSynthesis
On-paper hardware design (Block diagram & ASM chart)
FPGA Design process (2)
Implementation
Configuration
Timing simulation
On chip testing
55
Tools used in FPGA Design Flow
Xilinx XSTXilinx XST
DesignDesign
SynthesisSynthesis
ImplementationImplementationXilinx ISEXilinx ISE
VHDL code
Netlist
Bitstream
Synplify PremierSynplify Premier
Functionally verified
VHDL code
George Mason University
Synthesis
57
Synthesis Tools
… and others
Synplify PremierXilinx XST
58
architecture MLU_DATAFLOW of MLU is
signal A1:STD_LOGIC;signal B1:STD_LOGIC;signal Y1:STD_LOGIC;signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC;
beginA1<=A when (NEG_A='0') else
not A;B1<=B when (NEG_B='0') else
not B;Y<=Y1 when (NEG_Y='0') else
not Y1;
MUX_0<=A1 and B1;MUX_1<=A1 or B1;MUX_2<=A1 xor B1;MUX_3<=A1 xnor B1;
with (L1 & L0) selectY1<=MUX_0 when "00",
MUX_1 when "01",MUX_2 when "10",MUX_3 when others;
end MLU_DATAFLOW;
VHDL description Circuit netlist
Logic Synthesis
59
Circuit netlist (RTL view)
60
Mapping
LUT2
LUT3
LUT4
LUT5
LUT1FF1
FF2
LUT0
61
Xilinx XST Inputs/Outputs
62
Xilinx XST Inputs
• RTL VHDL and/or Verilog files• Constraints – XCF
Xilinx constraints file in which you can specify
synthesis, timing, and specific implementation
constraints that can be propagated to the NGC file.• Core files
These files can be in either NGC or EDIF format.
XST does not modify cores. It uses them to inform
area and timing optimization surrounding the cores.
63
Xilinx XST Outputs• NGC
Netlist file with constraint information• NGR
This is a schematic representation of the pre-optimized
design shown at the Register Transfer Level (RTL).
This representation is in terms of generic symbols,
such as adders, multipliers, counters, AND gates, and
OR gates, and is generated after the HDL synthesis phase
of the synthesis process.• LOG
This report contains the results from the synthesis run,
including area and timing estimation.
RTL view in Synplify Premier
incrementercomparator
General logic structures can be recognized in RTL view
MUX
Crossprobing between RTL view and code
Each port, net or block can be chosen by mouse click from the browser or directly from the RTL View
By double-clicking on the element its source code can be seen:
Reverse crossprobing is also possible: if section of code is marked, appropriate element of RTL View is marked too:
Technology View in Synplify Pro
Technology view is a mapped RTL view. It can be seen by pressing button or by double-click on “.srm” fileAs in case of “RTL View”, buttons can be used here
Two additional buttons are enabled: - show critical path - open timing analyst
Technology view is presented using device primitivesPorts, nets and
blocks browser
Pay attention: technology view is usually large and presented on number of sheets
Viewing critical path Critical path can be viewed by pressing on
Delay values are written near each component of the path
George Mason University
Implementation
69
Implementation
• After synthesis the entire implementation process is performed by FPGA vendor tools
70
Implementation
71
Translation
Translation
UCF
NGD Native Generic Database file
Constraint Editoror Text Editor
User Constraint File
Circuit Netlist
Timing Constraints
Synthesis
72
Mapping
LUT2
LUT3
LUT4
LUT5
LUT1FF1
FF2
LUT0
73
PlacingCLB SLICES
FPGA
74
Routing
Programmable Connections
FPGA
75
Configuration
• Once a design is implemented, you must create a file that the FPGA can understand• This file is called a bit stream: a BIT file (.bit extension)
• The BIT file can be downloaded directly to the FPGA, or can be converted into a PROM file which stores the programming information
Two main stages of the FPGA Design Flow
Synthesis
Technologyindependent
Technologydependent
Implementation
RTLSynthesis
Map Place & Route Configure
- Code analysis- Derivation of main logic constructions- Technology
independent optimization
- Creation of “RTL View”
- Mapping of extracted logic structures to device primitives
- Technology dependent optimization
- Application of “synthesis constraints”
- Netlist generation- Creation of “Technology
View”
- Placement of generated netlist onto the device
- Choosing best interconnect structure for the placed design
- Application of “physical constraints”
- Bitstream generation
- Burning device
77
Synthesis Report Example – Resource Utilization (1)
Device utilization summary:
---------------------------
Selected Device : 6slx4tqg144-3
Slice Logic Utilization:
Number of Slice Registers: 53 out of 4800 1%
Number of Slice LUTs: 163 out of 2400 6%
Number used as Logic: 163 out of 2400 6%
Slice Logic Distribution:
Number of LUT Flip Flop pairs used: 198
Number with an unused Flip Flop: 145 out of 198 73%
Number with an unused LUT: 35 out of 198 17%
Number of fully used LUT-FF pairs: 18 out of 198 9%
Number of unique control sets: 7
78
Synthesis Report Example – Resource Utilization (2)
IO Utilization:
Number of IOs: 43
Number of bonded IOBs: 43 out of 102 42%
Specific Feature Utilization:
Number of BUFG/BUFGCTRLs: 1 out of 16 6%
Number of DSP48A1s: 5 out of 8 62%
79
Synthesis Report Example – Timing
Timing Summary:
---------------
Speed Grade: -3
Minimum period: 6.031ns (Maximum Frequency: 165.817MHz)
80
Map Report Example – Resource Utilization (1)
Design Summary
--------------
Slice Logic Utilization:
Number of Slice Registers: 54 out of 4,800 1%
Number used as Flip Flops: 53
Number used as Latches: 0
Number used as Latch-thrus: 0
Number used as AND/OR logics: 1
Number of Slice LUTs: 149 out of 2,400 6%
Number used as logic: 148 out of 2,400 6%
Number using O6 output only: 133
Number using O5 output only: 0
Number using O5 and O6: 15
Number used as ROM: 0
Number used as Memory: 0 out of 1,200 0%
Number used exclusively as route-thrus: 1
81
Map Report Example – Resource Utilization (2)
Slice Logic Distribution:
Number of occupied Slices: 58 out of 600 9%
Number of MUXCYs used: 32 out of 1,200 2%
Number of LUT Flip Flop pairs used: 162
Number with an unused Flip Flop: 109 out of 162 67%
Number with an unused LUT: 13 out of 162 8%
Number of fully used LUT-FF pairs: 40 out of 162 24%
Number of unique control sets: 7
Number of slice register sites lost
to control set restrictions: 35 out of 4,800 1%
IO Utilization:
Number of bonded IOBs: 43 out of 102 42%
82
Map Report Example – Resource Utilization (3)
Specific Feature Utilization:
Number of RAMB16BWERs: 0 out of 12 0%
Number of RAMB8BWERs: 0 out of 24 0%
…….
Number of DSP48A1s: 5 out of 8 62%
…….
83
Post-PAR Static Timing Report
Clock to Setup on destination clock clk_i---------------+---------+---------+---------+---------+ | Src:Rise| Src:Fall| Src:Rise| Src:Fall|Source Clock |Dest:Rise|Dest:Rise|Dest:Fall|Dest:Fall|---------------+---------+---------+---------+---------+clk_i | 7.530| | | |---------------+---------+---------+---------+---------+
84
PAR Report
---------------------------------------------------------------------------------------------------------- Constraint | Check | Worst Case | Best Case | Timing | Timing | | Slack | Achievable | Errors | Score ---------------------------------------------------------------------------------------------------------- Autotimespec constraint for clock net clk | SETUP | N/A| 7.530ns| N/A| 0 _i_BUFGP | HOLD | 0.457ns| | 0| 0----------------------------------------------------------------------------------------------------------
85
Timing Report (1)
Timing constraint: Default period analysis for net "clk_i_BUFGP"
3354 paths analyzed, 309 endpoints analyzed, 0 failing endpoints
0 timing errors detected. (0 setup errors, 0 hold errors)
Minimum period is 7.530ns.
--------------------------------------------------------------------------------
Delay (setup path): 7.530ns (data path - clock path skew + uncertainty)
Source: a_register/q_o_4 (FF)
Destination: x_reg_inst/q_o_3 (FF)
Data Path Delay: 7.453ns (Levels of Logic = 2)
Clock Path Skew: -0.042ns (0.513 - 0.555)
Source Clock: clk_i_BUFGP rising
Destination Clock: clk_i_BUFGP rising
Clock Uncertainty: 0.035ns
86
Timing Report (2)
Maximum Data Path at Slow Process Corner: a_register/q_o_4 to x_reg_inst/q_o_3 Location Delay type Delay(ns) Physical Resource Logical Resource(s) ------------------------------------------------- ------------------- SLICE_X4Y36.AQ Tcko 0.447 a_register/q_o<4> a_register/q_o_4 DSP48_X0Y3.B4 net (fanout=21) 1.194 a_register/q_o<4> DSP48_X0Y3.M3 Tdspdo_B_M 3.364 Mmult_mult_unsigned Mmult_mult_unsigned SLICE_X8Y39.C4 net (fanout=1) 2.050 mult_unsigned<3> SLICE_X8Y39.CLK Tas 0.398 x_reg_inst/q_o<3> Mmux_x_57 Mmux_x_4_f7_2 Mmux_x_2_f8_2 x_reg_inst/q_o_3 ------------------------------------------------- -------------------- Total 7.453ns (4.209ns logic,
3.244ns route) (56.5% logic, 43.5% route)
87
Timing Report (3)
--------------------------------------------------------------------------------
Delay (setup path): 7.484ns (data path - clock path skew + uncertainty)
Source: a_register/q_o_7_1 (FF)
Destination: x_reg_inst/q_o_3 (FF)
Data Path Delay: 7.391ns (Levels of Logic = 2)
Clock Path Skew: -0.058ns (0.513 - 0.571)
Source Clock: clk_i_BUFGP rising
Destination Clock: clk_i_BUFGP rising
Clock Uncertainty: 0.035ns
Clock Uncertainty: 0.035ns ((TSJ^2 + TIJ^2)^1/2 + DJ) / 2 + PE
Total System Jitter (TSJ): 0.070ns
Total Input Jitter (TIJ): 0.000ns
Discrete Jitter (DJ): 0.000ns
Phase Error (PE): 0.000ns
88
Timing Report (4)
Maximum Data Path at Slow Process Corner: a_register/q_o_7_1 to x_reg_inst/q_o_3 Location Delay type Delay(ns) Physical Resource Logical Resource(s) ------------------------------------------------- ------------------- SLICE_X2Y33.AQ Tcko 0.447 a_register/q_o_7_2 a_register/q_o_7_1 DSP48_X0Y3.B7 net (fanout=13) 1.132 a_register/q_o_7_1 DSP48_X0Y3.M3 Tdspdo_B_M 3.364 Mmult_mult_unsigned Mmult_mult_unsigned SLICE_X8Y39.C4 net (fanout=1) 2.050 mult_unsigned<3> SLICE_X8Y39.CLK Tas 0.398 x_reg_inst/q_o<3> Mmux_x_57 Mmux_x_4_f7_2 Mmux_x_2_f8_2 x_reg_inst/q_o_3 ------------------------------------------------- -------------------- Total 7.391ns (4.209ns logic,
3.182ns route) (56.9% logic, 43.1% route)
Xilinx FPGA Memories
90
Recommended reading• Spartan-6 FPGA Block RAM Resources: User Guide
Google search: UG383
• Spartan-6 FPGA Configurable Logic Block: User Guide
Google search: UG384
• Xilinx FPGA Embedded Memory Advantages: White Paper
Google search: WP360
• ISE In-Depth Tutorial, Section: Creating a CORE Generator Tool Module
Google search: ISE In-Depth Tutorial
91
Memory Types
92
Memory Types
Memory
RAM ROM
Single port Dual port
With asynchronous
read
With synchronous
read
Memory
Memory
93
Memory Types specific to Xilinx FPGAs
Memory
Distributed (MLUT-based)
Block RAM-based(BRAM-based)
Inferred Instantiated
Memory
Manually Using CORE Generator
CORE Generator
CORE Generator
96
FPGA Distributed
Memory
97
Location of Distributed RAMRAM blocks
Multipliers
Logic blocks
Graphics based on The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
DSP units
RAM blocks
Logic resources
(#Logic resources, #Multipliers/DSP units, #RAM_blocks)
Logic resources(CLB slices)
98
Three Different Types of Slices
50% 25% 25%
99
16-bit SR
16 x 1 RAM
4-input LUT
The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
Spartan-6 Multipurpose LUT (MLUT)
64 x 1 ROM
(logic)
64 x 1 RAM
32-bit SR
100
Single-port 64 x 1-bit RAM
101
Memories Built of Neighboring MLUTs
• Single-port 128 x 1-bit RAM: RAM128x1S• Dual-port 64 x 1-bit RAM : RAM64x1D
Memories built of 2 MLUTs:
Memories built of 4 MLUTs:
• Single-port 256 x 1-bit RAM: RAM256x1S• Dual-port 128 x 1-bit RAM: RAM128x1D• Quad-port 64 x 1-bit RAM: RAM64x1Q• Simple-dual-port 64 x 3-bit RAM: RAM64x3SDP
(one address for read, one address for write)
102
Dual-port 64 x 1 RAM
ECE 448 – FPGA and ASIC Design with VHDL
• Dual-port 64 x 1-bit RAM : 64x1D• Single-port 128 x 1-bit RAM: 128x1S
103
Total Size of Distributed RAM
104
FPGA Block RAM
105
Location of Block RAMsRAM blocks
Multipliers
Logic blocks
Graphics based on The Design Warrior’s Guide to FPGAsDevices, Tools, and Flows. ISBN 0750676043
Copyright © 2004 Mentor Graphics Corp. (www.mentor.com)
DSP units
RAM blocks
Logic resources
(#Logic resources, #Multipliers/DSP units, #RAM_blocks)
Logic resources(CLB slices)
106
Spartan-6 Block RAM Amounts
107
Block RAM can have various configurations (port aspect ratios)
0
16,383
1
4,095
40
8,191
20
2047
8+10
1023
16+20
16k x 1
8k x 2 4k x 4
2k x (8+1)
1024 x (16+2)
108
109
110
Block RAM Port Aspect Ratios
111
Block RAM Interface
112
Block RAM Ports
113
Block RAM with synchronous readin Read-First Mode
CEEN
114
Features of Block RAMs in Spartan-6 FPGAs