Chapter9 Intro FPGA
-
Upload
abdulhussain-amravatiwala -
Category
Documents
-
view
222 -
download
0
Transcript of Chapter9 Intro FPGA
-
8/2/2019 Chapter9 Intro FPGA
1/62
Introduction to FPGATechnology, Devices and Tools
-
8/2/2019 Chapter9 Intro FPGA
2/62
FPGA Devices & Technology
-
8/2/2019 Chapter9 Intro FPGA
3/62
World of Integrated Circuits
Full-Custom
ASICs
Semi-Custom
ASICs
User
Programmable
PLD FPGA
-
8/2/2019 Chapter9 Intro FPGA
4/62
designs must be sent
for expensive and timeconsuming fabricationin semiconductor foundry
ASIC
ApplicationSpecificIntegratedCircuit
FPGA
FieldProgrammableGateArray
designed all the way
from behavioral descriptionto physical layout
Small development
overheadNo NRE (non-recurringengineering) costs
Quick time to market
No minimum quantityorder
Reprogrammable
-
8/2/2019 Chapter9 Intro FPGA
5/62
How can we make aprogrammable logic?
One time programmable
Fuses (destroy internal links with current)
Anti-fuses (grow internal links) PROM
Reprogrammable
EPROM EEPROM
Flash
SRAM - volatile
-
8/2/2019 Chapter9 Intro FPGA
6/62
BlockRAMs
BlockRAMs
Configurable
LogicBlocks
I/OBlocks
What is an FPGA?
Block
RAMs
-
8/2/2019 Chapter9 Intro FPGA
7/62
Which Way to Go?
Off-the-shelf
Low development cost
Short time to market
Reconfigurability
High performance
ASICs FPGAs
Low power
Low cost inhigh volumes
-
8/2/2019 Chapter9 Intro FPGA
8/62
Other FPGA Advantages
Manufacturing cycle for ASIC is very costly,lengthy and engages lots of manpower
Mistakes not detected at design time have largeimpact on development time and cost
FPGAs are perfect for rapid prototyping of digitalcircuits
Easy upgrades like in case of software Unique applications
reconfigurable computing
-
8/2/2019 Chapter9 Intro FPGA
9/62
Major FPGA Vendors
SRAM-based FPGAs
Xilinx, Inc.
Altera Corp.
Atmel
Lattice Semiconductor
Flash & antifuse FPGAs
Actel Corp.
Quick Logic Corp.
Share over 60% of the market
-
8/2/2019 Chapter9 Intro FPGA
10/62
XILINX
-
8/2/2019 Chapter9 Intro FPGA
11/62
Xilinx
Primary products: FPGAs and the associated CADsoftware
Main headquarters in San Jose, CA
Fabless* Semiconductor and Software Company UMC (Taiwan) {*Xilinx acquired an equity stake in UMC in 1996}
Seiko Epson (Japan)
TSMC (Taiwan)
ProgrammableLogic Devices ISE Alliance and Foundation
Series Design Software
-
8/2/2019 Chapter9 Intro FPGA
12/62
-
8/2/2019 Chapter9 Intro FPGA
13/62
Basic Spartan-II FPGA BlockDiagram
-
8/2/2019 Chapter9 Intro FPGA
14/62
F5IN
CINCLKCE
COUT
D Q
CK
S
REC
D Q
CK
REC
O
G4G3G2G1
Look-Up
Table
Carry
&
Control
Logic
O
YB
Y
F4F3F2F1
XB
X
Look-Up
Table
BY
SR
S
Carry
&
Control
Logic
SLICE
COUT
D Q
CK
S
REC
D Q
CK
REC
O
G4G3G2G1
Look-Up
Table
Carry
&
Control
Logic
O
YB
Y
F4F3F2F1
XB
X
Look-Up
Table
F5IN
BY
SR
S
Carry
&
Control
Logic
CINCLKCE SLICE
CLB Structure
Each slice has 2 LUT-FF pairs with associated carry logic
Two 3-state buffers (BUFT) associated with each CLB,accessible by all CLB outputs
-
8/2/2019 Chapter9 Intro FPGA
15/62
CLB Slice Structure
Each slice contains two sets of the
following: Four-input LUT
Any 4-input logic function,
or 16-bit x 1 sync RAM
or 16-bit shift register Carry & Control
Fast arithmetic logic
Multiplier logic
Multiplexer logic
Storage element
Latch or flip-flop
Set and reset
True or inverted inputs
Sync. or async. control
-
8/2/2019 Chapter9 Intro FPGA
16/62
LUT (Look-Up Table)Functionality
Look-Up tablesare primaryelements forlogic
implementation Each LUT can
implement anyfunction of 4
inputs
x1 x2 x3 x4
y
x1 x2
y
LUT
x1x2x3x4
y
0
x10
x2 x3 x40 0
0 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 0
0 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y
0100010
101001100
0
x10
x2 x3 x40 0
0 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 0
0 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y
1111111
111110000
x1 x2 x3 x4
y
x1 x2 x3 x4
y
x1 x2
y
x1 x2
y
LUT
x1x2x3x4
y
0
x10
x2 x3 x40 0
0 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 0
0 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y
0100010
101001100
0
x10
x2 x3 x40 0
0 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 0
0 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y
0100010
101001100
0
x10
x2 x3 x40 0
0 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 0
0 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y
1111111
111110000
0
x10
x2 x3 x40 0
0 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 0
0 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 01 1 0 11 1 1 01 1 1 1
y
1111111
111110000
-
8/2/2019 Chapter9 Intro FPGA
17/62
5-Input Functionsimplemented using two LUTs
One CLB Slice can implement any function of 5 inputs
Logic function is partitioned between two LUTs
F5 multiplexer selects LUT
A4
A3
A2
A1WS DI
D
LUT
ROMRAM
1
0
F4
F3F2
F1
A4
A3A2
A1
WS DI
D
LUT
ROM
RAM
F5
GXOR
G
nBX
BX
1
0
BX
X
F5
A4
A3
A2
A1WS DI
D
LUT
ROMRAM
A4
A3
A2
A1WS DI
D
LUT
ROMRAM
1
0
1
0
F4
F3F2
F1
A4
A3A2
A1
WS DI
D
LUT
ROM
RAM
A4
A3A2
A1
WS DI
D
LUT
ROM
RAM
F5
GXOR
G
F5
GXOR
G
nBX
BX
1
0
nBX
BX
1
0
BX
X
F5
-
8/2/2019 Chapter9 Intro FPGA
18/62
5-Input Functions implementedusing two LUTs
LUTLUT
X5 X4 X3 X2 X1 Y
0 0 0 0 0 0
0 0 0 0 1 1
0 0 0 1 0 0
0 0 0 1 1 0
0 0 1 0 0 1
0 0 1 0 1 1
0 0 1 1 0 0
0 0 1 1 1 0
0 1 0 0 0 1
0 1 0 0 1 0
0 1 0 1 0 0
0 1 0 1 1 1
0 1 1 0 0 1
0 1 1 0 1 1
0 1 1 1 0 1
0 1 1 1 1 1
1 0 0 0 0 0
1 0 0 0 1 0
1 0 0 1 0 0
1 0 0 1 1 0
1 0 1 0 0 0
1 0 1 0 1 0
1 0 1 1 0 01 0 1 1 1 1
1 1 0 0 0 0
1 1 0 0 1 1
1 1 0 1 0 0
1 1 0 1 1 1
1 1 1 0 0 0
1 1 1 0 1 1
1 1 1 1 0 0
1 1 1 1 1 0
LUTLUT
OUT
-
8/2/2019 Chapter9 Intro FPGA
19/62
CLB
MUXF6
Slice
LUT
LUT
MUXF5
Slice
LUT
LUT
MUXF5
Dedicated ExpansionMultiplexers
MUXF5 combines 2 LUTs to create Any 5-input function (LUT5) Or selected functions up to 9 inputs Or 4x1 multiplexer
MUXF6 combines 2 slices to form Any 6-input function (LUT6) Or selected functions up to 19 inputs 8x1 multiplexer
Dedicated muxes are faster and more
space efficient
-
8/2/2019 Chapter9 Intro FPGA
20/62
RAM16X1S
O
DWE
WCLK
A0
A1
A2
A3
RAM32X1S
O
DWE
WCLK
A0A1A2A3A4
RAM16X2S
O1
D0
WE
WCLKA0
A1
A2A3
D1
O0
=
=
LUT
LUT or
LUT
RAM16X1D
SPO
D
WE
WCLK
A0
A1
A2
A3
DPRA0 DPO
DPRA1
DPRA2
DPRA3
or
Distributed RAM
CLB LUT configurable asDistributed RAM
A LUT equals 16x1 RAM
Implements Single andDual-Ports
Cascade LUTs to increaseRAM size
Synchronous write
Synchronous/Asynchronousread
Accompanying flip-flops usedfor synchronous read
-
8/2/2019 Chapter9 Intro FPGA
21/62
-
8/2/2019 Chapter9 Intro FPGA
22/62
Shift Register
Register-rich FPGA Allows for addition of pipeline stages to increase
throughput
Data paths must be balanced to keep desiredfunctionality
64
Operation A
4 Cycles 8 Cycles
Operation B
3 Cycles
Operation C
64
12 Cycles
3 Cycles
9-Cycle imbalance
-
8/2/2019 Chapter9 Intro FPGA
23/62
COUT
D Q
CK
S
REC
D Q
CK
REC
O
G4G3G2G1
Look-Up
TableCarry
&
Control
Logic
O
YB
Y
F4F3F2F1
XB
X
Look-Up
Table
F5IN
BY
SR
S
Carry
&
Control
Logic
CINCLKCE
SLICE
Carry & Control Logic
-
8/2/2019 Chapter9 Intro FPGA
24/62
Each CLB contains separatelogic and routing for the fastgeneration of sum & carrysignals Increases efficiency and
performance of adders,subtractors, accumulators,comparators, and counters
Carry logic is independent ofnormal logic and routingresources
Fast Carry Logic
LSB
MSB
CarryLogic
Routing
-
8/2/2019 Chapter9 Intro FPGA
25/62
Accessing Carry Logic
All major synthesis tools can infer carrylogic for arithmetic functions Addition (SUM
-
8/2/2019 Chapter9 Intro FPGA
26/62
Block RAM
Spartan-IITrue Dual-Port
Block RAM
PortA
P
ortB
Block RAM
Most efficient memory implementation
Dedicated blocks of memory
Ideal for most memory requirements
4 to 14 memory blocks
4096 bits per blocks
Use multiple blocks for larger memories
Builds both single and true dual-port RAMs
-
8/2/2019 Chapter9 Intro FPGA
27/62
Dual Port Block RAM
-
8/2/2019 Chapter9 Intro FPGA
28/62
RAMB4_S4_S16
Port A Out
4-Bit Width
Port B In
256-Bit Depth
Port A In
1K-Bit Depth
Port B Out
16-Bit Width
DOA[3:0]
DOB[15:0]
WEA
ENA
RSTA
ADDRA[9:0]
CLKA
DIA[3:0]
WEB
ENB
RSTB
ADDRB[7:0]
CLKB
DIB[15:0]
Dual-Port Bus Flexibility
Each port can be configured with a different data buswidth
Provides easy data width conversion without anyadditional logic
T I d d t
-
8/2/2019 Chapter9 Intro FPGA
29/62
VCC, ADDR[10:0]
GND, ADDR[10:0]
RAMB4_S1_S1
Port B Out1-Bit Width
DOA[0]
DOB[0]
WEAENA
RSTA
ADDRA[10:0]
CLKA
DIA[0]
WEB
ENB
RSTB
ADDRB[10:0]
CLKB
DIB[0]
Port B In
2K-Bit Depth
Port A Out
1-Bit Width
Port A In2K-Bit Depth
Two IndependentSingle-Port RAMs
To access the lower RAM
Tie the MSB address bit toLogic Low
To access the upper RAM Tie the MSB address bit to
Logic High
Added advantage of True Dual-Port
No wasted RAM Bits Can split a Dual-Port 4K RAM into
two Single-Port 2K RAM Simultaneous independent access to
each RAM
-
8/2/2019 Chapter9 Intro FPGA
30/62
I/O Banking
B i I/O Bl k S
-
8/2/2019 Chapter9 Intro FPGA
31/62
Basic I/O Block Structure
D
EC
Q
SR
DEC
Q
SR
DEC
Q
SR
Three-StateControl
Output Path
Input Path
Three-State
Output
Clock
Set/Reset
Direct Input
RegisteredInput
FF Enable
FF Enable
FF Enable
-
8/2/2019 Chapter9 Intro FPGA
32/62
IOB Functionality
IOB provides interface between the packagepins and CLBs
Each IOB can work as uni- or bi-directionalI/O
Outputs can be forced into High Impedance
Inputs and outputs can be registered
advised for high-performance I/O
Inputs can be delayed
-
8/2/2019 Chapter9 Intro FPGA
33/62
Routing Resources
PSM PSM
CLB
PSM PSM
CLB CLB
CLBCLB CLB
CLBCLB CLB
ProgrammableSwitchMatrix
-
8/2/2019 Chapter9 Intro FPGA
34/62
Clock Distribution
-
8/2/2019 Chapter9 Intro FPGA
35/62
FPGA Nomenclature
-
8/2/2019 Chapter9 Intro FPGA
36/62
ALTERA
-
8/2/2019 Chapter9 Intro FPGA
37/62
Device Families & Tools
L i El FLEX K
-
8/2/2019 Chapter9 Intro FPGA
38/62
Logic Element: FLEX10K
L i A Bl k FLEX 0K
-
8/2/2019 Chapter9 Intro FPGA
39/62
Logic Array Block: FLEX10K
FLEX10K A hi
-
8/2/2019 Chapter9 Intro FPGA
40/62
FLEX10K Architecture
S i A hi
-
8/2/2019 Chapter9 Intro FPGA
41/62
Stratix Architecture
St ti D i F il
-
8/2/2019 Chapter9 Intro FPGA
42/62
Stratix Device Family
Feature EP1S10 EP1S20 EP1S25 EP1S30 EP1S40 EP1S60 EP1S80 EP1S120
Logic Elements (LEs) 10,570 18,460 25,660 32,470 41,250 57,120 79,040 114,140
M512 RAM Blocks( 512 Bits + Parity)
94 194 224 295 384 574 767 1,118
M4K RAM Blocks(4 Kbits + Parity)
60 82 138 171 183 292 364 520
M512 RAM Blocks(512 Kbits + Parity)
1 2 2 4 4 6 9 12
Total RAM bits 920,448 1,669,248 1,944,576 3,317,184 3,423,744 5,215,104 7,427,520 10,118,016
DSP Blocks 6 10 10 12 14 18 22 28
Embedded Multipliers 48 80 80 96 112 144 176 224
PLLS 6 6 6 10 12 12 12 12
Maximum User I/O Pins 426 586 706 726 822 1,022 1,238 1,314
Engineering SampleAvailability
NowUse
ProductionUse
ProductionN/A Now N/A Now 2003
ProductionDevice Availability
March2003
Now Now NowMarch2003
April2003
January2003
2003
FPGA T h l R d
-
8/2/2019 Chapter9 Intro FPGA
43/62
FPGA Technology Roadmap
year 1995 1996 1997 2000 2003 2004 ?
Technology 0.6 0.35 0.25 0.18 0.13 0.07
Gate count 25K 100K 250K 1 M
100K LC*
8Mb RAM
400 18X18multipliers
Transistorcount
3.5M 12M 23M 75M 430M 1B
*note: Xilinx Virtex-II ProXC2VP100 (9/16/2003)
-
8/2/2019 Chapter9 Intro FPGA
44/62
Advance architecture onmodern FPGAs
-
8/2/2019 Chapter9 Intro FPGA
45/62
More guts
Additional components
RAM blocks
Dedicated multipliers
Tri-state buffers
Transceivers
Processor cores
DSP blocks
D di t A ith ti Bl k
-
8/2/2019 Chapter9 Intro FPGA
46/62
Dedicate Arithmetic Blocks
Altera
Xilinx
QuickLogic
P C
-
8/2/2019 Chapter9 Intro FPGA
47/62
Processor Cores
P PC V t II P
-
8/2/2019 Chapter9 Intro FPGA
48/62
PowerPC on Vertex II Pro
Embedded 300+ MHz Harvard Architecture Core
Low Power Consumption: 0.9 mW/MHz Five-Stage Data Path Pipeline Hardware Multiply/Divide Unit Thirty-Two 32-bit General Purpose Registers
16 KB Two-Way Set-Associative Instruction Cache 16 KB Two-Way Set-Associative Data Cache Memory Management Unit (MMU)
- 64-entry unified Translation Look-aside Buffers (TLB)- Variable page sizes (1 KB to 16 MB)
Dedicated On-Chip Memory (OCM) Interface Supports IBM CoreConnect Bus Architecture Debug and Trace Support Timer Facilities
ARM in Excalibur
-
8/2/2019 Chapter9 Intro FPGA
49/62
ARM in Excalibur
Industry-standard ARM922T 32-bit RISC processor core
operating up to 200MHzARMv4T instruction set with Thumb extensions
Memory management unit (MMU) included for real-time operatingsystems (RTOS) support
Harvard cache architecture with 64-way set associative separate 8-Kbyte instruction and 8-Kbyte data caches
Embedded programmable on-chip peripherals
ETM9 embedded trace module to assistant software debugging
Flexible interrupt controller
Universal asynchronous receiver/transmitter (UART)
General-purpose timer
Watchdog timer
-
8/2/2019 Chapter9 Intro FPGA
50/62
FPGA Tools
-
8/2/2019 Chapter9 Intro FPGA
51/62
Design process (1)Design and implement a simple unit permitting to
speed up encryption with RC5-similar cipher with
fixed key set on 8031 microcontroller. Unlike in
the experiment 5, this time your unit has to be able
to perform an encryption algorithm by itself,
executing 32 rounds..
LibraryIEEE;
use ieee.std_logic_1164.all;
use ieee.std_logic_unsigned.all;
entity RC5_core is
port(clock, reset, encr_decr: in std_logic;
data_input: in std_logic_vector(31downto0);
data_output: out std_logic_vector(31downto0);
out_full: in std_logic;
key_input: in std_logic_vector(31downto0);
key_read: out std_logic;
);
end AES_core;
Specification (Lab Experiments)
VHDL description (Your Source Files)
Functional simulation
Post-synthesis simulationSynthesis
-
8/2/2019 Chapter9 Intro FPGA
52/62
Design process (2)
Implementation
Configuration
Timing simulation
On chip testing
Active HDL
-
8/2/2019 Chapter9 Intro FPGA
53/62
Active-HDL
-
8/2/2019 Chapter9 Intro FPGA
54/62
Simulation Tools
Synthesis Tools
L i S th i
-
8/2/2019 Chapter9 Intro FPGA
55/62
architecture MLU_DATAFLOW of MLU is
signal A1:STD_LOGIC;signal B1:STD_LOGIC;signal Y1:STD_LOGIC;signal MUX_0, MUX_1, MUX_2, MUX_3: STD_LOGIC;
begin
A1
-
8/2/2019 Chapter9 Intro FPGA
56/62
Features of synthesis tools
Interpret RTL code
Produce synthesized circuit netlist in astandard EDIF format
Give preliminary performance estimates
Some can display circuit schematicscorresponding to EDIF netlist
Implementation
-
8/2/2019 Chapter9 Intro FPGA
57/62
Implementation
After synthesis the entire implementationprocess is performed by FPGA vendor tools
Xilinx ISE foundation 6.2i
Altera Quartus II 4.0
3rd party tools for alliance version
Circuit Compilation
-
8/2/2019 Chapter9 Intro FPGA
58/62
Circuit Compilation
LUT
LUT
?
Assign a logicalLUT to a physicallocation.
Select wire segmentsAnd switches forInterconnection.
1. Technology Mapping
2. Placement
3. Routing
Routing Example
-
8/2/2019 Chapter9 Intro FPGA
59/62
Routing Example
Programmable Connections
FPGA
Static Timing Analyzer
-
8/2/2019 Chapter9 Intro FPGA
60/62
Static Timing Analyzer
Performs static analysis of the circuitperformance
Reports critical paths with all sources of
delays
Determines maximum clock frequency
Static Timing Analysis
-
8/2/2019 Chapter9 Intro FPGA
61/62
Static Timing Analysis
Critical Path The Longest Path From
Outputs of Registers to Inputs of Registers
D Qin
clk
D Qout
tP logic
tCritical = tP FF + tPlogic + tS FF
Min. Clock Period = Length of The Critical Path
Max. Clock Frequency = 1 / Min. Clock Period
Configuration
-
8/2/2019 Chapter9 Intro FPGA
62/62
Configuration
Once a design is implemented, you must
create a file that the FPGA can understand This file is called a bit stream: a BIT file (.bit
extension)
The BIT file can be downloaded directly tothe FPGA, or can be converted into a PROMfile which stores the programming information