The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes:...

48
August 2005, University of Strathclyde, Scotland, UK For Academic Use Only The DSP Primer 8 FPGA Technology DSPprimer Notes DSPprimer Home Return Return

Transcript of The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes:...

Page 1: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

August 2005, University of Strathclyde, Scotland, UK For Academic Use Only

The DSP Primer 8

FPGA Technology

DSPprimer NotesDSPprimer HomeReturn Return

Page 2: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

THIS SLIDE IS BLANK

Page 3: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

8.1

FPGA Technology

en covered:

logic of a design actually FPGA;

flip-flops/registers are “free”r high clock rates;

to designers within FPGAspes/options available;

locks are effectively routedice;

cing capabilities of FPGAs;

ic hardware.

ugust 2005, For Academic Use Only, All Rights Reserved

Introduction• This module will give a “top-down” overview of

based on various Xilinx devices;

• At the end of the section, the following will have be

• FPGA Technology Roadmap and the variousdevices available - how FPGAs are progressingand what might lie ahead;

• Performance and flexibility - how FPGAscompare to DSP Processors and ASICs andwhy FPGAs have the advantage;

• FPGA Structure - a top down look at what anFPGA consists of down to the low levelelements;

• Introduction to the FPGA design flow - anindication of the engineering process requiredto implement a design;

• How the digital operates within the

• Why pipelining andand are required fo

• Memory available and the different ty

• How signals and cthroughout the dev

• Input/Output interfa

• Dedicated arithmet

Page 4: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

otes:
Page 5: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

s 8.2

nsity through ever

plementing entire

hardware, clockdition to the main

sors (embedded inbric);

ugust 2005, For Academic Use Only, All Rights Reserved

FPGA Technology Trend• General trend is bigger and faster;

• This is being achieved by increases in device desmaller fabrication process technology;

• New generations of FPGAs are geared towards imsystems on a single device;

• Features such as RAM, dedicated arithmeticmanagement and transceivers are available in adprogrammable logic;

• FPGAs are also available with embedded processilicon or as cores within the programmable logic fa

Page 6: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

F lications such as consumere base-stations, networking/c

T C processors in recent Xilinxd d algorithms that involve a loto microprocessor than FPGA.T d goes a long way to makingth

M ores are implemented on them to write code to be executed.

F ndard, high speed I/O blocksa ith such features that plagueA all been solved by the FPGAm

otes:

PGAs are being incorporated as central processing elements in many applectronics, automotive, image/video processing, military/aerospace, ommunications, supercomputing and wireless applications.

he inclusion of embedded (i.e. actually present in silicon - not as soft IP) Power Pevices makes design partitioning and implementing much easier. Many low-speef decision making and jumps in execution are more suited to implementation byhe inclusion of the Power PC blocks by Xilinx is an acknowledgement of this ane “System on an FPGA” goal possible.

anufacturers may also provide embedded processors as “soft” IP cores. These cain programmable logic fabric and associated development kits allow designers

eatures such as dedicated arithmetic hardware, clock management and multi-stall assist the engineer in implementing a given design. Problems associated wSIC (Application Specific Integrated Circuit) designers such as clock skew haveanufacturer and can be essentially ignored by the FPGA engineer.

Page 7: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

8.3

ed at implementing

nsive and are notrimary factor;

d traditionally haveamilies (e.g. Xilinx

res offered by the

vailable at scalinggic fabric, RAM, I/O

ugust 2005, For Academic Use Only, All Rights Reserved

FPGA Families• Flagship FPGA families (e.g. Xilinx Virtex-4) are aim

large systems on a single device;

• Flagship families are the biggest and most expeaimed at high volume applications where cost is a p

• High volume applications (i.e. where an ASIC woulbeen used) are catered for by cheaper FPGA fSpartan-3);

• High volume devices often contain the same featuflagship devices at a smaller scale to control costs;

• Within FPGA families, multiple device sizes are acosts with associated scaling of features such as lopins, arithmetic hardware etc.

Page 8: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

O s making the design processm

E t device required will dependo

otes:

ften, low cost, high volume FPGA families are derived directly from larger familieore familiar (e.g. Spartan-3 from Virtex-II, Spartan-II from Virtex)

ach FPGA family comes in different sizes/packages and speed grades. The exacn factors related to requirements of the target design/application such as:

• Area;

• Data/sampling rates;

• Input/Outputs and associated data rates;

• Memory required;

• Requirement for embedded processor or not;

• Cost ($$$).

Page 9: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

ility (I) 8.4

cause algorithms/ent ways;

ship devices from second running at

mum MAC/s usingof course does not

ily outperform DSPnd flexibility;

esign flow is betterd some baseband

ugust 2005, For Academic Use Only, All Rights Reserved

FPGA Performance and Flexib• Performance of FPGAs is difficult to quantify be

systems can be flexibly implemented in many differ

• Multiply Accumulate (MAC) performance on flagXilinx is in the region of hundreds of GMACs perspeeds of a few hundred MHz;

• FPGA manufacturers often give figures for maxievery piece of logic capable of multiplication - this reflect typical systems implemented on FPGAs;

• What is clear is that, due to parallelism, FPGAs easProcessors in terms of data/arithmetic throughput a

• DSP Processors still have their place though - their dunderstood within the engineering community analgorithms do not yet map well to the FPGA fabric;

Page 10: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

M erformance) is often used toc nce.

T different ways to suit there

F e implemented on an FPGAo MACs one after the other ins le to perform the 10 MACs in5 s much flexibility.

W ormed quickly, the FPGA canu can be done slowly (definedb using a 10th of the area butta to the application and takea

In processors do not have thiso

It FPGA design that consistedo ive an idea of the potentialp onsiderably!)

otes:

IPS (Millions of Instructions Per Second or perhaps Meaningless Indicator Of Pompare DSP Processors but cannot be used to quantify overall FPGA performa

he problem is that FPGAs are flexible enough to implement algorithms inquirements of a particular application.

or example, an application that requires 10 MACs (Multiply Accumulates) can br a DSP processor. The FPGA could implement the hardware to perform the 10erial taking 10 clock cycles or in parallel, taking 1 clock cycle. Indeed it is possib clock cycles, or 2 clock cycles - as required. A DSP Processor does not have a

hy is this flexibility useful? The reason is because, if the 10 MACs must be perfse a lot of area and perform them in parallel in 1 clock cycle and if the 10 MACsy the system performance requirements), the FPGA can perform them seriallyking 10 clock cycles - i.e. the FPGA hardware implementation can be tailored

dvantage of the application requirements/specification.

this way, speed and area can be traded when implementing on FPGA - DSPption.

should also be noted that it is very unlikely that anyone would ever implement annly of multipliers! Figures given by manufacturers are merely intended to gerformance of these devices and by how far they outperform DSP Processors (c

Page 11: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

ility (II) 8.5

ugust 2005, For Academic Use Only, All Rights Reserved

FPGA Performance and Flexib

Page 12: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

M

It f FPGAs but they are still fullyc case due to the fact they areh

D and chips get faster, DSPP

b ce advantage gap over DSPP

D ttp://www.xilinx.com/esp/dvt/c

otes:

ore on DSP Processors vs FPGAs.

must be remembered that an FPGA is still an ASIC - Xilinx. are manufacturers oustom integrated circuits at the end of the day - even though they are a special ighly programmable...

SP Processors are also ASICs and as ASIC process technology improvesrocessors will get faster...

ut so will FPGAs because they are ASICs too! FPGAs already hold a performanrocessors and this gap will not close as silicon processes get better.

iagram: “FPGAs: DSP for Consumer Digital Video Applications”, Xilinx, hollateral/fpga_dsp_adv_in_dvt.pdf

Page 13: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

ility (III) 8.6

ugust 2005, For Academic Use Only, All Rights Reserved

FPGA Performance and Flexib

Page 14: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

A e grand scheme of things inre

T cate that this diagram is closeto

T y if bugs are found) when ad ate the design as many timesa

D ttp://www.xilinx.com/esp/dvt/c

otes:

rather hand-wavy diagram that gives an indication of where FPGAs lie in thlation to Custom ICs (ASICs) and DSP Processors.

he surge in FPGA use by manufacturers of electronic systems does seem to indi the mark however.

he costs and time involved in manufacturing ASICs are prohibitive (especiallesigner can have a design running in hardware on an FPGA at their desk and iters required with no expensive fabrication in sight!

iagram: “FPGAs: DSP for Consumer Digital Video Applications”, Xilinx, hollateral/fpga_dsp_adv_in_dvt.pdf

Page 15: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

8.7

simplified overviewGA design flow;

ormat conversionsthe many pieces of

can control andges of the processnd options;

ket contains many produce software stages of the flow;

tream configurese device required

nted design.

ugust 2005, For Academic Use Only, All Rights Reserved

FPGA Design Flow• This is a highly

of the Xilinx FP

• Numerous file foccur between software;

• The engineer influence all stavia constraints a

• The FPGA marcompanies thattools for various

• The final bitsevery part of thfor the impleme

Page 16: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

A sible stages although it doesc ny files and directories wheny and can be run by the high-l ch stage manually however!

Xilinx Software Manuals,x.com/docsan/xilinx5/

otes:

more detailed design flow is given below - this doesn’t even show all of the posontain most! It may become clear why the FPGA design flow produces so maou consider all of the processes below. Several stages are grouped/automated

evel software tools if desired. The engineer usually has the option of running ea

Flow diagrams: http://toolbox.xilinmanuals.htm

Page 17: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

itecture 8.8

ric view of theo family;

creases, so doesvailable resourcesdded multipliers,onfigurable logic;

nfigurable Logic the mainbric of the device;

Clock Managers)nagement issueshase shifting and

lso contain more I/O functionality.

ugust 2005, For Academic Use Only, All Rights Reserved

Xilinx Virtex-II Pro FPGA Arch• High-level, gene

Xilinx Virtex-II Pr

• As device size inthe amount of asuch as embeprocessors and c

• The CLBs (CoBlocks) formprogrammable fa

• DCMs (Digital solve clock masuch as skew, pdivision;

• Larger devices auser I/O pins and

Page 18: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

A r design maps to the actualh doing this once the user hasd owever and this is especiallytr ay not do a good enough joba

D linx.com/bvdocs/publications/d

otes:

n FPGA is rather abstract looking and it may not appear obvious how a useardware. Luckily, the software tools can take care of a lot of the complexity of efined their design. There is still a considerable amount of work for the engineer hue when pushing the limits of the hardware - at this point the software tools mnd the engineer must get in and around the “nuts and bolts” themselves!

iagram: Virtex-II Pro Platform FPGA Complete Data Sheet, Xilinx, http://direct.xis083.pdf

Page 19: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

ic Blocks 8.9

x-II CLB containstex/Spartan series per CLB);

ic design can beithin the slice logicLBs;

rconnected withind via the switch CLBs together;

e they are highly independent Cin/

LB can implementa larger bit-width

ugust 2005, For Academic Use Only, All Rights Reserved

Xilinx Virtex-II Configurable Log• One Xilinx Virte

four slices (Virhave two slices

• Any digital logimplemented whoused by the C

• Slices are intetheir CLBs anmatrix that links

• The Cin and Cout signals are significant becaususeful for implementing arithmetic functions. TwoCout columns exist per CLB column;

• One slice can implement a 2-bit full adder so one Ctwo independent 4-bit full adders as part of calculation with other CLBs as required.

Page 20: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

O ynthesis” process takes thed he engineer specifies exactlyw ). The synthesis process is ac be taken to FPGA by furthers

In perations of the design usingth h many more stages in orderto downloaded to an FPGA toc

D linx.com/bvdocs/publications/d

otes:

nce the user has entered their design (via VHDL/Verilog for example), the “Sesign and works out how to implement it on the elements of a specific FPGA. Thich device to target (i.e. manufacturer, family, size, package type, speed gradeomplex one that can turn any synthesiseable VHDL/Verilog into a form that canoftware tools.

the case of Xilinx, the Synthesis tool will decide how to perform the digital logic oe slice logic available. The FPGA manufacturer tools then take the design throug get the design into a form from which a bitstream is produced that can be

onfigure it.

iagram: Virtex-II Pro Platform FPGA Complete Data Sheet, Xilinx, http://direct.xis083.pdf

Page 21: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

8.10

of user-designe implemented byed by the CLBs;

n, the primary FPGA device sizeslices present;

ction possibilitiesslice elements

d many elements

boolean function -lemented using the;

- discussed later.

ugust 2005, For Academic Use Only, All Rights Reserved

Xilinx Virtex-II Slices (I)• The majority

functionality will bthe slices contain

• For this reasomeasure of Xilinxis the number of

• Many interconneexist between (connections annot shown here);

• The Look Up Tables (LUTs) implement any 4-inputthe majority of a user digital logic design will be imp4-input LUTs to perform the actual logic operations

• LUTs can also be used as Shift-Registers or RAM

Page 22: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

X s. The different elements canb

T e this is where it all happens!

D linx.com/bvdocs/publications/d

otes:

ilinx slices are where the actual “work” that implements the user design happene interconnected in different ways as determined by the configuration bitstream.

he number of slices available on a device essentially determine its capacity sinc

iagram: Virtex-II Pro Platform FPGA Complete Data Sheet, Xilinx, http://direct.xis083.pdf

Page 23: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

8.11

ting multiplication

ugust 2005, For Academic Use Only, All Rights Reserved

Xilinx Virtex-II Slices (II)• The registers provide the means of

implementing synchronous logic;

• Registers are vital when designingfor high clock rates - failure to usethem will not yield high speedperformance;

• The multiplexers and CYcomponents provide some of therouting possibilities for signalsthrough the slice (shown in moredetail later);

• The “Arithmetic Logic” AND gate atthe bottom has been included to make implemenmore efficient.

Page 24: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

D linx.com/bvdocs/publications/d

otes:

iagram: Virtex-II Pro Platform FPGA Complete Data Sheet, Xilinx, http://direct.xis083.pdf

Page 25: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

alf) 8.12

ugust 2005, For Academic Use Only, All Rights Reserved

Xilinx Virtex-II Slice (top h

Page 26: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

A

T on - the user can also do som

W the contents of the LUTs andth bitstream will also configureth

D linx.com/bvdocs/publications/d

otes:

ll of the interconnections and components are shown.

he software tools will take care of configuring every required element/connectianually if required!

hen the FPGA is configured with a bitstream (generated by the software tools), e routing between the slice elements is defined - forming the user design. Thee connection between slices/CLBs etc.

iagram: Virtex-II Pro Platform FPGA Complete Data Sheet, Xilinx, http://direct.xis083.pdf

Page 27: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

8.13

D Q

LUTD Q

est path betweenough LUTS/wires;

path as short as as possible.

Longest/Critical Path

ugust 2005, For Academic Use Only, All Rights Reserved

Registers and Pipelining

LUTD Q

LUT LUT

Slow Clock

LUTD Q D Q

LUTD Q

Fast Clock

Without Pipelining

With Pipelining

• Possible FPGA clock rate is limited by the longregisters because the signals must travel further thr

• Using the “free” slice registers keeps the longestpossible and hence the possible clock rate as high

Page 28: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

T derstood.

O s, LUTs, MUXes etc. beforea device on every clock edge.S tween two flip-flops/registersis ly free because every LUT isp

It an be clocked at. Rememberth too long, the design may notb

In d make the design run faster.T rt level of the software tools,a ise the hardware and reduceth

It design does not meet timing!T and how to use them) can bea

otes:

his is one of the fundamental design principles of FPGA design and must be un

n each clock edge, signals must travel through their data path via routing linerriving at the next flip-flop. This happens to signals within a design all over theome signals will have further to travel than others and the longest (time) path be known as the “critical path”. It should be noted that the flip-flops are essentialaired with a flip-flop that can register the LUT output as required.

is this critical path that will determine the maximum clock rate that the FPGA cat the user can choose the clock rate arbitrarily as required. If the critical path is

e able to be clocked fast enough to meet the specification of the application!

this case, the engineer must return to the software tools/their design and try anhis may be achieved by for example: pipelining, redesign, increasing the effodding/removing design constraints or manually editing the design in order to optime length of the critical path!

should be noted that this is the most difficult part of FPGA design - what to do if ahere are many options for the engineer to try and knowing which one(s) to use ( bit of a black art...

Page 29: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

8.14

devices haveb (Kilo-bit) Block the device;

est Virtex-II Pro 556 Block RAMs = 10,008 Kb ofl;

e written at devicee or written/read

be single or dualaddress gives 2excellent for DSPficient for ex.).

ugust 2005, For Academic Use Only, All Rights Reserved

Xilinx Virtex-II Block RAM• Xilinx Virtex-II

dedicated 18 KRAMs throughout

• One of the larg(XC2VP125) hasand so 556 * 18Block RAM in tota

• Block RAM can bconfiguration timduring operation;

• Block RAM can port - i.e. one pieces of data - (sample and coef

Page 30: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

E eir VHDL/Verilog code - thes gn.

A modulate a signal by a sinew

D linx.com/bvdocs/publications/d

otes:

ngineers specify how they want to use the RAM components from within thoftware tools then ensure that the actual hardware is made available to the desi

n example of using Block RAM could be to store the numeric values required toave.

iagram: Virtex-II Pro Platform FPGA Complete Data Sheet, Xilinx, http://direct.xis083.pdf

Page 31: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

AM 8.15

16 bits and can beAM;

form one 32x1or one 16x1 dual-the same addressm both RAMs;

ws several single/nfigurations of the within one CLB (416 bits = 128);

ith 55,616 slices616 * 2 LUTs * 16f Distributed RAM;

ugust 2005, For Academic Use Only, All Rights Reserved

Xilinx Virtex-II Distributed R• A LUT can store

used as a 16x1 R

• Two LUTs cansingle-port RAM port RAM - i.e. produces data fro

• This flexibility allodual port RAM co128 bits availableslices * 2 LUTs *

• A Virtex-II Pro wtherefore has 55,bits = 1,738 Kb o

• The ability to create smallRAMs anywhere on the deviceis extremely useful - especiallyfor DSP purposes.

Page 32: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

A or use in a communicationss ceeds through the system.

T s further testament to FPGAfl

D linx.com/bvdocs/publications/d

otes:

n example of using a small distributed RAM could be a chipping sequence fystem. The sequence would be stored where it is needed to “chip” data as it pro

he ability to form larger single/dual port configurations from the smaller ones iexibility - distributed RAMs need only be as large as required.

iagram: Virtex-II Pro Platform FPGA Complete Data Sheet, Xilinx, http://direct.xis083.pdf

Page 33: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

8.16

(called an SRL16)to every LUT, 17

ys;

ess lines to create

ugust 2005, For Academic Use Only, All Rights Reserved

Shift Registers• Xilinx LUTs can implement a 16-bit shift register

and when combined with the register available delays are possible in one half of a slice;

• Shift registers can be cascaded to form longer dela

• The delay can be tapped at any point using the addrdelay lines of length less than the maximum.

Shift Reg

A0A1A2A3

CLK

D QD

CLK

Q

Page 34: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

caded to form a larger delay

d mode they can operate in

omplete Data Sheet, Xilinx,df

otes:

The diagram opposite shows the SRL16s being casline.

Note the flexibility of the Xilinx LUTs - this is the 3raddition to LUT/RAM.

Diagram opposite: Virtex-II Pro Platform FPGA Chttp://direct.xilinx.com/bvdocs/publications/ds083.p

Page 35: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

e 8.17he Xilinx Virtex-4 SP48 slice offers ustom DSP nctionality;

00MHz throughput

owever, the ransposed/ystolic FIR tructures map ore effectively in is case;

ummation edback is also vailable for serial plementations;

ugust 2005, For Academic Use Only, All Rights Reserved

Xilinx Virtex-4 DSP48 Slic• T

Dcfu

• 5

• HTSsmth

• Sfeaim

Page 36: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

NT d Transposed. The Systolics igh input signal fanout. TheT latency increases with filterle rs. Both architectures can bee

D m

Full-ParallelTransposed FIR

Full-ParallelSystolic FIR

otes:he Virtex-4 DSP48 slice caters for two types of full-parallel FIR - Systolic antructure allows the highest performance due to maximum pipelining and no hransposed structure has a fixed, low latency compared to the Systolic (whosength) but the input signal fanout can limit performance, especially for large filtentirely implemented within DSP48 slices with no external logic.

iagrams: “XtremeDSP Design Considerations User Guide”, http://www.xilinx.co

Page 37: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

ipliers 8.18

ugust 2005, For Academic Use Only, All Rights Reserved

Xilinx Virtex-II Embedded Mult• Embedded multipliers are arranged

in columns between CLBs;

• Multipliers are 18 x 18 bit and areassociated with BlockRAM for easyaccess to data;

• Can be combinatorial or pipelinedrunning at over 300MHz;

• Combining embedded multiplierswith LUT implemented accumulatorsallows MAC engines to be created(e.g. for use in filters);

• Cascade multipliers to implementlarger width multiplications.

Page 38: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

E ence these elements sharein M, the BlockRAM can still beu

A rely slice logic or combiningB o embedded multipliers area

otes:

ach embedded multiplier is associated with an adjacent BlockRAM and hterconnect. When the multiplier is being used without the associated BlockRAsed but with only 18 bits.

gain, multipliers can be implemented in the main fabric as required using pulockRAM and slice implemented multiplier blocks. This may be necessary if nvailable or the design timing requirements are tight.

Page 39: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

8.19

that connects the

switch matrices) isls, carry chains etc.

ugust 2005, For Academic Use Only, All Rights Reserved

Xilinx Virtex-II Routing• Xilinx Virtex-II series contains a multitude of routing

elements of the device together;

• The configurable routing between CLBs (via the complemented by dedicated routing for clock signa

Page 40: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

R ere is a massive number ofp any hours to actually producea

T different routing options area as short routing distances asp

T mbined with the DCM (DigitalC t the device with no skew.

D linx.com/bvdocs/publications/d

otes:

outing signals around the device is usually left to the tools to implement. Thossibilities to implement a design on an FPGA and the software tools may take m bitstream for a reasonable design.

he routing possibilities are described as being hierarchical due to the fact thatvailable depending on how far a signal has to travel. Clearly, keeping signals to ossible is preferable to ensure high clock rates.

he dedicated clock distribution lines are of special importance because when colock Management) blocks, they allow for high speed clocks to be fed throughou

iagram: Virtex-II Pro Platform FPGA Complete Data Sheet, Xilinx, http://direct.xis083.pdf

Page 41: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

8.20

s, buses and other

l I/O standards are

locks are availablech;

single-ended andVDS (Low-Voltage

10.3125 Gbp/s.

ugust 2005, For Academic Use Only, All Rights Reserved

Xilinx Virtex-II I/O• FPGAs are capable of interfacing with backplane

systems at a board/system level;

• A multitude of current and emerging serial/parallesupported;

• In Virtex-II, up to 24 RocketIO Serial Transceiver boperating at full-duplex speeds of up 3.125Gb/s ea

• Also, in Virtex-II, user I/O pins support many differential signalling standards up to 840 Mbps LDifferential Signalling);

• Virtex-II Pro X family supports up to 20 channels at

Page 42: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

NG into and out of the device ons

T . still exist but interfacing theF standards:

T requirements and many moreg ch are given below:

Shh

otes:etting signals into and out of FPGAs requires high speed signals to be routed

ome sort of board that houses the overall system and the FPGA(s).

he usual board-level difficulties with signal cross-talk, inductance, resonance etcPGA to the board signals is quite achievable given the number of supported I/O

he Virtex-II devices have dedicated RocketIO blocks to deal with high speed I/O eneral Select I/O pins for other interfaces. The specific formats supported by ea

upported standards from:ttp://www.xilinx.com/products/virtex2pro/rocketio.htmttp://www.xilinx.com/products/virtex2pro/selectioultra.htm

Page 43: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

re 8.21

;

pes;

ation domains withcurately targeted;

esource types (e.g.ry) can be scaledtly of the die size;

GA architecturesrce types primarily

ie size.

ugust 2005, For Academic Use Only, All Rights Reserved

Xilinx ASMBL Architectu• Advanced Silicon Modular Block - basis of Virtex-4

• Column based architecture with focused column ty

• Mixing column types in different ratios allows applicdiffering logic resource requirements to be more ac

• Individual rDSP/memoindependen

• Current FPscale resouonly with d

Page 44: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

NT on Specific Modular Block”.

T ed independently of die sizec

X

D mbl.htm

otes:rivia: ASMBL was renamed to “Advanced Silicon Modular Block” from “Applicati

he diagram below further illustrates how logic resources/features can be scalompared to traditional FPGA architectures.

ilinx see ASMBL as the next stage in programmable logic evolution.

iagrams: ASMBL Press Kit, Xilinx, http://www.xilinx.com/company/press/kits/as

Page 45: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

8.22

ccording to feature

ugust 2005, For Academic Use Only, All Rights Reserved

Xilinx Virtex-4 Platforms

• Designers can select the most appropriate device arequirements and cost;

• DSP is now a major focus industry-wide!

Page 46: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

otes:
Page 47: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

Top

A

8.23

chnology to give a

, faster and more features tors, DSP blocks);

rocessors and ASICs due toin parallel as required;

ity;

Bs, slices/LEs and elementsents are used/combined to

e and provide clock routing;

s and processes that can be

igh-speed signals via board

allow high clock rates.

ugust 2005, For Academic Use Only, All Rights Reserved

Conclusion• This module has presented an overview of FPGA te

high-level understanding of:

• What features cutting-edge FPGAs contain and the general trend of largersupport entire systems being implemented on FPGAs (e.g. I/O Transceive

• Why FPGAs provide performance and flexibility advantages over DSP Pinfinite reconfigurability, trading area for speed and performing operations

• Why FPGA performance is difficult to measure due to their inherent flexibil

• How the FPGA structure is generally organised hierarchically into CLBs/LAsuch as LUTS/RAMs/SRL16s, MUXes and flip-flops and how these elemimplement a design;

• The memory available on FPGAs;

• Dedicated arithmetic hardware and the various configurations available;

• The hierarchical routing lines that connect blocks together across the devic

• The complexity of the FPGA design flow and the number of software toolinvolved;

• The various I/O standards available to allow FPGAs to interface with hsignals/buses/backplanes etc.

• Why flip-flops are “free” (they exist beside the LUTs anyway) and how they

Page 48: The DSP Primer 8 - Engineeringmbolic/elg6163/FPGATechnologyXIlinx.pdf · 2005-10-22 · Notes: Often, low cost, high volume FPGA families are derived dire ctly from larger families

N

otes: