FPGA / SOC teknologi - i dag og i fremtiden

98
An Introduction to Xilinx All Programmable Solutions FPGA Seminar NOVI Ålborg May 31’st 2017

Transcript of FPGA / SOC teknologi - i dag og i fremtiden

Page 1: FPGA / SOC teknologi - i dag og i fremtiden

An Introduction to Xilinx All Programmable Solutions

FPGA Seminar

NOVI – Ålborg

May 31’st 2017

Page 2: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 2

Kontakt detaljer :

Page 3: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Agenda

Update on Xilinx FPGA / SOC solutions

Roadmap : Where are the FPGA / SOC technology taking us –

what is the future ?

Development tool’s for FPGA / SOC – now and the future

Xilinx ReVision

3

Page 4: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 4

An Expanding All Programmable Portfolio

Page 5: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Industry View of 20nm Technology Cost

Page 5

*Source: Nvidia, 2013 International Trade Partner Conference

Page 6: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 6

Mid-Range Kintex® Portfolio for Price-Performance-per-Watt

Performance

1.7XPerformance/

Watt

Most cost-effective

Mainstream protocols

Highest DSP bandwidth

16G backplane support

The only FinFET mid-range FPGA

High-end features in the mid-range

2.4XPerformance/

Watt

1X

Page 7: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 7

Kintex® Portfolio: Expanding Mid-Range Capabilities Maximum Values

Logic Cells / System Logic Cells1 478 1,451 1,143

Block RAM (Mb) 34 76 34.6

UltraRAM (Mb) - - 36

DSP Slices 1,920 5,520 3,528

Peak DSP Performance (GMACs) 2,845 8,180 6,287

Transceiver Count 32 64 76

Peak Transceiver Line Rate (Gb/s) 12.5 16.3 32.75

Peak Transceiver Bandwidth (Gb/s) 800 2,086 3,268

Integrated PCI Express® Gen2 x8 Gen3 x8 Gen3 x16, Gen4 x8

Memory Interface Performance (Mb/s) DDR3-1866 DDR4-2400 DDR4-2666

I/O Pins 500 832 668

1: UltraScale™ & UltraScale+™ Devices measured in System Logic Cells

Page 8: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 8

Cost Optimized Solutions

Page 9: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Introducing the new Cost-Optimized Portfolio

• Better processor

scalability with single-

core ARM Cortex-A9

Artix®-7

Zynq®-7000

Spartan®-6

• Smaller Densities

• Win 10 ISE® Tool

Support

I/O Optimized

Transceiver

Optimized

Artix®-7

System

Optimized

Zynq®-7000

Spartan®-6

Page 9

Spartan-7

• 2.5X Performance/Watt

• Industry Leading

Vivado Tool Support

Page 10: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 10

Continuing the Spartan Heritage

SPARTAN SPARTAN-llE SPARTAN-3E SPARTAN-3A

1998 2000 2002 2004 2006 2008 2010 2012 2014 2016

Spartan-XL

Spartan-II/IIE

Spartan-3

Spartan-3L

Spartan-3E

Spartan-3A DSP

Spartan-3AN

Spartan-6 Spartan-7Spartan-3A

0.5um 90nm 45nm 28nm

Nearly two decades, and three quarter of a billion devices shipped

Page 11: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 11

5 New Devices and One New Family:The Broadest Cost-Optimized All Programmable Portfolio

Value

LX4 LX9 LX16 LX150LX45 LX100LX75LX25

A50T A75TA35TA15T A200TA100TA25TA12T

Mid-RangeZ-7010 Z-7015 Z-7020Z-7007S

Z-7012S

Z-7014S

S6 S15 S50 S100S75S25

Page 12: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 12

Spartan-7 FPGA OverviewIndustry’s Best performance-per-watt for cost-sensitive applications

Security

Encryption, authentication

AES256 CBC & SHA-256

XADC & SYSMON

1MSPS ADC

Thermal monitoring

Small Package

Form FactorOnly 28nm device in an

8x8mm package

High-Range I/OLow cost interfacing

Up to 1.25G LVDS

DDR3-800Up to 800Mb/s

Flexible soft controller

DSPWider 25x18 multiplier

160 slices, 176GMACs

Block RAM36K/18K blocks

Up to 4.2Mb total

2.5X Perf/Watt50% lower power &

30% faster than Spartan-6

3.3V

Page 13: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 13

Spartan-7 FPGAs

Notes:

1. Packages with the same last letter and number sequence, e.g., A484, are footprint compatible with all other Spartan-7 devices with the same sequence. The footprint compatible devices within this family are

outlined.

Spartan®-7 FPGAsI/O Optimization at the Lowest Cost and Highest Performance-per-Watt

Part Number XC7S6 XC7S15 XC7S25 XC7S50 XC7S75 XC7S100

Logic Cells 6,000 12,800 23,360 52,160 76,800 102,400

Slices 938 2,000 3,650 8,150 12,000 16,000

CLB Flip-Flops 7,500 16,000 29,200 65,200 96,000 128,000

Max. Distributed RAM (Kb) 70 150 313 600 832 1,100Block RAM/FIFO w/ ECC (36 Kb

each)5 10 45 75 90 120

Total Block RAM (Kb) 180 360 1,620 2,700 3,240 4,320Clock Mgmt Tiles (1 MMCM + 1

PLL)2 2 3 5 8 8

Max. Single-Ended I/O Pins 100 100 150 250 400 400

Max. Differential I/O Pairs 48 48 72 120 192 192

DSP Slices 10 20 80 120 140 160

Analog Mixed Signal (AMS) / XADC 0 0 1 1 1 1

Configuration AES / HMAC Blocks 0 0 1 1 1 1

Commercial Speed Grade -1,-2 -1,-2 -1,-2 -1,-2 -1,-2 -1,-2

Industrial Speed Grade -1,-2,-1L -1,-2,-1L -1,-2,-1L -1,-2,-1L -1,-2,-1L -1,-2,-1LPackage(1) Body Area (mm) Available User I/O: 3.3V SelectIO™ HR I/OCPGA196 8x8 100 100CSGA225 13x13 100 100 150CSGA324 15x15 150 210FTGB196 15x15 100 100 100 100FGGA484 23x23 250 338 338FGGA676 27x27 400 400

Page 14: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 14

Artix®-7 FPGA OverviewThe industry’s cost-optimized performance leader

Security Encryption & authentication

AES256 CBC & SHA-256

XADC & SYSMON1Msps ADC reduces BOM cost

Complies with reliability standards

Small Package

Form FactorSmallest for 35K-215K LCs

Meets stringent SWAP-C

High-range I/OLow cost interfacing

Up to 300Gb/s

LVDS bandwidth

6.6Gb/s GTPUp to 211Gb/s bandwidth

DDR3-1066Low-cost DRAM

Up to 1,066Mb/s

Flexible soft controller

DSPWider 25x18 multiplier

Up to 740 slices and

931GMACs @ 629MHz

Block RAM36K/18K blocks

Up to 12.8Mb total

Page 15: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 15

Artix-7 FPGAs

Notes:

4. Device migration is available within the Artix-7 family for like packages but is not supported between other 7 series families.

3. Leaded package option available for all packages. See DS180, 7 Series FPGAs Overview for details.

1. Supports PCI Express Base 2.1 specification at Gen1 and Gen2 data rates.

2. Represents the maximum number of transceivers available. Note that the majority of devices are available without transceivers. See the Package section of this table for details.

Artix®-7 FPGAsTransceiver Optimization at the Lowest Cost and Highest DSP Bandwidth (1.0V, 0.95V, 0.9V)

Part Number XC7A12T XC7A15T XC7A25T XC7A35T XC7A50T XC7A75T XC7A100T XC7A200T

LogicResources

Logic Cells 12,800 16,640 23,360 33,280 52,160 75,520 101,440 215,360

Slices 2,000 2,600 3,650 5,200 8,150 11,800 15,850 33,650

CLB Flip-Flops 16,000 20,800 29,200 41,600 65,200 94,400 126,800 269,200

MemoryResources

Maximum Distributed RAM (Kb) 171 200 313 400 600 892 1,188 2,888

Block RAM/FIFO w/ ECC (36 Kb each) 20 25 45 50 75 105 135 365

Total Block RAM (Kb) 720 900 1,620 1,800 2,700 3,780 4,860 13,140

Clock Resources CMTs (1 MMCM + 1 PLL) 3 5 3 5 5 6 6 10

I/O ResourcesMaximum Single-Ended I/O 150 250 150 250 250 300 300 500

Maximum Differential I/O Pairs 72 120 72 120 120 144 144 240

Embedded Hard IP

Resources

DSP Slices 40 45 80 90 120 180 240 740

PCIe® Gen2(1) 1 1 1 1 1 1 1 1

Analog Mixed Signal (AMS) / XADC 1 1 1 1 1 1 1 1

Configuration AES / HMAC Blocks 1 1 1 1 1 1 1 1

GTP Transceivers (6.6 Gb/s Max Rate)(2) 2 4 4 4 4 8 8 16

Speed Grades

Commercial -1, -2 -1, -2 -1, -2 -1, -2 -1, -2 -1, -2 -1, -2 -1, -2

Extended -2L, -3 -2L, -3 -2L, -3 -2L, -3 -2L, -3 -2L, -3 -2L, -3 -2L, -3

Industrial -1, -2, -1L -1, -2, -1L -1, -2, -1L -1, -2, -1L -1, -2, -1L -1, -2, -1L -1, -2, -1L -1, -2, -1L

Package(3), (4) Dimensions (mm)

Ball Pitch(mm)

Available User I/O: 3.3V SelectIO™ HR I/O (GTP Transceivers)

CPG236 10 x 10 0.5 106 (2) 106 (2) 106 (4) 106 (2) 106 (2)

CSG324 15 x 15 0.8 210 (0) 210 (0) 210 (0) 210 (0) 210 (0)

CSG325 15 x 15 0.8 150 (2) 150 (4) 150 (4) 150 (4) 150 (4)

FTG256 17 x 17 1.0 170 (0) 170 (0) 170 (0) 170 (0) 170 (0)

SBG484 / SBV484 19 x 19 0.8 285 (4)

FootprintCompatible

FGG484 23 x 23 1.0 250 (4) 250 (4) 250 (4) 285 (4) 285 (4)

FBG484 / FBV484 23 x 23 1.0 285 (4)

FootprintCompatible

FGG676 27 x 27 1.0 300 (8) 300 (8)

FBG676 / FBV676 27 x 27 1.0 400 (8)

FFG1156 / FFV1156 35 x 35 1.0 500 (16)

Page 16: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 16

Migrating from Spartan-6Spartan-7 or Artix-7?

Vivado support enables customers to build scalable cost optimized platforms

Logic + GTs

Logic Only

Spartan-6LXT

Spartan-6LX

For designs requiring…

Page 17: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Dual Cortex-A9 MPCore

1 GHz

5000 DMIPS

Xilinx Processing Heritage

2001 2003 2005 2007 2012

130nm

Dual 405 Cores

450+ MHz

700+ DMIPS

90nm

65nm

Dual 440 Cores

550+ MHz

1100+ DMIPS 28nm

10+ years, 4 Generations

Perf

orm

ance

405 Core

300+ MHz

450+ DMIPS

Page 18: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Introducing single ARM Cortex™-A9 devices built on the proven Zynq-7000 architecture

Offering the highest integration at the lowest cost within the Cost-Optimized Portfolio

New devices fortify processor scalability from the entry-level to the high-end for embedded designs

Page 18

Introducing Zynq-7000S Devices

Single-Core ARM® Devices Enhance Scalable Processing Portfolio

Lower Cost Entry Points Enhance Scalable Processing Portfolio

Page 19: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 19

Zynq-7000S Offers Scalability in Motor Control

Zynq-7000S Zynq-7000

Maximum Capabilities

• 2 Full Drives

• Fieldbus Protocols

via PL

• Profibus

• CanOpen

• Others

Maximum Capabilities

• 4 Full Drives

• 2nd Cortex-A9

enables AnyBus IP

• EtherCAT

• Profinet

• Powerlink

• EtherNet I/P

• Modbus

Z-7014S

Processing System

Programmable Logic

ARM

Cortex-A9

Motor Control

Computations

Z-7020

Processing System

Programmable Logic

ARM

Cortex-A9ARM

Cortex-A9

Fieldbus IPAnyBus IP

Motor Control

Computations

Page 20: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 20

Introducing Zynq-7000S Devices

Application Processors

A9

Integrated Memory

Mapped Peripherals• e.g. USB2.0, GigE

Integrated Analog• Dual multi-channel 12-bit ADC

• Up to 1Msps

• Temp & Voltage sensors

Programmable Logic

Extensive IP Portfolio• Standardized AXI4 interfaces

• Enables peripheral expansion

• Includes software drivers

Tightly Coupled Domains• 3000+ PS/PL interconnects

• Low Latency

• Up to 100Gb of bandwidth

High Bandwidth Memory

• L1/L2 CPU Caches

• Dedicated On-Chip Memory (OCM)

• DDR3, DDR2, LPDDR2 w/ ECC

Zynq-7000S• Single-Core

• Up to 766MHz

Zynq-7000• Dual-Core

• Up to 1GHz

Zynq-7000S• Artix-7 Series FPGA

• 23K-65K Logic Cells

Zynq-7000• 7 Series FPGA

• 28K-440K Logic Cells

Page 21: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Hig

h-E

nd

Mid

-Ran

ge

Co

st-

Op

tim

ize

d

Page 21

Extending Scalability Across the Zynq® Portfolio

Dual-core ARM Cortex-A9

28nm Artix-7 FPGA

Dual-core ARM Cortex-A9

28nm Kintex®-7 FPGA

Dual-Core ARM Cortex-R5

Dual-Core ARM Cortex-A53

16nm FinFET+ Logic

Dual-Core ARM Cortex-R5

Quad-Core ARM Cortex-A53

ARM Mali™-400 MP2

16nm FinFET+ Logic

Dual-Core ARM Cortex-R5

Quad-Core ARM Cortex-A53

ARM Mali-400 MP2

H.264/H.265 Video Codec

16nm FinFET+ Logic

Single-Core ARM® Cortex™-A9

28nm Artix®-7 FPGA

Page 22: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 22

Cost-Optimized Devices Mid-Range DevicesDevice Name Z-7007S Z-7012S Z-7014S Z-7010 Z-7015 Z-7020 Z-7030 Z-7035 Z-7045 Z-7100Part Number XC7Z007S XC7Z012S XC7Z014S XC7Z010 XC7Z015 XC7Z020 XC7Z030 XC7Z035 XC7Z045 XC7Z100

Pro

cess

ing

Syst

em (

PS)

Processor CoreSingle-Core

ARM® Cortex™-A9 MPCore™Up to 766MHz

Dual-Core ARM Cortex-A9 MPCore

Up to 866MHz

Dual-Core ARM Cortex-A9 MPCore

Up to 1GHz(1)

Processor Extensions NEON™ SIMD Engine and Single/Double Precision Floating Point Unit per processorL1 Cache 32KB Instruction, 32KB Data per processorL2 Cache 512KB

On-Chip Memory 256KBExternal Memory Support(2) DDR3, DDR3L, DDR2, LPDDR2

External Static Memory Support(2) 2x Quad-SPI, NAND, NORDMA Channels 8 (4 dedicated to PL)

Peripherals 2x UART, 2x CAN 2.0B, 2x I2C, 2x SPI, 4x 32b GPIOPeripherals w/ built-in DMA(2) 2x USB 2.0 (OTG), 2x Tri-mode Gigabit Ethernet, 2x SD/SDIO

Security(3) RSA Authentication of First Stage Boot Loader,AES and SHA 256b Decryption and Authentication for Secure Boot

Processing System to Programmable Logic Interface Ports

(Primary Interfaces & Interrupts Only)

2x AXI 32b Master, 2x AXI 32b Slave4x AXI 64b/32b Memory

AXI 64b ACP16 Interrupts

Pro

gram

mab

le L

ogi

c (P

L)

7 Series PL Equivalent Artix®-7 Artix-7 Artix-7 Artix-7 Artix-7 Artix-7 Kintex®-7 Kintex-7 Kintex-7 Kintex-7Logic Cells 23K 55K 65K 28K 74K 85K 125K 275K 350K 444K

Look-Up Tables (LUTs) 14,400 34,400 40,600 17,600 46,200 53,200 78,600 171,900 218,600 277,400Flip-Flops 28,800 68,800 81,200 35,200 92,400 106,400 157,200 343,800 437,200 554,800

Total Block RAM (# 36Kb Blocks)

1.8Mb(50)

2.5Mb(72)

3.8Mb(107)

2.1Mb (60)

3.3Mb(95)

4.9Mb (140)

9.3Mb (265)

17.6Mb (500)

19.1Mb (545)

26.5Mb (755)

DSP Slices 60 120 170 80 160 220 400 900 900 2,020PCI Express® — Gen2 x4 — — Gen2 x4 — Gen2 x4 Gen2 x8 Gen2 x8 Gen2 x8

Analog Mixed Signal (AMS) / XADC(2) 2x 12 bit, MSPS ADCs with up to 17 Differential InputsSecurity(3) AES & SHA 256b Decryption & Authentication for Secure Programmable Logic Config

Speed Grades

Commercial -1 -1 -1 -1

Extended -2 -2,-3 -2,-3 -2

Industrial -1, -2 -1, -2, -1L -1, -2, -2L -1, -2, -2LNotes:

1. 1 GHz processor frequency is available only for -3 speed grades for devices in flip-chip packages. Please see the data sheet for more details.

2. Z-7007S and Z-7010 in CLG225 have restrictions on PS peripherals, memory interfaces, and I/Os. Please refer to the Technical Reference Manual for more details.

3. Security block is shared by the Processing System and the Programmable Logic.

Zynq®-7000 AP SoC Family

Page 23: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 23

Zynq®-7000 All Programmable SoC FamilyHR I/O, HP I/O, PS I/O, and Transceivers (GTP or GTX)

Cost-Optimized Devices Mid-Range DevicesDevice Name Z-7007S Z-7012S Z-7014S Z-7010 Z-7015 Z-7020 Z-7030 Z-7035 Z-7045 Z-7100

Package Footprint

Dimensions(mm) (1)

HR I/O, HP I/OPS I/O(2), GTP Transceivers

HR I/O, HP I/OPS I/O(2), GTX Transceivers

CLG225 13x1354, 0

84(3), 054, 0

84(3), 0

CLG400 17x17100, 0128, 0

125, 0128, 0

100, 0128, 0

125, 0128, 0

CLG484 19x19200, 0128, 0

200, 0128, 0

CLG485(4) 19x19150, 0128, 4

150, 0128, 4

SBG485 / SBV485(4) 19x1950, 100 128, 4

FBG484 / FBV484 23x23100, 63128, 4

FBG676 / FBV676(1) 27x27100, 150

128, 4100, 150

128, 8100, 150

128, 8

FFG676 / FFV676(1) 27x27100, 150

128, 4100, 150

128, 8100, 150

128, 8

FFG900 / FFV900 31x31212, 150128, 16

212, 150128, 16

212, 150128, 16

FFG1156 / FFV1156 35x35250, 150128, 16

Notes:

1. Devices in the same package are footprint compatible. FBG676 / FBV676 and FFG676 / FFV676 are also footprint compatible.

2. PS I/O count does not include dedicated DDR calibration pins.

3. PS DDR and PS MIO pin count is limited by package size. See DS190, Zynq-7000 All Programmable SoC Overview for details.

4. CLG485 and SBG485 / SBV485 are pin-to-pin compatible. See product data sheets and user guides for more details.

See DS190, Zynq-7000 All Programmable SoC Overview for package details.

Page 24: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

New Low-Cost Kits for Cost-Optimized Devices

Avnet MiniZed Z007S Kit in June 2017• Zynq-7000S: Attack ASSPs needing companion FPGAs

S7 ARTY 7S50 Kit in July 2017

S7 ARTY 7S25 Kit in Dec 2017• Spartan 7: First Production 7S50 Silicon in June

$89

ARTY 7A35T Kit Available Now• Artix-7: Enable new 7A25T & 7A12T design starts now!

Page 25: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 25

Cost-Optimized Portfolio Supported with Free Vivado WebPACK™

Family Devices

ALL

ALL

ALL Zynq®-7000S +

Zynq-7000 up to

Z-7030

Drag and drop hundreds of Xilinx & partner 7 series IP blocks

– Includes MicroBlaze™ soft processor and AXI block-level interconnect

Industry’s only no-cost, mixed-language simulator with no code line limits

Best-in-class quality-of-results

Page 26: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

SoCs1FPGAs

Portfolio at a Glance

Process Node 45nm 28nm 28nm 28nm

ProcessorMicroBlaze™

Soft ProcessorMicroBlaze

Soft ProcessorMicroBlaze

Soft Processor

Single- or Dual-Core

ARM® Cortex™-A9

Logic Density

Range (Logic Cells)4K → 150K 6K → 102K 12K → 200K 28K → 85K

Max Memory

Interface (Mb/s)DDR3-800 DDR3-800 DDR3-1066 DDR3-1066

LVDS I/O

Performance1.08Gb/s 1.25Gb/s 1.25Gb/s 1.25Gb/s

Transceiver

Max Gb/s3.2Gb/s N/A 6.6Gb/s 6.25Gb/s

Zynq®-7000Artix®-7Spartan-7Spartan®-6

1: Cost-optimized devices based on Artix-7 programmable logicPage 26

Page 27: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 27

• 20nm UltraScale

Update

Page 28: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 28

Block-Level Innovations Optimize Critical Paths for Massive Bandwidth and Processing

27x18

XDSPWider multipliers,

fewer blocks per function

DDR4

Memory I/O

30% higher data rates

20% lower power

Block

RAM

Block RAMHardened data cascading

Improved power, performance

Transceivers12.5G low speed grade

16G & 28G backplane

33G chip-to-chip

Integrated IP100G Ethernet MAC

150G Interlaken

PCI Express Gen3

SSI

Technology Virtual monolithic die

Security AES-GCM mode,

greater key protection,

more authentication schemes

Co-Optimized

Page 29: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Effect of

routing

resources

& analytical

placement

Logic cells

O(N2)

Interconnect tracks O(N)

UltraScale Architecture Re-Designs the Core

Page 29

Clock

Domain 1

Clock Domain 3

Clock Domain 2

Wire lengthPartially

Used CLB

40nm 28nmN

20nm

Page 30: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 30

Integrated 100G Ethernet MAC, 150G Interlaken

150G

Interlaken

Up to

12 x 12.5Gb/s

Up to

6 x 25 Gb/s

100GE MAC 10 x10 Gb/s 4 x 25Gb/s

Configuration OptionsResource Savings

80% 90%

Interlaken

(12 lane, 10G)

7-Series

Soft IP

UltraScale

Hard IP

LUTs 32,700 0

Fabric Flip Flops 46,200 1,536

BRAM 16 0

Transceivers 12 12

Ethernet MAC + PCS

(10x10G)

7-Series

Soft IP

UltraScale

Hard IP

LUTs 70,000 0

Fabric Flip Flops 65,000 1,280

BRAM 41 0

Transceivers 10 10

Interlaken

(12 lane, 10G)

7-Series

Soft IP

UltraScale

Integrated IP

Ethernet MAC + PCS

(10x10G)

7-Series

Soft IP

UltraScale

Integrated IP

Hard IP Lanes x Line Rate

Feature Benefit

Large Scale Integration

• More headroom for power budget

• Lower latency and higher performance

• Frees up logic for additional functionality, e.g., packet processing

• Simplified flow and easier routing for shorter run-times

• No licensing requirements

Multiple configuration options Flexibility to meet existing and future design requirements

Page 31: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 31

2nd Generation 3D IC Infrastructure Enables Virtual Monolithic Design

Feature Benefit

~20,000 registered routing lines between die• Enables >500 MHz datapath performance between SLRs

• Deterministic, predictable timing

Clocking Architecture Spans SLR boundaries Abundant clock resources to meet demanding application

Foot-print compatibility between SSI and non-SSI devices Ability to seamlessly migrate from monolithic to 3D-IC devices

SLR0 SLR1 SLR2

passive interposer

Substrate

Page 32: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 32

UltraScale Demos – Delivering What We Promised

High Performance Proven in System Applications

Page 33: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 33

Kintex® UltraScale™ FPGAsDevice Name KU025(1) KU035 KU040 KU060 KU085 KU095 KU115

Logic Resources

System Logic Cells (K) 318 444 530 726 1,088 1,176 1,451

CLB Flip-Flops 290,880 406,256 484,800 663,360 995,040 1,075,200 1,326,720

CLB LUTs 145,440 203,128 242,400 331,680 497,520 537,600 663,360

Memory Resources

Maximum Distributed RAM (Kb) 4,230 5,908 7,050 9,180 13,770 4,800 18,360

Block RAM/FIFO w/ECC (36Kb each) 360 540 600 1,080 1,620 1,680 2,160

Block RAM/FIFO (18Kb each) 720 1,080 1,200 2,160 3,240 3,360 4,320

Total Block RAM (Mb) 12.7 19.0 21.1 38.0 56.9 59.1 75.9

Clock ResourcesCMT (1 MMCM, 2 PLLs) 6 10 10 12 22 16 24

I/O DLL 24 40 40 48 56 64 64

I/O Resources

Maximum Single-Ended HP I/Os 208 416 416 520 572 650 676

Maximum Differential HP I/O Pairs 96 192 192 240 264 288 312

Maximum Single-Ended HR I/Os 104 104 104 104 104 52 156

Maximum Differential HR I/O Pairs 48 48 48 48 56 24 72

Integrated IP Resources

DSP Slices 1,152 1,700 1,920 2,760 4,100 768 5,520

System Monitor 1 1 1 1 2 1 2

PCIe® Gen1/2/3 1 2 3 3 4 4 6

Interlaken 0 0 0 0 0 2 0

100G Ethernet 0 0 0 0 0 2 0

16.3Gb/s Transceivers (GTH/GTY) 12 16 20 32 56 64 64

Speed Grades

Commercial -1 -1 -1 -1 -1 -1 -1

Extended -2 -2 -3 -2 -3 -2 -3 -2 -3 -2 -2 -3

Industrial -1 -2 -1 -1L -2 -1 -1L -2 -1 -1L -2 -1 -1L -2 -1 -2 -1 -1L -2

PackageFootprint(2, 3, 4)

Package Dimensions (mm)

HR I/O, HP I/O, GTH/GTY

A784 23x23(5) 104, 364, 8 104, 364, 8

A676 27x27 104, 208, 16 104, 208, 16

A900 31x31 104, 364, 16 104, 364, 16

A1156 35x35 104, 208, 12 104, 416, 16 104, 416, 20 104, 416, 28 52, 468, 28

A1517 40x40 104, 520, 32 104, 520, 48 104, 520, 48

Footprint Compatible with

Virtex® UltraScale Devices

C1517 40x40 52, 468, 40

D1517 40x40 104, 234, 64

B1760 42.5x42.5 104, 572, 44 52, 650, 48 104, 598, 52

A2104 47.5x47.5 156, 676, 52

B2104 47.5x47.5 52, 650, 64 104, 598, 64

D1924 45x45 156, 676, 52

F1924 45x45 104, 520, 56 104, 624, 64

Notes:

1. Certain advanced configuration features are not supported in the KU025. Refer to the Configuring FPGAs section in DS890, UltraScale Architecture and Product Overview.

2. Maximum achievable performance is device and package dependent; consult the associated data sheet for details.

3. For full part number details, see the Ordering Information section in DS890, UltraScale Architecture and Product Overview.

4. See UG575, UltraScale Architecture Packaging and Pinouts User Guide for more information.

5. 0.8mm ball pitch. All other packages listed 1mm ball pitch.

Disclaimer: This document contains preliminary information and is subject to change without notice. Information provided herein relates to products and/or services not yet available for sale, and provided solely for information purposes and are not intended, or to be construed, as an offer for sale or an attempted commercialization of the

products and/or services referred to herein. Please contact your Xilinx representative for the latest information.

Page 34: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 34

Virtex® UltraScale™ FPGAsDevice Name XCVU065 XCVU080 XCVU095 XCVU125 XCVU160 XCVU190 XCVU440

Logic Resources System Logic Cells (K) 783 975 1,176 1,567 2,027 2,350 5,541

CLB Flip-Flops 716,160 891,424 1,075,200 1,432,320 1,852,800 2,148,480 5,065,920

CLB LUTs 358,080 445,712 537,600 716,160 926,400 1,074,240 2,532,960

Memory Resources

Maximum Distributed RAM (Kb) 4,830 3,980 4,800 9,660 12,690 14,490 28,710

Block RAM/FIFO w/ECC (36Kb each) 1,260 1,421 1,728 2,520 3,276 3,780 2,520

Block RAM/FIFO (18Kb each) 2,520 2,842 3,456 5,040 6,552 7,560 5,040

Total Block RAM (Mb) 44.3 50.0 60.8 88.6 115.2 132.9 88.6

Clock Resources

CMT (1 MMCM, 2 PLLs) 10 16 16 20 28 30 30

I/O DLL 40 64 64 80 120 120 120

Transceiver Fractional PLL 5 8 8 10 13 15 0

I/O Resources

Maximum Single-Ended HP I/Os 468 780 780 780 650 650 1,404

Maximum Differential HP I/O Pairs 216 360 360 360 300 300 648

Maximum Single-Ended HR I/Os 52 52 52 52 52 52 52

Maximum Differential HR I/O Pairs 24 24 24 24 24 24 24

Integrated IP Resources

DSP Slices 600 672 768 1,200 1,560 1,800 2,880

System Monitor 1 1 1 2 3 3 3

PCIe® Gen1/2/3 2 4 4 4 4 6 6

Interlaken 3 6 6 6 8 9 0

100G Ethernet 3 4 4 6 9 9 3

GTH16.3Gb/s Transceivers 20 32 32 40 52 60 48

GTY30.5Gb/s Transceivers 20 32 32 40 52 60 0

Speed Grades

Commercial -1

Extended -1H -2 -3 -1H -2 -3 -1H -2 -3 -1H -2 -3 -1H -2 -3 -1H -2 -3 -2 -3

Industrial -1 -2 -1 -2 -1 -2 -1 -2 -1 -2 -1 -2 -1 -1L -2

PackageFootprint(1, 2)

Package Dimensions (mm)

HR I/O, HP I/O, GTH 16.3Gb/s, GTY 30.5Gb/s

Footprint Compatible with

Kintex® UltraScale Devices

C1517 40x40 52, 468, 20, 20 52, 468, 20, 20 52, 468, 20, 20

D1517 40x40 52, 286, 32, 32 52, 286, 32, 32 52, 286, 40, 32

B1760 42.5x42.5 52, 650, 32, 16 52, 650, 32, 16 52, 650, 36, 16

A2104 47.5x47.5 52, 780, 28, 24 52, 780, 28, 24 52, 780, 28, 24

B2104 47.5x47.5 52, 650, 32, 32 52, 650, 32, 32 52, 650, 40, 36 52, 650, 40, 36 52, 650, 40, 36

C2104 47.5x47.5 52, 364, 32, 32 52, 364, 40, 40 52, 364, 52, 52 52, 364, 52, 52

B2377 50x50 52, 1248, 36, 0

A2577 52.5x52.5 0, 448, 60, 60

A2892 55x55 52, 1404, 48, 0

Notes:

1. For full part number details, see the Ordering Information section in DS890, UltraScale Architecture and Product Overview.

2. See UG575, UltraScale Architecture Packaging and Pinouts User Guide for more information.

Disclaimer: This document contains preliminary information and is subject to change without notice. Information provided herein relates to products and/or services not yet available for sale, and provided solely for information purposes and are not intended, or to be construed, as an offer

for sale or an attempted commercialization of the products and/or services referred to herein. Please contact your Xilinx representative for the latest information.

Page 35: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 35

• 16nm UltraScale +

Update

Page 36: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 36

New & Enhanced UltraScale+™ Capabilities

DDR4

Page 37: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 37

Tuned Process for Optimal Performance/WattOptimal Operating Voltage Selection

Normalized Fabric

Performance1.0x 1.2x 1.6x 1.2x

Normalized Total

Power1.0x .7x .8x .5x

Performance/Watt 1.0x 1.7x 2x 2.4x

Page 38: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 38

UltraRAM: New Memory Technology

Up to 360Mb to replace external memory for cost, power, performance

Page 39: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

UltraRAM Capabilities

.

.

.

.

.

.

Features Block RAM UltraRAM

Density per block 36K/18K 288K

Configurable Port Width -

Asynchronous Clocking -

Built-in FIFO -

ECC

Unused site gating

Sleep mode

Deep-sleep mode (3-clk cycle wake-up time) -

Hardened data output cascading

Hardened data input & address cascade -

Hard cascade across column - deterministic latency -

Optional input cascade/pipelines stages -

Hardened address decoder -

72DIN

72

DIN

ADDR

ADDR

ADDR

UltraRAM vs. Block RAM Comparison (Sub-Set)

Different Capabilities for Different Use Models

Page 39

.

.

.

Page 40: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 40

New Integrated PCIe Gen3x16 and Gen4x8 BlockNew Features Benefits

Gen3 x16 (8 Gb/s per lane) Performance for today’s high-end systems, e.g., 100G data center

Gen4 x8 (16 Gb/s per lane) Enables next generation system topologies

Hardened SR-IOV (4 Physical, 252 Virtual Functions) Expanded virtualization for demanding data center applications

Increased Number of Tags• 256 managed tags and 256 user managed tags

• Enables more outstanding RD requests for greater system performance

New DMA IP Complete end-to-end solution

Capable of

Multi-100G Ports

Page 41: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Multi-Node Footprint Migration

Page 41

20nm 16nm

Leverage system level investment across platforms

Future-proof migration path to 16nm

Page 42: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 42

Page 43: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 43

Virtex® UltraScale+™ FPGAs

Device Name VU3P VU5P VU7P VU9P VU11P VU13P

Logic

System Logic Cells (K) 862 1,314 1,724 2,586 2,822 3,763

CLB Flip-Flops (K) 788 1,201 1,576 2,364 2,580 3,441

CLB LUTs (K) 394 601 788 1,182 1,290 1,720

Memory

Max. Distributed RAM (Mb) 12.0 18.3 24.1 36.1 38.7 51.6

Total Block RAM (Mb) 25.3 36.0 50.6 75.9 70.9 94.5

UltraRAM (Mb) 90.0 132.2 180.0 270.0 270.0 360.0

Clocking Clock Management Tiles (CMTs) 10 20 20 30 12 16

Integrated IP

DSP Slices 2,280 3,474 4,560 6,840 8,928 11,904

PCIe® Gen3 x16 / Gen4 x8 2 4 4 6 3 4

150G Interlaken 3 4 6 9 6 8

100G Ethernet w/ RS-FEC 3 4 6 9 9 12

I/OMax. Single-Ended HP I/Os 520 832 832 832 624 832

GTY 32.75Gb/s Transceivers 40 80 80 120 96 128

Speed Grades

Extended -1 -2L -3 -1 -2L -3 -1 -2L -3 -1 -2L -3 -1 -2L -3 -1 -2L -3

Industrial -1 -1L -2 -1 -1L -2 -1 -1L -2 -1 -1L -2 -1 -1L -2 -1 -1L -2

Footprint(1,2) Dimensions (mm) HP I/O, GTY 32.75Gb/s

Footprint Compatible with 20nmUltraScale

Devices

C1517 40x40 520, 40

F1924(3) 45x45 624, 64

A210447.5x47.5 832, 52 832, 52 832, 52

52.5x52.5(4) 832, 52

B210447.5x47.5 702, 76 702, 76 702, 76 624, 76

52.5x52.5(4) 702, 76

C210447.5x47.5 416, 80 416, 80 416, 104 416, 96

52.5x52.5(4) 416, 104

A2577 52.5x52.5 448, 120 448, 96 448, 128

Notes:

1. For full part number details, see the Ordering Information section in DS890, UltraScale Architecture and Product Overview.

2. All packages are 1.0mm ball pitch.

3. GTY transceiver up to 16.3Gb/s. Refer to data sheet for details.

4. These 52.5x52.5mm packages have the same PCB ball footprint as the 47.5x47.5mm packages and are footprint compatible.

Page 44: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 44

Kintex® UltraScale+™ FPGAs

Notes:

1. GTY maximum data rate is limited.

2. Maximum achievable performance is device and package dependent; consult the associated data sheet for details.

3. For full part number details, see the Ordering Information section in DS890, UltraScale Architecture and Product Overview.

4. The B784 package is only offered in 0.8mm ball pitch. All other packages are 1.0mm ball pitch.

Device Name KU3P KU5P KU9P KU11P KU13P KU15P

LogicSystem Logic Cells (K) 356 475 600 653 747 1,143

CLB Flip-Flops (K) 325 434 548 597 683 1,045CLB LUTs (K) 163 217 274 299 341 523

MemoryMax. Distributed RAM (Mb) 4.7 6.1 8.8 9.1 11.3 9.8

Total Block RAM (Mb) 12.7 16.9 32.1 21.1 26.2 34.6UltraRAM (Mb) 13.5 18.0 0 22.5 31.5 36.0

Clocking Clock Management Tiles (CMTs) 4 4 4 8 4 11

Integrated IP

DSP Slices 1,368 1,824 2,520 2,928 3,528 1,968PCIe® Gen3 x16 / Gen4 x8 1 1 0 4 0 5

150G Interlaken 0 0 0 2 0 4100G Ethernet w/RS-FEC 0 1 0 1 0 4

I/O

Max. Single-Ended HD I/Os 96 96 96 96 96 96Max. Single-Ended HP I/Os 208 208 208 416 208 572

GTH 16.3Gb/s Transceivers 0 0 28 32 28 44GTY 32.75Gb/s Transceivers 16(1) 16(1) 0 20 0 32

Speed GradesExtended -1 -2L -3 -1 -2L -3 -1 -2L -3 -1 -2L -3 -1 -2L -3 -1 -2L -3Industrial -1 -1L -2 -1 -1L -2 -1 -1L -2 -1 -1L -2 -1 -1L -2 -1 -1L -2

Footprint(2,3) Dimensions (mm) HD I/O, HP I/O, GTH 16.3Gb/s, GTY 32.75Gb/s

Packaging

B784 23x23(4) 96, 208, 0, 16 96, 208, 0, 16A676 27x27 48, 208, 0, 16 48, 208, 0, 16B676 27x27 72, 208, 0, 16 72, 208, 0, 16D900 31x31 96, 208, 0, 16 96, 208, 0, 16 96, 312, 16, 0E900 31x31 96, 208, 28, 0 96, 208, 28, 0

A1156 35x35 48, 416, 28, 0 48, 468, 28, 0

E1517 40x40 96, 416, 32, 20 96, 416, 32, 24A1760 42.5x42.5 96, 416, 44, 32E1760 42.5x42.5 96, 572, 32, 24

Page 45: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 45

• Zynq UltraScale +

EG & EV

Page 46: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 46

The First All Programmable Multiprocessing SoC (MPSoC)

The Right Engines for the Right Tasks

Delivering 64-bit Performance and Terabyte Address Space

Delivering an Extra Node of Value

Page 47: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Zynq® UltraScale+™ System Features

Page 47

Page 48: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 48

Zynq® UltraScale+™ Block Diagram

Page 49: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 49

Unprecedented System Power ManagementDesigned with Lower Power Applications In Mind

Page 50: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Zynq® UltraScale+™ Connection Diagram

Page 7

Page 51: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Application Processing System: ARM Cortex-A53

Feature Benefit

ARMv8-A architecture,

Multicore Cortex-A53 up to 1.5 GHz

• 64-bit increases compute capability while maintaining 32-bit compatibility

• ARM’s most power-efficient A5x APU & most widely used 64-bit processor

• 1 terabyte physical address space

• 2.7X performance/watt (DMIPS) vs. predecessor (processor comparison only)

NEON Technology SIMD engine accelerates multimedia, signal & image processing algorithms

Floating-Point Unit (FPU)• Hardware support for FP operations in half-, single- and double-precision

• IEEE754-2008 compliant (current Floating Point standard)

Hardware Virtualization Enables multiple SW environments & apps simultaneous access to system resources

Application Processing Unit

321

ARM

Cortex™-A53

NEON™

I-Cachew/Parity

Floating Point Unit

D-Cachew/ECC

4

SCU

1MB L2 w/ECC

Performance Power

Page 52: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 52

Real-Time Processing System: ARM Cortex-R5

Real-Time Processing Unit

21

ARM

Cortex™-R5

Vector FloatingPoint Unit

128 KB TCM w/ECC

32 KB I-Cachew/ECC

32 KB D-Cachew/ECC

GIC

Memory ProtectionUnit

Feature Benefit

ARMv7-R Architecture, up to 600MHz

• Flagship ARM series for deterministic processing for critical real-time operation

• Offloads APU to perform compute-intensive tasks, reducing overall system power

• Supports Real-Time Operating Systems (RTOS) or Bare Metal

Dual-Core for Multi-Mode Operation• Lock-Step Mode for fault tolerance and fault detection, doubles TCM to 256KB

• Split-Mode with each real-time core operating autonomously

128KB Memory with ECC• Tightly coupled with processor for deterministic and low-latency response

• Ideal for critical code structures such as interrupt service routines

Safety Certifiable• Industry-proven to meet safety-critical standards

• e.g., IEC 61508 (industrial) and IEC 26262 (automotive)

Lock-Step Configuration

COMPARE

#include <stdio.h>

main ()

{

char *string;

string = “..”;

printf(“%s” string);

if (m_cust.valid == “F”)

{ m_app.status = “Reject”;

m_cust.eligible = false;

}

if (m_car.type == “S”)

{ m_rent.perDay = 80;

};

if (m_Car….

Page 53: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 53

ARM-Based Graphics ProcessorFeature Benefit

ARM Mali™-400 MP2 up to 667MHz

• Most power-optimized ARM GPU with Full HD support (1080p)

• Ideal for 2D vector graphics and 3D graphics (e.g., HMI, waveform processing)

• Supports open standards, e.g., OpenGL ES 1.1 & 2.0

Native Embedded Linux Support Out-of-the-box drivers and libraries for graphics support

Dual Pixel Processors Up to 1.3 GPix/s (fill rate) and 20 GFLOPS (shader rate)

Optimized Memory Interface Tightly coupled w/memory controller for efficient communication with DisplayPort controller

ARM Mali™-400 MP2

Geometry

Processor

2

Pixel

Processor

1

Memory Management Unit

64 KB L2 Cache

2

2.5D/3D Visualization On-Screen

Displays1080p Resolution

Intensive fill rate for smoother transition and frame rate

High performance shaders for complex 3D scenes

Page 54: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 54

Integrated H.264 / H.265 Video Codec EngineFeature Benefit

Integrated Video Codec Unit @up to 667MHz

• Broad application ranging from surveillance, digital cameras, broadcasting

• Up to 8 simultaneous streams coming from FPGA fabric or Processing System

• Higher display density, faster encoding, and lower power vs. soft implementation

• Up to 4Kx2K (60 fps) or 8Kx4K (15 fps)

Power Management, Performance Monitoring• Clock gating (dynamic savings), power gating (static/dynamic savings)

• Measure task execution time, bandwidth, and latency for fast design optimization

Video Codec Unit

Encoder

(x4)Decoder

(x2)

Memory Controller

Camera

Ethernet

Ethernet

DisplayPort

Page 55: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 55

Platform Management UnitDedicated Hardware for Power Management and Safety

Feature Benefit

Power Management

Power Domains & Islands• ASIC-like, domain- & block-level power control to use only what’s needed when needed

• Eliminate static power of unused blocks

Power Management Framework • Xilinx-provided library to simplify & customize power control for application requirements

• Systematic power coordination between processing elements for reliable shutdown & resume

Functional Safety & System Management

SW Test Library & Error Handling Xilinx-provided libraries to manage key processing elements & detect errors

Triple-Redundancy Processor Continuous & reliable operation in the event of an error

Processing System

Memory

Application

Processing Unit

Programmable Logic

A53 A53

A53 A53

Off

Off

Power

Down

Power

Down

Battery Power Domain

Low Power

Domain

Full Power

Domain

VCC_PSBATT

PL Domain

General

ConnectivitySecurity

System

Control

PMU

Power

SystemMonitor

Triple

Redundant

Processor

32KB ROM

128KB RAM

With ECC

Power

Domain Controls

Peripheral

& Memory

Access

IO Unit

&

Interrupt

Controller

Wake

Signals

Platform Management Unit Block Diagram

Power

Down

Page 56: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 56

UltraScale+™ Programmable Logic

Security, ReliabilityDecryption, Anti-Tamper

SEU Resilience

External MemoryDDR4 at 2,666Mb/s

DDR4

DSPFloating & Fixed Point Enhanced

Block RAMHardened cascading

UltraRAMMassive Capacity

SRAM replacement

Networking IP100G Ethernet

150G Interlaken

Transceivers16G & 28G backplane

32.75G chip-to-chip

PCI Express®Gen3 x16

Gen4 x8

I/O InterfacingHigh-Density I/O

MIPI D-PHY Support

Page 57: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 57

Embedded Software Development ToolsFeature Benefit

Eclipse-Based IDE Familiar software development environment

Linaro GCC Tool Chain (Industry standard compiler tool chain for Embedded Linux & Bare Metal (included in SDK)

Multi-Core Debug Debug & cross triggering for Cortex-A53s, Cortex-R5s, and MicroBlaze™ Processor

Performance Profiling & Analysis Analyze interfaces across processing and programmable logic domains

Ecosystem Development Tools • Broad support for 3rd party dev tools & debug, e.g., ARM DS-5, Lauterbach Trace-32

• Designers use their preferred development & debug environment

Xilinx Software Design Kit for SW Dev and Project, Build, & Tool Chain Management

Page 58: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 58

Reference DesignsExamples of System Topologies to Jump-Start Differentiation

Reference Design(e.g., Boot Loaders, Firmware, Framework, OSs)

Example Design (SMP Linux / RPU Split)

SMP Linux FreeRTOS

Start System Development Immediately

Inter-Processor Framework

APUR51 Core R51 Core

RPU

Message Passing

C-Code

User

App

User

App

Pro

vide

d by

Xili

nx

Features Details & Benefits

Common System

Topologies

• Pre-built & validated

• Enables immediate application development

“Mini-Reference Designs”

• Incrementally build to full system solution, e.g.,

• OS implementation

• ‘Hello World’ for each processor on top of OS

• Processing System & FPGA logic integration

• SDSoC software acceleration

• OpenAMP communication

Available Topologies

SMP Linux / RPU Split• APU: SMP Linux

• RPU: Baremetal (R51), FreeRTOS (R52)

SMP Linux / RPU Lock-Step• APU: SMP Linux

• RPU: Baremetal (R51), FreeRTOS (R52)

Hypervisor• APU: SMP Linux

• RPU: Baremetal (R51), FreeRTOS (R52)

Baremetal

Page 59: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 59

Page 60: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 60

Page 61: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 61

Page 62: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

UltraZed-EG SOM

Xilinx Zynq

UltraScale+ MPSoC

DDR4 SDRAM

(2GB)

QSPI Flash

(64MB)

eMMC Flash

(8GB)

Gigabit Ethernet

PHY

USB 2.0

PHY

PMBus Voltage

Regulators

Page 63: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

UltraZed-EG SOM Mechanical Dimensions

Page 64: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 64

• Zynq UltraScale +

CG

Page 65: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 65

Different Applications Have Different Processing Needs

Motion Control

Machine Vision

Application

Processor

x2

Real-Time

Processor

x2

Real-Time

Processor

x2

Application

Processor

x4

Graphics

ProcessorVideo Codec

ISM Applications

Scalable Common Architecture - Feature and cost optimized by application

Page 66: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Zynq® UltraScale+™ MPSoC: CG Devices

Application

Processor64-bit Dual-Core Application

Processor64-bit Quad-Core

Zynq® UltraScale+™ MPSoC: EG & EV Devices

Real-Time

Processors32-bit Dual-Core

Platform & Power

ManagementGranular Power Control

Functional Safety

Configuration &

Security UnitAnti-Tamper & Trust

Industry Standards

Fabric AccelerationCustomizable Engines

High Speed Connectivity

Video Codec8K4K (15fps)

4K2K (60fps)

High Speed

PeripheralsKey Interfaces

Graphics

ProcessorARM Mali-400MP2

Memory

SubsystemHigh Bandwidth

Low Latency

Page 67: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Hig

h-E

nd

Mid

-Ran

ge

Lo

w-E

nd

Page 67

Extending the Zynq® Portfolio

Dual-core ARM® Cortex™-A9

28nm Artix®-7 FPGA

Dual-core ARM Cortex-A9

28nm Kintex®-7 FPGA

Dual-Core ARM Cortex-R5

Dual-Core ARM Cortex-A53

16nm FinFET+ Logic

Dual-Core ARM Cortex-R5

Quad-Core ARM Cortex-A53

ARM Mali™-400 MP2

16nm FinFET+ Logic

Dual-Core ARM Cortex-R5

Quad-Core ARM Cortex-A53

ARM Mali-400 MP2

H.264/H.265 Video Codec

16nm FinFET+ Logic

Page 68: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 68

Completing the Zynq UltraScale+ MPSoC Portfolio

Seven New CG Devices for Increased Market Reach

EV Devices for Applications Requiring a Video Codec

Extended Range of EG Devices for Greater Flexibility

Dual-Core RPU

Dual-Core APU

Quad-Core APU

Dual-Core RPU

GPU

Quad-Core APU

Dual-Core RPU

GPU

VCU

Processor Scalability to meet diverse market requirements

Page 69: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 69

Zynq UltraScale+ MPSoC Device Migration Table

Zynq® UltraScale+™ MPSoC

Pkg mmCG Devices EG Devices EV Devices

ZU2CG ZU3CG ZU4CG ZU5CG ZU6CG ZU7CG ZU9CG ZU2EG ZU3EG ZU4EG ZU5EG ZU6EG ZU7EG ZU9EG ZU11EG ZU15EG ZU17EG ZU19EG ZU4EV ZU5EV ZU7EV

A484 19 X X X X

A625 21 X X X X

C784 23 X X X X X X X X X

B900 31 X X x X X X X X X

C900 31 X X x X X

B1156 35 X X x X X

C1156 35 x x X X

B1517 40 X X X

F1517 40 x x X X

C1760 42.5 X X X

D1760 42.5 X X

E1924 45 X X

Page 70: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 70

16nm UltraScale+ Is Now In Production

Expanding On Our One Year Lead at 16nm

KU3P, KU5P, KU9P

Devices

VU3P

DeviceZU2, ZU3, ZU6, ZU9

EG/CG Devices

Page 71: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 71

Roadmap

Where are the FPGA /

SOC technology

taking us – what is the

future ?

Page 72: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Bandwidth-Hungry Applications Drive Memory SolutionsGrowing bandwidth gap between commodity memory solutions vs. requirements of high-end systems

4K/8K Multi-Pass

Video Processing

HPC Analytics &

Image Recognition

Network Function

Virtualization

& Bridging

2008 2011 2014 2017

Ethernet Video DSP Capability DDR

Bandw

idth

Year

Ethernet

Video

DSP Capability

DDR

Ethernet Trend10G 40G 100G 400G

Video Trend1080P 2K 4K 8K

DDR Trend2,133 (DDR3) 2,667 (DDR4)

FPGA DSP Trend2,000 (40nm) 12,000 (16nm)

A revolutionary increase in

memory bandwidth is needed

Page 73: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Obtaining Superior Bandwidth-per-Watt

DDR-4 DIMM

Standard

commodity

memory used in

Servers and PC’s.

Bandwidth 21.3 GB/s

Depth 16 GB

Price / GB $

PCB Req High

pJ / bit ~27

Latency Med

HMC

Hybrid-Memory Cube

Serial DRAM

Bandwidth 160 GB/s

Depth 4 GB

Cost / GB $$$

PCB Req Med

pJ / bit ~30

Latency High

Bandwidth 12.8 GB/s

Depth 2 GB

Cost / GB $$

PCB Req High

pJ / bit ~40

Latency Low

Bandwidth 460 GB/s

Depth 8 GB

Cost / GB $$

PCB Req None

pJ / bit ~7

Latency Med

RLDRAM-3

Low Latency DRAM for

packet buffering

applications

HBM

High Bandwidth Memory

DRAM integrated into

the FPGA package

* Single DDR4 DIMM * Two x36 RLDRAM-3 * Single HMC Device * Single FPGA with HBM

Page 74: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Introducing Virtex UltraScale+ HBM Devices20X more bandwidth than a DDR4 DIMM

DRAM stacks integrated

using SSI Technology

Dedicated hardened

interface to the HBM for

maximized bandwidth

Built on the proven Virtex

UltraScale+ FPGA platform

Memory Controller uses

AXI interface for easy

integration using Vivado IPIHBM Gen2 represents the

highest DRAM bandwidth

available

Hardened Cache

Coherent Interconnect

(CCIX) Ports

Page 75: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Built Using Proven Assembly Technology

Xilinx pioneered CoWoS (SSI Technology)

back in 28nm

– This is the 3rd generation of Xilinx using CoWoS

(ChipOnWaferOnSubstrate)

CoWoS is the lowest risk assembly

for Virtex UltraScale+ HBM

CoWoS is the de facto standard

assembly for HBM integration

– GPU vendors are already using this assembly

White Paper circa 2012

Page 76: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 76

Virtex® UltraScale+™ HBM FPGAs

Device Name VU31P VU33P VU35P VU37P

Logic

System Logic Cells (K) 970 970 1,915 2,860

CLB Flip-Flops (K) 887 887 1,751 2,615

CLB LUTs (K) 444 444 876 1,308

Memory

Max. Distributed RAM (Mb) 12.5 12.5 24.6 36.7

Total Block RAM (Mb) 23.6 23.6 47.3 70.9

UltraRAM (Mb) 90 90 180 270

HBM DRAM (Gb) 32 64 64 64

HBM AXI Ports 32 32 32 32

Clocking Clock Management Tiles (CMTs) 4 4 8 12

Integrated IP

DSP Slices 2,880 2,880 5,952 9,024

PCIe® Gen3 x16 / Gen4 x8 4 4 5 6

CCIX Ports(2) 4 4 4 4

150G Interlaken 0 0 2 4

100G Ethernet w/ RS-FEC 2 2 5 8

I/OMax. Single-Ended HP I/Os 208 208 416 624

GTY 32.75Gb/s Transceivers 32 32 64 96

Speed Grades Extended(1) -1, -2L, -3 -1, -2L, -3 -1, -2L, -3 -1, -2L, -3

Footprint(1) Dimensions (mm) HP I/O, GTY 32.75Gb/s

Packaging

H1924 45x45 208, 32

H2104 47.5x47.5 208, 32 416, 64

H2892 55x55 416, 64 624, 96Notes:

1. All packages are 1.0mm ball pitch.

2. A CCIX port requires the use of a PCIe Gen3 x16 / Gen4 x8 block

Page 77: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

56G PAM4 Transceivers Coming to 16nm“There Is One More Thing…”

Page 77

C

O

N

F

I

D

E

N

C

E

56G Test ChipJan 2016

(Demo Video)

4th Generation

Adaptive RX Equalization

Proven

Foundation

Virtex

UltraScale+Swap GTYs for GTMs

Test Chips

in Progress

More Details Later

This Year

Timed with

Optics

Availability

Page 78: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 78

The First All Programmable RFSoC

Integrated RF-Class Analog

Technology

Full Programmability Across the

Analog-Digital Signal Chain

Delivering up to 50-70% Power

and Footprint Reduction

Page 79: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 79

Reduced Power, Form Factor, and Design Cycle

Power

Form Factor

Design Cycle

I/O Timing Closure

Virtex® UltraScale™ VU35P

HBM

RoleIPSec, SSL, Firewall,

GZIP, OSV, SHA-1/2

HBM ControllerPCIe/

CCIX

400GE

MAC

NIC w/Half the Height & LengthAll Programmable Device

1.75 Watts

2.25 Watts

1.75 Watts

ADC

DAC

ADC

DAC

Tra

nsce

ive

rsT

ran

sce

ive

rs

JESD204Converter

Interface IP

JESD204Converter

Interface IP

Analog DesignAnalog Interface Analog Design

System DesignSystem Design

1 Watt

1 Watt

Digital DesignEmbedded Design

Digital

DesignProcessing

System

ADC

DAC

ADC

DAC

2.25 Watts

Page 80: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 80

Advantages of All Programmable RFSoC

RF Sampling for Platform Flexibility

• RF-design moved to the digital domain for full programmability

• Reduces & minimizes analog signal processing components

Shorter Design Cycle

• Simplified system design with fewer components

• Eliminates JES204B/C analog interface design

Dramatic System Footprint Reduction

• Eliminates discrete converters

• Enables scalability for increasing channel count

Reduced System Power

• Reduces data converter power

• Eliminates FPGA-to-Analog interface power

Page 81: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Prior Experience with Analog Design & IntegrationFully Integrated Test Chip

12-bit 4 GSPS ADCs

14-bit 6.4 GSPS DACs

Published

Research Results

2014

Integrated ADC & DAC

with Virtex-7 FPGA

28nm Test Chip

Designed & Validated

2012

16nm FinFET Test Chip

Designed & Validated

2016

Page 81

Page 82: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 82

Development tool’s for

FPGA / SOC

now and the future

Page 83: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Vivado Design Suite

Page 83

High-level

Synthesis

Standards based

IP reuse

Fast simulation and HW co-simulation

IP

Integrator

Tcl SDC

ISimVivado

Runtime

3X

230+ LogiCORE & SmartCore IP

Page 84: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 84

SDSoC: HW Acceleration from C/C++ Applications

Move C/C++ functions to hardware

Full system generation including driver

and hardware connectivity

System-level debug and profile

Rapid HW partitioning and exploration

C/C++ Applications

System-level Profiling

Specify Functions for

Acceleration

Full System Generation

Performance

Estimation

Page 85: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 85

Before SDSoC: HW/SW Partition Exploration

PL

PS

ApplicationSDKC/C++

DriverSDK, OS ToolsC

IP IntegratorIPI projectDatamover

PS-PL interface

IPVivadoHLS

Verilog, VHDL

HW-SW partition

spec

Met

Req

?

Involves Multiple Disciplines to Explore Architecture

Page 86: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 86

SDSoC: Full-system Generation from Exploration

C/C++

Select functions

for PL

PL

PS

IP

Application

Driver

SDSoC

Datamover

PS-PL interface

Met

Req

?

C/C++ Applications to System in hours

Func1();

Func2();

Func3();

Page 87: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Easy to use Eclipse IDE

One click to accelerate functions

in Programmable Logic (PL)

Optimized libraries

– Xilinx, ARM and Partners

– DSP, Video, fixed point, linear

algebra, BLAS, OpenCV

Support for Linux, FreeRTOS

and baremetal

– Additional OS support in future

releases

SDSoC: Embedded C/C++ Applications Programming Experience

C/C++ Development

Page 87

Page 88: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Rapid system performance estimation

– Full system estimation (programmable

logic, data communication, processing

system)

– Reports SW/HW cycle level performance

and hardware utilization

Automated performance

measurement

– Runtime measurement by instrumentation

of cache, memory, and bus utilization

SDSoC: System Level Profiling

Page 88

Page 89: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Rapid software configurable

application acceleration using

C/C++

– Automated function acceleration in

programmable logic

– Up to 100X increase in performance

vs. software

– System optimized for latency, bandwidth,

and hardware utilization

SDSoC: Full System Optimizing Compiler

Page 89

Page 90: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 90

Page 91: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Machine learning is using exposure to data to learn and not programming of rulesMultiLayer Neural Network to develop intelligent systemsCNN or Convolutional Neural Networks are using for image detection

Page 91

Page 92: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 92

Page 93: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 93

Page 94: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

For deployment you always need 3 things !

Page 94

• Framework - Free & Open Source SW environment used to train and optimize you network model

Page 95: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 95

Page 96: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

Page 96

Frameworks

Libraries and Tools

Development Kits

DNN

CNNGoogLeNet

SSD

FCN …

Page 97: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

reVISION: Enabling Software Defined Development Flow

System Optimizing

Compiler Machine Learning

Scheduling of Pre-Optimized

Neural Network Layers

Optimized Accelerators

& Data Motion Network

.prototxt

& Trained

Weights

DNN

CNNGoogLeNet

SSD

FCN …

Page 98: FPGA / SOC teknologi - i dag og i fremtiden

© Copyright 2016 Xilinx.

reVISION: Enabling Software Defined Development Flow

C/C++/OpenCL

Creation

Profiling to Identify

Bottlenecks

System Optimizing

Compiler

Computer Vision

Machine Learning

Scheduling of Pre-Optimized

Neural Network Layers

Optimized Accelerators

& Data Motion Network

.prototxt

& Trained

Weights

DNN

CNNGoogLeNet

SSD

FCN …