Building Multi-Processor FPGA SystemsHands-on Tutorial to Using FPGAs and Linux
Chris Martin <[email protected]>Member Technical Staff Embedded Applications
Agenda
2
Introduction
Problem: How to Integrate Multi-Processor Subsystems
Why…– Why would you do this?– Why use FPGAs?
Lab 1: Getting Started - Booting Linux and Boot-strapping NIOS
Lab 2: Inter-Processor Communication and Shared Peripherals
Lab 3: Locking and Tetris
Building Hardware: FPGA Hardware Tools & Build Flow
Building/Debugging Software: Software Tools & Build Flow
References
Q&A – All through out.
Subsystem1
3
The Problem – Integrating Multi-Processor Subsystems
Given a system with multiple processor sub-systems, these architecture decisions must be considered:
Inter-processor communication
Partitioning/sharing Peripherals (locking required)
Bandwidth & Latency Requirements
Processor
Periph 1 Periph 2 Periph 3
Subsystem2Processor
Periph 1 Periph 2 Periph 3
???
4
Why Do We Need to Integrate Multi-Processor Subsystems?
May have inherited processor subsystem from another development team or 3rd party
– Risk Mitigation by reducing change
Fulfill Latency and Bandwidth Requirements
– Real-time Considerations– If main processor not Real-Time
enabled, can add a real-time processor subsystem
Design partition / Sandboxing– Break the system into smaller
subsystems to service task– Smaller task can be designed easily
Leverage Software Resources– Sometimes problem is resolved in less
time by Processor/Software rather than Hardware design
– Sequencers, State-machines
5
Why do we want to integrate with FPGA?(or rather, HOW can FPGAs help?)
Bandwidth & Latency can be tailored
– Addresses Real-time aspects of System Solution
– FPGA logic has flexible interconnect
– Trade Data width with clock frequency with latency
Experimentation– Many processor subsystems
can be implemented– Allows you to experiment
changing microprocessor subsystem hardware designs
– Altera FPGA under-the-hood– However: Generic Linux
interfaces used and can be applied in any Linux system.
NIOS
ARM
APeripheral
N Peripheral
SharedPeripheral
Mailbox
Simple Multiprocessor System
And, why is Altera involved with Embedded Linux…
Why is Altera Involved with Embedded Linux?
6
More than 50% of FPGA designs include an embedded processor, and growing.Many embedded designs using LinuxOpen-source re-use.
– Altera Linux Development Team actively contributes to Linux Kernel
0
20,000
40,000
60,000
80,000
100,000
120,000
Without CPU With CPU
Source: Gartner September 2010
50%
Des
ign
Sta
rts
With Embedded ProcessorWithout Embedded Processor
SoCKit Board Architecture Overview
Lab focus- UART- DDR3- LEDs- Buttons
7
FPGA Fabric “Soft Logic”
SoC/FPGA Hardware Architecture Overview
ARM-to-FPGA Bridges- Data Width
configurable FPGA
- 42K Logic Macros
- Using no more than 14%
8
A9
I$ D$
A9
I$ D$
L2
EMIF
DDR
ROM
RAM
DMA
UART
SD/MMC
AXI BridgeFPGA2HPS
AXI BridgeHPS2FPGA
AXI BridgeLWHPS2FPGA
NIOS
RAMGPIO
SYS ID
32 32/64/128
32
32/64/128
Lab 1: Getting StartedBooting Linux and Boot-strapping NIOS
9
Topics Covered:– Configuring FPGA from SD/MMC and U-Boot– Booting Linux on ARM Cortex-A9– Configuring Device Tree– Resetting and Booting NIOS Processor– Building and compiling simple Linux Application
Key Example Code Provided:– C code for downloading NIOS code and resetting NIOS from ARM– Using U-boot to set ARM peripheral security bits
Full step-by-step instructions are included in lab manual.
Lab 1: Hardware Design Overview
10
NIOS Subsystem– 1 NIOS Gen 2 processor– 64k combined instruction/data
RAM (On-Chip RAM)– GPIO peripheral
ARM Subsystem– 2 Cortex-A9 (only using 1)– DDR3 External Memory– SD/MMC Peripheral– UART Peripheral
Shared Peripherals Dedicated Peripherals
Subsystem 1
Subsystem 2
Cortex-A9
GPIO
UART
SD/MMC
NIOS 0
RAM
EMIF
Lab1: Programmer View - Processor Address Maps
Address Base Peripheral
0xFFC0_2000 ARM UART
0x0003_0000 GPIO (LEDs)
0x0002_0000 System ID
0x0000_0000 On-chip RAM
Address Base Peripheral
0xFFC0_2000 UART
0xC003_0000 GPIO (LEDs)
0xC002_0000 System ID
0xC000_0000 On-chip RAM
11
NIOS ARM Cortex-A9
Lab 1: Peripheral Registers
Peripheral AddressOffset
Access Bit Definitions
Sys ID 0x0 RO [31:0] – System ID. Lab Default = 0x00001ab1
GPIO 0x0 R/W [31:0] – Drive GPIO output.Lab Uses for LED control, push button status and NIOS processor resets (from ARM). [3:0] - LED 0-3 Control. ‘0’ = LED off . ‘1’ = LED on[4] – NIOS 0 Reset[5] – NIOS 1 Reset[1:0] – Push Button Status
UART 0x14 RO Line Status Register[5] – TX FIFO Empty[0] – Data Ready (RX FIFO not-Empty)
UART 0x30 R/W Shadow Receive Buffer Register[7:0] – RX character from serial input
UART 0x34 R/W Shadow Transmit Register[7:0] – TX character to serial output
12
Lab 1: Processor Resets Via Standard Linux GPIO Interface
NIOS resets connected to GPIO
GPIO driver uses /sys/class/gpio interface
int main(int argc, char** argv){ int fd, gpio=168; char buf[MAX_BUF];
/* Export: echo ### > /sys/class/gpio/export */ fd = open("/sys/class/gpio/export", O_WRONLY); sprintf(buf, "%d", gpio); write(fd, buf, strlen(buf)); close(fd);
/* Set direction to Out: */ /* echo "out“ > /sys/class/gpio/gpio###/direction */ sprintf(buf, "/sys/class/gpio/gpio%d/direction", gpio); fd = open(buf, O_WRONLY); write(fd, "out", 3); /* write(fd, "in", 2); */ close(fd);
/* Set GPIO Output High or Low */ /* echo 1 > /sys/class/gpio/gpio###/value */ sprintf(buf, "/sys/class/gpio/gpio%d/value", gpio); fd = open(buf, O_WRONLY); write(fd, "1", 1); /* write(fd, "0", 1); */ close(fd);
/* Unexport: echo ### > /sys/class/gpio/unexport */ fd = open("/sys/class/gpio/unexport", O_WRONLY); sprintf(buf, "%d", gpio); write(fd, buf, strlen(buf)); close(fd);}13
Lab 1: Loading External Processor Code Via Standard Linux shared memory (mmap)
NIOS RAM address accessed via mmap()
Can be shared with other processes
R/W during load Read-only protection
after load
/* Map Physical address of NIOS RAM to virtual address segment with Read/Write Access */fd = open("/dev/mem", O_RDWR);load_address = mmap(NULL, 0x10000, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0xc0000000);
/* Set size of code to load */ load_size = sizeof(nios_code)/sizeof(nios_code[0]);
/* Load NIOS Code */for(i=0; i < load_size ;i++){ *(load_address+i) = nios_code[i]; }
/* Set load address segment to Read-Only */mprotect(load_address, 0x10000, PROT_READ);
/* Un-map load address segment */munmap(load_address, 0x10000);
14
Lab 2: MailboxesNIOS/ARM Communication
15
Topics Covered:– Altera Mailbox Hardware IP
Key Example Code Provided:– C code for sending/receiving messages via hardware Mailbox IP
NIOS & ARM C Code– Simple message protocol– Simple Command parser
Full step-by-step instructions are included in lab manual. – User to add second NIOS processor mailbox control.
Subsystem 1
Subsystem 2
Lab 2: Hardware Design Overview
16
NIOS 0 & 1 Subsystems– NIOS Gen 2 processor– 64k combined instruction/data
RAM– GPIO (4 out, LED)– GPIO (2 in, Buttons)– Mailbox
ARM Subsystem– 2 Cortex-A9 (only using 1)– DDR3 External Memory– SD/MMC Peripheral– UART Peripheral
Cortex-A9
GPIO
UART
SD/MMC
NIOS 0
RAM
Shared Peripherals Dedicated Peripherals
MBox
Subsystem 3
GPIO
NIOS 1
RAMMBox
GPIO
EMIF
Lab2: Programmer View - Processor Address Maps
Address Base Peripheral
0xFFC0_2000 ARM UART
0x0007_8000 Mailbox (from ARM)
0x0007_0000 Mailbox (to ARM)
0x0005_0000 GPIO (In Buttons)
0x0003_0000 GPIO (Out LEDs)
0x0002_0000 System ID
0x0000_0000 On-chip RAM
Address Base Peripheral
0xFFC0_2000 UART
0x0007_8000 Mailbox (to NIOS 1)
0x0007_0000 Mailbox (from NIOS 1)
0x0006_8000 Mailbox (to NIOS 0)
0x0006_0000 Mailbox (from NIOS 0)
0xC003_0000 GPIO (LEDs)
0xC002_0000 System ID
0xC001_0000 NIOS 1 RAM
0xC000_0000 NIOS 0 RAM
17
NIOS 0 & 1 ARM Cortex-A9
Lab 2: Additional Peripheral (Mailbox) Registers
Peripheral AddressOffset
Access Bit Definitions
Mailbox 0x0 R/W [31:0] – RX/TX Data
Mailbox 0x8 R/W [1] – RX Message Queue Has Data
[0] – TX Message Queue Empty
18
LAB 2: Designing a Simple Message Protocol
Design Decisions:- Short Length: A single 32-bit word- Human Readable- Message transactions are closed-
loop. Includes ACK/NACK Format:
- Message Length: Four Bytes- First Byte is ASCII character
denoting message type.- Second Byte is ASCII char from
0-9 denoting processor number. - Third Byte is ASCII char from 0-9
denoting message data, except for ACK/NACK.
- Fourth Byte is always null character ‘\0’ to terminate string (human readable).
Message Types:- “G00”: Give Access to UART
(Push)- “A1A”: ACK- “N1A”:NACK
Can be Extended:- “L00”: LED Set/Ready- “B00”: Button Pressed- “R00”: Request UART Access
(Pull)
19
Byte 0 Byte 1 Byte 2 Byte3
‘L’ ‘0’ ‘0’ ‘\0’
‘A’ ‘0’ ‘0’ ‘\0’
Cortex-A9 NIOS 0
“G00”
“A0A”“N0N”
Lab 2: Inter-Processor Communication with Mailbox HWVia Standard Linux Shared Memory (mmap)
Wait for Mailbox Hardware message empty flag
Send message (4 bytes) Disable ARM/Linux
Access to UART Wait for RX message
received flag Re-enable ARM/Linux
UART Access
/* Map Physical address of Mailbox to virtual address segment with Read/Write Access */fd = open("/dev/mem", O_RDWR);mbox0_address = mmap(NULL, 0x10000, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0xff260000);<snip>/* Waiting for Message Queue to empty */while((*(volatile int*)(mbox0_address+0x2000+2) & 1) != 0 ) {} /* Send Granted/Go message to NIOS */send_message = "G00";*(mbox0_address+0x2000) = *(int *)send_message;
/* Disable ARM/Linux Access to UART (be careful here)*/config.c_cflag &= ~CREAD;if(tcsetattr(fd, TCSAFLUSH, &config) < 0) { }
/* Wait for Received Message */while((*(volatile int*)(mbox0_address+2) & 2) == 0 ) {}
/* Re-enable UART Access */config.c_cflag |= CREAD;tcsetattr(fd, TCSAFLUSH, &config);/* Read Received Message */printf(" - Message Received. DATA = '%s'.\n", (char*)(mbox0_address));
20
Lab 3: Putting It All Together – Tetris!Combining Locking and Communication
21
Topics Covered:– Linux Mutex
Key Example Code Provided:– C code showcasing using Mutexes for locking shared peripheral access– C code for multiple processor subsystem bringup and shutdown
Full step-by-step instructions are included in lab manual. – User to add code for second NIOS processor bringup, shutdown and
locking/control.
Subsystem 1
Subsystem 2
Lab 3: Hardware Design Overview (Same As Lab 2)
22
NIOS 0 & 1 Subsystems– NIOS Gen 2 processor– 64k combined instruction/data
RAM– GPIO (4 out, LED)– GPIO (2 in, Buttons)– Mailbox
ARM Subsystem– 2 Cortex-A9 (only using 1)– DDR3 External Memory– SD/MMC Peripheral– UART Peripheral
Cortex-A9
GPIO
UART
SD/MMC
NIOS 0
RAM
Shared Peripherals Dedicated Peripherals
MBox
Subsystem 3
GPIO
NIOS 1
RAMMBox
GPIO
EMIF
Lab 3: Programmer View - Processor Address Maps
Address Base Peripheral
0xFFC0_2000 ARM UART
0x0007_8000 Mailbox (from ARM)
0x0007_0000 Mailbox (to ARM)
0x0005_0000 GPIO (In Buttons)
0x0003_0000 GPIO (Out LEDs)
0x0002_0000 System ID
0x0000_0000 On-chip RAM
Address Base Peripheral
0xFFC0_2000 UART
0x0007_8000 Mailbox (to NIOS 1)
0x0007_0000 Mailbox (from NIOS 1)
0x0006_8000 Mailbox (to NIOS 0)
0x0006_0000 Mailbox (from NIOS 0)
0xC003_0000 GPIO (LEDs)
0xC002_0000 System ID
0xC001_0000 NIOS 1 RAM
0xC000_0000 NIOS 0 RAM
23
NIOS 0 & 1 ARM Cortex-A9
Available Linux Locking/Synchronization Mechanisms
24
Need to share peripherals– Choose a Locking Mechanism
Available in Linux– Mutex <- Chosen for this Lab– Completions– Spinlocks – Semaphores – Read-copy-update (decent for multiple
readers, single writer)– Seqlocks (decent for multiple readers, single
writer)
Available for Linux– MCAPI - openmcapi.org
NIOS 1
Tetris Message Protocol – Extended from Lab 2
25
NIOS Control Flow: – Wait for button press– Send Button press message– Wait for ACK (Free to write
to LED GPIO)– Write to LED GPIO– Send LED ready msg– Wait for ACK
ARM Control Flow: – Wait for button press
message– Lock LED GPIO Peripheral– Send ACK (Free to write to
LED GPIO)– Wait for LED ready msg– Send ACK– Read LED value– Release Lock/Mutex
Cortex-A9NIOS 0 “B00”
“A0A”
“A0A”
“L00”
“B10”
“A1A”
“A1A”
“L10”
Lab 3: Locking Hardware Peripheral AccessVia Linux Mutex
In this example, LED GPIO is accessed by multiple processors
Wrap LED critical section (LED status reads) with:- pthread_mutex_lock()- pthread_mutex_unlock()
Also need Mutex init/destroy:- pthread_mutex_init()- pthread_mutex_destroy()
pthread_mutex_t lock;<snip – Initialize/create/start> /* Initialize Mutex */ err = pthread_mutex_init(&lock, NULL);
/* Create 2 Threads */ i=0; while(i < 1) { err = pthread_create(&(tid[i]), NULL, &nios_buttons_get, &(nios_num[i])); i++; }
<snip – Critical Section> pthread_mutex_lock(&lock); /* Critical Section */ pthread_mutex_unlock(&lock);
<snip Stop/Destroy> /* Wait for threads to complete */ pthread_join(tid[0], NULL); pthread_join(tid[1], NULL);
/* Destroy/remove lock */ pthread_mutex_destroy(&lock);
26
References
27
Altera References
System Design Tutorials:
– http://www.alterawiki.com/wiki/Designing_with_AXI_for_Altera_SoC_ARM_Devices_Workshop_Lab_-_Creating_Your_AXI
3_Component
– Designing_with_AXI_for_Altera_SoC_ARM_Devices_Workshop_Lab
– Simple_HPS_to_FPGA_Comunication_for_Altera_SoC_ARM_Devices_Workshop
– http://www.alterawiki.com/wiki/Simple_HPS_to_FPGA_Comunication_for_Altera_SoC_ARM_Devices_Workshop_-_LAB2
Multiprocessor NIOS-only Tutorial:
– http://www.altera.com/literature/tt/tt_nios2_multiprocessor_tutorial.pdf
Quartus Handbook:
– https://www.altera.com/en_US/pdfs/literature/hb/qts/quartusii_handbook.pdf
Qsys:
– System Design with Qsys (PDF) section in the Handbook
– Qsys Tutorial: Step-by-step procedures and design example files to create and verify a system in Qsys
– Qsys 2-day instructor-led class: System Integration with Qsys
– Qsys webcasts and demonstration videos
SoC Embedded Design Suite User Guide:
– https://www.altera.com/en_US/pdfs/literature/ug/ug_soc_eds.pdf
Related Articles
29
Performance Analysis of Inter-Processor Communication Methods– http
://www.design-reuse.com/articles/24254/inter-processor-communication-multi-core-processors-reconfigurable-device.html
Communicating Efficiently between QorlQ Cores in Medical Applications
– https://cache.freescale.com/files/32bit/doc/brochure/PWRARBYNDBITSCE.pdf
Linux Inter-Process Communication:– http://www.tldp.org/LDP/tlk/ipc/ipc.html
Linux locking mechanisms (from ARM):– http://infocenter.arm.com/help/index.jsp?topic=/
com.arm.doc.dai0425/ch04s07s03.html
OpenMCAPI:– https://bitbucket.org/hollisb/openmcapi/wiki/Home
Mutex Examples:– http://www.thegeekstuff.com/2012/05/c-mutex-examples/
Thank YouThank You Full Tutorial Resources Online
- Project Wiki Page: http://rocketboards.org/foswiki/Projects/BuildingMultiProcessorSystems
Includes:- Source code- Hardware source- Hardware Quartus Projects- Software Eclipse Projects
BACKUP SLIDES
Post-Lab 1 Additional Topics
Hardware Design Flow and FPGA Boot with U-boot and SD/MMC
32
Building Hardware:Qsys (Hardware System Design Tool) User Interface
33
Connections between cores
Interfaces ExportedIn/out of system
Hardware and Software Work Flow Overview
34
Inputs:– Hardware Design (Qsys or RTL or Both)
Outputs (to load on boot media):– Preloader and U-boot Images– FPGA Programmation File: Raw Binary Format (RBF)– Device Tree Blob
Quartus&
Qsys
RBF
EclipseDS-5 & Debug Tools
Device Tree
Preloader & U-Boot
SDCARD Layout
35
Partition 1: FAT– Uboot scripts– FPGA HW Designs (RBF)– Device Tree Blobs– zImage– Lab material
Partition 2: EXT3 – Rootfs
Partition 3: Raw – Uboot/preloader
Partition 4: EXT3 – Kernel src
Updating SD Cards
36
More info found on Rocketboards.org– http://www.rocketboards.org/foswiki/Documentation/GSRD141SdCard
Automated Python Script to build SD Cards:– make_sdimage.py
File Update Procedure
zImage Mount DOS SD card partition 1 and replace file with new one:$ sudo mkdir sdcard$ sudo mount /dev/sdx1 sdcard/ $ sudo cp <file_name> sdcard/ $ sudo umount sdcard
soc_system.rbf
soc_system.dtb
u-boot.scr
preloader-mkpimage.bin $ sudo dd if=preloader-mkpimage.bin of=/dev/sdx3 bs=64k seek=0
u-boot-socfpga_cyclone5.img $ sudo dd if=u-boot-socfpga_cyclone5.img of=/dev/sdx3 bs=64k seek=4
root filesystem $ sudo dd if=altera-gsrd-image-socfpga_cyclone5.ext3 of=/dev/sdx2
Post-Lab 2 Additional TopicUsing Eclipse to Debug: NIOS Software Build Tools
37
Altera NIOS Software Design and Debug Tools
38
Nios II SBT for Eclipse key features:
– New project wizards and software templates
– Compiler for C and C++ (GNU)
– Source navigator, editor, and debugger
– Eclipse project-based tools– Download code to hardware
Key Multi-Processor System Design Points
39
Startup/Shutdown– Processor – Peripheral– Covered in Lab 1.
Communication between processors– What is the physical link?– What is the protocol & messaging method?– Message Bandwidth & Latency– Covered in Lab 2
Partitioning peripherals– Declare dedicated peripherals – only connected/controlled by one
processor– Declare shared peripherals – Connected/controlled by multiple
processors– Decide Upon Locking Mechanism– Covered in Lab 3
Post Lab 3 Additional TopicAltera SoC Embedded Design Suite
Altera Software Development Tools
41
Eclipse– For ARM Cortex-A9 (ARM Development Studio 5 – Altera Edition)– For NIOS
Pre-loader/U-Boot Generator
Device Tree Generator
Bare-metal Libraries
Compilers– GCC (for ARM and NIOS)– ARMCC (for ARM with license)
Linux Specific– Kernel Sources– Yocto & Angstrom recipes: http://
rocketboards.org/foswiki/Documentation/AngstromOnSoCFPGA_1 – Buildroot: http://
rocketboards.org/foswiki/Documentation/BuildrootForSoCFPGA
System Development Flow
FPGA Design Flow Software Design Flow
42
HardwareDevelopment
SoftwareDevelopment
Release Release• Quartus II Programmer• In-system Update
• Flash Programmer
Simulate Simulate• ModelSim, VCS, NCSim, etc.• AMBA-AXI and Avalon bus
functional models (BFMs)
Debug Debug• SignalTap™ II logic
analyzer• System Console
• GDB, Lauterbach, Eclipse
• Quartus II design software• Qsys system integration
tool• Standard RTL flow• Altera and partner IP
• Eclipse• GNU toolchain• OS/BSP: Linux, VxWorks• Hardware Libraries• Design Examples
Design Design
Inside the Golden System Reference Design
43
Complete system example design with Linux software support
Target Boards:– Altera SoC Development Kits– Arrow SoC Development Kits– Macnica SoC Development Kits
Hardware Design:– Simple custom logic design in
FPGA– All source code and Quartus II /
Qsys design files for reference
Software Design:– Includes Linux Kernel and
Application Source code– Includes all compiled binaries
---Topics – Back Up---
44
Introductions: Altera and SoC FPGAs
Development Tools– How to Build Hardware: FPGA Hardware Tools & Build Flow– How to Build Software: Software Tools & Build Flow– How to Debug all-of-the-above: Debug Tools
Key Multi-processor System Design Points
Hardware design– Shared peripherals– Available Hardware IP
Software design– Message Protocols– Linux tools/mechanism available today
Quartus – Hardware Development Tool
Quartus II User Interface
Quartus II main window provides a high level of visibility to each stage of the design flow- Project navigator provides direct
visual access to most of the key project information
- Tasks window allows you to use the tools and features of the Quartus II software and monitor their progress from a flow-based layout
- Tool View window shows various tools and design files
- Messages window outputs messages from each process of the run
46
Project Navigator
Tasks window
Tool View window
Messages window
47
Typical Hardware Design Flow
Design entry/RTL coding and early pin planning• Behavioral or structural description of design• Early pin planning allows board development in parallel
Functional verification
• Verify design behavior
Synthesis (mapping)• Translate design into device-specific primitives• Optimization to meet required area and
performance constraints
Placement and routing (fitting)• Place design in specific device resources with reference to
area and performance constraints• Connect resources with routing lines
Timing analysis• Verify performance specifications were met• Static timing analysis
Functional verification• Verify design will work in
target technology
PC board simulation and test• Simulate board design• Program and test device on board• On-chip tools for debuggingIn-system debugIn-system debug
Functional verificationFunctional verification
Functional verificationFunctional verification
Design creationDesign creation
Project creationProject creation
Project definition
Design compilationDesign compilation
Logic
I/O
Memory
Quartus II Feature Overview
48
Fully integrated development tool– Multiple design entry methods
Includes intellectual property- (IP-) based system design
– Up-front I/O assignment and validationEnables printed circuit board (PCB) layout early in the design process
– Incremental compilationReduces design compilation and improves timing closure
– Logic synthesisIncludes comprehensive integrated synthesis solutionAdvanced integration with third-party EDA synthesis software
– Timing-driven placement and routing
– Physical synthesisImproves performance without user intervention
– Verification solutionTimeQuest timing analyzerPowerPlay power analysis and optimizationFunctional simulation
– On-chip debug and verification suite In-system debugIn-system debug
Functional verificationFunctional verification
Functional verificationFunctional verification
Design creationDesign creation
Project creationProject creation
Project definition
Design compilationDesign compilation
Logic
I/O
Memory
49
Quartus II Feature Overview (1/2)Feature Quartus II Software
Project creation New project wizard
Design entry
HDL editor Schematic editor State machine editor MegaWizard™ Plug-In Manager
– Customization and generation of IP Qsys system integration tool
Design constraint assignments Assignment editor Pin planner Synopsys Design Constraint (SDC) editor
Synthesis Quartus II Integrated Synthesis (QIS) Third-party EDA synthesis Design assistant
Fitting and placing design into FPGA to meet user requirements
Fitter (including physical synthesis)
Design analysis and debug Netlist viewers Advisors
Power analysis PowerPlay power analyzer
50
Quartus II Feature Overview (2/2)
Feature Quartus II Software
Static timing analysis on post-fitted design TimeQuest timing analyzer
Viewing and editing design placement Chip Planner
Functional verification ModelSim®-Altera edition Third-party EDA simulation tools
Generation of device programming file Assembler
On-chip debug and verification
SignalTapTM II (embedded logic analyzer) In-system memory content editor Logic analyzer interface editor In-system sources and probes editor SignalProbe pins Transceiver Toolkit External memory interface toolkit
Technique to optimize design and improve productivity
Quartus II incremental compilation Physical synthesis optimization Design Space Explorer (DSE)
Quartus II Subscription Edition vs. Web Edition
51
Subscription Edition Web Edition
Device supported
MAX series devices: All (Excluding MAX7000 / 3000)
Cyclone III/IV/V FPGAs: All Arria II/V FPGAs: All Stratix III, IV, V FPGAs: All Cyclone V SoCs: All
MAX series devices: All (Excluding MAX7000 / 3000)
Cyclone V FPGAs: All (Excluding 5CEA9, 5CGXC9, and 5CGTD9)
Cyclone III/IV FPGAs: All Arria II GX FPGA: EP2AGX45 Cyclone V SoCs: All
Software features: Incremental compilation
and team-based design SSN Analyzer Transceiver Toolkit
Yes No
SignalTap II, SignalProbe Yes If TalkBack feature is enabled
Multi-processor support Yes If TalkBack feature is enabled
IP Base Suite MegaCore® functions Yes
No license required for OpenCore Plus hardware evaluation
License fee required for production use
Platform support Windows 32/64-bitLinux 32/64-bit
Windows 32/64-bitLinux 32/64-bit
License and maintenance terms
Perpetual(continues to work after
expiration)No license required except for IP core
Price $ Free
52
How to Get Started Using Quartus II Software
Download Quartus II software today and start designing with Altera programmable logic devices
Quartus II Handbook - http://www.altera.com/literature/lit-qts.jsp – Guides you through the programmable logic design cycle from design to
verification– Also covers third-party EDA vendor tool interfaces
Online demonstrations - http://www.altera.com/quartusdemos – Easiest way to learn about the latest Quartus II software features and
design flows
Training classes - https://mysupport.altera.com/etraining– Offers online training classes and live presentation coupled with hands-on
exercises to learn about Quartus II features and design flows
Agenda
Qsys – System Integration Platform
Qsys System Integration Platform
54
Avalon® Interfaces
AMBA® AXI, APB, AHB
High-Performance Interconnect
Based on Network-on-a-Chip (NoC)Architecture
Hierarchy Design Reuse
DesignSystem
Add toLibrary
Package as IP
Real-Time System Debug
Industry-Standard Interfaces
®
Qsys is Altera’s design environment for- Deployment of IP, with hierarchal support- Development platform for Altera custom solutions - Design platform for customers to quickly create system designs
Automated Testbench Generation
Qsys User Interface
55
Toolbar
Interfaces Exportedfor Hierarchy
Improved Validation Display
56
Qsys Benefits
Raises the level of design abstraction– System-level design and system visualization
Simplifies complex hierarchal system development– Automated interconnect generation
Provides a standard platform– IP integration, custom IP authoring, IP verification
Enables design re-use
Reduces time to market– System-level design reduces development time– Facilitates verification
Qsys improves productivity
57
Transport Layer Transfers packets to destination
Transaction Layer Converts command
packets to transactions and responses to response packets
Transaction Layer Converts transactions to
command packets and responses packets to responses
Network-on-Chip Architecture
MasterNetworkInterface
MasterNetworkInterface
Avalon-STAvalon-MMAXI-MM
Avalon-MMAXI-MM
MasterInterface
MasterInterface
Avalon STNetwork
(Response)
Avalon STNetwork
(Command)
SlaveNetworkInterface
SlaveNetworkInterface
SlaveInterface
SlaveInterface
58
Benefits of Network-On-Chip Approach
See white paper: Applying the Benefits of NoC Architecture to FPGA System Design
Independent implementation of transaction/transport layers– Different transport layer network topologies can be implemented without
transaction layer modificatione.g. High performance components on a wide high-frequency crossbar network
Supports standard interface interoperability– Mix and match interface types on transaction layer without transport layer
modification
Scalability– Segment network into sub-networks using
Bridges
Clock crossing logic
Industry-Standard Interfaces
59
Developer Standard Interface Protocol
Avalon® Interfaces
AMBA® AXI, AMBA APB, and AMBA AHB®
Qsys supports mixing of different interfaces
Target Qsys Applications
Qsys can be used in almost every FPGA design
Designs fall into two categories– Control plane
Memory mapped
Reading and writing to control and status registers
– Data planeStreaming
Data switching (muxing, demuxing), aggregation, bridges
61
“Packets………I care about Latency!”
Qsys packet format is wide– Packet format contains a complete transaction in a single clock cycle– Supports:
Writes with 0 cycles of latency
Reads with a round-trip latency of 1 cycle– You can control latency via Qsys configuration
Separate command and response network– Increases concurrency
Command traffic and Response traffic don’t compete for resources
62
Qsys: Wide Range of Compliant IP
Wide range of plug-and-play intellectual property (IP):
– Interface protocol IPE.g. PCIe, Ethernet 10/100/1000 Mbps (Triple-Speed Ethernet), Interlaken, JTAG, UART, SPI
– External memory interface IPE.g. DDR/DDR2/DDR3
– Video and imaging processing (VIP) IPE.g. VIP Suite including scaler, switch, deinterlacer, and alpha blending mixer
– Embedded processor IPE.g. Hardened ARM processor system, Nios II processor
– Verification IPE.g. Avalon-MM/-ST, AXI4, APB
>100 Qsys compliant IP available
Qsys as a Platform for System Integration
63
Connect IP and Systems
Automate Error-Prone Integration Tasks
Simplify Integration
Accelerate Development HDL
IP 1Custom 1
IP 2IP 3Custom 2
Interface protocols Memory DSP Embedded Bridges PLL Custom systems
Library of Available IP
64
Additional Resources
Watch online demos (3-5 min) www.altera.com/qsys
Complete the Qsys tutorial (2-3 hrs) www.altera.com/qsys
Watch free webcasts (10-15 mins)www.altera.com/qsys
Sign up for Qsys training www.altera.com/training
In-system Verification
Debug Challenges
66
Accessing and viewing internal signals
Not enough pins to use as test points
Capabilities in creating trigger conditions that correctly capture data
Verification of standard or proprietary protocol interfaces
Overall design process bottleneck
Debug Can Be Costly
On-chip Debug
67
Access and view internal signals
Store captured data in FPGA embedded memory
Use JTAG interface as debug ports
Incrementally add internal signals to view
Reduce Debug Cycles by
Using On-chip Debug Tools
On-chip Debug Technology
68
Debug tools communicate with the FPGA via standard JTAG interface
Multiple debug functions can share the JTAG interface simultaneously
– Altera’s system-level debugging (SLD) hub technology makes this possible
– All Altera tools and some third-party tools support the SLD hub JTAG interface
Download Cable
FPGA
JTAGTap
Controller
Node2
Node1
User's Design
(Core Logic)
NodeNSLD
Hub
NodeN-1
On-chip Debug Tools in Quartus II Software
69
SignalTap II logic analyzer– Captures and displays hardware events, fast turnaround times– Incrementally creates trigger conditions and adds signals to view– Uses captured data stored in on-chip RAM and JTAG interface for communication
In-system memory content editor– Displays content of on-chip memory– Enables modification of memory content in a running system
External logic analyzer interface– Uses external logic analyzer to view internal signals– Dynamically switches internal signals to output
In-system sources and probes– Stimulate and monitor internal signals without using on-chip RAM
Exception: SignalProbe incremental routing feature does not use JTAG interface (i.e. SLD hub technology)
– Quickly routes an internal node to a pin for observation
SignalTap II Logic Analyzer
70
Provides the most advanced triggering capabilities available in an FPGA-embedded logic analyzer
Proven to be invaluable in the lab– Captures bugs that would take weeks of simulation to uncover
Has broad customer adoptionFeatures and benefits
– An embedded logic analyzerUses available internal memory
– Probes state of internal signals without using external equipment or extra I/O pins
– Incremental compilation supportFast turnaround time when adding signals to view
– Advanced triggering for capturing difficult events/transactions– Power-up trigger support
Debug the initialization code– Megafunction support
Optionally, instantiate in HDL
71
In-system Memory Content Editor
Enables FPGA memory content and design constants to be updated in-system, via JTAG interface, without recompiling a design or reconfiguring the rest of the FPGA
– Fault injection into system– Update memory while system is running – Change value of coefficients in DSP applications– Easily perform “what if?” type experiments in-system in just seconds
Supports MIF and HEX formats for data interchange
Megafunctions supported– LPM_CONSTANT, LPM_ROM, LPM_RAM_DQ, ALTSYNCRAM (ROM and single-
port RAM mode)
Enable memory content editor
Enable memory content editor
72
In-system Memory Content Editor
Under Tools menu In-system Memory Content Editor
Altera SoC Embedded Design Suite
Included in SoC Embedded Design Suite (EDS)
74
Development Studio 5 Altera Edition– Awesome debugger, especially when combined with
USB Blaster II
Altera SoC FPGA System Trace Macrocells– Application development environment– Streamline system analyzer
Hardware Libraries
GNU-based bare-metal (EABI) compiler tools
U-Boot
Root file system to jump start software development
Pre-built Linux kernel– http://www.rocketboards.org for source trees and community access
System Development Flow
FPGA Design Flow Software Design Flow
75
HardwareDevelopment
SoftwareDevelopment
Release Release• Quartus II Programmer• In-system Update
• Flash Programmer
Simulate Simulate• ModelSim, VCS, NCSim, etc.• AMBA-AXI and Avalon bus
functional models (BFMs)
Debug Debug• SignalTap™ II logic
analyzer• System Console
• GNU, Lauterbach, DS5
• Quartus II design software• Qsys system integration
tool• Standard RTL flow• Altera and partner IP
• ARM Development Studio 5
• GNU toolchain• OS/BSP: Linux, VxWorks• Hardware Libraries• Design Examples
Design Design
Altera SoC Embedded Design Suite
FPGA Design Flow Software Design Flow
76
HardwareDevelopment
SoftwareDevelopment
Release Release• Quartus II Programmer• In-system Update
• Flash Programmer
Simulate Simulate• ModelSim, VCS, NCSim, etc.• AMBA-AXI and Avalon bus
functional models (BFMs)• Virtual Target
Debug Debug• SignalTap™ II logic
analyzer• System Console
• GNU, Lauterbach, DS5
• Quartus II design software• Qsys system integration
tool• Standard RTL flow• Altera and partner IP
• ARM Development Studio 5
• GNU toolchain• OS/BSP: Linux, VxWorks• Hardware Libraries• Design Examples
Design Design
Software Development
HW/SW Handoff
FPGA-Adaptive Debugging
Altera SoC Embedded Design Suite
77
Comprehensive Suite SW Dev Tools
Hardware / software handoff tools
Linux application development– Yocto Linux build environment– Pre-built binaries for Linux / U-Boot– Work in conjunction with the Community Portal
Bare-metal application development– SoC Hardware Libraries– Bare-metal compiler tools
FPGA-adaptive debugging– ARM DS-5 Altera Edition Toolkit
Design examples
Firmware Development
Hardware-to-Software Handoff
Linux Application
Development
FPGA-Adaptive
Debugging
Free Web Edition Subscription Edition Free 30-day Eval
Hardware-to-Software Handoff
78
system.sopcinfosystem.iswinfo
Device Tree Generator
Preloader Generator
.c & .h source files
Linux Device Tree
Qsys system info, SDRAM calibration files, ID / timestamp, HPS IOCSR data
Hardware
Software
Hardware / Software Handoff Tools
79
Allow hardware and software teams to work independently and follow their familiar design flows
Take Altera Quartus® II / Qsys output files and generate handoff files for the software design flow
Device Tree standard specifies hardware connectivity so that Linux kernel can boot up correctly
Linux Application Development
80
Yocto build support for Linux– Yocto standard enables open, versatile, and
cost-effective embedded software development– Allows a smooth transition to commercial Linux distributions
Pre-built Linux kernel, U-Boot, and root file system to jump start software development
– Link to community portal for source trees and community access
81
Bare-metal Application Development
Hardware Libraries– Software interface to all system
registers
– Functions to configure some basic system operations (e.g. clock speed settings, cache settings, FPGA configuration, etc.)
– Support board bring-up and diagnostics development
– Can be used by bare-metal application, device drivers, or RTOS
GNU-based bare-metal (EABI) compiler tools
Operating System
SoC FPGA
Application
HALBMALPAL
BSP
Bare-metal App
Hardware Libraries
Golden System Reference Design
82
Complete system design with Linux software support
– Simple custom logic design in FPGA
– All source code and Quartus II / Qsys design files for reference
– Include all compiled binaries- example can run on an Altera SoC Development Kit to jumpstart development
DS-5 Altera Edition- One Tool, Three Usages
83
• JTAG-Based Debugging
• Board Bring-up
• OS porting, Drivers Dev,
• Kernel Debug
• Application Debugging
• Linux User Space Code
• RTOS App Code
• System Integration
• System Debug
• FPGA-Adaptive Debugging
1
2
3
84
One Device, Two Debugging Tools?
Dedicated JTAG connection Visualize & control CPU
subsystem
JTAG
Dedicated JTAG connection Visualize & control FPGA
ARM® DS-5™ Toolkit Altera Quartus™ II Software
JTAGDSTREAM™
85
One Device, Two Debugging Tools?
Dedicated JTAG connection Visualize & control CPU
subsystem
JTAG
Dedicated JTAG connection Visualize & control FPGA
ARM® DS-5™ Toolkit Altera Quartus™ II Software
JTAGDSTREAM™
Debugging
Barrier
No single tool/cable to visualize
and control both CPU and
FPGA domains
No way for CPU and FPGA to
cross trigger and correlate
hardware and software events
No “fixed” debugger can
address the needs of
“changing” FPGA hardware
Industry First: FPGA-Adaptive Debugging
86
ARM® Development Studio 5 (DS-5™) Altera® Edition ToolkitRemoves debugging barrier between CPUs and FPGA Exclusive OEM agreement between Altera and ARMResult of innovation in silicon, software, and business model
Altera USB-Blaster™II
Connection
87
FPGA-Adaptive Debugging Features
Single USB-Blaster II cable for
simultaneous SW and HW debug
Automatic discovery of FPGA peripherals
and creation of register views
Hardware cross-triggering between the CPU and FPGA domains
Correlation of CPU software instructions and FPGA hardware events
Simultaneous debug and trace for Cortex-A9 cores and CoreSight™-compliant cores in FPGA
Statistical analysis of software load and bus traffic spanning the CPUs and FPGA
88
DS-5 Altera Edition Productivity-Boosting Features
Industry’s most advanced multicore debugger for ARM
JTAG based system-level debugging, gdbserver-based application debuggingin one package
Yocto plugin to enableLinux based application development
Integrated OS-aware analysisand debug capability
Visualization of SoC Peripherals
89
Register views assist the debug ofFPGA peripherals
– File generated by FPGA tool flow
– Automatically imported in DS-5 Debugger
Debug views for debug of software drivers
– Self-documenting– Grouped by peripheral,
register and bit-field
CMSIS
Peripheral register descriptions
90
FPGA-Adaptive, Unified DebuggingFPGA connected to debug and trace buses for non-intrusive capture and visualization of signal events
Simultaneous debug and trace connection to CPU coresand compatible IP
CorrelateFPGA signalevents withsoftware eventsand CPUinstructiontrace usingtriggers andtimestamps
Cross-Domain Debug 1
91
Trigger from software world to FPGA world
HARDWARE TRIGGER!
SOFTWARE TRIGGER
Cross-Domain Debug 2
92
Trigger from FPGA world to software world
HARDWARE TRIGGER
EXECUTION STOPOR
SW TRACE TRIGGER
EXECUTION STOPOR
HW TRACE TRIGGER
Correlate HW and SW Events
93
Debug event trigger point set from either:
SignalTap™ II Logic Analyzer
or DS-5 debugger
Captured trace can then be analyzed using timestamp-correlated events
SignalTap II Logic Analyzer
ARM® DS-5™ Toolkit
Timestamp Correlated
System-LevelPerformance Analysis
94
Performance bottlenecks in SoCs often come from the CPU interaction with the rest of the SoC
Streamline visualizes software activity with performance counters from the SoC and FPGA to enable full system-level analysis
Streamline only requires a TCP/IP connection to the SoC
ARM® DS-5™ Streamline
Linux OS Counters
Processor Counters,Aggregated, or Per Core
FPGA Block Counters
Power Consumption
Process/Thread Heat Map
Application Events
95
Altera SoC EDS- Key Benefits
One-stop shop from Altera
All the tools and examples for rapid starts
Familiar tools interface, easy to use
Share tools and knowledge to increase team productivity
Best multicore debugger tools for ARM architectureUnprecedented visibility and control acrossprocessor cores and across CPU, FPGA domains
Faster time to market, lower development costs!
Target Users and Usages
96
Web Edition
Subscription Edition
Board Bring-up Yes
Device Drivers Dev Yes
OS Porting Yes
Baremetal Programming Yes
RTOS Based App Dev Yes
Linux Based App Dev Yes Yes
Multicore App Debugging Yes
System Debugging Yes
SoC EDS Editions Summary
97
Component Key Feature Web Edition
SubscriptionEdition
30-DayEvaluation
Hardware/Software Handoff Tools
Preloader Image Generator x x x
Flash Image Creator x x x
Device Tree Generator (Linux) x x x
ARM DS-5 Altera Edition
Eclipse IDE x x x
ARM Compiler* x
Debugging over Ethernet (Linux) x x x
Debugging over USB-Blaster II JTAG x x
Automatic FPGA Register Views x x
Hardware Cross-triggering x x
CPU/FPGA Event Correlation x x
Compiler Tool Chains Linaro Tool Chain (Linux) x x x
CodeBench Lite EABI (Bare-metal) x x x
Hardware Libraries Bare-metal programming Support x x x
SoC Programming Examples
Golden System Reference Design x x x
*ARM Compiler is available in DS-5 Professional Edition, available directly from ARM
Coordinated Multi-Channel Delivery
98
Altera.com
Quartus II Programmer SignalTap II
Altera.com
Pre-built Binaries• Kernel• U-Boot• Yocto• Minimal RFS• Tool chains• Handoff tools• HW Libraries• Examples• Documentation
RocketBoards.org
Frequent Updates• Kernel source• U-Boot source• Yocto source• RFS source• Toolchain
source• Public git• Wiki• Mailman
Partners
BSPsMiddleware3rd Party Tools
Altera NIOS Software Design Tools
99
Nios II SBT for Eclipse key features:
– New project wizards and software templates
– Compiler for C and C++ (GNU)
– Source navigator, editor, and debugger
– Eclipse project-based tools
Top Related