Title 44pt Title Case
Affiliations 24pt sentence
case
20pt sentence case
© ARM 2016
ARM DS-5 support for Heterogeneous Platforms
Ronan Synnott
NXP FTF
Select Core Competency FAE
May 2016
© ARM 2016 2
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence
case
Agenda
ARM Heterogeneous Systems
DS-5 Development Studio (DS-5)
Which Compiler?
Debug Capabilities
© ARM 2016 3
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence
case
What do we mean by ARM Heterogeneous Systems?
All ARM Processors are not the same
Different Instruction Sets
Different Programmers Model
Different Debug Capabilities and Needs
We are seeing a huge increase of heterogeneous designs for system reasons
Shared resources and improved interoperability
Power Efficiency
Security
Reduce overall BOM
How to work efficiently with such a system?
© ARM 2016 4
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence
case
ARM® Cortex®-A Portfolio
ARMv8-A ARMv7-A
Q1 2016
Cortex-A15
High-performance with infrastructure
feature set
Cortex-A17 High-performance with lower power and smaller area
relative to Cortex-A15
Cortex-A9
Well established mid-range
processor used in many markets
Cortex-A5 Smallest and lowest power
ARMv7-A CPU, optimized for single-core
Cortex-A7 Most efficient
ARMv7-A CPU, higher
performance than Cortex-A5
Cortex-A53
Balanced performance and
efficiency 64/32 bit CPU
Cortex-A57
Proven high-performance
64/32 bit CPU
Cortex-A72
Highest performance
64/32 bit CPU
Cortex-A32
Smallest and lowest power
32-bit ARMv8-A
High
Performance
High
Efficiency
Ultra High
Efficiency
big.LITTLE compatible Key:
Cortex-A35
Highest efficiency 64/32 bit CPU
© ARM 2016 5
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence
case
ARM® Cortex®-R and Cortex-M Portfolio Q1 2016
Cortex-R
Cortex-M
SecurCore
Cortex-R4
Real-time performance
Cortex-R5
Real-time
performance
with functional
safety
Cortex-R7
High
performance
4G modem and
storage
Cortex-R8
Highest
performance
5G modem and
storage
Cortex-M0
Lowest cost,
low power
Cortex-M0+
Highest energy
efficiency
Cortex-M4
Mainstream
control & DSP
Cortex-M3
Performance
efficiency
Cortex-M7
Maximum
performance
control & DSP
SC000
Optimized area, anti-tampering
SC300
Performance,
anti-tampering
© ARM 2016 6
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence
case
CoreLink IP to ease Integration
DSPDSP
ACE
Network Interconnect
NIC-400
Flash
NIC-400
USB
Memory
Controller
DMC-520
x72
DDR4-3200
AHB
Snoop Filter1-32MB L3 cache
PCIe
10-40
GbE
DPI Crypto
CoreLink™ CCN-512 Cache Coherent Network
DSP SATA
Memory
Controller
DMC-520
x72
DDR4-3200
Cortex-A57
Memory
Controller
DMC-520
x72
DDR4-3200
Memory
Controller
DMC-520
x72
DDR4-3200
PCIe
DPI
I/O Virtualisation CoreLink MMU-500
SRAM
Network Interconnect
NIC-400
GPIO PCIe
GIC-500
Cortex CPU
or CHI
master
Cortex-A53
Cortex-A57
Cortex-A53
Cortex-A57
Cortex-A53
Cortex-A57
Cortex-A53
Cortex CPU
or CHI
master
Cortex CPU
or CHI
master
Cortex CPU
or CHI
master
© ARM 2016 7
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence
case
7
CoreSight Multi-Core Infrastructure
Stop all cores at the same time to see the whole system state
at that point
Use triggers to correlate different trace streams to get
visibility of real-time interaction between cores, not possible
when stepping
Simultaneous trace of asynchronous cores, busses and
intelligent peripherals
AMBA AHB trace and debug
Triggers correlate separate traces
Configured via JTAG or target resident software
Output to trace port and/or embedded trace buffer
JTAG Cortex-M Cortex-A
AMBA AXI/AHB
AHB trace
Cross trigger matrix
Cro
ss Trigge
r
Inte
rface
Cro
ss Trigge
r
Inte
rface
DAP
Debug bus (APBv3)
CoreSight
Debug
PTM ETM
Trace
buffer
Trace bus (ATB)
Trace
port
Funnel
CoreSight Multi-
source Trace
JTAG
or SWD
© ARM 2016 8
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence
case
Agenda
ARM Heterogeneous Systems
DS-5 Development Studio (DS-5)
Which Compiler?
Debug Capabilities
© ARM 2016 9
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence
case
DS-5 Development Studio
Integrated tool suite for software development on any ARM based SoC
ARM Compiler
Optimized code generation
and embedded libraries for
ARM CPUs
Streamline Analyzer
System-wide performance
analysis covering CPU, GPU,
System and software
DS-5 Debugger
Multi-core, multi-cluster
with OS awareness and
CoreSight trace
DS-5 IDE
Based on industry-standard
Eclipse CDT, with hundreds
of compatible plug-ins
© ARM 2016 11
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence
case
Agenda
ARM Heterogeneous Systems
DS-5 Development Studio (DS-5)
Which Compiler?
Debug Capabilities
Confidential © ARM 2016 12
Choosing the Right Tool for the Job
Compilation technology based on gcc, armcc and LLVM are
all relevant for ARM
armcc Deeply embedded pre-ARMv8
gcc Linux
LLVM Next-gen embedded, Android
Confidential © ARM 2016 13
Platforms Bare metal
Where does ARM Compiler fit in?
Bootloader
Linux
Apps
Android Application
RTOS
Windows
Application
and
Middleware
Apps
CPU(s)
Secure
RTOS
Not-secure
Applicatio
n
Apps
Validation
Test Suite
✓ ✓
✓ ✓
✓ ✓
✓
✓ ARM Compiler
Confidential © ARM 2016 14
Overview
ARM Compiler 5 Developed by ARM over 25 years
Supports all ARM cores up to v7-A
Mature, stable, safety certified
Long-term supported and maintained
ARM Compiler 6 Next generation ARM Compiler
Supports all ARM Cortex family, including
ARMv8
New features and performance
enhancements
Based on LLVM infrastructure
Confidential © ARM 2016 15
The ARM Compiler 5 toolchain is certified by TÜV SÜD, a recognized safety
industry expert
The certification enables customers to apply the ARM Compiler build tools
for safety-related development up to SIL3 (IEC 61508) and ASILD (ISO
26262) without further qualification activities
The certified compiler is relevant to anyone doing safety-related software
development on ARM-based devices
Includes safety-related applications in Automotive, Industrial, Medical, Railway, or Avionics
The accompanying Qualification Kit is applicable to other safety standards
(Not limited to just ISO 26262 and IEC 61508)
TÜV SÜD Certification for Functional Safety
Confidential © ARM 2016 16
ARM Compiler Qualification Kit
The Qualification Kit augments the TÜV certificate and the
accompanying report to the certificate Safety manual referenced in TÜV report
New defects reported within the Defect Report
Some customers don’t accept 3rd party certifications and insist on
qualifying the toolchain themselves QK provides relevant evidence and usage guidelines
Useful for customers targeting safety standards other than ISO 26262
or IEC 61508
Confidential © ARM 2016 17
Agenda
ARM Heterogeneous Systems
DS-5 Development Studio (DS-5)
Which Compiler?
Debug Capabilities
© ARM 2016 18
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence
case
Multi-Core Debug and Trace
See call stack, registers, disassembly and
trace for each core or thread
SMP Cluster memory coherent . Memory,
breakpoints and code are global.
Visualize ETM or PTM
trace per core
CONFIDENTIAL 19
OS and RTOS Support in DS-5 Debugger
Focus the debug context
on a task or stack frame
Task-aware and pending breakpoints
OS tasks and resources
OS Availability
Linux Now
Android Now
FreeRTOS Now
Keil RTX Now
MQX Now
uCOS-II and III Now
ThreadX Now
Segger embOS Now
Quadros RTXC Now
Nucleus Now
eForce Now
API for custom OS Now
CONFIDENTIAL 20
Memory Management Unit View
Translation Tables
Memory Map
Supports multitude of translation
regimes
Secure and non-secure memory
Exception (EL) and privilege (PL) levels
© ARM 2016 21
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence
case
ARM CoreSight™ Trace Support
Follow your program execution
step-by-step
Advantages of CoreSight trace support in DS-5
Zero-overhead ETM/PTM execution trace
Trace - source code matching
Instruction-based profile summary
Support for cycle accurate and system-wide timestamps
Trace triggering and filtering for buffer preservation
Log view for No/low intrusiveness instrumentation via ITM or
STM
Stores up to 4GB of compressed trace in a DSTREAM unit
Up to 16-bit parallel trace with DSTREAM
Up 20 Gbps serial trace capture with DSTREAM HSSTP probe
© ARM 2016 22
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence
case
Advanced CoreSight Support I
Flexible Cross-Triggering Configuration Support for hardware-based run control synchronization across multiple cores
CTI CTI CTI
Core 0 Core 1 Other
CONFIDENTIAL 23
Advanced CoreSight Support II
Synchronization of trace streams via CoreSight timestamps
ETM
PTM
STM
Global
timestamp
unit
CPUs
System
Events
Timestamp
Timestamp
© ARM 2016 24
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence
case
Support for all Stages of Product Development
VSTREAM™
Virtual debug
Interface
RTL simulator
and emulator
Fast Models
Virtual debug
interface
Virtual platform
DSTREAM™
High performance
debug and trace
Catalog MCU/MPU FPGA and silicon
ULINKpro D
Low cost debug
for MCUs
TCP/IP
Application debug
and analysis
Linux/Android
devices
Single IDE, compiler, debug, trace and performance analysis for all stages in the
product development leads to higher engineering efficiency
© ARM 2016 25
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence
case
DS-5 Debug and Trace Probes ULINKpro family* DSTREAM
Supported Families
Device Creation (.rvc)
ARM7, ARM9, ARM11
Cortex-M See device list
Cortex-R4, Cortex-R5 See device list
Cortex-R7
Cortex-A5, Cortex-A7, Cortex-A8, Cortex-A9 See device list
Cortex-A12, Cortex-A15
Cortex-A57, Cortex-A53, Cortex-A72
Debug Performance
Maximum download speed [kB/s] 1000 2500
Maximum JTAG clock [MHz] 50 60
Trace Capabilities
On-chip trace
Serial Wire Output (SWO) **
Maximum TPIU width [pins] 16
Maximum off-chip trace speed [Mb/s] 9600
Trace buffer [GB] 4
Connectivity
USB 2.0
Fast Ethernet (10/100 Mb/s)
CONFIDENTIAL 26
ARM DS-5 Streamline Performance Analyzer
Mali GPU Support
Analyze and optimize Mali™ GPU
utilization
Monitor CPU and GPU cache usage
Optimize energy efficiency
Monitor actual power consumption with
the ARM Energy Probe
Correlate software execution to actual
power consumption
Customize it for Your System
Flexible architecture permits easy
addition of new counters
Open source driver and daemon gives
developers ultimate flexibility
Speed Up Your Code
Find out where the CPU is
spending the most time
Tune code for optimal cache
usage
OpenCL™ Visualizer
Visualization of OpenCL dependencies,
helping you to balance resources between
GPU and CPU better than ever Drill down to the Source Code
Break performance
down by function
View it alongside
the disassembly
CONFIDENTIAL 27
Software based solution
Debug/trace unit not required
Open source user space daemon and optional kernel
module
Graphical and command line interfaces
Lightweight software profiler
Process-to-source level sample-based profiler
Android ART-compiled software profiling
Very-low intrusiveness mode (no PC sampling)
Multiple data sources
CPU and GPU hardware performance counters
Mali GPU OpenGL and OpenCL events
OS tracepoints, ftrace events, custom instrumentation
Streamline Agent-Based
Architecture
Target Device
User Space
ARM Processor
OpenGL® ES
Applications & Middleware
Linux Kernel
Mali GPU Drivers
gator daemon
gator module
Host
CONFIDENTIAL 28
Support for userspace data providers
Annotations
Linux counters (wait, cpufreq, cpuidle, etc)
Custom counters from apps
Limitations
Provisional synchronization of userspace and perf time
bases
Mali GPU not supported
Userspace-Only gator for CPU
Analysis
Target Device
User Space
ARM Processor
Applications & Middleware
Linux Kernel
gator daemon
Host
© ARM 2016 29
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence
case
Come see it in action
DS-5 with i.MX7 at Technology Lab
Pod #553
DS-5 and i.MX7
https://community.arm.com/groups/tools/blog/2016/02/24/getting-started-with-ds-5-and-imx7
http://bit.ly/1X6Sodg
DS-5 and i.MX6SoloX
https://community.arm.com/groups/tools/blog/2015/03/03/using-ds-5-with-the-freescale-imx-
6solox
http://bit.ly/1rtwm8r
© ARM 2016 30
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence
case
No one knows ARM processors better
Visit http://developer.arm.com/
The trademarks featured in this presentation are registered and/or unregistered trademarks of ARM Limited
(or its subsidiaries) in the EU and/or elsewhere. All rights reserved. All other marks featured may be
trademarks of their respective owners.
Copyright © 2016 ARM Limited
Confidential © ARM 2016
Top Related