HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

37
1 Simon Segars EVP & GM, ARM August 18 th , 2011 ARM Processor Evolution: Bringing High Performance to Mobile Devices

Transcript of HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

Page 1: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 1/37

1

Simon Segars EVP & GM, ARMAugust 18th, 2011

ARM Processor Evolution:Bringing High Performance to Mobile Devices

Page 2: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 2/37

2

1980’s mobile computing

Page 3: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 3/37

3

HotChips 1981

§ 4MHz Z80 Processor 

§ 64KB memory§ Floppy drives

§ 5” screen

§ 24.5 lbs

§ $1,795

§ 11,000 units sold

Osbourne 1 image courtesy of www.oldcomputers.net

Page 4: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 4/37

4

Hot Chips 1983

§ Motorola DynaTAC

§ $3995§ 30 minutes talk time

§ 10 hours charge time

§ No texting or Bluetooth

Page 5: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 5/37

5

30 Years later … 

§ MacBook Air 

§ 1.7GHz Processor § 8GB memory

§ 256GB storage

§ 13” screen

§ 2.96 lbs

§ $1,599

Page 6: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 6/37

6

 Your phone is a lot more useful too

§ Samsung Galaxy S-II

§  Android 2.3§ 4.3” screen

§ 32GB memory

§ Web, weather, Angry Birds

§ ~1/20th the DynaTAC cost

Page 7: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 7/377

Connectivity driving computing

Source: Morgan Stanley, 2009

Page 8: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 8/378

Evolutionary Pressures

Functionality

Cost FormFactor 

Competition

Page 9: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 9/379

1990-2011

Page 10: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 10/3710

A long time ago in a village far away….

§  Acorn, Apple and VLSI formeda joint venture

§  13 engineers and a CEO

§  ARM the company was born§  21 years old in November 

§ Goal to design low power 

embedded 32b processors

§ But never to manufacture them

Page 11: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 11/3711

1993: An early mobile computer 

§ 1.4lb

§ Personal Applications

§ Expansion socket

§ Stylus interface

Page 12: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 12/3712

1992: ARM7TDMI – ‘thumb’

§  A small, low power 32 bit core for wireless baseband

§ 74,000 transistors, 40MHz, 4.2mm2

on 0.5µm

Page 13: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 13/3713

Early GSM phone

§ 1 week battery life

§ Contact list

§ Calculator 

§ SMS

§ Snake… § Not exactly a mobile computer 

Page 14: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 14/3714

Evolution

Page 15: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 15/3715

Moore’s law has helped a lot

1985 ARM13 6k gates

7mm x 7mm 

Cortex-M020nm 8k gates

0.07mm x 0.07mm

Dec 2010

1/10,000th size

Cortex-M0

Subsystem

.M0

System

Mali-400

IO

DualCortex-A9

Cortex-A9 SOC40nm 100M gates7.4mm x 6.9mm 

Page 16: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 16/3716

The next decade

There may be trouble ahead….

Page 17: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 17/3717

First the good news

Over 4 Billion people connectedvia mobile phones

BIG NUMBERS! 

Over 13 Billionsmartphone appsdownloaded in 2.5

years

During the 2010Holiday period $230M

was spent on EBayusing smartphones

Smartphones willleapfrog over the PC inthe developing world

LTE Deploymentstarting

In 2010 280Msmartphones were

shipped 

Smartphone data trafficwill exceed PC traffic in

2014

Page 18: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 18/3718

0

100

200

300

400

500

600

700

800

   S  m  a  r   t  p   h  o  n  e   V  o   l  u  m  e   (   M  u   )

Smartphone growing and growing

Source: Gartner Q4 2010 Forecast

Phase 2

Phase 3

Phase 1

ARM 926100 MHz

ARM 926200MHz

ARM 11500MHz

Cortex-A8650MHz

Cortex-A81GHz

2x Cortex-A91GHz

4x Cortex-A91GHz

2x Cortex-A151.5GHz

0

10

20

30

40

50

60

70

80

90

100

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013

   R  e   l  a   t   i  v  e   P  e  r   f  o  r  m

  a  n  c  e

Page 19: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 19/3719

The “Superphone” for 2013 

§  Your primary internet accessdevice

§   Always connected broadband

§  Enough compute power toreplace your laptop

§  Your content with you

§  From HD to MP3 with cloud access

§  Multi-core processor 

§  Multi-core graphics

processor §  Multi-core coherency

§  28nm implementation

§  Security solution

§  12+MP Camera§  1080p playback and capture

§  128 Gbyte of storage

§  45 hours of 1080p video

§  HDMI video out

§  Full browser with HTML v5& Flash Player 

§  New UI paradigms

§  Voice, gestures, Augmented

Reality supported by OpenCL

§  Full ‘office’ support

§  Vibrant applications market

§  HD Gaming on device or screen

Device Vision

Enabling Technology

Visual Experience

 Applications

Page 20: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 20/3720

Another 1Bn sub $100 Computers

Device Features1+ Billion Opportunity

§  Consumers, OEMs & MNO allwant smartphones

§  Emerging markets to bypass thePC

§  Full internet connectivity

§  Growth controlled by retail price,

target $100-$200

§  Requirement beyond ARM11™performance

 Access to full range of apps in Android Marketplace

§  Full Android™ OS§   Access to full range of apps in

Google Marketplace

§  Full browser 

§  Flash Player 10 support

§  Full HTML 5 support

§  3D UI accelerated with OpenGL® ES 2.0

Page 21: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 21/3721

A whole lot of software

Operating Systems

innovating on

a 6-12 month period

InternetExplorer 

GoogleChrome

MozillaFirefox

<HTML>v5

WebkitFlashPlayer 10

 Application diversity delivers the personalized compute experience

 Apps & App Stores

100,000’s of Apps

15+ Billion Downloads

Full connectivity to the Internet and the Cloud

WebTechnology

support for all

standards

Diversity and rapid mobile OS innovation

 AIR

Page 22: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 22/3722

The center of your world

Page 23: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 23/37

23

TrustZone in 3 Steps

TrustedExecution

Environment

1. Define secure hardware architecture

§  Two separate domains:- normal and secure§  Extends across system:-

Processor, display, keypad, memory, clock, radios

2. Implement in silicon system on chip (SoC)

§  Physically enforcing secure/normal separation

3. Combine SoC with Secure OS

§  Separate but connected to main operating system

Result: A TrustedExecution Environment§  Ready to develop and deploy trusted services

TrustedExecution

Environment

CPUCPU

SECURENORMAL

Rich OS Secure OS

Page 24: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 24/37

24

But there are challenges

Transistor scalingBattery scaling

Modems Implementation

Page 25: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 25/37

25

Modem relative performance

§ 4G modem ~500x more complex than 2G

§  Control processor 

§  Dedicated data processing engines

§  More silicon area

§  More power consumed

0 100 200 300 400 500 600

4G

3.9G LTE

3G

2G

Page 26: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 26/37

26

Batteries haven’t helped

§ Historical 11% capacity growth

§ Not well matched to Moore's Law

§ Continued innovation required just to maintain 11%

§  New Si alloy materials or anode Carbon Nano Tubes may help

 Assuming 12 hours of use per day

4.5 kCal

30g 255 kCal49g

x2.2

Capacity

Page 27: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 27/37

27

Challenge: Implementation Complexity

§ ARM7TDMI

§  74K transistors

§  4.2 mm2 in 0.5µm technology

§  Largely hand-designed

§  3 corners, 1 voltage domain

§ Cortex-A9 MP Dual Core

§  >20M transistors

§  3.4 mm2 in 28nm technology

§  Complex timing sign-off 

§  12+ corners, 3 voltage domains

§  DFM, Variation modeling, … 

Page 28: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 28/37

28

Disaggregation – pros and cons

1970s

VerticalSuppliers

FullyVerticallyIntegrated

1980s

ASICVendors

System

Manufacturer 

ASIC

Vendor 

1990s

FablessSemis

Design &

Distribution

Manufacture

EDA

Today

IP DrivenDesign

Design

IP

EDA

Manufacture

Foundry

Page 29: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 29/37

29

Fabs: not as popular as they used to be

Al#sSemiconductor

DongbuHitek DongbuHitek

Freescale Freescale

Fujitsu Fujitsu

Globalfoundries Globalfoundries Freescale

GraceSemiconductor GraceSemiconductor Fujitsu

IBM IBM Globalfoundries

Infineon Infineon IBM

Intel Intel Infineon Fujitsu

Panasonic Panasonic Intel Panasonic

Renesas(NEC) Renesas(NEC) Panasonic Globalfoundries

Samsung Samsung Renesas(NEC) IBM

SeikoEpson SeikoEpson Samsung Intel

SMIC SMIC SMIC Renesas(NEC) Globalfoundries

Sony Sony Sony Samsung Intel

STMicroelectronics STMicroelectronics STMicroelectronics SMIC Panasonic Globalfoundries

TexasInstruments TexasInstruments TexasInstruments STMicroelectronics Samsung Intel

Toshiba Toshiba Toshiba Toshiba STMicroelectronics Samsung

TSMC TSMC TSMC TSMC TSMC STMicroelectronics

UMC UMC UMC UMC UMC TSMC

130nm 90nm 65nm/55nm 45/40nm 32/28nm 22/20nm

Page 30: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 30/37

30

Device Scaling

65nm

32nm 

40nm

28nm 

8nm

14nm

20nm 

10nm 

?nm

Strain

High-K

2x Patterning

3x Patterning? EUV??

????

Page 31: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 31/37

31

Think different!

Cortex-BigMulti-CPU

Multi-Gx

Hardware Platform

Runtime

Object

Compiler 

Application

Kernel

CPU-Audio

Modem

Video

Audio

5G

Web

ModemWiFi

Page 32: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 32/37

32

Advanced Processing

FetchDecodeRenameDispatch

NEONFPU

Multiply

Load/Store5 stages 7 stages

Integer pipeline

Cortex-A15 for High-end, power-efficient processing

§ 15-Stage Integer Pipeline

§ 2 extra cycles for multiply, load/store

§ 2-10 extra cycles for complex media instructions

   I  s  s  u  e

   W   BInt.

Branch

   I  s  s

  u  e

   I  s  s  u  e

   W

   B

   W   B

Page 33: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 33/37

33

Cortex-A15 System ScalabilityIntroduces Cache Coherent Interconnect

§  Processor to Processor Coherency and I/O coherency

§  Memory and synchronization barriers

§  Virtualization support with distributed virtual memory signaling

128-bit AMBA 4

Quad Cortex-A15 MPCore

A15

Processor Coherency (SCU)Up to 4MB L2 cache 

A15 A15 A15

CoreLink CCI-400 Cache Coherent Interconnect

128-bit AMBA 4   I   O

   c  o   h  e  r  e  n   t   d  e  v   i  c  e  s

MMU-400

Quad Cortex-A15 MPCore

A15

Processor Coherency (SCU)Up to 4MB L2 cache 

A15 A15 A15

System MMU

GIC-400

Page 34: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 34/37

34

Fully coherent SoCs

2011 Devices

§  Full coherency within CPU cluster §  Limited I/O coherency

§  Software managed coherency for SoC

2013 Devices

§  Full coherency for multiple CPU clusters

§  I/O coherency with graphics and other 

§  Simpler software programming model

2015 Devices

§  Full coherency on CPU, GPU and other 

§  True General Purpose Compute 

ApplicationsProcessor 

Fully Coherent 

2x or 4xCortex-A9 

Graphicsand Video 

Non-coherent or Software Managed 

Mali-400  Video 

ApplicationsProcessor 

Fully Coherent 

>4x Cortex-A15

Graphicsand Video 

I/O Coherent 

Mali-T604  Video 

   C   I

Applications

Processor 

Fully Coherent 

Next generationCompute cluster  

Graphics

and Video 

Next Gen  Video    C   I   G

  r  e  a   t  e  r   P  e  r   f  o  r  m  a  n  c  e  a  n   d   i  n   t  e  r  a  c   t   i  o  n

    L  o  w  e  r  e  n  e  r  g  y  p  e

  r   t  r  a  n  s  a  c   t   i  o  n

S C

Page 35: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 35/37

35

The holistic SoC Designer 

IP DrivenDesign

Design

IP

EDA

Manufacture

Foundry

 Architecture 

Implementation 

C l i

Page 36: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 36/37

36

Conclusion

§ 30 years have delivered incredible gains

§  Mobile computing now ubiquitous§  Smartphones, Superphones, Tablets

§ Silicon scaling has driven PPA gains

§  But it has to end somewhere, so get ready!

§ The future is Hetrogeneous

§  Multi-core CPU, multi-CPU, dedicated engines

§  A new battery would help too!

Page 37: HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

7/31/2019 HC23.18.Keynote1.Arm-Hot Chips 2011 SS Final

http://slidepdf.com/reader/full/hc2318keynote1arm-hot-chips-2011-ss-final 37/37

Thank you.