Helio X20: The First Tri-Gear Mobile SoC with...

24
Helio X20: The First Tri-Gear Mobile SoC with CorePilot3.0 Technology August 2016 Tsung-Yao Lin, Ming-Hsien Lee, Loda Chou, Clavin Peng, Jih-Ming Hsu, Jia-Ming Chen, John-CC Chen, Alex Chiou, Artis Chiu, David Lee, Carrie Huang, Kenny Lee, TzuHeng Wang, Wei-Ting Wang, Yenchi Lee, Chi-Hui Wang, Pao-Ching Tseng, Ryan Chen, Kevin Jou

Transcript of Helio X20: The First Tri-Gear Mobile SoC with...

Page 1: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

Helio X20:

The First Tri-Gear Mobile SoC

with CorePilot™ 3.0 Technology

August 2016

Tsung-Yao Lin, Ming-Hsien Lee, Loda Chou, Clavin Peng, Jih-Ming Hsu, Jia-Ming Chen, John-CC Chen, Alex Chiou, Artis Chiu, David Lee, Carrie Huang, Kenny Lee, TzuHeng Wang, Wei-Ting Wang, Yenchi Lee, Chi-Hui Wang, Pao-Ching Tseng, Ryan Chen, Kevin Jou

Page 2: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

Tri-Gear Concept

Challenges

Key Technologies

• Tailored CPU cores for gears

• Enhanced coherent interconnect

• Hybrid scheduler

• Holistic gear allocation

• Adaptive thermal management

Achievements

Summary

Agenda

Page 3: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

User Behavior Changed

Scenarios Example

Application Task Load

Time Spent%

Per Day

(2013)

Time Spent%

Per Day

(2014)

Time Spent%

Per Day

(2015)

Changes

(20142015)

Web Browsing Chrome

Browser

Heavy ~

Medium 20% 14% 10% -4%

Gaming Temple Run 2 Heavy ~

Light 32% 32% 15% -17%

Social

Messaging Facebook Medium 24% 28% 31% +3%

Entertainment,

Utilities, and

others

YouTube,

Mail

Medium ~

Light 24% 26% 44% +18%

Source: Flurry Analytics

• Social messaging, entertainment, and utilities (with medium to light

loads) take up to 75% of user time

Page 4: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

Task Load Distribution of Scenarios

12% 28%

17% 42% 38%

48% 47%

36% 33% 13%

0%

20%

40%

60%

80%

100%

WebBrowsing

Gaming SocialMessaging

Entertainment,Utilities & Others

Energy Consumption of Scenarios

Heavy Load

Medium Load

Light Load

Idle

• Medium load tasks are important across all scenarios (36% ~ 48%)

• Heavy load tasks are still important for specific scenarios

Page 5: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

big

• game

• multimedia • always-on,

connected

LITTLE

Light Tasks Medium Tasks Heavy Tasks

The Dual-Gear Dilemma

Page 6: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

big

• game

• multimedia • always-on,

connected

LITTLE

• Sustainable

usage

big

LITTLE

Light Tasks Medium Tasks Heavy Tasks

The Dual-Gear Dilemma

Execute medium load tasks on

• big wasted energy

• LITTLE cannot meet

performance requirement

Page 7: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

big

• game

• multimedia • always-on,

connected

LITTLE

Mid

• Sustainable

usage

big

LITTLE

Light Tasks Medium Tasks Heavy Tasks

The Dual-Gear Dilemma

Execute medium load tasks on

• Mid: balance between performance

and power

Page 8: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

po

we

r

performance 0 % 100 %

Mid

1 New Mid gear introduced

Min

Max

2 Min gear goes for even lower power,

Max gear aims for higher performance

po

we

r

performance 0 % 100 %

Max

Mid

Min

po

we

r

performance

0 % 100 %

Max

Mid

Min

3 Reduced power consumption

across entire performance range

Low Power

Sustainable

Performance

High Performance

Introduction to Tri-Gear

Page 9: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

Previous Dual-Gear

Improved

thermal sensing,

power budgeting

Improved

gear

management

Enhanced

coherent

interconnect

Tailored

processors

Revised

scheduler

Challenges of Tri-Gear

Evolving to Tri-Gear

SW

HW

Scheduler Thermal Management

Power Management

Balance power and performance

Maximize thermal performance Prevent overheating

Minimize power consumption

big

Coherent Interconnect

Right Task to Right CPU Control Info.

Control

Info

.

Light

Task

Heavy

Task

LITTLE

Page 10: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

Tri-Gear Concept

Challenges

Key Technologies

• Tailored CPU cores for gears

• Enhanced coherent interconnect

• Hybrid scheduler

• Holistic gear allocation

• Adaptive thermal management

Achievements

Summary

Agenda

Page 11: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

0.5X

1.0X

1.5X

2.0X

2.5X

0X 1X 2X 3X

Ener

gy C

on

sum

pti

on

Single-Thread Performance

Max

Mid

Min

• +30% power-efficiency

− Multi-bit flip-flops optimization

− Delicate usage of high leakage LVT cells

• +40% performance vs. Min gear

− LIB and MEM optimizations

* Energy and Performance scale relative to the highest

point of Min curve

Min, Max gears extend power/performance ranges

Tailored CPU Cores for Three Gears

2.5GHz Max

A72 A72

1.4GHz Min

A5

3

A5

3

A5

3

A53

2.0GHz Mid

A53

A5

3

A5

3

A5

3

Mid gear for efficient performance

+40% Performance

Mid vs. Min

+30% power-efficiency

Mid vs. Max

Page 12: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

ACE ACE ACE

ACE ACE

Enhanced from 2 ACE ports to 3 ACE ports

Increased logic extra power

• ~50% power reduction by sub-module

Fine-Grain Clock Gating (FGCG)

-50% power

* Power is relative to 2-gear at 1GB/s

common usage range

0.3

Enhanced Coherent Interconnect

Coherent Interconnect Power Comparison

Min Mid Max

Memory

LITTLE big

Coherent Interconnect

Tri-Gear Coherent Interconnect

Memory

Page 13: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

LITTLE

C0 C1 C2 C3

big

C0 C1

Dual-Gear scheduler

HMP Dual-Gear scheduler

• Limited to Dual-Gear

• Boot CPU is always on and cannot be migrated

(Fixed CPU0)

Typically in LITTLE LITTLE cannot be off

Fixed

CPU0

Hybrid Scheduler

Dual-level HMP scheduler for Tri-Gear?

• Might not be optimal

• Fixed CPU0 limits power saving opportunities

HMP (Heterogeneous Multi-Processing)

SMP (Symmetric Multi-Processing) SMP

Min

C0 C1 C2 C3

Max

C0 C1

Tri-Gear scheduler HMP

SMP SMP

Mid

C0 C1 C2 C3

SMP

HMP? Fixed

CPU0

Min Mid Max

Min Mid Max

Min Mid

Min Mid Min

LITTLE big

LITTLE big

LITTLE big

Power-Off

Page 14: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

ICAT assigns CPU0 dynamically

• Min gear can be off by task migration

• 8%~10% CPU power saved for medium load

Intelligent Core Activation Technology (ICAT)

Min

C0 C1 C2 C3

Mid

C0 C1 C2 C3

Max

C0 C1

Min

C0 C1 C2 C3 C0 C1 C2 C3

Max

C0 C1

Min always online for CPU0(booted CPU)

Fixed CPU0

ICAT: Min can be offline

Dynamic CPU0

Power-Off

0.5X

1.0X

1.5X

2.0X

2.5X

45 55 65 75 85

CP

U P

ow

er

Tj (°C)

2 threadsw/o ICAT

2 threadswith ICAT

1 threadw/o ICAT

1 threadwith ICAT

Power/Tj curve

* Power is relative to 1 thread with ICAT at 65°C

Mid

Page 15: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

Min

C0 C1 C2 C3

Mid

C0 C1 C2 C3

AMP

AMP: enhanced HMP with dynamic gear

operation for power saving

task migration

with ICAT

Asymmetric Multi-Processing (AMP) with ICAT

• Packing tasks to Mid for sustainable performance

• Packing tasks to Min for low power

Min

C0 C1 C2 C3

Mid

C0 C1 C2 C3

HMP

Min

C0 C1 C2 C3

Mid

C0 C1 C2 C3

HMP

Min

C0 C1 C2 C3

Mid

C0 C1 C2 C3

AMP Min

C0 C1 C2 C3

Mid

C0 C1 C2 C3

Max

C0 C1

Tri-Gear scheduler

SMP SMP SMP

AMP (Asymmetric Multi-Processing)

HMP

Page 16: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

Min Mid Max

Max

Min Mid Max

Instant boost technology

Inter-gear task migration

HMP for high performance

• Instant boost technology

Quick response to utilize Max for

urgent or heavy tasks

Hybrid = SMP + AMP + HMP

• Inter-gear task migration

Dynamic threshold control for

energy efficiency and responsiveness

Thread-group migration strategy to

increase cluster (L2 cache) locality

0 % 100 % 0 % 100 % p

ow

er

performance 0 % 100 %

HMP

AMP

Hybrid Scheduler

Min Mid Max Min Mid

High Performance

Sustainable Performance

Low Power

Page 17: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

Control Control

Previous Power Management

• Dynamic Voltage & Frequency Scaling (DVFS)

and Hot-Plug drivers consider inputs separately:

• Power budget, performance requests, and

system status such as load, Thread Level

Parallelism (TLP)

• Big gear on/off controlled by Hot-Plug driver

Centralized Gear Allocation

• A holistic control to handle increased complexity

• Tracking steady states to avoid unnecessary

gear migration overhead

• Linking to user-specified performance, normal,

power-saving modes

Enhanced Power Management

Power Budget

Requests

Centralized Gear Allocation

Performance

Requests

CPU DVFS CPU Hot-Plug

Status

Status

Thermal, Battery... Heavy task, Scenario...

CPU DVFS CPU Hot-Plug

Power Budget

Requests

Performance

Requests

Status

Thermal, Battery... Heavy task, Scenario...

Page 18: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

0X

1X

2X

0X 1X 2X

Po

wer

2-Thread Performance

Tri-Gear 2Max

1Max+1Mid

2Mid

1Max+1Min

2Min

1Mid+1Min

0X

1X

2X

0X 1X 2X

Po

wer

2-Thread Performance

Dual-Gear

2Max

1Max+1Min

2Min

Power budgeting by both core limit

and frequency limit for all CPUs

Dual-Gear to Tri-Gear

• More possible solutions from core /

frequency combination meeting power

target

• 1.5X ~ 3X more possible solutions on core

combination alone, depending on TLP

* Power and performance are relative to the highest

point of Max curve

* Each point in a curve represents a choice of gear /

core / freq

Adaptive Thermal Management (ATM)

Page 19: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

Previous power allocation

• Simple cost function: power efficiency only

• Large search space: chosen solution might

not meet actual system requirement

Precise power allocation

• Comprehensive cost function: power

efficiency, system requirement (#core,

frequency and power), system overhead

• +10% Performance from considering

system requirement

• -5°C max Tj from reducing system

overhead: hot-plug vs. DVFS latency

* Power and performance are relative to the

highest point of Max curve

* Geekbench v3 Multi-core Performance

0X

1X

2X

3X

0X 1X 2X 3X 4X 5X

Po

wer

Multi-Thread Performance

0X

1X

2X

3X

0X 1X 2X 3X 4X 5X

Po

wer

Multi-Thread Performance

Power budget

Power budget

Max Min

Precise Power Allocation

Previous Power Allocation

1 Heavy +

3 Light tasks Large

search space

Reduced

search space

Freq. limit

Freq. limit

Max Min

ATM for More Combinations

Page 20: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

Tri-Gear Concept

Challenges

Key Technologies

• Tailored CPU cores for gears

• Enhanced coherent interconnect

• Hybrid scheduler

• Holistic gear allocation

• Adaptive thermal management

Achievements

Summary

Agenda

Page 21: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

0%

20%

40%

60%

80%

100%

VideoRecord+EIS

(Utilities)

Web Rollover(Web Browsing)

Burst Photo(Utilities)

Facebook(Social

Messaging)

Heavy LoadingGame

(Gaming)

En

erg

y C

on

su

mp

tio

n

Tri-Gear Max

Tri-Gear Mid

Tri-Gear Min

-35% -38% -38% -21% -12%

Energy saving from Dual-Gear to Tri-Gear

• Up to -38% CPU energy measured for scenarios used daily

Energy Saving from Tri-Gear CPU Architecture

Dual-Gear LITTLE

Dual-Gear big

Page 22: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

MT6592 MT6595 Helio X20 Helio P10

CorePilot™ 3.0 CorePilot™ 2.0 CorePilot™ 1.0

• Octa-core with SMP • CPU+GPU Computing

• Dynamic Gear Migration

for low power

• Tri-Gear CPU Architecture

• 12% ~ 38% CPU energy saving

• big.LITTLE HMP

• Global Task Scheduling

CorePilot™ Technology Evolvement

SMP Tri-Gear HMP Symmetric

Multi-Processing

Heterogeneous

Multi-Processing

Hybrid Tri-Gear

Multi-Processing

HC Heterogeneous

Computing

big

C1 C2 C3 C0

GPU

Mid

C1 C2 C3 C0

GPU

Max

C1 C0

LITTLE

C1 C2 C3 C0

LITTLE

C1 C2 C3 C0

LITTLE

C1 C2 C3 C0 C1 C2 C3 C0

LITTLE

C1 C2 C3 C0

Min big

C1 C2 C3 C0

Page 23: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

Summary

po

we

r

performance 0 % 100 %

po

we

r

performance 0 % 100 %

Max

Mid

Min

pow

er

performance 0 % 100 %

Max

Mid

Min

Majority of tasks are medium and light loads • Added Mid gear and enhanced Min gear

CorePilot™ 3.0 Key Technologies • Tailored CPU cores for gears

• Enhanced coherent interconnect

• Hybrid scheduler

• Holistic gear allocation

• Adaptive thermal management

Benefit of Tri-Gear • Up to 38% CPU energy saving for typical scenarios

used daily over extended performance range

Page 24: Helio X20: The First Tri-Gear Mobile SoC with CorePilothotchips.org/.../HC28.22.210-Helio-TsungYaoLin-MediaTek-v2.pdf · Helio X20: The First Tri-Gear Mobile SoC with CorePilot™

Copyright © MediaTek Inc. All rights reserved.