HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4....

37
Christina Delimitrou 1 and Christos Kozyrakis 2 1 Stanford/Cornell University, 2 Stanford University/EPFL http://mast.stanford.edu ASPLOS – April 5 th 2016 HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED CLOUD SYSTEMS

Transcript of HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4....

Page 1: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

Christina Delimitrou1 and Christos Kozyrakis2

1Stanford/Cornell University, 2Stanford University/EPFLhttp://mast.stanford.edu

ASPLOS– April5th 2016

HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED CLOUD SYSTEMS

Page 2: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

2

¨ Problem: cloud provisioning is difficult

¤ Many resource offerings à resource/cost inefficiencies

¤ Interference from other users à performance jitter

¨ HCloud: resource-efficient public cloud provisioning

¤ User provides resource reservations performance goals

¤ Automatic selection of instance type and configuration

¤ Adjust allocations between reserved and on-demand

¤ High utilization and low performance jitter

Executive Summary

Page 3: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

3

Motivation

Page 4: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlarge

n1-standard-1n1-standard-2n1-standard-4n1-standard-8n1-standard-16n1-standard-32n1-highmem-2n1-highmem-4n1-highmem-8n1-highmem-16n1-highmem-32n1-highcpu-2n1-highcpu-4n1-highcpu-8n1-highcpu-16n1-highcpu-32f1-microg1-small

A0A1A2A3A4A5A6A7A8A9A10A11D1D2D3D4D11D12

Page 5: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlarge

A0A1A2A3A4A5A6A7A8A9A10A11D1D2D3D4D11

Page 6: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

n1-standard-1n1-standard-2n1-standard-4n1-standard-8n1-standard-16n1-standard-32n1-highmem-2n1-highmem-4n1-highmem-8n1-highmem-16n1-highmem-32n1-highcpu-2n1-highcpu-4n1-highcpu-8n1-highcpu-16n1-highcpu-32f1-microg1-small

A0A1A2A3A4A5A6A7A8A9A10A11D1D2D3D4D11D12D13D14D1 v2D2 v2D3 v2D4 v2D5 v2D11 v2

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

A0A1A2A3A4A5A6A7A8A9A10A11D1D2D3D4D11D12D13D14D1 v2D2 v2D3 v2D4 v2D5 v2D11 v2

n1-standard-1n1-standard-2n1-standard-4n1-standard-8n1-standard-16n1-standard-32n1-highmem-2n1-highmem-4n1-highmem-8n1-highmem-16n1-highmem-32n1-highcpu-2n1-highcpu-4n1-highcpu-8n1-highcpu-16n1-highcpu-32f1-microg1-small

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

n1-standard-1n1-standard-2n1-standard-4n1-standard-8n1-standard-16n1-standard-32n1-highmem-2n1-highmem-4n1-highmem-8n1-highmem-16n1-highmem-32n1-highcpu-2n1-highcpu-4n1-highcpu-8n1-highcpu-16n1-highcpu-32f1-microg1-small

n1-standard-1n1-standard-2n1-standard-4n1-standard-8n1-standard-16n1-standard-32n1-highmem-2n1-highmem-4n1-highmem-8n1-highmem-16n1-highmem-32n1-highcpu-2n1-highcpu-4n1-highcpu-8n1-highcpu-16n1-highcpu-32f1-microg1-small

A0A1A2A3A4A5A6A7A8A9A10A11D1D2D3D4D11D12D13D14D1 v2D2 v2D3 v2D4 v2D5 v2D11 v2

A0A1A2A3A4A5A6A7A8A9A10A11D1D2D3D4D11D12D13D14D1 v2D2 v2D3 v2D4 v2D5 v2D11 v2

Page 7: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

n1-standard-1n1-standard-2n1-standard-4n1-standard-8n1-standard-16n1-standard-32n1-highmem-2n1-highmem-4n1-highmem-8n1-highmem-16n1-highmem-32n1-highcpu-2n1-highcpu-4n1-highcpu-8n1-highcpu-16n1-highcpu-32f1-microg1-small

A0A1A2A3A4A5A6A7A8A9A10A11D1D2D3D4D11D12D13D14D1 v2D2 v2D3 v2D4 v2D5 v2D11 v2

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

A0A1A2A3A4A5A6A7A8A9A10A11D1D2D3D4D11D12D13D14D1 v2D2 v2D3 v2D4 v2D5 v2D11 v2

n1-standard-1n1-standard-2n1-standard-4n1-standard-8n1-standard-16n1-standard-32n1-highmem-2n1-highmem-4n1-highmem-8n1-highmem-16n1-highmem-32n1-highcpu-2n1-highcpu-4n1-highcpu-8n1-highcpu-16n1-highcpu-32f1-microg1-small

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

n1-standard-1n1-standard-2n1-standard-4n1-standard-8n1-standard-16n1-standard-32n1-highmem-2n1-highmem-4n1-highmem-8n1-highmem-16n1-highmem-32n1-highcpu-2n1-highcpu-4n1-highcpu-8n1-highcpu-16n1-highcpu-32f1-microg1-small

n1-standard-1n1-standard-2n1-standard-4n1-standard-8n1-standard-16n1-standard-32n1-highmem-2n1-highmem-4n1-highmem-8n1-highmem-16n1-highmem-32n1-highcpu-2n1-highcpu-4n1-highcpu-8n1-highcpu-16n1-highcpu-32f1-microg1-small

A0A1A2A3A4A5A6A7A8A9A10A11D1D2D3D4D11D12D13D14D1 v2D2 v2D3 v2D4 v2D5 v2D11 v2

A0A1A2A3A4A5A6A7A8A9A10A11D1D2D3D4D11D12D13D14D1 v2D2 v2D3 v2D4 v2D5 v2D11 v2

Cost, predictability, availability, flexibility, spin-up overheads,

retention time, load fluctuation, external load, sensitivity to interference, …

Page 8: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

n1-standard-1n1-standard-2n1-standard-4n1-standard-8n1-standard-16n1-standard-32n1-highmem-2n1-highmem-4n1-highmem-8n1-highmem-16n1-highmem-32n1-highcpu-2n1-highcpu-4n1-highcpu-8n1-highcpu-16n1-highcpu-32f1-microg1-small

A0A1A2A3A4A5A6A7A8A9A10A11D1D2D3D4D11D12D13D14D1 v2D2 v2D3 v2D4 v2D5 v2D11 v2

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

A0A1A2A3A4A5A6A7A8A9A10A11D1D2D3D4D11D12D13D14D1 v2D2 v2D3 v2D4 v2D5 v2D11 v2

n1-standard-1n1-standard-2n1-standard-4n1-standard-8n1-standard-16n1-standard-32n1-highmem-2n1-highmem-4n1-highmem-8n1-highmem-16n1-highmem-32n1-highcpu-2n1-highcpu-4n1-highcpu-8n1-highcpu-16n1-highcpu-32f1-microg1-small

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

t2.nanot2.micro t2.small t2.medium t2.largem4.large m4.xlarge m4.2xlargem4.4xlargem4.10xlargem3.medium m3.large m3.xlarge m3.2xlargec4.largec4.xlarge c4.2xlargec4.4xlargec4.8xlargec3.large c3.xlarge c3.2xlarge c3.4xlarge c3.8xlarger3.large r3.xlarge

n1-standard-1n1-standard-2n1-standard-4n1-standard-8n1-standard-16n1-standard-32n1-highmem-2n1-highmem-4n1-highmem-8n1-highmem-16n1-highmem-32n1-highcpu-2n1-highcpu-4n1-highcpu-8n1-highcpu-16n1-highcpu-32f1-microg1-small

n1-standard-1n1-standard-2n1-standard-4n1-standard-8n1-standard-16n1-standard-32n1-highmem-2n1-highmem-4n1-highmem-8n1-highmem-16n1-highmem-32n1-highcpu-2n1-highcpu-4n1-highcpu-8n1-highcpu-16n1-highcpu-32f1-microg1-small

A0A1A2A3A4A5A6A7A8A9A10A11D1D2D3D4D11D12D13D14D1 v2D2 v2D3 v2D4 v2D5 v2D11 v2

A0A1A2A3A4A5A6A7A8A9A10A11D1D2D3D4D11D12D13D14D1 v2D2 v2D3 v2D4 v2D5 v2D11 v2

Page 9: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

9

Cloud Provisioning Goals

✗ Determine appropriate instance size/type

✗ Determine appropriate instance configuration

✗ Dynamically adjust allocation decisions at runtime

Page 10: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

10

Cloud Provisioning Scenarios

¨ Batch analytics, latency-critical applications, and scientific workloads in all scenarios¤ Batch: 50%¤ Latency: 40%¤ Scientific: 10%

Page 11: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

11

Provisioning Baselines¨ All reserved (SR):

¤ 16vCPU instances on GCE, no external load¤ Predictable performance, high cost, low flexibility¤ Applications provisioned for peak requirements

¨ All on-demand: ¤ Largest instances (OdF): only 16vCPU instances¤ Mixed instances (OdM): mix of large and smaller instances¤ Interference, low cost, high flexibility

¨ Resource management: ¤ Least-loaded scheduler

Page 12: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

12

Extracting Resource Preferences

¨ Exhaustive characterization is infeasible

Heterogeneity

Interference

Resources per server

Resource ratio

10 servers40 apps

100 servers300 apps

1000 servers1200 apps

Combinations

1,000

1,000,000

0

1,000,000,000

Systems

Number of servers

Application params

Page 13: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

13

Quasar Overview [ASPLOS’14]

Profiling

Cluster

QoS Scheduler

UserApp

App

App

App

App

Resourcepreferences

App

Data mining

Page 14: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

14

Quasar Overview [ASPLOS’14]

Cluster

Scheduler

Greedy scheduler

Profiling

User Resourcepreferences

Data mining Resource selection

App

Page 15: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

15

Quasar Overview [ASPLOS’14]

Cluster

Scheduler

Greedy scheduler

Profiling

User Resourcepreferences

Data mining Resource selection

App

Page 16: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

16

Quasar Overview [ASPLOS’14]

Profiling[5-30sec]

Data mining[20msec]

Resource selection[50msec-2sec]

Cluster

Scheduler

Greedy schedulerUser

App

App

App

App

Resourcepreferences

App

Page 17: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

17

Evaluation

Page 18: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

18

Evaluation

Page 19: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

19

Cloud Provisioning Goals

✔ Determine appropriate instance size/type

✗ Determine appropriate instance configuration

✗ Dynamically adjust allocation decisions at runtime

Page 20: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

20

Hybrid Cloud Resource Allocation

¨ Insight: ¤ Combine reserved and on-demand resources

¨ Challenge: ¤ Separate applications between reserved (~private) and

on-demand (public) resources

Page 21: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

21

Hybrid Allocation Policies

P1 P2 P3 P4 P5 P6 P7 P8556065707580859095

Perf.

nor

m to

Isol

atio

n (%

) On-Demand Resources

HFHM

P1 P2 P3 P4 P5 P6 P7 P875

80

85

90

95

100

Perf.

nor

m to

Isol

atio

n (%

) Reserved Resources

HFHM

✗ Account for application interference sensitivity

Page 22: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

22

Hybrid Allocation Policies

P1 P2 P3 P4 P5 P6 P7 P875

80

85

90

95

100

Perf.

nor

m to

Isol

atio

n (%

) Reserved Resources

HFHM

P1 P2 P3 P4 P5 P6 P7 P8556065707580859095

100

Perf.

nor

m to

Isol

atio

n (%

) On-Demand Resources

HFHM

P1 P2 P3 P4 P5 P6 P7 P875

80

85

90

95

100

Perf.

nor

m to

Isol

atio

n (%

) Reserved Resources

HFHM

✗ Do not overload reserved resources

✓ Account for application interference sensitivity

Page 23: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

23

Hybrid Allocation Policies

P1 P2 P3 P4 P5 P6 P7 P875

80

85

90

95

100

Perf.

nor

m to

Isol

atio

n (%

) Reserved Resources

HFHM

P1 P2 P3 P4 P5 P6 P7 P8556065707580859095

100

Perf.

nor

m to

Isol

atio

n (%

) On-Demand Resources

HFHM

P1 P2 P3 P4 P5 P6 P7 P8556065707580859095

100

Perf.

nor

m to

Isol

atio

n (%

) On-Demand Resources

HFHM

P1 P2 P3 P4 P5 P6 P7 P875

80

85

90

95

100

Perf.

nor

m to

Isol

atio

n (%

) Reserved Resources

HFHM

✗ Dynamic decisions (e.g., utilization limits)

✓ Account for application interference sensitivity

✓ Do not overload reserved resources

Page 24: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

24

HCloud

¨ Insights: ¤ Account for interference sensitivity¤ Set load limits dynamically

Reserved On-demand

AppApp

App

App

Page 25: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

25

HCloud

Reserved On-demand

AppApp

App

App

¨ Insights: ¤ Account for interference sensitivity¤ Set load limits dynamically

Page 26: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

26

HCloud

Reserved On-demand

AppApp

App

App

¨ Insights: ¤ Account for interference sensitivity¤ Set load limits dynamically

Page 27: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

27

HCloud

Reserved On-demand

AppApp

App

App

¨ Insights: ¤ Account for interference sensitivity¤ Set load limits dynamically (feedback loop on queue length)

Page 28: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

28

HCloud

Reserved On-demand

AppApp

App

App

¨ Insights: ¤ Account for interference sensitivity¤ Set load limits dynamically (feedback loop on queue length)

¨ What signals a sensitive job?

Page 29: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

29

HCloud

¨ Insights: ¤ Account for interference sensitivity¤ Set load limits dynamically

¨ What signals a sensitive job?

Reserved On-demand

AppApp

App

App

Page 30: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

30

Hybrid Allocation Policies

P1 P2 P3 P4 P5 P6 P7 P875

80

85

90

95

100

Perf.

nor

m to

Isol

atio

n (%

) Reserved Resources

HFHM

P1 P2 P3 P4 P5 P6 P7 P8556065707580859095

100

Perf.

nor

m to

Isol

atio

n (%

) On-Demand Resources

HFHM

P1 P2 P3 P4 P5 P6 P7 P8556065707580859095

100

Perf.

nor

m to

Isol

atio

n (%

) On-Demand Resources

HFHM

P1 P2 P3 P4 P5 P6 P7 P875

80

85

90

95

100

Perf.

nor

m to

Isol

atio

n (%

) Reserved Resources

HFHM

✓ Account for application interference sensitivity

✓ Do not overload reserved resources

✓ Dynamic decisions (e.g., utilization limits)

Page 31: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

31

Evaluation

Page 32: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

32

Cloud Provisioning Goals

✔ Determine appropriate instance size/type

✔ Determine appropriate instance configuration

✗ Dynamically adjust allocation decisions at runtime

Page 33: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

33

Adjusting Allocations at Runtime

¨ Runtime: monitor app performance¤ Remove co-scheduled apps¤ Move to larger instance within cluster¤ Move to reserved cluster

Reserved On-demand

Page 34: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

34

Conclusions

¨ HCloud: hybrid cloud provisioning (reserved & on-demand)¤ Account for interference sensitivity¤ Account for performance-cost tradeoffs¤ Adjust to load fluctuations¤ 2.1x better performance than on-demand, 50% lower cost than

reserved

¨ See paper for: ¤ Performance unpredictability analysis on EC2 and GCE¤ Sensitivity studies on system & app parameters

n Spin-up overheads, cost, retention time, load, app characteristics, external load, …

¤ Further workload scenarios

Page 35: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

35

¨ HCloud: hybrid cloud provisioning (reserved & on-demand)¤ Account for interference sensitivity¤ Account for performance-cost tradeoffs¤ Adjust to load fluctuations¤ 2.1x better performance than on-demand, 50% lower cost than

reserved

¨ See paper for: ¤ Performance unpredictability analysis on EC2 and GCE¤ Sensitivity studies on system & app parameters

n Spin-up overheads, cost, retention time, load, app characteristics, external load, …

¤ Further workload scenarios

Questions??

Page 36: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

36

Thank you

Questions??

Page 37: HCLOUD: RESOURCE-EFFICIENT PROVISIONING IN SHARED …kozyraki/publications/2016.h... · 2016. 4. 22. · 2 ¨ Problem: cloud provisioning is difficult ¤ Many resource offerings à

37

Resource Efficiency