Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of...

31

Transcript of Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of...

Page 1: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for
Page 2: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Capacity Provisioning Problems in Geo-distributed Data CentersBhuvan UrgaonkarDept. of CSEThe Pennsylvania State University

Page 3: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Geo-distributed Data Centers

3

• Reasons for geo-distribution:- Latency- Availability

• What are the cost implications?7/15/2014 Microsoft Faculty Summit 2014

Page 4: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

What’s New?

• What is well-understood:- How to build single data centers cost-effectively?- How to create distributed applications using an existing pool of data

centers (that were built separately)? E.g., content distribution networks such as Akamai

E.g., recent work on procuring resources from geo-distributed public clouds like SpanStore

• What is (likely) less well-explored:- Building a fleet of distributed data centers from scratch for

supporting large-scale distributed workloads

• Approach: specific case studies -> general insights & challenges

4

How do costs change when we build a geo-distributed version of a centralized DC?

7/15/2014 Microsoft Faculty Summit 2014

Page 5: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

2525

25 25

Base + Spare IT costIT Cost

Degree of Distr.2 431

Base IT cost100

200

A Simple Thought Experiment8 8

88

2525

2525

Costs of networking DCs

7/15/2014 5Microsoft Faculty Summit 2014

Page 6: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Costs: What have we made worse?

• Networking infrastructure to connect DCs

• Larger overall IT capacity- Redundancy for availability

Higher for heterogeneous collection of DCs

- Poorer statistical multiplexing

67/15/2014 Microsoft Faculty Summit 2014

Page 7: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Costs: What have we made worse?

• Networking infrastructure to connect DCs

• Larger overall IT capacity- Redundancy for availability

Higher for heterogeneous collection of DCs

- Poorer statistical multiplexing

7

How do we keepthis “small”?

7/15/2014 Microsoft Faculty Summit 2014

Page 8: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Costs: What have we made worse?

• Networking infrastructure to connect DCs

• Larger overall IT capacity- Redundancy for availability

Higher for heterogeneous collection of DCs

- Poorer statistical multiplexing

• Non-IT infrastructure (power+cooling) costs- To support higher IT capacity

87/15/2014 Microsoft Faculty Summit 2014

Page 9: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Costs: What have we made worse?

• Networking infrastructure to connect DCs

• Larger overall IT capacity- Redundancy for availability

Higher for heterogeneous collection of DCs

- Poorer statistical multiplexing

• Non-IT infrastructure (power+cooling) costs- To support higher IT capacity

9

Can we keep non-IT Infra. “size” small?

7/15/2014 Microsoft Faculty Summit 2014

Page 10: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Costs: What has improved?

10

• Revenue due to better latency improvements

7/15/2014 Microsoft Faculty Summit 2014

Page 11: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Costs: What has improved?

11

• Revenue due to better latency improvements

• Aspects of availability

Base + Spare IT costIT Cost

Degree of Distr.2 431

Base IT cost100

200

Costs of networking DCs

7/15/2014 Microsoft Faculty Summit 2014

Avail.=0.99Avail.=0.999

Page 12: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Costs: What has improved?

12

• Revenue due to better latency improvements

• Aspects of availability

Base + Spare IT costIT Cost

Degree of Distr.2 431

Base IT cost100

200

Costs of networking DCs

Can we lower the availabilityof individual DC infra.?

7/15/2014 Microsoft Faculty Summit 2014

Avail.=0.99Avail.=0.999

Page 13: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Outline

An example of cost-effective IT provisioning

• Keeping non-IT infrastructure costs low• Lowering peak power related costs using batteries

• Conclusions

137/15/2014 Microsoft Faculty Summit 2014

Page 14: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Problem Setting

• DC locations given

• Client demands known, time-varying

• Goal: determine total capacity at each DC- To meet latency constraints, and- To allow for one DC to fail

• Our optimizer: An LP- Generally, NP-hard facility location problems

7/15/2014 14Microsoft Faculty Summit 2014

Page 15: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Results

• DC locations- 6 MS data centers in the US

• Client demand model- Exhibits time zone specific variation- Proportional to population

7/15/2014 Microsoft Faculty Summit 2014 15

New York demand

Oregon demand

Page 16: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Results

Experiments using demand measured for one Microsoft cluster, and 6 MS DC locations within US. L’= L

Availability (against 1 failure) for free!

0 20 40 60 80 100 120 140 160

Single DC capacity

Nearest DC (no failure)

Optimized (support 1 failure)

Without time-of-day

Optimized (no failure)

TOTAL CAPACITY

7/15/2014 16Microsoft Faculty Summit 2014

Page 17: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Results

Excess capacity for high availability

0 20 40 60 80 100 120 140 160

Single DC capacity

Nearest DC (no failure)

Optimized (support 1 failure)

Without time-of-day

Optimized (no failure)

TOTAL CAPACITY

7/15/2014 17Microsoft Faculty Summit 2014

Experiments using demand measured for one Microsoft cluster, and 6 MS DC locations within US. L=L’

Details: Narayanan et al., “Towards leaner geo-distributed cloud infrastructure,” Proc. HotCloud 2014

Page 18: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Outline

An example of cost-effective IT provisioning

Keeping non-IT infrastructure costs low• Lowering peak power related costs using batteries

• Conclusions

187/15/2014 Microsoft Faculty Summit 2014

Page 19: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

A Closer Look at Power Infrastructure

19

UtilitySubstation

PowerDistributionUnit (PDU)

ServerRacks

Diesel Generator

UPS Battery

1$/W

0.6$/W

0.3$/W

0.2$/W

Auto TransferSwitch (ATS)

PowerInfrastructure

… …

7/15/2014 Microsoft Faculty Summit 2014

Page 20: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Lowering Peak Draw

20

Power

Power Cap

Cap-ex Saving

Re-shaped Power

Time

7/15/2014 Microsoft Faculty Summit 2014

Page 21: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Using Energy Storage

21

… …

Time

Po

we

r (W

) Power Cap

Newdraw

Energy Storage Device (ESD)(No Performance Impact)

How to provision and harness ESDs in data centers?

7/15/2014 Microsoft Faculty Summit 2014

Page 22: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

ESDs in Current Data Centers

22

… …

… …

Cost Saving Cost Saving

1$/W

0.6$/W

0.3$/W

0.2$/W

Why restrict ESDs to any one level of the datacenter power hierarchy (e.g., central or server)?

7/15/2014 Microsoft Faculty Summit 2014

Page 23: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

23

Spe

cific

Ene

rgy (W

h/k

g)

Specific Power (W/kg)

0Batteries

Capacitors

Compressed Air (CAES)

Supercapacitors

10,000

1,000

100

10

1

0.1

10 100 1,000 10,000 100,000 1,000,000

Lead acid

Fuel Cell

Flywheels (FW)

Lithium ion

Ultracapacitors(UC)

Ragone Plot

7/15/2014 Microsoft Faculty Summit 2014

Page 24: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Capital Cost (Energy and Power)

FlywheelUltracapacitor Lead-acid battery

Lithium ion battery

Compressed air

Energy Cost ($/kWh)

10,000 5,000 525 200 50

Power Cost

($/kW)

100 250 175 125 600

Why restrict to single ESD technology (e.g., Lead acid battery)?

7/15/2014 24Microsoft Faculty Summit 2014

Page 25: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Multi-level Multi-technology ESDs

25

ATS

ESD

PDU PDU PDU…

UtilityDiesel Generator

ESD

ESD

ESD

ESDServerH/W

Battery

Capacitor

Rack Rack Rack

Flywheel

Battery

CompressedAir

7/15/2014

Microsoft Faculty Summit 2014

Page 26: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Cost Savings for Google Workloads

26

(Savings, ESD cost)Datacenter:

Fly Wheel+Compress Air

Server: lead acidSaving

s ($

/day)

Single-tech,Datacenter-level

Multi-tech,Server Level

Multi-tech,Multi-level

Total cost without ESD is $12k/day

Single-tech,Server-level

Server:Ultracap+Lead acid

Power

(MW

)

1,000

2,000

0

Server: Lead Acid

Datacenter: Compress Air

25% 30%(4.9k,0.4k)20%

Time (hour)

3,000

4,000

5,000

(4.7k,0.3k)(5.2k,0.3k)

(3.9k,0.2k)

7/15/2014 Microsoft Faculty Summit 2014

Details: Wang et al., “Energy Storage in the Datacenter: What, Where,and How Much?,” Proc. ACM Sigmetrics 2012

Page 27: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Outline

An example of cost-effective IT provisioning

Keeping non-IT infrastructure costs low Lowering peak power related costs using batteries

• Conclusions

277/15/2014 Microsoft Faculty Summit 2014

Page 28: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Related Work

• IT capacity provisioning- Capacity planning [Goiri et al. ICDCS’11]

Showed that more DCs, where each is lower availability (lower cost) but extra geo-spares, better

Computed optimal capacity placements

• Lowering infrastructure availability/cost- Reducing the “size” of power infrastructure

Under-provisioning backup generators [Wang14] Reducing component redundancy [Govindan11,Kansal13]

- Less aggressive cooling design Has similarity in offering an availability vs cost trade-off

[Schroeder@Sigmetrics12] Related work in geo-distributed setting: [Wierman]

- Lower availability IT

287/15/2014 Microsoft Faculty Summit 2014

Page 29: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Conclusions

• Cost-effective capacity provisioning of geo-distributed data centers presents opportunities for novel problems in optimization and system design

- Putting together lower availability data centers with appropriate fault tolerance mechanisms during subsequent operation

- Key source of difficulty is uncertainty of subsequent workload evolution Typical facility location based formulations might be inadequate Stochastic optimization? Robust optimization?

• More information: http://www.cse.psu.edu/~bhuvan• Joint work with: Anand Sivasubramaniam, Aman Kansal, Di Wang,

Sriram Govindan, Hosam Fathy, Iyswarya Narayanan

297/15/2014 Microsoft Faculty Summit 2014

Page 30: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for

Microsoft Privacy Policy statement applies to all information collected. Read at research.microsoft.com

Save the planet and return your name badge before you

leave (on Tuesday)

Page 31: Capacity Provisioning Problems in Geo-distributed Data Centers€¦ · Bhuvan Urgaonkar Dept. of CSE The Pennsylvania State University. Geo-distributed Data Centers 3 •Reasons for