Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J....

80
Information Systems Group Runtime Measurements in the Cloud Observing, Analyzing, and Reducing Variance Jörg Schad, Jens Dittrich, and Jorge-Arnulfo Quiané-Ruiz VLDB 2010 September 14th, Singapore Information Systems Group Saarland University

Transcript of Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J....

Page 1: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Information Systems Group

Runtime Measurements in the Cloud

Observing, Analyzing, and Reducing Variance

Jörg Schad, Jens Dittrich, and Jorge-Arnulfo Quiané-Ruiz

VLDB 2010

September 14th, Singapore1

Information Systems GroupSaarland University

Page 2: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupMotivation

Hadoop++

1

public void getValue() {return value;}

public void getValue() {return value;}

public void getValue() {return value;}dsfdsfl; flksdjfjds lkfjsdjk jlkjsf ljsdlkfj sf

2Motivation

Page 3: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupMotivation

Hadoop++

1

public void getValue() {return value;}

public void getValue() {return value;}

public void getValue() {return value;}dsfdsfl; flksdjfjds lkfjsdjk jlkjsf ljsdlkfj sf

J. Dittrich et al., Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing)

VLDB 2010Presentation on Wednesday at 12:00

Research track: Cloud computing session

2Motivation

Page 4: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems Group

Hadoop++

MotivationYeah! the

prototype is ready

2

Hadoop++

1

public void getValue() {return value;}

public void getValue() {return value;}

public void getValue() {return value;}dsfdsfl; flksdjfjds lkfjsdjk jlkjsf ljsdlkfj sf

J. Dittrich et al., Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing)

VLDB 2010Presentation on Wednesday at 12:00

Research track: Cloud computing session

2Motivation

Page 5: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems Group

Hadoop++Hadoop++

MotivationYeah! the

prototype is ready

But, where do I run our 100-nodes experiments?

2 3

Hadoop++

1

public void getValue() {return value;}

public void getValue() {return value;}

public void getValue() {return value;}dsfdsfl; flksdjfjds lkfjsdjk jlkjsf ljsdlkfj sf

J. Dittrich et al., Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing)

VLDB 2010Presentation on Wednesday at 12:00

Research track: Cloud computing session

2Motivation

Page 6: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems Group

Hadoop++Hadoop++

MotivationYeah! the

prototype is ready

But, where do I run our 100-nodes experiments?

O.K., run the experiments on

Amazon EC2

2 3 4 5

Hadoop++

Hadoop++

1

public void getValue() {return value;}

public void getValue() {return value;}

public void getValue() {return value;}dsfdsfl; flksdjfjds lkfjsdjk jlkjsf ljsdlkfj sf

J. Dittrich et al., Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing)

VLDB 2010Presentation on Wednesday at 12:00

Research track: Cloud computing session

2Motivation

Page 7: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems Group

Hadoop++Hadoop++

MotivationYeah! the

prototype is ready

But, where do I run our 100-nodes experiments?

O.K., run the experiments on

Amazon EC2

Hadoop++

2 3 4 5

6

Hadoop++

Hadoop++

1

public void getValue() {return value;}

public void getValue() {return value;}

public void getValue() {return value;}dsfdsfl; flksdjfjds lkfjsdjk jlkjsf ljsdlkfj sf

J. Dittrich et al., Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing)

VLDB 2010Presentation on Wednesday at 12:00

Research track: Cloud computing session

2Motivation

Page 8: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems Group

Hadoop++Hadoop++

MotivationYeah! the

prototype is ready

But, where do I run our 100-nodes experiments?

O.K., run the experiments on

Amazon EC2

Hadoop++

0

200

400

600

800

1000

1200

0 5 10 15 20 25 30 35 40 45 50

Runtim

e [sec]

Measurements

EC2 Cluster Local Cluster

2 3 4 5

6 7

Hadoop++

Oops!

Hadoop++

1

public void getValue() {return value;}

public void getValue() {return value;}

public void getValue() {return value;}dsfdsfl; flksdjfjds lkfjsdjk jlkjsf ljsdlkfj sf

J. Dittrich et al., Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing)

VLDB 2010Presentation on Wednesday at 12:00

Research track: Cloud computing session

2Motivation

Page 9: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems Group

Hadoop++Hadoop++

MotivationYeah! the

prototype is ready

But, where do I run our 100-nodes experiments?

O.K., run the experiments on

Amazon EC2

Hadoop++

0

200

400

600

800

1000

1200

0 5 10 15 20 25 30 35 40 45 50

Runtim

e [sec]

Measurements

EC2 Cluster Local Cluster

2 3 4 5

6 7

Hadoop++

Oops!

Hadoop++

1

public void getValue() {return value;}

public void getValue() {return value;}

public void getValue() {return value;}dsfdsfl; flksdjfjds lkfjsdjk jlkjsf ljsdlkfj sf

J. Dittrich et al., Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing)

VLDB 2010Presentation on Wednesday at 12:00

Research track: Cloud computing session

2Motivation

• Application: a MapReduce aggregation job

• Number of virtual nodes: 50

• Repetitions: once every hour

0

200

400

600

800

1000

1200

0 5 10 15 20 25 30 35 40 45 50

Runtim

e [sec]

Measurements

0

200

400

600

800

1000

1200

0 5 10 15 20 25 30 35 40 45 50

Runtim

e [sec]

Measurements

EC2 Cluster Local Cluster

Local Cluster

Amazon EC2

Page 10: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems Group

Hadoop++Hadoop++

MotivationYeah! the

prototype is ready

But, where do I run our 100-nodes experiments?

O.K., run the experiments on

Amazon EC2

Hadoop++

0

200

400

600

800

1000

1200

0 5 10 15 20 25 30 35 40 45 50

Runtim

e [sec]

Measurements

EC2 Cluster Local Cluster

2 3 4 5

6 7

Hadoop++

Oops!

Hadoop++

1

public void getValue() {return value;}

public void getValue() {return value;}

public void getValue() {return value;}dsfdsfl; flksdjfjds lkfjsdjk jlkjsf ljsdlkfj sf

J. Dittrich et al., Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing)

VLDB 2010Presentation on Wednesday at 12:00

Research track: Cloud computing session

2Motivation

Hadoop++

Why?!

0

200

400

600

800

1000

1200

0 5 10 15 20 25 30 35 40 45 50

Runtim

e [sec]

Measurements

EC2 Cluster Local Cluster

8

Page 11: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems Group

Hadoop++Hadoop++

MotivationYeah! the

prototype is ready

But, where do I run our 100-nodes experiments?

O.K., run the experiments on

Amazon EC2

Hadoop++

0

200

400

600

800

1000

1200

0 5 10 15 20 25 30 35 40 45 50

Runtim

e [sec]

Measurements

EC2 Cluster Local Cluster

2 3 4 5

6 7

Hadoop++

Oops!

Hadoop++

1

public void getValue() {return value;}

public void getValue() {return value;}

public void getValue() {return value;}dsfdsfl; flksdjfjds lkfjsdjk jlkjsf ljsdlkfj sf

J. Dittrich et al., Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing)

VLDB 2010Presentation on Wednesday at 12:00

Research track: Cloud computing session

2Motivation

Hadoop++

Why?!Let’s

investigate!

0

200

400

600

800

1000

1200

0 5 10 15 20 25 30 35 40 45 50

Runtim

e [sec]

Measurements

EC2 Cluster Local Cluster

9 10

Hadoop++

?

8

Page 12: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupRelated Work

3Related Work

Page 13: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupRelated Work

3Related Work

M. Armbrust et al., Above the Clouds: A Berkeley View of Cloud Computing. UCB Technical Report, 2009.

Summary: performance unpredictability is mentioned as one of the major obstacles for Cloud computing.

Page 14: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupRelated Work

3Related Work

M. Armbrust et al., Above the Clouds: A Berkeley View of Cloud Computing. UCB Technical Report, 2009.

S. Ostermann et al., A Performance Analysis of EC2 Cloud computing Services for Scientific Computing. Cloudcomp, 2009.

Summary: performance unpredictability is mentioned as one of the major obstacles for Cloud computing.

Summary: evaluation of different Cloud services of Amazon in terms of cost and performance.

ResearchChallenges

Page 15: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems Group

[Appeared after VLDB’10 deadline]

Related Work

3Related Work

M. Armbrust et al., Above the Clouds: A Berkeley View of Cloud Computing. UCB Technical Report, 2009.

D. Kossmann et al., An Evaluation of Alternative Architectures for Transaction Processing in the Cloud. SIGMOD, 2010.

S. Ostermann et al., A Performance Analysis of EC2 Cloud computing Services for Scientific Computing. Cloudcomp, 2009.

Summary: cost and performance evaluation of different distributed databases architectures and cloud providers.

Summary: performance unpredictability is mentioned as one of the major obstacles for Cloud computing.

Summary: evaluation of different Cloud services of Amazon in terms of cost and performance.

AbsolutePerformance

ResearchChallenges

Page 16: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems Group

?

Related Work

3Related Work

M. Armbrust et al., Above the Clouds: A Berkeley View of Cloud Computing. UCB Technical Report, 2009.

D. Kossmann et al., An Evaluation of Alternative Architectures for Transaction Processing in the Cloud. SIGMOD, 2010.

S. Ostermann et al., A Performance Analysis of EC2 Cloud computing Services for Scientific Computing. Cloudcomp, 2009.

Summary: cost and performance evaluation of different distributed databases architectures and cloud providers.

Summary: performance unpredictability is mentioned as one of the major obstacles for Cloud computing.

Summary: evaluation of different Cloud services of Amazon in terms of cost and performance.

ApplicationPerformance

AbsolutePerformance

ResearchChallenges

Variability inPerformance

Page 17: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems Group

?

Related Work

3Related Work

M. Armbrust et al., Above the Clouds: A Berkeley View of Cloud Computing. UCB Technical Report, 2009.

D. Kossmann et al., An Evaluation of Alternative Architectures for Transaction Processing in the Cloud. SIGMOD, 2010.

S. Ostermann et al., A Performance Analysis of EC2 Cloud computing Services for Scientific Computing. Cloudcomp, 2009.

Summary: cost and performance evaluation of different distributed databases architectures and cloud providers.

Summary: performance unpredictability is mentioned as one of the major obstacles for Cloud computing.

Summary: evaluation of different Cloud services of Amazon in terms of cost and performance.

ApplicationPerformance

AbsolutePerformance

ResearchChallenges

Variability inPerformance

11

Go for it!

Page 18: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupAgenda

• Motivation

• Related Work

• Background

• Methodology

• Results & Analysis

4Background

Page 19: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupAmazon EC2

• Most popular Cloud infrastructure

• Three locations: US, EU, and ASIA [after VLDB’10 deadline]

• Different availability zones for US

• Linux-based virtual machines (instances)

• Five EC2 Instance types: standard, micro [from

September 9th], high-memory, high-cpu, and cluster-compute [after VLDB’10 deadline]

5Background

Page 20: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupStandard Instances• Small size instance

• 1.7 GB of main memory

• 1 EC2 Compute Unit

• 160 GB of local storage

• Large size instance

• 7.5 GB of main memory

• 4 EC2 Compute Units

• 850 GB of local storage

• Extra Large size instance

• 15 GB of main memory

• 8 EC2 Compute Unit

• 1690 GB of local storage

6Background

Page 21: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupStandard Instances• Small size instance

• 1.7 GB of main memory

• 1 EC2 Compute Unit

• 160 GB of local storage

• Large size instance

• 7.5 GB of main memory

• 4 EC2 Compute Units

• 850 GB of local storage

• Extra Large size instance

• 15 GB of main memory

• 8 EC2 Compute Unit

• 1690 GB of local storage

6Background

“one EC2 compute unit (ECU) provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.”

Page 22: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupAgenda

• Motivation

• Related Work

• Background

• Methodology

• Results & Analysis

7Methodology

Page 23: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupWhat to Measure?

8Methodology

Page 24: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupWhat to Measure?

8Methodology

Microbenchmarks

Page 25: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupWhat to Measure?

8

• CPU performance

• Memory performance

Methodology

Microbenchmarks

Page 26: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupWhat to Measure?

8

• CPU performance

• Memory performance

• Disk I/O (sequential and random)

Methodology

Microbenchmarks

Page 27: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupWhat to Measure?

8

• CPU performance

• Memory performance

• Disk I/O (sequential and random)

Methodology

In this talk:

• Internal network bandwidth

• External network bandwidth

• Instance startup

Microbenchmarks

Page 28: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupHow to Measure?

9

• CPU performance:

• Memory performance:

• Disk I/O (sequential and random):

Methodology

In this talk:

• Internal network bandwidth

• External network bandwidth

• Instance startup

Microbenchmarks

Page 29: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupHow to Measure?

9

• CPU performance:

• Memory performance:

• Disk I/O (sequential and random):

Methodology

Bonnie++

In this talk:

Ubench

Ubench

• Internal network bandwidth

• External network bandwidth

• Instance startup

Microbenchmarks

Page 30: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupGoal of Our Study

10Methodology

Page 31: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupGoal of Our Study

10

• Do different Instance types have different variations in performance?

Methodology

Page 32: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupGoal of Our Study

10

• Do different Instance types have different variations in performance?

• Do different locations or availability zones impact performance?

Methodology

Page 33: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupGoal of Our Study

10

• Do different Instance types have different variations in performance?

• Do different locations or availability zones impact performance?

• Does performance depend on the time of the day, weekday, or week?

Methodology

In this talk:

Page 34: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupSetup

11

• Small and large Instances in US and EU locations

• Default settings for Ubench and Bonnie++

• Results reported in CET time

• Baseline: our team’s cluster at Saarland University

Methodology

• 2.66 GHz Quad Core Xeon CPU

• 16 GB of main memory

• 6x750 GB SATA hard disks

• 50 Xeon-based virtual nodes

Page 35: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupSetup

11

• Small and large Instances in US and EU locations

• Default settings for Ubench and Bonnie++

• Results reported in CET time

• Baseline: our team’s cluster at Saarland University

Methodology

[ASIA location was introduced after VLDB deadline]

• 2.66 GHz Quad Core Xeon CPU

• 16 GB of main memory

• 6x750 GB SATA hard disks

• 50 Xeon-based virtual nodes

Page 36: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupMethodology

Methodology12

Create one small and one large Instance

Run Bonnie++

Run Ubench

Run Bonnie++

Small Large

Other benchmarks

Run Ubench

Other benchmarks

Every hour

Loop

Kill previous instances

Page 37: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupMethodology

Methodology12

Start: December 14, 2009End: January 12, 2010Duration: 31 days

[Results for one additional month,but without any additional pattern]

Create one small and one large Instance

Run Bonnie++

Run Ubench

Run Bonnie++

Small Large

Other benchmarks

Run Ubench

Other benchmarks

Every hour

Loop

Kill previous instances

Page 38: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupMeasure of Variation

13Methodology

Page 39: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupMeasure of Variation

13Methodology

•Different ones: range, variance, standard deviation, ...

Page 40: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems Group

•Need to compare data series in different scales

Measure of Variation

13Methodology

•Different ones: range, variance, standard deviation, ...

Page 41: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems Group

•Need to compare data series in different scales

Measure of Variation

13Methodology

CPU Memory Sequential Read Random Read Network

[Ubench score] [Ubench score] [KB/second] [seconds] [MB/second]

Mean x 1,248,629 390,267 70,036 215 924Min 1,246,265 388,833 69,646 210 919Max 1,250,602 391,244 70,786 219 925Range 4,337 2,411 1,140 9 6COV 0.001 0.003 0.006 0.019 0.002

Table 1: Physical Cluster: Benchmark results obtained as baseline

byte I/O and block I/O. For further details please refer to [4].In our study, we report results for sequential reads/writesand random reads block I/O, since they are the most influ-encing aspects in database applications.Network Bandwidth. We use the Iperf benchmark [8]to measure network performance. Iperf is a modern alter-native for measuring maximum TCP and UDP bandwidthperformance developed by NLANR/DAST. It measures themaximum TCP bandwidth, allowing users to tune variousparameters and UDP characteristics. Iperf reports resultsfor bandwidth, delay jitter, and datagram loss. Unlike othernetwork benchmarks (e.g. Netperf), Iperf consumes less sys-tem resources, which results in more precise results.S3 Access. To evaluate S3, we measure the required timefor uploading a 100 MB file from one unused node of ourphysical cluster at Saarland University (which has no net-work contention locally) to a newly created bucket on S3(either in US or EU location). The bucket creation timeand deletion time are included in the measurement. It isworth noting that such a measurement also reflects the net-work congestion between our local cluster and the respectiveAmazon datacenter.

4.3 Benchmark ExecutionWe ran our benchmarks two times every hour during 31

days (from December 14 to January 12) on small and largeinstances. The reason for making such a long measurementsis because we expected the performance results to vary con-siderably over time. This long period of testing also allowsus to do a more meaningful analysis of the system perfor-mance of Amazon EC2. We have even one month moreof data, but we could not see any additional patterns thanthose presented here1. We shut down all instances after 55minutes, which allowed us to enforce Amazon EC2 to createnew instances just before running again all benchmarks. Themain idea behind this is to better distribute our tests overdifferent computing nodes and hence to get a real overallmeasure for each of our benchmarks. To avoid that bench-mark results were impacted by each other, we sequentiallyran all benchmarks so as to ensure that only one benchmarkwas running at any time. Notice that, as sometimes a singlerun can take longer than 30 minutes, we ran all benchmarksonly once in such cases. To run the Iperf benchmark, wesynchronized two instances just before running it, becauseIperf requires two idle instances. Furthermore, since twoinstances are not necessarily in the same availability zone,network bandwidth is very likely to be different. Thus, weran different experiments for the case when two instancesare in the same availability zone and when they are not.

4.4 Experimental SetupWe ran our experiments on Amazon EC2 using one small

1The entire dataset is publicly available on the project web-site [9].

standard instance and one large standard instance in bothlocations US and EU (we increased the number of instancesin Section 7.1). Please refer to Section 3 for details on thehardware of these instances. For both types of instances weused a Linux Fedora 8 OS. For each instance type we cre-ated one Amazon Machine Image per location including thenecessary benchmark code. We used standard instances lo-cal storage and mnt partitions for both types when runningBonnie++. We stored all benchmark results in a MySQLdatabase, hosted at our local file server.

To compare EC2 results with a meaningful baseline, wealso ran all benchmarks — except instance startup and S3 —in our local cluster having physical nodes. It has the follow-ing configuration: one 2.66 GHz Quad Core Xeon CPU run-ning 64-bit platform with Linux openSuse 11.1 OS, 16 GBmain memory, 6x750 GB SATA hard disks, and three Giga-bit network cards in bonding mode. As we had full controlof this cluster, there was no additional workload on the clus-ter during our experiments. Thus, this represents the bestcase scenario, which we consider as baseline.

We used the default settings for Ubench, Bonnie++, andIperf in all our experiments. As Ubench performance alsodepends on compiler performance, we used gcc-c++ 4.1.2on all Amazon EC2 instances and all physical nodes of ourcluster. Finally, as Amazon EC2 is used by users from allover the world, and thus with different time zones, there isno local time. This is why we decided to use CET as thecoordinated time for presenting results.

4.5 Measure of VariationLet us now introduce the measure we use to evaluate the

variance in performance. There exist a number of measuresto represent this: range, interquartile range, and standarddeviation among others. The standard deviation is a widelyused measure of variance, but it is hard to compare for dif-ferent measurements. In other words, a given standard devi-ation value can only indicate how high or low the variance isin relation to a single mean value. Furthermore, our studyinvolves the comparison of different scales. For these tworeasons, we consider the Coefficient of Variation (COV),which is defined as the ratio of the standard deviation tothe mean. Since we compute the COV over a sample of re-sults, we consider the sample standard deviation. Therefore,the COV is formally defined as follows,

COV =1

vuut 1

N − 1·

NX

i=1

(xi − x)2

Here N is the number of measurements; x1, .., xN are themeasured results; and x is the mean of those measurements.Note that we divide by N − 1 and not by N , as only N − 1of the N differences (xi − x) are independent [18].

In contrast to the standard deviation, the COV allows usto compare the degree of variation from one data series toanother, even if the means are different from each other.

•Coefficient of Variation (COV): ratio of the standard deviation to the mean

•Different ones: range, variance, standard deviation, ...

standard deviation (s)mean (x)

Page 42: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupAgenda

• Motivation

• Related Work

• Background

• Methodology

• Results & Analysis

14Results & Analysis

0

100000

200000

300000

400000

500000

600000

Week 52 Week 53 Week 1 Week 2 Week 3

CP

U P

erf

orm

ance [U

bench s

core

]

Measurements per Hour

US location EU location

Page 43: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupCPU Performance

15

0

100000

200000

300000

400000

500000

600000

Week 52 Week 53 Week 1 Week 2 Week 3

CP

U P

erf

orm

an

ce [

Ub

en

ch s

core

]

Measurements per Hour

US location EU location

Large Instances

Results & Analysis

Measurements per Hour

CPU Performance [Ubench score]

Page 44: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupCPU Performance

15

0

100000

200000

300000

400000

500000

600000

Week 52 Week 53 Week 1 Week 2 Week 3

CP

U P

erf

orm

an

ce [

Ub

en

ch s

core

]

Measurements per Hour

US location EU location

Large Instances

Results & Analysis

COV of 24%

Measurements per Hour

CPU Performance [Ubench score]

Baseline: COV of 0.1%

Page 45: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupCPU Performance

15

0

100000

200000

300000

400000

500000

600000

Week 52 Week 53 Week 1 Week 2 Week 3

CP

U P

erf

orm

an

ce [

Ub

en

ch s

core

]

Measurements per Hour

US location EU location

Large Instances

Results & Analysis

COV of 24%

Measurements per Hour

CPU Performance [Ubench score]

Baseline: COV of 0.1%

Obser vat ion : two bands in performance

Page 46: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupMemory Performance

16

0

50000

100000

150000

200000

250000

300000

350000

Week 52 Week 53 Week 1 Week 2 Week 3

Mem

ory

Perf

orm

ance

[U

bench

sco

re]

Measurements per Hour

US location EU location

Results & Analysis

Large InstancesMeasurements per Hour

Memory Performance [Ubench score]

Page 47: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupMemory Performance

16

0

50000

100000

150000

200000

250000

300000

350000

Week 52 Week 53 Week 1 Week 2 Week 3

Mem

ory

Perf

orm

ance

[U

bench

sco

re]

Measurements per Hour

US location EU location

Results & Analysis

Large InstancesMeasurements per Hour

Memory Performance [Ubench score]

COV of 10%

Baseline: COV of 0.3%

Page 48: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupMemory Performance

16

0

50000

100000

150000

200000

250000

300000

350000

Week 52 Week 53 Week 1 Week 2 Week 3

Mem

ory

Perf

orm

ance

[U

bench

sco

re]

Measurements per Hour

US location EU location

Results & Analysis

Large InstancesMeasurements per Hour

Memory Performance [Ubench score]

COV of 10%

Observation: again, two bands in performance

Baseline: COV of 0.3%

Page 49: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupRandom I/O Performance

17

50

100

150

200

250

300

350

400

Week 52 Week 53 Week 1 Week 2 Week 3

Random

Read D

isk

Perf

orm

ance

[K

B/s

]

Measurements per Hour

US location EU location

Results & Analysis

Large InstancesMeasurements per Hour

Random Read Disk Performance [KB/s]

Page 50: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupRandom I/O Performance

17

50

100

150

200

250

300

350

400

Week 52 Week 53 Week 1 Week 2 Week 3

Random

Read D

isk

Perf

orm

ance

[K

B/s

]

Measurements per Hour

US location EU location

Results & Analysis

Large InstancesMeasurements per Hour

Random Read Disk Performance [KB/s]

COV of 13%

Baseline: COV of 1.9%

Page 51: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupRandom I/O Performance

17

50

100

150

200

250

300

350

400

Week 52 Week 53 Week 1 Week 2 Week 3

Random

Read D

isk

Perf

orm

ance

[K

B/s

]

Measurements per Hour

US location EU location

Results & Analysis

Large InstancesMeasurements per Hour

Random Read Disk Performance [KB/s]

COV of 13%

Baseline: COV of 1.9%

Observation: one band in performance for US

Page 52: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupAvailability Zones

18Results & Analysis

0

100000

200000

300000

400000

500000

600000

Week 52 Week 53 Week 1 Week 2 Week 3

CP

U P

erf

orm

ance

[U

bench

sco

re]

Measurements per Hour

us-east-1a us-east-1b us-east-1c us-east-1d

Large InstancesMeasurements per Hour

CPU Performance [Ubench score]

Page 53: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupAvailability Zones

18Results & Analysis

0

100000

200000

300000

400000

500000

600000

Week 52 Week 53 Week 1 Week 2 Week 3

CP

U P

erf

orm

ance

[U

bench

sco

re]

Measurements per Hour

us-east-1a us-east-1b us-east-1c us-east-1d

Large InstancesMeasurements per Hour

CPU Performance [Ubench score]

“Availability Zones are distinct locations that are engineered to be insulated from failures in other Availability Zones...”

Page 54: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupAvailability Zones

18Results & Analysis

0

100000

200000

300000

400000

500000

600000

Week 52 Week 53 Week 1 Week 2 Week 3

CP

U P

erf

orm

ance

[U

bench

sco

re]

Measurements per Hour

us-east-1a us-east-1b us-east-1c us-east-1d

Large InstancesMeasurements per Hour

CPU Performance [Ubench score]

“Availability Zones are distinct locations that are engineered to be insulated from failures in other Availability Zones...”

Observation 1 : us-east-1d results always in the upper band

Page 55: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupAvailability Zones

18Results & Analysis

0

100000

200000

300000

400000

500000

600000

Week 52 Week 53 Week 1 Week 2 Week 3

CP

U P

erf

orm

ance

[U

bench

sco

re]

Measurements per Hour

us-east-1a us-east-1b us-east-1c us-east-1d

Large InstancesMeasurements per Hour

CPU Performance [Ubench score]

Observation 2: lower band results belong only to us-east-1c

“Availability Zones are distinct locations that are engineered to be insulated from failures in other Availability Zones...”

Observation 1 : us-east-1d results always in the upper band

Page 56: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupCPU Distribution (1/2)

19Results & Analysis

Hadoop++

12

Why two bands?

Page 57: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupCPU Distribution (1/2)

19Results & Analysis

Hadoop++

12

Why two bands?

Hadoop++

13

... and if I verify the performance of the

different processors type?

Page 58: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems Group

Run Ubench 150 times

For each pair

CPU Distribution (1/2)

19Results & Analysis

Hadoop++

12

Why two bands?

Hadoop++

13

... and if I verify the performance of the

different processors type?

Xeon Opteron

Run Ubench 150 times

Request 5 Xeon-Opteron pairs

Page 59: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems Group

Run Ubench 150 times

For each pair

CPU Distribution (1/2)

19Results & Analysis

Hadoop++

12

Why two bands?

Hadoop++

13

... and if I verify the performance of the

different processors type?

Xeon Opteron

Run Ubench 150 times

Request 5 Xeon-Opteron pairs examining the /proc/cpuinfo fileexamining the /proc/cpuinfo file

Page 60: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupCPU Distribution (2/2)

20Results & Analysis

0

20000

40000

60000

80000

100000

120000

140000

Instance 1 Instance 2 Instance 3 Instance 4 Instance 5

CP

U p

erf

orm

ance [U

bench S

core

]

Instances

Xeon Opteron

Instance-pair 1 Instance-pair 2 Instance-pair 3 Instance-pair 4 Instance-pair 5

Page 61: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupCPU Distribution (2/2)

20Results & Analysis

Different performance per processor type

0

20000

40000

60000

80000

100000

120000

140000

Instance 1 Instance 2 Instance 3 Instance 4 Instance 5

CP

U p

erf

orm

ance [U

bench S

core

]

Instances

Xeon Opteron

Instance-pair 1 Instance-pair 2 Instance-pair 3 Instance-pair 4 Instance-pair 5

Page 62: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupCPU Distribution (2/2)

20Results & Analysis

COV: 2%

COV: 1% Different performance per processor type

While the COV for combined measurements is 24%

0

20000

40000

60000

80000

100000

120000

140000

Instance 1 Instance 2 Instance 3 Instance 4 Instance 5

CP

U p

erf

orm

ance [U

bench S

core

]

Instances

Xeon Opteron

Instance-pair 1 Instance-pair 2 Instance-pair 3 Instance-pair 4 Instance-pair 5

Page 63: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupCPU Distribution (2/2)

20Results & Analysis

COV: 2%

COV: 1% Different performance per processor type

While the COV for combined measurements is 24%

0

20000

40000

60000

80000

100000

120000

140000

Instance 1 Instance 2 Instance 3 Instance 4 Instance 5

CP

U p

erf

orm

ance [U

bench S

core

]

Instances

Xeon Opteron

Observation: 1 cpu 1 underlying system[memory and I/O follows this pattern ]

Instance-pair 1 Instance-pair 2 Instance-pair 3 Instance-pair 4 Instance-pair 5

Page 64: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupLarger Clusters

21

50 Large Instances Cluster

0

20000

40000

60000

80000

100000

120000

140000

0 5 10 15 20 25 30 35

Me

an

Ub

en

ch S

core

Start Point [Hours]

Results & Analysis

Page 65: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupLarger Clusters

21

50 Large Instances Cluster

0

20000

40000

60000

80000

100000

120000

140000

0 5 10 15 20 25 30 35

Me

an

Ub

en

ch S

core

Start Point [Hours]

Observation: 30% of the measurements fall into the low performance band.

Still, two bands in performance

Results & Analysis

Page 66: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupMapReduce Job

22

0

200

400

600

800

1000

1200

0 5 10 15 20 25 30 35 40 45 50

Runtim

e [sec]

Measurements

EC2 Cluster Local Cluster

Local Cluster

Amazon EC2

• Application: a MapReduce aggregation job

• Number of virtual nodes: 50

• Repetitions: once every hour

Results & Analysis

Page 67: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupMapReduce Job

22

0

200

400

600

800

1000

1200

0 5 10 15 20 25 30 35 40 45 50

Runtim

e [sec]

Measurements

EC2 Cluster Local Cluster

Local Cluster

Amazon EC2

• Application: a MapReduce aggregation job

• Number of virtual nodes: 50

• Repetitions: once every hour

Results & Analysis

Page 68: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupMapReduce Job

22

0

200

400

600

800

1000

1200

0 5 10 15 20 25 30 35 40 45 50

Runtim

e [sec]

Measurements

EC2 Cluster Local Cluster

Local Cluster

Amazon EC2

• Application: a MapReduce aggregation job

• Number of virtual nodes: 50

• Repetitions: once every hour

Observation: upper band results from virtual clusters composed of 80% of Xeon processors (of 20% for the lower band).

Results & Analysis

Page 69: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupLessons Learned

23Conclusion

Page 70: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupLessons Learned

23Conclusion

16

Ah O.K.! So...

1514

Hadoop++

Page 71: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupLessons Learned

23Conclusion

16

Ah O.K.! So...

•Be careful!

1514

Hadoop++

Page 72: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupLessons Learned

23Conclusion

•High variance in performance: COV up to 24%

16

Ah O.K.! So...

•Be careful!

•Hard to interpret results

•Repeatability to limited extent

1514

Hadoop++

Page 73: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupLessons Learned

23Conclusion

•High variance in performance: COV up to 24%

16

Ah O.K.! So...

•Be careful!

•Hard to interpret results

•Repeatability to limited extent

•Two bands in performance

1514

Hadoop++

Page 74: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupLessons Learned

23Conclusion

•High variance in performance: COV up to 24%

16

Ah O.K.! So...

•Be careful!

•Hard to interpret results

•Repeatability to limited extent

•Two bands in performance

•Partially due to different physical CPU types

1514

Hadoop++

Page 75: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupConclusion

24Conclusion

Page 76: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupConclusion

24Conclusion

• Amazon should:

• reveal the physical details• allow users to specify physical characteristics

Page 77: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupConclusion

24Conclusion

•Researchers should

• use equivalent virtual clusters to compare systems• report underlying system type with the results

• Amazon should:

• reveal the physical details• allow users to specify physical characteristics

Page 78: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupConclusion

24Conclusion

17

By the way...

Amazon recently introduced the cluster-compute Instances

[after VLDB’10 deadline]

•Researchers should

• use equivalent virtual clusters to compare systems• report underlying system type with the results

• Amazon should:

• reveal the physical details• allow users to specify physical characteristics

Page 79: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupConclusion

24Conclusion

17

By the way...

Amazon recently introduced the cluster-compute Instances

[after VLDB’10 deadline]

•Researchers should

• use equivalent virtual clusters to compare systems• report underlying system type with the results

• Amazon should:

• reveal the physical details• allow users to specify physical characteristics

Still significantly higher than in a local cluster

Page 80: Runtime Measurements in the Cloud · September 14th, 2010 Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. Quiané Related Work Information Systems Group 3 Related

Runtime Measurements in the Cloud, J. Schad, J. Dittrich, and J. QuianéSeptember 14th, 2010

Information Systems GroupConclusion

24Conclusion

17

By the way...

Amazon recently introduced the cluster-compute Instances

[after VLDB’10 deadline]

•Researchers should

• use equivalent virtual clusters to compare systems• report underlying system type with the results

• Amazon should:

• reveal the physical details• allow users to specify physical characteristics

Thank you!

Still significantly higher than in a local cluster