Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task...

32
Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task Scheduling Thiago A. L. Genez , Luiz F. Bittencourt, Nelson L. S. da Fonseca, Edmundo R. M. Madeira Institute of Computing (IC) University of Campinas (UNICAMP) Campinas, SP, Brazil December 10, 2014 IEEE GLOBECOM 2014 1 / 22

Transcript of Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task...

Page 1: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Refining the Estimation ofthe Available Bandwidth in Inter-Cloud

Links for Task Scheduling

Thiago A. L. Genez, Luiz F. Bittencourt,Nelson L. S. da Fonseca, Edmundo R. M. Madeira

Institute of Computing (IC)University of Campinas (UNICAMP)

Campinas, SP, Brazil

December 10, 2014IEEE GLOBECOM 2014

1 / 22

Page 2: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Outline

Introduction

Related Works

Procedure for Deflating Estimates of the Available Bandwidth

Evaluation

Final Considerations

2 / 22

Page 3: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Introduction

Workflow Scheduling Problem in Hybrid CloudsI Peak demand time:

• Private resources → overloaded or insufficient• Hybrid Cloud: Public resources + private resources

I What are the advantages of using public clouds?• Elasticity• Pay-as-you-go basis

I Workflow scheduling problem

3 / 22

Page 4: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Introduction

Current schedulersI Not designed to cope with imprecise information

I Produce schedules without taking into account the variability of theavailable bandwidth in inter-cloud links

I Available bandwidth can increase or decrease at the running time

I Application execution can lead• Violation of deadlines

4 / 22

Page 5: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Introduction

Purpose of this work

How to reduce the negative impact of imprecise information about theinter-cloud available bandwidth on the production of schedules by ascheduler that was not designed to address with such impreciseinformation?

Challenge

Use the original scheduling algorithm

Proposed Mechanism

Deflating the estimate of the inter-cloud available bandwidth based onthe expected imprecision of such estimate and provide a deflatedbandwidth estimate as an input to the scheduler

5 / 22

Page 6: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Introduction

Purpose of this work

How to reduce the negative impact of imprecise information about theinter-cloud available bandwidth on the production of schedules by ascheduler that was not designed to address with such impreciseinformation?

Challenge

Use the original scheduling algorithm

Proposed Mechanism

Deflating the estimate of the inter-cloud available bandwidth based onthe expected imprecision of such estimate and provide a deflatedbandwidth estimate as an input to the scheduler

5 / 22

Page 7: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Introduction

Purpose of this work

How to reduce the negative impact of imprecise information about theinter-cloud available bandwidth on the production of schedules by ascheduler that was not designed to address with such impreciseinformation?

Challenge

Use the original scheduling algorithm

Proposed Mechanism

Deflating the estimate of the inter-cloud available bandwidth based onthe expected imprecision of such estimate and provide a deflatedbandwidth estimate as an input to the scheduler

5 / 22

Page 8: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Outline

Introduction

Related Works

Procedure for Deflating Estimates of the Available Bandwidth

Evaluation

Final Considerations

6 / 22

Page 9: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Related Works

Rahman et al.– Performance of the network of the Amazon EC2

(2010)

– Analysis of the packets delay of VMs to/from Amazon EC2– Large delay variations– Negatively impact the performance of scientific applications

Batista et al. – Describe tools for estimating available bandwidth2010 – Produce estimations with large variability

8 / 22

Page 10: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Outline

Introduction

Related Works

Procedure for Deflating Estimates of the Available Bandwidth

Evaluation

Final Considerations

9 / 22

Page 11: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Procedure for Deflating Estimates of the AvailableBandwidth in Inter-cloud Links

Available bandwidth

estimation toolScheduler

Estimate of the

Available Bandwidth

Hybrid Cloud

Application workflow

and

Deadline Value

Schedule

10 / 22

Page 12: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Procedure for Deflating Estimates of the AvailableBandwidth in Inter-cloud Links

Available bandwidth

estimation toolScheduler

Estimate of

the Available

Bandwidth

Expected

uncertainty

value

Hybrid Cloud

Procedure

Deflated

Available

Bandwidth

Estimate

Application workflow

and

Deadline Value

Schedule

10 / 22

Page 13: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Procedure for Deflating Estimates of the AvailableBandwidth in Inter-cloud Links

ProcedureI History of past executions of the target workflow

I When a workflow is about to be scheduled

1. Estimate of the available bandwidth2. Expected uncertainty value3. Query the history of past executions of the target workflow4. Calculates the deflating factor U

U = 10 ⇒ 90% of the estimate of the available bandwidth

I Schedule produced is based on the expected uncertainty of theestimate of available bandwidth in inter-cloud links

11 / 22

Page 14: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Procedure for Deflating Estimates of the AvailableBandwidth in Inter-cloud Links

ProcedureI History of past executions of the target workflow

I When a workflow is about to be scheduled

1. Estimate of the available bandwidth2. Expected uncertainty value3. Query the history of past executions of the target workflow4. Calculates the deflating factor U

U = 10 ⇒ 90% of the estimate of the available bandwidth

I Schedule produced is based on the expected uncertainty of theestimate of available bandwidth in inter-cloud links

11 / 22

Page 15: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Procedure for Deflating Estimates of the AvailableBandwidth in Inter-cloud Links

ProcedureI History of past executions of the target workflow

I When a workflow is about to be scheduled

1. Estimate of the available bandwidth2. Expected uncertainty value3. Query the history of past executions of the target workflow4. Calculates the deflating factor U

U = 10 ⇒ 90% of the estimate of the available bandwidth

I Schedule produced is based on the expected uncertainty of theestimate of available bandwidth in inter-cloud links

11 / 22

Page 16: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Procedure for Deflating Estimates of the AvailableBandwidth in Inter-cloud Links

Database

Available bandwidth

estimation toolScheduler

Observed

Available

Bandwidth

value

Expected

uncertainty

value

Hybrid Cloud

Procedure

Deflated

Available

Bandwidth

Application workflow

and

Deadline Value

Schedule

12 / 22

Page 17: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Procedure for Deflating Estimates of the AvailableBandwidth in Inter-cloud Links

Database

Available bandwidth

estimation toolScheduler

Estimate of

the Available

Bandwidth

Expected

uncertainty

value

Hybrid Cloud

Procedure

Deflated

Available

Bandwidth

Application workflow

and

Deadline Value

Schedule

12 / 22

Page 18: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Procedure for Deflating Estimates of the AvailableBandwidth in Inter-cloud Links

Database

Available bandwidth

estimation toolScheduler

Estimate of

the Available

Bandwidth

Expected

uncertainty

value

Hybrid Cloud

Procedure

Deflated

Available

Bandwidth

Application workflow

and

Deadline Value

Schedule

12 / 22

Page 19: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Procedure for Deflating Estimates of the AvailableBandwidth in Inter-cloud Links

Database

Available bandwidth

estimation toolScheduler

Estimate of

the Available

Bandwidth

Expected

uncertainty

value

Hybrid Cloud

Procedure

Deflated

Available

Bandwidth

Application workflow

and

Deadline Value

Schedule

12 / 22

Page 20: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Procedure for Deflating Estimates of the AvailableBandwidth in Inter-cloud Links

Database

Available bandwidth

estimation toolScheduler

Estimate of

the Available

Bandwidth

Expected

uncertainty

value

Hybrid Cloud

Procedure

Deflated

Available

Bandwidth

Application workflow

and

Deadline Value

Schedule

12 / 22

Page 21: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Procedure for Deflating Estimates of the AvailableBandwidth in Inter-cloud Links

Database

Available bandwidth

estimation toolScheduler

Estimate of

the Available

Bandwidth

Expected

uncertainty

value

Hybrid Cloud

Procedure

Deflated

Available

Bandwidth

Application workflow

and

Deadline Value

Schedule

Untouched QualifedSolution

12 / 22

Page 22: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Procedure for Deflating Estimates of the AvailableBandwidth in Inter-cloud Links

Computation of the Deflating factor U for the TargetWorkflow

I Multiple Linear Regression: f(x, y) = ax+ by + c

• x: Current estimate of the available bandwidth• y: Current expected uncertainty• Deflating factor U = f(x, y)

Computation of the coefficients a, b and cI Target workflow G: dataset HG

• 5-tuple hi =(bw, p,U , errormG , error$G

)I Subset Hk ⊆ HG

• For each pair (bw, p) in HG

•(bw, p,Um)

and(bw, p,U$) are added into Hk

I Subset Hk is used by the multiple linear regression

13 / 22

Page 23: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Outline

Introduction

Related Works

Procedure for Deflating Estimates of the Available Bandwidth

Evaluation

Final Considerations

14 / 22

Page 24: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Evaluation

Experimental ParametersI Scheduler

• HCOC scheduling algorithm

I Hybrid Cloud Scenario• 1 private cloud and 2 public clouds• Inter-cloud bandwidths of 10 to 60 Mbps• Intra-cloud bandwidths of 1 Gbps

I Simulator• Estimates the makespan and cost of the execution of the workflow

15 / 22

Page 25: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Evaluation

Scheduler

DAX

File

VMs

File

Schedule Simulator

Reduction factor Uncertain

Makespanand

Cost ($)

Available Bandwidth

Makespanand

Cost ($)U p

b

Database

16 / 22

Page 26: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Evaluation

Experimental Steps

1. History of execution was created• Fixed bandwidth deflating factors U ∈ {0, 25, 50}• p varying from 45% to 99%• 100 simulations

2. Multiple linear regression (MLR) procedure• f(x, y) = ax+ by + c• Employs using 50% and 100% of the dataset

3. Use the equation f(x, y) to calculate the deflating factor U• 100 simulations

17 / 22

Page 27: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Evaluation

40

50

60

70

80

90

100

0 45 50 60 70 80 90 100

% o

f q

ua

lifie

d s

olu

tion

s

Uncertainty p

Montage DAG

U=0U=25U=50

MLR 50%MLR 100%

Inter-cloud available bandwidth of 60Mbps

D = Tmax × 3/718 / 22

Page 28: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Evaluation

25

30

35

40

45

50

55

60

0 45 50 60 70 80 90 100

Ave

rag

e m

ake

spa

n e

stim

atio

n

Uncertainty p

Montage DAG

U=0U=25U=50

MLR 50%MLR 100%

Inter-cloud available bandwidth of 60Mbps

D = Tmax × 3/719 / 22

Page 29: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Outline

Introduction

Related Works

Procedure for Deflating Estimates of the Available Bandwidth

Evaluation

Final Considerations

20 / 22

Page 30: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Final Considerations

ConclusionI Current scheduler

• Estimated available bandwidth is precise at the scheduling time• Produce inefficient scheduling decisions• Missing deadlines, increasing costs and makespan more than expected

I The proposed procedure• Deflates the estimate of the available bandwidth in inter-cloud links• Multiple linear regression approach• Increases the number of qualified solutions

21 / 22

Page 31: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Final Considerations

ConclusionI Current scheduler

• Estimated available bandwidth is precise at the scheduling time• Produce inefficient scheduling decisions• Missing deadlines, increasing costs and makespan more than expected

I The proposed procedure• Deflates the estimate of the available bandwidth in inter-cloud links• Multiple linear regression approach• Increases the number of qualified solutions

21 / 22

Page 32: Refining the Estimation of the Available Bandwidth in Inter-Cloud Links for Task SchedulingPresentation

Thank You!

Questions?

[email protected]

Acknowledgment:

grant #2014/08607-4

Sao Paulo Research Foundation

22 / 22