Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan...

29
Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University

Transcript of Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan...

Page 1: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Opportune Job Shredding:An Efficient Approach for

Scheduling Parameter Sweep Applications

Rohan Kurian, Pavan Balaji, P. Sadayappan

The Ohio State University

Page 2: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Parameter Sweep Applications

An important class of applications

Set of independent tasks

MCell Application3D simulations for sub-cellular architecture/physiology

GTOMO (Parallel Tomography) ApplicationMultiple view-point simulation

Systems exist for scheduling on the Grid

Cluster-based Scheduling?

Page 3: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Application Level Schedulers

Manage the scheduling of applications

Break the application to appropriate

chunks

APST (AppLeS Parameter Sweep Template)

NIMROD

Greedy approach to schedule PSA

chunks

Page 4: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Presentation Roadmap

Job Scheduling in Clusters

Multi-Site Job Scheduling

PSA Scheduling Strategies

Multi-Site Scheduling of PSAs

Performance Evaluation

Conclusions

Page 5: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Job Scheduling in Clusters

Mapping arriving jobs to available resources

Multiple Schemes for Scheduling First Come First Serve (FCFS)

Conservative Scheduling

Aggressive or EASY Scheduling

Fair-Share Constraints A user can not have more than ‘N’ queued jobs

Submitting the multiple chunks of a PSA job Violation of Fair-Share constraints

Combine chunks to form a single parallel job

Page 6: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Formation of PSAs in ClustersSmall

Independent Tasks

Parallel Parameter

Sweep Application

Page 7: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Presentation Roadmap

Job Scheduling in Clusters

Multi-Site Job Scheduling

PSA Scheduling Strategies

Multi-Site Scheduling of PSAs

Performance Evaluation

Conclusions

Page 8: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Multi-Site Job Scheduling

Multiple Simultaneous Requests

Job submitted to multiple sites

Started on the earliest cluster

Existing schemes have limitations

Heterogeneous Clusters

Different Scheduling Schemes

Page 9: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Multiple-simultaneous-requests

Meta Scheduler

Local Scheduler

Meta Scheduler

Local Scheduler

Meta Scheduler

Local Scheduler

Jobs

Jobs

JobsSite 1 Site 2

Site 3

Page 10: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Presentation Roadmap

Job Scheduling in Clusters

Multi-Site Job Scheduling

PSA Scheduling Strategies

Multi-Site Scheduling of PSAs

Performance Evaluation

Conclusions

Page 11: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

PSA Scheduling Strategies Flooding based Job Shredding

Submit all chunks in the PSA at onceGreedy approach Improves User and System metricsDoesn’t ensure fairness to Non-PSA jobs

Opportune Job ShreddingUses an additional Application-Level Scheduler

Monitors the current schedule of the system

If no normal backfill is possibleAllow PSA jobs to shred and backfill

Page 12: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Presentation Roadmap

Job Scheduling in Clusters

Multi-Site Job Scheduling

PSA Scheduling Strategies

Multi-Site Scheduling of PSAs

Performance Evaluation

Conclusions

Page 13: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Multi-Site Scheduling for PSAs

Two-level Application Level Schedulers

No constraints on sites

Allowed to have different speeds

Allowed to have different scheduling

policies

Similar to “Multiple Simultaneous

Requests”

Simultaneous requests only for PSAs

Page 14: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Multi-Site Scheduling for PSAs

App-Level Scheduler

Job Queue Local Scheduler

App-Level Scheduler

Job Queue Local

Scheduler

App-Level Scheduler

Job Queue Local

Scheduler

MetaApplication-Level

Scheduler

Site 1

Site 2

Site 3

Page 15: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Presentation Roadmap

Job Scheduling in Clusters

Multi-Site Job Scheduling

PSA Scheduling Strategies

Multi-Site Scheduling of PSAs

Performance Evaluation

Conclusions

Page 16: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Performance MetricsResponse Time

Completion Time – Submit Time

SlowdownResponse Time / Runtime

Loss of Capacity (LOC)LOC = min {(waiting jobs procs), idle

procs}T = Time for which this state lastsLOC = LOC x T

Page 17: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Evaluation Scheme

Simulation based Approach

CTC trace from Feitelson’s archive

EASY backfilling used

For multi-site evaluation

CTC traces from 3 different months

Processing speeds in the ratio 2:1:3

Page 18: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Flooding Based Job ShreddingAverage Slowdown (10% PSA Jobs)

-150

-100

-50

0

50

100

1 1.2 1.5

LoadP

erce

ntag

e de

crea

se

All Jobs PSA Jobs Non-PSA Jobs

Average Response Time(10% PSA Jobs)

-20

0

20

40

60

80

1 1.2 1.5

Load

Per

cent

age

decr

ease

All Jobs PSA Jobs Non-PSA Jobs

• Up to 60% improvement for PSA Jobs• Up to 90% worse performance for Non-PSA

Jobs

Page 19: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Flooding: Job Category wise breakup

Average Response Time(10% PSA Jobs)

-100

-80

-60

-40

-20

0

20

1 1.2 1.5

Load

Pe

rce

nta

ge

de

cre

ase

NarrowShort NarrowLong

WideShort WideLong

Average Slowdown(10% PSA Jobs)

-140

-120

-100

-80

-60

-40

-20

0

20

40

1 1.2 1.5

LoadP

erc

en

tag

e d

ecr

ea

seNarrowShort NarrowLong

WideShort WideLong

• Narrow Short Non-PSA jobs suffer most• Loss of back-filling opportunities is the main

reason

Page 20: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Flooding: Loss of Capacity

Loss Of Capacity (10% PSA jobs)

0

10

20

30

40

50

60

70

80

1 1.2 1.5

Load

Pe

rce

nta

ge

de

cre

ase

10% PSA Jobs

• Up to 75% improvement in the Loss of Capacity

Page 21: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Opportune Job ShreddingAverage Response Time

(10% PSA Jobs)

-2

0

2

4

6

8

10

1 1.2 1.5

Load

Per

cent

age

decr

ease

All Jobs PSA Jobs Non-PSA Jobs

Average Slowdown(10% PSA Jobs)

-100

1020304050607080

1 1.2 1.5

LoadP

erce

ntag

e de

crea

se

All Jobs PSA Jobs Non-PSA Jobs

• Up to 70% improvement for PSA Jobs• Less than 2% worsening in performance for Non-

PSA Jobs

Page 22: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Opportune: Job Category wise breakup

Average Response Time(10 % PSA Jobs)

-3

-2

-1

0

1

2

3

4

1 1.2 1.5

Load

Per

cent

age

decr

ease

NarrowShort NarrowLong

WideShort WideLong

Average Slowdown (10% PSA Jobs)

-8

-6

-4

-2

0

2

4

1 1.2 1.5

LoadP

erce

ntag

e de

crea

se

NarrowShort NarrowLong

WideShort WideLong

• No category of Non-PSA jobs suffers more than 7%

Page 23: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Opportune: Loss of Capacity

Loss Of Capacity (10% PSA Jobs)

0

2

4

6

8

10

12

14

1 1.2 1.5

Load

Per

cent

age

decr

ease

10% PSA Jobs

• Up to 12% improvement in the Loss of Capacity

Page 24: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Opportune (Multi-Site)Average Response Time

(10% PSA Jobs)

0102030405060708090

1 1.2 1.5

Load

Per

centa

ge

dec

reas

e

PSA Jobs Cluster1 Non-PSA Jobs Cluster1

PSA Jobs Cluster2 Non-PSA Jobs Cluster2

PSA Jobs Cluster3 Non-PSA Jobs Cluster3

Average Slowdown (10% PSA Jobs)

-40

-20

0

20

40

60

80

100

120

1 1.2 1.5

LoadPe

rcen

tage

dec

reas

ePSA Jobs Cluster1 Non-PSA Jobs Cluster1

PSA Jobs Cluster2 Non-PSA Jobs Cluster2

PSA Jobs Cluster3 Non-PSA Jobs Cluster3

• Up to 95% improvement for PSA Jobs• No significant loss of performance for Non-PSA jobs

Page 25: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Opportune (Multi-Site):Response Time

Average Response Time (10% PSA Jobs)

0

1020

3040

5060

7080

90

1 1.2 1.5

Load

Perc

enta

ge d

ecre

ase

PSA Jobs Cluster1 Non-PSA Jobs Cluster1 PSA Jobs Cluster2Non-PSA Jobs Cluster2 PSA Jobs Cluster3 Non-PSA Jobs Cluster3

• Up to 75% improvement for PSA Jobs• No significant loss of performance for Non-PSA jobs

Page 26: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Opportune (Multi-Site):Slowdown

Average Slowdown (10% PSA Jobs)

-40

-20

0

20

40

60

80

100

120

1 1.2 1.5

Load

Perc

enta

ge d

ecre

ase

PSA Jobs Cluster1 Non-PSA Jobs Cluster1 PSA Jobs Cluster2Non-PSA Jobs Cluster2 PSA Jobs Cluster3 Non-PSA Jobs Cluster3

• Up to 95% improvement for PSA Jobs• No significant loss of performance for Non-PSA jobs

Page 27: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Opportune (Multi-Site):Loss of Capacity

Loss Of Capacity (10% PSA Jobs)

05

101520253035404550

1 1.2 1.5

Load

Per

cent

age

decr

ease

Cluster1

Cluster2

Cluster3

• Up to 45% improvement in the Loss of Capacity

Page 28: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Concluding Remarks

Opportune Job ShreddingEfficient Scheduling of PSAsSingle Site and Multi-Site versionsSignificant improvement for PSA jobsEnsures that Non-PSA jobs are not affected

Plan to integrate this with Prod. Schedulers

Page 29: Opportune Job Shredding: An Efficient Approach for Scheduling Parameter Sweep Applications Rohan Kurian, Pavan Balaji, P. Sadayappan The Ohio State University.

Thank You!