Applying Scheduling and Tuning to On-line Parallel Tomography

Applying Scheduling and Tuning to On-line Parallel Tomography

Shava SmallenIndiana University

Henri Casanova, Francine BermanUniversity of California at San DiegoSan Diego Supercomputer Center

2

Outline

1. Introduction to On-line Parallel Tomography

2. Tunable On-line Parallel Tomography

3. User-directed application-level scheduler

4. Experiments

5. Summary

3

What is tomography?

• Tomography: a method for reconstructing the interior of an object from its projections

• National Center for Microscopy and Imaging Research (NCMIR)– Electron Microscopy

Electron Microscope

4

Tomogram of spiny dendrite(Images courtesy of Steve Lamont)

Example

• Compute and data-intensive

• E.g. 2k x 2k dataset (pixels)– 2k units of work (slices)– Total input data size: 976 MB– Total output data size: 9.6

GB– Compute time: ~ 16 days on

a standard workstation

• Off-line1. Data collection2. Data processing3. Data viewing

5

On-line Parallel Tomography

• Provide interactive soft real-time feedback on quality of data acquisition– High tomogram resolution and frequent

refreshes • Efficiency benefits for users and microscope

on-lineparallel

tomography

6

NCMIR Compute Platform

• Distributed multi-user, heterogeneous Grid

network

Blue Horizon (SDSC)1152 procs (AIX, Loadleveler, Maui

Scheduler)

NCMIR clusterSGI Indigo2, SGI Octane (IRIX)

SUN ULTRA, SUN Enterprise (Solaris)

Meteor cluster (SDSC)Pentium III dual procs (Linux)

7

• On-line parallel tomography is a tunable application– [Chang,et al] Availability of alternate configurations

• Resource utilization• Output

• On-line parallel tomography output– Tomogram resolution– Refresh frequency

• Tunability controlled by configuration pair ( f, r ) where– f is the reduction factor (tomogram resolution)– r is the number of projections per refresh (refresh frequency)– E.g. (2,3)

on-lineparallel

tomography reduce(f)

Application Tunability

8

Tunability/Scheduling

• At run-time, we need to find out which configuration pairs are feasible– Flexibility to allow for trade-offs between f and r

• e.g., (2, 3 ) or (3, 2)

– Resource availability– User bounds

• E.g.,– Refresh at least once every 10 minutes– Minimum image resolution 256 x 256 pixels

• A configuration pair is feasible if we can find a corresponding schedule

• We choose an adaptive-scheduling approach

9

Application-Level Scheduler (AppLeS)

AppLeS + application = self-scheduling application

• Enable an application to adaptively schedule its execution on distributed, heterogeneous resources in order to improve performance

• Type of information used:– static

• e.g. application model, network topology, …– dynamic

• e.g. Network Weather Service (NWS) - available CPU, bandwidth, …

User-directed AppLeS

User

generaterequest

displaypairs

adjustrequest

reviewpairs

processrequest

findschedule

executeon-line parallel

tomography

accepts one

rejects all

infeasible

feasible

• User-directed AppLeS– Involves user in

scheduling process– Flexible

slices

preprocessor

worker

worker

worker

worker

worker

writer

On-line Parallel Tomography Architecture

projection

scanlines

Updatetomogram

12

Scheduling Approach

• Constrained optimization problem based on soft real-time execution– compute constraint

• static benchmark, dynamic CPU availability (NWS)– transfer constraint

• topology info (ENV), dynamic bandwidth (NWS)

• Problem is a nonlinear program– Exploit small range of f to reduce to multiple

mixed integer programs which is solved via lp_solve

• approximate solution

13

Experiments

• Goals:– Set 1 – Scheduler Results

• Evaluate scheduler efficacy• Evaluate impact of dynamic resource availability on

scheduler efficacy– Set 2 – Tunability Results

• Evaluate usefulness of tunability

• Simulation– Number of experiments– Repeatability

NCMIR Grid

• Case Study: – week of traces: May 19 – 26, 2001

• CPU availability (NWS)• Bandwidth (NWS)• Node availability (Maui scheduler showbf)

15

Scheduling Strategies

• 4 scheduling strategies

Assumes infinite bandwidth info

Uses dynamic bandwidth info

Assumes dedicated CPU

wwa wwa+bw

Uses dynamic CPU info

wwa+cpu AppLeS

16

• Simulates an execution of on-line parallel tomography

• Uses Simgrid - Casanova [CCGrid’2001]– toolkit for evaluating scheduling algorithms

• tasks• resources modeled using traces

– E.g. Parameter sweep applications [HCW’00]• 2 types of simulations

– Executed at 10 minute intervals• 1004 simulations x 4 schedulers

Simtomo

17

Real trace

0

1

Simulation Types

0

11. Partially trace-driven (perfect load predictions)

12

3

12

3

2. Completely trace-driven (imperfect load predictions)

0

11


0

1

3


0

1 2

18

relative refresh lateness

actual refresh period

• Relative refresh lateness

Performance Metric

expected refresh period (based on r)

19

Scheduling Results (1)(partially trace-driven)

May 19-26, 2001

98%

Importance of dynamic

bandwidth info

20

Scheduling Results (2)(Completely trace-driven)

May 19-26, 2001

57.1%

21

Tunability Results

• How often does the pair change (i.e., tune)– Assume a single user model where user always

chooses pair with lowest f– Find the best pairs throughout simulated week

• Snapshot of Monday May 21st

• On average, pair changed 25% of the time

8:00 9:00 10:00 11:00

(3,1)(2,2) (3,2) (2,2)

22

Summary

• Tunable on-line parallel tomography at NCMIR

• Dynamic resource information improves scheduler efficacy– Dynamic bandwidth information is key

• Case for tunability in a Grid environment

23

Future Work

• Introduce cost – another tunable parameter: (f, r, $)

• More Grid simulations – Traces from various sites across US and

Europe• Generalizing to other applications• Rescheduling• Production use at NCMIR

24

Parallel Tomography at NCMIR

• Embarrassingly parallel

X

Y

slice

specimen

Z

scanlineprojection

projection

scanline

25

Scheduling Latency

• Time to search for feasible triples

1k x 1k 2k x 2k

Applying Scheduling and Tuning to On-line Parallel Tomography

Documents

Transcript of Applying Scheduling and Tuning to On-line Parallel Tomography