Abstractions for Data-Intensive Computing on Condor Christopher Moretti University of Notre Dame

1Christopher Moretti – University of Notre Dame4/21/2009

Abstractions for Data-Intensive Computing

on Condor

Christopher Moretti

University of Notre Dame


I want to complete big workloads3.6B Hamming distance computations, .02 seconds each

on a 2GHz dual-core desktop computer, each creating 1 real number output: 2.3 CPUYrs and 29GB of output.

85M 1000x1000 dynamic programming tables, .04 seconds each on a 3GHz quad-core Xeon server: 39 CPUDays requiring a total of 77GB of input data.

500x500 recurrence matrix, recurrence functions 7 seconds each on the desktop: 22 CPUDays, not completely independently parallelizable.

… can these be run this week? How about this afternoon? Can we even turn these into “lunchtime” or “coffee refill” problems?


How? On my workstation.

Write my program, make sure to make it partitionable, because it takes a really long time and might crash, debug it. Now run it for 39 days – 2.3 years.

On my department’s 128-node research cluster Learn MPI, determine how I want to move many GBs of data

around, re-write my program and re-debug, wait until the cluster can give me 8-128 homogeneous nodes at once, or go buy my own. Now run it.

BlueGene Get $$$ or access, learn custom MPI-like computation and

communication working language, determine how I want to handle communication and data movement, re-write my program, wait for configuration or access, re-debug my program, re-run.


So? Serially Cluster Supercomputer

So I can either take my program as-is and it’ll take forever, or I can do a new custom implementation to a certain particular architecture and re-write and re-debug it every time we upgrade (assuming I’m lucky enough to have a BlueGene in the first place)?

Well what about Condor?


Yes, what about Condor?

How do I fit my

workload into jobs?

Which resources

?

What happens when things

fail?

How Many?What is Condor?

What do I do with the results?

How can I measure job

stats?

What about job input

data?

How long will it take?


FFFFFF

Abstractions Fill in the Gap

1CPU Multicore Cluster Condor Supercomputer

Here is my function: F(x,y)Here is a folder of files: set SHere is a static library I need.

set S of files lib.a

binary function F

F

F FFFFFF

FFFF exec exec rfork condor_submit submit

F FF F

FFF

FFFFFF

FFFFFF

FFFFFF


What is an abstraction?

Abstraction: a declarative specification of the computation and data of a workload.

A restricted pattern, not meant to be a general purpose programming language.

Uses data structures instead of files. Regular structure makes it tractable to model and

predict performance.

Allows a user to repeat the same pattern of work many times, making slight changes to the data and algorithms


Abstractions Example Approaches

Data management: distribute data only to nodes where it is necessary for computation. Distribute broadcasted data via efficient algorithms. Access data in a memory-efficient order. Use data structures instead of flat files.

Task management: assign appropriate amounts of data per discrete task. Adapt to the environment by choosing nodes showing good performance. Submit/manage tasks that do not overwhelm the batch system.


The All-Pairs ProblemAll-Pairs(Set S1,Set S2,Function F)yields a matrix M:Mij = F(S1i,S2j)

60K 20KB images >1GB3.6B comparisons

@ 50/s = 2.3 CPUYrsx 8B output = 29GB

1 .8 .1 0 0 .1

1 0 .1 .1 0

1 0 .1 .7

1 0

1 .1

1

F

http://www.cse.nd.edu/people/faculty_bio.php?id=100000049






All Pairs Abstraction

set S of filesbinary function F

F

M = AllPairs(F,S)

invocation


Biometrics All-Pairs at Notre Dame


R[4,2]

R[3,2] R[4,3]

R[4,4]R[3,4]R[2,4]

R[4,0]R[3,0]R[2,0]R[1,0]R[0,0]

R[0,1]

R[0,2]

R[0,3]

R[0,4]

Fx

yd

Fx

yd

Fx

yd

Fx

yd

Fx

yd

Fx

yd

F

F

y

y

x

x

d

d

x F Fx

yd yd

Wavefront Recurrence ( R[x,0], R[0,y], F(x,y,d) )


Implementing Wavefront

CompleteInput

CompleteOutput

Master

Worker

Worker FInput

Output

Input

Output

F


Genome Assembly

Bioinformatics sequencers can only extract DNA from samples 50-1000 basepairs (A,C,G,T) at a time.

Biologists need the DNA together in genome profiles of millions of contiguous basepairs.

Genome assembly is the process of putting the pieces of the puzzle back together again in the right configuration. A principal step is “overlapping”.

One way would be a huge All-Pairs problem, but this isn’t necessary. Algorithms exist to extract a sparse matrix of possible candidate pairs (two sequences that might overlap in the right answer). So we must only compute the overlaps for the candidates.


Seq1 Seq2Seq1 Seq3Seq2 Seq3Seq4 Seq5

>Seq1ATG*CTAG>Seq2A*G*CTGA…

Input Sequence Data

Candidate (Work) List

Master

Worker

WorkQueue:“Align”“>Seq1\nATG*CTAG\n…”

Output Alignment Results(raw format)

Inputdata Align


Purpose of a Suite of Abstractions

Engineering: Big systems to solve cool problems. Science:

What are common elements of abstractions? What is an intuitive interface for users to adapt their existing serial

solutions to larger problems?data computation

F M = AllPairs(F,S)

invocation

Set S

F R = Wavefront(F,R)Initial State

F O = Overlap(F,C,S)CandsSeqs


Challenges Remaining with Abstractions

Exploiting a regular pattern doesn’t mean that nothing can go wrong … It’s not just domain scientists who can make mistakes that

lead to disastrous consequences. Sometimes good solutions to problems beget new and

interesting problems. A motivating example to finish …


starter

Our tasks are done, we don’t need workers anymore!

condor_rm;exit();

starter

starter

starter

master

Universe=vanilla…TransferFiles=always…


shadow

schedd

starter

starter

starter

starter

Okay, I’ll send back the data the workers

generated.

Okay, I’ll send back the data the workers

generated.

.

.

.

.

.


shadow

schedd

starter

starter

starter

starter

Here’s the data the worker generated!


shadow

schedd

starter

starter

starter

starter



shadow

schedd

starter

starter

starter

starterHere’s the data the worker

generated!


shadow

schedd

starter

starter

starter

starter


ENOSPACE? What?


shadow

schedd

starter

starter

starter

starter

Ack! We didn’t want those files back, they were

temporary. “TransferFiles=Always”

was a mistake! Now we’re out of space!


shadow

schedd

starter

Hrm, I can’t transfer back their files. I guess I’ll hang out and won’t

remove the jobs until I can.


shadow

schedd

starter

I’m waiting to finish my condor_rm until I can

transfer files back to the submitting node.

We’re waiting to clean up local state until you kill

your jobs.


shadow

schedd

starter

I’m waiting to finish my condor_rm until I can

transfer files back to the submitting node.

We’re waiting to clean up local state until you kill

your jobs.ARGH!


Moral of the Story

Abstractions are a way to give domain scientists tools that don’t require drastically changing their already-complete solutions, but still allow for efficient HPC/HTC.

Exploiting a regular pattern doesn’t mean that nothing can go wrong.

Interesting challenges remain, among these: Predicting performance on an unpredictable system Monitoring and adapting to fit a changing system Dealing with entangling relationships

e.g. Remote state requires local state Tying together the lessons learned from these abstractions

What do they have in common? Why is one solution right for one problem and wrong for another?


For More Information Christopher Moretti

[email protected]

Douglas Thain [email protected]

Cooperative Computing Lab http://cse.nd.edu/~ccl

Abstractions for Data-Intensive Computing on Condor Christopher Moretti University of Notre Dame

Documents

Transcript of Abstractions for Data-Intensive Computing on Condor Christopher Moretti University of Notre Dame