Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008...

23
Space-based approach to high throughput computations in UNICORE 6 Grids Bernd Schuller, Miriam Schumacher Jülich Supercomputing Cente Distributed Systems and Grid Computing Forschungszentrum Jülich GmbH

Transcript of Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008...

Page 1: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

Space-based approach to high throughput computations in

UNICORE 6 Grids

Bernd Schuller, Miriam Schumacher

Jülich Supercomputing CenteDistributed Systems and Grid Computing

Forschungszentrum Jülich GmbH

Page 2: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

2UNICORE Summit, August 26, 2008

Outline

• Motivation: high-throughput computing• What is a tuple space?• „XML Spaces“ based on WSRFlite / UNICORE 6• Job execution using the tuple space• Performance measurements and results

Page 3: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

3UNICORE Summit, August 26, 2008

High-throughput computing – characteristics and challenges

• Characteristics– Many (small) jobs, many (small) resources

• Examples– Docking (e.g. WISDOM), – High-throughput screening, e.g. apply a QSAR model

for property prediction for a very high number of structures

split join

process

Page 4: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

UNICORE Summit, August 26, 2008

Example: find a drug for combating avian flu

Garcia-Sosa, A.T., Sild, S., Maran, U. ChemMedChem, submitted 2008.

Page 5: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

UNICORE Summit, August 26, 2008

Using docking for virtual screening

• Docking is very well suited for massive parallelization: 1 job per docking run

• Docking was run through the UNICORE command client. UNICORE was used for the distribution, running and output recollection of the jobs

• Single UNICORE site• ~ 1,500 jobs per day on 20 cluster nodes• Each job took around 15 mins. average real time• 50-100,000 ligands per virtual screening• 33-66 days on 20 nodes (or 7-12 days on 100 nodes)• Promising strategies and molecules for new inhibitors

of avian flu have been obtained

Garcia-Sosa, A.T., Sild, S., Maran, U. ChemMedChem, submitted 2008.

Page 6: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

6UNICORE Summit, August 26, 2008

High throughput computing –Problems of conventional architectures• Challenges

– Scalable resource discovery, Efficient resource usage

• the „information gap“: a lot of state information must be available to the information systems to allow efficient resource usage

• Scalability: information systems and schedulers (usually) become bottlenecks as execution nodes are added

Page 7: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

7UNICORE Summit, August 26, 2008

High-throughput computing – UNICORE 6 based approaches

• Commandline client (UCC) batch mode

• UNICORE 6 Workflow system

• Tuple Space based approach

• Other(s)

Page 8: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

8UNICORE Summit, August 26, 2008

Tuple Space basics

Tuple Space

write()

Client

Query template

XML Document

read()

Client

take()

Clientx

Page 9: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

9UNICORE Summit, August 26, 2008

Tuple Spaces principles

• Tuple space stores entries• Tuple space entries have a lifetime („lease time“)• Basic API

– write(Entry, LeaseTime)• inserts new entry into the tuple space

– read(Template, Timeout) • returns matching entry

– take(Template, Timeout)• returns matching entry

and removes it from the tuple space

– notify(Template)• tuple space will notify client upon insertion of matching

entry

Page 10: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

10UNICORE Summit, August 26, 2008

Template-based queries

• read(), take() use „query by example“• Supply a template for querying with fields set• Example:

– give me an entry where the field „status“ has the value „DONE“

• Intuitive and easy to use• Not as powerful as a real query language (such

as SQL, XPath or XQuery)

Page 11: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

11UNICORE Summit, August 26, 2008

JavaSpaces

• Java based tuple space (stores Java objects)• Part of Sun's JINI specification• Opensource and commercial implementations

exist– Gigaspaces (commercial)– Sun JINI– Blitz

• Stores Java objects• Communicates using Java RMI (but also SOAP

etc.)

Page 12: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

12UNICORE Summit, August 26, 2008

Tuple spaces: pro&con

• Pro– Decouple communications– Enables highly scalable („share-nothing“)

architectures

• Con– Tuple space itself is hard to distribute– Tuple space itself may become the bottleneck– Not all applications fit this model

Page 13: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

13UNICORE Summit, August 26, 2008

Idea: „XML Space“

• Store any XML documents• Use UNICORE 6 protocols and tools• WSRF fits the tuple space model very nicely

– resource + lifetime concepts– XML centric

• Diploma thesis by Miriam Schumacher– used UNICORE 6 / WSRFlite to implement such

an „XML space“

– Prototype + example application availablehttp://unicore.svn.sf.net/svnroot/unicore/contributions/unicore-spaces/trunk

Page 14: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

14UNICORE Summit, August 26, 2008

UNICORE Spaces

• Small add-on to UNICORE 6• Two services

– Space• web service, offering write(), read(), take()

– SpaceEntry• WSRF service• Each instance corresponds to one entry in the

spaceWS- • XML document is stored as a WSRFresource

property

• Example Client (SpaceClient)• ca. 300 lines of code

Page 15: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

15UNICORE Summit, August 26, 2008

Job execution example

Tuple space

Client ExecutionNode

<Job xmlns=..."> <JobID>my test job</JobID> <Status>NEW</Status> <JSDL> <jsdl:JobDescription> <jsdl:Application>...</...> </jsdl:JobDescription> </JSDL></Job>

1<Job xmlns=..."> <JobID>my test job</JobID> <ServerJobID>...</ServerJobID> <ServerID>...</ServerID> <Status>SUBMITTED</Status> <Address>...</Address> <JSDL>...</JSDL></Job>

2

<Job xmlns=..."> <JobID>my test job</JobID> <ServerJobID>...</ServerJobID> <ServerID>...</ServerID> <Address>...</Address> <Status>DONE</Status>

<JSDL>...</JSDL></Job>

3

Page 16: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

16UNICORE Summit, August 26, 2008

Job execution

1. write()job.status=NEW

Client

ExecutionNode

2. take()job.status=NEW

3. write()job.status=SUBMITTED

Tuple space

5. write()job.status=DONE

6. take()job.status=DONE

4. executelocally

Page 17: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

17UNICORE Summit, August 26, 2008

Components

Tuple space

Job TakerJob Taker

TSS, JMS

XNJS

submit/start jobnotify onstatuschanges

take() new jobswrite() submitted jobswrite() done jobs

Clientwrite() new jobstake() done jobs

ExecutionNode

Page 18: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

18UNICORE Summit, August 26, 2008

Performance test configuration

Clientnode5

Executionnode3

Executionnode2

Executionnode4

Executionnode1

Tuple spacenode0

GatewayShared registryXUUDBnode0

UCC - batch mode (w/o fetch output) - client for space-based batch mode

Page 19: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

19UNICORE Summit, August 26, 2008

Some first results: job throughput

for comparison: UCC batch-mode, 100 jobs @ 4 nodes = 126 seconds(ucc „tuned“ to not check resource availability and to not get any output files)

Nodes Jobs

100 400 1000 5000

1 972 48 146 5204 26 80 231 9405 22

Page 20: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

20UNICORE Summit, August 26, 2008

read() / take() : performance

0 2000 4000 6000 8000 100007,5

10

12,5

15

17,5

20

22,5

25

27,5

30

32,5

35

37,5

40

Mean lookup time [ms]

Number of Entries in the Tuple space

Unit test!

read()/take() becomes a bottleneckwhen many clientsaccess the same space

Measure mean time for read() of100 random entries

Page 21: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

21UNICORE Summit, August 26, 2008

Summary:

• Very promising!• Pro

– Excellent for simple requirements– Highly scalable– Very simple to setup

• Con– Difficult for complex requirements (e.g. co-

scheduling)– The Tuple space might become the bottleneck

eventually!

Page 22: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

22UNICORE Summit, August 26, 2008

Outlook

• Other use cases for the UNICORE Spaces?

– any „document-oriented state machine“ will be easy

• Implementation aspects

– improve read() / take() performance (partitioning, indexing...)

– investigate distributed/clustered space (hard!)

• Job processing example application:

– more than a toy

– Security• need to delegate trust to the workers

– Input/Output data• stage-in from shared storage? From client?

• getting results: not a problem once TD is in place

Page 23: Space-based approach to high throughput computations in ... · UNICORE Summit, August 26, 2008 Using docking for virtual screening • Docking is very well suited for massive parallelization:

23UNICORE Summit, August 26, 2008

Downloads, documentation, tutorials, mailing lists, community links,

and more: http://www.unicore.eu

Thank you!

Project website: http://www.chemomentum.orgFunded by the European commission, IST-5-033437