Ecosys Experiment Engine

Post on 22-Jan-2018

1.221 views 0 download

Transcript of Ecosys Experiment Engine

Ecosys

Experiment

Engine

Rob SimmondsGrid Research CentreUniversity of Calgary

Grid Research Centre (GRC)• Perform R&D in:

– Grid computing– High Performance Computing– Cyberinfrastructure– Automation– Virtualization

• People– Executive director– Research director– 5 research staff– 5 students

• http://grid.ucalgary.ca

Objective

• Create Ecosys Experiment Engine byintegrating CI components

• This should:– Manage multiple experiments and automate

the running of jobs and movement of data– Manage jobs running on multiple clusters to

improve throughput– Provide high level interface to reduce

learning curve

Experiment overview

• Ecosys experiment consists of runninglarge number of jobs using set ofcommon data files

• Each job has specific input data fileand creates multiple output files

• To run on distributed resources inputdata needs to be made available on eachresource and output data returned toanalysis site

EEE Workflow

• Replicate common data toexecution sites (clusters)

• Register these sites as ready• Match “stagein – job start –

stageout” to registered site(s)• Job Starting Service (JSS)

manages matchmaking and rulebased fault handling

CI tools employed

• Security – GSI, MyProxy• Resource discovery – MDS4• Job starting – Condor, GT4 GRAM• Data management – SRB• Data staging – Stork / GridFTP• Portal – PhP based• Execution management – JSS

– Also explored Kepler, DAGman

Kepler workflow

EEE Architecture

JSS Internals

Suggested future EEE work

• Continue to improve JSS fault handling• Work with researchers to improve

portal interface• Add enhanced support for manipulating

result data– Automated post processing to create

spreadsheets/graphs as required– Embed interactive tools in portal environment to

provide direct access to results data

Summary

• Have created Ecosys ExperimentEngine using many existing CIcomponents

• Application specific portals/gatewayscan provide simple access to complexautomation tools

• Many open source CI tools need moredevelopment to make them robust andeasy to integrate

http://grid.ucalgary.ca

Kepler workflow