Frieda meets Pegasus-WMS
-
Upload
lucy-cabrera -
Category
Documents
-
view
19 -
download
1
description
Transcript of Frieda meets Pegasus-WMS
www.cs.wisc.edu/condor
Frieda meets Pegasus-WMS
› What if I want to define workflows that can flexibly take advantage of different grid resources?
› What if I want to register data products in a way that makes them available to others?
› What if I want to use the grid without a full Condor installation?
www.cs.wisc.edu/condor
Pegasus Workflow Management System
› A higher level on top of DAGMan› User creates an abstract workflow› Pegasus maps abstract workflow to
executable workflow› DAGMan runs executable workflow› Doesn’t need full Condor (schedd
only)
www.cs.wisc.edu/condor
Pegasus features
› Workflow has inter-job dependencies (similar to DAGMan)
› Pegasus can map jobs to grid sites› Pegasus handles discovery and
registration of data products› Pegasus handles data transfer
to/from grid sites
www.cs.wisc.edu/condor
*The full moon is 0.5 deg. sq. when viewed form Earth, Full Sky is ~ 400,000 deg. sq.
Generating mosaics of the sky(Bruce Berriman, Caltech)
Size of the mosaic in degrees square*
Number of jobs
Number of input data files
Number of intermediate files
Total data footprint
Approx. execution time (20 procs)
1 232 53 588 1.2GB 40 mins
2 1,444 212 3,906 5.5GB 49 mins
4 4,856 747 13,061 20GB 1hr 46 mins
6 8,586 1,444 22,850 38GB 2 hrs. 14 mins
10 20,652 3,722 54,434 97GB 6 hours
BgModel
Project
Project
Project
Diff
Diff
Fitplane
Fitplane
Background
Background
Background
Add
Image1
Image2
Image3
www.cs.wisc.edu/condor
Abstract Workflow (DAX)
› Pegasus workflow description—DAX Workflow “high-level language” Devoid of resource descriptions Devoid of data locations Refers to codes as logical
transformations Refers to data as logical files
www.cs.wisc.edu/condor
Basic Workflow Mapping
› Select where to run the computations Change task nodes into nodes with executable
descriptions› Select which data to access
Add stage-in and stage-out nodes to move data› Add nodes that register the newly-created data
products› Add nodes to create an execution directory on a
remote site› Write out the workflow in a form understandable
by a workflow engine Include provenance capture steps
www.cs.wisc.edu/condor
Pegasus Workflow MappingOriginal workflow: 15 compute nodesdevoid of resource assignment
Resulting workflow mapped onto 3 Grid sites:
11 compute nodes (4 reduced based on available intermediate data)
13 data stage-in nodes
8 inter-site data transfers
14 data stage-out nodes to long-term storage
14 data registration nodes (data cataloging)
41
85
10
9
13
12
15
9
4
837
10
13
12
15
www.cs.wisc.edu/condor
Pegasus WMS
Pegasus Workflow Mapper
CondorDAGMan
TeraGridOpen Science GridCampus resourcesLocal machine
Transformation Catalog
Site Catalog
Workflow Description in XML (DAX)
Condor Schedd
Submit Host
Replica Catalog
Pegasus WMS restructures and optimizes the workflow, provides reliability
Properties
www.cs.wisc.edu/condor
Mapping a workflow
› To map a workflow, use the pegasus-plan command: % pegasus-plan -Dpegasus.user.properties=pegasus-wms/config/properties --dir dags --sites viz --output local --force --nocleanup --dax pegasus-wms/dax/montage.dax
› Creates executable workflow
www.cs.wisc.edu/condor
Running a workflow
› To run a workflow, use the pegasus-run command: % pegasus-run -Dpegasus.user.properties=pegasus-wms/dags/train01/pegasus/montage/run0001/pegasus.51773.properties pegasus-wms/dags/train01/pegasus/montage/run0001
› Runs condor_submit_dag and other tools
www.cs.wisc.edu/condor
There’s much more…
› We’ve only scratched the surface of Pegasus’s capabilities
www.cs.wisc.edu/condor
Pegasus links
› Pegasus home page: pegasus.isi.edu
› Tutorial materials available at: http://pegasus.isi.edu/tutorials.php
› For more questions: [email protected]