Virtual Geophysics Laboratory Scientific workflows exploiting the cloud Ryan Fraser, Terry Rankine,...
-
Upload
elizabeth-webb -
Category
Documents
-
view
213 -
download
0
Transcript of Virtual Geophysics Laboratory Scientific workflows exploiting the cloud Ryan Fraser, Terry Rankine,...
Virtual Geophysics LaboratoryScientific workflows exploiting the cloud
Ryan Fraser, Terry Rankine, Lesley Wyborn, Joshua Vote, Ben Evans...Presented by Robert Woodcock
October 2012
CSIRO | MINERALS DOWN UNDER FLAGSHIP
Gather data, process it, publish resultsSimple, isn’t it?
bedrock
surficial
mineral
geochemical
geochronologic
hyrdrogeological
Geo-information
geophysical
knowledgedata
Virtual Geophysics Laboratory | Robert Woodcock
Data discovery
5 |
Layers discovered via remote registries
Layers consist of numerous remote data services
Virtual Geophysics Laboratory | Robert Woodcock
Data discovery
6 |
Some data services support subsetting
Some data services support reformattinge.g. CSV, NetCDF, GeoTIFF
Virtual Geophysics Laboratory | Robert Woodcock
Data discovery
7 |
Some data is only registered with flat files
Powered by the Spatial Information Services StackCommon Platform
MarineEnvironment, Water
Groundwater GeologyGeophysics
Virtual Geophysics Laboratory | Robert Woodcock
Data processing
10 |
A variety of different scientific codes are already available in the form of “Toolboxes”
Virtual Geophysics Laboratory | Robert Woodcock
Data processing
11 |
Further input files can be uploaded.
Input files are passed directly into the cloud
Virtual Geophysics Laboratory | Robert Woodcock
Data processing
12 |
The steps so far have been building an environment to run a processing script
...or build from existing templates
Either write your own...
Virtual Geophysics Laboratory | Robert Woodcock
What just happened?
13 |
Processing script/ small input files uploaded
Start processing
Download big data sets
Perform data processing
Download Job Script/user input files
Upload processing results
Managing results - provenance
Presentation title | Presenter name14 |
All of a job’s outputs are also accessible Each job has a lifecycle that
can be managed
Successful jobs can have their entire process captured in a ISO 19115 ‘provenance record’
Virtual Geophysics Laboratory | Robert Woodcock
What just happened?
15 |
What’s the processing status?
What are the job input/outputs?
Publish the job’s process and results
Cloud storage will persist the final artefacts
Virtual Geophysics Laboratory | Robert Woodcock
Still under construction
16 |
Courtesy - www.textfiles.com
Virtual Geophysics Laboratory | Robert Woodcock
What’s left?
• BYO cloud allocation• Users should be able to authorise VGL start jobs using their compute/storage
resources.
• Confidential Data• How do you get access to ‘restricted data’ in a secure manner?• Where can you store the results? (geographical restrictions)
• Massive Horizontal Scaling• What’s the best way to set up a truly elastic pool of CPU’s for jobs to utilise?
• A Common Processing Services Platform – SISS like?
17 |
Sustainable Resources Policy Societal Need
Virtual Solid Earth Sciences Laboratory
EnvironmentV. Lab
Integrated Virtual Labs
Virtual Geophysical Laboratory
Virtual Core Laboratory
Virtual Geodesy
Laboratory
Virtual Climate
Laboratory
Virtual Water Laboratory
Virtual Laboratories
Geophysics Borehole data
Geodesy Climate Modelling
Water Monitoring
Virtual Libraries
Processing Services
DataMiddleware
Processing Services
DataMiddleware
Processing Services
DataMiddleware
Processing Services
DataMiddleware
Processing Services
DataMiddleware
Modelling & analytic tools