Accessing Cloud Computing to Support Water Resources Modeling
-
Upload
consortium-of-universities-for-the-advancement-of-hydrologic-science-inc -
Category
Documents
-
view
4 -
download
1
description
Transcript of Accessing Cloud Computing to Support Water Resources Modeling
Accessing Cloud Computing to Support Water Resources ModelingScott D. Christensen, [email protected]
Nathan R. Swain, [email protected]. James Nelson, [email protected]
Norman L. Jones, [email protected]
This material is based upon work supported by the National Science Foundation under Grant No. 1135483.
Background Applications Tethys Platform Integration
CondorPy and TethysCluster Summary
Advances in water resources modeling are providing us with better information,however, they require more computational power to run. Cloud computingenables universal access to cost-‐effective computing, yet there still remains asignificant technical barrier to accessing these resources. Here we present a setof Python tools, TethysCluster and CondorPy, that have been developed tolower the barrier to modeling in the cloud by providing :
(1)programmatic access to dynamically scalable computing resources
(2)a batch scheduling system to queue and dispatch the jobs to the computing resources
(3)data management for job inputs and outputs(4) the ability to dynamically create, submit, and monitor computing jobs
While TethysCluster and CondorPy can be used independently to provisioncomputing resources and perform large modeling tasks, they have also beenintegrated into Tethys Platform, a development platform for water resourcesweb apps, to enable computing support for modeling workflows and decisionsupport systems deployedas web apps.
Two Python modules have been developed to lower the technical barrier to
accessing cloud computing for performing large modeling tasks. TethysCluster
automates the process of provisioning diverse cloud resources and configuring
them with HTCondor. CondorPy interfaces with HTCondor to enable computing
jobs to programmatically be created, submitted, and monitored.
CondorPy and TethysCluster have been integrated into Tethys Platform enabling
web apps to easily perform large computing tasks.
Stochastic Analysis
Uncertainty is inherent to hydrologic modeling,
and is often accounted for my performing a
stochastic analysis which requires running
hundreds or thousands of model simulations.
For a spatially-‐distributed, physics-‐based models
such as GSSHA running thousands of models
may take months or even years. TethysCluster
and CondorPy enable this type of analysis to be
done much faster using cloud computing.
Job ManagerCondorPy has been integrated intothe Tethys Platform Python SDK in theform of a job manager that enablesdevelopers to define computing jobsand submit them to the HTCondorpools to offload large computingtasks.
CondorPyHTCondor is a software system that that enables
High Throughput Computing (HTC) by managing
computing resources and scheduling computing
jobs. It enables diverse computing systems to be
linked together into a unified computing pool.
CondorPy serves as a cross-‐platform, high-‐level
interface for HTCondor, and allows jobs to be
created, submitted and monitored from a Python
scripting environment. This interface facilitates the
use of HTCondor in a web environment like Tethys
Platform (see panel D).
TethysClusterLarge modeling tasks often require a large amount of
computing resources. Commercial cloud providers
such as Amazon Web Services (AWS), and Microsoft
Azure provide on-‐demand, scalable resources,
however configuring them HTCondor can prove
challenging. StarCluster is a Python module that
automatically provisions and configures Linux
computing resources with AWS. TethysCluster is an
adaptation of StarCluster and expands it’s functionality
to work with both Linux and Windows resources with
AWS as well as Azure. ci-‐water.github.io/condorpy
A C D
B ETethysCluster
CondorPy
Tethys PlatformTethys Platform is a water resourcesweb development platform thatlowers the barrier to creating webapps. Tethys Platform provides opensource web GIS and visualization toolsall integrated into a unified PythonSDK .
Cluster ManagementCloud computing resources are easy toprovision through admin site of TethysPortal, the web interface of TethysPlatform. TethysCluster works behindthe scenes to automatically configurethe cloud resources into an HTCondorcomputing pool.
CondorPy
TethysCluster
Ensemble Forecast Processing
TethysCluster and CondorPy are used by the Streamflow Prediction Tool (a Tethys web app) to
automatically process a 52-‐member ensemble forecast produced by the European Center for
Medium-‐Range Weather Forecasts. A scheduled Python script creates 52 jobs using CondorPy to
process each ensemble forecast every 12 hours when a new forecast is available. TethysCluster can
be used to automatically provision and de-‐provision cloud computing resources.
Hierarchical Modeling
Running high fidelity models over large
domains often requires powerful computers
and lots of time. One way to alleviate this
problem is to partially parallelize the
computation by decomposing the domain into
smaller models. This results in a series of
hierarchical models whose execution must be
coordinated. CondorPy facilitates running this
type of workflow with HTCondor in a parallel
computing environment.
ci-‐water.github.io/TethysCluster
CondorPy TethysCluster
Probabilistic flood map resulting from 5000 model runs using the spatially-‐distributed physics-‐based hydrologic model GSSHA.
Top: large watershed shown divided into hierarchical sub-‐basis. Bottom: Diagram showing the parallelization and hierarchy of the models.
Screenshot of a Tethys web app, the Streamflow Prediction Tool, which uses CondorPy and TethysCluster to process ensemble forecasts.