Earth System Modeling Infrastructure Cecelia DeLuca and the ESMF Team NCAR/CISL CCSM Software...
-
Upload
angelina-brianne-lane -
Category
Documents
-
view
215 -
download
0
Transcript of Earth System Modeling Infrastructure Cecelia DeLuca and the ESMF Team NCAR/CISL CCSM Software...
Earth System Modeling Infrastructure
Cecelia DeLuca and the ESMF TeamNCAR/CISL
CCSM Software Engineering Working GroupJune 17, 2009
Evolution
• Extending beyond historical emphasis on high performance frameworks
• Integration across model, data, and organizational elements
• Multi-agency and international contributions and partnerships essential
Elements of Modeling Infrastructure
1. Tight coupling tools and interfaces• Emphasis on performance• Frequent, high volume transfers on high performance computers• Examples: ESMF, Flexible Modeling System (FMS)
2. Loose coupling protocols and interfaces• Emphasis on ease of use• Lower volume and infrequent transfers on desktop and distributed
systems• Examples: Open Modeling Interface (OpenMI, hydrology), web
services3. Science gateways
• browse, search, and distribution of model components, models, and datasets
• workflows, visualization and analysis services• workspaces and management tools for collaboration• Examples: Earth System Grid (ESG), Hydrologic Information System
(HIS)
Elements of Modeling Infrastructure
4. Metadata conventions and ontologies• Structured information describing modeling artifacts • Ideally, with automated production of metadata from models• Examples: CF conventions, CIM (from EU Metafor project) in CMIP5
5. Governance• Coordinated and controlled evolution of systems• Example: ESMF Change Review Board – quarterly multi-agency
meeting that sets priorities and tasks and creates release schedules
Sample Applications
• Multi-model ensembles for weather and climate• Automated modeling workflows• Meeting stakeholder requests for model output outside the
scope of the original research• Coordinating standards development across the international
community
Programs
• National Unified Operational Prediction Capability (NOAA and DoD operational NWP centers)
• Global Interoperability Program (NOAA climate and weather)
• Modeling Analysis and Prediction Program (NASA)• Battlespace Environments Institute (DoD regional codes)• Common Metadata for Climate Modelling Digital
Repositories (METAFOR, IS-ENES)
ESMF Update
Cecelia DeLuca, Kathy Saint
Release Plan2002 2003 2004 2005 2006 2007 2008 2009 2010
ESMF v1Prototype
ESMF v3Index Space OperationsESMF_ArraySparseMatMul()
ESMF v4Grid OperationsESMF_GridCreate()ESMF_FieldRegrid()
ESMFv5StandardizationBuild, init, data types, error handling, …
Beta release May 2009ESMF v4.0.0
ESMF v2Components, VM and UtilsESMF_GridCompRun()
Current ReleaseBeta Release 4.0.0 May 2009
• Basis for next public release
• On-line and off-line (file to file) parallel generation of interpolation weights, bilinear and higher order methods
• Two higher-level data representations: connected logically rectangular blocks or fully unstructured mesh
• Attributes for Metadata handling, read and write in XML or plain text formats
• Optimization of sparse matrix multiply for large processor counts (10,000+)
The Flow-Following Finite Volume Icosahedral Model (FIM) from NOAA GSD is converting to ESMF to couple to National Weather Service models
ESMF parallel regrid:• Grid-Grid, Grid-Mesh,
Mesh-Mesh• Bilinear and higher order
Performance Portablity
• 30+ platform/compiler combinations regression tested nightly (3000+ system and unit tests)
• Newer ports include Solaris and native Windows
Performance at the petascale…The chart at right shows scaling of the ESMF sparse matrix multiply, used in regridding transformations, out to 16K processors.(ESMF v3.1.0rp2, MCT 2.5.1)
Plot from Peggy Li, NASA/JPLTested on ORNL XT4 in a variety of configurations .-N1 means 1 core per node.
ASMM Run Time Comparison on XT4
0
1
10
100
1000
Number of Processors
Tim
e (
mse
cs)
3D Array
Bundle
MCT
ç
ASMM Run-Time Comparison
msec
Loose coupling andintegration with data services
• Prototype completed that translates ESMF initialize/run/finalize component interfaces into web services
• Supports loose coupling with hydrological, human impacts, and other services, and integration of components into service oriented architectures
• Will enable invocation of ESMF components and applications from a portal in a Grid-based (e.g. Teragrid) gateway
• Data and metadata generated by ESMF Attributes during runs can be stored back to portals for search, browse, comparison, and decision support
Portal developed jointly with the Earth System Grid displays detailed scientific metadata … work funded by NSF.
Web services: design goals
• ESMF components can be turned into web services with minimal impact to component developers– Fortran and C++
• Use standard web technologies– SOAP/XML– NetCDF– OpenDAP/Thredds
• Simple deployment and user interface• 2 for 1 interface – with ESMF can get both a tight coupling
interface and a service interface
Client Application
Service Architecture
InternetInternet
Tomcat/Axis2
SOAP ServiceSOAP Service
Component Connector
Component Connector
Application
libnetesmflibnetesmf
Grid/CouplerComponent
Grid/CouplerComponent
WSDL
SOAP Client WSDL
OpenDAP
NetCDF
Component List
Component List
Registrar
Current Status
• Initial prototype released– Grid Component wrapper w/ SOAP interface– Initialize, Run, Finalize component operations
• Current version– Support for Coupler components– Single SOAP service to support multiple components– Data movement supported using OpenDAP access to NetCDF files
• Next steps– New project to integrate with hydrology data and applications,
including Hydrologic Information System (HIS) Desktop
ESMF in CCSM4
Fei Liu, Bob Oehmke, Ufuk Turuncoglu
ESMF in CCSM4
• Motivation: create an ESMF based CCSM4 platform to explore:
– Self-describing models– Interoperability with other modeling systems– Component as web service– Automated model workflow– Online regridding– Improved scalability
• General Approach– Introduce a wrapper layer between CCSM4 driver and ESMF
component– Convert MCT-based model component to ESMF-based
component– Verify against baseline global integrals to ensure they are bit-
wise reproducing. – Completed all of CCSM4 dead, data and active components
Plan for CCSM4
MCT based CCSM Driver
Shared code
MCT based model component
ESMF Library
ESMFshr code
MCT -> ESMF_Array
Domain -> ESMF_ArrayBundle
Infodata -> ESMF_State Attributes
ESMF compliant model component
ESMF_Array -> MCT
ESMF_ArrayBundle -> Domain
ESMF_State Attributes -> infodata
Data structure independent model physics
Regridding
Bob Oehmke
Higher-Order Interpolation
• CCSM now uses ESMF higher-order interpolation as its standard non-conservative remapping
• Replaces bilinear interpolation of atmosphere to ocean states
• Results in a large reduction in noise in ocean surface transports (33% in a measured case)
• Higher-order interpolation:– A patch is a 2nd order n-D polynomial representing source data– Patches generated for each corner of source cell– Each patch created by least-square fit through source data in
cells surrounding corner– Destination value is weighted average of patch values– Full algorithm presented here in 2007 by David Neckels– Longer description in ESMF v4.0.0 Reference Manual
Weight Generation
• Higher-order weights generated by ESMF off-line application– Takes netCDF grid files and generates netCDF weight file– Format same as SCRIP, can be used as an alternative– Runs in parallel
• Multiple improvements over last year– Improved accuracy– More robust– Pole options are now: no pole, full-circle average, and n-point
stencil pole• Future work
– More pole options (improved vector quantity handling)– Weight generation for conservative regridding
Noise reduction in CCSM transport
Interp.noise
Interpolation noise in a derivative of the zonal wind stress
grid index in latitudinal direction
• ESMF higher order interpolation weights were used to map from a 2-degree Community Atmospheric Model (CAM) grid to an irregularly spaced POP ocean grid (384x320)
• dTAUx/dy was computed using interpolated fields – this is closely related to the curl of the wind stress, which drives the upper ocean circulation
• Noise is calculated as deviation of a point from the sum of itself plus four neighbors
• 33% reduction in noise globally compared to original bilinear interpolation
Black = bilinearRed = higher-orderESMF v3.1.1Green = higher order ESMF v4.0.0
Attributes in CCSM4
Ufuk Turuncoglu
CCSM4 and ESMF AttributesCCSM4 and ESMF Attributes
• ESMF Attributes hold metadata about model components, fields, etc.
• Using ESMF Attributes, metadata about CCSM4 driver fields was exported in XML format
• Metadata includes:• name• long name (or description)• standard name• units• whether it is an import or export field
• All the field metadata uses the CF conventions and Metafor Common Information Model (CIM)
• The standard names of CCSM driver fields were collected and corrected by working with the Curator project and component liaisons.
CCSM and ESMF AttributesCCSM and ESMF Attributes
• Example: Contents of output XML file for CCSM atmosphere component;
Towards CCSM self-describing Towards CCSM self-describing workflowsworkflows
Ufuk Utku Turuncoglu1
Sylvia Murphy2
Cecelia DeLuca2
1 Istanbul Technical University2 National Center for Atmospheric Research
* please come and see our poster for more information
OutlineOutline• Motivation• Basic Concepts• Implementation of CCSM Workflow• Preliminary Results• Future Plans
MotivationMotivation
• Create easy to use work environments for running complex models like CCSM4
• Automatically document parameter changes and other aspects of model runs
• Hide interactions with complicated computing technologies
• Archive, reproduce and share experiments
WorkflowsWorkflows
• A modeling workflow separates modeling tasks into smaller chunks, and combines them using dataflow pipelines.
• KEPLER was chosen because it is open source and supports different models of computation.
• A KEPLER director controls the execution of a workflow and sends execution instructions to the actors that are specialized to do sub-tasks. In other words, actors specify “what processing occurs” while the director specifies “when it occurs”.
ProvenanceProvenance
• Provenance is defined as structured information that keeps track of the origin and derivation of the workflow.
• The basic types of provenance information:• System (system environment, OS, CPU architecture,
compiler versions etc.)• Data (history or lineage of data, data flows, input and
outputs, data transformations)• Process (statistics about workflow run, transferred
data size, elapsed time etc.)• Workflow (version, modifications etc.)
• Seek to collect provenance in a format that is easily displayed in a portal and linked to data outputs
Components of Components of Workflow EnvironmentWorkflow Environment
Typical workflow
Hierarchical Collection of Hierarchical Collection of Provenance InformationProvenance Information
Multi-component structure of CCSM makes it complicated to collect provenance information.
Conceptual CCSM WorkflowConceptual CCSM Workflow• Split into two parts: run preparation and postprocessing • Postprocessing is automically triggered when run workflow completes
Kepler Graphical InterfaceKepler Graphical Interface
main workflow
composite actorCCSM workflow
Development TasksDevelopment Tasks
• CCSM modifications for ESMF Attributes (explained earlier)• A Perl script was written to set up the third-party environment
on remote machines. • ORNL/NCSU scripts that gather system provenance information
were modified to work with CCSM.• Added support for multiple source directories, multi-
component models, XML output• Collects extra system provenance information (compiler etc.)
• Created new Kepler actors:• Grid; CCSM: build, modify, and run; post-processing• Listener to trigger other workflows automatically• Provenance recorder with modified XML option
CCSM Workflow ActorsCCSM Workflow Actors
Created KEPLER actors:
Grid actors; Job submission to TeraGrid
CCSM specific actors; create, modify (env_*.xml)
Post-processing actors
Utility actors
Results & Future PlansResults & Future Plans
• A fully active CCSM4 workflow with BTR1 component set and 1.9x2.5_gx1v6 resolution was run on the TeraGrid (IBM BG/L Frost) for five days• Provenance information was automatically collected• Near future plans:
• Longer simulation will be done with two different resolutions and two different computers (Frost and Kraken) simultaneously
• Export collected provenance information into ESG
Questions?Questions?
Contact:[email protected]
Project Repository: http://esmfcontrib.cvs.sourceforge.net/viewvc/esmfcontrib/workflow/
Demo Movie: http://www.be.itu.edu.tr/~turuncu/workflow.htm
CCSM Metadata forIPCC Assessment Report 5
Sylvia Murphy/NCAR
Curator and CMIP5
Curator is coordinating metadata generation, standardization, and implementation in science gateways
• We are partners with Metafor who are responsible for the collection of metadata for AR5
• We are collaborating closely with the Earth System Grid (ESG) to implement metadata technologies to search, explore, and compare model informationCurator Partners• National Center for Atmospheric Research• Geophysical Fluid Dynamics Laboratory• Georgia Institute of Technology• EU Partners (BADC, BODC, etc.)
ESG data collections
CCSM AR4 Metadata
End-to-end modeling
• Linking datasets and simulations within a science gateway is just one part of an end-to-end modeling system that includes:– Self-describing models– Broader range of coupling interactions– Web services– Workflows– Community-developed ontologies– Science gateways
• Future work within Curator will combine these pieces into a variety of modeling systems