Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI...
-
Upload
coleen-hampton -
Category
Documents
-
view
215 -
download
2
Transcript of Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI...
![Page 1: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/1.jpg)
Metadata Standards for Gridded Climate Data in the Earth System Grid
Robert Drach LLNL/PCMDI
UCRL-PRES-149779
![Page 2: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/2.jpg)
Drach 2 Sept. 10, 2002
Overview
I. Earth System Grid: Grid Access to Climate Research Data
II. Metadata Standards for Gridded Climate Data
![Page 3: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/3.jpg)
Part I
ESG: Grid Access to Climate Research Data
![Page 4: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/4.jpg)
Drach 4 Sept. 10, 2002
The goal of ESG is to make climate data – particularly climate model data – an easily accessible community resource. The project is funded by the SciDAC program: Scientific Discovery through Advanced Computing.
Enabling researchers to understand and make effective use of very large, distributed climate datasets is critical. The broad strategy is to develop a collection of server-side capabilities – minimize the amount of data movement.
Multiple interfaces to ESG will allow researchers to focus on science rather than issues of data transfer, format, and data set manipulation.
Foundation is Globus Grid technology
Earth System Grid Overview
![Page 5: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/5.jpg)
Drach 5 Sept. 10, 2002
Globus middleware supports linkage of distributed data archives, supercomputers, workstations, local disk caches into data/computational grids.
GridFTP: high-performance, secure, robust data transfer mechanism: protocol, server, client library. ESG is integrating OpenDAP (DODS protocol) with GridFTP
protocol. Single sign-on using Grid Security Infrastructure
Proxy certificates Community Authorization Service (CAS) Replica Location Service: manages copying and
placement of files in a distributed environment. Logical vs. physical files
http://www.globus.org
ESG uses Globus Grid technology.
![Page 6: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/6.jpg)
Drach 6 Sept. 10, 2002
ESG: U.S. Collaborations & Development
ORNL: Climate storage &computational resources
ORNL: Climate storage &computational resources
ANL: Computational grids,& grid-based applications
ANL: Computational grids,& grid-based applications
USC/ISI: Computational grids,& grid-based applications
USC/ISI: Computational grids,& grid-based applications
NCAR: Climate changepredication and scenarios
NCAR: Climate changepredication and scenarios
LBNL: Climate storage Facility and access
LBNL: Climate storage Facility and access
LLNL: Model diagnostics& inter-comparison
LLNL: Model diagnostics& inter-comparison
![Page 7: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/7.jpg)
Drach 7 Sept. 10, 2002
Program for Climate Model Diagnosis and Intercomparison
Validation and intercomparison of atmospheric general circulation models, coupled ocean-atmosphere models
Development of analysis software, quality control, archiving, distribution of model results. Climate Data Analysis Tools (CDAT) is a Python-based analysis and visualization system.
Global warming detection studies
CMIP (coupled models) and AMIP (atmospheric GCMs) gather model simulation results from thirty modeling groups worldwide.
![Page 8: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/8.jpg)
Drach 8 Sept. 10, 2002
PCMDI and Model Development
Modeling groups
PCMDIDiagnosis, quality control,
data archivalSimulation data
Controlled simulation runs
Feedback to modelers
Gridded observation data
Observations
Data assimilation
PCMDI
![Page 9: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/9.jpg)
Drach 9 Sept. 10, 2002
ESG-II Architecture
Portals
Servers
Middleware
![Page 10: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/10.jpg)
Drach 10 Sept. 10, 2002
ESG: Metadata Services
METADATAEXTRACTION
METADATAEXTRACTION
METADATADISPLAY
METADATADISPLAY
METADATABROWSING
METADATABROWSING
METADATAQUERY
METADATAQUERY
ESG CLIENTS API & USER INTERFACES
Data &MetadataCatalog
Dublin CoreDatabase
CFDatabase
mirrorDublin CoreXML Files
COMMENTSXML Files
METADATA HOLDINGS
METADATAANNOTATION
METADATAANNOTATION
METADATAVALIDATION
METADATAVALIDATION
METADATA ACCESS(update, insert, delete, query)
METADATA ACCESS(update, insert, delete, query)
SERVICE TRANSLATIONLIBRARY
SERVICE TRANSLATIONLIBRARY
CORE METADATA SERVICES
METADATAAGGREGATION
METADATAAGGREGATION
METADATADISCOVERY
METADATADISCOVERY
METADATA & DATA REGISTRATION
METADATA & DATA REGISTRATION
PUBLISHINGPUBLISHING
HIGH LEVEL METADATA SERVICES
SEARCH & DISCOVERYSEARCH & DISCOVERYADMINISTRATIONADMINISTRATION BROWSING & DISPLAYBROWSING & DISPLAY
ANALYSIS & VISUALIZATIONANALYSIS & VISUALIZATION
![Page 11: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/11.jpg)
Drach 11 Sept. 10, 2002
OpenDAP (DODS): Distributed Oceanographic Data System (Unidata)Integrations of Globus GridFTP, DODS data access
THREDDS: THematic Real‑time Environmental Distributed Data Services (Unidata)LAS: Live Access Server (NOAA Pacific Marine Environmental Laboratory)
Works with CDAT, Ferret, GrADS, …CDAT: Climate Data Analysis Tools (PCMDI), includes CDMS: Climate Data Management System, VCDAT visualizationCommunity Data Portal project (NCAR)NCL (NCAR)Globus Grid technology(ANL, ISI): GridFTP, CAS Community Authorization Service
ESG is leveraging off existing software and projects.
![Page 12: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/12.jpg)
Drach 12 Sept. 10, 2002
CDAT: Example of an ESG GUI Client Access
![Page 13: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/13.jpg)
Drach 13 Sept. 10, 2002
LAS/CDAT: Example of a Web-based Data Portal
Technology: Web Based (end user requirements) LAS, DODS, ESG (i.e., Globus),
CDAT Portal should hide/simplify the Grid for
users Single sign-on Community-based authorization Simplified resource location Remote job submission,
management Accesses the ESG Grid Testbed
![Page 14: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/14.jpg)
Part II
Metadata Standards for Gridded Climate Data
![Page 15: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/15.jpg)
Drach 15 Sept. 10, 2002
Most climate simulation data are in the form of gridded datasets: collections of variables as a function of longitude, latitude, time, and vertical level.
A dataset is a logical container:A fileAn aggregation of filesA collection of database tables
Model-generated dataModel dataDerived data: zonal averages, global averages, virtual variables
Observational data, including reanalysesAttributes in the form of (name, value) pairs, array values
Climate Model Datasets
![Page 16: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/16.jpg)
Drach 16 Sept. 10, 2002
Suitable basis for storing data, but lack the metadata to support certain application requirements
netCDF (UCAR) array data model flexible attribute/value metadata model simple API
HDF (NCSA, NASA) collection of APIs, can be tailored to specific data models including scientific data sets, satellite data, point data
Binary formats
![Page 17: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/17.jpg)
Drach 17 Sept. 10, 2002
GRIB (WMO, ECMWF, NCEP) mixed sequential/array data model tailored for simulation output, supports common horizontal grid types hardwired metadata model good compression capabilities lacks a standard API
Binary formats
![Page 18: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/18.jpg)
Drach 18 Sept. 10, 2002
Self-describing binary formats are flexible, but underconstrain representation of coordinate systems.
Coordinate Systems
Index Space
Variable Space
Coordinate Space
Coordinate SystemTime(i)Latitude(j,k)Longitude(j,k)
V = Temperature(Time, Latitude, Longitude)V’ = Temperature(i,j,k)
![Page 19: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/19.jpg)
Drach 19 Sept. 10, 2002
Curvilinear grid - Los Alamos POP ocean model
Horizontal Grids
Temperature(i,j)
Latitude(i,j)
Longitude(i,j)
Lat_bounds(i,j,4)
Lon_bounds(i,j,4)
![Page 20: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/20.jpg)
Drach 20 Sept. 10, 2002
Reduced grid
Horizontal Grids
Temperature(i,j)
Latitude(i)
Longitude(i,j)
Lat_bounds(i,2)
Lon_bounds(i,j,4)
![Page 21: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/21.jpg)
Drach 21 Sept. 10, 2002
General grid – Colorado State geodesic grid
Horizontal Grids
Temperature(npts)
Latitude(npts)
Longitude(npts)
Lat_bounds(npts,6)
Lon_bounds(npts,6)
![Page 22: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/22.jpg)
Drach 22 Sept. 10, 2002
Applications must be able to recognize the spatial/temporal coordinate axes.
Visualization: continental overlaysData: selection by axis type
Spatial/temporal location
file = cdms.open(‘sample.nc’)
temperature = file[‘temperature’]
data = temperature(latitude=(-45.0, 45.0))
file = cdms.open(‘sample.nc’)
temperature = file[‘temperature’]
data = temperature(latitude=(-45.0, 45.0))
![Page 23: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/23.jpg)
Drach 23 Sept. 10, 2002
Climate simulations use different types of calendars ‘proleptic’ Gregorian Julian Mixed Gregorian/Julian No leap years (noleap) 30-day months
Climatologies represent multi-year averages.
Time representation and calendars
![Page 24: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/24.jpg)
Drach 24 Sept. 10, 2002
Several conventions have been developed to augment the netCDF data model.
Represent a balance between needs of data producers and data consumers.
COARDS convention 1D coordinates axes, rectilinear horizontal grids axis identification based on units variables limited to four dimensions ordering of dimensions fixed http://ferret.wrc.noaa.gov/noaa_coop/coop_cdf_profile.html
Metadata conventions
![Page 25: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/25.jpg)
Drach 25 Sept. 10, 2002
CF (Climate and Forecast) convention Based on earlier conventions, COARDS and GDT multidimensional coordinates (auxiliary coordinate variables) simplified axis identification specific representation for several horizontal grid types
rectilinear curvilinear reduced grids
variables can have an arbitrary number of dimensions no constraint on ordering of dimensions non-Gregorian calendars standard name table http://www.cgd.ucar.edu/cms/eaton/cf-metadata/
Metadata conventions
![Page 26: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/26.jpg)
Drach 26 Sept. 10, 2002
Ability to recognize comparable quantities is fundamental to model intercomparison. CF defines a schema for standard name tables XML representation used for table of standard variable names and descriptions standard_name attribute is optional. No restriction on variable names. Relationship to ontology development?
Comparability of quantities
<standard_name_table> <institution>Program for Climate Model Diagnosis and Intercomparison</institution> <contact>[email protected]</contact> <entry id="surface_air_pressure"> <canonical_units>Pa</canonical_units> <description>Pressure defined at the level of the mean topography within the grid box.</description> </entry> <alias id="mean_sea_level_pressure"> <entry_id>air_pressure_at_sea_level</entry_id> </alias> </standard_name_table>
<standard_name_table> <institution>Program for Climate Model Diagnosis and Intercomparison</institution> <contact>[email protected]</contact> <entry id="surface_air_pressure"> <canonical_units>Pa</canonical_units> <description>Pressure defined at the level of the mean topography within the grid box.</description> </entry> <alias id="mean_sea_level_pressure"> <entry_id>air_pressure_at_sea_level</entry_id> </alias> </standard_name_table>
![Page 27: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/27.jpg)
Drach 27 Sept. 10, 2002
Variable
CoordinateVariable
NetCDFFile
Attribute
NetCDFAggregate
Dimension
Collection
CF / NetCDF Object Model
Inheritance
Relationship (1:n)
CF
Relationship (m:n)
BoundaryVariable
GeneralCoordinate
Variable
GriddedVariable
AuxiliaryCoordinate
Variable
ESG has adopted the netCDF data model and the CF convention as standards Other standards and conventions will follow.NcML markup language.
ESG metadata
![Page 28: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/28.jpg)
Drach 28 Sept. 10, 2002
CF and NcML apply to data aggregates as well as files
Data aggregation: collections of files/datasets are treated as single entities. array model netCDF-like tailored for extraction of 'hyperslabs' of data
Aspects of aggregation: combining/merging variables joining variables creating new coordinate axes overlaying/adding metadata nesting datasets
Aggregation
![Page 29: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/29.jpg)
Drach 29 Sept. 10, 2002
Aggregation maps well to multifile datasets: multifile datasets can be thought of as 'partitioned' into files. Variables may 'span' multiple files. Usually a dataset is partitioned on time and/or vertical level axes.PCMDI CDAT supports aggregations via the cdscan utility, uses XML representation THREDDS/DODS aggregation server (http://www.unidata.ucar.edu/projects/THREDDS/)
Aggregation
Time
Level
Variable
![Page 30: Metadata Standards for Gridded Climate Data in the Earth System Grid Robert Drach LLNL/PCMDI UCRL-PRES-149779.](https://reader035.fdocuments.us/reader035/viewer/2022062722/56649f345503460f94c519b9/html5/thumbnails/30.jpg)
Drach 30 Sept. 10, 2002
The Earth System Grid project is developing metadata services to support a variety of schemas and conventions.
The initial focus of ESG is to enable climate researchers to make effective use of distributed, model-generated datasets.
The netCDF schema and CF convention are the foundation for representation of this data.
Summary