Unidata Support Tools and Plans Unidata Policy and Users Committee May 2005.
THREDDS, CDM, OPeNDAP, netCDF and Related Conventions John Caron Unidata/UCAR Sep 2007.
-
Upload
felix-henry -
Category
Documents
-
view
216 -
download
3
Transcript of THREDDS, CDM, OPeNDAP, netCDF and Related Conventions John Caron Unidata/UCAR Sep 2007.
THREDDS, CDM, OPeNDAP, netCDF and Related
Conventions
John Caron
Unidata/UCAR
Sep 2007
Contents
• Overview
• THREDDS Data Server
• Unidata’s Common Data Model
2) Server0) Client
1) Request
3) Response
0) The Client
What functionality is needed?
1. Scientific User– Raw data– Drill down to arbitrary detail
2. Decision Support– “best effort” Visualization– operational
1) The Request
What functionality is possible?• Analogous to SQL language for RDBMS• Implies a Data Model• OGC vs File access APIs
– NetCDF/OPeNDAP/HDF5 : index space– WXS : coordinate space
• Higher semantic level trumps if no significant extra cost.– File APIs become implementation, not interface
1) WCS Request
• Functionality – Subsetting (bounding box, time range, variable)– Optional reprojection/resample
• Variants: KML/XML/SOAP+XML/REST• Optional Functionality : 42 flavors• Bad news for interoperability• Is there an elephant to dictate a standards?
– Eg IBM chose SQL/Relational model (1984)
2) Server
How do I serve my data?
• Do I need specialized personnel?– $$, resource consumption, core competency
• What are the common requests?– (that I should optimize for)?
3) Response
What comes back?
• Has to be a representation of the “answer” in the Data Model
• WCS allows anything– Cant write a generic client
• Communities will form around a small number of variants– No elephants in sight
3) Response : XML vs. binary
• Extensibility vs. Efficiency
• Binary: netCDF/GeoTIFF/HDF/etc – reflect favorite formats of committee members– Different data models : ideally need a formal
mapping (but there arent any yet)– Domain experts can make use
• GML closely follows the OGC/ISO data model (WFS requires GML)
3) Response : XML vs. binary
• GML is waaaay too complex– Ambitious– OGC/ISO models are complex– Reality is complex– XML Schema is a disaster
• Google KML – “visualization format” not “data storage”
HTTP Tomcat Server
THREDDS Data Server
Datasets
catalog.xml
motherlode.ucar.edu
THREDDS Server
NetCDF-Javalibrary
Application
IDD Data
•HTTPServer
•NetcdfSubset
•WCS
•OPeNDAP
configCatalog.xml
THREDDS Catalogs
• XML over HTTP• Hierarchical listing of online resources (datasets)• Container for arbitrary search metadata
– Standard set maps to DC, GCMD, ADN – Unidata/NCAR-CDP
• Metadata can be inherited• Design goal: Make it easy for data providers• TDS uses extended version for configuration• Data Access URLS
– “Crossing the protocol boundary”
THREDDS OPeNDAP Server
• OPenDAP is protocol for remote access to CDM• Current version 2.0; NASA ESE standard
– Working on new 4.0 protocol spec
• Based on Java-OPeNDAP library – shared development by Unidata/opendap.org
• Any CDM dataset can be served• Server4 (Hyrax):
– latest version of opendap.org C++ library – THREDDS Catalogs replace dods_dir
THREDDS WCS service
• CDM files that have Grid coordinate system– evenly spaced x,y
• Allow to subset the dataset by:– Lat/lon or projection bounding box– time and vertical coordinate range– list of Variables
• Return formats– GeoTIFF floating point, grayscale– NetCDF/CF-1.0
• No reprojections, resamplings• Uses WCS 1.0, work on WCS 1.1 in progress
NetCDF Subset Service
• Experiment with REST style web service• Allow to subset the dataset by:
– Lat/lon bounding box– time and vertical coordinate range– list of Variables
• NetCDF/CF, XML, CSV (spreadsheet)• Gridded Data
– Output is a CF-1.0 netCDF file– Variation of WCS (simplified request protocol)
• Grid as Point Datasets (experimental)– Extract vertical profile, time series from one point in model data
• Station Data: metars (7 day rolling archive)
HTTP Tomcat Server
Common Data Model
catalog.xml
hostname.edu
THREDDS Server Application
NetCDF-Javalibrary
IDD Data
•HTTPServer
•NetcdfSubset
•WCS
•OPeNDAP
Then a miracle
happens
Datasets
NetcdfDataset
ApplicationScientific Datatypes
NetCDF-Java architecture
OPeNDAP
THREDDS
Catalog.xml NetCDF-3
HDF5
I/O service provider
GRIB
GINI
NIDS
NetcdfFile
NetCDF-4
…Nexrad
DMSP
CoordSystem Builder
Datatype Adapter
NcMLNcML
Common Data Model File Formats
• General: NetCDF, HDF5, OPeNDAP
• Gridded: GRIB-1, GRIB-2 • Radar: NEXRAD level II and level III, DORADE,
Chinese NEXRAD
• Point: BUFR
• Satellite: DMSP, GINI
• In Progress: NetCDF4, McIdas AREA, NPOESS, NOAA CLASS legacy files, Barrowdale DataBlade, others
Coordinate Systems
Common Data Model Layers
Data Access
Scientific Datatypes
Grid
Point
Radial
Trajectory
Swath
Station Profile
Common Data Model(Data Access Layer)
Coordinate Systems UML
NetCDF-4 file format
• NetCDF-4 C library – 4.0 Beta implements CDM access layer
• Persistence format for complete CDM
– 4.1: adding Coordinate Systems • Optional layer, focus on CF-1 (libcf)
– 4.?: merge OPeNDAP access
• NetCDF-Java library will read, maybe write
TDS / NcML aggregation
<dataset name="WEST-CONUS_4km Aggregation" urlPath="satellite/3.9/WEST-CONUS_4km">
<netcdf > <aggregation dimName="time" type="joinNew"> <scan location="/data/ldm/pub/satellite/3.9/WEST-CONUS_4km/"
suffix=".gini" /> </aggregation> </netcdf>
</dataset>
Forecast Model Run Collection (FMRC)
Scientific DataTypes
• Based on datasets Unidata is familiar with– APIs are evolving
• How are data points connected?• Intended to scale to large, multifile
collections• Intended to support “specialized queries”
– Space, Time
• Intend to create “standard” NetCDF file encoding conventions
Scientific DataTypes
• Grids– Structured– Swath– Unstructured
• Point Observation– Unconnected– Station / Time Series– Trajectory– Profile
• Radial
Climate and Forecast (CF) Conventions
• Conventions for encoding coordinate systems, other semantics in netCDF
• Working for 10 years– Version 1.0 in 2003– Good for gridded data
• Current working goups– Point/Station/Trajectory/Profile observations– CRS (map to OGC)
• Governance in place• Volunteer: motivated, practical, real
Summary: Unidata’s directions
• Client: both Scientific User and Decision Support
• Request in coordinate space– WCS is fine, not a big architectural decision
• Server: TDS– Files in native format, augmented by
indexing/DB
• Response: netCDF/CF and GeoTIFF/KML or WMS/JPEG