NetCDF

52
NetCDF Ed Hartnett Unidata/UCAR [email protected]

description

NetCDF. Ed Hartnett Unidata/UCAR [email protected]. Unidata. Unidata - helps universities acquire, display, and analyze Earth-system data. UCAR – University Corporation for Atmospheric Research - a nonprofit consortium of 66 universities. SDSC Presentation, July 2005. - PowerPoint PPT Presentation

Transcript of NetCDF

Page 1: NetCDF

NetCDF

Ed HartnettUnidata/UCAR

[email protected]

Page 2: NetCDF

Unidata

• Unidata - helps universities acquire, display, and analyze Earth-system data.

• UCAR – University Corporation for Atmospheric Research - a nonprofit consortium of 66 universities.

Page 3: NetCDF

SDSC Presentation, July 2005

• Intro to NetCDF Classic • Intro to NetCDF-4

Page 4: NetCDF

What is NetCDF?

• A conceptual data model for scientific data.

• A set of APIs in C, F77, F90, Java, etc. to create and manipulate data files.

• Some portable binary formats.• Useful for storing arrays of data and

accompanying metadata.

Page 5: NetCDF

History of NetCDF

20051988 20041991 1996

netCDF 2.0 released

netCDF developed at Unidata

netCDF 3.0 released

netCDF 3.6.0 released

netCDF 4.0 beta released

Page 6: NetCDF

Getting netCDF

• Download latest release from the netCDF web page: http://www.unidata.ucar.edu/content/software/netcdf

• Builds and installs on most platforms with no configuration necessary.

• For a list platforms netCDF versions have built on, and the output of building and testing netCDF, see the web site.

Page 7: NetCDF

NetCDF Portability

• NetCDF is tested on a wide variety of platforms, including Linux, AIX, SunOS, MacOS, IRIX, OSF1, Cygwin, and Windows.

• We test with native compilers when we can get them.

• 64-bit builds are supported with some configuration effort.

Page 8: NetCDF

What Comes with NetCDF

• NetCDF comes with 4 language APIs: C, C++, Fortran 77, and Fortran 90.

• Tools ncgen and ncdump.• Tests.• Documentation.

Page 9: NetCDF

NetCDF Java API

• The netCDF Java API is entirely separate from the C API.

• You don’t need to install the C API for the Java API to work.

• Java API contains many exciting features, such as remote access and more advanced coordinate systems.

Page 10: NetCDF

Tools to work with NetCDF Data

• The netCDF core library provides basic data access.

• ncgen and ncdump provide some helpful command line functionality.

• Many additional tools are available, see: http://www.unidata.ucar.edu/packages/netcdf/software.html

Page 11: NetCDF

CDL – Common Data Language

• Grammar defined for displaying information about netCDF files.

• Can be used to create files without programming.

• Can be used to create reading program in Fortran or C.

• Used by ncgen/ncdump utilities.

Page 12: NetCDF

Example of CDLnetcdf foo { // example netCDF specification in CDL dimensions: lat = 10, lon = 5, time = unlimited;

variables: int lat(lat), lon(lon), time(time); float z(time,lat,lon), t(time,lat,lon); double p(time,lat,lon); int rh(time,lat,lon);

lat:units = "degrees_north"; lon:units = "degrees_east";

data: lat = 0, 10, 20, 30, 40, 50, 60, 70, 80, 90; lon = -140, -118, -96, -84, -52; }

Page 13: NetCDF

Software Architecture of NetCDF-3

V3 C API

V2 C API

V2 C tests

V3 C tests F77 API

F77 tests F90 API

C++ APIncdumpncgen

• Fortran, C++ and V2 APIs are all built on the C API.

• Other language APIs (perl, python, MatLab, etc.) use the C API.

Page 14: NetCDF

NetCDF Documentation

• Unidata distributes a NetCDF Users Guide which describes the data model in detail.

• A language-specific guide is provided for C, C++, Fortran 77, and Fortran 90 users.

• All documentation can be found at: http://my.unidata.ucar.edu/content/software/netcdf/docs

Page 15: NetCDF

NetCDF Jargon

• “Variable” – a multi-dimensional array of data, of any of 6 types (char, byte, short, int, float, or double).

• “Dimension” – information about an axis: it’s name and length.

• “Attribute” – a 1D array of metadata.

Page 16: NetCDF

More NetCDF Jargon

• “Coordinate Variable” – a 1D variable with the same name as a dimension, which stores values for each dimension value.

• “Unlimited Dimension” – a dimension which has no maximum size. Data can always be extended along the unlimited dimension.

Page 17: NetCDF

The NetCDF Classic Data Model

• The netCDF Classic Data Model contains dimensions, variables, and attributes.

• At most one dimension may be unlimited.• The Classic Data Model is embodied by

netCDF versions 1 through 3.6.0• NetCDF is moving towards a new, richer

data model: the Common Data Model.

Page 18: NetCDF

NetCDF Example

• Suppose a user wants to store temperature and pressure values on a 2D latitude/longitude grid.

• In addition to the data, the user wants to store information about the lat/lon grid.

• The user may have additional data to store, for example the units of the data values.

Page 19: NetCDF

NetCDF Model Example

temperature

pressure

Units: C

Units: mb

VariablesDimensions

latitude

longitude

Attributes

latitude

longitude

Coordinate Variables

Page 20: NetCDF

Important NetCDF Functions

• nc_create and nc_open to create and open files.• nc_enddef, nc_close.• nc_def_dim, nc_def_var, nc_put_att_*, to define

dimensions, variables, and attributes.• nc_inq, nc_inq_var, nc_inq_dim, nc_get_att_* to

learn about dims, vars, and atts.• nc_put_vara_*, nc_get_vara_* to write and read

data.

Page 21: NetCDF

C Functions to Define Metadata /* Create the file. */ if ((retval = nc_create(FILE_NAME, NC_CLOBBER, &ncid))) return retval;

/* Define the dimensions. */ if ((retval = nc_def_dim(ncid, LAT_NAME, LAT_LEN, &lat_dimid))) return retval; if ((retval = nc_def_dim(ncid, LON_NAME, LON_LEN, &lon_dimid))) return retval;

/* Define the variables. */ dimids[0] = lat_dimid; dimids[1] = lon_dimid; if ((retval = nc_def_var(ncid, PRES_NAME, NC_FLOAT, NDIMS, dimids, &pres_varid))) return retval if ((retval = nc_def_var(ncid, TEMP_NAME, NC_FLOAT, NDIMS, dimids, &temp_varid))) return retval;

/* End define mode. */ if ((retval = nc_enddef(ncid))) return retval;

Page 22: NetCDF

C Functions to Write Data

/* Write the data. */ if ((retval = nc_put_var_float(ncid, pres_varid, pres_out))) return retval; if ((retval = nc_put_var_float(ncid, temp_varid, temp_out))) return retval;

/* Close the file. */ if ((retval = nc_close(ncid))) return retval;

Page 23: NetCDF

C Example – Getting Data• /* Open the file. */• if ((retval = nc_open(FILE_NAME, 0, &ncid)))• return retval;

• /* Read the data. */• if ((retval = nc_get_var_float(ncid, 0, pres_in)))• return retval;• if ((retval = nc_get_var_float(ncid, 1, temp_in)))• return retval;

• /* Do something useful with the data… */• • /* Close the file. */• if ((retval = nc_close(ncid)))• return retval;

Page 24: NetCDF

Data Reading and Writing Functions

• There are 5 ways to read/write data of each type.

• var1 – reads/writes a single value.• var – reads/writes entire variable at once.• vara – reads/writes an array subset.• vars – reads/writes an array by slices.• varm – reads/writes a mapped array.• Ex.: nc_put_vars_short writes shorts by slices.

Page 25: NetCDF

Attributes

• Attributes are 1-D arrays of any of the 6 netCDF types.

• Read/write them with functions like: nc_get_att_float and nc_put_att_int.

• Attributes may be attached to a variable, or may be global to the file.

Page 26: NetCDF

NetCDF File Formats

• Starting with 3.6.0, netCDF supports two binary data formats.

• NetCDF Classic Format is the format that has been in use for netCDF files from the beginning.

• NetCDF 64-bit Offset Format was introduced in 3.6.0 and allows much larger files.

• Use classic format unless you need the large files.

Page 27: NetCDF

NetCDF-3 Summary

• NetCDF is a software library and some binary data formats, useful for scientific data, developed at Unidata.

• NetCDF organizes data into variables, with dimensions and attributes.

• NetCDF has proven to be reliable, simple to use, and very popular.

Page 28: NetCDF

Why Add to NetCDF-3?

• Increasingly complex data sets call for greater organization.

• Size limits, unthinkably huge in 1988, are routinely reached in 2005.

• Parallel I/O is required for advanced Earth science applications.

• Interoperability with HDF5.

Page 29: NetCDF

NetCDF-4

• NetCDF-4 aims to provide the netCDF API as a front end for HDF5.

• Funded by NASA, executed at Unidata and NCSA.

• Includes reliable netCDF-3 code, and is fully backward compatible.

Page 30: NetCDF

NetCDF-4 Organizations• Unidata/UCAR• NCSA –

The National Center for Supercomputing ApplicationsUniversity of Illinois at Urbana-Champaign

• NASA – NetCDF-4 was funded by NASA award number AIST-02-0071.

Page 31: NetCDF

New Features of NetCDF-4

• Multiple unlimited dimensions.• Groups to organize data.• New types, including compound types and

variable length arrays.• Parallel I/O.

Page 32: NetCDF

The Common Data Model

• NetCDF-4, scheduled for beta-release this Summer, will conform to the Common Data Model.

• Developed by John Caron at Unidata, with the cooperation of HDF, OpenDAP, netCDF, and other software teams, CDM unites different models into a common framework.

• CDM is a superset of the NetCDF Classic Data Model

Page 33: NetCDF

The NetCDF-4 Data Model

• NetCDF-4 implements the Common Data Model.• Adds groups, each group can contain variables,

attributes and dimensions, and groups.• Dimensions are scoped so that variables in

different groups can share dimensions.• Compound types allow users to define new

types, comprised of other atomic or user-defined types.

• New integer and string types.

Page 34: NetCDF

Software Architecture of NetCDF-4

V4 C API

V2 C API

V2 C tests

V3 C tests F77 API

F77 tests F90 API

C++ APIncdumpncgen

V3 C API HDF5

Page 35: NetCDF

NetCDF-4 Release Status

• Latest alpha release includes all netCDF-4 features – depends on latest HDF5 development snapshot.

• Beta release – due out in August, replaces artificial netCDF-4 constructs, and depends on a yet-to-be-released version of HDF5.

• Promotion from beta to full release will happen sometime in 2006.

Page 36: NetCDF

Building NetCDF-4

• NetCDF-4 requires that HDF5 version 1.8.3 be installed. This is not released yet.

• The latest HDF5 development release works with the latest netCDF alpha release.

• To build netCDF-4, specify –enable-netcdf-4 at configure.

Page 37: NetCDF

When to Use NetCDF-4 Format

• The new netCDF-4 features (groups, new types, parallel I/O) are only available for netCDF-4 format files.

• When you need HDF5 files.• When portability is less important, until

netCDF-4 becomes widespread.

Page 38: NetCDF

Versions and Formats

Classic Format

64-Bit Offset Format

NetCDF-4 Format

20051988 20041991 1996

netCDF 2.0 released

netCDF developed by Glenn Davis netCDF 3.0

released

netCDF 3.6.0 released

netCDF 4.0 beta released

Page 39: NetCDF

NetCDF-4 Feature Review

• Multiple unlimited dimensions.• How to use groups.• Using compound types.• Other new types.• Variable length arrays.• Parallel I/O.• HDF5 Interoperability.

Page 40: NetCDF

Multiple Unlimited Dimensions

• Unlimited dimensions are automatically expanded as new data are written.

• NetCDF-4 allows multiple unlimited dimensions.

Page 41: NetCDF

Working with Groups

• Define a group, then use it as a container for the classic data model.

• Groups can be used to organize sets of data.

Page 42: NetCDF

Model_Run_1arhlat

lon temp

units

units

history

Model_Run_1rhlat

lon temp

units

units

history Model_Run_2rhlat

lon temp

units

units

history

An Example of Groups

Page 43: NetCDF

New Functions to Use Groups

• Open/create returns ncid of root group.• Create a new group with nc_def_grp. nc_def_grp(int parent_ncid, char *name, int *new_ncid);

• Learn about groups with nc_inq_grps. nc_inq_grps(int ncid, int *numgrps, int *ncids);

Page 44: NetCDF

C Example Using Groups if (nc_create(FILE_NAME, NC_NETCDF4, &ncid)) ERR; if (nc_def_grp(ncid, DYNASTY, &tudor_id)) ERR; if (nc_def_dim(tudor_id, DIM1_NAME,

NC_UNLIMITED, &dimid)) ERR; if (nc_def_grp(tudor_id, HENRY_VII, &henry_vii_id))

ERR; if (nc_def_var(henry_vii_id, VAR1_NAME, NC_INT, 1,

&dimid, &varid)) ERR; if (nc_put_vara_int(henry_vii_id, varid, start, count,

data_out)) ERR; if (nc_close(ncid)) ERR;

Page 45: NetCDF

Create Complex Types

• Like C structs, compound types can be assembled into a user defined type.

• Compound types can be nested – that is, they can contain other compound types.

• New functions are needed to create new types.

• V2 API functions are used to read/write complex types.

Page 46: NetCDF

C Example of Compound Types

/* Create a file with a compound type. Write a little data. */ if (nc_create(FILE_NAME, NC_NETCDF4, &ncid)) ERR; if (nc_def_compound(ncid, sizeof(struct s1), SVC_REC, &typeid)) ERR; if (nc_insert_compound(ncid, typeid, BATTLES_WITH_KLINGONS,

HOFFSET(struct s1, i1), NC_INT)) ERR; if (nc_insert_compound(ncid, typeid, DATES_WITH_ALIENS,

HOFFSET(struct s1, i2), NC_INT)) ERR; if (nc_def_dim(ncid, STARDATE, DIM_LEN, &dimid)) ERR; if (nc_def_var(ncid, SERVICE_RECORD, typeid, 1, dimids, &varid)) ERR; if (nc_put_var(ncid, varid, data)) ERR; if (nc_close(ncid)) ERR;

Page 47: NetCDF

New Ints, Opaque, String Types

• Opaque types are bit-blobs of fixed size.• String types allow multi-dimensional arrays

of strings.• New integer types: UBYTE, USHORT,

UINT, UINT64, INT64.

Page 48: NetCDF

Variable Length Arrays

• Variable length arrays allow the efficient storage of arrays of variable size.

• For example: an array of soundings of different number of elements.

Page 49: NetCDF

Parallel I/O with NetCDF-4• Must use configure option –enable-parallel when

building netCDF.• Depends on HDF5 parallel features, which require

MPI.• Must create or open file with nc_create_par or

nc_open_par.• All metadata operations are collective.• Adding a new record is collective.• Variable reads/writes are independent by default,

but can be changed to do collective operations.

Page 50: NetCDF

HDF5 Interoperability

• NetCDF-4 can interoperate with HDF5 with a SUBSET of HDF5 features.

• Will not work with HDF5 files that have looping groups, references, and types not found in netCDF-4.

• HDF5 file must use new dimension scale API to store shared dimension info.

• If a HDF5 follows the Common Data Model, NetCDF-4 can interoperate on the same files.

Page 51: NetCDF

Future Plans for NetCDF

• NetCDF 4.0 release in 2006.• Beta for next major version of netCDF in

Summer, 2006.• Full compatibility with Common Data Model.• Remote access, including remote subsetting of

data.• XML-based representation of netCDF metadata.• Full Fortran 90 support, but limited F77 support.

Page 52: NetCDF

For Further Information

• netCDF mailing list: [email protected]

• email Ed: [email protected]• netCDF web site: www.unidata.ucar.edu