NASA HDF/HDF-EOS Data Access Challenges

Post on 16-Apr-2017

816 views 5 download

Transcript of NASA HDF/HDF-EOS Data Access Challenges

www.hdfgroup.org

The HDF Group

ESIP 2013 Summer Meeting 1

NASA HDF/HDF-EOS Data Access Challenges

H. Joe Lee (hyokee@hdfgroup.org)Kent Yang (myang6@hdfgroup.org)

The HDF Group

July 9, 2013

www.hdfgroup.org

Hal Varian, Google’s chief economist

ESIP 2013 Summer Meeting 2July 9, 2013

“The ability to take data – to be able to understand it, to process it, to extract value from it, to visualize

it, to communicate it – that’s going to be a hugely important skill in the next decades.”

www.hdfgroup.org

For Earth Science Data Users

ESIP 2013 Summer Meeting 3July 9, 2013

The ability to take NASA HDF/HDF-EOS data – to be able to understand it, to process it, to extract value from it, to visualize it, to

communicate it – that’s a hugely important skill right now.

www.hdfgroup.org

Is it easy to take NASA HDF data?

ESIP 2013 Summer Meeting 4July 9, 2013

No, for Average Joe data user.

www.hdfgroup.org

Understand

ESIP 2013 Summer Meeting 5July 9, 2013

“I'm new to IDL and HDF; and I'm currently working with MODIS L1B data.  I found your examples very helpful.  Is

it possible to show how radiance is calculated?”

www.hdfgroup.org

Process

ESIP 2013 Summer Meeting 6July 9, 2013

“I work in NASA/GSFC GES-DISC on AIRS

project. We have new idl version 8.1.

But got a core dump error when we run EOS function EOS_SW_INQSWATH to

inqure swath name from a AIRS level 2 product file. Need your help. Thanks.”

www.hdfgroup.org

“Hi,I want to use the following TRMM data , http://mirador.gsfc.nasa.gov/...2A25....Can you provide me some programs that deal with these datasets so that I can obtain the

daily convective

precipitation in the region 110-180E,0-40N during 2006?”

Extract Values

ESIP 2013 Summer Meeting 7July 9, 2013

www.hdfgroup.org

Visualize

ESIP 2013 Summer Meeting 8July 9, 2013

“Can you please make the matlab file for

reading ozone hdf5 files obtained from

mls available to the public. I wanted to obtain ozone distributionover the world and ozone distributions with height etc. thank you :)….oh can you tell me which function can i use to plot latitude in the x-axis, pressure in the y-axis and a contour plot of ozone over it?”

www.hdfgroup.org

Communicate

ESIP 2013 Summer Meeting 9July 9, 2013

“Your prog is very helpful to verify my process. I have one more doubt. I am trying to

convert this hdf to Geotiff using Matlab. Do have any written code to do the same. Doing it with HEG tool given an error specifying that 5D are only supported for SOM projections. Also I am doing all processing with Matlab. So could you pl. help me.”

www.hdfgroup.org

NASA HDF Users See Challenges

ESIP 2013 Summer Meeting 10July 9, 2013

in accessingsatellite-product-specific

(MODIS, AIRS, MLS)

geo-location/time-specific (lat/lon/height/year)

data with their favorite software packages (MATLAB/IDL/ArcGIS).

www.hdfgroup.org

What Makes Access Challenging?

ESIP 2013 Summer Meeting 11July 9, 2013

1. Some files use the techniques that end users may

not be familiar with, although the techniques may

help storing data efficiently.

2. Information from a source outside the files is

required to retrieve the data in a physically

meaningful manner. 

3. Attributes do not comply with the widely used

conventions. 

4. Metadata in HDF file has incorrect information.

www.hdfgroup.org

Converted File Size Comparison

July 9, 2013 ESIP 2013 Summer Meeting 12

72M

128M

656M

HDF-EOS2

Netcdf-4

Netcdf-3

9X

www.hdfgroup.org

Challenge 1: Unfamiliar Techniques

ESIP 2013 Summer Meeting 13July 9, 2013

Users look for Latitude/Longitude datasets that match

variable (e.g., Ozone) datasets.

Some HDF products have

• mismatched lat/lon.

• lat/lon information in metadata attribute.

• duplicate lat/lon information.

www.hdfgroup.org

Swath Dimension Map Example

ESIP 2013 Summer Meeting 14July 9, 2013

HDF-EOS Swath Dimension Map allows to have

mismatched size in dimensions.

• Latitude[512][512]

• Longitude[512][512]

• Data[1024][1024]

www.hdfgroup.org

NSIDC AMSR_E NCL Example

ESIP 2013 Summer Meeting 15July 9, 2013

; Read the file as HDF4 file to obtain dataset attributes. hdf4_file = addfile("AMSR_E_L3_WeeklyOcean_V03_20020616.hdf", "r")

; Read the file as HDF-EO2 file to obtain lat and lon. hdf-eos2_file = addfile("AMSR_E_L3_WeeklyOcean_V03_20020616.hdf.he2", "r")

User should call both HDF4 and HDF-EOS2 API:

• HDF4 API alone cannot resolve lat/lon.

• HDF-EOS2 API alone cannot retrieve some attributes

that are added later by HDF4 APIs.

www.hdfgroup.org

Challenge 2: Information Outside HDF

ESIP 2013 Summer Meeting 16July 9, 2013

Users must read data product manual to find

• fill value / valid ranges

• units or discrete key values

• scale / offset equation

• physical description of data

Some products are not self-describing!

www.hdfgroup.org

Without Information Outside HDF

ESIP 2013 Summer Meeting 17July 9, 2013

www.hdfgroup.org

With Information Outside HDF

ESIP 2013 Summer Meeting 18July 9, 2013

www.hdfgroup.org

Challenge 3: The CF Conventions

ESIP 2013 Summer Meeting 19July 9, 2013

Following the widely accepted CF conventions is

important for interoperability but some HDF products

• use non-alphanumeric characters.

• use non-CF attribute names and values.

• use non-CF scale / offset rules.

• use different data type for attribute (e.g.,

_FillValue) from the variable.

www.hdfgroup.org

Attribute Type Mismatch Example

ESIP 2013 Summer Meeting 20July 9, 2013

Int16 data[180][360] // Variable

String valid_range “0,100” // Attribute (Wrong)

Byte _FillValue 255 // Attribute (Wrong)

Int16 data[180][360] // Variable

Int16 valid_range 0,100 // Attribute (Correct)

Int16 _FillValue 255 // Attribute (Correct)

www.hdfgroup.org

Challenge 4: Incorrect Information

ESIP 2013 Summer Meeting 21July 9, 2013

Sometimes, metadata contains incorrect information.

This is rare and such information is usually corrected

immediately by data producers.

www.hdfgroup.org

Incorrect Information Example

ESIP 2013 Summer Meeting 22July 9, 2013

An NCL user reported that the same code doesn’t work

for an older MOP02 HDF-EOS5 file.

In  2008/01/01 file, StructMetadata has the wrong value:

  nTime = 250841130416

In  2008/12/31 file, StructMetadata has the correct value:

  nTime= 2

LaRC ASDC fixed this already!

www.hdfgroup.org

Good News

ESIP 2013 Summer Meeting 23July 9, 2013

The recent effort from The HDF Group overcomes many

challenges:

• HDF4/HDF5 OPeNDAP Handler with EnableCF option

• H4CF Conversion Toolkit with NcML / NCO examples

• HDF-EOS5 Augmentation Tool

• HDF-EOS2 Dumper tool with Comprehensive

Examples for MATLAB/IDL/NCL

The above tools and their examples are available at

HDFEOS.org.

www.hdfgroup.org

Challenge 1: Unfamiliar Techniques

ESIP 2013 Summer Meeting 24July 9, 2013

HDF OPeNDAP handlers & H4CF Conversion Toolkit

• provide full geo-location information as explicit datasets.

HDF-EOS5 Augmentation Tool

• provides ways to associate geo-location information with

existing datasets or to supply new ones.

HDF-EOS2 Dumper Tool

• prints out geo-location information in ASCII because

MATLAB/IDL/NCL can read ASCII text data.

www.hdfgroup.org

Challenge 2: Information Outside HDF

ESIP 2013 Summer Meeting 25July 9, 2013

HDF OPeNDAP handlers

• provide fill value / valid range information.

• apply CF scale / offset rule.

• calculate latitude and longitude values for some NASA

non-EOS products.

• are tested against ncml_handler so that data centers

can add additional information using NcML.

H4CF Conversion Toolkit (h4tonccf)

• provides NcML and NCO examples to add or edit

attributes for converted NetCDF files.

www.hdfgroup.org

Challenge 3: The CF Conventions

ESIP 2013 Summer Meeting 26July 9, 2013

HDF OPeNDAP handlers & H4CF Conversion Toolkit

• flatten group hierarchies.

• change variable & attribute types, names, and values.

• add named dimensions.

• add coordinate information.

www.hdfgroup.org

Challenge 4: Incorrect Information

ESIP 2013 Summer Meeting 27July 9, 2013

HDF OPeNDAP handlers & H4CF Conversion Toolkit

• correct errors for old products temporarily.

• catch errors for new products.

www.hdfgroup.org

Better News

ESIP 2013 Summer Meeting 28July 9, 2013

We see less and less challenges in newer HDF products

thanks to open communication and standardization effort

among Earth Science communities through meetings,

telecons, and mailing lists.

• HDF – DAACs Telecons

• ESDSWG – H5CF Conventions

• ESIP

• CF (satellite) conventions mailing lists

www.hdfgroup.org

Future Challenges

ESIP 2013 Summer Meeting 29July 9, 2013

• Data Discovery

• Subsetting and Aggregation

• Sharing Research Data

www.hdfgroup.org

Data Discovery

ESIP 2013 Summer Meeting 30July 9, 2013

Some users still don’t know how to search and where

to download data.

Spatial search in Reverb doesn’t guarantee that the

matched HDF data files contain the valid values at

the specific location that user is looking for.

Browse image is helpful but users don’t want to

examine one by one.

www.hdfgroup.org

Reverb Browse Image for O3 at Seoul

ESIP 2013 Summer Meeting 31July 9, 2013

The returned HDF file has no value at Seoul

www.hdfgroup.org

Subsetting and Aggregation

ESIP 2013 Summer Meeting 32July 9, 2013

Customized on-demand HDF product generation is

desired based on the user’s query. For example,

“Give me all L2 Ozone data at Seoul from 2002 to 2013

and allow me to download it as a single HDF file.”

Most HDF data products are packaged in daily granule

for large region. Search result returns thousands of HDF

files and users cannot download them one by one.

www.hdfgroup.org

Reverb Query Result for AIRS at Seoul

ESIP 2013 Summer Meeting 33July 9, 2013

Showing 1 to 9 of 5,047 granules

www.hdfgroup.org

Sharing Research Data

ESIP 2013 Summer Meeting 34July 9, 2013

How can users easily compose and publish new

research data from the different NASA data product

sources?

“I’d like to combine AIRS Ozone and OMI Ozone data

at Seoul from 2002-2013 and share it with journal

editors.”

Can this be shared as a single URL query to NASA

data cloud?

www.hdfgroup.org

Thanks!

Questions / Comments?eoshelp@hdfgroup.org

July 9, 2013 ESIP 2013 Summer Meeting 35

www.hdfgroup.org

Acknowledgements

ESIP 2013 Summer Meeting 36July 9, 2013

This work was supported by Subcontract number 114820 under Raytheon Contract number NNG10HP02C, funded by the National Aeronautics and Space Administration (NASA) and by cooperative agreement number NNX08AO77A from the NASA. Any opinions, findings, conclusions, orrecommendations expressed in this material are those of the authors and do not necessarily reflect the views of Raytheon or the National Aeronautics and Space Administration.