Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

28
Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008

Transcript of Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 1: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Profile of HDF-EOS5 Files

Abe Taaheri, Raytheon IIS

HDF & HDF-EOS Workshop XIIOctober 2008

Page 2: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 2

General HDF-EOS5 File Structure

• HDF-EOS5 file:

any valid HDF5 file that contains a family of global attributes called: coremetadata.X

Optional data objects: family of global attributes called:

archivemetadata.X any number of Swath, Grid, Point, ZA, and

Profile data structures. another family of global attributes:

StructMetadata.X

Page 3: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 3

General HDF-EOS5 File Structure

• Global Attributes provide:

- Info on the structure of HDF-EOS5 file

- Info on the data granule that file contains

• Other optional user-added global attributes: “PGEVersion”, “OrbitNumber”, etc. are written as HDF5 attributes into a group called “FILE ATTRIBUTES”

Page 4: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 4

General HDF-EOS5 File Structure

S

• coremetadata.XUsed to populate searchable database tables

within the ECS archives. Data users use this information to locate particular HDF-EOS5 data granules.

• archivemetadata.XNot searchable. Contains whatever

information the file creator considers useful to be in the file, but which will not be directly accessible by ECS databases.

• StructMetadata.XDescribes contents and structure of HDF-EOS

file. e.g. dimensions, compression methods, geolocation, projection information, etc. that are associated with the data itself.

Page 5: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 5

General HDF-EOS5 File Structure

• An HDF-EOS5 file– can contain any number of Grid, Point, Swath,

Zonal Average, and Profile data structures

– has no size limits. A file containing 1000's of objects could cause

program execution slow-downs

– can be hybrid, containing plain HDF5 objects for special purposes. HDF5 objects must be accessed by the

HDF5 library and not by HDF EOS5 extensions. will require more knowledge of file contents on

the part of an applications developer or data user.

Page 6: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 6

Swath Structure

• Data which is organized by time, or

other track parameter.

• Spacing can be irregular.

• Structure–Geolocation information stored

explicitly in Geolocation

Field (2-D array)

–Data stored in 2-D or 3-D arrays

–Time stored in 1-D or 2-D array,

–Geolocation/science data

connected by structural metadata

Page 7: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 7

Swath Structure

• For a typical satellite swath, an instrument takes a series of scans perpendicular to the ground track of the satellite as it moves along that ground track

Pro

file

s

Instrument

Along Track

• Or a sensor measures a vertical profile, instead of scanning across the ground track

Page 8: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 8

Swath Structure

Data Field.1

ProfileField.1

ProfileField.n

HDF5 Attribute

HDF5Dataset

Each Data Field object can have Attributes and/or Dimension Scales

• Swath_X groups are created when swaths are created

•Data/Geo fields’ parent group are created when fields are defined.

• Swath attributes are set as Object Attributes.

• Attributes for Data, Profile, or Gelocation Fields groups are set as Group Attributes

• Dataset related attributes set for each data field or geolocation field are called Local Attributes. They may contain attributes such as fillvalue, units, etc.

Geolocation Fields

“SWATHS” group

“Swath_N”“Swath_1”

Data Fields

Profile Fields

Object Attribute<SwathName>:

<AttrName>

Group Attribute<DataFields>:<AttrName>

Local Attribute<FieldName>:<AttrName>

Longitude Latitude

Time Colatitude

DataField.n

HDF5 Group

Page 9: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 9

Swath Structure

Field Name Data Type Format

Longitude float32 or float64 DD*, range [-180.0, 180.0]

Latitude float32 or float64 DD*, range [-90.0, 90.0]

Colatitude float32 or float64 DD*, range [0.0, 180.0]

Time float64 TAI93 [seconds until(-) /

since(+) midnight, 1/1/93]

• Geolocation Fields− Geolocation fields allow the Swath to be accurately tied to particular points on the Earth’s surface. − At least a time field (“Time”) or a latitude/longitude field pair (“Latitude” and “Longitude”). “Colatitude” may be substituted for “Latitude.”− Other geofields such as “Altitude” can be defined and mapped onto a dataDim− Fields must be either one- or two-dimensional − The “Time” field is always in TAI format (International Atomic Time)

* DD = Decimal Degree

Page 10: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 10

Swath Structure

• Data Fields

− Fields may have up to 8 dimensions.

− For multi-dimensional fields:

The dimension representing the “along track” must precede the dimension representing the scan or profile (in C-order).

( e.g. “Bands, DataTrack, DataXtrack” )

Page 11: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 11

Swath Structure

− Compression is selectable at the field level. ▪ All HDF5-supported compression methods are

available through the HDF-EOS5 library▪ The compression method is stored within the file.▪ Subsequent use of the library will un-compress the file.▪ As in HDF5 the data needs to be chunked before the

compression is applied.

−Field names: * may be up to 64 characters in length. * Any character can be used with the exception of,

",", ";", and "/". * are case sensitive. * must be unique within a particular Swath structure.

Page 12: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 12

Compression Codes

Compression Code Value Explanation

HDFE_COMP_NONE 0 No Compression

HDFE_COMP_RLE1

Run Length Encoding Compression (not supported)

HDFE_COMP_NBIT 2 NBIT Compression

HDFE_COMP_SKPHUFF 3 Skipping Huffman (not supported)

HDFE_COMP_DEFLATE 4 gzip Compression

HDFE_COMP_SZIP_CHIP5

szip Compression, Compression exactly as in hardware

HDFE_COMP_SZIP_K136

szip Compression, allowing k split = 13 Compression

HDFE_COMP_SZIP_EC 7 szip Compression, entropy coding method

HDFE_COMP_SZIP_NN 8

szip Compression, nearest neighbor coding method

HDFE_COMP_SZIP_K13orEC

9

szip Compression, allowing k split = 13 Compression, or entropy coding method

For Compression the data storage must be CHUNKED first

Page 13: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 13

Compression Codes

Compression Code Value Explanation

HDFE_COMP_SZIP_K13orNN

10

szip Compression, allowing k split = 13 Compression, or nearest neighbor coding method

HDFE_COMP_SHUF_DEFLATE 11 shuffling + deflate(gzip) Compression

HDFE_COMP_SHUF_SZIP_CHIP12

shuffling + Compression exactly as in hardware

HDFE_COMP_SHUF_SZIP_K1313

shuffling + allowing k split = 13 Compression

HDFE_COMP_SHUF_SZIP_EC 14 shuffling + entropy coding method

HDFE_COMP_SHUF_SZIP_NN 15

shuffling + nearest neighbor coding method

HDFE_COMP_SHUF_SZIP_K13orEC

16

shuffling + allowing k split = 13 Compression, or entropy coding method

HDFE_COMP_SHUF_SZIP_K13orNN

17

shuffling + allowing k split = 13 Compression, or nearest neighbor coding method

For Compression the data storage must be CHUNKED first

Page 14: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 14

Swath Structure

A “Normal” Dimension Map

Data Dimension

Geolocation DimensionMapping Offset: 1

Increment: 2

1 2 30 4 5 6 7 8 910111213141516171819

1 2 30 4 5 6 7 8 9

Data Dimension

Geolocation Dimension

Mapping Offset: -1

Increment: -21 2 30 4 5 6 7 8 9

1 2 30 4 5 6 7 8 910111213141516171819

A “Backwards” Dimension Map

• Dimension maps:

- Glue that holds the SWATH together.

- Define the relationship between data fields and geolocation fields dimensions

- Can be normal or indexed mapping

Page 15: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 15

Grid Structure

• Usage - Data which is organizedby regular geographic spacing, specified by projection parameters.

• Structure–Any number of 2-D to 8-D data arrays per structure–Geolocation information contained in projection formula,

coupled by structural metadata.–Any number of Grid structures per file allowed.

Page 16: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 16

Grid Structure• A grid contains:

- grid corner locations- a set of projection equations (or references to them) along with their relevant parameters.

• The equations and parameters are used to compute the lon/lat for any point in the grid.

• Important features of Grid data set: - the data fields- the dimensions- the projection

A Data Field in a Mercator-Projected Grid

A Data Field in an Interrupted Goode’s Homolosine-Projected Grid

Page 17: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 17

Grid Structure

Data Field characteristics:

− Fields may have up to 8 dims− Dim order in field definitions: - C: “Band, YDim, XDim” - Fortran: “XDim, YDim, Band”

Compression is selectable at the field level within a Grid. Subsequent use of the library will un-compress the file. Data needs to be tiled before the compression is applied.

− Field names must be unique within a particular Grid structure and are case sensitive. They may be up to 64 characters in length. − Any character can be used with the exception of, ",", ";", " and "/".

Page 18: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 18

Grid Structure

• Fields are Two - eight dimensional many fields will need not more than three: the predefined dimensions “XDim” and “YDim” and a third dimension for depth, height, or band.

Dimensions:• Two predefined dimensions

for Data Fields: “XDim” and

“YDim”. - defined when the grid is created - stored in the structure metadata. - relate data fields to each other and to the geolocation information

Page 19: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 19

Grid Structure

• Projection:− Is the heart of the Grid structure. − Provides a convenient way to encode geolocation information as

a set of mathematical equations, capable of transforming Earth coordinates (lat/long) to X-Y coordinates on a sheet of paper

− General Coordinate Transformation Package (GCTP) library contains all projection related conversions and calculations.

− Supported projections:

Geographic

Mercator Transverse Mercator  Universal Transverse

MercatorCylindrical Equal area   Hotin Oblique Mercator  Space Oblique Mercator

Sinusoidal* Integerized Sinusoidal   Interrupted Goode’s

Homolosine   

Polar StereographicLambert Azimuthal Equal

Area 

   

Polyconic Albers Conical Equal Area Lambert Conformal Conic

* Sinusoidal is pseudocylinderical

Page 20: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 20

Point Structure

• Data is specified temporally and/or spatially, but with no particular organization

• Structure–Tables used to store science

data at a particular Lat/Long/Height

– Up to eight levels of data allowed. Structural metadata specifies

relationship between levels.

Station Lat Lon Chicago 41.49 -87.37 Los Angeles 34.03 -118.14 Washington 38.50 -77.00 Miami 25.45 -80.11

Time Temp(C) 0800 -3 0900 -2 1000 -1 0800 20 0900 21 1000 22 1100 24 1000 6 1100 8 1200 9 1300 11 1400 12 0600 15 0700 16

Page 21: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 21

Point Structure

• Usually shared info is stored in Parent level, while data values stored in Child level

• The values for the LinkFiled in the Parent level must be unique

Lat Lon Temp(C) Dewpt(C)61.12 -149.48 15.00 5.0045.31 -122.41 17.00 5.0038.50 -77.00 24.00 7.0038.39 -90.15 27.00 11.0030.00 -90.05 22.00 7.0037.45 -122.26 25.00 10.0018.00 -76.45 27.00 4.0043.40 -79.23 30.00 14.0034.03 -118.14 25.00 4.0032.45 -96.48 32.00 8.0033.30 -112.00 30.00 10.0042.15 -71.07 28.00 7.0035.05 -106.40 30.00 9.0034.12 -77.56 28.00 9.00 46.32 -87.25 30.00 8.00 47.36 -122.20 32.00 15.0039.44 -104.59 31.00 16.0021.25 -78.00 28.00 7.00 44.58 -93.15 32.00 13.00 41.49 -87.37 28.00 9.0025.45 -80.11 19.00 3.00

• Made up of a series of data records taken at [possibly] irregular time intervals and at scattered geographic locations

• Loosely organized form of geolocated data supported by HDF-EOS

• Level are linked by a common field name called LinkField

Page 22: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 22

Point Structure

Object Attribute<SwathName>:

<AttrName>

“POINTS” Group

“Point_1”

Group Attribute<SwathName>:

<AttrName>

Local Attribute<SwathName>:

<AttrName>

Level 1 Level n

Data Linkag

“Point_n”

FWDPOINTER

BCKPOINTER

HDF5 Group

• Point structure groups are created when user creates “Point_1”, ….. • Data and Linkage groups are created automatically when the level is defined

• The order in which the levels are defined determines the (0-based) level index

• FWDPOINTER Linkage will not be set (acutally first one is set to (-1,-1)) if the records in Child level is not monotonic in LinkFiekd

• A level can contain any number of fields and records

Level Data

Page 23: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 23

Zonal Average (ZA) Structure

• Generalized array structure with no geolocation linkage (basically a swath like structure without geolocation.)

• The interface is designed to support data that has not associated with specific geolocation information.

• Data can be organized by time or track parameter

• Data spacing can be irregular• Structure

–Data stored in multidimensional arrays

–Time stored in 1-D or 2-D array

“ZAS” group

“Za_n”“Za_1”Object Attribute<SwathName>:

<AttrName>

Group Attribute<DataFields>:<AttrName>

Local Attribute<FieldName>:<AttrName>

Data Fields

HDF5 Group

DataField.n

Page 24: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 24

“h5dump” output of a simpleHDF-EOS5 file

HDF5 "Grid.he5" {GROUP "/" { GROUP "HDFEOS" { GROUP "ADDITIONAL" { GROUP "FILE_ATTRIBUTES" { } } GROUP "GRIDS" { GROUP "TMGrid" { GROUP "Data Fields" { DATASET "Voltage" { DATATYPE H5T_IEEE_F32BE DATASPACE SIMPLE { ( 5, 7 ) / ( 5, 7 ) } DATA { (0,0): -1.11111,-1.11111,-1.11111,-1.11111,-1.11111, (0,5): -1.11111,-1.11111, ……………………………….. (4,0): -1.11111,-1.11111,-1.11111,-1.11111,-1.11111, (4,5): -1.11111,-1.11111 }

Page 25: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 25

“h5dump” output of a simpleHDF-EOS5 file (cont.)

ATTRIBUTE "_FillValue" { DATATYPE H5T_IEEE_F32BE DATASPACE SIMPLE { ( 1 ) / ( 1 ) } DATA { (0): -1.11111 } } } } } } } GROUP "HDFEOS INFORMATION" { ATTRIBUTE "HDFEOSVersion" { DATATYPE H5T_STRING { STRSIZE 32; STRPAD H5T_STR_NULLTERM; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; }

Page 26: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 26

“h5dump” output of a simpleHDF-EOS5 file (cont.)

DATASPACE SCALAR DATA { (0): "HDFEOS_5.1.11" } } DATASET "StructMetadata.0" { DATATYPE H5T_STRING { STRSIZE 32000; STRPAD H5T_STR_NULLTERM; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; } DATASPACE SCALAR DATA { (0): "GROUP=SwathStructure END_GROUP=SwathStructure GROUP=GridStructure GROUP=GRID_1 GridName="TMGrid" XDim=5 YDim=7

Page 27: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 27

“h5dump” output of a simpleHDF-EOS5 file (cont.)

UpperLeftPointMtrs=(4855670.775390,9458558.924830) LowerRightMtrs=(5201746.439830,-10466077.249420) Projection=HE5_GCTP_TM

ProjParams=(0,0,0.999600,0,-75000000,0,5000000, 0,0,0,0,0,0)

SphereCode=0 GROUP=Dimension OBJECT=Dimension_1 DimensionName="Time" Size=10 END_OBJECT=Dimension_1 OBJECT=Dimension_2 DimensionName="Unlim" Size=-1 END_OBJECT=Dimension_2 END_GROUP=Dimension

Page 28: Profile of HDF-EOS5 Files Abe Taaheri, Raytheon IIS HDF & HDF-EOS Workshop XII October 2008.

Page 28

“h5dump” output of a simpleHDF-EOS5 file (cont.)

GROUP=DataField OBJECT=DataField_1 DataFieldName="Voltage" DataType=H5T_NATIVE_FLOAT DimList=("XDim","YDim") MaxdimList=("XDim","YDim") END_OBJECT=DataField_1 END_GROUP=DataField GROUP=MergedFields END_GROUP=MergedFields END_GROUP=GRID_1 END_GROUP=GridStructure GROUP=PointStructure END_GROUP=PointStructure GROUP=ZaStructure END_GROUP=ZaStructure END " } } }}}