NP-EMD.2006.580.0001
Profile of National Polar-Orbiting Operational Satellite System
(NPOESS) HDF5 Files
Chuck Nellis
NPOESS Program
Aurora, Colorado
NP-EMD.2006.580.0001
Presentation Compiled by:
Kim Tomashosky, Ken Stone, Pat Purcell, Ron Andrews
NPOESS Program
Aurora, Colorado
NP-EMD.2006.580.0001
Introduction
406.580.0001
NP-EMD.2006.580.0001
About NPOESS
• The National Polar-orbiting Operational Environmental Satellite System* (NPOESS) is a satellite system used to monitor global environmental conditions, and collect and disseminate data related to:
– Weather– Atmosphere– Oceans– Land– Near-space environment
• The National Polar-orbiting Operational Environmental Satellite System (NPOESS) will converge existing polar-orbiting satellite systems under a single national program
• Polar-orbiting satellites observe Earth from space– They collect and disseminate data on Earth's weather, atmosphere, oceans, land, and near-
space environment– The polar orbiters are able to monitor the entire planet and provide data for long-range weather
and climate forecasts
*http://www.ipo.noaa.gov/
506.580.0001
NP-EMD.2006.580.0001
About NPOESS, Continued
• Increases the timeliness and accuracy of severe weather event forecasts • Will collect over 50 environmental measurements which are crucial to
timely, accurate, weather forecasts by military and civilian organizations. It will enable:
– Increased accuracy in severe storm warnings and forecasting– Improved drought analysis and flood warnings
• Managed by the tri-agency Integrated Program Office* (IPO) utilizing personnel from the Department of Commerce, Department of Defense, and NASA
*http://www.ipo.noaa.gov/
606.580.0001
NP-EMD.2006.580.0001
NPOESS Data Products
• NPOESS Data Products are distributed, formatted in HDF5– Archived and made available to the community via the Comprehensive Large Array-
data Stewardship System* (CLASS), an electronic library of NOAA environmental data– There is no “HDF-NPOESS” library, NPOESS Data Products have been designed
using the native HDF5 library
• NPOESS Data Products– Raw Data Records (RDR)– Sensor Data Records (SDR) / Temperature Data Records (TDR)– Intermediate Products (IP)– Application Related Products (ARP)– Environmental Data Records (EDR)
*http://www.class.noaa.gov/
706.580.0001
NP-EMD.2006.580.0001
Data Organization
• Data Product Granules– A segment of data, with the size optimally determined to achieve maximum efficiency
for an algorithm class. – It is associated with an integer number of sensor scans, and its definition varies for
sensors and data products– Gaps in granules are filled using a pre-defined ‘missing data’ fill value– Represented as a set of region reference pointers to sections of the respective data
set arrays
• Data Product Aggregations– A grouping of the same kind of granules packaged in HDF5 covering a temporal range– May contain as few as one granule and as many as an orbit of granules– Represented as a set of object reference pointers to the various groupings of data
which make up a particular data product (one for each homogenous dataset included in the granule)
806.580.0001
NP-EMD.2006.580.0001
NPOESS Documentation
• Documentation for the NPOESS Data Products– NPOESS Common Data Format Control Book – External
• Volume I – Overview• Volume II – RDR Formats• Volume III – SDR/TDR Formats• Volume IV – EDR/IP/ARP Formats• Volume V – Metadata• Volume VI – Ancillary Data, Auxiliary Data, Messages, and Reports• Volume VII – Application Packets
NP-EMD.2006.580.0001
NPOESS HDF5 General Overview
1006.580.0001
NP-EMD.2006.580.0001
HDF5 Conceptual DiagramXML User Block
Root Group
Data
Product Group
Aggregation
Granules
Reference Object
Reference Region
Reference Region
Reference Region
1106.580.0001
NP-EMD.2006.580.0001
HDF5 XML User Block
• The XML User Block for NPOESS Data Products provides a ‘quick-look’ into the metadata of the associated HDF5 file
– The size of the HDF5 XML User Block will be a multiple of 512 bytes• The XML User Blocks are defined in the following volumes of the CDFCB-X:
– Volume V – Metadata• Contains the XML User Block formats for:
– Raw Data Records (RDR)– Sensor Data Records (SDR) / Temperature Data Records (TDR)– Intermediate Products (IP)– Application Related Products (ARP)– Environmental Data Records (EDR)
– Volume VI – Ancillary, Auxiliary, Reports, and Messages• Contains the XML User Block formats for the Ancillary and Auxiliary data files that are delivered in HDF5
• Example elements:– Mission, Platform, and Instrument Names– Number_of_Data_Products– CollectionShortName(s)– Aggregation Information– Timestamps
1206.580.0001
NP-EMD.2006.580.0001
General HDF5 File Structure
/
AllDataData_Products
<NPOESS Data Product CollectionShortName><CollectionShortName>_All
Dataset_Array
<NPOESS Data Product CollectionShortName>_Gran_<n>
<NPOESS Data Product CollectionShortName>_Agg
1
1
1
1
1
*
1
1..*
1
1..*
1
1..*
1
1
+Reference Regions +Reference Objects
The File Root Group
The *_All Group containsall of the datasets in thefile, it does not containany metadata attributes
One or more Data ProductGroups may be presentunder the DataProducts Group Data Product Groupscontain individual granulespertaining to specificaggregations
The datasets referenced by the Reference Objects are aggregations of granulesThe datasets referenced by the Reference Regions are specific granules
Both the Granule and the Aggregation Datasets contain datasets consisting of an array of References
1306.580.0001
NP-EMD.2006.580.0001
NPOESS HDF5 Metadata Locations
• The NPOESS HDF5 Metadata is organized hierarchically, from the top down in order to reduce duplication of information and to take advantage of the hierarchical nature of HDF5
– Root Group• Data Products Group
– Data Product (indicated by the specific product’s identifier)» Product Aggregation Dataset» Product Granule Dataset
1406.580.0001
NP-EMD.2006.580.0001
HDF5 Conceptual Diagram - Data
Time
Granule Boundary
Reference Region
Reference Object
256In Track
3200Cross Track
Reference Region
Granule 0 Granule 1
256In Track
1506.580.0001
NP-EMD.2006.580.0001
NPOESS Quality Flags Overview
• The concept is to provide for consistently stored, high density, quality information about the delivered data – simplifying usability while maintaining storage efficiency
• Quality flags are qualifications of one or more consecutive bits in each byte.• Quality flag arrays follow the structure of the data product
– The size of the arrays are equal to or less than the size of the data to which the quality information applies (dimensions correspond to the data product arrays)
• Quality flags are stored in the HDF5 files as n number(s) of two or three dimensional, 1-byte arrays.
– The number of arrays is dependant on the quality flag definitions, specific to each data product– Each byte may contain multiple bit-level flags– Quality flags will be ordered such that each flag is entirely contained within a single byte,
occasionally resulting in a byte with reserved or meaningless bits– Byte alignment is the same for every quality flag array
• First bit (left-most) is the LSB
NP-EMD.2006.580.0001
Detailed NPOESS UML Models
1706.580.0001
NP-EMD.2006.580.0001
RDR UML Model
+Metadata[1..*]
«HDF5 Dataset»/
«HDF5 Group»AllData «HDF5 Group»Data_Products
+Metadata[1..*]
«HDF5 Group»<CollectionShortName>
«HDF5 Group»<CollectionShortName>_All
«HDF5 Dataset»Dataset_Array
+Metadata[1..*]
«HDF5 Dataset»<CollectionShortName>_Gran_n
+Metadata[1..*]
«HDF5 Dataset»<CollectionShortName>_Agg
+Metadata[1..*]
«HDF5 Group»<Spacecraft Diary CollectionShortName>
1
1
11..*
1
1..*
1
1
11 1 1..*
1
1..*
11
+Metadata[1..*]
«HDF5 Dataset»<Spacecraft Diary CollectionShortName>_Gran_n
+Metadata[1..*]
«HDF5 Dataset»<Spacecraft Diary CollectionShortName>_Agg
11..*
1
1
+Reference Regions
1
1..*
+Reference Objects
1
1
+Reference Objects
1
1
+Reference Regions
1
1..*
1806.580.0001
NP-EMD.2006.580.0001
Common RDR LayoutGeneric RDR
Variable Storage Area Offsets
Static HeaderContains number of observations and offsets to the variable length records
APID ListContains number of APs occurring for each specific APID per observation
Packet TrackerLists, per APID, the size and offset into the Application Packets Storage Area for each AP of the same type
Application Packets Storage AreaStores the received APs in order of arrival
1906.580.0001
NP-EMD.2006.580.0001
SDR/TDR UML Model
+Metadata[1..*]
«HDF5 Group»/
«HDF5 Group»AllData«HDF5 Group»Data_Products
+Metadata[1..*]
«HDF5 Group»<SDR/TDR CollectionShortName>
+Metadata[1..*]
«HDF5 Group»<GEO CollectionShortName>
«HDF5 Group»<CollectionShortName>_All
«HDF5 Dataset»Dataset_Array
+Metadata[1..*]
«HDF5 Dataset»<GEO CollectionShortName>_Agg
+Metadata[1..*]
«HDF5 Dataset»<GEO CollectionShortName>_Gran_n
+Metadata[1..*]
«HDF5 Dataset»<SDR/TDR CollectionShortName>_Gran_n
+Metadata[1..*]
«HDF5 Dataset»<SDR/TDR CollectionShortName>_Agg
N_GEO_Ref and inclusion of the GEO Group is dependant on the Packaging Option configured at the IDP. These elements are mutually exclusive
1
1
1
1..*
11..*
1
1
1
1
11..*
1 1
1
1..*
11..*
11
+Reference Regions
11..*
+Reference Objects
1
1
+Reference Objects
1
1
+Reference Regions
1
1..*
2006.580.0001
NP-EMD.2006.580.0001
EDR UML Model
+Metadata[1..*]
«HDF5 Group»/
«HDF5 Group»AllData «HDF5 Group»Data_Products
+Metadata[1..*]
«HDF5 Group»<EDR/IP/ARP CollectionShortName>
+Metadata[1..*]
«HDF5 Group»<GEO CollectionShortName>«HDF5 Group»
<CollectionShortName>_All
«HDF5 Dataset»Dataset_Array +Metadata[1..*]
«HDF5 Dataset»<GEO CollectionShortName>_Agg
+Metadata[1..*]
«HDF5 Dataset»<GEO CollectionShortName>_Gran_<n>
+Metadata[1..*]
«HDF5 Dataset»<EDR/IP/ARP CollectionShortName>_Gran_<n>
+Metadata[1..*]
«HDF5 Dataset»<EDR/IP/ARP CollectionShortName>_Agg
N_GEO_Ref and inclusion of the GEO Group is dependant on the Packaging Option configured at the IDP. These elements are mutually exclusive
1
1
1
1
1*
1
1..*
1
1..*
1
1..*
1
1 11..* 1 1
11
+Reference Regions
1
1..*
+Reference Objects
1
1
+Reference Regions
1
1..*
+Reference Objects
1
1
2106.580.0001
NP-EMD.2006.580.0001
Geolocation UML Model
+Metadata[1..*]
«HDF5 Group»/
«HDF5 Group»AllData «HDF5 Group»Data_Products
+Metadata[1..*]
«HDF5 Group»<GEO CollectionShortName>
«HDF5 Group»<CollectionShortName>_All
«HDF5 Dataset»Dataset_Array
+Metadata[1..*]
«HDF5 Dataset»<GEO CollectionShortName>_Agg
+Metadata[1..*]
«HDF5 Dataset»<GEO CollectionShortName>_Gran_<n>
11
11
11..*
1 1
1 1
11..*
1 1
+Reference Objects
1
1
+Reference Regions
1
1..*
2206.580.0001
NP-EMD.2006.580.0001
Ancillary/Auxiliary UML Models
+Metadata[1..*]
«HDF5 Group»/
«HDF5 Group»AllData
+Metadata[1..*]
«HDF5 Group»<Collection Short Name>
«HDF5 Group»<CollectionShortName>_All
Dataset_Array
1 111
11
1
1
+Reference Objects
1
1
NP-EMD.2006.580.0001
NPOESS Sample Data Reading the NPOESS HDF5 file with the
HDF API
2406.580.0001
NP-EMD.2006.580.0001
+File Metadata[1..*]
«HDF5 Group»/
«HDF5 Group»DataProducts
+General Product Metadata[1..*]
«HDF5 Group»VIIRS-IST-EDR
+Granule Metadata[1..*]
«HDF5 Dataset»VIIRS-IST-EDR_Gran_<n>
+Aggregate Metadata[1..*]
«HDF5 Dataset»VIIRS-IST-EDR_Agg
1
1
1
*
1 1..*
1
1
1 1..*
1 1
+IST_Array : H5T_NATIVE_INT+QF1_VIIRSISTEDR : H5T_NATIVE_UCHAR+QF2_VIIRSISTEDR : H5T_NATIVE_UCHAR+QF3_VIIRSISTEDR : H5T_NATIVE_UCHAR+ISTFactors : H5T_NATIVE_FLOAT
«H5T_Reference Array»VIIRS-IST-EDR
+IST_Array : H5T_NATIVE_INT+QF1_VIIRSISTEDR : H5T_NATIVE_UCHAR+QF2_VIIRSISTEDR : H5T_NATIVE_UCHAR+QF3_VIIRSISTEDR : H5T_NATIVE_UCHAR+ISTFactors : H5T_NATIVE_FLOAT
«H5T_Reference Array»VIIRS-IST-EDR
These data arrays are differentfor each data product
VIIRS Ice Surface Temperature (IST) Environmental Data Record (EDR) Example UML Model
2506.580.0001
NP-EMD.2006.580.0001
The NPOESS Granule - Product ProfileIce Surface Temperature
• The Product Profile describes the NPOESS granule.• For Ice Surface Temperature, the fields in the granule are:
– IST_Array (Shown below)– QF1_VIIRSISTEDR (Shown below)– QF2_VIIRSISTEDR– QF3_VIIRSISTEDR– ISTFactors (Scale & Offset – Shown below)
Fields Name Data
Size Field Offset
Dimensions
IST_Array 2bytes 0 Name Attribute Name Granule Boundary Dynamic Min Array Size Max Array Size inTrack N_Rows Yes No 256 256 crossTrack N_Columns No No 3200 3200
Datum Description Datum
Offset Unscaled Valid Range Min
Unscaled Valid Range Max
Measurement Units
Scaled Scale Factor Name
Data Type
Fill Values Legend Entries
Ice Surface Temperature
0 213 275 K Yes ISTFactors 16-bit unsigned Integer
Name Value NA_UINT16_FILL 65535 MISS_UINT16_FILL 65534 ONBOARD_PT_UINT16_FILL 65533 ONGROUND_PT_UINT16_FILL 65532
ERR_UINT16_FILL 65531
ELINT_UINT16_FILL 65530 VDNE_UINT16_FILL 65529 SOUB_UINT16_FILL 65528
Name Value
2606.580.0001
NP-EMD.2006.580.0001
The NPOESS Granule - Product ProfileIST Quality Flag Byte 1
Fields
Name Data Size
Field Offset
Dimensions
QF1_VIIRSISTEDR 1byte 0 Name Attribute Name Granule Boundary Dynamic Min Array Size Max Array Size
inTrack N_Rows Yes No 256 256
crossTrack N_Cols No No 3200 3200
Datum
Description Datum Offset
Unscaled Valid Range Min
Unscaled Valid Range Max
Measurement Units
Scaled
Scale Factor Name
Data Type
Fill Values Legend Entries
IST Quality 0 unitless No (2) bits Name Value
Name Value High 0 Medium 1 Low 2 No Retrieval 3
Algorithm 2 unitless No (1) bit Name Value
Name Value 2-Band Split Window Baseline 0 Single-band (12 micrometer) Fallback
1
Day/Night 3 unitless No (1) bit Name Value
Name Value Night, (85 Deg. < Solar Zenith Angle)
0
Day, (0 Deg. <= Solar Zenith Angle <= 85 Deg.)
1
Band M15 Brightness Temperature Range
4 unitless No (1) bit Name Value
Name Value Within Range, (190K < BT(M15) < 340K)
0
Out Of Range 1 Band M16 Brightness Temperature Range
5 unitless (1) bit Name Value
Name Value Within Range, (190K < BT(M16) < 343K)
0
Out Of Range 1 Active Fire 6 unitless (1) bit Name Value
Name Value No Active Fire 0 Active Fire 1
Spare 7 unitless (1) bit Name Value
Name Value
2706.580.0001
NP-EMD.2006.580.0001
The NPOESS Granule - Product ProfileIST Scale Factors
Fields Name Data
Size Field Offset
Dimensions
ISTFactors 4bytes 0 Name Attribute Name
Granule Boundary
Dynamic Min Array Size
Max Array Size
IST scale/offset values
Yes No 2 2
Datum
Description Datum Offset
Unscaled Valid Range Min
Unscaled Valid Range Max
Measurement Units
Scaled Scale Factor Name
Data Type Fill Values Legend Entries
Scale = first array element Offset = second array element
Scale = unitless Offset = K
No 32-bit floating point
Name Value Name Value
2806.580.0001
NP-EMD.2006.580.0001
VIIRS Ice Surface Temperature (IST) EDR – HDFView Screenshot
Data Arrays (float, int, etc)
Arrays of HDF References
2906.580.0001
NP-EMD.2006.580.0001
The NPOESS Granule – HDF View
The granule dataset array “VIIRS-IST-EDR_Gran_1” contains object IDs that “point” or dereference to the second region of each dataset array under the “VIIRS-IST-EDR_All” group:
The first object ID in the VIIRS-IST-EDR_Gran_1 array dereferences to the middle portion of the IST_Array
All of these “portions” share the same time effectivity and other granule level metadata.
3006.580.0001
NP-EMD.2006.580.0001
References to Regions
IST_Array (768x3200)
QF1_VIIRSISTEDR (768x3200)
QF2_VIIRSISTEDR (768x3200)
QF3_VIIRSISTEDR (768x3200)
ISTFactors (6)
Granule 0 (256 x 3200)
Granule 1 (256 x 3200)
Granule 2 (256 x 3200)
Granule 0 (256 x 3200)
Granule 1 (256 x 3200)
Granule 2 (256 x 3200)
Granule 0 (256 x 3200)
Granule 1 (256 x 3200)
Granule 2 (256 x 3200)
Granule 0 (256 x 3200)
Granule 1 (256 x 3200)
Granule 2 (256 x 3200)
Granule 0 (2)
Granule 1 (2)
Granule 2 (2)
71284376 71287112 71287384 71287656 71287928
VIIRS-IST-EDR_Gran_1[ ]
Contains Object IDs to use in conjunction with the “H5Rdereference” and “H5Rget_region” commands.
3106.580.0001
NP-EMD.2006.580.0001
NPOESS HDF5 Files Summary
• The NPOESS Program delivers the official deliverable data products (RDR, SDR/TDR, EDR/ARP/IP) and dynamic ancillary data and auxiliary data in HDF5 Files
• The HDF5 Files have an XML User Block that can be accessed without HDF5 tools - provides a “quick-look” into the metadata before opening the HDF5 file
• Metadata within the HDF5 files are stored as attributes• There are general UML Models for the NPOESS official delivered data that
provide a common framework • Official deliverable data products are organized by reference objects
(aggregations) which contain one or more reference regions (granules)• Although data may be accessed directly through the All Data group, the Data
Products group provides integrated access:– Allows the user to access both metadata and data through a common HDF5 group
• Metadata is accessed directly by reading the Attribute values• Datasets may be accessed by dereferencing the object ID stored in the Data Products
Group for the aggregation or granule• NPOESS HDF5 files provide flexibility for a variety of end users.
NP-EMD.2006.580.0001
Backup Slides
3306.580.0001
NP-EMD.2006.580.0001
NPOESS Granules – Derefencing to DatasetsDetails(See the HDF5 User’s Guide release 1.6.5, Chapter 2, “The HDF5 Library and Programming Model”
Section 2, “Dataspace Function Summaries” - H5S commands)
Note that the H5S API commands fall into two broad categories: 1. Dataspace Management & Query Functions
• These functions operate on the entire dataspace– Entire dataspace is equivalent to an entire (temporal) aggregated array’s dataspace in an NPOESS
HDF5 file under the “All_Data” group• Example: H5Sget_simple_extent_npoints
– Returns the number of elements in the entire Array under “All_Data” for HDF5 NPOESS. – For VIIRS-IST-EDR_Gran_1, the first reference in the array (referencing the IST_Array) would return
768 x 3200 = 2,457,600 points.
2. Dataspace Selection Functions – hyperslabs and points• These functions operate on a hyperslab or a point selection• For NPOESS HDF5 files, the “selection” is equivalent to the granule (hyperslab) for a particular field (array)• The “selection” is the portion of the data array the reference “points” to:
– Example: H5Sget_select_npoints » Determines the number of points in a dataspace selection. » For HDF5 NPOESS, this would be the number of points in a granule for a particular field» For VIIRS-IST-EDR_Gran_1, the first reference in the array (referencing the IST_Array) would
return 256 x 3200 = 819,200 points.– Note that the “select” in the API command is short for “selection”. It is not a redundant term for “get”.
3406.580.0001
NP-EMD.2006.580.0001
Extract from HDF5 User’s Guide (1.6.5), Section 4.2 - The Programming Model
Reading and Writing a Portion of a Dataset
A “selection” may be: – A hyperslab (NPOESS uses this only)– A Union of hyperslabs– A list of independent points. – Note: These illustrations show a mapping procedure to another dataspace. The HDF5 API does not do
this when you dereference ... this would be user defined.
3506.580.0001
NP-EMD.2006.580.0001
h5dump Screenshot – VIIRS Sea Surface Temperature HDF5 File
• Another way to view the arrays of references (Aggregation and Granule dataset arrays) is with the h5dump utility:
– Granule:
– Aggregation:
– Note: Currently, the only way to match the object ID in the granule/aggregation datasets is to manually list the aggregation as shown above using h5dump or look up the order in the NPOESS Data Format Control Book - External. The HDF Group will add the ability to obtain the name of the dataset a reference points to in v1.8 beta.
3606.580.0001
NP-EMD.2006.580.0001
Sample Code (p1)– Reads a Multi-Granule HDF5 NPOESS File
Note: This dereferences to a dataset region (selection). Use H5R_OBJECT to dereference to an object (for VIIRS-IST-EDR_Aggr).
3706.580.0001
NP-EMD.2006.580.0001
Sample Code (p2)
3806.580.0001
NP-EMD.2006.580.0001
Sample Code (p3)
3906.580.0001
NP-EMD.2006.580.0001
Sample Code (p4) – Code Output
4006.580.0001
NP-EMD.2006.580.0001
Sample Files & HDF5 Reference API Summary
• NPOESS granules are made up of portions of one or more dataset arrays.
• In order to access a granule, the granule dataset must be read and each object ID dereferenced using the HDF Reference API (H5R)
• Use H5Sget_ ... commands to retrieve information about the entire dataspace of the array containing a reference’s selection (or hyperslab)
• Use H5Sget_select_ ... command to retrieve information about the selection only
Top Related