Overview of Parallel HDF5 and Performance Tuning in HDF5 Library
HDF5 Advanced Topics Datatypes
description
Transcript of HDF5 Advanced Topics Datatypes
![Page 1: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/1.jpg)
1 HDFHDF
HDF5 Advanced TopicsHDF5 Advanced TopicsDatatypesDatatypes
HDF and HDF-EOS Workshop IXHDF and HDF-EOS Workshop IX
November 30, 2005November 30, 2005
![Page 2: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/2.jpg)
2 HDFHDF
Goal Goal
Introduce HDF5 datatypesIntroduce HDF5 datatypes
![Page 3: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/3.jpg)
3 HDFHDF
TopicsTopics
• Overview of HDF5 datatypes
• Simple atomic datatypes
• Composite atomic datatypes
• Compound datatypes
• Discovering HDF5 datatype
![Page 4: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/4.jpg)
4 HDFHDF
Overview ofOverview of HDF5 DatatypesHDF5 Datatypes
![Page 5: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/5.jpg)
5 HDFHDF
DatatypesDatatypes
• A datatype is– A classification specifying the interpretation of
a data element– Specifies for a given data element
• the set of possible values it can have• the operations that can be performed• how the values of that type are stored
– May be shared between different datasets in one file
![Page 6: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/6.jpg)
6 HDFHDF
General Operations on HDF5 DatatypesGeneral Operations on HDF5 Datatypes
• Create – H5Tcreate creates a datatype of the H5T_COMPOUND, H5T_OPAQUE,
and H5T_ENUM classes
• Copy– H5Tcopy creates another instance of the datatype; can be applied to any
datatypes
• Commit– H5Tcommit creates an Datatype Object in the HDF5 file; comitted
datatype can be shared between different datatsets
• Open– H5Topen opens the datatypes stored in the file
• Close– H5Tclose closes datatype object
![Page 7: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/7.jpg)
7 HDFHDF
Programming model for HDF5 DatatypesProgramming model for HDF5 Datatypes
• Create– Use predefined HDF5 types – Create compound or composite datatypes
• Create a datatype (by copying existing one or by creating from the one of H5T_COMPOUND(ENUM,OPAQUE) classes)
• Create a datatype by queering datatype of a dataset
– Open committed datatype from the file– Set datatype properties (length, precision, etc.)
• (Optional) Discover datatype properties (size, precision, members, etc.)
• Use datatype to create a dataset/attribute, to write/read dataset/attribute, to set fill value
• (Optional) Save datatype in the file• Close
– No need to close for predefined datatypes
![Page 8: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/8.jpg)
8 HDFHDF
SimpleSimple AtomicAtomic HDF5 DatatypesHDF5 Datatypes
![Page 9: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/9.jpg)
9 HDFHDF
HDF5 Atomic DatatypesHDF5 Atomic Datatypes
• Atomic types classes– standard integers & floats – strings (fixed and variable size)– pointers - references to objects/dataset regions– enumeration - names mapped to integers– opaque– bitfield
• Element of an atomic datatype is a smallest possible unit for HDF5 I/O operation– Cannot write or read just mantissa or exponent fields for
floats or sign filed for integers
![Page 10: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/10.jpg)
10 HDFHDF
HDF5 Predefined DatatypesHDF5 Predefined Datatypes
• HDF5 Library provides predefined datatypes (symbols) for all atomic classes except opaque– H5T_<arch>_<base>– Examples:
• H5T_IEEE_F64LE• H5T_STD_I32BE• H5T_C_S1• H5T_STD_REF_OBJ, H5T_STD_REF_DSETREG• H5T_NATIVE_INT
• Predefined datatypes do not have constant values; initialized when library is initialized
![Page 11: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/11.jpg)
11 HDFHDF
HDF5 Predefined DatatypesHDF5 Predefined Datatypes
• Operations prohibited– Create (H5Tcreate)– Close (H5Tclose)
• Operations permitted – Copy (H5Tcopy)– Set/get size and other properties
![Page 12: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/12.jpg)
12 HDFHDF
When to use HDF5 Predefined Datatypes?When to use HDF5 Predefined Datatypes?
• In datasets and attributes creation operations– Argument to H5Dcreate or to H5Acreate
• In datasets and attributes read/write operations– Argument to H5Dwrite/read, H5Awrite/read– Use H5T_NATIVE_* types for application portability
• To create user-defined types– Fixed and variable-length strings– User-defined integers and floats (13-bit integer or non-standard floating-
point)
• In composite types definitions• Do not use for declaring variables
![Page 13: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/13.jpg)
13 HDFHDF
Storing strings in HDF5Storing strings in HDF5
• Array of characters– Access to each character– Extra work to access and interpret each string
• Fixed lengthstring_id = H5Tcopy(H5T_C_S1);H5Tset_size(string_id, size);
• Overhead for short strings• Can be compressed
• Variable lengthstring_id = H5Tcopy(H5T_C_S1);H5Tset_size(string_id, H5T_VARIABLE);
• Overhead as for all VL datatypes (later)• Compression will not be applied to actual data
![Page 14: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/14.jpg)
14 HDFHDF
Bitfield DatatypeBitfield Datatype
• C bitfield• Bitfield – sequence of bytes packed in some
integer type • Examples of Predefined Datatypes
– H5T_NATIVE_B64 – native 8 byte bitfield– H5T_STD_B32 – standard 2 bytes bitfield
• Created by copying predefined bitfield type and setting precision, offset and padding
• Use n-bit filter to store significant bits only
![Page 15: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/15.jpg)
15 HDFHDF
Bitfield DatatypeBitfield Datatype
0 0 0 1 0 1 1 1 0 0 1 1 1 0 0 0
0 7 15
Example:LE0-padding
Offset 3Precision 11
![Page 16: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/16.jpg)
16 HDFHDF
Opaque DatatypeOpaque Datatype
• Datatype that cannot be described by any other HDF5 datatype
• Element treated as a blob of data and not interpreted by the library
• Identified by– Size– Tag (ASCII string)
![Page 17: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/17.jpg)
17 HDFHDF
Reference DatatypeReference Datatype
• Reference to an HDF5 object– Pointers to Groups, datasets, and named datatypes in a
file• Predefined datatype H5T_STD_REG_OBJ• H5Rcreate• H5Rdereference
• Reference to a dataset region (selection)– Pointer to the dataspace selection
• Predefined datatype H5T_STD_REF_DSETREG• H5Rcreate• H5Rdereference
![Page 18: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/18.jpg)
18 HDFHDF
Enumeration DatatypeEnumeration Datatype
• Constructed after C/C++ enum type• Name-value pairs
– Name –ascii string– Value – of any HDF5 integer type– H5Tcreate
• Creates the type based on integer type
– H5Tinsert• Inserts name-value pairs• Order of insertion is not important• Two types are equal if they have the same pairs
![Page 19: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/19.jpg)
19 HDFHDF
Composite atomic HDF5 DatatypesComposite atomic HDF5 Datatypes
![Page 20: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/20.jpg)
20 HDFHDF
Array DatatypeArray Datatype
• Element is multidimensional array of elements
• Base type can be of any HDF5 datatypes
• Example– Time series of speed (v1(t), v2(t), v3(t))– Speed vector (v1, v2,v3) – all three components
are needed; no subsetting by vector component (e.g. by v1)
![Page 21: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/21.jpg)
21 HDFHDF
•Data
Time•Data
•Data
•Data
•Data
•Data
•Data
•Data
•Data
Time
HDF5 Fixed and Variable length array HDF5 Fixed and Variable length array storagestorage
![Page 22: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/22.jpg)
22 HDFHDF
HDF5 Variable Length DatatypesHDF5 Variable Length DatatypesProgramming issuesProgramming issues
• Each element is represented by C struct typedef struct {
size_t length;
void *p;
} hvl_t;
• Base type can be any HDF5 type
• H5Tvlen_create(base_type)
![Page 23: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/23.jpg)
23 HDFHDF
Creation of HDF5 Variable length arrayCreation of HDF5 Variable length array
data[n].p
data[n].len
•Data
hvl_t data[LENGTH]
![Page 24: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/24.jpg)
24 HDFHDF
•Data
•Data
•Data
•Data
•Data
Creation of HDF5 Variable length arrayCreation of HDF5 Variable length array
hvl_t data[LENGTH];
for(i=0; i<LENGTH; i++) { data[i].p=HDmalloc((i+1)*sizeof(unsigned int)); data[i].len=i+1;
}
tvl = H5Tvlen_create (H5T_NATIVE_UINT);
data[0].p
data[4].len
![Page 25: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/25.jpg)
25 HDFHDF
HDF5 Variable Length DatatypesHDF5 Variable Length DatatypesStorageStorage
Global heapGlobal heap
Dataset with variable length datatypeDataset with variable length datatype
Raw dataRaw data
![Page 26: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/26.jpg)
26 HDFHDF
Reading HDF5 Variable length arrayReading HDF5 Variable length array
hvl_t rdata[LENGTH];
/* Discover the type in the file */
tvl = H5Tvlen_create (H5T_NATIVE_UINT);
ret = H5Dread(dataset,tvl,H5S_ALL,H5S_ALL,
H5P_DEFAULT, rdata);
/* Reclaim the read VL data */
ret=H5Dvlen_reclaim(tvl,H5S_ALL,H5P_DEFAULT,rdata);
When size and base datatype are known:
![Page 27: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/27.jpg)
27 HDFHDF
Reading HDF5 Variable length arrayReading HDF5 Variable length array
hvl_t *rdata;
ret=H5Dvlen_get_buf_size(dataset,tvl,H5S_ALL,&size);
rdata = (hvl_t *)malloc(size);
ret = H5Dread(dataset,tvl,H5S_ALL,H5S_ALL,
H5P_DEFAULT, rdata);
…
/* Reclaim the read VL data */ ret=H5Dvlen_reclaim(tvl,H5S_ALL,H5P_DEFAULT,rdata);
free(rdata);
When size is unknown:
![Page 28: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/28.jpg)
28 HDFHDF
Freeing HDF5 Variable length arrayFreeing HDF5 Variable length array
data[n].p
data[n].len
•Data
H5Dvlen_reclaim
free
![Page 29: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/29.jpg)
29 HDFHDF
Compound HDF5 DatatypesCompound HDF5 Datatypes
![Page 30: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/30.jpg)
30 HDFHDF
HDF5 Compound DatatypesHDF5 Compound Datatypes
• Compound types– Comparable to C structs – Members can be atomic or compound types – Members can be multidimensional– Can be written/read by a field or set of fields– Non all data filters can be applied (shuffling, SZIP)– H5Tcreate(H5T_COMPOUND), H5Tinsert calls to
create a compound datatype– See H5Tget_member* functions for discovering
properties of the HDF5 compound datatype
![Page 31: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/31.jpg)
31 HDFHDF
HDF5 Compound DatatypesHDF5 Compound DatatypesCreating and writing compound datasetCreating and writing compound dataset
typedef struct s1_t { int a; float b; double c; } s1_t;
s1_t s1[LENGTH];
/* Initialize the data */ for (i = 0; i< LENGTH; i++) { s1[i].a = i; s1[i].b = i*i; s1[i].c = 1./(i+1); }
![Page 32: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/32.jpg)
32 HDFHDF
HDF5 Compound DatatypesHDF5 Compound DatatypesCreating and writing compound datasetCreating and writing compound dataset
/* Create datatype in memory. */
s1_tid = H5Tcreate (H5T_COMPOUND, sizeof(s1_t)); H5Tinsert(s1_tid, "a_name", HOFFSET(s1_t, a), H5T_NATIVE_INT); H5Tinsert(s1_tid, "c_name", HOFFSET(s1_t, c), H5T_NATIVE_DOUBLE); H5Tinsert(s1_tid, "b_name", HOFFSET(s1_t, b), H5T_NATIVE_FLOAT);
Note: • Use HOFFSET macro instead of calculating offset by hand• Order of H5Tinsert calls is not important if HOFFSET is used
![Page 33: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/33.jpg)
33 HDFHDF
HDF5 Compound DatatypesHDF5 Compound DatatypesCreating and writing compound datasetCreating and writing compound dataset
/* Create dataset and write data */
dataset = H5Dcreate(file, DATASETNAME, s1_tid, space, H5P_DEFAULT);status = H5Dwrite(dataset, s1_tid, H5S_ALL, H5S_ALL, H5P_DEFAULT, s1);
Note: • In this example memory and file datatypes are the same•Type is not packed • Use H5Tpack to save space in the file
s2_tid = H5Tpack(s1_tid);status = H5Dcreate(file, DATASETNAME, s2_tid, space, H5P_DEFAULT);
![Page 34: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/34.jpg)
34 HDFHDF
File content with h5dumpFile content with h5dump
HDF5 "SDScompound.h5" {GROUP "/" { DATASET "ArrayOfStructures" { DATATYPE { H5T_STD_I32BE "a_name"; H5T_IEEE_F32BE "b_name"; H5T_IEEE_F64BE "c_name"; } DATASPACE { SIMPLE ( 10 ) / ( 10 ) } DATA { { [ 0 ], [ 0 ], [ 1 ] }, { [ 1 ], [ 1 ], [ 0.5 ] }, { [ 2 ], [ 4 ], [ 0.333333 ] }, ….
![Page 35: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/35.jpg)
35 HDFHDF
HDF5 Compound DatatypesHDF5 Compound DatatypesReading compound datasetReading compound dataset
/* Create datatype in memory and read data. */
dataset = H5Dopen(file, DATSETNAME);s2_tid = H5Dget_type(dataset);mem_tid = H5Tget_native_type (s2_tid); status = H5Dread(dataset, mem_tid, H5S_ALL, H5S_ALL, H5P_DEFAULT, s1);
Note:
We could construct memory type as we did in writing example
For general applications we need discover the type in the file to guess the structure to read to
![Page 36: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/36.jpg)
36 HDFHDF
HDF5 Compound DatatypesHDF5 Compound DatatypesReading compound dataset: subsetting by fieldsReading compound dataset: subsetting by fields
typedef struct ss_t { double a; float b;} ss_t; ss_t ss[LENGTH];…ss_tid = H5Tcreate (H5T_COMPOUND, sizeof(ss_t)); H5Tinsert(s1_tid, "c_name", HOFFSET(ss_t, c), H5T_NATIVE_DOUBLE); H5Tinsert(s1_tid, "b_name", HOFFSET(ss_t, b), H5T_NATIVE_FLOAT);…status = H5Dread(dataset, ss_tid, H5S_ALL, H5S_ALL, H5P_DEFAULT, ss);
![Page 37: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/37.jpg)
37 HDFHDF
Discovering HDF5 DatatypesDiscovering HDF5 Datatypes
![Page 38: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/38.jpg)
38 HDFHDF
Discovering DatatypeDiscovering Datatype
1. get class2. get size3. if numeric atomic
A. get precision, sign, padding, mantissa, exponent, etcB. allocate space and read data
4. if VL , enum or arrayA. get super class; go to 2
5. if compoundA. get number of members and members’ offsetsB. go to 1
![Page 39: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/39.jpg)
39 HDFHDF
HDF5 Compound DatatypesHDF5 Compound DatatypesDiscovering DatatypeDiscovering Datatype
s1_tid = H5Dget_type(dataset); if (H5Tget_class(s1_tid) == H5T_COMPOUND) { sz = H5Tget_size(s1_tid); nmemb = H5Tget_nmembers(s1_tid); for (i =0; i < nmemb; i++) { s2_tid = H5Tget_member_type(s1_tid,i); H5Tget_member_name(s1_tid,i); H5Tget_member_offset(s1_tid,i), if (H5Tget_class(s2_tid) == H5T_COMPOUND) { {/* recursively analyze the nested type. */ } else if (H5Tget_class(s2_tid) == H5T_ARRAY) { sz2 = H5Tget_size(s2_tid); H5Tget_array_dims(s2_tid,dim,NULL); s3_tid = H5Tget_super(s2_tid); }
![Page 40: HDF5 Advanced Topics Datatypes](https://reader036.fdocuments.us/reader036/viewer/2022062423/56814dce550346895dbb27d3/html5/thumbnails/40.jpg)
40 HDFHDF
HDF InformationHDF Information
• HDF Information Center– http://hdf.ncsa.uiuc.edu/
• HDF Help email address– [email protected]
• HDF users mailing list– [email protected]