NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in...

Post on 24-Dec-2015

217 views 0 download

Tags:

Transcript of NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in...

NetCDF4 NetCDF4 Performance Performance BenchmarkBenchmark

Part IPart I

Will the performance in netCDF4 Will the performance in netCDF4 comparable with that in netCDF3?comparable with that in netCDF3?

ConfigurationsConfigurations

DatasetDataset 40 MB: 6 files40 MB: 6 files 1 MB: 6 files1 MB: 6 files

Storage LayoutStorage Layout ContiguousContiguous Chunked (HDF5 default cache size: 1 Chunked (HDF5 default cache size: 1

MB)MB) Chunked (HDF5 cache size: 64 MB)Chunked (HDF5 cache size: 64 MB)

System CacheSystem Cache

System CacheSystem Cache

OnOn Use all caches and buffers provided by Use all caches and buffers provided by

kernelkernel DropDrop

““drop_caches” to read data from diskdrop_caches” to read data from disk ““fsync” to write data into diskfsync” to write data into disk

10 cases10 casesDatasetDataset Storage LayoutStorage Layout System System

CacheCache

11 40 MB40 MB contiguouscontiguous onon

22 40 MB40 MB contiguouscontiguous dropdrop

33 40 MB40 MB chunked (64 MB chunked (64 MB cache)cache)

onon

44 40 MB40 MB chunked (64 MB chunked (64 MB cache)cache)

dropdrop

55 40 MB40 MB chunked (1 MB chunked (1 MB cache)cache)

onon

66 40 MB40 MB chunked (1 MB chunked (1 MB cache)cache)

dropdrop

77 1 MB1 MB contiguouscontiguous onon

88 1 MB1 MB contiguouscontiguous dropdrop

99 1 MB1 MB chunked (1 MB chunked (1 MB cache)cache)

onon

1010 1 MB1 MB chunked (1 MB chunked (1 MB cache)cache)

dropdrop

Default HyperslabDefault Hyperslab

One big hyperslab is selectedOne big hyperslab is selected

1. Contiguous layout 1. Contiguous layout with cachewith cache

0 100 200 300 400 500 600

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5 contiguous

netCDF4 contiguous

netCDF3

0 100 200 300 400

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5 contiguous

netCDF4 contiguous

netCDF3

DatasetDataset Storage Storage LayoutLayout

System System CacheCache

≈ ≈ 40 MB40 MB contiguouscontiguous onon

2. Contiguous layout w/o 2. Contiguous layout w/o cachecache

DatasetDataset Storage Storage LayoutLayout

System System CacheCache

≈ ≈ 40 MB40 MB contiguouscontiguous dropdrop

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5

netCDF4

netCDF3

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5

netCDF4

netCDF3

3. Chunked layout with 3. Chunked layout with cachecache

DatasetDataset Storage LayoutStorage Layout System System CacheCache

≈ ≈ 40 40 MBMB

chunked (chunked (HDF5 HDF5 cache size: 64 cache size: 64 MBMB))

onon

0 100 200 300 400 500

1D

2D

3D

4D

5D

6D

num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5 chunked

netCDF4 chunked

netCDF3

0 100 200 300 400 500

1D

2D

3D

4D

5D

6D

num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5 chunked

netCDF4 chunked

netCDF3

4. Chunked layout w/o 4. Chunked layout w/o cachecache

DatasetDataset Storage LayoutStorage Layout System System CacheCache

≈ ≈ 40 40 MBMB

chunked (chunked (HDF5 HDF5 cache size: 64 cache size: 64 MBMB))

dropdrop

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5

netCDF4

netCDF3

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5

netCDF4

netCDF3

5. Chunked layout with 5. Chunked layout with cachecache

DataseDatasett

Storage LayoutStorage Layout System System CacheCache

≈ ≈ 40 40 MBMB

chunked (chunked (HDF5 HDF5 default cache default cache size: 1 MBsize: 1 MB))

onon

0 100 200 300 400 500

1D

2D

3D

4D

5D

6D

num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5 chunked

netCDF4 chunked

netCDF3

0 100 200 300 400

1D

2D

3D

4D

5D

6D

num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5 chunked

netCDF4 chunked

netCDF3

H5Pset_alloc_time(EARLH5Pset_alloc_time(EARLY)Y)

DatasDatasetet

Storage LayoutStorage Layout System System CacheCache

≈ ≈ 40 40 MBMB

chunked (chunked (HDF5 HDF5 default cache default cache size: 1 MBsize: 1 MB))

onon

0 100 200 300 400

1D

2D

3D

4D

5D

6D

num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5 chunked

netCDF4 chunked

netCDF3

0 100 200 300 400

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5 chunked

netCDF4 chunked

netCDF3 chunked

H5Pset_alloc_time(EARLY)

6. Chunked layout w/o 6. Chunked layout w/o cachecache

DatasDatasetet

Storage LayoutStorage Layout System System CacheCache

≈ ≈ 40 40 MBMB

chunked (chunked (HDF5 HDF5 default cache size: default cache size: 1 MB1 MB))

dropdrop

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5

netCDF4

netCDF3

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5

netCDF4

netCDF3

7. Contiguous layout 7. Contiguous layout with cachewith cache

DatasetDataset Storage Storage LayoutLayout

System System CacheCache

≈ ≈ 1 MB1 MB contiguouscontiguous onon

0 100 200 300 400 500 600

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5 contiguous

netCDF4 contiguous

netCDF3

0 100 200 300 400

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5 contiguous

netCDF4 contiguous

netCDF3

8. Contiguous layout w/o 8. Contiguous layout w/o cachecache

DatasetDataset Storage Storage LayoutLayout

System System CacheCache

≈ ≈ 1 MB1 MB contiguouscontiguous dropdrop

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5

netCDF4

netCDF3

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5

netCDF4

netCDF3

9. Chunked layout with 9. Chunked layout with cachecache

DataseDatasett

Storage LayoutStorage Layout System System CacheCache

≈ ≈ 1 1 MBMB

chunked (chunked (HDF5 HDF5 default cache default cache size: 1 MBsize: 1 MB))

onon

0 100 200 300 400 500

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5 chunked

netCDF4 chunked

netCDF3

0 100 200 300 400 500

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5 chunked

netCDF4 chunked

netCDF3

10. Chunked layout w/o 10. Chunked layout w/o cachecache

DatasDatasetet

Storage LayoutStorage Layout System System CacheCache

≈ ≈ 1 1 MBMB

chunked (chunked (HDF5 HDF5 default cache size: default cache size: 1 MB1 MB))

dropdrop

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5

netCDF4

netCDF3

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5

netCDF4

netCDF3

Part IIPart II

Can I get better performance with Can I get better performance with netCDF4? If yes, under what netCDF4? If yes, under what circumstances can I get better circumstances can I get better performance?performance?

Non-contiguous AccessNon-contiguous Access

Logical layout for 2-dimensional Logical layout for 2-dimensional arraysarrays

256

256

163

84

16

1

240

Non-contiguous AccessNon-contiguous Access

Physical layoutPhysical layout

16384 non-adjacent data points

Chunk size [16384][1]

Chunk size [8192][1]

Chunk size [4096][1]

11. Non-contiguous 11. Non-contiguous AccessAccess

DatasetDataset Storage LayoutStorage Layout System System CacheCache

≈ ≈ 16 16 MBMB

contiguous; contiguous; chunkedchunked

(default chunk (default chunk cache)cache)

dropdrop

0 100 200 300 400 500 600

netCDF3contiguous

netCDF4contiguous

chunked[16384][1]

chunked[8192][1]

chunked[4096][1]

Sto

rage

Lay

out

wall clock time to read one non-contiguous hyperslab (ms)

0 5 10 15 20 25

netCDF3contiguous

netCDF4contiguous

chunked[16384][1]

chunked [8192][1]

chunked [4096][1]

Sto

rage

Lay

out

wall clock time to write non-contiguous hyperslabs (s)

12. Chunked layout with 12. Chunked layout with cachecache

DatasetDataset Storage LayoutStorage Layout System System CacheCache

≈ ≈ 40 40 MBMB

chunkedchunked

(chunk cache (chunk cache varies)varies)

onon

0

50

100

150

200

250

300

350

400

450

1 4 8 16 32 64

cache size for 5D dataset (MB)

data

writ

e ra

te (

MB

/s)

netCDF3

netCDF4

13. Compression13. Compression

DatasetDataset Storage LayoutStorage Layout System System CacheCache

Radar Radar datadata

chunkedchunked

(default chunk (default chunk cache)cache)

dropdrop

0.0 0.5 1.0 1.5

tile1

tile2

tile4

Dat

aset

Nam

e

wall clock time to read radar data (second)

deflate compression level 1

without compression

0.0 0.5 1.0 1.5 2.0

tile1

tile2

tile4

Dat

aset

Nam

e

wall clock time to write radar data (second)

deflate compression level 1

without compression

13. Compression13. Compression

Compression ratioCompression ratio

DatasDatasetet

UncompresUncompressedsed

CompressCompresseded

CompressiCompression Ratioon Ratio

Tile1Tile1 72,132,89272,132,892 3,432,5593,432,559 2121

Tile2Tile2 72,132,89272,132,892 5,129,4825,129,482 1414

Tile3Tile3 72,132,89272,132,892 3,069,2543,069,254 2323

Part IIIPart III

Can netCDF4 performance be bad? Can netCDF4 performance be bad? How can I avoid the bad How can I avoid the bad performance?performance?

14. Chunk size14. Chunk size

Too small chunk size is badToo small chunk size is bad Little bit smaller than Little bit smaller than (number of (number of

elements) / Nelements) / N is bad is bad

14. Chunk size14. Chunk size

chunkchunk00

chunkchunk11

chunkchunk22

chunkchunk33

chunkchunk00

chunkchunk11

chunkchunk22

chunkchunk33

chunkchunk44

chunkchunk55

chunkchunk66

chunkchunk77

chunkchunk88

3162

791

3162

790

dataset

chunk

36

38

40

42

44

46

48

8 16 32 50 128 200Number of elements for each dimension in a chunk

file

size

(M

B)

0

5

10

15

20

25

30

35

40

45

50

8 16 32 50 128 200Number of elements for each dimension in a chunk

data

writ

e ra

te (

MB

/s)

14. Chunk size14. Chunk size

DatasetDataset

≈ ≈ 64 MB64 MB

Storage LayoutStorage Layout

chunkedchunked

(default chunk (default chunk cache)cache)

System CacheSystem Cache

dropdrop

0

20

40

60

80

100

120

140

160

316 527 791 1054 1581 2400 3162Number of elements for each dimension in a chunk

file

size

(M

B)

14. Chunk size (more)14. Chunk size (more)

DatasetDataset

≈ ≈ 64 MB64 MB

Storage LayoutStorage Layout

chunkedchunked

(default chunk (default chunk cache)cache)

System CacheSystem Cache

dropdrop

0

5

10

15

20

25

30

35

40

45

316 527 791 1054 1581 2400 3162Number of elements for each dimension in a chunk

data

writ

e ra

te (

MB

/s)

n

n + 1

n - 1

15. Many Hyperslab 15. Many Hyperslab selectionsselections

H5Pcreate()

H5Dopen()

15. Many Hyperslab 15. Many Hyperslab selectionsselections

ConclusionConclusion

The performance in netCDF4 is The performance in netCDF4 is comparable with that in netCDF3comparable with that in netCDF3

ImprovementImprovement Non-contiguous access patternNon-contiguous access pattern Adjusted cache sizeAdjusted cache size CompressionCompression

PitfallPitfall Small chunk sizeSmall chunk size Many small hyperslab selectionsMany small hyperslab selections