RACBVHs: Random-Accessible Compressed Bounding Volume
HierarchiesTae-Joon Kim Bochang Moon Duksu Kim Sung-Eui Yoon
KAIST (Korea Advanced Institute of Science and Technology)
To appear at IEEE TVCG 09
Goal
Design a compact representation for interactive geometric and graphics applications of massive models Support random access
Support various applications
Improve performance of applications
Motivation – Massive Models
Take high disk/memory spaces
Long data access time and low I/O performance
St. Matthew model(372M triangles)
Power plant with Hugo model(82M triangles)
Low Growth Rate of Data Access Speed
Ran
dom
Seq
uent
ial
RA
M
CP
U
GP
U
Disk access access speed speed
1
10
100
1000
1.3X 4.5X
10X
56X 192X Accumulated
growth rate during 1999~2009
(log scale)
Reducing data access time is critical!
accessspeeddisk access speed
Large Scale Applications
Ray tracing
Collision detection
Visibility queries
Dynamic simulation
Motion planning
Large Scale Applications
Common geometric data structures Meshes
Acceleration hierarchies
k-d trees
Bounding volume hierarchies (BVHs)
Usually require random access on meshes and hierarchies
Supporting random access is important!
Our Approach
Compress bounding volume hierarchies (BVHs) to reduce expensive data access time Support random access
Support fast runtime decompression for performance improvements
Provide general API for various applications
Random-Accessible Compressed BVHs
Ray Tracing of St. Matthew Model
128M triangles (4GB)
256M BVH (8GB)
9.6:1 compression ratios for the BVH over original uncompressed BVH
4.4:1 runtime performance improvement
Related Work
Mesh compression
Compression and random access
Tree and BVH compression
Mesh Compression
Designed to achieve a maximum compression ratio or efficient transmission [Touma and Gotsman 98, Devillers and Gandoin 00]
Encode vertices [Alliez and Desbrun 01, Kälberer et al. 05]
Encode edges [Isenburg and Snoeyink 00]
Encode faces [Gumhold and Strasser 98, Rossignac 99, Lee et al. 02]
Do not directly support random access
Compression and Random Access
Quite common in video & audio encoding E.g., MPEG video
Single or multi-resolution mesh compression [Choe et al. 04, Yoon and Lindstrom 07, Choe et al. 09]
[Gobbetti et al. 06, Kim et al. 06]
Regular volumetric grids and images [Ihm and Park 99, Rodler 99, Lefebvre and Hoppe 06]
Tree and BVH Compression
Tree compression [Zaks 80, Zerling 85, Katajainen and Makinen 90]
Do not support random access
Encode bounding volumes Quantization: [Cline et al. 06, Mahovsky 05, Terdiman 03]
Hierarchical encoding: [Rusinkiewicz and Levoy 00, Hubo et al. 06]
Encode tree structures Assume a particular tree structure: [Lauterbach et al. 07 and 08]
Assume a particular layout for the tree: [Lefebvre and Hoppe 07]
Outline
Overview
Compression at preprocessing
Runtime data access framework
Results
Outline
Overview
Compression at preprocessing
Runtime data access framework
Results
Overview - Compression at Preprocessing
Decompose the BVHs into set of clusters
Compress each cluster separately Each cluster serves as an
access point C1 C2 C3
BVH
Overview - Runtime BVH Access Framework
Main memory
Applications
External drive
Compressed data pool
Cached dataCluster c0
Cluster c1
Cluster c2
Cluster c3
Cluster c4
Cluster c5
…Cluster cn
…
012345…n
Offset table
Requestdata
Requested data
Decompress using CPU power
Overview - Runtime BVH Access Framework
Main memory
Applications
Compressed data pool
Cached dataCluster c0
Cluster c1
Cluster c2
Cluster c3
Cluster c4
Cluster c5
…Cluster cn
…
012345…n
Offset table
Requestdata
Requested data
Preloaded data
Outline
Overview
Compression at preprocessing
Runtime data access framework
Results
Bounding Volume Hierarchies (BVHs)
A node of BVHs has: Indices of children nodes
A bounding volume (BV)
We use the axis-aligned bounding box (AABB) and a binary BVH Due to its simplicity and the wide use
Can be easily extended to other types of BVs and k-ary BVHs
BV
Index of child node
Layouts of BVHs
Order of stored BV nodes in memory Affect the performance of
applications
Our method preserves the original layouts Support any layouts
Use cache-efficient layouts for the best performance[Yoon and Manocha 06]
Layout of a BVH
Cluster-based Compression
Cluster has: Fixed number (e.g. 4K) of
consecutive nodes in the layouts
Compute clusters with cache-coherent nodes Implicitly performed by using
cache-coherent layouts of BVHs
C1 C2 C3
Encoding Bounding Volumes
Quantize min and max extents of BVs
Predict child BVs using their parent BV Use the simple median
prediction
Encode prediction errors Use a dictionary-based
encoder for fast decompression
Parent BVLeft BV
Right BVPrediction error
Front-based Tree Structure Encoding
Maintain a front of nodes, one of whose child nodes is uncompressed yet
Average size of front: 13 4K nodes with cache-efficient
layouts
Encode tree structures using only a few bits by connecting uncompressed nodes to a node in the front
: Not yet compressed nodes
: Compressed nodes
n6 n7 n8
n9
n10 n11 n13 n14
n12
n1
n3
n5
n4
n2
n0
n2 n3 n4 n5Front
0 1 2 3
n1
n3
n5
n4
n2
n0
Outline
Overview
Compression at preprocessing
Runtime data access framework
Results
Various applications can transparently access our representation using the following API:
Applications
BVH Access API
Main memory
External drive
Compressed data pool
Cached dataCluster c0
Cluster c1
Cluster c2
Cluster c3
Cluster c4
Cluster c5
…Cluster cn
…
012345…n
Offset table
Decompression
GetRootIndex (.)
GetBV (.)
IsLeaf (.)
GetLeftChildIndex (.)
GetRightChildIndex (.)
GetTriangleIndex (.)
Outline
Overview
Compression at preprocessing
Runtime data access framework
Results
Compression Results
Model Tri.(M)
St. Matthew 128Iso surface 102
Lucy 28Power plant 13CAD turbine 1.8
Comp. ratio over uncomp. BVH
9.6:18.8:1
8.7:112:18.3:1
Compression bits per node
Tree BV8.66 18.08.75 20.3
8.70 20.79.22 12.39.19 21.6
16 bits quantization for BVs
4K nodes per cluster
Size (MB) ofuncomp. BVH
7,8116,254
1,684778108
Benchmark Applications
Ray tracing Typically traverses large portions of BVHs
Collision detection Accesses smaller and more localized portions of BVHs than
ray tracing
Video for Benchmark Applications
Performance Improvement
Mainly due to higher I/O performance Reduce the data size by using compression→ Reduce or remove the expensive disk I/O time
Use CPU power to efficiently decompress the data
Parallel Random Access
Ray tracing of St. Matthew model (128M tri.) using only primary rays
1 2 3 40
0.51
1.52
2.53
3.54
4.5 Ideal, 4
Scalability of Ray Tracing
Number of Threads1 2 3 4
00.5
11.5
22.5
33.5
44.5 Ideal, 4
Original, 0.8
Scalability of Ray Tracing
Number of Threads1 2 3 4
00.5
11.5
22.5
33.5
44.5 Ideal, 4
Original, 0.8
Comp. Data, 3.6
Scalability of Ray Tracing
Number of Threads
Speedup
Limitation
Lower (by up to 3%) the performance with small models which can fit in main memory Overhead of identifying clusters
Less tight BVs due to a conservative quantization
Summary & Conclusion
Low storage requirement Achieve up to a 12:1 compression ratio
Improved performance with random accessibility Achieve a 4:1 runtime performance improvement
Wide applicability Allows various applications to access the compressed BVHs
High scalability Achieve a 3.6:1 performance improvement when using 4
threads over single thread
Future Work
Compact in-core representations
Apply to highly parallel architectures GPU
Larrabee
Apply to levels-of-detail (LOD) hierarchies
Acknowledgments
Members of SGLab. in KAIST
Model contributors
Funding agencies KAIST seed grant
Ministry of Knowledge Economy
Samsung
Microsoft Research Asia
Korea Research Foundation
Any Questions?
Q & AThank you!
Top Related