Cache-Efficient Layouts of Bounding Volume Hierarchies (BVHs)

Post on 31-Dec-2015

31 views 2 download

description

Cache-Efficient Layouts of Bounding Volume Hierarchies (BVHs). Sung-Eui Yoon Lawrence Livermore National Laboratory Dinesh Manocha Univ. of North Carolina at Chapel Hill. Goal. Compute cache-coherent layouts of bounding volume hierarchies (BVHs) For various geometric applications - PowerPoint PPT Presentation

Transcript of Cache-Efficient Layouts of Bounding Volume Hierarchies (BVHs)

Cache-Efficient Layouts of Bounding Volume Hierarchies

(BVHs)

Sung-Eui Yoon

Lawrence Livermore National Laboratory

Dinesh Manocha

Univ. of North Carolina at Chapel Hill

2

Goal

● Compute cache-coherent layouts of bounding volume hierarchies (BVHs)● For various geometric applications ● Handles any kind of BVHs and spatial

partitioning hierarchies (e.g., kd-tree)

3

Bounding Volume Hierarchies (BVHs)

● Widely used data structures in:● Ray tracing● Collision detection● Visibility culling

Ray tracing Dynamic simulation

4

Bounding Volumes (BVs)

Triangles of a mesh

● Axis-aligned bounding boxes (AABBs)

● Oriented bounding boxes (OBBs)

[Gottschalk et al. 96]

● Spheres [Hubbard 93]

● Discrete orientation polytopes

(k-DOPs) [Klosowski et al. 98]

5

Layout of BVHs

● Nodes (and triangles) of BVHs are stored in arrays● What is a good layout?● How to compute cache-efficient layouts?

A

B C

D E

1D Layout of nodes:Layout method

A B C D E

6

Motivation

Growth rateduring 1993 – 2004

Courtesy: http://www.hcibook.com/e3/online/moores-law/

● Lower growth rate of data access speed

1.5X

20X

46X

7

Memory Hierarchies and Caches

CPU

Fast memory or cache

Slow memory

Blocktransfer

Disk

1 secAccess time: 10-4 sec10-6 sec

8

Main Contributions

● An algorithm computing cache-efficient layouts of BVHs● Probabilistic model● Simple layout construction method● Applicable to spatial partitioning

hierarchies

9

Related Work

● Mesh layouts● Layouts of search trees● Layouts of BVHs

10

Related Work

● Mesh layouts● Cache-coherent layouts of meshes and

graphs [Yoon et al. 05, Yoon and Lindstrom 06]

● Layouts of search trees● Layouts of BVHs

Require an input graph that represents access patterns on a BVH

11

Related Work

● Mesh layouts● Layouts of search trees

● [Gil and Itai 99, Alstrup et al. 03]

● Layouts of BVHs

Require a probability function that each node will be accessed

12

Related Work

● Mesh layouts● Layouts of search trees● Layouts of BVHs

● Studied in collision detection [Ericson 04] and ray tracing [Havran 97]

● Blocking-based layouts [Terdiman 03, van Emde Boas 77]

13

Outline

● Probabilistic model● Layout computation● Results

14

Outline

● Probabilistic model● Layout computation● Results

15

Traversals of Collision Queries on BVHs

● Takes two objects● Two 3D objects for collision detection● One 3D object and one ray for ray tracing

BVH1 BVH2

16

Two Localities

● Parent-child locality● Spatial locality

17

Parent-child Locality

A B

BVH1 BVH2

AB

18

D

Spatial Locality

E

C

BVH1 BVH2

D

E

C

19

Probabilistic Model

● Quantify localities in a uniform way● Measure the probability for localities● Based on geometric relationships

between bounding volumes

20

Probabilistic Model

● Two major factors● Prob. that p is accessed● Conditional prob. that p

is also intersected given g is intersected n

g

p

bIntersected

Accessed andIntersected

● Pr (n)● Probability that a node, n, will be

accessed during runtime traversal

),1|1Pr()Pr()Pr( gp XXpnwhere Xp (or Xg) is a boolean random variable

indicating collision between p (or g) and b

21

Probability Computation

● : Conditional prob. that p is also

intersected given g is intersected● Do not know any information

about b

n

g

p

bIntersected

Intersected

)1|1Pr( gp XX

22

Contact Space

● Contact space of b against p and g● Denoted as Sp and Sg

)(

)(

g

gp

SVol

SSVol

n

g

p

bIntersected

Intersected

gSg =

Sp∩Sg

b

)1|1Pr( gp XX

pSp =

23

● Assume b is a sphere● Computed from Minkowski sum

● Configuration space, in general● Too expensive to compute

Contact Space

b bSp∩Sg Sp∩Sg

Sg

SpSg

Sp

24

Approximate Probability Computation

● Assumes “b” to be a point, a degenerated case● Exact value is not required● Only 5% incorrect decisions compared to

considering many other cases

● Surface area heuristics (SAH) [MacDonald and Booth 90, Havran 00]● Equivalent to our approximation

25

Outline

● Probabilistic model● Layout computation● Results

26

Overview of Layout Algorithm

● Cache-oblivious layout computation● Do not assume any particular cache block

sizes● Designed to work well with various

(geometric) block sizes [Yoon and Lindstrom 06]

● Two main steps in recursion● Cluster construction w/ parent-child

locality● Layout clusters w/ spatial locality

27

Clustering

● Minimize the working set size during collision queries● Maximize the sum of probabilities of

nodes in a cluster● NP-complete even for cache-aware layout

given a search query [Gil and Itai 99]

28

Greedy Clustering

● Employ top-down greedy clustering● Compute balanced sized clusters● Maintain convexity [Gil and Itai 99]

0.9 0.5

0.8 0.1

Cluster

29

Layout of Clusters

● Uses cache-oblivious layouts of meshes ● [Yoon et al. 05]

Spatial locality

30

Layout of Clusters

● Uses cache-oblivious layouts of meshes ● [Yoon et al. 05]

Spatial locality

31

Outline

● Probabilistic model● Layout computation● Results

32

Results

● Collision detection● Use oriented bounding box (OBB)

[Gottschalk et al. 96]● Breadth-first tree traversal

● Ray tracing● Use kd-tree [Wald 04]● Depth-first tree traversal

33

Collision Detection – Robot and Power Plant Models

20k triangles 1M triangles

34

Collision Detection – Performance Comparison I

0

200

400

600

800

1000

1200

COLBVH VEB BFL COML DFLOur

cache-oblivious

layout

van Emde Boas layout

Breadth-first

layout

Cache-oblivious

mesh layout

Depth-first

layout

Different layouts

Working set size (KB)

Collision time (ms/100)

41% ~ 500% performance improvement

35

0

500

1000

1500

2000

2500

COLBVH VEB BFL COML DFL

Collision Detection – Performance Comparison II

Our layout

van Emde Boas layout

Breadth-first

layout

Cache-oblivious

mesh layout

Depth-first

layout

Different layouts

35% ~ 2600% performance improvement

Collision time (ms/100)

Working set size (KB)

36

Cache-Oblivious Layout vs Cache-Aware Layout

● Cache-aware layouts● Take advantage of block size

information (4KB)● Minor performance degradation

● 8% compared to cache-aware layouts

37

Ray Tracing – Lucy Model

28 million triangles Pentium IV with 1GB

38

0

200

400

600

800

1000

1200

COLBVH VEB BFL DFL

Ray Tracing – Performance Comparison

Our layout

van Emde Boas layout

Breadth-first layout

Depth-first layout

Different layouts

77% ~ 180% performance improvement

Working set size (MB)

Render time (sec)

39

Major Differences over Other Layouts

● Commonly used layouts● Consider connectivity of trees

● Two improvements of our layouts● Probabilistic model based on geometry● Layout method considering two different

localities

40

Limitations

● No guarantee that our layout always improves the performance

● May not improve the performance of computationally intensive queries (e.g., exact penetration depth computation)

● Assumes that collision algorithm does not use front tracking

41

Advantages

● Generality● Works with any geometric hierarchies● Does not require cache parameters

● Usability● Can gain performance improvement

without modifying codes● Replaces only data layouts

42

Conclusion

● Cache-efficient layouts of BVHs● Probabilistic model● Simple layout construction method● Applied to collision detection and ray

tracing

43

Ongoing and Future Work

● Extend to other proximity and LOD queries [Yoon et al. 06]

● Investigate other geometric hierarchies

● Improve the quality of hierarchies

● Apply to deforming models [Lauterbach et al. 06]

44

Acknowledgements

● Model contributors● Funding agencies

● Army Research Office● DARPA● Intel● Lawrence Livermore National Laboratory● Microsoft● National Science Foundation● Office of Naval Research● RDECOM

45

Acknowledgements

● Russ Gayle● Ted Kim● Ming Lin● Peter Lindstrom● Brandon Lloyd● Valerio Pascucci● Stephane Redon● LLNL data analysis group members● Anonymous reviewers

46

Questions?

Thanks!

47

UCRL-PRES-223220

This work was performed under the auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract No. W-7405-ENG-48.

Note: this talk is not supported or sanctioned by DoE, UC, LLNL, CASC

48

Additional slides

49

BVHs of Massive Models

● Complex and massive models

● High memory requirement● Can have gigabyte data size

Double eagle tanker (82M triangles)

St. Matthew (372M)Isosurface (472M)

50

Memory Hierarchies

Register

Caches

Main memory

Disk storage

Size

1KB

1MB

1GB

> 1GB

Speed

100 ns

101 ns

102ns

104ns

51

Mesh Layouts

● Rendering sequences● Triangle strips● [Deering 95, Hoppe 99, Bogomjakov and

Gotsman 02]● Processing sequences

● [Isenburg and Gumhold 03, Isenburg and Lindstrom 04]

Assume that access patternglobally follows the layout order!

52

Mesh Layouts

● Cache-aware and cache-oblivious layouts of meshes and graphs ● [Yoon et al. 05, Yoon and Lindstrom 06]

Require an input graph that represents access patterns on a BVH

53

Layouts of Search Trees

● Cache-aware layout of search tree [Gil and Itai 99]

● Cache-oblivious search tree layout [Alstrup et al. 03]

Require a probability function that each node is accessed

54

Layouts of BVHs

● Realtime collision detection book [Ericson 04]

● Layouts analysis in ray tracing [Havran 97]

● Opcode [Terdiman 03]● Uses blocking

● van Emde Boas layout [van Emde Boas 77]● Uses recursive blocking

55

Layout of Clusters

● Uses cache-oblivious layouts of meshes ● [Yoon et al. 05]

Spatial locality