Ray Tracing and Photon Mapping on GPUs Tim PurcellStanford / NVIDIA.
-
Upload
frederica-hensley -
Category
Documents
-
view
249 -
download
4
Transcript of Ray Tracing and Photon Mapping on GPUs Tim PurcellStanford / NVIDIA.
Ray Tracing and Photon Ray Tracing and Photon Mapping on GPUsMapping on GPUs
Tim Purcell Stanford / NVIDIA
Small Sampling of GI on GPUsSmall Sampling of GI on GPUs
• Much more detail in the included papers
• Lots of other ‘global illumination on GPUs’ in the literature– The Ray Engine [Carr et al. 2002]– GPU Algorithms for Radiosity and Subsurface Scattering
[Carr et al. 2003]– Radiosity on Graphics Hardware [Coombe et al. 2004]– Lots and lots of shadow papers…
Subsurface ScatteringSubsurface Scattering
GPU Algorithms for Radiosity and Subsurface Scattering [Carr et al. 2003]
Ray TracingRay Tracing
SpecularSpecular
DiffuseDiffuse
DiffuseDiffuse
PP
TT
TT
SS SS
SSOccluderOccluder
Point LightPoint Light
R
MaterialMaterial
MaterialMaterial
MaterialMaterial
CameraCamera
Implementation OptionsImplementation Options
• GPU as a ray-triangle intersection engine [Carr et al. 2002]
– Rays and geometry streamed to GPU– Intersection calculation results read back– Acceleration structure traversal done on host CPU
• GPU as a ray tracing engine [Purcell et al. 2002]
– Scene geometry and acceleration structure stored on GPU– GPU performs ray generation, acceleration structure traversal,
intersection, and shading– Host provides camera info
Streaming Ray TracerStreaming Ray TracerGenerate Generate Eye RaysEye Rays
Traverse Traverse Acceleration Acceleration
StructureStructure
Intersect Intersect TrianglesTriangles
Shade Hits Shade Hits and and
Generate Generate Shading Shading
RaysRays
CamerCameraa
GridGrid
TriangleTriangless
MaterialMaterialss
Techniques UsedTechniques Used
• Data structure navigation– Texture memory stores data structures– Dependent texture fetches walk through data
• Flow control– Kernel binding based on occlusion query results– Efficient selective execution of kernels using early-z occlusion
culling– Difficulty in flow control disappearing with newest graphics
cards• PS 3.0
Texture Memory OrganizationTexture Memory Organization
xyzxyz xyzxyz xyzxyz xyzxyz xyzxyz xyzxyz …… xyzxyz
00 33 1111 3838 …… 564564
00 33 11 33 77 2121 216216 ……
xyzxyz xyzxyz xyzxyz xyzxyz xyzxyz xyzxyz …… xyzxyz
xyzxyz xyzxyz xyzxyz xyzxyz xyzxyz xyzxyz …… xyzxyz
Uniform GridUniform Grid3D Luminance 3D Luminance
TextureTexture
Triangle ListTriangle List1D Luminance1D Luminance
Texture Texture
TrianglesTriangles3x 1D RGB 3x 1D RGB
TexturesTextures
vox0vox0 vox1vox1 vox2vox2 vox3vox3 vox4vox4 vox5vox5 voxMvoxM
vox0vox0 vox2vox2
tri0tri0 tri1tri1 tri2tri2 tri3tri3 tri4tri4 tri5tri5 triNtriN
v0v0
v1v1
v2v2
Efficient Selective ExecutionEfficient Selective Execution
• Rendering giant screen filling quad not ideal
– Not all pixels need to process every rendering pass
• Proposed low-overhead early fragment kill
– Computation mask– Controllable early-Z
occlusion culling
• Trade computation for bandwidth
Original System ImplementationOriginal System Implementation
• ATI Radeon 9700 Pro (R300)
• ATI Fragment Program
Performance ResultsPerformance Results
• Radeon 9700 Pro– 100M ray-triangle intersections/s– 300K to 4.0M rays/s– Between 3 – 12 fps @ 256x256 pixels
• CPU implementation– 20M intersections/s P3 800 MHz [Wald et al. 2001]– 800K to 7.1M ray/s 2.5 GHz P4 [Wald et al. 2003]
• With simple shading: 1.8M to 2.3M rays/s
Photon MappingAlgorithm ReviewPhoton MappingAlgorithm Review• Photon tracing
– Emission, scattering, storing into k-d tree
– Similar to ray tracing
• Rendering
– Ray tracing for direct illumination
– Photon map visualization• Indirect bounce
Computational Challenge for GPUs #1Computational Challenge for GPUs #1
• Constructing a irregular or sparse data structure
Computational Challenge for GPUs #2Computational Challenge for GPUs #2
• Adaptive nearest neighbor search
– Noise vs. blur
Computational Challenge for GPUs #2Computational Challenge for GPUs #2
• Adaptive nearest neighbor search
– Noise vs. blur
Scatter on the GPUScatter on the GPU
• Sort photons into grid cells– Grid cell is sort key
• Two solutions– Simulate scatter with fragment programs
• Bitonic merge sort followed by binary search• Multiple rendering passes
– Vertex program with stencil buffer• Fixed number of photons per grid cell• Single rendering pass
Adaptive Nearest Neighbor SearchAdaptive Nearest Neighbor Search
• Iterative algorithm
• Accept or reject photons in cell visit order– No priority queue!– kNN-grid
Original System ImplementationOriginal System Implementation
• NVIDIA GeForce FX 5900 Ultra (NV35)
• Cg compiler 1.1
TracePhoton
s
BuildPhoton
Map
RayTraceScene
ComputeRadianceEstimate
Compute Lighting Render Image