Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics...
-
Upload
alden-menear -
Category
Documents
-
view
220 -
download
2
Transcript of Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics...
![Page 1: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/1.jpg)
coherent ray tracing via stream filtering
christiaan gribblekarthik ramani
ieee/eurographics symposium on interactive ray tracing
august 2008
![Page 2: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/2.jpg)
• early implementation– andrew kensler (utah)– ingo wald (intel) – solomon boulos (stanford)
• other contributors– steve parker & pete shirley (nvidia)– al davis & erik brunvand (utah)
acknowledgements
![Page 3: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/3.jpg)
• ray packets SIMD processing• increasing SIMD widths
– current GPUs– intel’s larrabee– future processors
how to exploit wide SIMD units forfast ray tracing?
wide SIMD environments
![Page 4: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/4.jpg)
• recast ray tracing algorithm– series of filter operations– applied to arbitrarily-sized groups of rays
• apply filters throughout rendering – eliminate inactive rays– improve SIMD efficiency– achieve interactive performance
stream filtering
![Page 5: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/5.jpg)
• ray streams– groups of rays– arbitrary size– arbitrary order
• stream filters– set of conditional statements– executed across stream elements– extract only rays with certain properties
core concepts
![Page 6: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/6.jpg)
core concepts
a b d e f
stream element
input stream
out_stream filter<test>(in_stream){ foreach e in in_stream if (test(e) == true) out_stream.push(e) return out_stream}
c
test conditional statement(s)
![Page 7: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/7.jpg)
• process stream in groups of N elements• two steps
– N-wide groups boolean mask– boolean mask partitioned stream
SIMD filtering
![Page 8: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/8.jpg)
SIMD filtering
a b d e f
input stream
c
test boolean mask
step one
![Page 9: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/9.jpg)
SIMD filtering
a b d e f
input stream
c
test boolean maska b c
t t f
step one
![Page 10: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/10.jpg)
SIMD filtering
a b d e f
input stream
c
test boolean mask
t
d e f
t f t f t
step one
![Page 11: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/11.jpg)
SIMD filtering
a b d e f
input stream
c
test boolean mask
t t f t f t
![Page 12: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/12.jpg)
SIMD filtering
a b d e f
input stream
c
test boolean mask
t t f t f t
partition
a b d e c
output stream
f
![Page 13: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/13.jpg)
• wide SIMD ops (N > 4)• scatter/gather memory ops• partition op
hardware requirements
![Page 14: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/14.jpg)
• all rays requiring same sequence of ops will always perform those ops together
independent of execution path
independent of order within stream
• coherence defined by ensuing ops
no guessing with heuristics
adapts to geometry, etc.
key characteristics
![Page 15: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/15.jpg)
• all rays requiring same sequence of ops will always perform those ops together
independent of execution path
independent of order within stream
• coherence defined by ensuing ops
no guessing with heuristics
adapts to geometry, etc.
key characteristics
![Page 16: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/16.jpg)
• all rays requiring same sequence of ops will always perform those ops together
independent of execution path
independent of order within stream
• coherence defined by ensuing ops
no guessing with heuristics
adapts to geometry, etc.
key characteristics
![Page 17: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/17.jpg)
• recast ray tracing algorithm as a sequence of filter operations
• possible to use filters in all threemajor stages of ray tracing– traversal– intersection– shading
application to ray tracing
![Page 18: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/18.jpg)
• sequence of stream filters– extract certain rays for processing– ignore others, process later– implicit or explicit
• traversal implicit filter stack• shading explicit filter stack
filter stacks
![Page 19: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/19.jpg)
drop inactive rays
traversal
a b d e f
input stream
c
stackcurrent node x w (0, 5)
…
a b d e c
output stream
f
y
z
filter against node
![Page 20: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/20.jpg)
traversal
a b d e f
input stream
c
stackcurrent node x y (0, 3)
…
a b d
output stream
f
y
zw (0, 5)
push back child
![Page 21: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/21.jpg)
traversal
a b d e f
input stream
c
stackcurrent node x z (0, 3)
…a b d
output stream
f
y
z
w (0, 5)
y (0, 3)
push front child
![Page 22: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/22.jpg)
traversal
a b d e f
input stream
c
stackcurrent node x z (0, 3)
…a b d
output stream
f
y
z
w (0, 5)
y (0, 3)
continue to next traversal step
![Page 23: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/23.jpg)
• explicit filter stacks– decompose test into sequence of filters
• sequence of barycentric coordinate tests• …
– too little coherence to necessitate additional filter ops
• simply apply test in N-wide SIMD fashion
intersection
![Page 24: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/24.jpg)
• explicit filter stacks– extract & process elements
• shadow rays for explicit direct lighting• rays that miss geometry• rays whose children sample direct illumination• …
– streams are quite long– filter stacks are used to good effect
• shading achieves highest utilization
shading
![Page 25: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/25.jpg)
• general & flexible• supports parallel execution
– process only active elements– yields highest possible efficiency– adapts to geometry, etc.
• incurs low overhead
algorithm – summary
![Page 26: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/26.jpg)
• why a custom core?– skeptical that algorithm could perform
interactively– provides upper bound on expected
performance– explore parameter space more easily
• if successful, implement for available architectures
hardware simulation
![Page 27: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/27.jpg)
• cycle-accurate– models stalls & data dependencies– models contention for components
• conservative– could be synthesized at 1 GHz @ 135 nm– we assume 500 MHz @ 90 nm
• additional details available in companion papers
simulator highlights
![Page 28: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/28.jpg)
• does sufficient coherence exist to use wide SIMD units efficiently?
focus on SIMD utilization
• is interactive performance achievable with a custom core?
initial exploration of design space
key questions
![Page 29: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/29.jpg)
• does sufficient coherence exist to use wide SIMD units efficiently?
focus on SIMD utilization
• is interactive performance achievable with a custom core?
key questions
![Page 30: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/30.jpg)
• does sufficient coherence exist to use wide SIMD units efficiently?
focus on SIMD utilization
• is interactive performance achievable with a custom core?
initial exploration of design space
key questions
![Page 31: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/31.jpg)
• monte carlo path tracing– explicit direct lighting– glossy, dielectric, & lambertian materials– depth-of-field effects
• tile-based, breadth-first rendering
rendering
![Page 32: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/32.jpg)
• 1024x1024 images• stream size 1K or 4K rays
– 1 spp 32x32 or 64x64 pixels/tile– 64 spp 4x4 or 8x8 pixels/tile
• per-frame stats– O(100s millions) rays/frame– O(100s millions) traversal ops– O(10s millions) intersection ops
experimental setup
![Page 33: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/33.jpg)
• high geometric & illumination complexity• representative of common scenarios
test scenes
rtrt conf kala
![Page 34: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/34.jpg)
predicted performance
N = 8 N = 12 N = 16
32x32 streams 6.73 11.78 13.34
64x64 streams 8.34 13.45 15.65
7
9
11
13
15
17
kala – frame rate
32x32 streams
64x64 streams
SIMD width
fram
es p
er s
eco
nd
![Page 35: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/35.jpg)
• achieve high utilization– as high as 97%– SIMD widths of up to 16 elements– utilization increases with stream size
• achieve interactive performance– 15-25 fps– performance increases with stream size– currently requires custom core
results – summary
![Page 36: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/36.jpg)
• too few common ops no improvement in utilization
• possible remedies– longer ray streams– parallel traversal
limitations – parallelism
![Page 37: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/37.jpg)
• conventional cpus– narrow SIMD (4-wide SSE & altivec)– limited support for scatter/gather ops– partition op software implementation
• possible remedies– custom core– current GPUs– time
limitations – hw support
![Page 38: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/38.jpg)
• new approach to coherent ray tracing– process arbitrarily-sized groups of rays
in SIMD fashion with high utilization– eliminates inactive elements, process
only active rays• stream filtering provides
– sufficient coherence for wider-than-four SIMD processing
– interactive performance with custom core
conclusions
![Page 39: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/39.jpg)
• additional hw simulation– parameter tuning– homogeneous multicore– heterogeneous multicore– …
• improved GPU-based implementation• implementations for future processors
future work
![Page 40: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/40.jpg)
• temple of kalabsha– veronica sundstedt– patrick ledda– other members of the university of bristol
computer graphics group• financial support
– swezey scientific instrumentation fund– utah graduate research fellowship– nsf grants 0541009 & 0430063
(more) acknowledgements
![Page 41: Coherent ray tracing via stream filtering christiaan gribble karthik ramani ieee/eurographics symposium on interactive ray tracing august 2008.](https://reader036.fdocuments.us/reader036/viewer/2022062515/56649c9d5503460f9495ca5b/html5/thumbnails/41.jpg)