Deferred Rendering for Current and Future Rendering Pipelines

download Deferred Rendering for Current and  Future Rendering Pipelines

of 34

Transcript of Deferred Rendering for Current and Future Rendering Pipelines

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    1/34

    Beyond Programmable Shad

    ACM SIGG

    Deferred Rendering for Current and

    Future Rendering Pipelines

    Andrew LauritzenAdvanced Rendering Technology (ART)

    Intel Corporation

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    2/34

    Overview

    Forward shading Deferred shading and lighting

    Tile-based deferred shading

    Deferred multi-sample anti-aliasing (MSA

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    3/34

    Forward Shading

    Do everything we need to shade a pixel for each light

    Shadow attenuation (sampling shadow maps)

    Distance attenuation

    Evaluate lighting and accumulate

    Multi-pass requires resubmitting scene g

    Not a scalable solution

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    4/34

    Forward Shading Problems

    Ineffective light culling Object space at best

    Trade-off with shader permutations/batching

    Memory footprint of all inputs

    Everything must be resident at the same time

    Shading small triangles is inefficient

    Covered earlier in this course: [Fatahalian 20

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    5/34

    Conventional Deferred Shading

    Store lighting inputs in memory (G-buffer for each light

    Use rasterizer to scatter light volume and cull

    Read lighting inputs from G-buffer

    Compute lighting

    Accumulate lighting with additive blending

    Reorders computation to extract coheren

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    6/34

    Modern Implementation

    Cull with screen-aligned quads Cover light extents with axis-aligned boundin

    Full light meshes (spheres, cones) are generally o

    Can use oriented bounding box for narrow spot lig

    Use conservative single-direction depth test Two-pass stencil is more expensive than it is worth

    Depth bounds test on some hardware, but not batc

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    7/34

    Lit Scene (256 Point Lights)

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    8/34

    Quad-Based Light Culling

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    9/34

    Deferred Shading Problems

    Bandwidth overhead when lights overlap for each light

    Use rasterizer to scatter light volume and cull

    Read lighting inputs from G-bufferoverhead

    Compute lighting

    Accumulate lighting with additive blendingover

    Not doing enough work to amortize overh

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    10/34

    Improving Deferred Shading

    Reduce G-buffer overhead Access fewer things inside the light loop

    Deferred lighting / light pre-pass

    Amortize overhead

    Group overlapping lights and process them to

    Tile-based deferred shading

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    11/34

    Deferred Lighting / Light Pre-Pass

    Goal: reduce G-buffer overhead Split diffuse and specular terms

    Common concession is monochromatic spec

    Factor out constant terms from summatio

    Albedo, specular amount, etc.

    Sum inner terms over all lights

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    12/34

    Deferred Lighting / Light Pre-Pass

    Resolve pass combines factored compon Still best to store all terms in G-buffer up fron

    Better SIMD efficiency

    Incremental improvement for some hardw

    Relies on pre-factoring lighting functions

    Ability to vary resolve pass is not particularly

    See [Hoffman 2009] and [Stone 2009]

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    13/34

    Tile-Based Deferred Shading

    Goal: amortize overhead Use screen tiles to group lights

    Use tight tile frusta to cull non-intersecting lig

    Reduces number of lights to consider

    Read G-buffer once and evaluate all relevant Reduces bandwidth of overlapping lights

    See [Andersson 2009] for more details

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    14/34

    Lit Scene (1024 Point Lights)

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    15/34

    Tile-Based Light Culling

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    16/34

    Quad-Based Lighting Culling

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    17/34

    1

    2

    4

    8

    16

    16 32 64 128 256 512 1024

    FrameTime(ms)

    Number of Point Lights

    Quad (A

    Quad (N

    Tiled (NV

    Tiled (AT

    Light Culling Only at 1080p

    Beyond Programmable Shading, SIGGRAPH 2010

    Slope ~

    Slope ~

    Tile setup dominates

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    18/34

    1

    2

    4

    8

    16

    32

    16 32 64 128 256 512 1024

    FrameTime(ms)

    Number of Point Lights

    Deferred Shad

    Deferred Shad

    Deferred Ligh

    Deferred Ligh

    Tiled (NVIDIA

    Tiled (ATI 587

    Total Performance at 1080p

    Beyond Programmable Shading, SIGGRAPH 2010

    Deferred lighting slightly faster, but trends similarly

    Slope ~ 4 s /

    Slope ~ 20 s

    Few lights overlap

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    19/34

    Anti-aliasing

    Multi-sampling with deferred rendering re

    some work

    Regular G-buffer couples visibility and shadin

    Handle multi-frequency shading in user s

    Store G-buffer at sample frequency Only apply per-sample shading where neces

    Offers additional flexibility over forward rende

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    20/34

    Identifying Edges

    Forward MSAA causes redundant work

    It applies to all triangle edges, even for contin

    tessellated surfaces

    Want to find surfacediscontinuities

    Compare sample depths to depth derivatives Compare (shading) normal deviation over sa

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    21/34

    Per-Sample Shading Visualization

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    22/34

    MSAA with Quad-Based Methods

    Mark pixels for per-sample shading

    Stencil still faster than branching on most ha

    Probably gets scheduled better

    Shade in two passes: per-pixel and per-s

    Unfortunately, duplicates culling work Scheduling is still a problem

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    23/34

    Per-Sample Scheduling

    Beyond Programmable Shading, SIGGRAPH 2010

    Lack of spatial locality causes hardware

    scheduling inefficiency

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    24/34

    MSAA with Tile-Based Methods

    Handle per-pixel and per-sample in one p

    Avoids duplicate culling work

    Can use branching, but incurs scheduling pro

    Instead, reschedule per-sample pixels

    Shade sample 0 for the whole tile Pack a list of pixels that require per-sample shadin

    Redistribute threads to process additional samples

    Scatter per-sample shaded results

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    25/34

    Tile-Based MSAA at 1080p, 1024 Lig

    Beyond Programmable Shading, SIGGRAPH 2010

    0

    5

    10

    15

    20

    25

    30

    35

    Crytek Sponza

    (ATI 5870)

    2009 Game

    (ATI 5870)

    Crytek Sponza

    (NVIDIA 480)

    2009 Game

    (NVIDIA 480)

    FrameTime(ms)

    No MSAA

    4x MSAA (B

    4x MSAA (Pa

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    26/34

    1

    2

    4

    8

    16

    32

    64

    16 32 64 128 256 512 1024

    FrameTime(ms)

    Number of Point Lights

    Deferred Shad

    Deferred Ligh

    Deferred Shad

    Deferred Ligh

    Tiled (ATI 587

    Tiled (NVIDIA

    4x MSAA Performance at 1080p

    Beyond Programmable Shading, SIGGRAPH 2010

    Slope ~ 5 s /

    Slope ~ 35 s

    Tiled takes less of a hit from MSAA

    Deferred lighting even less compelling

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    27/34

    Conclusions

    Deferred shading is a useful rendering to

    Decouples shading from visibility

    Allows efficient user-space scheduling and c

    Tile-based methods win going forward

    Fastest and most flexible Enable efficient MSAA

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    28/34

    Future Work

    Hierarchical light culling

    Straightforward but would need lots of small

    Improve MSAA memory usage

    Irregular/compressed sample storage?

    Revisit binning pipelines? Sacrifice higher resolutions for better AA?

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    29/34

    Acknowledgements

    Microsoft and Crytek for the scene assets

    Johan Andersson from DICE

    Craig Kolb, Matt Pharr, and others in the

    Advanced Rendering Technology team a

    Nico Galoppo, Anupreet Kalra and Mike Bfrom Intel

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    30/34

    References

    [Andersson 2009] Johan Andersson, Parallel Graphics in Frostbite - Cu

    Future, http://s09.idav.ucdavis.edu/ [Fatahalian 2010] Kayvon Fatahalian, Evolving the Direct3D Pipeline fo

    Micropolygon Rendering, http://bps10.idav.ucdavis.edu/

    [Hoffman 2009] Naty Hoffman, Deferred Lighting Approaches,

    http://www.realtimerendering.com/blog/deferred-lighting-approaches/

    [Stone 2009] Adrian Stone, Deferred Shading Shines. Deferred Lighting

    Much., http://gameangst.com/?p=141

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    31/34

    Questions?

    Full source and demo available at:

    http://visual-computing.intel-

    research.net/art/publications/deferred_rende

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    32/34

    Backup

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    33/34

    Quad-Based Light Culling

    Accumulate many lights per draw call

    Render one point per light

    Vertex shader computes quad bounds for ligh

    Geometry shader expands into two triangles

    Pixel shader reads G-buffer and evaluates lig

    Beyond Programmable Shading, SIGGRAPH 2010

  • 7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines

    34/34

    Tile-Based Deferred Lighting?

    Can do deferred lighting with tiling...

    Not usually worth sacrificing the flexibility

    Bandwidth already minimized

    Additional resolve pass can make it slower o

    Exception: hardware considerations SPU lighting on Playstation 3

    Moving less data across the bus can be an overall

    Beyond Programmable Shading, SIGGRAPH 2010