Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as...

37
Advanced Uses of Pixel Local Storage Marius Bjørge ARM Developer Day - London Graphics Research Engineer December 3 rd 2015

Transcript of Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as...

Page 1: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

Advanced Uses of Pixel Local Storage

Marius Bjørge

ARM Developer Day - London

Graphics Research Engineer

December 3rd 2015

Page 2: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 2

Agenda

Pixel Local Storage

Indirect lighting pipeline

Page 3: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 3

Pixel Local Storage

Exposed as EXT_shader_pixel_local_storage

Per-pixel scratch memory available to fragment shaders

Automatically discarded once a tile is fully processed

No impact on external memory bandwidth

Shader declares a view of PLS memory

Re-interpret PLS between different passes

Can have separate input and output views

Independent of framebuffer format

Page 4: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 4

Pixel Local Storage

Rendering pipeline changes

slightly when PLS is enabled

Writing to PLS bypasses blending

Note

Fragment order

Fragment tests still applies

PLS and color share the same

memory location

Tile executionMemory

Primitive setup

Rasterization

Fragment Shading

Position data

Varyings

Textures

Framebuffer Writeback

Uniforms

Blending

Tilebuffer

PLS / Color

Page 5: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 5

Deferred Shading

Popular technique on PC and console games

Very memory bandwidth intensive

Traditionally not a good fit for mobile

Page 6: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 6

Order Independent Transparency

Depth peeling

Approximate approaches

Multi-Layer Alpha Blending [Salvi

et al, 2014]

Adaptive Range

Page 7: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 7

Pixel Local Storage

OIT phaseOpaque phase

Fill gbufferLight

accumulationTonemap

Pixel Local Storage

Init OIT

R32UI R32UI R32UI R32UI

Transparent rendering

Resolve

ColorRGB10A2 RGB10A2 RG16F RG16F

At this point we change the layout of the PLS

Page 9: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

Indirect Lighting Pipeline

Page 10: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 10

Related Work

Efficient Rendering With Tile

Local Storage [Bjorge14]

SH Irradiance Volumes

[Tatarchuk05]

Reflective Shadow Maps

[Dachsbacher09]

Virtual Point Lights

Light Propagation Volumes

[Kaplanyan09]

Page 11: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 11

Global illumination with ARM

Enlighten remains the GI solution of choice

for mobile

Performant

High quality

Proven

PLS + Enlighten = bandwidth saving! [Bjorge14]

This R&D project is separate

Exploration of PLS capabilities

Indirect lighting as first challenging use case

Results remain an approximation

Page 12: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 12

The Idea

A grid of probes are placed throughout the scene

Albedo, normal and depth information is stored

Resolution can be quite low (64x64, 32x32 or lower)

Selectively store as either cubemaps/latlon maps or spherical harmonics

Relighting does simple deferred shading style rendering per probe

Output: SH irradiance + reflection map

Page 13: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 13

High-Level Overview

Offline Runtime

Probe data

Probe scene for indirect lighting information

Relight probes

Irradiance volume

Reflection maps

Compute SH

Dynamic lights and reflector objects

Page 14: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

Direct Lighting

Page 15: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

Indirect Lighting

Page 16: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

Reflections

Page 17: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

Final Image

Page 18: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

Lighting Only

Page 19: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

Offline

Page 20: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 20

Offline – Probe Generation

For every probe in grid

Render scene from every direction

storing albedo, normal and depth

Save to a texture atlas

Depth == 1.0

Hemisphere

Page 21: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 21

Offline – Probe Storage

High frequency

Cubemaps / latlon maps

Low frequency

Spherical harmonics

Both are used in the demo application

Cubemaps for probes that are used for

environment mapping

Spherical harmonics for the rest

Gives the best quality/performance trade-off

Page 22: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 22

Offline – Probe Storage

Cubemaps

6kb

Latlon

2kb

Spherical harmonics

40 bytes

Page 23: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 23

Offline – Meta Data

Neighbour visibility information

Reduce amount of work required when updating probes

Snap texture coordinates to reduce light leakage

Hemisphere visibility

Avoid updating probes that are never affected by global lights (such as the hemisphere

or sun light)

Local cubemap bounding box

Page 24: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

Runtime

Page 25: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 25

Runtime – Relighting

First step is to determine which probes require updating

Probe meta-data is really useful here

Dirty probes are pushed to queue for processing

Page 26: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 26

Runtime – Per Probe Pipeline

Determine contributing lights

Accumulate lightingParallel sum of

lighting data

Dynamic lights and reflector objects

Probe data

Blit to reflection map

Compute 2nd order SH

Store to 3D textures

Page 27: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 27

Runtime – Deferred Probe Relighting

Deferred shading implementation

PLS is really useful here

Sample from the baked albedo/depth and

normal maps

Possible to re-use existing deferred shading

lighting code path

Makes it easy to support custom light types

and reflector objects

Heavily batched

-X +X -Y +Y -Z +Z

-X +X -Y +Y -Z +Z

-X +X -Y +Y -Z +Z

-X +X -Y +Y -Z +Z

-X +X -Y +Y -Z +Z

Probe 0

Probe 1

Probe 2

Probe 3

Probe n

Page 28: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 28

Runtime – Compute SH

Parallel sum with spherical harmonics coefficients

Compute shader

… and/or multiple fragment reduction passes

Output formats

Pack down to separate red, green and blue 3D irradiance textures

Store in FP16 for best visual quality

Page 29: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 29

Runtime – Irradiance Volume

2nd order SH

3x 3D textures for red, green and blue

coefficients

Simple 3D texture lookup using

world-space position

Page 30: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 30

Runtime – Reflections

Output to cubemap matching size of

probes

glGenerateMipmap

Cubemap filter too expensive for runtime

Page 31: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 31

Runtime – Multiple Bounces

Temporal probe relighting

Slowly accumulate into irradiance volume

Page 32: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

Single Bounce

Page 33: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

Multiple Bounces

Page 34: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 34

Future Work

Move “everything“ to compute

Octree representation

Occlusion culling?

Page 35: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 35

For more information visit the

Mali Developer Centre:

http://malideveloper.arm.com

• Revisit this talk in PDF and audio

format post event

• Download tools and resources

Page 36: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

The trademarks featured in this presentation are registered and/or unregistered trademarks of ARM Limited (or its

subsidiaries) in the EU and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their

respective owners.

Copyright © 2015 ARM Limited

Thank you!

Questions?

Page 37: Advanced Uses of Pixel Local Storage - Microsoft · Pixel Local Storage Exposed as EXT_shader_pixel_local_storage Per-pixel scratch memory available to fragment shaders Automatically

© ARM 2015 37

References

1. http://www.ppsloan.org/publications/StupidSH36.pdf

2. http://www.crytek.com/cryengine/cryengine3/presentations/light-

propagation-volumes-in-cryengine-3

3. http://developer.amd.com/wordpress/media/2012/10/Tatarchuk_Irradi

ance_Volumes.pdf

4. http://www.vis.uni-stuttgart.de/~dachsbcn/download/rsm.pdf

5. http://www.geomerics.com/wp-content/uploads/2014/11/Efficient-

Rendering-with-Tile-Local-Storage.pdf