Vertex Shaders for Geometry Compression

Post on 20-Jan-2016

35 views 0 download

Tags:

description

Vertex Shaders for Geometry Compression. by Kenneth Hurley GDC San Francisco March 5 th , 2007. You might be an engineer if…. The sales people at the local computer store can't answer any of your questions. You can type 70 words per minute but can't read your own handwriting. - PowerPoint PPT Presentation

Transcript of Vertex Shaders for Geometry Compression

Vertex Shaders for Geometry Compression by Kenneth Hurley GDC San Francisco March 5th, 2007

You might be an engineer if… The sales people at the local computer

store can't answer any of your questions. You can type 70 words per minute but

can't read your own handwriting.

Your wife hasn't the foggiest idea of what you do at work.

You can’t enjoy movies, because you are constantly analyzing the special effects.

Your laptop computer costs more than your car.

Agenda Introduction Simple Compression Quantization Instancing with constants Uncompressing Questions

Introduction Why do we need it?

Reduce AGP/PCI Bus Transfers Optimal sizes are <8, 16, 32 for Vertex data size AGP 1x = 266 MegaBytes a second AGP 2x = 533 MB/s AGP 4x = ~ 1GB/s AGP 8x, which provides about 2.1GB/s PCIe provides 2.5Gbps and PCIe 2.0 raised that to 5Gbps Theoretical maximum is around ~70 million triangles at

32 bytes per tri for AGP 8x and PCIe

Introduction Why do we need it?

Reduce AGP/PCI Bus Transfers (cont) 70 Million is that really true? Probably not Could be higher, could be lower

Drivers can store triangles on video card memory

But textures are there too Uploads of textures (Managed textures), go

across same bus. Even on consoles this is a win because of

memory limitation and memory access

Introduction Why do we need it?

Speeds Rendering Reduces video memory accessVertex pipes are filled faster even if

loading from video memory (less access)

Reduces memory consumptionOf course, it does

Win, Win, Win

Introduction Why do we need it?

PS3/RSX/NV 7xxx architectures fetch 1 vertex attribute (i.e. 1 float4 per clock).

Yet there 8 vertex engines executing at an effective 8 instructions per

clock. Fetching 1 position, normal, tex coords,

binormal and tangent is 5 Attributes (5 clocks)

the vertex shader can be 64 instructions long and still not be limited by instruction count!

Simple “Obvious” Compression If you don’t need it, don’t include it

Don’t include unused component, Z, W Or pack something else in there

Remove Component(s) Normal, BiNormal, Tangent

Cross Product to reconstruct in Vertex Shader

Remove UV (ST) and calculate in Vertex Shader

Packed ARGB not floats (D3DCOLOR)

Quantization

Quantization is constraining something to a discrete set of values In our case reducing #bits/#bytes to

represent floats or integers Quantization is Lossy

Trades #bits/Bytes for precision Find acceptable error for your

application Distant LOD objects can have higher

errors and will be less noticeable.

Quantization Compression

#define NUMBITS 16 // number of bits to retain #define fracScale (1 << NUMBITS) int Quantize(float value, float fracScale)

{Return Float2Int(clamp(value * fracScale, -fracScale,

fracScale));}

unsigned int Quantize(float value, float fracScale)

{Return Float2Int(clamp(value * fracScale, 0, fracScale));}

Quantization

Decompression#define NUMBITS 16 // number of bits to retain#define fracScale (1 << NUMBITS)float Decompress(int value){

return ((float)value * fracScale);

}

Scaled Offset Separable components (scaled

offset) Minimum and maximum for static

objects used to pick offset point and scale

For dynamic objects (Animated, skinned, etc) this minimum and maximum must include all dynamic changes

Scaled Offset Redistributes quantization based on

choosing a scale that covers entire objectvoid CalculateScaleandOffset(Vertex &vertices,

float &offset, float &scale)

{

offset = 0.0f;

UpperRange = maxfloat;

for every vertex

{

LowerRange = min(offset, Vertex);

UpperRange = max(UpperRange, Vertex);

}

scale = (UpperRange – offset);

}

Scaled Offset

void ScaleandOffsetVerts(Vertex &vertices,Vertex &newVerts, float &offset, float &scale)

{for every vertex{

newVerts = Float2Short((vertices – offset) / scale);}

}

Scaled Offset Decompression

Vertex ScaleandOffsetVerts(Vertex v,float &offset, float &scale)

{return (((float)v * scale) + offset;

}

XVOX Demo

XVOX Demo Geomorphing terrain data in 36

bytes of vertex data. Trilinear displacement mapping UV (ST) texture coordinates can be

same as dU, dV with a scale stored in constant memory

Should probably use Ambient occlusion or Ambient aperture for lighting. Or light in world space

For Time of Day use color ramp textures to light terrain.

XVOX Demo Displacement mapping

V’(u,v) = V(u,v) + d(u,v) * N(u,v) Assuming normal is always up (terrain)

V’(u,v) = ( u, v, d(u,v) )

struct VS_INPUT{

float4 d1_d2; // 4 mipmap displacement values float4 u1v1_u2v2; //UV displacments + UV+1

float lod; // R-LOD selection};

XVOX ideas Vertex streams The coordinates (u, v) are taken from

a first vertex stream and the displacements d from a second vertex stream. This is done so the (u, v) coordinates can

be reused for each displacement mapped square, resulting in less memory used

Transform Compression

Compress by finding a dominate axis of the data Given vertex data, setup a covariance

matrix and extract the eigenvectors form See ShaderX for details, pp. 176-180

Decompression is simply the inverse of the covariance matrix

Since we are multiplying the position by a matrix for HCLIP space, the matrices can be rolled together

Achieves 50% compression without additional overhead

Idea from Displacement Maps Uses Quadtree/Octree structure

Object Vertices are displaced from quad/oct node corner or center

On rendering each entity/object set in the node of the tree, set vertex constant value to quad corner or center

More ideas for less bus traffic Put the vertex data in constant memory

Particles/Billboards or low poly data can be stored there

Then pass in index as a UBYTE component of D3DCOLOR

Other 3 bytes can be used for normal, etc. 2 shorts for scale and offset Offset from Quad/Octree node

UV coordinates are a good thing to store there

Example, low poly trees Pass Offset, scale and rotation around up axis 2 16 bit shorts and 1 float

8 bytes per tree

More ideas for less bus traffic Similar compression can be used for

writing pixels (other than display) For example deferred rendering storage Depth buffers

Summary Remove unused component

Or use for something else Quantize what you can Separate components by regions Displacement maps Reduce Memory and AGP/video

memory traffic!

References [Calver] Dean Calver, “Vertex Decompression in a Shader”,

ShaderX, Wordware publishing, pp 172-187  [Vlietinck] Jan Vlietinck, “Hardware trilinear displacement

mapping without tessellator and vertex texturing”, online http://users.belgacom.net/gc610902/technical.htm

[Forsyth] Tom Forsyth, “Practical Displacement Maps” GDC, 2003, Available from http://home.comcast.net/~tom_forsyth/papers/papers.html

[Wloka] Matthias Wloka, “Personal Communication”, March 2007

Shameless Plug Our game engine will soon be available for free on the PC

Editor (XML Format), Terrain painting, object placement Particle Systems RakNet Networking system Ageia Physics Lua Scripting A.I. through Hierarchical State machines, events, scripting, triggers Sound through OpenAL Auto Navmesh Generation GUI Editor Model Viewer Direct X9, DirectX10, HDR, Shadow Maps, Radiosity lightmaps (WIP) More…

Questions? More information available on website.

http://www.signaturedevices.com & http://www.graffitientertainment.com (Publishing Subsidiary)

submissions@graffitientertainment.com