Post on 20-Jan-2016
description
Vertex Shaders for Geometry Compression by Kenneth Hurley GDC San Francisco March 5th, 2007
You might be an engineer if… The sales people at the local computer
store can't answer any of your questions. You can type 70 words per minute but
can't read your own handwriting.
Your wife hasn't the foggiest idea of what you do at work.
You can’t enjoy movies, because you are constantly analyzing the special effects.
Your laptop computer costs more than your car.
Agenda Introduction Simple Compression Quantization Instancing with constants Uncompressing Questions
Introduction Why do we need it?
Reduce AGP/PCI Bus Transfers Optimal sizes are <8, 16, 32 for Vertex data size AGP 1x = 266 MegaBytes a second AGP 2x = 533 MB/s AGP 4x = ~ 1GB/s AGP 8x, which provides about 2.1GB/s PCIe provides 2.5Gbps and PCIe 2.0 raised that to 5Gbps Theoretical maximum is around ~70 million triangles at
32 bytes per tri for AGP 8x and PCIe
Introduction Why do we need it?
Reduce AGP/PCI Bus Transfers (cont) 70 Million is that really true? Probably not Could be higher, could be lower
Drivers can store triangles on video card memory
But textures are there too Uploads of textures (Managed textures), go
across same bus. Even on consoles this is a win because of
memory limitation and memory access
Introduction Why do we need it?
Speeds Rendering Reduces video memory accessVertex pipes are filled faster even if
loading from video memory (less access)
Reduces memory consumptionOf course, it does
Win, Win, Win
Introduction Why do we need it?
PS3/RSX/NV 7xxx architectures fetch 1 vertex attribute (i.e. 1 float4 per clock).
Yet there 8 vertex engines executing at an effective 8 instructions per
clock. Fetching 1 position, normal, tex coords,
binormal and tangent is 5 Attributes (5 clocks)
the vertex shader can be 64 instructions long and still not be limited by instruction count!
Simple “Obvious” Compression If you don’t need it, don’t include it
Don’t include unused component, Z, W Or pack something else in there
Remove Component(s) Normal, BiNormal, Tangent
Cross Product to reconstruct in Vertex Shader
Remove UV (ST) and calculate in Vertex Shader
Packed ARGB not floats (D3DCOLOR)
Quantization
Quantization is constraining something to a discrete set of values In our case reducing #bits/#bytes to
represent floats or integers Quantization is Lossy
Trades #bits/Bytes for precision Find acceptable error for your
application Distant LOD objects can have higher
errors and will be less noticeable.
Quantization Compression
#define NUMBITS 16 // number of bits to retain #define fracScale (1 << NUMBITS) int Quantize(float value, float fracScale)
{Return Float2Int(clamp(value * fracScale, -fracScale,
fracScale));}
unsigned int Quantize(float value, float fracScale)
{Return Float2Int(clamp(value * fracScale, 0, fracScale));}
Quantization
Decompression#define NUMBITS 16 // number of bits to retain#define fracScale (1 << NUMBITS)float Decompress(int value){
return ((float)value * fracScale);
}
Scaled Offset Separable components (scaled
offset) Minimum and maximum for static
objects used to pick offset point and scale
For dynamic objects (Animated, skinned, etc) this minimum and maximum must include all dynamic changes
Scaled Offset Redistributes quantization based on
choosing a scale that covers entire objectvoid CalculateScaleandOffset(Vertex &vertices,
float &offset, float &scale)
{
offset = 0.0f;
UpperRange = maxfloat;
for every vertex
{
LowerRange = min(offset, Vertex);
UpperRange = max(UpperRange, Vertex);
}
scale = (UpperRange – offset);
}
Scaled Offset
void ScaleandOffsetVerts(Vertex &vertices,Vertex &newVerts, float &offset, float &scale)
{for every vertex{
newVerts = Float2Short((vertices – offset) / scale);}
}
Scaled Offset Decompression
Vertex ScaleandOffsetVerts(Vertex v,float &offset, float &scale)
{return (((float)v * scale) + offset;
}
XVOX Demo
XVOX Demo Geomorphing terrain data in 36
bytes of vertex data. Trilinear displacement mapping UV (ST) texture coordinates can be
same as dU, dV with a scale stored in constant memory
Should probably use Ambient occlusion or Ambient aperture for lighting. Or light in world space
For Time of Day use color ramp textures to light terrain.
XVOX Demo Displacement mapping
V’(u,v) = V(u,v) + d(u,v) * N(u,v) Assuming normal is always up (terrain)
V’(u,v) = ( u, v, d(u,v) )
struct VS_INPUT{
float4 d1_d2; // 4 mipmap displacement values float4 u1v1_u2v2; //UV displacments + UV+1
float lod; // R-LOD selection};
XVOX ideas Vertex streams The coordinates (u, v) are taken from
a first vertex stream and the displacements d from a second vertex stream. This is done so the (u, v) coordinates can
be reused for each displacement mapped square, resulting in less memory used
Transform Compression
Compress by finding a dominate axis of the data Given vertex data, setup a covariance
matrix and extract the eigenvectors form See ShaderX for details, pp. 176-180
Decompression is simply the inverse of the covariance matrix
Since we are multiplying the position by a matrix for HCLIP space, the matrices can be rolled together
Achieves 50% compression without additional overhead
Idea from Displacement Maps Uses Quadtree/Octree structure
Object Vertices are displaced from quad/oct node corner or center
On rendering each entity/object set in the node of the tree, set vertex constant value to quad corner or center
More ideas for less bus traffic Put the vertex data in constant memory
Particles/Billboards or low poly data can be stored there
Then pass in index as a UBYTE component of D3DCOLOR
Other 3 bytes can be used for normal, etc. 2 shorts for scale and offset Offset from Quad/Octree node
UV coordinates are a good thing to store there
Example, low poly trees Pass Offset, scale and rotation around up axis 2 16 bit shorts and 1 float
8 bytes per tree
More ideas for less bus traffic Similar compression can be used for
writing pixels (other than display) For example deferred rendering storage Depth buffers
Summary Remove unused component
Or use for something else Quantize what you can Separate components by regions Displacement maps Reduce Memory and AGP/video
memory traffic!
References [Calver] Dean Calver, “Vertex Decompression in a Shader”,
ShaderX, Wordware publishing, pp 172-187 [Vlietinck] Jan Vlietinck, “Hardware trilinear displacement
mapping without tessellator and vertex texturing”, online http://users.belgacom.net/gc610902/technical.htm
[Forsyth] Tom Forsyth, “Practical Displacement Maps” GDC, 2003, Available from http://home.comcast.net/~tom_forsyth/papers/papers.html
[Wloka] Matthias Wloka, “Personal Communication”, March 2007
Shameless Plug Our game engine will soon be available for free on the PC
Editor (XML Format), Terrain painting, object placement Particle Systems RakNet Networking system Ageia Physics Lua Scripting A.I. through Hierarchical State machines, events, scripting, triggers Sound through OpenAL Auto Navmesh Generation GUI Editor Model Viewer Direct X9, DirectX10, HDR, Shadow Maps, Radiosity lightmaps (WIP) More…
Questions? More information available on website.
http://www.signaturedevices.com & http://www.graffitientertainment.com (Publishing Subsidiary)
submissions@graffitientertainment.com