Post on 16-Dec-2015
Texturing Massive Terrain Colt McAnlis
Graphics Programmer – Blizzard
60 minutes (ish)
What are we talking about?
Keys to Massive texturing
Texturing data is too large to fit into memory
Texturing data is unique
Lots of resolution Down to maybe 1meter / per pixel
What we’re ignoring
Vertex data General terrain texturing issues Low End Hardware Review of technologies
What We’re covering
Paging & Caches DXT++ Compression Compositing frameworks Editing Issues Example Based Texture Synthesis
The World : So Far..
What’s Visible?
Only subsection visible at a time Non-visible areas remain on disk
New pages must be streamed in Quickly limited by Disk I/O
Fast frustum movements kill perf New pages occur frequently
Radial Paging
Instead page in full radius around player Only need to stream in far-away pages
Distance based resolution
Chunks stream in levels of mip-maps As Distance changes, so does LOD New mip levels brought in from disk
Textures typically divided across chunk bounds Not ideal for Draw call counts..
Typical Setup
Each chunk has it’s own mipchain
Difficult To filter across boundaries
One mip to rule them..
But we don’t need full chains at each chunk Radial paging requires less memory
Would be nice to have easier filtering
What if we had one large mip-chain?
Mip Stack
Use one texture per ‘distance’ Resolution consistent for range
All textures are same size As distance increases, quality decreases
Can store as 3d texture / array Only bind 1 texture to GPU
Big textures
The benefit of this is that we can use 1 texture Texturing no longer a reason for
breaking batches No more filtering-across-boundary
issues 1 sample at 1 level gets proper filtering
Mip mapping still poses a problem though Since mips are separated out
Mipping solution
Each ‘distance’ only needs 2 mips Current mip, and the next smallest
At distance boundaries, mip levels should be identical. Current distance is mipped out to next
distance Memory vs. perf vs. quality tradeoff
YMMV
MipChain
Mip Transition
Updating the huge texture
How do we update the texture? GPU resource?
Should use render-to-texture to fill it. But what about compression?
Can’t RTT to compressed target GPU compress is limited
Not enough cycles for good quality Shouldn’t you be GPU bound??
So then use the CPU to fill it? Lock + memcpy
What We’re covering
Paging & Caches DXT++ Compression Compositing frameworks Editing Issues Example Based Texture Synthesis
Compressing Textures
Goal : Fill large texture on CPU Problem : DXT is good
But other systems are better (JPG)
ID Software: JPEG->RGBA8->DXT Re-compressing decompressed streams
2nd level quality artifacts can be introduced Decompress / recompress speeds?
Compressing DXT
We have to end up at GPU friendly format Sooner or later..
Remove the Middle man? We would need to decompress directly to DXT Means we need to compress the DXT data MORE
Let’s look at DXT layout
High 565
Low 565
2 bit Selectors
DXT1 : Results in 4bpp
In reality you tend to have a lot of them :512x512 texture is 16k blocks
…
Really, two different types of data per texture
16 bit block colors 2bit selectors
Each one can be compressed even further
Block ColorsInput texture :
Potential for millions of colorsInput texture :
Actual used colors16 bit compressed
Used colors
•Two unique colors per block.•But what if that unique color exists in other blocks?• We’re duplicating data
•Let’s focus on trying to remove duplicates
Huffman Encoding
Lossless data compression Represents least-bit dictionary set
IE more frequently used values have smaller bit reps
String : AAAABBBCCD (80 bits)
Result : 00001010101101101111 (20 bits)
Symbol Used % Encode
A 50% 0
B 25% 10
C 15% 110
D 5% 111
Huffman block colors
More common colors will be given smaller indexes 4096 identical 565 colors = 8kb Huffman encoded = 514 bytes
4k single bits, one 16 bit color
Problem : As number of unique colors increases, Huffman becomes less effective.
Goal : Minimize Unique Colors
Similar colors can be quantized Human eye won’t notice
Vector Quantization Groups large data sets into correlated
groups Can replace group
elements withsingle value
Compressing Block Colors
Step #1 - Vectorize unique input colors Reduces the number of unique colors
Step #2 – Huffmanize quantized colors Per-DXT block, store the Huffman index
rather than the 565 color.
W00t..
Selector bits
Each selector block is a small number of bits
Chain 2bit selectors together to make larger symbol
Can use huffman on these too!
Huffman’s revenge!!
4x4 array of 2bit –per block values Results in four 8 bit values
Might be too small to get good compression results
Or a single 32 bit value Doesn’t help much if there’s a lot of unique
selectors
Do tests on your data to find the ideal size 8bit-16 bit works well in practice
DXT Data
Block Colors
Huffman Table
Selector Bits
Seperate
Q Block Colors
Vector Quantization
Huffman
Huffman Table
Huffman
Color Indexes
Selector Indexes
TO
DIS
K
Compressing DXT : rehash
Color Indexes
Selector Indexes
Huffman TableBlock Colors
Huffman TableSelector Bits
Fill DXT blocks
Decompressing
Results : 1024x1024 diffuse
Uncompressed DXT1 (4bpp) DXT1 ++
3mb 512kb 91kb
0.7 bpp
Results : 1024x1024 AO
Uncompressed DXT3A (4bpp) DXT ++
1mb 512kb 9kb
0.07 bpp
BACK UP!
Getting back to texturing.. Insert decompressed data into mip-
stack level Can lock the mip-stack level
Update the sub-region on the CPU
Decompression not the only way..
What We’re covering
Paging & Caches DXT++ Compression Compositing frameworks Editing Issues Example Based Texture Synthesis
Paged data
Pages for the cache can come from anywhere Doesn’t have to be compressed unique
data What about splatting?
Standard screenspace method Can we use it to fill the cache?
Frame buffer splatting
Splatting is standard texturing method Re-render terrain to screen Bind new texture & alpha each time Results accumulated via blending
De facto for terrain texturing
2D Splatting : Compositing
Same process can work for our caching scheme Get same memory benefits
Don’t splat to screen space, Composite to page in the cache
What about compression? Can’t composite & compress
Alpha blending + DXT compress???
Composite->ARGB8->DXT
Why composite?
Compression is awesome But we could get better results
Repeating textures + low-res alpha = large memory wins
Decouples us from Verts overdraw Which is a great thing!
Why don’t composite?
Quality vs. Perf tradeoff Hard to get unique quality @ same perf More blends = worse perf
Trade uniqueness for memory Tiled features very visible.
Effectively wasting cycles Re-creating the same asset every frame
End Goal
Mix of compositing & decompression Fun ideas for foreground /
background Switch between them based on distance
Fun ideas for low-end platforms High end gets decompression Low end gets compositing
Fun ideas for doing both!
Cache
Disk Data
2D Compositor
CPU Compress
GPU Compress
Decompress
A really flexible pipeline..
What We’re covering
Paging & Caches DXT++ Compression Compositing frameworks Editing Issues Example Based Texture Synthesis
Authoring issues
UR A T00L (programmer..)
Standard pipelines choke on data Designed for 1 user -> 1 asset work Mostly driven by source control setups
Need to address massive texturing directly
Multi-user editing
Problem with allowing multiple artists to texture a planet. 1 artist per planet is slow…
Standard Source Control concepts fail If all texturing is in one file, it can only
safely be edited by one person at a time Solution : 2 million separate files?
Need a better setup
Texture Control Server
Allows multiple users to edit texturing User Feedback is highly important
Edited areas are highlighted immediately to other users Highlighted means ‘has been changed’ Highlighted means ‘you can’t change’
Artist A Artist B
Texturing Server
Change Made Data Updated
Custom Submission
Custom merge tool required Each machine only checks in their
sparse changes
Server handles merges before submitting to actual source control Acts as ‘man in the middle’
Artist A Artist B
Texturing Server
Source Control
Changes Changes
Batching ideas
What about planet-sized batch operations? Could modify entire planet at once? Would ignore affected areas?
Double Edged Sword.. Important to still have batching. Maybe limit batch operation distances? Flag if trying to modify edited area?
Batch texture generation
Common texturing concepts Set texture by slope Set texture by height Set texture by area
Could we extend it further?
Complex interactions
View ‘set’ operations as ‘masks’ Set texturing by procedural functions
Combine masks in a graph setup Common concept
.kkriger, worldmachine, etc
Mask Graphs = good
Masks can re-generate based upon vertex changes As long as you store the graph, not the
mask. Generate multiple masks for other
data Apply trees, objects, etc
Cool algorithms here for all
What We’re covering
Paging & Caches DXT++ Compression Compositing frameworks Editing Issues Example Based Texture Synthesis
Texture Tiling Problems
Repeating textures causes problems Takes more blends to reduce repetition
Increases Memory Increases perf Burden
Would be nice to fix that automagically
Example Based Texture Synthesis
Generates output texture per-pixel Chooses new pixel based upon current
neighborhood Represent input pixel as a function
of its neighbors Create search acceleration structure Find ‘neighborhood’ similar to input
This is known as ‘Per-pixel’ synthesis
Texture being synthesized
Exemplar
Issues
Basically a Nearest Neighbor search
Doesn’t give best quality Only correcting input pixel based upon
previously corrected neighborhood Introduces sequential dependencies
Need to increase neighborhood size to get better results
This increases sample time
Exemplar Noisy Output image
Appearance Texture Synthesis
Hoppe 2006 (Microsoft Research)
Multi-resolution: Fixes pixels at various sizes in output image This ‘keeps’ course texture features Reduces image artifacts
GPU based
Highly controllable Artists / mesh provided vector fields
For terrain
Can synthesize large textures Rather than have same repeating
texture Use terrain normals as input
Allows texture to ‘flow’ with contours Allow artists to adjust vectors
So they can paint custom swirls etc.
Could even use to synthesize terrain vertex data
But that’s another talk ;)
Issues
Still too slow to composite MASSIVE terrain @ edit time Synthesize the whole planet? Would have to be a render-farm process.
Actually, still too slow to do non-massive terrain.. Maybe generate custom decals?
But what about CPU? Multicore may shed light on it Future research?
In Conclusion
Use 1 Texture Resource for texture data MipStack structure
Use DXT++ to decrease footprint W/o using RGBA->DXT
Multi-input cache filling algorithms Stream + Composite
Use Custom texturing server Make texture synthesis Faster!!
I’m talking to you Mr. Hoppe ;)
Thank you! Andrew Foster Rich Geldreich
Ken Adams
Questions?cmcanlis@blizzard.com