Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

42
Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

description

Sony Computer Entertainment Development Conference 2nd - 3rd August 2001. GS Master class. Mark Breugelmans. What we know about the GS. GS memory is 4meg GS fill rate is 1.2gigapixel/sec (textured) GS input bandwidth is 64bit We can stream up to 1.2gigabyte a second - PowerPoint PPT Presentation

Transcript of Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Page 1: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Sony Computer EntertainmentDevelopment Conference

2nd - 3rd August 2001

Page 2: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

GS Master class

Mark BreugelmansMark Breugelmans

Page 3: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• GS memory is 4meg• GS fill rate is 1.2gigapixel/sec (textured)• GS input bandwidth is 64bit

– We can stream up to 1.2gigabyte a second

• GS polygon though-put is determined by:– Set-up time (number of cycles per vertex)– Polygon size (number of pixels to draw)

What we know about the GS

Page 4: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• GS runs at 150mhz but with only a 64bit input• That’s around 24megabyte/frame (PAL) to be

shared between textures and geometry• Geometry

– Use strips for fastest geometry set-up

• Textures– Always pack 4,8,16bit textures into 32bit format

before hand for fastest transfer.

Getting data in

Page 5: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Theoretical rate is 1.2gig/sec • Transfer rates

– 32, 24, 16bit 1200 Megabyte/sec (1065*)– 8bit 900 Megabyte/sec (799*)– 4bit 600 Megabyte/sec (383*)– (* path3 measured values)

• Sample code shows you how to convert

Texture Transfer Rates

Page 6: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• At most 8 textured pixels are drawn per cycle• Up to 8x4 that can be drawn in set-up time• The GS is not very efficient for tiny triangles

Small triangles and set-up time

0

5

10

15

20

1x1 2x2 4x4 8x4 8x8 16x16

UntexturedTexture

Page 7: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Pixels are drawn by the GS in groups of 8• Small triangles will not make use of this

– Triangle Size Pixels Drawn/Cycle

– 1x1 0.12

– 2x2 0.5

– 4x4 2

– 8x8 5.27

– 16x16 6.13

Small triangles and Fill-rate

Page 8: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Triangle size• Texture to pixel size• Texture filtering modes (Tri-linear, mip-maps)• Fog• Caches

– Texture page buffer– Frame/Z page buffer

Fill rate factors

Page 9: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Frame and Z-Buffer: 8k– split into 2 buffers: 32x32x32bit = 4k each

• Page refill is very fast– 8192bits per cycle (150gigabyte/sec bandwidth!)– Whole 8k page buffer refilled in 8 cycles

Frame/Z Buffer Page caches

Z Buffer32x32

Frame 32x32

Page 10: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Frame/Z Page cache will get filled line by line as drawing scans down– Fill rate while varying height is roughly constant– Fill rate while varying width varies with page miss

• Cache misses due for Frame/Z page don’t drop fill-rate much below 1gigapixel.

• Textures are usually more of a problem

Frame/Z Page Cache misses

Page 11: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

0

200

400

600

800

1000

1200

1400

2x2 4x4 8x8 16x16 32x32 64x64 128x128 256x256

*Texture is on cache without reducing size

Fill

rat

e

Untextured

Textured*

Fill-rate vs. Triangle size

Page 12: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• As polygon counts head into millions pixel sizes shrink rapidly

• PA scans of games suggests better use of LOD would benefit some games significantly.– The back of a 5000 polygon car may result in just

50 visible pixels once projected onto the screen.– Similarly there’s no point having detailed textures

that are going to be shrunk so much

Level of detail

Page 13: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Set all vertices to:• red=0, green=1, blue=0• alpha blend=destination + source• z test = disabled• texture = disabled

• Lighter areas show you where there is high density or overdraw

A pixel density test

Page 14: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Texture cache: 8k

Texture Page caches

32bit64x32

16bit64x64

8bit128x64

4bit128x128

Also used for 24,8H, 4HL, 4HH

Page 15: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• 64x32 sprite, 24bit texture– Texture size Fill-rate GS cycles– 64x32 1158 262– 65x32 596 514

• One pixel outside the page halves fill rate!• Texture cache miss is based on the texture

co-ordinates not the original texture size• Crossing texture pages also affects the cache

Texture Cache misses example

Page 16: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• The blocks in the pages are zig-zagged in 1/4s, 1/16s etc for efficiency.

• Use at most 1/2 page width and height to avoid crossing 3 quarters which causes many block reloads / page misses

Crossing Texture Pagesefficiently

Crosses2 quarters

Crosses3 quarters

Page 17: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• PA scans showing GS wait for texture

• Suggested subdivision for each texture mode:– Texture mode Subdivision

– 4bit (128x128) 64x64

– 8bit (128x64) 64x32

– 16bit (64x64) 32x32

– 24/32bit (64x32) 32x16

Recommended subdivision

Not subdivided 256x256(4bit) Subdivided 256x256(4bit)

Page 18: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Use 4bit or 8bit textures• Clamp texture to page size to keep in page

– Bilinear may fetch 1pixel outside your co-ordinate range.

• Either/Or– Keep all textures within one page– Sub-divide polygons until ST co-ordinates of each

polygon stay within a half cache page

Reducing texture cache miss

Page 19: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

0

200

400

600

800

1000

1200

8x8 16x16 32x32 64x64 128x128 256x256

Texture Coordinates

Fil

l ra

te Fillrate for a4bit texture ona 16x16 sprite

Texture reduction penalty

Page 20: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Good for avoiding texture reduction– Look better– May help reduce texture transfers for distant

drawing

• Watch out for performance on large polygons– mip-maps in different pages can cause multiple

texture cache reloads

Mip-maps

Page 21: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Primitive is drawn line by line– Wall reloads all mipmaps for every line– Road loads each mip-map only once

Mip-maps on large primitives

4

3

2

11 2 3 4

Page 22: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Tri-linear fill rate is 1/2 the speed of bilinear.– It’s fetching twice the number of pixels

• When two mip-map levels are in different pages Tri-linear is 8x slower than bi-linear– Due to multiple page loads per pixel

• Solutions– Keep smaller mip-maps in same page– Disable tri-linear for near mipmap levels– Perhaps do tri-linear as 2 pass with alpha

Tri-linear performance

Page 23: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

0

200

400

600

800

1000

1200

*Texture is on cache without reducing size

Fill

rat

e

Textured*

Texture*+Fog

Fill-rate and Fog

Page 24: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• For larger textured primitives it is quicker to do fog as a second pass

• Technique– 1st pass draw a textured primitive– 2nd pass gouraud and alpha blended primitive

Alternative FOG

Page 25: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

25 25 2

• Early Pixel reject– Pixels discarded in lines – Eliminates all page misses and texture loads– Speed depends on location of triangle

Scissoring

52 52 34

79 1135 34

12 12 2

25 26 18

36 280 18

4 4 2

7 7 6

9 12 6

16x16 triangle 64x64 triangle 128x128 triangleNote: All Timings in GS cycles

Page 26: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• TEX0_1 only takes 2 GS cycles if CLUT isn’t loaded and texture address isn’t changed

• TEX2_1 (CLUT) is no quicker than TEX0_1 it just masks some of the TEX0_1 fields

Context changes with TEX0_1

Page 27: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Loading a new CLUT causes 2 things to happen– New CLUT must be loaded– Texture cache is invalidated

• Loading a just a CLUT is no faster than loading both CLUT and TEXTURE

• However selecting an already loaded CLUT is a zero cost operation.

CLUTs

Page 28: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Texture page caches have the biggest effect on fill rate– Subdivide large texture co-ordinate ranges– Keep mip-maps in the same page

• Texture reduction also costs fill rate as texel read becomes bottle neck

• Frame buffer pages misses aren’t too bad– Cost for big polygons is not bad compared to texture

penalties

Fill-rates : Summary

Page 29: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• 4bit, 8bit palletised are the most compact• Tiled textures with repeat and region repeat• Multi-pass techniques

– Alpha blending is zero cost• Useful for multi-pass techniques

– Useful blend types• Standard blend between SRC and FRAME• Multiply blend (using alpha channel)

Making the most of VRAM

Page 30: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Very easy way to add detail for little cost• Repeat range

– 0.10.4 UV (0 - 1024)– 1.11.4 ST (+- 2048) which is 4x the range– Number of repeats reduces for larger textures

• Watch out when scissoring massively tiled polygons– Perspective errors– Recalculate smaller texture co-ordinates

Tiling textures

Page 31: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Monochrome textures can compress really well to 4bit

Texture Compression

Page 32: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• The eye is sensitive to gradual changes in luminance so palettes bad look in this case

• In this case it would be better to reduce in size and use GS bilinear filter to interpolate

Texture Compression

Page 33: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• You can add a low bit depth detail map to a low resolution interpolated image

• Total size of the 2 images is much less than a single 24bit image. We can also use tiling.

Texture Compression

Page 34: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

2 Pass Texture Compression

Colour map 1/16 area of original.8-bit CLUT up to 32-bit

Detail mapfull-size

2-bit or 4-bit grayscale

Original 24-bit or 32-bit image

Page 35: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Detail map CLUT is concentrated around the centre– Eye is sensitive

to small changes in luminance.

Texture Compression

1.0

0.0

2.0

• Detail map is calculated as:– original pixel / colour map pixel = alpha multiply which is then mapped to a

CLUT.

Page 36: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

2-bit Luminance Textures

4-bit image

x x 0 1

x x 1 0

x x 1 1

0 0x x

CLUT 1

0 1 x x

1 0 x x

1 1 x x

00 x x

CLUT 2

Page 37: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Decompressing the texture– Draw low resolution colour map normally– Draw detail map with alpha multiply

• Two alternatives for detail map drawing• Decompress to a new texture first

• Draw directly using two passes

• Colour map can serve as a low-res mipmap– Detail map can be faded in for close ups

• Benefit is reduced GIF->GS data transfer

Texture Compression

Page 38: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• For high-resolution you need to run the TV interlaced– Odd and Even lines are drawn alternate frames– Any image not drawn on both lines flickers

• Scan line blending solves the problem• This flickering is much more more of a

problem than edge aliasing.

Interlace Flickering

Page 39: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Choose appropriate mip-map textures• For games not guaranteed to run in a frame

– Use 2 circuit method (very easy)

• If you can run in a frame you can save some VRAM compared to the 2circuit method– Sprite method: Saves 1/2 a display buffer– Motion blur method: Save all VRAM– 2pass method: Save all VRAM but 2x polygons

Interlace Flickering - Solutions

Page 40: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Edge anti-aliasing is nice but you must sort your polygons and it’s slower to draw

• Down sample is easy but expensive in VRAM– Draw objects to large off-screen buffers and down-

sample (we can still Z test if we scale up Z first)

• An alternative method– Render 4x with 25% alpha and 1/2 pixel offset in 4

directions. Same effect using extra polygons rather than VRAM

Super-sampling techniques and edge Anti-aliasing

Page 41: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Framing out on loading– Use field mode perhaps– You could use 16bit field mode in the z buffer?– Use a low res background with 2nd circuit text?

One last thing - Loading screens and framing out

Page 42: Sony Computer Entertainment Development Conference 2nd - 3rd August 2001

Confidential Information of Sony Computer Entertainment Europe

• Maximising GS input paths– Transfer textures as 32bit– Consider detail textures and texture tiling

• Keeping up fill-rates– Subdivide textures to within caches– Don’t reduce textures– Make use of LOD to avoid <1pixel area triangles– Watch out for penalties on Fog and Mip-maps

Summary