SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

56

description

Presented at SIGGRAPH Asia 2012 in Singapore on Friday, 30 November 14:15 - 16:00 during the "Points and Vectors" session. Find the paper at http://developer.nvidia.com/game/gpu-accelerated-path-rendering or on Slideshare. For thirty years, resolution-independent 2D standards (e.g. PostScript, SVG) have relied largely on CPU-based algorithms for the filling and stroking of paths. Learn about our approach to accelerate path rendering with our GPU-based "Stencil, then Cover" programming interface. We've built and productized our OpenGL-based system.

Transcript of SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Page 1: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering
Page 2: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

GPU-accelerated Path Rendering

Mark Kilgard & Jeff BolzNVIDIA CorporationNovember 30, 2012

Page 3: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

GPUs are good at a lot of stuff

Page 4: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Games

Battlefield 3, EA

Page 5: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Data visualization

Page 6: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Product design

Catia

Page 7: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Physics simulation

CUDA N-Body [Nyland et al., GPU Gems 3, 2007]

Page 8: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Interactive ray tracing

OptiX [Parker et al., SIGGRAPH 2010]

Page 9: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Game physics

PhysX [Tonge et al., SIGGRAPH 2012]

Page 10: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Molecular modeling

NCSA

Page 11: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Impressive stuff

Page 12: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

What about advancing 2D graphics?

Page 13: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Can GPUs render & improve the immersive web?

Page 14: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Complete Web Pages Rendered via OpenGL

without Pre-rendered Glyph Bitmaps and all on GPU

Page 15: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Not just zoomed & rotated,also perspective

No tricks

Every glyph isrendered from itsoutline; no render-to-texture

Magnify & minify with

no transitionalpixelization

or tile poppingartifacts

syncedto refreshrate; 60 Hzupdates

Page 16: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Live demo!

Web page Control points ofTrueType glyphsvisualized

Zoomed in

Projected

Page 17: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

What is path rendering?A rendering approach

Resolution-independent two-dimensional graphicsOcclusion & transparency depend on rendering order

So called “Painter’s Algorithm”

Basic primitive is a path to be filled or stroked

Path is a sequence of path commandsCommands are

– moveto, lineto, curveto, arcto, closepath, etc.

StandardsContent: PostScript, PDF, TrueType fonts, Flash, Scalable Vector Graphics (SVG), HTML5 Canvas, Silverlight, Office drawingsAPIs: Apple Quartz 2D, Khronos OpenVG, Microsoft Direct2D, Cairo, Skia, Qt::QPainter, Anti-grain Graphics

Page 18: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Path Rendering Standards

DocumentPrinting andExchange

ImmersiveWebExperience

2D GraphicsProgrammingInterfaces

OfficeProductivityApplications

Resolution-IndependentFonts

OpenType

TrueType

Flash

Open XMLPaper (XPS)

Java 2DAPI

Mac OS X2D API

Khronos API

Adobe Illustrator

InkscapeOpen Source

ScalableVectorGraphics

QtGuiAPI

HTML 5

Page 19: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Seminal Path Rendering Paper

John Warnock & Douglas Wyatt, Xerox PARCPresented SIGGRAPH 1982

Warnock founded Adobe months later

John WarnockAdobe founder

Page 20: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Power wallMore functionality with less latency…

…with less power

Reasons toGPU-accelerate Path Rendering

Increasing screen resolutions

Multi-touch

Increasing screen densities

Immersive 2D web content

Page 21: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Live Demo

New York Times rendered fromits resolution-independent form

Flash content

Classic PostScript content

Complex text rendering

Page 22: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Live demo!

Dragon, andzoomed dragon 3D dice, but really

2D + gradients

Dashed stroking Complexgradientcontent

Gradients withblending

Maps with text

Page 23: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Last Year’s SIGGRAPH Results in Real-time

“Digital Micrography” Ron Maharik, Mikhail Bessmeltsev, Alla Sheffer, Ariel Shamir, and Nathan Carr

SIGGRAPH 2011

“Girl with Words inHer Hair” scene

591 paths

338,507 commands

1,244,474 scalarcoordinates

Page 24: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Our Contributions

A novel “stencil, then cover” programming interface for path rendering, well-suited to acceleration by GPUs

Our NV_path_rendering API

Our programming interface’s efficient implementation within OpenGL to avoid CPU bottlenecks

Productized, shipping in GeForce/Quadro drivers

Accompanying algorithms to handletessellation-free stenciled stroking of pathsstandard stroking embellishments such as dashingclipping paths to arbitrary pathsmixing 3D and path rendering

Page 25: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Notable Prior Art

Loop & Blinn 2005: Resolution independent curve rendering using programmable graphics hardware

Kokojima, et al. 2006: Resolution independent rendering of deformable vector objects using graphics hardware

Rueda, et al. 2008: GPU-based rendering of curved polygons using simplicial coverings

Page 26: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

CPU vs. GPU atRendering Tasks over Time

Goal of our research is to make path rendering a GPU task

Render all interactive pixels, whether 3D or 2D or web content with the GPU

Pipelined 3D Interactive Rendering Path Rendering

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

GPU

CPU

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

GPU

CPU

Page 27: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Our Approach

“Stencil, then Cover” (StC)Map the path rendering task from a sequential algorithm……to a pipelined and massively parallel taskBreak path rendering into two steps

First, “stencil” the path’s coverage into stencil bufferSecond, conservatively “cover” path

Test against path coverage determined in the 1st stepShade the pathAnd reset the stencil value to render next path

Step 1Stencil

Step 2:Cover

repeat

Page 28: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Our Implemented System: NV_path_rendering

OpenGL extension to GPU-accelerate path renderingUses “stencil, then cover” (StC) approach via OpenGL calls

Create a path objectStep 1: “Stencil” the path object into the stencil buffer

GPU provides fast stenciling of filled or stroked pathsStep 2: “Cover” the path object and stencil test against its coverage stenciled by the prior step

Application can configure arbitrary shading during the step

More details laterSupports the union of functionality of all major path rendering standards

Includes all stroking embellishmentsIncludes first-class text and font supportAllows functionality to mix with traditional 3D and programmable shading

Page 29: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Vertex assembly

Primitive assembly

Rasterization

Fragment operations

Display

Vertex operations

Application

Primitive operations

Texturememory

Pixel assembly(unpack)

Pixel operations

Pixel pack

Vertex pipelinePixel pipeline

Application

transformfeedback

readback

Framebuffer

Raster operations

Path pipeline

Path specification

Transform path

Fill/StrokeStenciling

Fill/StrokeCovering

Page 30: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Stencil Fill Process Visualized

Visualizationof “invisible”stencil-onlygeometrygeneratedduringstencil step

Net resultof stencilincrementsanddecrementsis path’swindingnumber

Page 31: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Cover Fill Geometry Visualized

Page 32: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Stroking Approach

Stroked line segments are straightforwardDrawn as rectangles into the stencil buffer

Curved stroked segments are involvedCurved segments are broken into stroked quadratic segments

Hulls are formed around each stroked quadratic segment

An intricate fragment discard shader solves the cubic equation for every sample to determine the sample’s containment in the quadratic stroke segment

If contained, the sample’s stencil sample is updated

Caps & joins are also drawn into the stencil buffer

Covering geometry is computed as union of rectangles, hulls, and cap/join geometry

Page 33: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Quadratic Stroking Hulls Visualized

SimplequadraticBeziersegment,movingcontrol points

Drawnwith stroking

Non-convexhull usedfor thestroking stencilstep isvisualized

Page 34: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Intricate Path’sStroking Example

Zoomed stroking Same zoom: Stencil hull geometry

Joinstylegeometry

Page 35: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Excellent Geometric Fidelity for Stroking

Correct stroking is hardLots of CPU implementations approximate stroking

GPU-accelerated stroking avoids such short-cuts

GPU has FLOPS to compute true stroke point containment

GPU-accelerated OpenVG reference

Cairo Qt

Stroking with tight end-point curve

Page 36: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Combined for a Complex ScenesWith Many Paths

Stencil Fill Geometry Cover Fill Geometry Filling-only Result

Stencil Stroke Geometry Cover Stroke Geometry Stroking-only Result

Complete Tiger

240 paths2,510 commands12,174 coordinates

Page 37: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

NV_path_renderingCompared to Alternatives

Alternative APIs rendering same content

-

200.00

400.00

600.00

800.00

1,000.00

1,200.00

1,400.00

1,600.00

1,800.00

2,000.00

10

0x10

0

20

0x20

0

30

0x30

0

40

0x40

0

50

0x50

0

60

0x60

0

70

0x70

0

80

0x80

0

90

0x90

0

100

0x10

00

110

0x11

00

Window Resolution in Pixels

Fram

es p

er se

cond

Cairo

QtSkia Bitmap

Skia Ganesh FBO (16x)Skia Ganesh Aliased (1x)

Direct2D GPUDirect2D WARP

With Release 300 driver NV_path_rendering

-

200.00

400.00

600.00

800.00

1,000.00

1,200.00

1,400.00

1,600.00

1,800.00

2,000.00

10

0x10

0

20

0x20

0

30

0x30

0

40

0x40

0

50

0x50

0

60

0x60

0

70

0x70

0

80

0x80

0

90

0x90

0

100

0x10

00

110

0x11

00

Window Resolution in Pixels

Fram

es p

er se

cond

16x

8x

4x

2x

1x

ConfigurationGPU: GeForce 480 GTX (GF100)CPU: Core i7 950 @ 3.07 GHz

Alternative approaches are all much slower

Page 38: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Detail on Alternatives

Same results, changed Y Axis

-

50.00

100.00

150.00

200.00

250.00

10

0x10

0

20

0x20

0

30

0x30

0

40

0x40

0

50

0x50

0

60

0x60

0

70

0x70

0

80

0x80

0

90

0x90

0

100

0x10

00

110

0x11

00

Window Resolution in Pixels

Fram

es p

er s

econ

d

CairoQtSkia BitmapSkia Ganesh FBO (16x)Skia Ganesh Aliased (1x)Direct2D GPUDirect2D WARP

Alternative APIs rendering same content

-

200.00

400.00

600.00

800.00

1,000.00

1,200.00

1,400.00

1,600.00

1,800.00

2,000.00

100

x100

200

x200

300

x300

400

x400

500

x500

600

x600

700

x700

800

x800

900

x900

1000

x1000

11

00x11

00

Window Resolution in Pixels

Frame

s per

secon

d

Cairo

QtSkia Bitmap

Skia Ganesh FBO (16x)Skia Ganesh Aliased (1x)

Direct2D GPUDirect2D WARP

Fast, but unacceptable

quality

ConfigurationGPU: GeForce 480 GTX (GF100)CPU: Core i7 950 @ 3.07 GHz

Page 39: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Across an range of scenes…Release 300 GeForce GTX 480 Speedups over Alternatives

0.10

1.00

10.00

100.00

1000.00

1

00

x1

00

2

00

x2

00

3

00

x3

00

4

00

x4

00

5

00

x5

00

6

00

x6

00

7

00

x7

00

8

00

x8

00

9

00

x9

00

1

00

0x1

00

0

11

00

x1

10

0

1

00

x1

00

2

00

x2

00

3

00

x3

00

4

00

x4

00

5

00

x5

00

6

00

x6

00

7

00

x7

00

8

00

x8

00

9

00

x9

00

1

00

0x1

00

0

11

00

x1

10

0

1

00

x1

00

2

00

x2

00

3

00

x3

00

4

00

x4

00

5

00

x5

00

6

00

x6

00

7

00

x7

00

8

00

x8

00

9

00

x9

00

1

00

0x1

00

0

11

00

x1

10

0

1

00

x1

00

2

00

x2

00

3

00

x3

00

4

00

x4

00

5

00

x5

00

6

00

x6

00

7

00

x7

00

8

00

x8

00

9

00

x9

00

1

00

0x1

00

0

11

00

x1

10

0

1

00

x1

00

2

00

x2

00

3

00

x3

00

4

00

x4

00

5

00

x5

00

6

00

x6

00

7

00

x7

00

8

00

x8

00

9

00

x9

00

1

00

0x1

00

0

11

00

x1

10

0

1

00

x1

00

2

00

x2

00

3

00

x3

00

4

00

x4

00

5

00

x5

00

6

00

x6

00

7

00

x7

00

8

00

x8

00

9

00

x9

00

1

00

0x1

00

0

11

00

x1

10

0

1

00

x1

00

2

00

x2

00

3

00

x3

00

4

00

x4

00

5

00

x5

00

6

00

x6

00

7

00

x7

00

8

00

x8

00

9

00

x9

00

1

00

0x1

00

0

11

00

x1

10

0

1

00

x1

00

2

00

x2

00

3

00

x3

00

4

00

x4

00

5

00

x5

00

6

00

x6

00

7

00

x7

00

8

00

x8

00

9

00

x9

00

1

00

0x1

00

0

11

00

x1

10

0

1

00

x1

00

2

00

x2

00

3

00

x3

00

4

00

x4

00

5

00

x5

00

6

00

x6

00

7

00

x7

00

8

00

x8

00

9

00

x9

00

1

00

0x1

00

0

11

00

x1

10

0

1

00

x1

00

2

00

x2

00

3

00

x3

00

4

00

x4

00

5

00

x5

00

6

00

x6

00

7

00

x7

00

8

00

x8

00

9

00

x9

00

1

00

0x1

00

0

11

00

x1

10

0

1

00

x1

00

2

00

x2

00

3

00

x3

00

4

00

x4

00

5

00

x5

00

6

00

x6

00

7

00

x7

00

8

00

x8

00

9

00

x9

00

1

00

0x1

00

0

11

00

x1

10

0

1

00

x1

00

2

00

x2

00

3

00

x3

00

4

00

x4

00

5

00

x5

00

6

00

x6

00

7

00

x7

00

8

00

x8

00

9

00

x9

00

1

00

0x1

00

0

11

00

x1

10

0

tigerWelsh_dragon

Celtic_round_dogsbutterfly spikesAmerican_Samoacowboy BuonaparteEmbrace_the_WorldYokozawaCougar

tiger_clipped_by_heart

NVpr16/Cairo

NVpr16/SkiaBitmap

NVpr16/SkiaGanesh

NVpr16/Direct2D GPU

NVpr16/Direct2D WARP

Y axis is logarithmic—shows how many TIMES faster NV_path_rendering is that competitor

Page 40: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Partial Solutions Not Enough

Path rendering has 30 years of heritage and history

Can’t do a 90% solution and expect Software to change

Trying to “mix” CPU and GPU methods doesn’t work

Expensive to move software—needs to be an unambiguous win

Must surpass CPU approaches on all frontsPerformance

Quality

Functionality

Conformance to standards

More power efficient

Enable new applications

John WarnockAdobe founder

Inspiration: Perceptive Pixel

Page 41: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Dashing Content Examples

Dashing character outlines for quilted look

Frosting on cake is dashedelliptical arcs with roundend caps for “beaded” look;flowers are also dashing

Same cakemissing dashedstroking details

Artist made windows with dashed line

segment

Technical diagramsand charts often employ dashing

All content shownis fully GPU rendered

Page 42: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

First-class, Resolution-independentFont Support

Fonts are a standard, first-class part of all path rendering systemsForeign to 3D graphics systems such as OpenGL and Direct3D, but natural for path renderingBecause letter forms in fonts have outlines defined with paths

TrueType, PostScript, and OpenType fonts all use outlines to specify glyphs

NV_path_rendering makes font support easyCan specify a range of path objects with

A specified fontSequence or range of Unicode character points

No requirement for applications use font API to load glyphsYou can also load glyphs “manually” fromyour own glyph outlines

Page 43: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Rendering Paths Clipped toSome Other Arbitrary Path

Example clipping the PostScript tiger to a heart constructed from two cubic Bezier curves

unclipped tiger tiger with pink background clipped to heart

Page 44: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Complex Clipping Example

cowboy clip isthe union of 1,366 paths

tiger is 240 paths

result of clipping tigerto the union of all the cowboy paths

Page 45: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

NV_path_rendering is more than justmatching CPU vector graphics

3D and vector graphics mix

2D in perspective is free

Superior quality

Arbitrary programmable shader on paths— bump mapping

GPU

CPUCompetitors

Page 46: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Mixing 3D Depth Buffering andPath Rendering

PostScript tigers surrounding Utah teapotPlus overlaid TrueType font renderingNo textures involved, no multi-pass

Page 47: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Live demo!

Very fastTeapots + tigers in same 3D scene

Zoom on tigersAll the detail is there

Solidor wireframeteapots

Page 48: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Handling Uncommon Path RenderingFunctionality: Projection

Projection “just works”Because GPU does everythingwith perspective-correctinterpolation

Page 49: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Example of Bump Mapping onPath Rendered Text

Phrase “Brick wall!” is path rendered and bump mapped with a Cg fragment shader

light source position

Page 50: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Handling Common Path RenderingFunctionality: Filtering

GPUs are highly efficient at image filtering

Fast texture mappingMipmappingAnisotropic filteringWrap modes

CPUs aren'treally

GPU

Qt

Cairo

Moiréartifacts

Page 51: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Anti-aliasing Discussion

Good anti-aliasing is a big deal for path renderingParticularly true for font rendering of small point sizesFeatures of glyphs are often on the scale of a pixel or less

NV_path_rendering uses multiple stencil samples per pixel for reasonable antialiasing

Otherwise, image quality is poor4 samples/pixel bare minimum8 or 16 samples/pixel is pretty sufficient

But 16 requires expensive 2x2 supersampling of 4x multisampling16x is quite memory intensive

Alternative: quality vs. performance tradeoffFast enough to render multiple passes to improve qualityApproaches

Accumulation bufferAlpha accumulation

Page 52: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

RealFlash

Scene

conflationartifacts abound,rendered by Skia

same scene, GPU-renderedwithout conflation

conflation is aliasing &edge coverage percentsare un-predicable in general;means conflated pixelsflicker when animated slowly

Page 53: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Improved Color Space:sRGB Path Rendering

Modern GPUs have native support for perceptually-correct for

sRGB framebuffer blendingsRGB texture filteringNo reason to tolerate uncorrected linear RGB color artifacts!More intuitive for artists to control

Negligible expense for GPU to perform sRGB-correct rendering

However quite expensive for software path renderers to perform sRGB rendering

Not done in practice

linear RGBtransition between saturatedred and saturated blue hasdark purple region

sRGBperceptually smoothtransition from saturatedred to saturated blue

Radial color gradient examplemoving from saturated red to blue

Page 54: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Trying OutNV_path_rendering

Operating system support2000, XP, Vista, Windows 7, Linux, FreeBSD, and SolarisUnfortunately no Mac support

GPU supportGeForce 8 and up (easy rule: all CUDA-capable GPUs)Most efficient on Fermi and Kepler GPUsCurrent performance can be expected to improve

Shipping since NVIDIA’s Release 275 driversAvailable since summer 2011

New Release 300+ drivers have remarkable NV_path_rendering performance

Try it, you’ll like it

There’s an SDK freely available with example code! https://developer.nvidia.com/nv-path-rendering

Page 55: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering

Future Work

Using NV_path_rendering in actual web and 2D applications

Standardizing the programming interface

Moving these algorithms to mobile devices

Path rendering test bed on Nexus 7

Page 56: SIGGRAPH Asia 2012: GPU-accelerated Path Rendering