OpenGL 4.3 Overview SIGGRAPH - Khronos Group

35
BOF Siggraph 2012 Barthold Lichtenbelt OpenGL ARB chair

Transcript of OpenGL 4.3 Overview SIGGRAPH - Khronos Group

BOF Siggraph 2012 Barthold Lichtenbelt

OpenGL ARB chair

© Copyright Khronos Group, 2010 - Page 2

OpenGL BOF Agenda

• Latest news and features in OpenGL

- Barthold Lichtenbelt, NVIDIA

• Cool things you never dreamed you could do with OpenGL?

- Bill Licea-Kane, AMD

• Left 4 Dead 2 Linux: From 6 to 300 FPS in OpenGL

- Rich Geldreich, Valve

• 20 years of OpenGL

- Kurt Akeley, co-founder of SGI and the OpenGL API

• Followed by Party!

• Trivia throughout

© Copyright Khronos Group, 2010 - Page 3

Sponsors

© Copyright Khronos Group, 2010 - Page 4

Sponsors

• Rob Barris

• Tadamasa Teranishi

• Tomohiro Matsumoto

• Jesse Barker

• Lingjun Chen

• Glenn Fredericks

• Masahito Hirose

• John Kessenich

• Arzhange Safdarzadeh

• Tom Olson

• Lawrence McDonough

• Mark Kilgard

• Takeshi Haga

• Takeshi Hirai

• Ian Romanick

• Laurent Billy

• Benj Lipchak

• Sergey Kosarevsky

• Christophe Riccio

• Dominic Agoro-Ombaka

• Vicki and Dave Shreiner

• Kentaro Suzuki a.k.a. “hole”

• Kentaro Oku "kioku/System K“

• The English Tiddlywinks Association

• Several anonymous sponsors

Cass Everitt

© Copyright Khronos Group, 2010 - Page 5

OpenGL is 20 years today!

© Copyright Khronos Group, 2010 - Page 6

OpenGL 20th Birthday - Then and Now

1992 Reality Engine

8 Geometry Engines 4 Raster Manager

boards

2012 Mobile NVIDIA Tegra 3

Nexus 7 Android Tablet

2012 PC NVIDIA

GeForce GTX 680 Kepler GK104

Triangles / sec (millions) 1 103 (x103) 1800 (x1800)

Pixel Fragments / sec (millions) 240 1040 (x4.3) 14,400 (x60)

GigaFLOPS 0.64 15.6 (x25) 3090 (x4830)

Rage -id Software Ideas in Motion - SGI

1.5KW <5W

© Copyright Khronos Group, 2010 - Page 7

OpenGL Latest Updates

• Games

- Steam’s Left 4 Dead 2 on Linux uses OpenGL (7/2012) - http://www.extremetech.com/gaming/133824-valve-opengl-is-faster-than-directx-even-on-windows

- Doom3 source code released (11/2011)

• Books

- OpenGL Insights released (8/2012)

- OpenGL 4.0 Shading Language Cookbook released (1/2012)

- Graphics Shaders: Theory and Practice, second edition released (11/2011)

- Learning Modern 3D Graphics Programming (2012)

- http://www.arcsynthesis.org/gltut/

© Copyright Khronos Group, 2010 - Page 8

OpenGL Ecosystem News

• Tools updated to support OpenGL 4.2

- GLView (2/2012)

- GLEW (7/2012) and GL3W

- GLIntercept (11/2011)

- http://www.g-truc.net/ (8/2011)

• New projects

- GLCapsViewer - http://delphigl.de/glcapsviewer/listreports.php (8/2011)

- Regal for OpenGL - https://github.com/p3/regal (2012)

- Proland - http://proland.inrialpes.fr/index.html (5/2012)

• New Tutorials - http://www.opengl-tutorial.org/

© Copyright Khronos Group, 2010 - Page 9

Announcing 4.3

© Copyright Khronos Group, 2010 - Page 10

DirectX 11.1

2004 2006 2008 2009 2010 2005 2007 2011

Accelerating OpenGL Innovation

DirectX 10.1

OpenGL 2.0 OpenGL 2.1 OpenGL 3.0

OpenGL 3.1

DirectX 9.0c DirectX 10.0 DirectX 11

OpenGL 3.2

OpenGL 3.3/4.0

OpenGL 4.1

Bringing state-of-the-art functionality to cross-platform graphics

2012

OpenGL 4.2

OpenGL 4.3

© Copyright Khronos Group, 2010 - Page 11

What is new in OpenGL 4.3?

• texture functionality

- ARB_texture_view - ARB_internalformat_query2

- ARB_copy_image

- ARB_texture_buffer_range

- ARB_stencil_texturing

- ARB_texture_storage_multisample

• buffer functionality

- ARB_shader_storage_buffer_object - ARB_invalidate_subdata

- ARB_clear_buffer_object

- ARB_vertex_attrib_binding

- ARB_robust_buffer_access_behavior

4.3

© Copyright Khronos Group, 2010 - Page 12

What is new in OpenGL 4.3?

• pipeline functionality

- ARB_compute_shader

- ARB_multi_draw_indirect

- KHR_debug - ARB_program_interface_query

- ARB_ES3_compatibility

• extensions

- KHR_texture_compression_astc_ldr

- ARB_robustness_isolation

• GLSL 4.3 functionality - ARB_shader_image_size

- ARB_explicit_uniform_location

- ARB_texture_query_levels

- ARB_arrays_of_arrays

- ARB_fragment_layer_viewport

4.3

© Copyright Khronos Group, 2010 - Page 13

OpenGL 4.3 Pipelines

Framebuffer

Vertex Puller

Vertex Shader

Tessellation Control Shader

Tessellation Primitive Gen.

Geometry Shader

Transform Feedback

Rasterization

Fragment Shader

Dispatch Indirect Buffer b

Pixel Assembly

Pixel Operations

Pixel Pack

Per-Fragment Operations

Image Load / Store t/b

Atomic Counter b

Shader Storage b

Texture Fetch t/b

Uniform Block b

Pixel Unpack Buffer b

Texture Image t

Pixel Pack Buffer b

Element Array Buffer b

Draw Indirect Buffer b

Vertex Buffer Object b

Transform Feedback Buffer b

From Application

From Application

t – Texture Binding

b – Buffer Binding

Programmable Stage

Fixed Function Stage

Arrows indicate data flow

Tessellation Eval. Shader

Dispatch

Compute Shader

From Application

Legend

© Copyright Khronos Group, 2010 - Page 14

Compute Shaders

• Execute algorithmically general purpose GLSL shaders

- Operate on buffers, images and textures

• Process graphics data in the context of the graphics pipeline

- Easier than interoperating with a compute API IF processing ‘close to the pixel’

• Complementary to OpenCL

- Not a full heterogonous (CPU/GPU) programming framework using full ANSI C

• Standard part of all OpenGL 4.3 implementations

- Matches DirectX 11 functionality

Image processing AI Simulation Ray Tracing Wave Simulation Global Illumination

© Copyright Khronos Group, 2010 - Page 15

Compute Shaders for Physics Processing

• Credit: Dr. Mike Bailey, Oregon State University… also…

• Notes and sample code on OpenGL Compute Shader

- http://web.engr.oregonstate.edu/~mjb/sig12/

© Copyright Khronos Group, 2010 - Page 16

Compute programming model

Work Group (0, 1) Work Group (1, 1) Work Group (2, 1)

Work Group (0, 0) Work Group (1, 0) Work Group (2, 0)

Dispatch

Work Group (1,1)

Invocation (0,1)

Invocation (1,1)

Invocation (2,1)

Invocation (3,1)

Invocation (0,0)

Invocation (1,0)

Invocation (2,0)

Invocation (3,0)

in uvec3 gl_NumWorkGroups; // Number of workgroups dispatched

const uvec3 gl_WorkGroupSize; // Size of each work group for current shader

in uvec3 gl_WorkGroupID; // Index of current work group being executed

in uvec3 gl_LocalInvocationID; // index of current invocation in a work group

in uvec3 gl_GlobalInvocationID; // Unique ID across all work groups and invocations

gl_WorkGroupSize = (4,2,0)

gl_WorkGroupID = (1,1,0)

gl_LocalInvocationID = (2,1,0)

gl_GlobalInvocationID = (6,3,0)

© Copyright Khronos Group, 2010 - Page 17

Thread (0,1)

Work Group

(0, 1)

Work Group Work Group Work Group

Work Group Work Group Work Group

Dispatch Shader Storage Buffer

Object (SSBO)

Image

Uniform Buffer Object (UBO)

Texture Buffer Object (TexBO)

Texture

Compute memory hierarchy

Work Group

Shared Variables

Invocation

Local Variables

void memoryBarrier();

void memoryBarrierAtomicCounter();

void memoryBarrierBuffer();

void memoryBarrierImage();

void memoryBarrierShared(); // Only for compute shaders

void groupMemoryBarrier(); // Only for compute shaders

Use memory barriers to order reads/writes accessible to other invocations

Use void barrier() to synchronize invocations in a work group

© Copyright Khronos Group, 2010 - Page 18

Texture Views

• “View” texture data store multiple ways

- Re-interpret the format/type

- Clamp mip-map level range

- Clamp array slice range

• No new object types introduced

• Conceptual split of a texture object

- Data store holding texels

- View state describing which part of data store to use

- View state describing how to interpret elements in data store

- An embedded sampler object

- Texture parameters

• Multiple textures share same data store

- Data store ref counted

© Copyright Khronos Group, 2010 - Page 19

Texture Views

Texture

Parameters

Texture Object

Sampler

Parameters (mutable)

Sampler Object

Sampler

Parameters (mutable)

Texel Data (mutable,

ref counted)

Texture

Lookup

Hardware

To rest of pipeline

Texture View Parameters (immutable)

use sampler object if bound

mipmap chain

Texture levels selected by view

created with TexStorage*()

© Copyright Khronos Group, 2010 - Page 20

Creation of New Texture View

Texture

Parameters

Texture Object

Sampler

Parameters (mutable)

Sampler Object

Sampler

Parameters (mutable)

Texture

Parameters (reset to default)

New Texture Object

Sampler

Parameters (reset to default)

Texel Data (mutable,

ref counted)

mipmap chain

Sampler Object

Sampler

Parameters (mutable)

Texture

Lookup

Hardware

use sampler object if bound

To rest of pipeline

Texture

Lookup

Hardware

use sampler object if bound

Texture View Parameters (immutable)

Texture View Parameters (immutable)

New texture state set with TextureView() enum internalformat // base internal format

enum target // texture target

uint minlevel // first level of mipmap

uint numlevels // number of mipmap levels

uint minlayer // 1st layer of array texture

uint numlayers // number of layers in array

created with TexStorage*() created with TextureView()

Texture levels selected by view

Texture levels selected by view

© Copyright Khronos Group, 2010 - Page 21

KHR_debug

• Builds on ARB_debug_output

• Callback with debug information

- Or write to log

• Messages grouped by {source, type, ID, severity}

- Source: GL API, GLSL shader, application, third-party, debugger

- Type: Error, performance, undefined behavior, portability

- ID: Unique identifier for each message

- Severity: High, medium, low

• Label objects

- Human readable text

• Annotate commands stream

- Markers: Identify some event in your code

- Groups: Encapsulate command stream and control debug verbosity

© Copyright Khronos Group, 2010 - Page 22

MultiDraw*Indirect()

•MultiDraw{Arrays/Elements}Indirect

- Combines MultiDraw with DrawIndirect

• MultiDraw

- MultiDraw functions can help reduce validation

overhead especially for many „low complexity“ draw

calls, while keeping each „sub-object“ addressable

• DrawIndirect

- Store draw command inputs in host or GPU buffers

• Provides efficient system for GPU to generate its

own work

- Use XFB or SSBO/compute to write the draw

command buffers

- For example for culling (setting count to zero), LOD

picking (changing count/firstIndex)...

struct DEICommand {

uint count;

uint instanceCount;

uint firstIndex;

int baseVertex;

uint baseInstance;

};

...

...

...

...

...

...

CAD example:

individual model

features (bevels…)

MultiDraw Indirect Buffer

Compute Shader

Graphics Shader Host

© Copyright Khronos Group, 2010 - Page 23

Shader Storage Buffer Objects (SSBO)

• Read/write and atomic operations on variables stored in a buffer object

- Think writeable UBOs

• New binding point SHADER_STORAGE_BUFFER

- Queriable limits on number of storage blocks per shader type - MAX_<SHADER>_STORAGE_BLOCKS

• Support large buffers

- Minimum size is 16 MB

• New std430 memory layout

- Pack scalar arrays more efficiently

• Can use C-style code in a shader to read/write

- Example on next slide

• Especially useful for compute shaders

- No built-in outputs

- Data transfer has to be through buffers or images

© Copyright Khronos Group, 2010 - Page 24

Shader Storage Buffer Objects (SSBO)

struct MyVertex {

vec2 tex[2]; // tightly packed array in 430

vec3 pos;

int materialIdx;

}

layout(std430, binding = 4) buffer {

MyVertex Vertices[ ]; // unsized array allowed at end of buffer

};

... // compute data to store in Vertices[]

Vertices[i]. materialIdx = idx; // directly write to buffer content

glGenBuffers(1, & posSSBO);

glBindBuffer(GL_SHADER_STORAGE_BUFFER, posSSBO);

glBufferData(GL_SHADER_STORAGE_BUFFER, .... );

glUseProgram(MyShaderProgram);

glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 4, posSSBO);

© Copyright Khronos Group, 2010 - Page 25

New KHR and ARB extensions

• KHR_texture_compression_astc_ldr

- Adaptive Scalable Texture Compression (ASTC)

- 1-4 component, low bit rate < 1 bit/pixel – 8 bit/pixel

• ARB_robustness_isolation

- If application causes GPU reset, no other application will be affected

• All 4.3 functionality also available as ARB extensions

Original

24bpp ASTC Compression

8bpp 3.56bpp 2bpp

© Copyright Khronos Group, 2010 - Page 26

OpenGL 4.3 reference pages

Huge thanks to Graham Sellers!!!

© Copyright Khronos Group, 2010 - Page 27

Specification re-ordering

• Shader and buffer centric

- Fixed function interfaces described as alternates

• Introduces concepts and objects at high level

- Before being used later in document

• Error summaries for commands

• Removed duplication of language

• Consistent uses of phrases and terminology

• Aligned section numbering between Core and

Compatibility profiles

© Copyright Khronos Group, 2010 - Page 28

Conclusion

• OpenGL 4.3 adds major new functionality

- Compute shaders

- Advanced buffer management

- Advanced texture management

- Advanced GPU work creation

• OpenGL usage on the rise sharply

- WebGL

- Mobile platforms

- Linux

• OpenGL is 20 years today!

- Awesome achievement

© Copyright Khronos Group, 2010 - Page 29

Rest of the evening

•Get Drink

•Kurt Akeley Presentation

•Toast

•Party

- LIVE DEMO: Viewperf 12

- Play with the O2!

Get your drink and COME BACK to toast to OpenGL

© Copyright Khronos Group, 2010 - Page 30

OpenGL 4.3 details

© Copyright Khronos Group, 2010 - Page 31

OpenGL 4.3 new texture functionality

• ARB_texture_view

- Provide different ways to interpret texture data without duplicating the texture

- Match DX11 functionality

• ARB_internalformat_query2

- find out actual supported limits for most texture parameters

• ARB_copy_image

- Direct copy of pixels between textures and render buffers

• ARB_texture_buffer_range

- create texture buffer object corresponding to a subrange of a buffer’s data store

• ARB_stencil_texturing

- Read stencil bits of a packed depth-stencil texture

• ARB_texture_storage_multisample

- Immutable storage objects for multisampled textures

© Copyright Khronos Group, 2010 - Page 32

OpenGL 4.3 new buffer functionality

• ARB_shader_storage_buffer_object

- Enables all shader stages to read and write to very large buffers

- structs, arrays, scalars, etc

• ARB_invalidate_subdata

- Invalidate all or some of the contents of textures and buffers

• ARB_clear_buffer_object

- Clear a buffer object with a constant value

• ARB_vertex_attrib_binding

- Separate vertex attribute state from the data stores of each array

• ARB_robust_buffer_access_behavior

- shader read/write to an object only allowed to data owned by the application

- Applies to out of bounds accesses

© Copyright Khronos Group, 2010 - Page 33

OpenGL 4.3 new pipeline functionality

• ARB_compute_shader

- Introduces new shader stage

- Enables advanced processing algorithms that harness the parallelism of GPUs

• ARB_multi_draw_indirect

- Draw many GPU generated objects with one call

• KHR_debug

- Enhanced debug context support

• ARB_program_interface_query

- Generic API to enumerate active variables and interface blocks for each stage

- Enumerate active variables in interfaces between separable program objects

• ARB_ES3_compatibility

- features not previously present in OpenGL

- Brings EAC and ETC2 texture compression formats

© Copyright Khronos Group, 2010 - Page 34

GLSL 4.3 new functionality

• ARB_shader_image_size

- Query size of an image in a shader

• ARB_explicit_uniform_location

- Set location of a default-block uniform in the shader

• ARB_texture_query_levels

- Query number of mipmap levels accessible through a sampler uniform

• ARB_arrays_of_arrays

- Allows multi-dimensional arrays in GLSL. float f[4][3];

• ARB_fragment_layer_viewport

- gl_Layer and gl_ViewportIndex now available to fragment shader

© Copyright Khronos Group, 2010 - Page 35

Texture object state

Texture View Parameters <target>

TEXTURE_INTERNAL_FORMAT

TEXTURE_VIEW_{MIN,NUM}_LEVEL

TEXTURE_VIEW_{MIN,NUM}_LAYER

TEXTURE_IMMUTABLE_LEVELS

TEXTURE_SHARED_SIZE

TEXTURE_{RED,GREEN,BLUE,ALPHA,DEPTH,STENCIL}_SIZE

TEXTURE_{RED,GREEN,BLUE,ALPHA,DEPTH}_TYPE

IMAGE_FORMAT_COMPATIBILITY_TYPE

Texture Parameters

TEXTURE_WIDTH

TEXTURE_HEIGHT

TEXTURE_DEPTH

TEXTURE_SAMPLES

TEXTURE_FIXED_SAMPLE_LOCATIONS

TEXTURE_COMPRESSED

TEXTURE_COMPRESSED_IMAGE_SIZE

TEXTURE_IMMUTABLE_FORMAT

TEXTURE_SWIZZLE_{R,G,B,A}

TEXTURE_MAX_LEVEL

TEXTURE_BASE_LEVEL

DEPTH_STENCIL_TEXTURE_MODE

Sampler Parameters

TEXTURE_BORDER_COLOR

TEXTURE_COMPARE_{FUNC,MODE}

TEXTURE_LOD_BIAS

TEXTURE_{MAX,MIN}_LOD

TEXTURE_{MAG,MIN}_FILTER

TEXTURE_WRAP_{S,T,R}

Texture

Parameters

Texture Object

• •

Sampler

Parameters (mutable)

Texture View Parameters (immutable)

State is immutable, unless listed in italics