BOF Siggraph 2012 Barthold Lichtenbelt
OpenGL ARB chair
© Copyright Khronos Group, 2010 - Page 2
OpenGL BOF Agenda
• Latest news and features in OpenGL
- Barthold Lichtenbelt, NVIDIA
• Cool things you never dreamed you could do with OpenGL?
- Bill Licea-Kane, AMD
• Left 4 Dead 2 Linux: From 6 to 300 FPS in OpenGL
- Rich Geldreich, Valve
• 20 years of OpenGL
- Kurt Akeley, co-founder of SGI and the OpenGL API
• Followed by Party!
• Trivia throughout
© Copyright Khronos Group, 2010 - Page 4
Sponsors
• Rob Barris
• Tadamasa Teranishi
• Tomohiro Matsumoto
• Jesse Barker
• Lingjun Chen
• Glenn Fredericks
• Masahito Hirose
• John Kessenich
• Arzhange Safdarzadeh
• Tom Olson
• Lawrence McDonough
• Mark Kilgard
• Takeshi Haga
• Takeshi Hirai
• Ian Romanick
• Laurent Billy
• Benj Lipchak
• Sergey Kosarevsky
• Christophe Riccio
• Dominic Agoro-Ombaka
• Vicki and Dave Shreiner
• Kentaro Suzuki a.k.a. “hole”
• Kentaro Oku "kioku/System K“
• The English Tiddlywinks Association
• Several anonymous sponsors
Cass Everitt
© Copyright Khronos Group, 2010 - Page 5
OpenGL is 20 years today!
© Copyright Khronos Group, 2010 - Page 6
OpenGL 20th Birthday - Then and Now
1992 Reality Engine
8 Geometry Engines 4 Raster Manager
boards
2012 Mobile NVIDIA Tegra 3
Nexus 7 Android Tablet
2012 PC NVIDIA
GeForce GTX 680 Kepler GK104
Triangles / sec (millions) 1 103 (x103) 1800 (x1800)
Pixel Fragments / sec (millions) 240 1040 (x4.3) 14,400 (x60)
GigaFLOPS 0.64 15.6 (x25) 3090 (x4830)
Rage -id Software Ideas in Motion - SGI
1.5KW <5W
© Copyright Khronos Group, 2010 - Page 7
OpenGL Latest Updates
• Games
- Steam’s Left 4 Dead 2 on Linux uses OpenGL (7/2012) - http://www.extremetech.com/gaming/133824-valve-opengl-is-faster-than-directx-even-on-windows
- Doom3 source code released (11/2011)
• Books
- OpenGL Insights released (8/2012)
- OpenGL 4.0 Shading Language Cookbook released (1/2012)
- Graphics Shaders: Theory and Practice, second edition released (11/2011)
- Learning Modern 3D Graphics Programming (2012)
- http://www.arcsynthesis.org/gltut/
© Copyright Khronos Group, 2010 - Page 8
OpenGL Ecosystem News
• Tools updated to support OpenGL 4.2
- GLView (2/2012)
- GLEW (7/2012) and GL3W
- GLIntercept (11/2011)
- http://www.g-truc.net/ (8/2011)
• New projects
- GLCapsViewer - http://delphigl.de/glcapsviewer/listreports.php (8/2011)
- Regal for OpenGL - https://github.com/p3/regal (2012)
- Proland - http://proland.inrialpes.fr/index.html (5/2012)
• New Tutorials - http://www.opengl-tutorial.org/
© Copyright Khronos Group, 2010 - Page 9
Announcing 4.3
© Copyright Khronos Group, 2010 - Page 10
DirectX 11.1
2004 2006 2008 2009 2010 2005 2007 2011
Accelerating OpenGL Innovation
DirectX 10.1
OpenGL 2.0 OpenGL 2.1 OpenGL 3.0
OpenGL 3.1
DirectX 9.0c DirectX 10.0 DirectX 11
OpenGL 3.2
OpenGL 3.3/4.0
OpenGL 4.1
Bringing state-of-the-art functionality to cross-platform graphics
2012
OpenGL 4.2
OpenGL 4.3
© Copyright Khronos Group, 2010 - Page 11
What is new in OpenGL 4.3?
• texture functionality
- ARB_texture_view - ARB_internalformat_query2
- ARB_copy_image
- ARB_texture_buffer_range
- ARB_stencil_texturing
- ARB_texture_storage_multisample
• buffer functionality
- ARB_shader_storage_buffer_object - ARB_invalidate_subdata
- ARB_clear_buffer_object
- ARB_vertex_attrib_binding
- ARB_robust_buffer_access_behavior
4.3
© Copyright Khronos Group, 2010 - Page 12
What is new in OpenGL 4.3?
• pipeline functionality
- ARB_compute_shader
- ARB_multi_draw_indirect
- KHR_debug - ARB_program_interface_query
- ARB_ES3_compatibility
• extensions
- KHR_texture_compression_astc_ldr
- ARB_robustness_isolation
• GLSL 4.3 functionality - ARB_shader_image_size
- ARB_explicit_uniform_location
- ARB_texture_query_levels
- ARB_arrays_of_arrays
- ARB_fragment_layer_viewport
4.3
© Copyright Khronos Group, 2010 - Page 13
OpenGL 4.3 Pipelines
Framebuffer
Vertex Puller
Vertex Shader
Tessellation Control Shader
Tessellation Primitive Gen.
Geometry Shader
Transform Feedback
Rasterization
Fragment Shader
Dispatch Indirect Buffer b
Pixel Assembly
Pixel Operations
Pixel Pack
Per-Fragment Operations
Image Load / Store t/b
Atomic Counter b
Shader Storage b
Texture Fetch t/b
Uniform Block b
Pixel Unpack Buffer b
Texture Image t
Pixel Pack Buffer b
Element Array Buffer b
Draw Indirect Buffer b
Vertex Buffer Object b
Transform Feedback Buffer b
From Application
From Application
t – Texture Binding
b – Buffer Binding
Programmable Stage
Fixed Function Stage
Arrows indicate data flow
Tessellation Eval. Shader
Dispatch
Compute Shader
From Application
Legend
© Copyright Khronos Group, 2010 - Page 14
Compute Shaders
• Execute algorithmically general purpose GLSL shaders
- Operate on buffers, images and textures
• Process graphics data in the context of the graphics pipeline
- Easier than interoperating with a compute API IF processing ‘close to the pixel’
• Complementary to OpenCL
- Not a full heterogonous (CPU/GPU) programming framework using full ANSI C
• Standard part of all OpenGL 4.3 implementations
- Matches DirectX 11 functionality
Image processing AI Simulation Ray Tracing Wave Simulation Global Illumination
© Copyright Khronos Group, 2010 - Page 15
Compute Shaders for Physics Processing
• Credit: Dr. Mike Bailey, Oregon State University… also…
• Notes and sample code on OpenGL Compute Shader
- http://web.engr.oregonstate.edu/~mjb/sig12/
© Copyright Khronos Group, 2010 - Page 16
Compute programming model
Work Group (0, 1) Work Group (1, 1) Work Group (2, 1)
Work Group (0, 0) Work Group (1, 0) Work Group (2, 0)
Dispatch
Work Group (1,1)
Invocation (0,1)
Invocation (1,1)
Invocation (2,1)
Invocation (3,1)
Invocation (0,0)
Invocation (1,0)
Invocation (2,0)
Invocation (3,0)
in uvec3 gl_NumWorkGroups; // Number of workgroups dispatched
const uvec3 gl_WorkGroupSize; // Size of each work group for current shader
in uvec3 gl_WorkGroupID; // Index of current work group being executed
in uvec3 gl_LocalInvocationID; // index of current invocation in a work group
in uvec3 gl_GlobalInvocationID; // Unique ID across all work groups and invocations
gl_WorkGroupSize = (4,2,0)
gl_WorkGroupID = (1,1,0)
gl_LocalInvocationID = (2,1,0)
gl_GlobalInvocationID = (6,3,0)
© Copyright Khronos Group, 2010 - Page 17
Thread (0,1)
Work Group
(0, 1)
Work Group Work Group Work Group
Work Group Work Group Work Group
Dispatch Shader Storage Buffer
Object (SSBO)
Image
Uniform Buffer Object (UBO)
Texture Buffer Object (TexBO)
Texture
Compute memory hierarchy
Work Group
Shared Variables
Invocation
Local Variables
void memoryBarrier();
void memoryBarrierAtomicCounter();
void memoryBarrierBuffer();
void memoryBarrierImage();
void memoryBarrierShared(); // Only for compute shaders
void groupMemoryBarrier(); // Only for compute shaders
Use memory barriers to order reads/writes accessible to other invocations
Use void barrier() to synchronize invocations in a work group
© Copyright Khronos Group, 2010 - Page 18
Texture Views
• “View” texture data store multiple ways
- Re-interpret the format/type
- Clamp mip-map level range
- Clamp array slice range
• No new object types introduced
• Conceptual split of a texture object
- Data store holding texels
- View state describing which part of data store to use
- View state describing how to interpret elements in data store
- An embedded sampler object
- Texture parameters
• Multiple textures share same data store
- Data store ref counted
© Copyright Khronos Group, 2010 - Page 19
Texture Views
Texture
Parameters
Texture Object
Sampler
Parameters (mutable)
Sampler Object
Sampler
Parameters (mutable)
Texel Data (mutable,
ref counted)
Texture
Lookup
Hardware
To rest of pipeline
Texture View Parameters (immutable)
use sampler object if bound
mipmap chain
Texture levels selected by view
created with TexStorage*()
© Copyright Khronos Group, 2010 - Page 20
Creation of New Texture View
Texture
Parameters
Texture Object
Sampler
Parameters (mutable)
Sampler Object
Sampler
Parameters (mutable)
Texture
Parameters (reset to default)
New Texture Object
Sampler
Parameters (reset to default)
Texel Data (mutable,
ref counted)
mipmap chain
Sampler Object
Sampler
Parameters (mutable)
Texture
Lookup
Hardware
use sampler object if bound
To rest of pipeline
Texture
Lookup
Hardware
use sampler object if bound
Texture View Parameters (immutable)
Texture View Parameters (immutable)
New texture state set with TextureView() enum internalformat // base internal format
enum target // texture target
uint minlevel // first level of mipmap
uint numlevels // number of mipmap levels
uint minlayer // 1st layer of array texture
uint numlayers // number of layers in array
created with TexStorage*() created with TextureView()
Texture levels selected by view
Texture levels selected by view
© Copyright Khronos Group, 2010 - Page 21
KHR_debug
• Builds on ARB_debug_output
• Callback with debug information
- Or write to log
• Messages grouped by {source, type, ID, severity}
- Source: GL API, GLSL shader, application, third-party, debugger
- Type: Error, performance, undefined behavior, portability
- ID: Unique identifier for each message
- Severity: High, medium, low
• Label objects
- Human readable text
• Annotate commands stream
- Markers: Identify some event in your code
- Groups: Encapsulate command stream and control debug verbosity
© Copyright Khronos Group, 2010 - Page 22
MultiDraw*Indirect()
•MultiDraw{Arrays/Elements}Indirect
- Combines MultiDraw with DrawIndirect
• MultiDraw
- MultiDraw functions can help reduce validation
overhead especially for many „low complexity“ draw
calls, while keeping each „sub-object“ addressable
• DrawIndirect
- Store draw command inputs in host or GPU buffers
• Provides efficient system for GPU to generate its
own work
- Use XFB or SSBO/compute to write the draw
command buffers
- For example for culling (setting count to zero), LOD
picking (changing count/firstIndex)...
struct DEICommand {
uint count;
uint instanceCount;
uint firstIndex;
int baseVertex;
uint baseInstance;
};
...
...
...
...
...
...
CAD example:
individual model
features (bevels…)
MultiDraw Indirect Buffer
Compute Shader
Graphics Shader Host
© Copyright Khronos Group, 2010 - Page 23
Shader Storage Buffer Objects (SSBO)
• Read/write and atomic operations on variables stored in a buffer object
- Think writeable UBOs
• New binding point SHADER_STORAGE_BUFFER
- Queriable limits on number of storage blocks per shader type - MAX_<SHADER>_STORAGE_BLOCKS
• Support large buffers
- Minimum size is 16 MB
• New std430 memory layout
- Pack scalar arrays more efficiently
• Can use C-style code in a shader to read/write
- Example on next slide
• Especially useful for compute shaders
- No built-in outputs
- Data transfer has to be through buffers or images
© Copyright Khronos Group, 2010 - Page 24
Shader Storage Buffer Objects (SSBO)
struct MyVertex {
vec2 tex[2]; // tightly packed array in 430
vec3 pos;
int materialIdx;
}
layout(std430, binding = 4) buffer {
MyVertex Vertices[ ]; // unsized array allowed at end of buffer
};
... // compute data to store in Vertices[]
Vertices[i]. materialIdx = idx; // directly write to buffer content
glGenBuffers(1, & posSSBO);
glBindBuffer(GL_SHADER_STORAGE_BUFFER, posSSBO);
glBufferData(GL_SHADER_STORAGE_BUFFER, .... );
glUseProgram(MyShaderProgram);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 4, posSSBO);
© Copyright Khronos Group, 2010 - Page 25
New KHR and ARB extensions
• KHR_texture_compression_astc_ldr
- Adaptive Scalable Texture Compression (ASTC)
- 1-4 component, low bit rate < 1 bit/pixel – 8 bit/pixel
• ARB_robustness_isolation
- If application causes GPU reset, no other application will be affected
• All 4.3 functionality also available as ARB extensions
Original
24bpp ASTC Compression
8bpp 3.56bpp 2bpp
© Copyright Khronos Group, 2010 - Page 26
OpenGL 4.3 reference pages
Huge thanks to Graham Sellers!!!
© Copyright Khronos Group, 2010 - Page 27
Specification re-ordering
• Shader and buffer centric
- Fixed function interfaces described as alternates
• Introduces concepts and objects at high level
- Before being used later in document
• Error summaries for commands
• Removed duplication of language
• Consistent uses of phrases and terminology
• Aligned section numbering between Core and
Compatibility profiles
© Copyright Khronos Group, 2010 - Page 28
Conclusion
• OpenGL 4.3 adds major new functionality
- Compute shaders
- Advanced buffer management
- Advanced texture management
- Advanced GPU work creation
• OpenGL usage on the rise sharply
- WebGL
- Mobile platforms
- Linux
• OpenGL is 20 years today!
- Awesome achievement
© Copyright Khronos Group, 2010 - Page 29
Rest of the evening
•Get Drink
•Kurt Akeley Presentation
•Toast
•Party
- LIVE DEMO: Viewperf 12
- Play with the O2!
Get your drink and COME BACK to toast to OpenGL
© Copyright Khronos Group, 2010 - Page 30
OpenGL 4.3 details
© Copyright Khronos Group, 2010 - Page 31
OpenGL 4.3 new texture functionality
• ARB_texture_view
- Provide different ways to interpret texture data without duplicating the texture
- Match DX11 functionality
• ARB_internalformat_query2
- find out actual supported limits for most texture parameters
• ARB_copy_image
- Direct copy of pixels between textures and render buffers
• ARB_texture_buffer_range
- create texture buffer object corresponding to a subrange of a buffer’s data store
• ARB_stencil_texturing
- Read stencil bits of a packed depth-stencil texture
• ARB_texture_storage_multisample
- Immutable storage objects for multisampled textures
© Copyright Khronos Group, 2010 - Page 32
OpenGL 4.3 new buffer functionality
• ARB_shader_storage_buffer_object
- Enables all shader stages to read and write to very large buffers
- structs, arrays, scalars, etc
• ARB_invalidate_subdata
- Invalidate all or some of the contents of textures and buffers
• ARB_clear_buffer_object
- Clear a buffer object with a constant value
• ARB_vertex_attrib_binding
- Separate vertex attribute state from the data stores of each array
• ARB_robust_buffer_access_behavior
- shader read/write to an object only allowed to data owned by the application
- Applies to out of bounds accesses
© Copyright Khronos Group, 2010 - Page 33
OpenGL 4.3 new pipeline functionality
• ARB_compute_shader
- Introduces new shader stage
- Enables advanced processing algorithms that harness the parallelism of GPUs
• ARB_multi_draw_indirect
- Draw many GPU generated objects with one call
• KHR_debug
- Enhanced debug context support
• ARB_program_interface_query
- Generic API to enumerate active variables and interface blocks for each stage
- Enumerate active variables in interfaces between separable program objects
• ARB_ES3_compatibility
- features not previously present in OpenGL
- Brings EAC and ETC2 texture compression formats
© Copyright Khronos Group, 2010 - Page 34
GLSL 4.3 new functionality
• ARB_shader_image_size
- Query size of an image in a shader
• ARB_explicit_uniform_location
- Set location of a default-block uniform in the shader
• ARB_texture_query_levels
- Query number of mipmap levels accessible through a sampler uniform
• ARB_arrays_of_arrays
- Allows multi-dimensional arrays in GLSL. float f[4][3];
• ARB_fragment_layer_viewport
- gl_Layer and gl_ViewportIndex now available to fragment shader
© Copyright Khronos Group, 2010 - Page 35
Texture object state
Texture View Parameters <target>
TEXTURE_INTERNAL_FORMAT
TEXTURE_VIEW_{MIN,NUM}_LEVEL
TEXTURE_VIEW_{MIN,NUM}_LAYER
TEXTURE_IMMUTABLE_LEVELS
TEXTURE_SHARED_SIZE
TEXTURE_{RED,GREEN,BLUE,ALPHA,DEPTH,STENCIL}_SIZE
TEXTURE_{RED,GREEN,BLUE,ALPHA,DEPTH}_TYPE
IMAGE_FORMAT_COMPATIBILITY_TYPE
Texture Parameters
TEXTURE_WIDTH
TEXTURE_HEIGHT
TEXTURE_DEPTH
TEXTURE_SAMPLES
TEXTURE_FIXED_SAMPLE_LOCATIONS
TEXTURE_COMPRESSED
TEXTURE_COMPRESSED_IMAGE_SIZE
TEXTURE_IMMUTABLE_FORMAT
TEXTURE_SWIZZLE_{R,G,B,A}
TEXTURE_MAX_LEVEL
TEXTURE_BASE_LEVEL
DEPTH_STENCIL_TEXTURE_MODE
Sampler Parameters
TEXTURE_BORDER_COLOR
TEXTURE_COMPARE_{FUNC,MODE}
TEXTURE_LOD_BIAS
TEXTURE_{MAX,MIN}_LOD
TEXTURE_{MAG,MIN}_FILTER
TEXTURE_WRAP_{S,T,R}
Texture
Parameters
Texture Object
• •
Sampler
Parameters (mutable)
Texture View Parameters (immutable)
State is immutable, unless listed in italics
Top Related