Seamless Compute and OpenGL Graphics Development in NVIDIA

30
Seamless Compute and OpenGL Graphics Development in NVIDIA® Nsight 3.0™ Visual Studio Edition …and Beyond 3/20/2013

Transcript of Seamless Compute and OpenGL Graphics Development in NVIDIA

Page 1: Seamless Compute and OpenGL Graphics Development in NVIDIA

Seamless Compute and OpenGL Graphics Development in

NVIDIA® Nsight 3.0™ Visual Studio Edition …and Beyond

3/20/2013

Page 2: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

Agenda

Computational Graphics and Visual Computing

Developer Challenges

Maximus™

Getting help from Nsight

…and Beyond

Conclusion and Q&A

Page 3: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

Computational Graphics – Visual Computing

Deferred Shading

Ambient Occlusion

Simulation & Visualization

Medical imaging

Manufacturing

Page 4: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

Compute Shader Based Deferred Shading

Graphics pipeline rasterizes gbuffers for opaque surfaces

Compute pipeline uses gbuffers

— culls light sources

— computes lighting

— combines with shading

Johann Anderson - Compute Shader Based Deferred Shading

Page 5: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

Page 6: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

Page 7: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential NVIDIA Confidential

Multi-GPU Platform Overview

OS

WDDM

GPU Driver

APP

I/O H

UB

MEM

MEM

CE1

CE2

<<<.>>>

GPU

MEM

CE1

CE2

<<<.>>>

GPU CPU

Page 8: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential NVIDIA Confidential

Single-GPU Simulation and Visualization

Fn-1

<<<…>>>

Fn

<<<…>>>

<<<…>>>

<<<…>>>

Kernels

Mem. xfers

Draw

Mem. xfers

Mem. xfers

1

2

CPU

GPU

Page 9: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential NVIDIA Confidential

Multi-GPU Simulation and Visualization

Fn-1 Fn+1

<<<…>>>

Fn

<<<…>>>

<<<…>>>

<<<…>>>

<<<…>>> <<<…>>>

<<<…>>>

Kernels

Mem. xfers

<<<…>>>

Draw

Mem. xfers 2

1

?

CPU

GPU

GPU

Page 10: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

Developer Challenges

Debugging GPU Compute and Graphics and interop

Context switching overhead

Multi-Core/Multi-GPU race conditions

Data transfers – host-to-device, device-to-host, P2P

Asynchronous transfers

Multi-core/Multi-threaded

Multi-GPU – Graphics/Graphics, Graphics/Compute

Concurrent kernel execution

Driver models (TCC/WDDM)

Page 11: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

NVIDIA® Maximus™

Compute Companion

Processor

NVIDIA® MAXIMUS

Professional Graphics

Processor

Page 12: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

NVIDIA® Nsight™ Visual Studio Edition Visual Studio integrated development for GPU and CPU

Profile Debug Build

Page 13: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

NVIDIA Nsight for Graphics Developers OpenGL 4.2, Direct3D 9/11

• Frame debugger - OpenGL 4.2, Direct3D 9/11

• HLSL and GLSL Shader debugger

• Frame profiler - OpenGL 4.2, Direct3D 9/11

• Application and system trace

Page 14: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

NVIDIA Nsight for Compute Developers CUDA 4.2 and 5.0 and Maximus™

• CUDA Debugger

• CUDA Memory Checker

• Application and system trace

• CUDA Profiler

Page 15: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential NVIDIA Confidential

Two Methods for Fast Ray-Cast Ambient

Occlusion Render depth & normal buffers using OpenGL

Calculates AO using CUDA

Deferred shading in OpenGL Fn-1

Fn

<<<…>>> Kernels

Mem. xfers

Draw

Mem. xfers

GPU

GPU

Page 16: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

Page 17: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

PhysX Fracture Demo

Fracture computation in CUDA

PhysX GPU Rigid body collision in CUDA

OpenGL for rendering opaque scene and composite final image

Fn-1

Fn

<<<…>>> Kernels

Mem. xfers

Draw

Mem. xfers

GPU

GPU <<<…>>>

<<<…>>> … <<<…>>>

Page 18: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

Page 19: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

Macro Analysis Heterogeneous System Utilization

Page 20: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

NVIDIA Tools Extension API

Markers and Ranges

— Colors, string and payload

NVTX

— Host code decoration

— Frame Debugger scrubber

— Frame Profiling

— System Trace

NVTXT

— Custom data provider for System Trace

Page 21: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

High Performance Areas to Watch For

Kernel execution concurrency

Asynchronous memory transfers (Quadro/Tesla)

WDDM kernel fifos

Minimize Compute/Graphics context switches on a single

GPU

Multi-GPU and P2P xfers

Page 22: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential NVIDIA Confidential

Single-GPU Memory Transfers

OS

WDDM

GPU Driver

APP

I/O H

UB

MEM

MEM

CE1

CE2

<<<.>>>

GPU

MEM

CE1

CE2

<<<.>>>

GPU CPU

Page 23: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential NVIDIA Confidential

Multi-GPU memory transfers – P2H2P

OS

WDDM

GPU Driver

APP

I/O H

UB

MEM

MEM

CE1

CE2

<<<.>>>

GPU

MEM

CE1

CE2

<<<.>>>

GPU CPU

Page 24: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential NVIDIA Confidential

Multi-GPU memory transfers – P2P

OS

WDDM

GPU Driver

APP

I/O H

UB

MEM

MEM

CE1

CE2

<<<.>>>

GPU

MEM

CE1

CE2

<<<.>>>

GPU CPU

Page 25: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential NVIDIA Confidential

Multi-GPU memory transfers

CUDA

Context OpenGL

Context

RAM

Buffer

GPU 1 GPU 2

CUDA memcpy glBufferSubData

CUDA

Context

Aux.

CUDA

Context

Buffer

GPU 1 GPU 2

CUDA memcpy

OpenGL

Context

API Interop

Page 26: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

Page 27: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

…Beyond

GLSL Compute shader

Single GPU Graphics and Compute debugging

Full Compute and Graphics capture and replay

Source code generation of Compute and Graphics trace

Intra-frame debugging– snapshots of interop buffers…

Page 28: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential NVIDIA Confidential

Intra-Frame CUDA/OpenGL interop support

Fn-1

<<<…>>>

Fn

<<<…>>>

<<<…>>>

<<<…>>>

Kernels

Mem. xfers

Draw

Mem. xfers

<<<…>>>

GPU

GPU

CPU

Page 29: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

Conclusion

Multi-GPU developer tools are available

Start GPU Debugging…

Use NVToolsExt

System Trace is your friend

Page 30: Seamless Compute and OpenGL Graphics Development in NVIDIA

NVIDIA Confidential

Q&A

Download

— http://www.nvidia.com/nsight

Nsight Visual Studio Edition Developer Forums

— https://devtalk.nvidia.com/default/board/84/nsight-visual-studio-edition

Nsight documentation — Start All Programs NVIDIA Nsight Visual Studio Edition 3.0 User Guide

— http://http.developer.nvidia.com/NsightVisualStudio/3.0/Documentation/UserGuide/HTML/Nsight

_Visual_Studio_Edition_User_Guide.htm

Nsight instruction videos

— http://www.gputechconf.com/object/gtc-express-webinar.html

Special Thanks – Tero Karas, Shalini Venkataraman, David Luebke for the demo content