Seamless Compute and OpenGL Graphics Development in NVIDIA
Transcript of Seamless Compute and OpenGL Graphics Development in NVIDIA
Seamless Compute and OpenGL Graphics Development in
NVIDIA® Nsight 3.0™ Visual Studio Edition …and Beyond
3/20/2013
NVIDIA Confidential
Agenda
Computational Graphics and Visual Computing
Developer Challenges
Maximus™
Getting help from Nsight
…and Beyond
Conclusion and Q&A
NVIDIA Confidential
Computational Graphics – Visual Computing
Deferred Shading
Ambient Occlusion
Simulation & Visualization
Medical imaging
Manufacturing
…
NVIDIA Confidential
Compute Shader Based Deferred Shading
Graphics pipeline rasterizes gbuffers for opaque surfaces
Compute pipeline uses gbuffers
— culls light sources
— computes lighting
— combines with shading
Johann Anderson - Compute Shader Based Deferred Shading
NVIDIA Confidential
NVIDIA Confidential
NVIDIA Confidential NVIDIA Confidential
Multi-GPU Platform Overview
OS
WDDM
GPU Driver
APP
I/O H
UB
MEM
MEM
CE1
CE2
<<<.>>>
GPU
MEM
CE1
CE2
<<<.>>>
GPU CPU
NVIDIA Confidential NVIDIA Confidential
Single-GPU Simulation and Visualization
Fn-1
<<<…>>>
Fn
<<<…>>>
<<<…>>>
<<<…>>>
Kernels
Mem. xfers
Draw
Mem. xfers
Mem. xfers
1
2
CPU
GPU
NVIDIA Confidential NVIDIA Confidential
Multi-GPU Simulation and Visualization
Fn-1 Fn+1
<<<…>>>
Fn
<<<…>>>
<<<…>>>
<<<…>>>
<<<…>>> <<<…>>>
<<<…>>>
Kernels
Mem. xfers
<<<…>>>
Draw
Mem. xfers 2
1
?
CPU
GPU
GPU
NVIDIA Confidential
Developer Challenges
Debugging GPU Compute and Graphics and interop
Context switching overhead
Multi-Core/Multi-GPU race conditions
Data transfers – host-to-device, device-to-host, P2P
Asynchronous transfers
Multi-core/Multi-threaded
Multi-GPU – Graphics/Graphics, Graphics/Compute
Concurrent kernel execution
Driver models (TCC/WDDM)
NVIDIA Confidential
NVIDIA® Maximus™
Compute Companion
Processor
NVIDIA® MAXIMUS
Professional Graphics
Processor
NVIDIA Confidential
NVIDIA® Nsight™ Visual Studio Edition Visual Studio integrated development for GPU and CPU
Profile Debug Build
NVIDIA Confidential
NVIDIA Nsight for Graphics Developers OpenGL 4.2, Direct3D 9/11
• Frame debugger - OpenGL 4.2, Direct3D 9/11
• HLSL and GLSL Shader debugger
• Frame profiler - OpenGL 4.2, Direct3D 9/11
• Application and system trace
NVIDIA Confidential
NVIDIA Nsight for Compute Developers CUDA 4.2 and 5.0 and Maximus™
• CUDA Debugger
• CUDA Memory Checker
• Application and system trace
• CUDA Profiler
NVIDIA Confidential NVIDIA Confidential
Two Methods for Fast Ray-Cast Ambient
Occlusion Render depth & normal buffers using OpenGL
Calculates AO using CUDA
Deferred shading in OpenGL Fn-1
Fn
<<<…>>> Kernels
Mem. xfers
Draw
Mem. xfers
GPU
GPU
NVIDIA Confidential
NVIDIA Confidential
PhysX Fracture Demo
Fracture computation in CUDA
PhysX GPU Rigid body collision in CUDA
OpenGL for rendering opaque scene and composite final image
Fn-1
Fn
<<<…>>> Kernels
Mem. xfers
Draw
Mem. xfers
GPU
GPU <<<…>>>
<<<…>>> … <<<…>>>
NVIDIA Confidential
NVIDIA Confidential
Macro Analysis Heterogeneous System Utilization
NVIDIA Confidential
NVIDIA Tools Extension API
Markers and Ranges
— Colors, string and payload
NVTX
— Host code decoration
— Frame Debugger scrubber
— Frame Profiling
— System Trace
NVTXT
— Custom data provider for System Trace
NVIDIA Confidential
High Performance Areas to Watch For
Kernel execution concurrency
Asynchronous memory transfers (Quadro/Tesla)
WDDM kernel fifos
Minimize Compute/Graphics context switches on a single
GPU
Multi-GPU and P2P xfers
NVIDIA Confidential NVIDIA Confidential
Single-GPU Memory Transfers
OS
WDDM
GPU Driver
APP
I/O H
UB
MEM
MEM
CE1
CE2
<<<.>>>
GPU
MEM
CE1
CE2
<<<.>>>
GPU CPU
NVIDIA Confidential NVIDIA Confidential
Multi-GPU memory transfers – P2H2P
OS
WDDM
GPU Driver
APP
I/O H
UB
MEM
MEM
CE1
CE2
<<<.>>>
GPU
MEM
CE1
CE2
<<<.>>>
GPU CPU
NVIDIA Confidential NVIDIA Confidential
Multi-GPU memory transfers – P2P
OS
WDDM
GPU Driver
APP
I/O H
UB
MEM
MEM
CE1
CE2
<<<.>>>
GPU
MEM
CE1
CE2
<<<.>>>
GPU CPU
NVIDIA Confidential NVIDIA Confidential
Multi-GPU memory transfers
CUDA
Context OpenGL
Context
RAM
Buffer
GPU 1 GPU 2
CUDA memcpy glBufferSubData
CUDA
Context
Aux.
CUDA
Context
Buffer
GPU 1 GPU 2
CUDA memcpy
OpenGL
Context
API Interop
NVIDIA Confidential
NVIDIA Confidential
…Beyond
GLSL Compute shader
Single GPU Graphics and Compute debugging
Full Compute and Graphics capture and replay
Source code generation of Compute and Graphics trace
Intra-frame debugging– snapshots of interop buffers…
NVIDIA Confidential NVIDIA Confidential
Intra-Frame CUDA/OpenGL interop support
Fn-1
<<<…>>>
Fn
<<<…>>>
<<<…>>>
<<<…>>>
Kernels
Mem. xfers
Draw
Mem. xfers
<<<…>>>
GPU
GPU
CPU
NVIDIA Confidential
Conclusion
Multi-GPU developer tools are available
Start GPU Debugging…
Use NVToolsExt
System Trace is your friend
NVIDIA Confidential
Q&A
Download
— http://www.nvidia.com/nsight
Nsight Visual Studio Edition Developer Forums
— https://devtalk.nvidia.com/default/board/84/nsight-visual-studio-edition
Nsight documentation — Start All Programs NVIDIA Nsight Visual Studio Edition 3.0 User Guide
— http://http.developer.nvidia.com/NsightVisualStudio/3.0/Documentation/UserGuide/HTML/Nsight
_Visual_Studio_Edition_User_Guide.htm
Nsight instruction videos
— http://www.gputechconf.com/object/gtc-express-webinar.html
Special Thanks – Tero Karas, Shalini Venkataraman, David Luebke for the demo content