Video Basics. Agenda Digital Video Compressing Video Audio Video Encoding in tools.
HIGH-PERFORMANCE GPU VIDEO ENCODING...AGENDA GPU Video Encoding Overview NVIDIA Video Encoding...
Transcript of HIGH-PERFORMANCE GPU VIDEO ENCODING...AGENDA GPU Video Encoding Overview NVIDIA Video Encoding...
ABHIJIT PATAIT
SR. MANAGER, NVIDIA
HIGH-PERFORMANCE GPU VIDEO ENCODING
AGENDA GPU Video Encoding Overview
NVIDIA Video Encoding Capabilities
Kepler, Maxwell Gen 1, Maxwell Gen 2
Software API
Performance & Quality
Roadmap
WHY GPU VIDEO ENCODING?
BENEFITS Low power
Fixed function hardware, free CPU
Reduced memory transfers
Low latency
High performance
Higher density
Scalability
Automatic benefit from improvements in hardware
Ease of programming
Linux, Windows, C/C++, Application portability
NVIDIA VIDEO ENCODER CAPABILITIES
MAIN FEATURES
Feature Benefits
H.264 base, main, high profiles Wide range of use-cases
H.265/HEVC main profile Lower bitrates at same quality
High performance (4K @ 60 fps) “Blazing-speed” encoding
YUV 4:2:0 and 4:4:4 support High quality encoding without chroma
subsampling
QP maps Customizable quality, region of interest encoding
4K encoding in hardware High resolution encode
API - NV Encode SDK & GRID SDK Flexible, Win/Linux, DirectX/CUDA
Independent of CUDA Use CUDA and encode simultaneously
FEATURE COMPARISON Kepler Maxwell Gen 1 (GM10x) Maxwell Gen 2 (GM20x)
H.264 only H.264 only H.264 and HEVC/H.265
Planar 4:4:4 & proprietary
4:4:4; no lossless encoding
Standard 4:4:4 and H.264
lossless encoding
Standard 4:4:4 and H.264
lossless encoding
~240 fps 2-pass encoding @
720p
~500 fps 2-pass encoding @
720p
~900 fps 2-pass encoding @
720p
GRID K340/K520, K1/K2,
Quadro, Tesla K10/K20
Maxwell-based GRID & Quadro
products
TBA
GeForce – 2 full-speed encode
sessions/system
GeForce – 2 full-speed encode
sessions/system
GeForce – 2 full-speed encode
sessions/system
NV Encode SDK 1.0-5.0 (Now) NV Encode SDK 4.0+ (Now) NV Encode SDK 5.0+ (Now)
GRID SDK 1.x, 2.2, 2.3 (Now) GRID SDK 3.0+ (Now) In development
WHAT’S NEW – HARDWARE HEVC
8-bit encoding
Main8 profile
Optimized for low-latency applications (I and P frames)
> 300 fps at very high quality 720p
H.264
Improved performance (~80% higher compared to 1st Gen Maxwell)
4:4:4 and lossless
WHAT’S NEW - SOFTWARE NVENC SDK 5.0
NVIDIA GPU driver 347.18 and above
HEVC
Unified API for H.264 and HEVC
Linux & Windows
Intra refresh, ref-pic invalidation, etc. for H.264 and HEVC
Support for all NVENC hardware up to GM20x
Adaptive quantization
Quality improvements
All-new sample applications, including a performance application
SOFTWARE API
USING NVENC N
VEN
C S
DK
• No capture
• Transcoding
• Archiving
• Video editing
• CUDA pre-process + encoding
• Granular encoder settings
• D3D, CUDA interop
Direct
Encode
GRID
SD
K
• Capture + encode
• Optimized for low-latency apps
• Capture + CUDA pre-process + encoding
• Encoder settings optimized for streaming
• D3D, CUDA interop
Capture +
Encode
DIRECT ENCODE (NVENC SDK)
Client application
NVENC API
NVENC
Driver
DirectX
Driver
CUDA
Driver
NVENC firmware + hardware
Initialize,
Configure HW
HW Encode
Encoded
bitstream Configure, Encode
CAPTURE & ENCODE (GRID SDK)
Client application
NvFBC/NvIFR
NVENC
Driver
DirectX/OGL
Driver
NVENC Hardware
Capture
YUV
GPU 3D Engine
DX/OGL Present
Encode
Encoded
Bitstream
NVENC SDK (1/2) Available on NVIDIA developer zone
https://developer.nvidia.com/nvidia-video-codec-sdk
Current release: 5.0
Interface header, documentation, sample application
.dll/.so included in the driver
Unified API for Windows and Linux
Works on x86/x64
API’s, presets, rate control modes for
Low-latency streaming
Transcoding
Video conferencing
NVENC SDK (2/2) Unified API for H.264 and HEVC
Flexibility
Dynamic resolution/bitrate change
Low-level encoder settings
Windows, Linux, DirectX, CUDA, OGL (via CUDA)
Works on GeForce (2 sessions/system)
Error concealment
Reference picture invalidation
Intra-refresh
Greater flexibility for quality/performance trade-off
Lossless encoding only in NVENC SDK
GRID SDK ENCODE NDA only – older release available on NV developer zone
https://developer.nvidia.com/grid-app-game-streaming
Current release: 3.1 (Now – NDA), 2.3 (Public)
Interface header, documentation, sample apps
.dll/.so included in the driver
Windows and Linux
Works on x86/x64
Presets and API’s for
Remote graphics (Cloud gaming, remote desktop, capture & stream)
Optimized for low latency
QUALITY
H.264 QUALITY – 1-PASS ENCODING
34
36
38
40
42
44
46
6 8 10 12 15 18 20
PSN
R (
dB)
bitrate (Mbps)
H.264 quality with 1-pass rate control
Default
LL-Default
HP
HQ
BD
LL-HQ
H.264 QUALITY – 2-PASS ENCODING
34
36
38
40
42
44
46
6 8 10 12 15 18 20
PSN
R (
dB)
bitrate (Mbps)
H.264 quality with 2-pass rate control
Default
LL-Default
HP
HQ
BD
LL-HQ
COMPARISON: 1-PASS VS 2-PASS
37.5
38
38.5
39
39.5
40
40.5
Default LL-Default HP HQ BD LL-HQ
PSN
R (
dB)
Encoder preset
H.264 quality comparison: 1-pass vs 2-pass
1-p
ass
2-p
ass
1-p
ass
1-p
ass
1-p
ass
1-p
ass
1-p
ass
2-p
ass
2-p
ass
2-p
ass
2-p
ass
2-p
ass
BITRATE SAVINGS
39.5 dB 41.0 dB 42.0 dB
6
8
9.8
4
6
8
Bitrate savings - Default preset
39.5 dB 41.0 dB 42.0 dB
5.8
7.8
9.7
3.9
5.8
7.9
Bitrate savings - HQ preset
33% 25% 18% 33% 26% 19% Bitrate savings
H.2
64
H.2
64
H.2
64
H.2
64
H.2
64
H.2
64
HEVC
HEVC
HEVC
HEVC
HEVC
HEVC
H.264 VS HEVC
Courtesy: Vanguard video
H.264 VS HEVC
Courtesy: Vanguard video
PERFORMANCE
H.264 PERFORMANCE – GM20X
1080p, GM20x
1 p
ass
2 p
ass
2 p
ass
2 p
ass
2 p
ass
1 p
ass
1 p
ass
1 p
ass
fps
50 fps
100 fps
150 fps
200 fps
250 fps
300 fps
350 fps
400 fps
450 fps
500 fps
HP LL-HP HQ LL-HQ
Single pass 464 fps 342 fps 291 fps 293 fps
Two pass 306 fps 246 fps 171 fps 181 fps
Encode F
PS
H.264 Performance (1080p)
1 p
ass
1 p
ass
1 p
ass
1 p
ass
2 p
ass
2 p
ass
2 p
ass
2 p
ass
H.264/HEVC PERF COMPARISON
fps
50 fps
100 fps
150 fps
200 fps
250 fps
300 fps
350 fps
HP LL-HP HQ LL-HQ
H.264 306 fps 246 fps 171 fps 181 fps
H.265 220 fps 214 fps 102 fps 153 fps
Encode F
PS
H.264/HEVC Performance: 2-pass
H.2
64
H.2
64
H.2
64
H.2
64
HEVC
HEVC
HEVC
HEVC
PERFORMANCE - TREND
fps
200 fps
400 fps
600 fps
800 fps
1000 fps
1200 fps
1400 fps
1600 fps
1800 fps
Kepler (2011) Maxwell Gen 1 (2013) Maxwell Gen 2 (2014) Future
Performance
ROADMAP
ROADMAP Core GPU chip IP
Motion estimation only mode – 2H2015
SAO, 10/12-bit, HEVC B-frames
Lossless/4:4:4
Improved quality for screen content encoding
ME performance and quality enhancements
Today: 4K@60fps
Next: 8K@??
THANK YOU [email protected]