Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution
-
Upload
intel-software -
Category
Technology
-
view
403 -
download
1
Transcript of Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution
![Page 1: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/1.jpg)
VICTOR H. S. HA, PH.D.
VPG MEDIA AND DISPLAY IP, INTEL CORP.
![Page 2: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/2.jpg)
2
1
2
3
Title: Ultra High Definition (UHD) Video Scaling:Low-Power (LP) Hardware (HW) Fixed-Function (FF) vs.Convolutional Neural Network (CNN)-based Super-Resolution (SR)
Gen9 Intel®Processor Graphics
Super-ResolutionScaling
SFC Media HW FFAdvanced Video
Scaler in SFC
Convolutional Neural Network
Super-Resolution Scaling using CNN
Compare
![Page 3: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/3.jpg)
Gen9 Intel® processor graphics
3
![Page 4: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/4.jpg)
4
Table of Content
Gen9 Intel®Processor Graphics
Super-ResolutionScaling
SFC Media HW FFAdvanced Video
Scaler in SFC
Convolutional Neural Network
Super-Resolution Scaling using CNN
Compare
![Page 5: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/5.jpg)
5
![Page 6: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/6.jpg)
UHD End-to-End Support in Gen9 Intel® Processor Graphics
UHD Decode, Encode, Display
UHD Content
UHD Display
UHD Capture
UHD Video Scaling Support• Upscale from HD to UHD• Downscale from UHD to HD
Display Port* (DP), Embedded DisplayPort* (eDP), Miracast* and other names and brands may be claimed as the property of others
* GPU Accelerated; Media Codec support may not be available on all operating systems and applications.
![Page 7: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/7.jpg)
7
Why UHD Scaling is Different?
SD to HD Scaling
• Pixel Resolution from 720x480 to 1920x1080
• Aspect Ratio from 4:3 to 16:9
• SD Video in Low Quality, often requiring, De-interlace, De-noise, De-blocking, Sharpening, etc.
FHD to 4K UHD Scaling
• Pixel Resolution from 1920x1080 to 3840x2160
• Aspect Ratio stays at 16:9
• FHD Video already in High-Quality with Crisp Details
![Page 8: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/8.jpg)
8
Why UHD Scaling is Different?
SD to HD Scaling
• Pixel Resolution from 720x480 to 1920x1080
• Aspect Ratio from 4:3 to 16:9
• SD Video in Low Quality, often requiring, De-interlace, De-noise, De-blocking, Sharpening, etc.
• 345,600 pixels to 2,073,600 pixels
FHD to 4K UHD Scaling
• Pixel Resolution from 1920x1080 to 3840x2160
• Aspect Ratio stays at 16:9
• FHD Video already in High-Quality with Crisp Details
• 2,073,600 pixels to 8,294,400 pixels
![Page 9: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/9.jpg)
9
Un
slic
eG
eo
me
try
Subslice
Slice Common
FF Media in Unslice
• 6th Generation Intel Core Processor Graphics on 14nm Process
• Support of Latest APIso DirectX* 12/11.3o OpenCL 2.0o OpenGL* 4.4
• Scalable uArch Partitioning similar to 5th Generation Intel® Core™ Architecture o Unslice, Slice, Subslice, etc.
• Improved Design for Better Energy Efficiency
• Flexible and Finer-grain Power Management
* Other names and brands may be claimed as the property of others
![Page 10: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/10.jpg)
10
Multi-Format Codec (MFX)
• HEVC Decode
• HEVC Encode
• HEVC 10bit Decode (GPU Accelerated)
• JPEG / MJPEG Decode
• JPEG / MJPEG Encode
• MPEG2 Decode and Encode
• AVC Decode and Encode
• VP8 Decode and Encode
FF Media in UnsliceU
nsl
ice
Ge
om
etr
y
Subslice
Slice Common
![Page 11: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/11.jpg)
11
Video Quality Engine (VQE)
• Video Processing and Enhancement
• 16bit per channel processing pipe
• RAW image processing pipe
• De-noise
• De-interlace
• Contrast/Saturation Enhancement
• Skin-tone Detection and Enhancement
• Color Space Conversion (BT2020)
• Color Correction
FF Media in UnsliceU
nsl
ice
Ge
om
etr
y
Subslice
Slice Common
![Page 12: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/12.jpg)
12
Scaler and Format Conversion (SFC)
• Dedicated Media FF HW
• Advanced Video Scaler (AVS)
• Sharpness Enhancement
• Color Space Conversion
• Chroma Sampling
• Rotation and other Format Conversions
Media Sampler
• Video Motion Estimation (VME)
• Advanced Video Scaler (AVS)
• Sharpness Enhancement
FF Media in UnsliceU
nsl
ice
Ge
om
etr
y
Subslice
Slice Common
![Page 13: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/13.jpg)
SFC (Scaler and Format Converter)
Low-Power UHD Video Playback
• New SFC HW pipe is added to deliver Ultra Low Power media playback experience
• SFC is connected inline (without memory read/write) to MFX (video decode) and VQE (video processing)
![Page 14: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/14.jpg)
14
Video Decode Scaling Display (or Encode)
MFXVideo Decode
Media Sampler AVS
VQEVideo Enhancement
MFXVideo Decode
SFC AVSVD-SFC (Video Decode SFC)
VQEVideo Enhancement
MFXVideo Encode
MFXVideo Encode
SFC AVS Example #1
GEN8 without SFC
GEN9 with SFC
memoryread/write
memoryread/write
memoryread/write
![Page 15: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/15.jpg)
15
SFC AVS Example #2
Video Quality Enhancement Scaling Display (or Encode)
MFXVideo Decode
VQEVideo Enhancement
Media Sampler AVS
MFXVideo Decode
VQEVideo Enhancement
SFC AVSVE-SFC (Video Enhance SFC)
MFXVideo Encode
MFXVideo Encode
GEN8 without SFC
GEN9 with SFC
memoryread/write
memoryread/write
memoryread/write
![Page 16: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/16.jpg)
SFC (Scaler and Format Converter)
Low-Power UHD Video Playback
• New SFC HW pipe is added to deliver Ultra Low Power media playback experience
• SFC is connected inline (without memory read/write) to MFX (video decode) and VQE (video processing)
SFC pipeline delivers many benefits:
• Inline Connection: Reduced bandwidth and power consumption
• SFC handles scaling, detail enhancement, color space conversion, and other format conversion on the fly
• 12bit Data Path ready for Ultra-HD (UHD), High Dynamic Range (HDR), Wide Color Gamut (WCG)
• Free up EU resources (slice/subslice) from media use cases and power-gated when not used
• SFC can process UHD Video (3840x2160 @ 60fps) operating at power-efficient low-frequency mode
![Page 17: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/17.jpg)
17
AVS (Advanced Video Scaler) in SFC
AVS is a Low-Power Fixed-Function Hardware in SFC• Real-time video scaling in a 12bits per channel data path• Consists of a pair of spatial filters, Sharp Filter and Smooth Filter
Adaptive Mode• The results of the two filters are alpha-blended to generate the output pixel value
• The alpha blending factor, , is computed for each pixel from neighboring pixels
Sharp Filter
Smooth Filter
Blending Factor Computation +
InputPixel
OutputPixel
Blending Factor
![Page 18: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/18.jpg)
18
AVS Smooth Filter
Reference Ground Truth (1440x960) Smooth Filter (720x480 to 1440x960)
** Blurrier than Reference Ground Truth **
![Page 19: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/19.jpg)
19
AVS Sharp Filter
Reference Ground Truth (1440x960) Sharp Filter (720x480 to 1440x960)
** Similar to Reference Ground Truth **
![Page 20: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/20.jpg)
20
AVS Sharper Filter
Reference Ground Truth (1440x960) Sharper Filter (720x480 to 1440x960)
** Sharper than Reference Ground Truth **
visual artifact
![Page 21: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/21.jpg)
21
Sharp vs. Smooth Filter
Smooth Filter Sharper Filter
** Ringing Artifacts **
![Page 22: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/22.jpg)
22
Adaptive Mode in AVS
Sharp Filter• Sharp and Crisp Output on Natural Scenes
• Ringing on Computer Graphics
Smooth Filter• Blurrier Output on Natural Scenes• Ringing-free Output on Computer Graphics
Adaptive Mode• Best of Both Filters possible based on Per-Pixel Adjustment
• Sharp Output on Natural Scenes
• Ringing-free Output on Computer Graphics
![Page 23: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/23.jpg)
23
Sharp vs. Smooth Filter
Smooth Filter Sharper Filter
** Ringing Artifacts **
![Page 24: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/24.jpg)
24
Adaptive Mode 1
Adaptive Mode On Sharper Filter
** Ringing Artifacts **** Sharper than Smooth Filter without Ringing **
![Page 25: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/25.jpg)
25
Adaptive Mode 2
Adaptive Mode On Smooth Filter
** Sharper than Smooth Filter **
![Page 26: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/26.jpg)
26
Adaptive Mode in AVS
Sharp Filter• Sharp and Crisp Output on Natural Scenes
• Ringing on Computer Graphics
Smooth Filter• Blurrier Output on Natural Scenes• Ringing-free Output on Computer Graphics
Adaptive Mode• Best of Both Filters possible based on Per-Pixel Adjustment
• Sharp Output on Natural Scenes
• Ringing-free Output on Computer Graphics
![Page 27: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/27.jpg)
Media Scaler Interface
Interface Video Scaler
Intel® Media Server Studio SDKhttps://software.intel.com/en-us/media-sdk
• Microsoft Windows* DXVA SFC AVS (default)
• LibVA (Android/Linux) SFC AVS (default)
macOS* SFC and AVS
27
• Application SW specifies input/output formats, then
o conf.vpp.In.Width, Height, CropX, CropY, CropW, CropHo conf.vpp.Out.Wdith, Height, CropX, CropY, CropW, CropH
• MSDK configures the video processing pipeline accordingly
* Other names and brands may be claimed as the property of others
![Page 28: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/28.jpg)
neuron to convolutional neural networks for Super-resolution scaling
28
![Page 29: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/29.jpg)
29
Table of Content
Gen9 Intel®Processor Graphics
Super-ResolutionScaling
SFC Media HW FFAdvanced Video
Scaler in SFC
Convolutional Neural Network
Super-Resolution Scaling using CNN
Compare
![Page 30: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/30.jpg)
30
From Neuron to CNN
Neuron CNN
Scaling Super ResolutionSparse Coding
Super Resolution
CNN-based SRSparseCoding
Sparse CodingDeep Network
![Page 31: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/31.jpg)
neuron to convolutional neural networks
31
![Page 32: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/32.jpg)
32
Neuron
A neuron
• Is a nerve cell in brains, spinal cords, etc.
• Processes and transmits data through electrical/chemical signals
• Can give rise to multiple dendrites, but not more than one axon
• Signals travel from the axon of one neuron to a dendrite
of another (with many exceptions to these rules) via a synapse
• Connects to each other to form neural networks
• A human brain contains about 100 billion neurons
• Each has 5K~100K synaptic connections to other neurons
input signal input signal
dendrites
axon
output signal
axon terminals
nucleus
cell body
![Page 33: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/33.jpg)
33
Artificial Neuron
• A Neuron has a single Axon and multiple Dendrites
o Dendrites receive incoming electrical signals
o Electrical signal is sent out from an Axon to Dendrites
and 𝑜𝑢𝑡 = 01
𝑖𝑓 𝑓 < 0𝑖𝑓 𝑓 ≥ 0
𝑓 = 𝑏 +
𝑖=0
𝑛
𝑤𝑖𝑥𝑖
S
x0
xn
b
fout
w0
wn
x1 w1...
.
.
.
input signal input signal
dendrites
axon
output signal
axon terminals
nucleus
cell body
![Page 34: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/34.jpg)
34
Artificial Neuron – what does it do?
x0 x1 x0 AND x1 x0 NAND x1
0 0 0 1
0 1 0 1
1 0 0 1
1 1 1 0
x0 x1 f out
0 0 3 1
0 1 1 1
1 0 1 1
1 1 -1 0
S
x0
x1b
fout
w0
w1
NAND gate is universal for computation - any logic can be built up out of NAND gates
An artificial neuron (perceptron with 2 input) can implement a NAND gate:• input = (x0, x1)
• weights = (w0, w1) = (-2, -2)
• bias b = 3
• out = 0 if f < 0
1 if f ≥ 0
NAND Gate
Artificial Neuron
and 𝑜𝑢𝑡 = 01
𝑖𝑓 𝑓 < 0𝑖𝑓 𝑓 ≥ 0
𝑓 = 𝑏 +
𝑖=0
𝑛
𝑤𝑖𝑥𝑖
![Page 35: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/35.jpg)
S
x0
x1b
fout
w0
w1
S
x0
x1b
fout
w0
w1
S
x0
x1b
fout1
w0
w1
S
x0
x1b
fout2
w0
w1
S
x0
x1b
fout0
w0
w1
in0
in1
Layer 1 Layer 2
35
Neural Network
Connect multiple artificial neurons• Simple compute devices become interconnected• Connections between neurons determine the function of the overall network• Massively parallel structure allows fast results with slow neurons• Multi-layer networks are more powerful
![Page 36: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/36.jpg)
36
Convolutional Neural Networks (CNN)
What is it?• Multiple layers of artificial neural networks
• Some layers performing Convolution Operations that extract features (e.g., edges) from input images
• 2D Convolution Operation is
Usages:• Image Classification
• Object Detection
• Face Recognition
• Denoise
• Deblurring
• Super-Resolution Scaling
𝑓(𝑥, 𝑦) =
𝑖=−∞
∞
𝑗=−∞
∞
𝑤 𝑖, 𝑗 𝑥(𝑥 − 𝑖, 𝑦 − 𝑗)
![Page 37: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/37.jpg)
37
Convolution using a Neuron• Each neuron processes a small part (receptive field) of input image
using shared weights in convolutional layers
What’s it good for? Why use it?• Instead of designing and optimizing each convolution kernel manually,
train the network to solve difficult problems simply by feeding input and output pairs (i.e., feature extraction process is learned by the network)
x0 x1
x3 x4
x2
x5
x6 x7 x8
w0 w1
w3 w4
w2
w5
w6 w7 w8
x1
x4
x2
x5
x7 x8
w0 w1
w3 w4
w2
w5
w6 w7 w8
x1
x4
x2
x5
x7 x8
w0 w1
w3 w4
w2
w5
w6 w7 w8
x0 x1
x3 x4
x2
x5
x6 x7 x8
x0 x1
x3 x4
x2
x5
x6 x7 x8
x1
x4
x2
x5
x7 x8
x0 x1
x3 x4
x2
x5
x6 x7 x8
Convolution Kernel Convolution Kernel Convolution Kernel
Image Patch Image Patch Image Patch
Input Image Input Image Input Image
𝑓 = 𝑏 +
𝑖=0
𝑛
𝑤𝑖𝑥𝑖
S
x0
xn
b
fout
w0
wn
x1 w1...
.
.
.
𝑓(𝑥, 𝑦) =
𝑖=−∞
∞
𝑗=−∞
∞
𝑤 𝑖, 𝑗 𝑥(𝑥 − 𝑖, 𝑦 − 𝑗)
![Page 38: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/38.jpg)
CNN-based Super-Resolution
38
![Page 39: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/39.jpg)
39
Super-Resolution
Super-Resolution
• The term has been used by many to mean many different things over the years
• We will define what we mean by it in this talk, and then move on
Super-Resolution as Upscaling
• Input = Low-resolution Image (e.g., 1920x1080 RGB picture)
• Output = High-resolution Image (e.g., 3840x2160 RGB picture)
• Super-Resolution Requirements:
o Use a single input image to generate a single output image, i.e., Single-frame (Spatial) SR
o Output image quality is better than traditional scalers based on interpolation (bilinear, bicubic, etc.)
o No visual artifacts are introduced by SR upscaling
![Page 40: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/40.jpg)
Publications on CNN-based SR
40
SCN from University of Illinois – Urbana Champaign1. Image Super-Resolution via Sparse Representation, Huang et al., TIP 20102. Coupled Dictionary Training for Image Super-Resolution, Huang et al., TIP 20123. Deep Networks for Image Super-Resolution with Sparse Prior, Huang et al., ICCV 20154. Self-Tuned Deep Super Resolution, Huang et al., CVPR 20155. Robust Single Image Super-Resolution via Deep Networks with Sparse Prior, Huang et al., TIP 2016
SRCNN from The Chinese University of Hong Kong1. Learning a deep convolutional network for image super-resolution, Tang et al., ECCV 2014
2. Image Super-Resolution using Deep Convolutional Networks, Tang et al., TPAMI 2016
DRCN from Seoul National University1. Deeply-Recursive Convolutional Network for Image Super-Resolution, Kim et al., CVPR 2016
2. Accurate Image Super-Resolution using Very Deep Convolutional Networks, Kim et al., CVPR 2016
Technische Universität Mϋnchen, Image Super-Resolution with Fast Approximate Convolutional Sparse Coding, Smagt et al., ICONIP 2014
Huaqiao University, Deep Network Cascade for Image Super-Resolution, Chen et al., ECCV 2014
![Page 41: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/41.jpg)
Publications on CNN-based SR
41
SCN from University of Illinois – Urbana Champaign1. Image Super-Resolution via Sparse Representation, Huang et al., TIP 20102. Coupled Dictionary Training for Image Super-Resolution, Huang et al., TIP 20123. Deep Networks for Image Super-Resolution with Sparse Prior, Huang et al., ICCV 20154. Self-Tuned Deep Super Resolution, Huang et al., CVPR 20155. Robust Single Image Super-Resolution via Deep Networks with Sparse Prior, Huang et al., TIP 2016
SRCNN from The Chinese University of Hong Kong1. Learning a deep convolutional network for image super-resolution, Tang et al., ECCV 2014
2. Image Super-Resolution using Deep Convolutional Networks, Tang et al., TPAMI 2016
DRCN from Seoul National University1. Deeply-Recursive Convolutional Network for Image Super-Resolution, Kim et al., CVPR 2016
2. Accurate Image Super-Resolution using Very Deep Convolutional Networks, Kim et al., CVPR 2016
Technische Universität Mϋnchen, Image Super-Resolution with Fast Approximate Convolutional Sparse Coding, Smagt et al., ICONIP 2014
Huaqiao University, Deep Network Cascade for Image Super-Resolution, Chen et al., ECCV 2014
compared to all SFSR(CNN-based or not)solutions
![Page 42: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/42.jpg)
From Sparse Coding to CNN-based SR
42
Neuron CNN
Scaling Super ResolutionSparse Coding
Super Resolution
CNN-based SRSparseCoding
Sparse CodingDeep Network
![Page 43: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/43.jpg)
Sparse Coding
43
• Reconstruct input signal x using a linear combination of basis vectors of a Dictionary D with sparse coefficients
o x = D ⋅
• where x is an n x 1 input vector
D is an n x m matrix, an overcomplete (m > n) Dictionary with m basis vectors
is an m x 1 sparse code vector
• Sparse = Most of sparse code coefficients in are zero, i.e., is a sparse representation of x
• Optimal sparse code is obtained as = argminz E(x, z) = 1
2x− 𝐃𝐳 2
2 + 𝐳 1
Encoder• Dictionary D• ISTA/CoD (iterative)
• LSTA/LCoD (approximate)
Input Vector x Sparse Code
![Page 44: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/44.jpg)
Sparse Coding Super-Resolution
44
Super-Resolution Reconstruction
• y = Dy ⋅ y y = x Dx ⋅ x = x
3x3 LR
Image Patch y
HR Sparse
Representation x
LR Sparse
Representation y
9x9 HR
Image Patch x
Joint DictionaryTraining:Iterative
Optimization using 100,000 random image
patch pairs
Overcomplete
LR Dictionary Dy(m = 1024)
Overcomplete
HR Dictionary Dx(m = 1024)
Linear Combination
Linear Combination
Dictionary Elements
Dictionary Elements
Sparse Code Encoder
![Page 45: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/45.jpg)
45
SCN (Sparse Coding based Network)
Sparse Coding Super-Resolution Deep Network Super-Resolution1. Layer #1 (Convolutional Layer H): image patch/feature y is extracted from the LR image Iy with my filters
2. Layer #2 and #3 (Sparse Code Encoder as k-iterations of LISTA network): Sparse code is computed from y
3. Layer #4 (Reconstruction): Sparse code is multiplied with HR Dictionary Dx to reconstruct HR image patch x
4. Layer #5 (Convolutional Layer G): All HR patches x are combined to HR Image Ix
Sparse Code Encoder
Iy LR Imagey LR Image Patch Sparse Codex HR Image PatchIx HR Image
Fig. 2 from “Robust Single Image Super-Resolution via Deep Networks with Sparse Prior”, IEEE Transactions on Image Processing, Vol. 25. Issue 7, pp 3194-3207, 2016
![Page 46: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/46.jpg)
46
SCN: 5-Layer Deep Network for Super-Resolution
Deep Network Architecture• 2 Convolutional Layers (H and G) and 3 Layers for Sparse Coding Encoder
• All parameters trained via back-propagation using MSE cost function
• Network learns more complex function beyond the sparse coding model
• Performs better than sparse coding results even with dictionary size reduced from 1024 to 128
Advantages of SCN• LISTA sub-network to enforce sparse representation, i.e., better interpretation of filter responses
and parameter initialization based on domain knowledge in sparse coding
• Better SR results, faster training speed and smaller model size
Subjective Quality Assessment• Best Visual Quality against other SFSR solutions (sharper boundaries, richer textures, no ringing)
• Scale ratio is fixed for the network Use a cascade of multiple SCNs + bicubic downscaler
• Cascade of multiple networks is better than a single network trained with a large scale factor
![Page 47: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/47.jpg)
Quality Study via Simulation
47
![Page 48: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/48.jpg)
48
Table of Content
PSNRMSE
VisualInspection
Gen9 Intel®Processor Graphics
Super-ResolutionScaling
SFC Media HW FFAdvanced Video
Scaler in SFC
Convolutional Neural Network
Super-Resolution Scaling using CNN
Compare
![Page 49: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/49.jpg)
Capturing LR and HR Test Images
49
1. Camera Capture• LR: Camera Capture in FHD Mode at 1936x1288, then cropped to 720x480• HR: Camera Capture in UHD Mode at 3888x2592, then cropped to 1440x960
2. Optical Scanner• LR: Scan a letter-size printed document in 300dpi Mode at 2478x3228, then cropped to 720x480• HR: Scan the same printed document in 600dpi Mode at 4956x6456, then cropped to 1440x960
3. Screen Capture (www.intel.com)• LR: Screen Capture of Intel Website at 100% Zoom, then cropped to 720x480• HR: Screen Capture of the same Intel Website at 200% Zoom, then cropped to 1440x960
Test Image #1 Test Image #2 Test Image #3
![Page 50: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/50.jpg)
SR Test Scenarios
50
Scaling Solutions
• SFC AVS: Gen9 Intel® Processor Graphics Media HW FF SFC AVS in SW Simulation
• SCN: Sparse-Coding Network (SCN) is CNN-based SR from Huang et al.
MATLAB codes and network parameters available in http://www.ifp.illinois.edu/~dingliu2/iccv15/
2x Upscaling for 1920x1080 to 3840x2160• SFC AVS: 2x
• SCN: 2x
4x Upscaling for 1920x1080 to 7680x4320• SFC AVS: 4x
• SCN: 2x (SCN) 2x (SCN)
1.3x Upscaling for 1920x1080 to 2560x1440• SFC AVS: 1.3x
• SCN: 2x (SCN) 0.65x (MATLAB Bicubic)
![Page 51: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/51.jpg)
51
![Page 52: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/52.jpg)
52
SFC AVS
SCN
visual artifact
SCN result is sharper than AVS
SCN adds some visual artifacts
+1 to AVS or on Par
![Page 53: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/53.jpg)
53
SCN
SFC AVS
SCN has the halo problem that is more pronounced in 4x upscaling
+1 to AVS
halo added
![Page 54: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/54.jpg)
54
SCN
SFC AVS
ringing
severe color bleeding
SCN result is sharper, but with more visible ringing and color bleeding artifacts
+1 to AVS
![Page 55: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/55.jpg)
SR Test Results
55
Upscaling Ratio Test 1 Test 2 Test 3
1.3x SFC AVS SFC AVS SCN
2x SFC AVS SFC AVS SCN
4x SFC AVS / SCN SFC AVS SFC AVS
Overall• SFC AVS and SCN performed well against the ground truth and quite closely to each other in 3 test examples• SFC AVS seems to have a slight advantage over SCN on these 3 test examples
But, Why...?• SCN has not been trained on a wide range of non-natural scenes / computer graphics contents
• Test input images are high-quality LR images, but SCN is trained on very blurry LR input images (Gaussian Blurring + Downsample + Bicubic Upsample)
• Better understanding of CNN architecture, training database, and training strategies is required
![Page 56: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/56.jpg)
Summary
56
• Gen9 Intel® Processor Graphics adds a new HW FF called SFC• SFC AVS provides a high-quality video scaling solution at low-power• Adaptive mode in AVS combines benefits of smooth and sharp
filters on a per-pixel basis for superior output quality
1 Gen9 Intel®Processor Graphics
Super-ResolutionScaling
SFC Media HW FFAdvanced Video
Scaler in SFC
Convolutional Neural Network
Super-Resolution Scaling using CNN
Compare
![Page 57: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/57.jpg)
Summary
57
• Super-Resolution scaling solutions have been developed using CNN framework and presents a great potential for high quality video scaling
• Gen9 Intel® Processor Graphics adds a new HW FF called SFC• SFC AVS provides a high-quality video scaling solution at low-power• Adaptive mode in AVS combines benefits of smooth and sharp
filters on a per-pixel basis for superior output quality
2
Gen9 Intel®Processor Graphics
Super-ResolutionScaling
SFC Media HW FFAdvanced Video
Scaler in SFC
Convolutional Neural Network
Super-Resolution Scaling using CNN
Compare
![Page 58: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/58.jpg)
Summary
58
• Super-Resolution scaling solutions have been developed using CNN framework and presents a great potential for high quality video scaling
• SFC AVS produces very high quality output that is comparable to current state-of-the-art CNN-based SR solutions
• CNN-based SR scaling can be further improved with more intelligent training and architecture in the future
• Gen9 Intel® Processor Graphics adds a new HW FF called SFC• SFC AVS provides a high-quality video scaling solution at low-power• Adaptive mode in AVS combines benefits of smooth and sharp
filters on a per-pixel basis for superior output quality
3
Gen9 Intel®Processor Graphics
Super-ResolutionScaling
SFC Media HW FFAdvanced Video
Scaler in SFC
Convolutional Neural Network
Super-Resolution Scaling using CNN
Compare
![Page 59: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/59.jpg)
Summary
59
• Super-Resolution scaling solutions have been developed using CNN framework and presents a great potential for high quality Super-Resolution scaling
• SFC AVS produces very high quality output that is comparable to current state-of-the-art CNN-based SR solutions
• CNN-based SR scaling can be further improved with more intelligent training and architecture in the future
• Gen9 Intel® Processor Graphics adds a new HW FF called SFC• SFC AVS provides a high-quality video scaling solution at low-power• Adaptive mode in AVS combines benefits of smooth and sharp
filters on a per-pixel basis for superior output quality
• Use Gen9 Intel HW FF Scaler for Low-Power High-Performance High-Quality UHD 4K60 Scaling
• Use Gen9 Intel® Processor Graphics for CNN-based SR running on openCL for enhanced UHD picture quality
Gen9 Intel®Processor Graphics
Super-ResolutionScaling
SFC Media HW FFAdvanced Video
Scaler in SFC
Convolutional Neural Network
Super-Resolution Scaling using CNN
Compare
![Page 60: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/60.jpg)
Q & A
60
![Page 61: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/61.jpg)
61
Acknowledgement
Many thanks go to the following individuals from Intel• Yi-jen Chiu
• Keith Rowe
• Niranjan S Mulay
• Ping Liu
• Furong Zhang
• Wen-fu Kao
• Vidhya Krishnan
• Sungye Kim
• Charles Lingle, Jon Kennedy and other tech reviewers
• Michaelle Gonzalez, Naomi Pitfield, and the SIGGRAPH Team
![Page 62: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/62.jpg)
Legal Notices and DisclaimersIntel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com.
Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit http://www.intel.com/performance.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/performance.
Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.
This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
Statements in this document that refer to Intel’s plans and expectations for the quarter, the year, and the future, are forward-looking statements that involve a number of risks and uncertainties. A detailed discussion of the factors that could affect Intel’s results and plans is included in Intel’s SEC filings, including the annual report on Form 10-K.
All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice. The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate. © 2016 Intel Corporation. Intel, the Intel logo, OpenCL and others are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
![Page 63: Ultra HD Video Scaling: Low-Power HW FF vs. CNN-based Super-Resolution](https://reader030.fdocuments.us/reader030/viewer/2022020113/588627861a28ab8f2c8b6365/html5/thumbnails/63.jpg)