Transcoding of MPEG Compressed Bitstreams: Techniques and ...
description
Transcript of Transcoding of MPEG Compressed Bitstreams: Techniques and ...
1
EE665000 視訊處理
Transcoding of MPEG Compressed Bitstreams: Techniques and Application
2
Outline
Introduction
- Purpose of Transcoder
-Transcoding Application
Overview of Transcoding Techniques
-Bit-rate Reduction
-Temporal Resolution Reduction
-Spatial Resolution Reduction
3
Outline
An example : MPEG-2 to MPEG-4
-Drift error analysis for spatial resolution reduction
-Novel drift compensation architectures and techiques
-Comparisons of complexity and quality
Conclusion / Future Work
4
Purpose of Transcoder
Bitstream Bitstream
-Bit rate reduction
SDTV : 6Mbps 3Mbps, HDTV : 19.2 Mbps 11Mbps
-Frame rate reduction
30 frame/s 10 frame/s : surveillance application
-Resolution reduction
HDTV SDTV
720*480i, 30Hz 352*240p 10Hz
5
Purpose of Transcoder
Syntax conversion
Mpeg-2 Mpeg-4 to support Mobile devices
Mpeg-2 Transport stream Mpeg-2 Program stream to support DVD
Other Conversions
Video summarization for a compact representation of content; satisfy time constraints
Color depth reduction, e.g., support for 4-bit PDA display
Text summarization, e.g., compact viewing, including in HTML-to-WML
Multi-model, e.g., text-to-speech, audio driven animation model
-
-
6
Transcoding Example
Transcoding research focuses on efficient techniques to perform such conversions
7
Concept: Universal Multimedia Access (UMA)
8
Use Case: Video Server
Deliver Video From Server to Mobile Device
-For broadcast or surveillance content
9
Use Case: Surveillance
10
User Case: Ste-Top Box
11
Use Case: DVD Recorder System
12
Use Case: DTV Distribution to Remote Devices
13
Use Case: Enhanced Server Operation
14
Application Environments
Transcoding is needed to fill the gaps between content, network, terminal and user
15
Overview of Video TanscodingTechniques
Conventional Approaches
Full decoding , post-processing, full re-encoding
Highest quality, but an expensive solution
In most cases, requires a hardware-based solution
Low-Cost Approaches
Target similar quality as conventional approach, but with much lower complexity
Architectures that utilize compressed-domain processing can provide savings
Low-cost solutions may be more flexible as they also enable softwaresolution
-
-
-
-
-
-
16
Bit Rate Reduction
Purpose
Bandwidth savings for efficient transmission
Compatibility with certain profile/level, e.g., MPEG-4 Simple Profile @ Level 0
Main Issues
Drift compensation architecture
Rate control algorithm
Trade-off between quality and complexity
-
-
-
-
-
17
Bit Rate Reduction
Technical challenges
Picture quality degradation: re-quantization error, drift
Complexity reduction with partial decoding
Approachs
Cutting high frequencies, Requantization
Open-Loop and Closed-loop architectures
-
-
-
-
18
Bit-Rate Reduction Architectures
19
Experimental Results
Comparison of Open-Loop and Closed Loop architectures
Original sequence encoded at 2Mbps, N=30, M=3
Transcoded to fixed QP=15 with both architectures; plot shows I/P framsonly
Server drift with open-loop
-
-
-
20
Joint Transcoding
In many communication systems, it is desirable to distribute an aggregate rate over multiple programs.
In spatial domain, this is known as statistical multiplexing (StatMux)
Encode pictures proportional to encoding complexity
Complexity is determined from pixel domain
Distribute bits to achieve min distortion across all programs
-
-
-
21
Joint Transcoding
If the programs are already encoded, joint transcoding techniques will minimize the distortion
Extract normalized activity measures from original quantizerscales
Reassign target distributions
-
-
22
Block Diagram of Joint Transcoder
23
Temporal Resolution Reduction
Purpose
Bandwidth savings for efficient transmission
Reduce number of frames/sec to meet processing requirements at terminal
Main Issues
Estimate new motion vector based on incoming motion vectors
Estimate new residual based on incoming residual values
-
-
-
-
24
Temporal Resolution Reduction
Technical Challenges
Picture quality degradation
Avoid MV re-estimation
Minimize mismatch between predictive and residual components
Approaches
MV interpolation via bilinear interpolation
MV interpolation via majority voting
-
-
-
-
-
25
MV Interpolation
Problem: estimation MV between current and new reference frames
MVskip = mv + mvint
Solutions (how to determine mvint)
-Bilinear interpolation:
-Majority voting:
26
Estimating New Residue
Residue Compensation
-Need to minimize between new MV and residue
-New residue corresponding to MV interpolation by
majority voting:
residueskip = residuei + residue
where wi ≥ wj
27
Spatial Resolution Reduction
Purpose
Bitstream that can be decoded and displayed on a low resolution screen
Bandwidth savings for efficient transmission
Compatibility with certain profile/level, e.g., MPEG-4 Simple Profile
Main Issues
Motion vectors corresponding to reduced resolution reference frame
Obtaining texture information for lower reslution MB’s
Drift compensation architecture
-
-
-
-
-
-
28
Spatial Resolution Reduction
Technical challenges
Picture quality degradation
Down-conversion filtering
Motion vector mapping
Approaches
Cascaded approach: full decoding, spatial down sampling, and full re-encoding
Low-cost approaches that avoid spatial down-sampling and full re-encoding
-
-
-
-
-
29
Case Study: MPEG-2 to MPEG-4
Motivation
MPEG-2 in the DTV/DVD market has created a large amount digital infrastructure and broadcast quality content
MPEG-4 adopted for mobile multimedia communications
Error-resilient transmission to low resolution displays on mobile devices
There will be a large demand for this specific transcoding technology
-
-
-
-
30
Case Study: MPEG-2 to MPEG-4
Topics to be Covered
Syntax Conversion: at higher and lower layers
MB-level conversions, e.g., MV mapping, texture down-sampling
Analysis of drift errors when transcoding to a lower spatial resolution
Presentation of various architectures to overcome sources of drift
Rate control and bit allocation issues
Evaluation of complexity and quality
-
-
-
-
-
-
31
Macroblock Conversions
Spatial resolution reduced by half [4MB to 1MB]
-Motion vector mapping
-Texture down-sampling
-Mixed block processing
32
Motion Vector Mapping
Frame-Based
4:1 mapping v.s. 1:1 mapping
-use adaptive mapping based on variance of 4 motion vectors
33
Motion Vector Mapping
Frame-Based
May have up to eight 16*8 MV’s (2 per MB)For mapping
Use top-field MV as default
If motion_vertical_filed_select[0][0]= =1, i.e., the bottom field is used to predict the top field, then the top-field and bottom field-motion vectors are averaged
-
-
34
Texture Down-Sampling
Actual implementation
-use separable 1D filters to compute down-converted blocks
-mathematically equivalent filters can be derived in spatial domain
-filtering can be adapted to work on a field basis
-corresponding up-conversion filters are also available
35
Mixed Block ProcessorPurpose
-Pre-process selected MB’s to ensure no mixing modes with one MB
-Mixed coding modes within MB not supported by coding standards
Processing
-Map MB modes so that all sub-blocks have same mode, either all intra or inter
-Modify MV’s and DCT coefficients to correspond with MB modes
Example of Mixed Block
MB Inter InterDCT Inter MV
MB Inter InterDCT Inter MV
MB Intra IntraDCT Zero MV
MB Inter Inter DCT Inter MV
MB(0) MB(1)
MB(k+1)MB(k)
MB sub-blocks
b(3)b(2)
b(1)b(0)
(after down-conversion)
MB(x)
sub-block must have same mode
36
Mixed Block Processor (Cont’d)
Three possible methods
-ZeroOut
Convert mixed-block MB modes to Inter
MV’s and DCT coefficients set to Zero
37
Mixed Block Processor (Cont’d)
-IntraInterConvert mixed-block MB modes to InterMV for Intra block are predicted from neighborsCorresponding Inter DCT coefficients are computed
-InterIntraConvert mixed-block MB modes to IntraMV for mixed blocks set to zeroCorresponding Intra DCT coefficients are computed
Decoding loop is needed for these options
38
Reference Architecture
39
Open-Loop Architecture
Open-Loop analysis
40
Drift Error Analysis
Approach
Compare closed-loop reference with simple open-loop architecture
We discuss P frames only, since B frames do not introduce drift error propagation
Rationale
Expose all the possible sources of drift errors
-
-
-
41
Reference Analysis
P-frame analysis
2 1 1 21 1( ) ( ( )) ( )n n f n r ng D e D M x M y− −= + −
42
Drift Error Analysis
Error due to quantization Error due to down-sampling
43
Drift Compensation Architectures
“Drift Low”
-Drift compensation in reduced resoltion
“Drift Full”
-Drift compensation in original resolution
“MC Low”
-Drift compensation by partial re-encoding
“Intra Refresh”
-Drift compensation by intra block refresh
44
Drift Low Architecture
45
Drift Low Architecture
Reduced resolution residual is approximated as
Assumes the following approximation
2 1 1 21 1( ) ( )n n r n ng D e M y y− −= + −
1 1 11 1 1( ( )) ( ( )) ( )f n r n r nD M x M D x M y− − −= =
Architecture attempts to eliminate dq
46
Drift Full Architecture
47
Drift Full Architecture
Reduced resolution residual is approximated as
Assumes the following approximation
2 1 1 21 1( ) ( )n n r n ng D e M x x− −= + −
2 2 21 1 1( ) ( ( ( ))) ( ( ))r n f n f nM y D M U y D M x− − −= =
Architecture attempts to eliminate dq and dr
48
MC Low Architecture
49
Reduced resolution residual is approximated as
Assumes the following approximation
MC Low Architecture
2 1 1 11 1( ) ( ( )) ( ( ))n n f n r ng D e D M x M D x− −= + −
2 1 11 1 1( )n n ny y D x− − −= =
Architecture attempts to eliminate dr
50
Intra Refresh Architecture
51
Intra Refresh Architecture
Inter-Intra used to convert inter-coded blocks to intra
Intra-coded blocks not subject to drift, therefore aim to stop drift propagation for both dq and dr
Flexible and capable of correcting error caused by MV mapping as well
Two steps involved:
-Estimate amount of drift
-Translate drift estimate into an intra-refresh rate
Intra refresh must work jointly with rate control
52
Profile Definitions of Version 1
Simple Profile
─ Basic tool of I/P VOP AC/DC Prediction and 4MV unrestricted
─ Short header and Error Resilience tools
Core Profile
─ Simple + Binary Shape, Quantization Method ½ and B-VOP
Main Profile
─ Core + Grey Shape, Interlace and Sprite
Simple Scalable Profile
─ Simple + Spatial and temporal scalability and B-VOP
53
Profile Definitions of Version 1
N-Bit Profile
─ Core + N-Bit
Animated 2D Mesh
─ Core + Scalable Still Texture, 2D Dynamic Mesh
Basic Animated Texture
─ Binary Shape, Scalable Still Texture and 2D Dynamic Mesh
Still Scalable Texture ─ Scalable Still Texture
Simple Face-Face Animation Parameters
54
Profile Definitions of Version 2
Advanced Real Time Simple Profile
─ Simple +
─ Advanced error resilience with channel,
─ Improved temporal scalability with low buffering delay
Core Scalable Profile
─ Simple scalable +
─ Core +
─ SNR, Spatial/Temporal Scalability for Region or Object of Internet
55
Profile Definitions of Version 2
Advanced Coding Efficiency Profile
─ Tool for improving coding efficiency for both rectangular and arbitrary
shaped objects
─ For applications such as mobile broadcast reception
Advanced Scalable Texture Profile
─ Tool for decoding arbitrary shaped texture and still image including
scalable shape coding
56
Profile Definitions of Version 2
Advanced Core Profile
─ Core Profile +
─ Tool for decoding arbitrary shaped video objects and arbitrary shaped
scalable still image
Simple Face and Body Animation Profile
─ Simple face animation + body animation
57
58
Comparison of Transcoding Arch.
Reference Architecture─ 2 loop solution; corrects for all types of errors
Residual value can change with modified motion vector
Also, compensates for re-quantization error in inter-coded blocks
Intra Refresh Architecture─ 1 loop solution; uses intra-block refresh to corrects for errors
Residual value cannot change with modified motion vector
No compensation for re-quantization errors in inter-coded blocks
59
Comparison of Transcoding Arch.
MC Low Architecture─ 1.5 loop solution; use partial encoder to compensate for errors
Residual value can change with modified motion vector
No compensation for re-quantization errors in inter-coded blocks
─ Quality and complexity should be between intra refresh and reference
60
Comparison of Transcoding Arch.
61
Complexity Analysis [Non-Optimized]
Simulation-Machine: Pentium 4, 1.8GHz, 512MB- Content: Highway19 @ 384Kbps, 30 sec duration
62
Complexity Analysis [Optimized]
Simulation-Machine: Pentium 4, 1.8GHz, 512MB- Content: Highway19 @ 384Kbps, 30 sec duration
63
Complexity Reductions
Down-Conversion Optimizations- For intra refresh architecture
Float-to-integer, exploit filter symmetry and zero coefficient
Approximately 70% improvement for down-conversion (5.4s to 1.6s)
- For reference and partial encoder architectures
Replace frequency synthesis filter with averaging filter
64
Complexity Reductions
Speeding up FDCT, IDCT, and MC-MMX implementation for FDCT; 26% overall reduction (20.0s to 14.9s)
- SSE2 implementation for IDCT; 9% overall reduction (16.3s to 14.9s)
-MMX implementation for common block-based process
Common process include average, clipping, block addition
These optimized routines have a significant impact on MC
65
Observations on Complexity
Overall improvement is quite high
- 61% for Intra Refresh
- 71% for Reference; 74% for partial Encoding
Transcoding multiple streams in software is feasible
- 2 streams can be supported by reference; 3 streams by proposed methods
- All methods provide acceptable quality
Further complexity reduction
- Computation for RC_Quant can be reduced by avoiding division operations
-Majority of complexity now in DecTime and MB_Code protions
-Maybe other marginal gains possible if data is restructured
66
Experimental Results: Akiyo
Akiyo
- Low motion and low-level of detail
- CIF (352*288) -> QCIF (176*144), N=15, M=3, drop B
- Source bit rate: 512Kbps
67
Akiyo (Cont’)
68
Experimental Results: Foreman
Foreman
-Medium motion and medium-level of detail
- CIF (352*288) -> QCIF (176*144), N=15, M=3, drop B
- Source bit rate: 2Mbps
69
Foreman (Cont’)
70
Experimental Results: Football
Football
- Fast motion and high-level of detail
- CCIR601 (720*480) -> SIF (352*240), N=15, M=3, drop B
- Source bit rate: 6Mbps
71
Football (Cont’)
72
Summary of MPEG-2 to MPEG-4
Key observations
- DriftFull with InterIntra more complex than Reference
Not recommended to be used
- Simple sequences with low motion and low level of detail
Zeroout: reasonably good quality
InterIntra, IntraInter, Intra_Refresh, MC_Low, DriftLow: high quality
- Sequences with medium to high motion
Artifacts can be found in Zeroout, InterIntra, IntraInter, DriftLow
Intra_Refresh, MC_Low comparable to Reference
73
Summary of MPEG-2 to MPEG-4
Summary
- Intra Refresh
Offers vest trade-off between quality and complexity
Flexible and adaptable, i.e., easily scaled in terms of complexity-quality
-MC Low
Provide a reasonable quality-complexity trade-off
A good alternative to Reference, but less dynamic compared to Intra-Refresh
74
Transcoding of FGS to Simple Profile (1)
Application scenario
75
Transcoding of FGS to Simple Profile (2)
Conceptual illustration
Technique issues
How to combine the two bitstreams in DCT domain or even at bitstreamlevel by advanced processing
How to minimize the efforts in the combining processes for converting the two FGS bitstreams into an MPEG-4 Simple Profile bitstream
-
-
76
Transcoding of FGS to Simple Profile (3)
Reference architecture
77
Transcoding of FGS to Simple Profile (4)
Analysis of Reference Architecture
- P-frame analysis
78
Transcoding of FGS to Simple Profile (5)
Proposed Architecture
79
Transcoding of FGS to Simple Profile (6)
Simulation results
80
Future Transcoding Considerations
Industry Need
- Describing a dynamic usage environment
Capabilities of the terminal and network
User preference and natural environment conditions
Types of services that are available
- Transcoding should be performed according to usage environment
- This is one of the targets for emerging MPEG-21 strandard
81
Future Transcoding Considerations
Research Topic
- Transcoding strategy is needed for multiple transcoding possibilities
- For example:
Send QCIF @ 30Hz or CIF @ 10Hz
Key frame w/audio or QCIF @ 7.5Hz
-What is a suitable quality metric for optimal transcoding strategy?
- How to measure distortion across spatio-temporal scales?
82
Conclusion
Transcoding is a bridge between standards in many applications
Transcoding is a very useful tool for video streaming systems in which the content format at the server has been defined
Transcoding is a useful component for UMA which is concerned with the access to any multimedia content from any type of terminal or network. This is an important part of MPEG-21