Post on 08-Jan-2016
description
Codierformate für Bilder und Video
Ralf Schäfer
Fraunhofer Heinrich-Hertz-Institut
schaefer@hhi.de
http://ip.hhi.de
Ralf
Sch
äfe
r
Slide 2
Outline
Introduction Some fundamentals in image coding JPEG-2000 and Motion JPEG-2000: Compression tools for
production MPEG-4: New functionalities for interactivity H.264/AVC: A step forward in compression technology Experimental Results and Comparison of MPEG-2 and
H.264/AVC Scalable coding Multiview coding Next generation video coding Conclusions
Ralf
Sch
äfe
r
Slide 3
Compression as enabling technology
Live ContentLive Content
Computer Computer animationanimation
Post productionPost production
Recorded ContentRecorded Content
Storage Storage MediaMediaMedia EncoderMedia Encoder
ArchiveArchive
Lossless or „quasi“ lossless
UNICAST, UNICAST, MULTICAST,MULTICAST,BROADCASTBROADCAST
TransmissionTransmission
Media EncoderMedia Encoder
lossy
Ralf
Sch
äfe
r
Slide 4
Image formats and data rates
TV
(ITU-R 601)
HD(TV) D-Cinema
3k x 4k
Pixels per line 720 1920 4096
Number of lines 576/480 1080 3072
Frame rate 25/30 24/25/30 24
Components YUV YUV RGB
Bit/component 8/10 8/10 12
Data rate [Mbit/s] 165,9 < 995,3 10.872
Ralf
Sch
äfe
r
Slide 5
Capacity of and transmission time for movies (90 min)
Uncompressed
(CF = 1)
Archiving
(CF = 2)
Distribution
(CF = 20/45*)
HD 672 336 34
3k x 4k 7.339 3.598 163
HD 143 71 8
3k x 4k 1562 781 35
HD 39,3 19,6 2
3k x 4k 429 214,5 9,5
Cap
acity
[GB
]N
um
be
ro
fD
VD
s
Tra
nsm
. T
ime
[h
] @
38
Mb
it/s
Ralf
Sch
äfe
r
Slide 6
Concept of DCT coding (JPEG, MPEG, H.26x)
compression factor = 512/26 20
DCTblockscanning
quanti-sation
zig-zag scanning
channel
VLC
10, 70, 10, 10, 10, 30, 10, 10, 0, 0, 0, 0, ....
8 x 8 x 10 bit = 640 bit
10 92 31 12 1 0 0 014 13 5 0 0 0 0 015 36 3 0 0 0 0 04 4 2 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 0
01, 00111, 01, 01, 01, 010, 01, 01, 000001
000001 = EOB-> 26 bit
8 x 8 x 8 bit = 512 bit
8 x 8 x 4 bit = 256 bit
10 70 30 10 0 0 0 010 10 10 0 0 0 0 010 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 0
Ralf
Sch
äfe
r
Slide 7
Compression using temporal prediction
Difference image(= 0 without motion)
DCT-encoder
channel
DCT-decoder
-+
Frame store
Difference image(with motion)Difference image(with motion compensation)
Motion estimation
Motion compen-sation
Motion vectors
Ralf
Sch
äfe
r
Slide 8
I, P and B frames
P frames - Uni-directional predictive coding
B frames - Bi-directional predictive coding
I frames - Intracoding (JPEG)
I B ...... B P B B P
Ralf
Sch
äfe
r
Slide 9
ISO and ITU-T standards for image & video coding
MPEG-4
ITU
H.261
MPEG-2
ITU
H.263
100 Mbit/s
20 Mbit/s
1 Mbit/s
8 kbit/s
64 kbit/s
1990 1992 1994 1996 1998 2000 2002
Vers.
1
Vers.
3
Vers.
2
ITU/MPEG(JVT)
H.264/AVC
JPEG
MPEG-1
JPEG-2000
TV/HDTV production
HDTV
SDTV
D-Cinema production
CD-ROM
Mobile video services
Videophone/ conference
500 Mbit/s
Ralf
Sch
äfe
r
Slide 10
JPEG
IDCTVLD Q-1From channel/storage media
DCT Q VLC To channel/storage media
Lossy coding
Lossless coding (JPEG/JPEG-LS)
(Adapt.)Spatial
prediction
Entropycoding
Line n-1
Line n
Spatial prediction
c b d
a
Disadvantage: Motion JPEG is not standardized!
Ralf
Sch
äfe
r
Slide 11
JPEG2000
JPEG 2000 is the successor of the JPEG standard.
Work started in 1997
Most important criterion was the » overall environment «, in which images would be tasked in future.
JPEG2000 is a wavelet based compression, which delivers better quality than JPEG and allows » scalability « without having to store redundant data.
JPEG 2000 delivers about 20% » better compression « than JPEG. And, at more extreme compression ratios, JPEG 2000 delivers significantly better quality.
JPEG 2000 supports both » lossless and lossy « compression in a single codec – a very desirable feature in certain applications such as medical imaging and post- production.
Ralf
Sch
äfe
r
Slide 12
JPEG 2000
Pre-processing
To channel/storage media
2D DWT
Q CoefficientBit Model.
ArithmeticEncoder
From channel/storage media
Post-processing
2D IDWT
Q-1CoefficientBit Model.
Arithmeticdecoder
Ralf
Sch
äfe
r
Slide 13
JPEG2000: Scalability
JPEG 2000 is scalable in both SNR and resolution without transcoding:
Example: Scalability in Resolution
Ralf
Sch
äfe
r
Slide 14
Motion JPEG 2000 (Part 3)
Based on Part 1 codec of JPEG2000 standard
Motion Image specific additions
intraframe based coding scheme
MPEG-4 based file format
Synchronisation of audio and video
Metadata embedding
Multi-component, multi-sampling formats e.g. YUV422, RGB 444
Ralf
Sch
äfe
r
Slide 15
MPEG-4: New Functionalities
• MPEG-4 - scene description allows:
– hierarchicalstructuringof scenes
– combinationof natural andsynthetic video & audio objects
– interactionwith singlescene elements
Ralf
Sch
äfe
r
Slide 16
Immersive Conference Terminal
Audio Speakers
Semi-circular table
61” Plasma display
Cameras
Seamless transition between the real and virtual world Life-sized upper body images Natural reproduction of gestures and body language 3D representation of the remote participants including provision of eye
contact
Ralf
Sch
äfe
r
Slide 17
„Intelligent scrambling“ for Pay-TV
Broadcast of a tennis match without the players
Only those paying for admission get the players
Ralf
Sch
äfe
r
Slide 18
EntropyCoding
H.264/AVC Video Coding
Inv. Scal. & Transform
Motion-Compensated
Predictor
ControlData
Quant.Transf. coeffs
MotionData
0
Intra/Inter
CoderControl
Decoder
MotionEstimator
Transform/Scal./Quant.-
InputVideoSignal
Split intoMacroblocks16x16 pixels
Ralf
Sch
äfe
r
Slide 19
Common Elements with other Standards
Macroblocks: 16x16 luma + 2 x 8x8 chroma samplesInput: Association of luma and chroma and
conventional sub-sampling of chroma (4:2:0)Block motion displacementMotion vectors over picture boundariesVariable block-size motionBlock transformsScalar quantizationI, P, and B coding types
Ralf
Sch
äfe
r
Slide 20
EntropyCoding
Scaling & Inv. Transform
Motion-Compensation
ControlData
Quant.Transf. coeffs
MotionData
Intra/Inter
CoderControl
Decoder
MotionEstimation
Transform/Scal./Quant.-
InputVideoSignal
Split intoMacroblocks16x16 pixels
Intra-frame Prediction
De-blockingFilter
OutputVideoSignal
Motion Compensation Accuracy
Motion vector accuracy 1/4 (6-tap filter)
8x8
0
4x8
0 10 1
2 3
4x48x4
1
08x8Types
0
16x16
0 1
8x16MB
Types
8x80 1
2 3
16x8
1
0
Ralf
Sch
äfe
r
Slide 21
EntropyCoding
Scaling & Inv. Transform
Motion-Compensation
ControlData
Quant.Transf. coeffs
MotionData
Intra/Inter
CoderControl
Decoder
MotionEstimation
Transform/Scal./Quant.-
InputVideoSignal
Split intoMacroblocks16x16 pixels
Intra-frame Prediction
De-blockingFilter
OutputVideoSignal
MotionData
OutputVideoSignal
Multiple Reference Frames
Multiple Reference Frames Generalized B Frames
Ralf
Sch
äfe
r
Slide 23
EntropyCoding
Scaling & Inv. Transform
Motion-Compensation
ControlData
Quant.Transf. coeffs
MotionData
Intra/Inter
CoderControl
Decoder
MotionEstimation
Transform/Scal./Quant.-
InputVideoSignal
Split intoMacroblocks16x16 pixels
Intra-frame Prediction
De-blockingFilter
OutputVideoSignal
Transform Coding
4x4 Block Integer Transform
Main Profile: Adaptive Block Size Transform (8x4,4x8,8x8)
Repeated transform of DC coeffs for 8x8 chroma and 16x16 Intra luma blocks
1 1 1 1
2 1 1 2
1 1 1 1
1 2 2 1
H
Ralf
Sch
äfe
r
Slide 24
EntropyCoding
Scaling & Inv. Transform
Motion-Compensation
ControlData
Quant.Transf. coeffs
MotionData
Intra/Inter
CoderControl
Decoder
MotionEstimation
Transform/Scal./Quant.-
InputVideoSignal
Split intoMacroblocks16x16 pixels
Intra-frame Prediction
De-blockingFilter
OutputVideoSignal
Intra Prediction
Directional spatial prediction (9 types for luma, 1 chroma)
1
2
3456
7
8
0
3
• e.g., Mode 3: diagonal down/right prediction a, f, k, p are predicted by (A + 2Q + I + 2) >> 2
Q A B C D E F G HI a b c dJ e f g hK i j k lL m n o pMNOP
Ralf
Sch
äfe
r
Slide 25
Deblocking Filter
Improves subjective visual and objective quality of the decoded picture. Is significantly superior to post filtering.
Filtering affects the edges of the 4x4 block structure Highly content adaptive filtering procedure mainly
removes blocking artifacts and does not unnecessarily blur the visual content
Ralf
Sch
äfe
r
Slide 26
Deblocking Filter: Subjective Result for Inter
without filter with H264/AVC deblocking
Ralf
Sch
äfe
r
Slide 28
Variable Length Coding
Two schemes depending on profile:
Context adaptive VLC (CAVLC)
Context-based Adaptive Binary Arithmetic Codes (CABAC)
-> 10-15% gain over CAVLC
Ralf
Sch
äfe
r
Slide 29
Four profiles: Baseline, Main, Extended, and High
• Baseline (Videoconferencing & Wireless)– I and P picture types (not B)– In-loop deblocking filter– 1/4-sample motion compensation– Tree-structured motion segmentation down to 4x4 block size– VLC-based entropy coding– Some enhanced error resilience features
• Flexible macroblock ordering• Arbitrary slice ordering• Redundant slices
Grouping of Capabilities into ProfilesR
alf
Sch
äfe
r
Slide 30
• Main Profile (esp. Broadcast/Entertainment)– All Baseline features except enhanced error resilience features– B pictures– CABAC– Adaptive block-size transforms– MB-level frame/field switching– Adaptive weighting for B and P picture prediction– Note: Main is not exactly a superset of Baseline
• Extended Profile – All Baseline features– B pictures– More error resilience: Data partitioning– SP/SI switching pictures– Note: Profile X is a superset of Baseline
Main and Extended ProfilesR
alf
Sch
äfe
r
Slide 31
New Features in High Profiles
• Larger transforms– 8x8 transform – Drop 4x8, 8x4, or larger, 16-point…
• Filtered intra prediction modes for 8x8 block size
• Quantization matrix– 4x4, 8x8, intra, inter trans. coefficients weighted
differently
• Coding in various color spaces– 4:4:4, 4:2:2, 4:2:0, Monochrome, with/without Alpha– New integer color transform (a VUI-message item)
Ralf
Sch
äfe
r
Slide 32
High Profiles
The High profile (HP):
Supporting 8-bit video with 4:2:0 sampling, addressing high-end consumer use and other applications using high-resolution video without a need for extended chroma formats or extended sample accuracy.
The High 10 profile (Hi10P):
Supporting 4:2:0 video with up to 10 bits of representation accuracy per sample.
The High 4:2:2 profile (H422P):
Supporting up to 4:2:2 chroma sampling and up to 10 bits per sample.
The High 4:4:4 profile (H444P):
Supporting up to 4:4:4 chroma sampling, up to 12 bits per sample, and additionally supporting efficient lossless region coding and an integer residual color transform for coding RGB video while avoiding color-space transformation error.
Ralf
Sch
äfe
r
Slide 33
Test Set Results for Streaming Application
(B pictures used when in profile)
Average rate savings relative to:
Codec MPEG-4 ASP
H.263 HLP MPEG-2
H.264/AVC 39% 49% 64%
MPEG-4 ASP - 17% 43%
H.263 HLP - - 31%
Ralf
Sch
äfe
r
Slide 35
CIF, 30Hz : 512 kbit/s CIF, 30Hz : 340 & 1024 kbit/s
H.264/AVC @ 512 kbit/s MPEG-2 @ 512 kbit/s
Comparison of MPEG-2 and H.264/AVCR
alf
Sch
äfe
r
Slide 36
CIF, 30Hz : 512 kbit/s CIF, 30Hz : 340 & 1024 kbit/s
H.264/AVC @ 512 kbit/s MPEG-2 @ 512 kbit/s
Comparison of MPEG-2 and H.264/AVCR
alf
Sch
äfe
r
Slide 37
CIF, 30Hz : 512 kbit/s CIF, 30Hz : 340 & 1024 kbit/s
H.264/AVC @ 340 kbit/s MPEG-2 @ 1024 kbit/s
Comparison of MPEG-2 and H.264/AVCR
alf
Sch
äfe
r
Slide 38
CIF, 30Hz : 512 kbit/s CIF, 30Hz : 340 & 1024 kbit/s
H.264/AVC @ 340 kbit/s MPEG-2 @ 1024 kbit/s
Comparison of MPEG-2 and H.264/AVCR
alf
Sch
äfe
r
H.264/AVC Adoptions and Applications
Wireless broadcast and mobile networks adoptions and applications• Optional codec in 3GPP Release 6• Optional codec in DVB (AVC) for DVB-H• Mandatory in DMB (DAB application)• Mandatory codec in Japanese 1 Segment ISDB-T system
Broadcast adoptions and applications• Optional codec in DVB (DVB-AVC)• To be adopted as optional codec by ATSC• Optional codec in Japan (ARIB) and Korea• HDTV services via satellite (DirecTV, Echo Star, BskyB, Premiere, …)• The only mandatory codec for HDTV services in Europe (EICTA)• SDTV services via IPTV (SBC, KPN, Belgacom, France Telecom, ...)
Storage adoptions and applications• Mandatory codec for HD-DVD• Mandatory codec for Blu-ray Disk• Mandatory codec for UMD in Sony Play Station Portable 3• Used in Apple iPod Video
Internet• Used in Adobe Flash Player
Ralf
Sch
äfe
r
Slide 39
HHI‘s role in video coding and standardization
• Associated Rapporteur of ITU-T/SG 16/VCEG (T. Wiegand) 2000 - …
• Co-chair of Joint Video Team (MPEG/VCEG) (T. Wiegand) 2001 - …
• Co-chair of MPEG Video (T. Wiegand) 2005 - …
• HHI prepared MPEG-4 reference software (RD optimised) and the H.26L proposal for the MPEG tests in 2001 -> foundation of JVT
• HHI is responsible for the integration of maintenance of the official H.264/AVC reference software (K. Sühring) 2002 - ...
• Editor of the H.264/AVC standard (T. Wiegand) 2002 - ...
• Coordinator of video for DVB-H and editor in DVB-CBMS (T. Wiegand) 2005
• Editor of the visual parts of TS 102 005 and TS 101154 in DVB-AVC (T. Wiegand) 2003 - 2005
• Chairman of ITG-FA 3.2 „Digital Image Coding” (R. Schäfer)
• Editor of the SVC standard (H. Schwarz and T. Wiegand) 2005 - ...
• Chairman of 3DAV Group of MPEG (A. Smolic)
Ralf
Sch
äfe
r
Slide 40
Scalable Video Coding
Facing the scenario of heterogeneous media delivery: • Different users• Different needs• Different displays• Different links
Flexible source coding, i.e. scalability is needed• Simple adaptation to different bit-rates, frame rates or spatial resolutions of
the video content on a bit-stream level
Ralf
Sch
äfe
r
Slide 41
Scalable Video Coding
scene
Scalable video encoder
Sc. video decoder
Sc. video decoder
Sc. video-decoder
video decoder
Dat
a st
ream
204
8 k
bit
/s 32 kbit/s
256 kbit/s
512 kbit/s
2048 kbit/s
TV @ 60 Hz
CIF @ 30 Hz
CIF @ 15 Hz
QCIF @ 7,5 Hz
Ralf
Sch
äfe
r
Slide 42
Ralf
Sch
äfe
r
Slide 43
Hierarchical MCP &Intra prediction
Base layercoding
texture
motion
MultiplexScalable
bit-stream
Inter-layer prediction:•Intra•Motion•Residual
H.264/AVC-compatiblebase layer bit-stream H.264/AVC MCP &
Intra predictionBase layer
coding
texture
motion
H.264/AVC compatible encoder
Hierarchical MCP &Intra prediction
Base layercoding
texture
motion
Inter-layer prediction:•Intra•Motion•Residual
SNR Scalability: Typical Encoding
Client
Server BServer
AServer D
Server E
Server C
Graceful Degradation in Video Transmission
Mobile ad-hoc networks: time varying connectivity, throughput, errors, and delay
Design a robust transmission system for video
Combine channel coding (Raptor codes) with error resilient source coding (SVC)
Graceful degradationSlide 44
Scalability of Video - Modalities
Fidelity: change of quality (e.g. SNR)
30 Hz15 Hz7.5 Hz
TVCIF
QCIF
coarse fine
Temporal: change of frame rate
Spatial: change of frame size
Slide 47
3D-Television (1)
Video + Depth concept adopted by MPEG (under chairmanship of HHI) Coding & transmission of 2D video Generation of per pixel depth information & coding of depth map Rendering at the decoder Intermediate views can be generated within a certain operating range
=> head motion parallax viewing
Ralf
Sch
äfe
r
Slide 48
Slide 49
Ralf
Sch
äfe
r
3D-Television (2)
Single User
Multiple User
Head Tracking
2D
2D
3D
3D
3D Warp
Layered Cod ing Syntax
DVBMPEG-2 Decoder
AdvancedLayer
Decoder
Base Layer
Advanced Layer
Backward compatible to DVB
Can be decoded by any existing STB
Advanced 3D features can be used depending on functionality of an advanced STB and the attached display
Slide 50
Ralf
Sch
äfe
r
Multiview Coding (MVC) in MPEG
8 responses to the Call for Proposals on MVC had been received: 5 from industry(-cooperations), 2 from research institutions, 1 from a university 2 from Korea, 2 from Japan, 2 from USA, 2 from Germany
Examples of test sequences for MVC test
Slide 52
Ralf
Sch
äfe
r
H.265: Next Generation Video Coding
Objective:
Reduction of bit rate by 50% towards H.264/AVC @ equal quality
Applications:Mobile Internet (Mobile) Broadcast services Immersive entertainment services Digital Cinema @ beyondRobust video transmission Interactive services with low delay
Slide 53
Ralf
Sch
äfe
r
Coding technologies:
• Motion compensted long term prediction• Parametric motion models • Coding with hierarchical prediction structures • Texture analysis and synthesis • Model-aided coding • Adaptive Quantisation • Visual quality models • Etc.
H.265: Next Generation Video Coding - Technologies
Slide 54
Ralf
Sch
äfe
r
Texture analysis and synthesis
• Strongly texturered image areas are difficult to encode (-> high data rate)
• the eyes can be cheated in strongly texturered image areas
Where is the original texture ?
Slide 55
Ralf
Sch
äfe
r
Video coding with Texture Analysis (TA) and Synthesis (TS)
Side Info
Video In Encoder
TA TS
Decoder Bits
Video Out
Video coding using a Texture Analyzer (TA) and a Texture Synthesizer (TS)
Slide 56
Ralf
Sch
äfe
r
H.264 Codec with integrated TA/TS branch
Bitrate gain: 35%But very high processing power is required
Slide 57
Conclusions (1) : Milestones in Video Compression
0 100 200 300 400 50026
28
30
32
34
36
38
PSNR[dB]
DCT(Motion JPEG)
(1985)
Foreman10 Hz, QCIF
133 frames encoded
Bit-Rate [kbps]
H.2631995
MPEG-4 (P2)1999
H.1201988
H.2611991
H.2642002
H.265
2010 ?
Ralf
Sch
äfe
r
Slide 58
Conclusions (2)
Image & video coding are key technologies for multimedia and communication MPEG-4 (Part 2) provides a number of interesting functionalities for interactivity, however it
has not really reached the market The H.264/AVC Video Coding Standard is the most powerful open video coding standard
with increased compression factor of 2-3 compared to MPEG-2 Using H.264/AVC, HD material can today be coded at similar rates as TV in the beginning of
MPEG-2 H.264/AVC is used in many areas ranging from low data rate mobile services up to HDTV Scalable Video Coding is just entering the market. Promising applications are QVGA/VGA
transmission in mobile networks and 720p/1080p transmission of HDTV Multiview coding (MVC) is currently under development in MPEG and JVT and market
penetration is expected in 3-5 years from now due to increased interest in 3D technologies. H.265 is the next step in video coding. Very complex and advanced coding schemes have
to be used in order to achieve another factor of 2 in compression efficiency. HHI plays a leading role in the development of video coding standards
Ralf
Sch
äfe
r