Adaptive Rate-Distortion Based Wyner-Ziv Video Coding

Adaptive Rate-Distortion Based Wyner-Ziv Video Coding

Lina KaramImage, Video, and Usability (IVU) LabDepartment of Electrical Engineering

Arizona State UniversityTempe, AZ [email protected]

1

Outline• Motivation

• Existing DVC Approaches

• BLAST-DVC: Rate-distortion based BitpLane SelecTive decoding for pixel-domain Distributed Video Coding

• AQT-DVC: Rate-distortion based Adaptive QuanTization for transform-domain Distributed Video Coding

• Enhanced AQT-DVC

• Conclusion and future directions

2

Motivation

Time

Frame 60

Frame 61

Mother and DaughterCIF – 352 x 288Spatial and Temporal Redundancy

3

Motion Estimation and Compensation

Reference Frame (Frame 197) Current Frame (Frame 198)

CIF Mother & Daughter

4

Residual Error ( No Motion Compensation)

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Difference (Residual) Frame = Frame 198 – Frame 197

5

Motion Estimation and Compensation

Reference Frame (Frame 197) Current Frame (Frame 198) = Reference Frame + Error

CIF Mother & Daughter

6

Full Search Motion Estimation

[8x8] block motion vectors superimposed on Reference Frame (Frame 197)

7

Motion Compensation

8

Motion Compensated Reference (Frame 197)PSNR = 40.8 dB, MSE = 5.4

9

Residual Error ( No Motion Compensation)

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Residual Error ( 16x16 blocks, Full pixel)

PSNR = 39.4 dB, MSE = 7.5

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

10

Residual Error ( 4x4 blocks, quarter pixel)

PSNR = 45 dB, MSE = 2.1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

11

Variable block size(16x16 – 4x4) +

quarter-pel + multi-framemotion compensation+ R-D Optimization

( H.264 2004)

85%

12

So, what is the problem?

13

Deblocking filter34%

Motion compensation

29%

Mics.18%

Intra predictor10%

Syntax parser5%

CAVLC+IQ+IZZ+IDCT4%

Power profile of H.264 with QCIF@15fps

Deblocking filter

Motion compensation

Mics.

Intra predictor

Syntax parser

CAVLC+IQ+IZZ+IDCT

From: T.-A. Liu, T.-M. Lin, S. -Z. Wang, et al. “A low-power dual-mode video decoder for mobile applications,” IEEE Communications Magazine, volume 44, issue 8, pp.119-126, Aug. 2006.

• Encoder performs both Motion Estimation and Compensation• Motion Estimation operation much more computationally complex

and consumes much more power than Motion Compensation

(H.264 Decoder)

14

Distributed Video Coding: MotivationConventional video coding• MPEGx or H.26x• High complexity video encoder due to motion estimation.

Emerging applications• Video compression with mobile devices

‒ Low complexity video encoder is preferred to reduce the hardware cost and to extend battery life.

• Video compression for sensor networks‒ Low complexity video encoder is also preferred to

reduce the hardware cost and to extend battery life. ‒ Inter–sensor communication may not be allowed or

needs to be minimized.Two main frameworks• Multi-View/Multi-Cameras• Single-View/Single Camera (Wyner-Ziv Video Coding)

15

Intraframe encoding and interframe decoding • Move complexity (motion estimation) from encoder to decoder• Achieve interframe compression rate-distortion performance

Distributed source coding• Compress consecutive frames separately• Decode the frames jointly at the decoder • Motivated by the work of Slepian-Wolf (1973) and Wyner-Ziv

(1976) ‒ Slepian-Wolf : possible to compress losslessly two statistically

dependent sources in a distributed fashion at a rate equal to their joint entropy

‒ Wyner-Ziv: possible to compress in a distributed fashion and achieve the same rate-distortion performance as when coding in a non-distributed fashion (Gaussian memoryless sources and mean-square error distortion).

Distributed Video Coding: Objectives

16

How can we do this?

17

Reference Frame (Frame 197) Current Frame (Frame 198) = Reference Frame + “Error”

Back to Mother & Daughter…

Distributed Video Coding (DVC): How?

DVC problem becomes: Correct or Reduce “Error” without using Motion Estimation at the encoder and without knowing what the “Error” is!

Similar to a channel coding problem => can make use of channel codes

18

Distributed Video Coding (DVC): ExampleQCIF (176x144) Foreman

Frame 1 Frame 2 Frame 3 Frame 4 Frame 5

Intra-coded Intra-coded Intra-coded

•Encoder:

•Decoder: - Recovers even frames from intra-coded odd-numbered frames - Odd-numbered frames are considered to be a distorted version of even-numbered frames; i.e. Frame2n=Frame2n-1+”Error”- “Error” corrected using parity bits or syndrome bits

Parity Bits orSyndrome bits

Parity Bits orSyndrome bits

19

Distributed Video Coding (DVC): Example•Issue 1: “Error” can be large => need to send a lot of parity bits => large bitrate

Frame 55 Frame 56

• Strategy: at the decoder, try to reconstruct even frames using received odd frames (e.g., bi-directional motion-compensated interpolation).

Distributed Video Coding (DVC): ExampleQCIF (176x144) Foreman

Frame 1 Frame 2 Frame 3 Frame 4 Frame 5

•Decoder: Side Information Generation

interpolate interpolate

•Issue 2: How to generate high-quality side information?•Issue 3: How do we determine the number of needed parity or syndrome bits ?

- Sending too much will waste bits- Sending too little might leave large distortions uncorrected

Interpolated frames called “side information”

PRISM (Puri et al., IEEE Trans. IP, Oct 2007)

• Syndrome-based Wyner-Ziv Coding by dividing codeword space into cosets

• After quantization, bitplane representation used

• Most significant bits can be inferred from side information

• Least significant bits (syndrome bits) need to be encoded and sent to decoder

• Issues:

- Syndrome coding rate is fixed in advance

- Coding can stop if CRC check fails => correctness not guaranteed

- Coding performance decreases significantly if unknown source statistics. Source correlation not known in advance in practice and is hard to estimate

Existing Approaches

22

1 10 1 1 0 0 1

Feedback-channel-based DVC by Aaron et al. 2004, Girod et al., 2005

• Bitplane coding

• Rate-Compatible Punctured Turbo (RCPT) codes used to generate parity bits (Slepian-Wolf coding) for each bitplane

• Feedback channel used to request parity bits based on need

• No need to determine number of parity bits to send in advance

• Hybrid FEC/ARQ–like scheme

‒ Feedback channel is to acknowledge the decoding correctness (e.g., CRC can be used to check correctness)

‒ Bitrate is determined on the fly.

‒ Decoding successes can be guaranteed.

Existing Approaches

23

levelsQuantizer

Slepian-Wolf

EncoderBuffer

Slepian-Wolf

DecoderReconstruction

Side Information Generation

Conventional Intraframe Decoder

Conventional Intraframe Encoder

S

bitplane1

S’

K’K

Intraframe Encoder Interframe Decoder

Request bits

Wyner-Ziv frames

Key frames

Side Information

S

Decoded Wyner-Ziv

frames

Decoded Key frames

Wyner-Ziv Encoder Wyner-Ziv Decoder

DCT

kX

IDCTkXkq

DCT

kX

Extract bitplanes

bitplane2

bitplane kM

kM2

Existing Approaches: Feedback-based DVC (Girod’s Group)

RCPT

For pixel-domain, no DCT, IDCT

MCTI

24

levelsQuantizer

Slepian-Wolf

EncoderBuffer

Slepian-Wolf

DecoderReconstruction


Conventional Intraframe Decoder

Conventional Intraframe Encoder

S

bitplane1

S’

K’K

Intraframe Encoder Interframe Decoder

Request bits

Wyner-Ziv frames

Key frames

Side Information

S

Decoded Wyner-Ziv

frames

Decoded Key frames

Wyner-Ziv Encoder Wyner-Ziv Decoder

DCT

kX

IDCTkXkq

DCT

kX

Extract bitplanes

bitplane2

bitplane kM

kM2

Existing Approaches: DISCOVER (Artigas et al., PCS 2007 )

Significant R-D performance improvement

LDPCA*

Hierarchicalsubpixel ME withSmoothing filter

* LDPCA provided by Girod’s Group – Varodayan et al., 2006

25

Issues with Existing Approaches

• Issue 1: Existing DVC schemes do not adapt the Slepian-Wolf decoding to the local characteristics of the video => every bitplane is Slepian-Wolf decoded based on bit budget starting from MSB to LSB. - Decoding stops when no error detected or when bit budget exhausted. Some important locations and bitplanes might not be decoded!

Question:Can we skip some less important regions and bitplanes without decoding them?How do we measure the significance of a bitplane?

Issues with Existing Approaches

•Issue 2: Existing DVC schemes do not adapt the quantization to the local characteristics of the video => During the encoding, a single quantizer matrix (one fixed quantizer for each subband) is selected for the whole video.

Question:Can we adapt the quantization matrix to the local characteristics of the video so as minimize the needed bits for LDPCA-decoding while maximizing the quality?

Proposed Strategy

• Divide each video frame into partitions in order to exploit local characteristics• Allocate bits to a partition only if they result in sufficient distortion reduction

- Determined using Distortion-Rate (D-R) ratios: D-R = D/R, where D = Distortion Reduction resulting from allocating R bits.

• Mimimum allowed distortion reduction per bit is specified in terms of a target Distortion-Rate (D-R) ratio = TD-R

-Allocate bits only if D/R of partition is > TD-R

• D/R is an indication of how much distortion reduction (quality) can a bit can buy us on average for the considered partition•Bits can be allocated to a partition via Slepian-Wolf (LDPCA-) decoding and/or by selecting quantization matrix• Target TD-R used to control bit-rate: set low for high bit-rate coding, and high for low bit-rate coding

28

Challenge: How to Measure Distortion-Rate Ratio?

• The original source information is not available at the decoder, so the distortion D cannot be exactly measured.

• The bitrate R cannot be known without decoding. • Proposed Approach: Distortion-Rate Ratio estimation

performed at the decoder using the side information frames and the source correlation model

‒ The complexity of the encoder is not increased ‒ More flexibility as the decoder can selectively decode

the bitplanes based on a target distortion-rate ratios. The target rate-distortion ratio can be changed so that different R-D operating point can be achieved.

‒ Error probability needs to be estimated at decoder

29

BLAST-DVC: Pixel-Domain BiTpLAne SelecTive Decoding

Xi

Wyner-Ziv frames

LDPCA EncoderBuffer

1ˆ

iX

Key frames

Requ

est b

its by

blo

ck

indic

es

parity

bits

+ C

RC

bits

Wyner-Ziv Frame Encoder

Wyner-Ziv Frame Decoder

Block Indices Decoding

Divide into Sub-images

…

Xi,1

Xi,2

Xi,M

Extract bitplanes

CRC Generator

…

xi,m,1

xi,m,2

xi,m,k

……

1iX1iX

Motion Compensated Interpolation

1ˆ

iX

iX

Rate-Distortion Ratio Estimation

Block Indices Encoding


Minimum Distance Symbol

Reconstruction

Minimum-distortion Pixel Reconstruction

LDPCA Decoder

LDPCA Decoder

LDPCA Decoder

……… …Merge Sub-imagesDecoded Wyner-Ziv frames

X’i

X’i,1

X’i,2

X’i,M

1,ˆ

iX

MiX ,ˆ

…

x’i,1,k

x'i,2,k

x'i,m,k

1,1ˆ

iX

MiX ,1ˆ

…1,1

ˆiX

MiX ,1ˆ

…

30

BLAST-DVC: Distortion-Rate Ratio EstimationSource Correlation Model• Let D be the difference of the source information X and its side

information Xside.

• D can be modeled as a random variable with a Laplacian distribution.

• α can be estimated from the co-located blocks of two motion-compensated Key frames and (Brites et al., 2006).

where m = partition index and n is the pixel location in the partition

,

255 if ,)exp(5.0

255255- if ,)exp(5.0

255 if ,)exp(5.0

)(

5.254

5.05.0

5.254

ddxx

ddxx

ddxx

dDP dd

1ˆ

iX 1ˆ

iX

,ˆˆ1ˆ2

1

2,,1,,1

22

N

n

nminmimm

XXN

31

BLAST-DVC: Rate Estimation

.1

1,

N

nknk P

NP

CRCkkkkk RNPPPPR ))1log()1(log(

:

•The needed bits for the considered kth bitplane can be computed as:

•Average of the error probabilities over subimage : knP ,

knP ,•Let be the error probability at a pixel n in bitplane k in partition

32

BLAST-DVC: Error Probability Estimation

The probability of bit error can be expressed as:

where bn,k and b’n,k denote a bit in the kth bitplane corresponding to

the nth pixels in the original subimage and in the side information (generated through motion compensated interpolation), respectively.DBP stands for Decoded Bit Planes.

)11 ,,'|1',0(

)11 ,,'|0',1(

,,,,

,,,,,

krDBPsrbbbbP

krDBPsrbbbbPP

rnrnknkn

rnrnknknkn

33

BLAST-DVC: Error Probability Estimation

)(

)(

)11 ,,'|1(

1

5.0

5.0

1

5.0

5.0

,,,1,,,

1,,,

1,,,

1,,,

S

s

XU

XLd

S

s

XU

XBd

rnrnknknskn

knskn

knskn

knskn

dDP

dDP

krDBPsrbbbP

)(

)(

)11 ,,'|0(

1

5.0

5.0

1

5.0

5.0

,,,1,,,

1,,,

1,,,

1,,,

S

s

XU

XLd

S

s

XB

XLd

rnrnknknskn

knskn

knskn

knskn

dDP

dDP

krDBPsrbbbP

and

34

• Estimate distortion reduction if the target bitplane is decoded.

• Average distortion estimation for a sub-image Xn

kkk DDD ˆΔ

Distortion reduction

Average distortion if the target bitplane is not LDPCA decoded

Average distortion if the target bitplane is LDPCA decoded

;])'[(1

21,

N

nknnk XXED

Partially reconstructed pixel value when the target bitplane is LDPCA-decoded => minimum distance symbol reconstruction is used

BLAST-DVC: Distortion Estimation

N

nknnnk XXXED

1

21, ]))',(Recon[(ˆ

Partially reconstructed pixel value based on the previously determined k-1 bitplanes and side info

35

Minimum Distortion Reconstruction

ΔΔ

if,ΔΔ

ΔΔ

if,

ΔΔ

if,1ΔΔΔ

),(Recon

1,

1,1,

1,

1,,

k

n

k

knk

k

n

k

n

k

knkn

k

n

k

knkk

k

n

knnkn

XXX

XXX

XXX

XXX

kΔ

nX

1, knX

knX ,

nX

1, knX

knX ,

nX

1, knX

knX ,

Side Info

OriginalLaplacian RV

36

Distortion Estimation – Bitplane not decoded

1U1L 1BsideX

X

)|( sideXXP

0 255

0 1

yy

y-Xside

NDBPs of no. and 2;

)(

)()(

]11 ,,|)[(

1

1

5.0

5.0

21,

1

5.0

5.0

1,,

21,

,,

,,

,,

,,

pS

yXP

XyyXP

krDBPsrbbXXED

pN

nS

s

U

Lyn

kn

S

s

U

Lyn

N

nrnrnknnk

skn

skn

skn

skn

y-Xside

Consider that the MSB is 0 and we want to determine next bit

Estimated value

=> Next bit is 1

00 01

37

1U1L 1BsideX

X

)|( sideXXP

0 255

yy

y-Recon(y,Xside)

y-Recon(y,Xside)

00 01 10 11

N

nS

s

U

Lyn

kn

S

s

U

Lyn

N

nrnrnknnnk

sknmi

sknmi

skn

skn

yXP

XyyXP

krDBPsrbbXXXED

1

1

5.0

5.0

21,

1

5.0

5.0

1,,

21,

,,,,

,,,,

,,

,,

)(

))Recon(y,()(

]11 ,,|)),Recon([(ˆ

If y in Bin 00, Recon(y,Xside)

If y in Bin 01, Xside is Recon(y,Xside)

Distortion Estimation – Bitplane LDPCA-decodedConsider that the MSB is 0 and we want to determine next bit

38

Bitplane Decoding Selection

Once the rate Rk and the distortion reduction ΔDk are obtained, a targeted distortion-rate ratio t can be chosen to determine whether bitplane decoding should be performed.

If ΔDk / Rk < t , the current bitplane is not decoded (NDBP case)

If ΔDk / Rk ≥ t , CRC bits are requested followed progressively by parity/syndrome bits, one parity/syndrome bit at a time, so that error correction can be applied to the current sub-image bitplane by means of LDPCA until no errors are detected (DBP case).

39

Proposed BLAST-DVC

Xi

Wyner-Ziv frames

LDPCA EncoderBuffer

1ˆ

iX

Key frames

Req

uest b

its by

blo

ck in

dices

parity

bits

+ C

RC

bits





…

Xi,1

Xi,2

Xi,M

Extract bitplanes

CRC Generator

…

xi,m,1

xi,m,2

xi,m,k

……

1iX1iX


1ˆ

iX

iX





Reconstruction


LDPCA Decoder

LDPCA Decoder

LDPCA Decoder


X’i

X’i,1

X’i,2

X’i,M

1,ˆ

iX

MiX ,ˆ

…

x’i,1,k

x'i,2,k

x'i,m,k

1,1ˆ

iX

MiX ,1ˆ

…

1,1ˆ

iX

MiX ,1ˆ

…

40

Proposed BLAST-DVC

Xi

Wyner-Ziv frames

LDPCA EncoderBuffer

1ˆ

iX

Key frames

Req

uest b

its by

blo

ck in

dices

parity

bits

+ C

RC

bits





…

Xi,1

Xi,2

Xi,M

Extract bitplanes

CRC Generator

…

xi,m,1

xi,m,2

xi,m,k

……

1iX1iX


1ˆ

iX

iX





Reconstruction


LDPCA Decoder

LDPCA Decoder

LDPCA Decoder


X’i

X’i,1

X’i,2

X’i,M

1,ˆ

iX

MiX ,ˆ

…

x’i,1,k

x'i,2,k

x'i,m,k

1,1ˆ

iX

MiX ,1ˆ

…

1,1ˆ

iX

MiX ,1ˆ

…

41

Simulation SetupQCIF Video Sequences (176x144)Frame rate: 15 frame per second.Number of partitions per frame = 64 (22x18 each)Comparison with following systems:• H.264 Inter : I-B-I-B• H.264 Intra only• DISCOVER by X. Artigas et al.

‒ Transform domain DVC, GOP = 2.• PDDVC (non-adaptive best pixel-domain system)

‒ Pixel domain DVC, GOP =2.‒ Special case of the proposed system but no partitions (1

partition per frame)

42

Simulation Results

2.0 dB

22% reduction 18% reduction

1.6 dB

43

Simulation Results

1.4 dB

18% reduction

18% reduction

0.8 dB

44

Visual Testing Setup

9 subjects took the test.Two video sequences are randomly placed side by

side on a 19” Dell Ultrasharp screen.Score• 1: DISCOVER is much better than BLAST DVC• 2: DISCOVER is better than BLAST DVC• 3: same quality• 4: DISCOVER is worse than BLAST DVC• 5: DISCOVER is much worse than BLAST DVC

45

Visual testingHall Monitor Foreman

Operating Point A B C D

Average Bitrate

(kbps)

DISCOVER 73.60 97.64 167.01 293.73

BLAST 71.43 97.62 166.48 291.63

Average PSNR

(dB)

DISCOVER 28.71 29.93 32.38 35.51

BLAST 28.19 29.34 31.68 34.59


Average Bitrate

(kbps)

DISCOVER 87.62 100.28 140.38 208.25

BLAST 83.53 89.69 121.57 185.45

Average PSNR

(dB)

DISCOVER 31.48 32.07 34.31 37.27

BLAST 31.49 32.02 34.29 37.25

46

Proposed SystemFrame bits: 3.36 kbits.

Frame PSNR: 32.89 dB.

DISCOVERFrame bits: 5.34 kbits.

Frame PSNR : 33.21 dB

Sequence average bitrate is 140.38 kbps and average PSNR is 34.31 dB for DISCOVER. Sequence average bitrate is 121.57 kbps and average PSNR is 34.29 dB for the proposed system.

47

Sequence average bitrate is 167.01 kbps, and average PSNR is 32.38 dB for DISCOVER.Sequence average bitrate is 166.48 kbps, and average PSNR is 31.68 dB for the proposed system.

DISCOVERFrame bits: 8.61 kbits Frame PSNR: 33.16 dB

Proposed SystemFrame bits: 5.83 kbitsFrame PSNR: 31.84 dB

48

DISCOVER BLAST-DVC

Compressed at 15fps, 167.01 kbps Compressed at 15fps, 166.48 kbps

49

AQT-DVC: Transform-Domain Distributed Video Coding with Rate-Distortion Based Adaptive Quantization

Motivation• Transform domain DVC performance is better than pixel domain DVC

performance, especially for high motion sequences. • Rate-distortion based adaptive quantization provides a better quantization

scheme in terms of rate-distortion performance.Considerations:

• Feedback channel Minimize the traffic on the feedback channel. Bitplane selective scheme is

not applicable because the number of bitplanes might be too large. -> One quantization matrix for each partition (M 4x4 DCT blocks)• Partition size versus LDPCA block size Smaller partition size keeps the flexibility of the quantization scheme. Larger LDPCA block size provides a better error correction ability and

reduce the feedback channel traffic. -> One LDPCA code for a bitplane of a subband. -> Due to different adative quantizers, resulting bitplanes are not

rectangular (irregular shape) and have undefined values => need to modify LDPCA

50

Sample Quantizer Matrices

Each matrix describes the number of quantization levels used for each of the 16 DCT subbands

51

Adaptive Quantization

Q 4x4 DCT block

3

4

1 1 1

3 3

4

4 4

4 4

1 1

1 1

1 1

5 55 5

5 5 5 5

1 1

1 1

3 3 3 3 3 3

Q: Quantizer matrix index

52

LDPCA Adaptationx1

x8

s1

s8 a8

a1 x1

x8

s1+s2

a8

a2

a4s3+s4

s5+s6

s7+s8

a6

LDPCA encoder Tanner graph corresponding to the transmission of only the even-indexed subset of the accumulated syndrome

x1

x3

s1+s2 a2

a4s3+s4

s5+s6 a6

x2

Tanner graph after eliminating redundant nodes

53

AQT-DVCWyner-Ziv Frame Encoder

Xi

Wyner-Ziv frames

DCTAdaptive

QuantizationExtract

Bitplanes

LDPCA Encoder

CRC Generator

Buffer

parity bits +

CR

C bits

request

LDPCA DecoderReconstructionInverse DCT

Xi-1

Xi+1


Distortion-rate Estimation

Quantizer S

et Index

X’i

Decoded Wyner-Ziv frames


Quantizater Set Selection

Quantizer Set Index

DCT

Same D-R concept butdifferent equations for Dand R

54

Quantizer Matrix Selection

• Each RD point corresponds to a quantizer matrix

• Two criterions for quantizer selection

- D/R Is larger than threshold target D-R TD-R=t

- The quantizer matrix results in the largest

distortion reduction

Ave

rage

dis

tort

ion

D

Average bitrate R0

Slope t

Selected quantizer set

7M

6M

5M4M

3M 2M

1M

0M

Side information

55

Simulation Setup

QCIF Video Sequences (176x144)

Frame rate: 15 frame per second.

Partition size = 16x16 pixels (four 4x4 DCT blocks)

Four LDPCA code to accommodate variable-size bitplanes: 396, 792, 1188, and 1984

Comparison with following systems:• GOP = 2

‒ H.264 Inter : I-B-I-B

‒ H.264 Intra only

‒ DISCOVER by X. Artigas et al. (LDPCA length: 1584)

56

Simulation Results

Up to 1.4 dB compared to DISCOVER

57



Average Bitrate

(kbps)

DISCOVER 73.60 97.64 167.01 293.73

AQT-DVC 71.99 103.02 168.31 290.52

Average PSNR

(dB)

DISCOVER 28.71 29.93 32.38 35.51

AQT-DVC 28.70 29.93 32.37 35.49


Average Bitrate

(kbps)

DISCOVER 87.62 100.28 140.38 208.25

AQT-DVC 85.60 98.51 141.56 207.78

Average PSNR

(dB)

DISCOVER 31.48 32.07 34.31 37.27

AQT-DVC 32.02 32.78 35.17 38.60

58

DISCOVER AQT-DVC

Compressed at 15fps,167.01 kbps Compressed at 15fps, 168.31 kbps

59

AQT-DVC

Inaccurate estimate of source correlation model might result in inappropriate quantization matrix selection and might cause significant RD performance loss in AQT-DVC.• The model estimation solely depends on two neighboring motion-compensated Key

frames.

Previous Key frame Next Key frameOriginal WZ frame

Motion-compensated previous Key frame

Motion-compensated next Key frameSide information 60

eAQT-DVC Procedure

Coarsely quantize and encode all DC coefficients

Decode and reconstruct all DC coefficients

Receive quantization matrix index

Use the obtained Laplacian models to estimate rate-distortion ratios for all coefficients with respect to all available quantization matrices

Estimate the Laplacian model paramters of all DCT coefficients by using the motion-

compensated Key frames and the reconstructed coarsely quantized DC coefficients

Quantize and encode all DCT coefficients

Decode and reconstruct all DCT coefficients

syndromes

syndromes

Matrix index

Encoder Decoder

Select the best quantization matrices (in the R-D sense), one for each partition.

61

Simulation Results (High-motion Sequences)

62



Average Bitrate

(kbps)

DISCOVER 73.60 97.64 167.01 293.73

eAQT 71.43 97.62 166.48 291.63

Average PSNR

(dB)

DISCOVER 28.71 29.93 32.38 35.51

eAQT 28.19 29.34 31.68 34.59


Average Bitrate

(kbps)

DISCOVER 87.62 100.28 140.38 208.25

eAQT 89.58 100.84 142.51 209.17

Average PSNR

(dB)

DISCOVER 31.48 32.07 34.31 37.27

eAQT 32.02 32.74 35.65 38.61

63

DISCOVER eAQT-DVC

Compressed at 15fps,167.01 kbps Compressed at 15fps,166.48 kbps

64

Conclusion

Adaptive distributed video coding

Distortion-Rate estimation for distributed video coding

• Allows allocation of more bits to significant regions

• A bitplane selective decoding scheme for pixel-domain DVC

• An adaptive quantization for transform-domain DVC

• PSNR improvement as much as 2.0 dB on the decoded video.

• Superior visual quality on the decoded video.

65

Future Research Directions• Explore more accurate source probability model.

• Variable block size locally-adaptive DVC scheme

• Improved DVC without feedback channel

• Real-time decoding

• Multi-View compression/3D TV

• Perceptual-based DVC

66

Related PublicationsWei-Jung Chien and Lina J. Karam, “Transform-Domain Distributed

Video Coding with Rate-Distortion Based Adaptive Quantization,” to appear in the IET Journal of Image Processing, Special Issue on Distributed Video Coding.

Wei-Jung Chien and Lina J. Karam, “BLAST-DVC: BitpLAne SelecTive Distributed Video Coding,” Springer Journal of Multimedia Tools and Applications, Special Issue on Distributed Video Coding, July 2009.

Wei-Jung Chien and Lina J. Karam, “AQT-DVC: Transform-Domain Distributed Video Coding with Rate-Distortion Based Adaptive Quantization,” accepted to IEEE International Conference on Image Processing, 2009.

Wei-Jung Chien and Lina J. Karam “Bitplane Selective Distributed Video Coding,” Asilomar Conference on Signals, Systems and Computers, 2008.

6767

67

Related PublicationsWei-Jung Chien, Lina J. Karam, and Glen P. Abousleman, “ Rate-

Distortion Based Selective Decoding for Pixel-Domain Distributed Video Coding ,” IEEE International Conference on Image Processing, p .1132 - 1135 , 2008

Wei-Jung Chien, Lina J. Karam, and Glen P. Abousleman, “Block Adaptive Wyner-Ziv Coding for Transform-Domain Distributed Video Coding,” IEEE International Conference on Acoustics, Speech, and Signal Processing, p I-525-8, 2007.

Wei-Jung Chien, Lina J. Karam, and Glen P. Abousleman, “Distributed Video Coding with lossy side information,” IEEE International Conference on Acoustics, Speech, and Signal Processing, p II-69-72, 2006.

Wei-Jung Chien, Lina J. Karam, and Glen P. Abousleman, “Distributed Video Coding with 3-D Recursive Search Block Matching,” IEEE International Symposium on Circuits and Systems, p 5415-5418, 2006.

6868

68

Wei-Jung Chien and Lina Karam

Wei-Jung Chien and President Obama

Thank you

Adaptive Rate-Distortion Based Wyner-Ziv Video Coding

Documents

Transcript of Adaptive Rate-Distortion Based Wyner-Ziv Video Coding