CHAPTER 13 EXPERIMENTS, RESULTS AND ITS...

169

CHAPTER 13

EXPERIMENTS, RESULTS AND ITS ANALYSIS

13.1. Mechanization

The raw video sequences used in this research work are YUV 4:2:0 sequences. There

are many YUV sequences of different resolutions that are available in the websites, are

used by most of the researchers. Those video sequences can be downloaded from websites

like, http://see.xidian.edu.cn/vipsl/database_video.html, http://videocoders.com/yuv.html,

http://trace.eas.asu.edu/yuv/ and http://media.xiph.org/video/derf/. In addition to YUV

sequences, two reference encoder software (JM16.1 and x264) are downloaded and used.

In this research work, architecture and design of homogeneous video transcoding

are detailed. The entire transcoding is mechanized in MATLAB environment. The

functionality of decoder, encoder and resizer are checked which are explained in this

chapter. Both reference and research transcoder models are established and the

functionality and robustness of the transcoders are verified and checked.

13.1.1. Inputs and Outputs of Transcoder

Inputs and outputs of the research work are shown in Fig. 13.1.

Fig. 13.1 Inputs and outputs of transcoder

Compressed

Domain

Transcoder

H.264 Bitstream High resolution

H.264 Bitstream Lower resolution

Output

Resolution

Encoding

QP

Standalone /

Reuse

PSNR

File size

Complexity

Bits / frame

170

13.1.2. Functionality Verification of Compressed Domain Decoder

The compressed domain decoder output must be H.264 Standard Compliance. The

bitstreams are generated by encoding YUV video files by x264 encoder. There are 23

YUV videos used in experimentation.

QCIF (176x144) – Coastguard, Foreman, Mother, Salesman

CIF (352x288) – Akiyo, Foreman, Mobile, Soccer

SD (720x576) – Animation, Ant, Balloon, Carrace, Stadium, Unknown Trailer

C640x360 – Gangnam (downloaded video)

HD720p – 1280x720 – Ducks Take Off, Parkrun, Shields

HD1080p – 1920x1080 – Bluesky, Dinner, Gangnam, RiverBed, Touchdown

These sequences are encoded by x264 with 8 different QPs (i.e., 1, 7, 14, 21, 28, 35,

42, and 49). These H.264 Baseline bitstreams are decoded by this compressed domain

decoder. The output compressed domain decoded frames are inverse-transformed and

constructed as spatial domain frames. The same bitstream is decoded by JM decoder to

generate reference video frames. These two frames are compared as shown in Fig. 13.2.

The output videos are always found same resulting in the output of comparator being zero.

Fig. 13.2 Checking the compliance of Compressed Domain Decoder (Spatial Domain)

Compressed Domain

H.264 Baseline

Decoder

Compressed Domain

H.264 Baseline

Encoder

Compressed

Domain Resizer

Reuse

Engine H.264

Bitstream

Parsed Information

Syntax Elements

Compressed domain

Decoded frame

Compressed

domain

Resized frame

Transcoded

H.264

Bitstream

Resize

Ratio

Resize

Ratio

Spatial Domain

H.264 Baseline

Decoder (JM)

Inverse

Transform

–

Always ZERO for H.264

Standard Compliance

Modified

Syntax Elements

171

The outputs of spatial and compressed domain decoder are also checked in

compressed domain as shown in Fig. 13.3. The difference is found always zero. So the

compressed domain decoder ensured that it adhered to the compliance of H.264 Standard.

Fig. 13.3 Checking the compliance of Compressed Domain Decoder (Compressed Domain)

Totally 23 x 8 = 184 bitstreams are decoded by compressed domain decoder and

checked the functionality. For all the bitstreams, the compressed domain decoder

performed as per H.264 Standard without any error.

Compressed Domain

H.264 Baseline

Decoder

Compressed Domain

H.264 Baseline

Encoder

Compressed

Domain Resizer

Reuse

Engine H.264

Bitstream

Parsed Information

Syntax Elements

Compressed domain

Decoded frame

Compressed

domain

Resized frame

Transcoded

H.264

Bitstream

Resize

Ratio

Resize

Ratio

Spatial Domain

H.264 Baseline

Decoder (JM)

Forward

Transform

–

Always ZERO for

H.264 Standard

Compliance

Modified

Syntax Elements

172

13.1.3. Functionality Verification of Compressed Domain Resizer

The compressed domain resizer is tested with the above-mentioned set of YUV

sequences for different resolutions. The sequence is resized by spatial domain resizer

which is imresize.m function in MATLAB Image Processing Toolbox. The same sequence

is forward transformed and resized by compressed domain resizer. The output of resizer is

inverse transformed to bring the result into spatial domain. The resolutions of input and

output frame are mentioned common to both spatial and compressed domain resizers.

PSNR is calculated by comparing the results of resizers frame-by-frame which is shown in

Fig. 13.4.

Fig. 13.4 Checking the functionality of Compressed Domain Resizer

The PSNR in all the cases are more than 50dB. The different resolutions used for

resizing the input videos are, 176x144, 352x288, 480x320, 640x480 and 720x576. The

resizing ratios are arbitrary and it is found that compressed domain resizer works for any

downsizing resolution.

Forward

Transform

PSNR > 50dB

YUV Sequence

Compressed

domain Resizer

Inverse

Transform PSNR Calculation

Spatial Domain

Resizer

I/P & O/P resolution

173

13.1.4. Functionality Verification of Compressed Domain Encoder

The functionality of compressed domain encoder is checked as shown in Fig. 13.5.

The YUV sequence is forward-transformed and sent as input to compressed domain

encoder. The encoder compresses the input video into bitstream with compressed domain

reconstructed frame. These compressed domain frame is inverse-transformed and PSNR of

each frame is calculated with reference to input video YUV sequence.

Fig. 13.5 Checking the functionality of Compressed Domain Encoder

The output of encoder, i.e., the H.264 bitstream is decoded by JM decoder. During

decoding, PSNR of each frame is calculated with reference to input YUV sequence. It is

found that the PSNRs are matching frame-by-frame component wise. It proves that the

compressed domain encoder is H.264 compliance.

Zero error

Compressed domain

Reconstructed Frame

Forward

Transform

YUV Sequence

Compressed

domain Encoder

Inverse

Transform

PSNR

Calculation

JM Decoder

PSNR

Calculation

QP

H.264 bitstream

174

13.1.5. Creation of Reference Model

The available reference software (JM16.1 and x264) are used as standalone encoders

and they do not perform transcoding. The transcoding setup is made with the reference

software. But there is no possibility to implement reusing techniques while encoding. The

architecture of reference model is shown in Fig. 13.6.

Fig. 13.6 Architecture of Reference Model (Classical Spatial Domain Transcoder by

Reference Software) with compressed domain resizer

The input H.264 bitstreams are decoded by reference software first. In order to have

fair comparison, the compressed domain resizer is used here. Then those decoded video

sequences are transformed to compressed domain and resized by the compressed domain

resizer. And the outputs of compressed domain resizer are inverse transformed to give

spatial domain resized video sequences. These video sequences are encoded by reference

software to get required H.264 bitstream. The resizing ratio and QP are noted to follow in

the research (MATLAB) Model.

Inverse Transform

Decoded Transform

Domain Video

Forward Transform

Compressed Domain

Resizer

Spatial Domain

Decoder

Spatial Domain

Encoder

Input H.264

Bitstream

Output H.264

Bitstream

Decoded Spatial

Domain Video

Resized Transform

Domain Video

Resized Spatial

Domain Video

175

13.1.6. Creation of Research Models

The compressed domain decoder, compressed domain resizer, reuse engine and

compressed domain encoder are coded in MATLAB. There are two models in this research

work, namely Standalone Model and Reuse Model. The standalone model is shown in Fig.

13.7. The research standalone model is coded such that it adheres to the H.264 Standard

compliance.

Fig. 13.7 Architecture of Research Standalone Model

The reuse model is shown in Fig. 13.8. Here the switches are joined to enable Reuse

engine which supplies modified syntax elements to encoder. This research reuse model is

coded such that it adheres to the H.264 Standard compliance.

Fig. 13.8 Architecture of Research Reuse Model

Compressed Domain

H.264 Baseline

Decoder

Compressed Domain

H.264 Baseline

Encoder

Compressed

Domain Resizer

Reuse

Engine H.264

Bitstream

Parsed Information

Syntax Elements

Compressed domain

Decoded frame

Compressed

domain

Resized frame

Transcoded

H.264

Bitstream

Resize

Ratio

Resize

Ratio

Modified

Syntax Elements

Compressed Domain

H.264 Baseline

Decoder

Compressed Domain

H.264 Baseline

Encoder

Compressed

Domain Resizer

Reuse

Engine H.264

Bitstream

Parsed Information

Syntax Elements

Compressed domain

Decoded frame

Compressed

domain

Resized frame

Transcoded

H.264

Bitstream

Resize

Ratio

Resize

Ratio

Modified

Syntax Elements

176

13.1.7. Compliance to H.264 Standard

As the resized video of encoder in reference model and research model must be same

for fair comparison, the compressed domain resizer is used in the reference model with

appropriate forward and inverse transform operations as shown in Fig. 13.6. It is ensured

that the output of the resizer in the reference model is found same as that of research

model.

Now the spatial domain resized video is encoded by reference software. The output

of spatial domain encoder is in compliance with H.264 Standard. This output is called

reference output. The syntax elements for those transcoded bitstreams are checked and

found compliance with H.264 Standard by simply decoding the transcoded bitstreams by

JM decoder.

The research model is set to Standalone Model and performed the transcoding

processes. And the research model is set to Reuse Model and performed the transcoding.

The output of the transcoders must be compliance to H.264 Standard.

Fig. 13.9 Architecture for H.264 Standard compliance check

As shown in Fig. 13.9, the compressed domain reconstructed frames which are used

in research model are converted to spatial domain reconstructed frames. The transcoded

H.264 bitstream is decoded to spatial domain frames by reference decoder. These two

video frames are compared and found same, resulting the transcoded H.264 bitstream is in

compliance with H.264 Standard.

Research Model –

Standalone / Reuse

H.264

Bitstream

Compressed domain

reconstructed frame

Transcoded H.264

Bitstream

Spatial Domain

H.264 Baseline

Decoder (JM)

Inverse

Transform

–

Always ZERO for H.264

Standard Compliance

Spatial domain

reconstructed frame

Spatial domain

decoded frame

177

13.2. Experiments

Homogeneous Video Transcoding through Integer transform is shown to be possible

in the research work. This section explains the experimentation of the entire video

transcoding pipeline in MATLAB environment.

For the presentation in this research work, three different CIF sequences (which are

4:2:0 Chroma Subsampling) have been used as data to assess the efficacy of the algorithm

developed. They are 1) Akiyo (which has low motion) 2) Foreman (Camera panning and

fast motion) and 3) Mobile (which is colourful and various motion). Each video sequences

are having 300 frames. The frame rate is 25 fps. The compression characteristics of those

video sequences are shown in Fig. 13.10. It shows the motion / behavioural deviations

between the sequences. This plot is obtained by compressing those video sequences by

x264 software with different QP (varying from 1 to 51). The curves are separate with

considerable distance which indicates that the sequences are varied with different motion

characteristics.

Fig. 13.10 Characteristics of CIF sequences (Akiyo, Foreman and Mobile)

The input of the transcoder is H.264 bitstream which is obtained by compressing the

YUV video sequences by x264 or JM reference software. The identified videos are

compressed into H.264 bitstreams by x264 with QP as 7 in Baseline Profile with one

reference frame. Group of Picture (GOP) = 25 (i.e., Intra frame interval = 25).

Results are obtained from Reference Model, Standalone Model and Reuse Model of

Research Model.

0

1000

2000

3000

4000

5000

6000

7000

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51

Fil

esiz

e (K

B)

QP

Akiyo Foreman Mobile

178

13.3. Results of Transcoder Experiments

Metrics measured as results of Transcoding are listed below.

1. File size

2. Quality (in terms of PSNR)

3. Complexity (in terms of )

Akiyo, Foreman and Mobile sequences are compressed by x264 encoder with QP =

7. The H.264 bitstreams are used as input of the transcoder models. The screen size is

resized from CIF (352 x 288) to QCIF (176 x 144). The QP is varied at the encoder from 7

to 42 insteps of 7 (i.e., QP = 7, 14, 21, 28, 35, 42).

File Sizes (in Kilo Bytes) of the transcoded bitstreams for Reference Model,

Standalone Model and Reuse Model of Research Model are listed in Table 13.1. The

quality of output of those models for the given bitstreams is measured in terms of PSNR

(in dB) and listed in Table 13.2. The complexity in terms of of those models for the

given bitstreams is listed in Table 13.3.

13.4. Analysis of Results

The analysis of results between research model and reference model is done in terms of the

metrics given below.

1. Perceptual Video Quality

2. Objective Quality (PSNR) of Frames

3. Bits consumed by Frames

4. Complexity ( ) of Frames

5. Objective Quality (PSNR) vs. Bitrate – RD-Plot

6. Quality + Bitrate + Complexity of Frames – RDC-Plot

179

Table 13.1 File sizes (Kilo bytes) of transcoded bitstreams for resizing ratio 2:1

QP


Refe

rence

Model

Research

Model

(Stand

alone)

Research

Model

(Reuse)

Refe

rence

Model

Research

Model

(Stand

alone)

Research

Model

(Reuse)

Refe

rence

Model

Research

Model

(Stand

alone)

Research

Model

(Reuse)

7 883 865 968 3023 3443 3321 5261 5354 5423

14 328 326 345 1419 1775 1651 3410 3518 3584

21 148 142 154 617 826 738 1948 2013 2068

28 61 58 57 223 332 272 842 857 629

35 27 26 25 85 134 120 241 229 191

42 15 14 10 40 58 52 71 79 89

Table 13.2 Quality (PSNR in dB) of transcoded bitstreams for resizing ratio 2:1

QP


Refe

rence

Model

Research

Model

(Stand

alone)

Research

Model

(Reuse)

Refe

rence

Model

Research

Model

(Stand

alone)

Research

Model

(Reuse)

Refe

rence

Model

Research

Model

(Stand

alone)

Research

Model

(Reuse)

7 53.5 53.8 54.0 52.6 53.2 52.9 52.3 52.8 52.7

14 48.5 48.9 48.8 46.7 47.3 47.2 45.6 46.4 46.3

21 44.2 44.3 44.4 41.8 42.1 42.0 39.5 39.8 39.6

28 39.4 38.6 38.2 37.1 37.2 36.0 33.4 33.8 32.4

35 35.1 33.6 34.6 33.3 33.1 31.1 28.2 28.2 25.2

42 31.4 30.8 31.9 30.0 28.7 29.6 24.5 22.6 20.0

Table 13.3 Complexity ( ) of transcoded bitstreams for resizing ratio 2:1

QP


Refe

rence

Model

Research

Model

(Stand

alone)

Research

Model

(Reuse)

Refe

rence

Model

Research

Model

(Stand

alone)

Research

Model

(Reuse)

Refe

rence

Model

Research

Model

(Stand

alone)

Research

Model

(Reuse)

7 44.23 62.29 18.33 47.18 67.40 41.90 63.62 89.60 27.21

14 38.88 49.85 17.67 45.94 57.42 25.93 48.03 60.04 25.08

21 33.73 47.51 13.85 41.28 55.79 24.48 40.75 53.62 22.07

28 34.39 47.11 12.29 37.39 52.67 21.17 29.14 37.85 19.74

35 28.94 40.18 11.84 32.72 44.83 18.51 28.66 35.83 15.76

42 20.85 29.79 10.71 29.10 39.32 15.23 18.60 23.25 14.29

180

13.4.1. Perceptual Video Quality

The visual quality and bitrate of the research work are compared with reference

model. The quality of the resized video is comparable with the reference model and found

less distortion. Three different input bitstreams of same screen size, CIF (352 x 288) with

QP = 7 are transcoded to QCIF (176 x 144) with QP = 14. The sample shots are shown in

Fig. 13.11 to Fig. 13.13. Akiyo sequence, which is slow motion, news reading sequence is

shown in Fig. 13.11.

CIF (352 x 288) QCIF (176 x 144)

(a) Input bitstream

(b) Reference Model

(c) Research Model (Reuse)

Fig. 13.11 Frame number 1 of Akiyo Sequence a) Input bitstream b) Transcoded by reference

model c) Transcoded by research model

Foreman sequence, which is commonly used in Video compression world, is shown

in Fig. 13.12. The decoded bitstream (in CIF resolution) is shown in left side of Fig. 13.12.

The video sequence is resized to QCIF resolution. Then the resized video is encoded by

reference model and research work, which are shown in right side of Fig. 13.12. The fast

moving and colourful sequence is mobile. The decoded frame, encoded frame by reference

model and research model of mobile sequence are shown in Fig. 13.13. The perceptual

visual qualities of those sequences are same. The visual comparisons of reference and

research model of those sequences transcoded with QP = 14 are shown Fig. 13.14 to Fig.

13.16.

181

CIF (352 x 288) QCIF (176 x 144)

(a) Input bitstream

(b) Reference Model


Fig. 13.12 Frame number 90 of Foreman sequence a) Input bitstream b) Transcoded by

reference model c) Transcoded by research model

CIF (352 x 288) QCIF (176 x 144)

(a) Input bitstream

(b) Reference Model


Fig. 13.13 Frame number 238 of Mobile sequence a) Input bitstream b) Transcoded by

reference model c) Transcoded by research model

182

Reference Model Research Model

(Standalone) Research Model (Reuse)

Fig. 13.14 Frame number 1, 61, 121, 181 and 241 of transcoded Akiyo Sequence

183



Fig. 13.15 Frame number 1, 61, 121, 181 and 241 of transcoded Foreman Sequence

184



Fig. 13.16 Frame number 1, 61, 121, 181 and 241 of transcoded Mobile Sequence

185

13.4.2. Objective Quality (PSNR) of Frames

PSNRs with respect to resizer output (spatial domain) for each frame of reference

model and research model’s output bitstreams with QP = 14 have been compared. The Fig.

13.17 to Fig. 13.19 indicated that PSNR of research model is very closer to reference

model, but it very negligible. At least 2dB deviation cannot be identified by naked eyes.

PSNR for each component is calculated as follows.

………………………………………………………………..(13-1)

where

For a 4:2:0, average quality i.e., PSNR of a frame is calculated as a weighted average

of three components as shown below.

…………………………………………….(13-2)

where is PSNR of Luma component, is that of Cb and is that

of Cr component.

I-frame normally takes more bits than P-frames while compressing the video

sequence for same QP. The spikes in the Fig. 13.17 to Fig. 13.19 are the PSNRs of I-

frames in the sequence. The intra frame interval is set to 25 while encoding. So for every

25 frames, the first frame is compressed as I-frame. It is observed that the spikes are

repeated every 25 frames.

The visual qualities of bitstreams for different QPs are compared in Fig. 13.20 to Fig.

13.22. The PSNRs of the entire sequence are averaged to get mean PSNR for a given QP.

The Fig. 13.20 to Fig. 13.22 are plot with mean PSNR vs. QP. It is observed that the mean

PSNR of research models as standalone and reuse are very closer or equal to that of

reference model.

186

Fig. 13.17 Quality Comparison of Akiyo Sequence

Fig. 13.18 Quality Comparison of Foreman Sequence

Fig. 13.19 Quality Comparison of Mobile Sequence

47.00

47.50

48.00

48.50

49.00

49.50

50.00

1

11

21

31

41

51

61

71

81

91

10

1

11

1

12

1

13

1

14

1

15

1

16

1

17

1

18

1

19

1

20

1

21

1

22

1

23

1

24

1

25

1

26

1

27

1

28

1

29

1

PS

NR

(d

B)

Frame Number

Reference Model Research Model (Standalone) Research Model (Reuse)

44.00

45.00

46.00

47.00

48.00

49.00

50.00

1

11

21

31

41

51

61

71

81

91

10

1

11

1

12

1

13

1

14

1

15

1

16

1

17

1

18

1

19

1

20

1

21

1

22

1

23

1

24

1

25

1

26

1

27

1

28

1

29

1

PS

NR

(d

B)

Frame Number


43.00

44.00

45.00

46.00

47.00

48.00

49.00

50.00

1

11

21

31

41

51

61

71

81

91

10

1

11

1

12

1

13

1

14

1

15

1

16

1

17

1

18

1

19

1

20

1

21

1

22

1

23

1

24

1

25

1

26

1

27

1

28

1

29

1

PS

NR

(d

B)

Frame Number


187

Fig. 13.20 Mean PSNR vs. QP of Akiyo Sequence

Fig. 13.21 Mean PSNR vs. QP of Foreman Sequence

Fig. 13.22 Mean PSNR vs. QP of Mobile Sequence

0.0

10.0

20.0

30.0

40.0

50.0

60.0

7 14 21 28 35 42

Mea

n P

SN

R (

dB

)

Quantization Parameter (QP)


0.0

10.0

20.0

30.0

40.0

50.0

60.0

7 14 21 28 35 42

Mea

n P

SN

R (

dB

)



0.0

10.0

20.0

30.0

40.0

50.0

60.0

7 14 21 28 35 42

Mea

n P

SN

R (

dB

)



188

13.4.3. Bits consumed by Frames

The Fig. 13.23 to Fig. 13.25 showed that the bits consumed by each frame of those

sequences. The results showed that research model has taken 10% extra bits that of

reference model for each frame. But the deviation may not be visible in Akiyo and Mobile;

but it is visible in Foreman plot.

Fig. 13.23 Bits spent Comparison of Akiyo Sequence

Fig. 13.24 Bits spent Comparison of Foreman Sequence

The reasons for taking excessive bits than reference model are given below.

1. The reference model works on spatial domain; research model works on

compressed domain where the precision is limited in mode decision and motion

estimation.

0

10000

20000

30000

40000

50000

60000

70000

1

11

21

31

41

51

61

71

81

91

10

1

11

1

12

1

13

1

14

1

15

1

16

1

17

1

18

1

19

1

20

1

21

1

22

1

23

1

24

1

25

1

26

1

27

1

28

1

29

1

Bit

s p

er f

ram

e

Frame Number


0

20000

40000

60000

80000

100000

120000

1

11

21

31

41

51

61

71

81

91

10

1

11

1

12

1

13

1

14

1

15

1

16

1

17

1

18

1

19

1

20

1

21

1

22

1

23

1

24

1

25

1

26

1

27

1

28

1

29

1

Bit

s p

er f

ram

e

Frame Number


189

2. The reference model uses rigorous sliding techniques to find the best match in

motion estimation; the research model uses restricted searching technique to

address the hardware implementation.

In Foreman sequence, there is a transition of scenes from 170th

frame to 220th

frame.

These transitions have used more bits to code more residue coefficients. This resulted

higher quality in Fig. 13.18.

Fig. 13.25 Bits spent Comparison of Mobile Sequence

The bits taken by bitstreams for different QPs are compared in Fig. 13.26 to Fig.

13.28. The bits of the entire sequence are summed to get total bits (file size) for a given

QP. The Fig. 13.26 to Fig. 13.28 are plot with file size vs. QP. It is observed that the file

size of research models as standalone and reuse are 10% more than that of reference

model.

Fig. 13.26 File size vs. QP of Akiyo Sequence

0

50000

100000

150000

200000

1

11

21

31

41

51

61

71

81

91

10

1

11

1

12

1

13

1

14

1

15

1

16

1

17

1

18

1

19

1

20

1

21

1

22

1

23

1

24

1

25

1

26

1

27

1

28

1

29

1

Bit

s p

er f

ram

e

Frame Number


0

200

400

600

800

1000

7 14 21 28 35 42

Fil

esiz

e (k

ilo

by

tes)



190

Fig. 13.27 File size vs. QP of Foreman Sequence

Fig. 13.28 File size vs. QP of Mobile Sequence

13.4.4. Complexity ( ) of Frames

The computational complexity is calculated based on number of operations such as

addition, subtraction, multiplication and shifting in the transcoding path. The complexity of

decoding, resizing and encoding each frame is computed in terms of operations. The

computational complexities obtained from transcoding by reference model and transcoding

by research model (Standalone and Reuse Models) are compared in this research work.

Because research model worked in compressed domain, its complexity is 10-20% higher

than that of spatial domain reference model. But, with reuse techniques, the complexity of

research model is 70-80% lower than that of spatial domain reference model. The

complexities of the above-said sequences are shown in Fig. 13.29 to Fig. 13.31.

0

500

1000

1500

2000

2500

3000

3500

7 14 21 28 35 42

Fil

esiz

e (k

ilo

by

tes)



0

1000

2000

3000

4000

5000

6000

7 14 21 28 35 42

Fil

esiz

e (k

ilo

by

tes)



191

Fig. 13.29 Complexity comparison for Akiyo Sequence

Fig. 13.30 Complexity comparison for Foreman Sequence

Fig. 13.31 Complexity comparison for Mobile Sequence

0.0

10.0

20.0

30.0

40.0

50.0

60.0

70.0

1

11

21

31

41

51

61

71

81

91

10

1

11

1

12

1

13

1

14

1

15

1

16

1

17

1

18

1

19

1

20

1

21

1

22

1

23

1

24

1

25

1

26

1

27

1

28

1

29

1

Co

mp

lexit

y (

mil

lio

n

op

era

tio

ns

per

Fra

me)

Frame Number


0.0

10.0

20.0

30.0

40.0

50.0

60.0

70.0

1

11

21

31

41

51

61

71

81

91

10

1

11

1

12

1

13

1

14

1

15

1

16

1

17

1

18

1

19

1

20

1

21

1

22

1

23

1

24

1

25

1

26

1

27

1

28

1

29

1

Co

mp

lexit

y (

mil

lio

n

op

era

tio

ns

per

Fra

me)

Frame Number


0.0

10.0

20.0

30.0

40.0

50.0

60.0

70.0

80.0

1

11

21

31

41

51

61

71

81

91

10

1

11

1

12

1

13

1

14

1

15

1

16

1

17

1

18

1

19

1

20

1

21

1

22

1

23

1

24

1

25

1

26

1

27

1

28

1

29

1

Co

mp

lexit

y (

mil

lio

n

op

era

tio

ns

per

Fra

me)

Frame Number


192

The computational complexities of the entire sequence are averaged to get mean

complexity ( ) for a given QP. The computational complexities by bitstreams for

different QPs are compared in Fig. 13.32 to Fig. 13.34. The Fig. 13.32 to Fig. 13.34 are

plot with mean Complexity vs. QP. It is observed that the mean Complexity of research

model (reuse) is 70-80% less than that of standalone and reference model. The plots

showed that at least 75% of computation complexity is saved. In slow moving sequences,

like Akiyo, 80% savings is possible. It is observed that 78% and 75% savings are done

Foreman sequence and Mobile sequence respectively.

Fig. 13.32 Mean Complexity vs. QP for Akiyo Sequence

Fig. 13.33 Mean Complexity vs. QP for Foreman Sequence

0.000

10.000

20.000

30.000

40.000

50.000

60.000

70.000

7 14 21 28 35 42

Mea

n C

om

ple

xit

y (

mo

pf)



0.000

10.000

20.000

30.000

40.000

50.000

60.000

70.000

7 14 21 28 35 42

Mea

n C

om

ple

xit

y (

mo

pf)



193

Fig. 13.34 Mean Complexity vs. QP for Mobile Sequence

13.4.5. PSNR vs. Bitrate (RD Plot)

Next the PSNR vs. bitrate plot (RD plot) is done to evaluate the quality. The Video

quality vs. bitrate of the research model is very closer with the reference model. The

comparisons for the above-said Akiyo, Foreman and Mobile sequences are shown below in

Fig. 13.35 to Fig. 13.37. It is clearly noted that the quality at different bitrates is on-par

with that of reference software.

Fig. 13.35 Quality vs. Bitrate of Akiyo Sequence

0.000

20.000

40.000

60.000

80.000

100.000

7 14 21 28 35 42

Mea

n C

om

ple

xit

y (

mo

pf)



0.0

20.0

40.0

60.0

11 10 9 8 7

6 5

PS

NR

(d

B)

Bitrate (in powers of 2) (kb)


194

Fig. 13.36 Quality vs. Bitrate of Foreman Sequence

Fig. 13.37 Quality vs. Bitrate of Mobile Sequence

13.4.6. Rate-Distortion-Complexity (RDC) Plot

The new metric called RDC-plot is introduced in this research work. The

combination of bits consumed, the quality in terms of SSE and complexity in are

combined as follows for each frame.

(13-2)

where

is listed in Table A1.6 (Annexure – I). The RDC-Plots for the same

sequences are plotted in Fig. 13.38 to Fig. 13.40. It indicated that the RDC-parameter for

0.0

20.0

40.0

60.0

13 12 11 10 9 8

7

PS

NR

(d

B)



0.0

20.0

40.0

60.0

13 12 11 10 9 8

7

PS

NR

(d

B)



195

research work is always less than that of reference model. It saved 30% to 50% of overall

computation with optimized rate and quality.

Fig. 13.38 Rate-Distortion-Complexity comparison for Akiyo Sequence

Fig. 13.39 Rate-Distortion-Complexity comparison for Foreman Sequence

Fig. 13.40 Rate-Distortion-Complexity comparison for Mobile Sequence

0

2000000

4000000

6000000

8000000

10000000

7 14 21 28 35 42

RD

C C

ost



0

2000000

4000000

6000000

8000000

10000000

12000000

7 14 21 28 35 42

RD

C C

ost



0

1000000

2000000

3000000

4000000

5000000

6000000

7000000

7 14 21 28 35 42

RD

C C

ost



196

13.5. Results of different resizing ratios

The transcoder is tested for different resizing ratios in order to test the stability and

robustness of compressed domain resizer. There are two different resizing ratios depicted

here, 1) 1280x720 to 720x576 i.e., HD720p to SD and 2) 720x576 to 480x320 i.e., SD to

320p. Here width-wise resizing ratios are 1.778:1 and 1.5:1 and height-wise resizing ratios

are 1.25:1 and 1.8:1. This shows that the compressed domain resizer resizes arbitrary

resizing ratios too.

Table 13.4 File sizes (Kilo bytes) of transcoded bitstreams for different resizing ratios

QP

Shields (HD to SD) SD to 320p

Reference

Model

Research

Model

(Standalone)

Research

Model

(Reuse)

Reference

Model

Research

Model

(Standalone)

Research

Model

(Reuse)

7 13875 10055 8207 17482 19282 20036

14 5720 5953 4225 9808 10818 11241

21 3636 3235 2976 4610 5085 5283

28 1026 1183 879 1607 1772 1842

35 504 543 521 500 551 573

42 255 250 246 237 261 272

Table 13.5 Quality (PSNR in dB) of transcoded bitstreams for different resizing ratios

QP


Reference

Model

Research

Model

(Standalone)

Research

Model

(Reuse)

Reference

Model

Research

Model

(Standalone)

Research

Model

(Reuse)

7 53.2563 53.1901 52.7249 52.3092 52.5705 52.4010

14 46.6552 46.6485 46.3962 45.7937 46.0225 45.8741

21 41.5879 41.4440 41.4158 39.9838 40.1835 40.0540

28 37.0409 36.8473 36.1874 34.1302 34.3007 34.1901

35 33.0036 32.5472 31.9388 29.4681 29.6153 29.5198

42 30.4831 29.1440 28.5534 26.8067 26.9406 26.8537

Table 13.6 Complexity ( ) of transcoded bitstreams for different resizing ratios

QP


Reference

Model

Research

Model

(Standalone)

Research

Model

(Reuse)

Reference

Model

Research

Model

(Standalone)

Research

Model

(Reuse)

7 193.6453 261.8464 97.1455 100.3472 139.6943 49.3139

14 121.9546 159.4850 60.1628 87.3475 121.5973 42.9254

21 106.2155 136.3968 45.9893 72.9563 101.5632 35.8531

28 97.8327 110.9638 39.0806 54.2751 75.5569 26.6726

35 94.8630 105.5490 35.2117 50.9672 70.9519 25.0470

42 91.9594 101.4523 32.5880 45.8765 63.8651 22.5452

197

One frame has been sampled from input 1280x720 resolution bitstream and shown

in Fig. 13.41. The sample frame from output 720x576 resolution bitstream is shown in Fig.

13.42.

Fig. 13.41 Sample frame from HD1280x720 Shields Sequence

Fig. 13.42 Sample frame from SD720x576 transcoded Sequence

198

One frame has been sampled from input 720x576 resolution bitstream and shown in

Fig. 13.43. The sample frame from output 480x320 resolution bitstream is shown in Fig.

13.44.

Fig. 13.43 Sample frame from SD720x576 Stadium Sequence

Fig. 13.44 Sample frame from 480x320 transcoded Sequence

199

The PSNR plots for reference model, standalone model and reuse model for these two

sequences at QP=14 are shown in Fig. 13.45 and Fig. 13.46.

Fig. 13.45 Quality Comparison of Shields Sequence

Fig. 13.46 Quality Comparison of Stadium Sequence

44.5000

45.0000

45.5000

46.0000

46.5000

47.0000

47.5000

48.0000

1

18

35

52

69

86

10

3

12

0

13

7

15

4

17

1

18

8

20

5

22

2

23

9

25

6

27

3

29

0

30

7

32

4

34

1

35

8

37

5

39

2

40

9

42

6

44

3

46

0

47

7

49

4

PS

NR

(d

B)

Frame Number


44.0000

44.5000

45.0000

45.5000

46.0000

46.5000

47.0000

47.5000

48.0000

48.5000

1

9

17

25

33

41

49

57

65

73

81

89

97

10

5

11

3

12

1

12

9

13

7

14

5

15

3

16

1

16

9

17

7

18

5

19

3

20

1

20

9

21

7

PS

NR

(d

B)

Frame Number


CHAPTER 13 EXPERIMENTS, RESULTS AND ITS...

Documents

Transcript of CHAPTER 13 EXPERIMENTS, RESULTS AND ITS...