CHAPTER 13 EXPERIMENTS, RESULTS AND ITS...
Transcript of CHAPTER 13 EXPERIMENTS, RESULTS AND ITS...
169
CHAPTER 13
EXPERIMENTS, RESULTS AND ITS ANALYSIS
13.1. Mechanization
The raw video sequences used in this research work are YUV 4:2:0 sequences. There
are many YUV sequences of different resolutions that are available in the websites, are
used by most of the researchers. Those video sequences can be downloaded from websites
like, http://see.xidian.edu.cn/vipsl/database_video.html, http://videocoders.com/yuv.html,
http://trace.eas.asu.edu/yuv/ and http://media.xiph.org/video/derf/. In addition to YUV
sequences, two reference encoder software (JM16.1 and x264) are downloaded and used.
In this research work, architecture and design of homogeneous video transcoding
are detailed. The entire transcoding is mechanized in MATLAB environment. The
functionality of decoder, encoder and resizer are checked which are explained in this
chapter. Both reference and research transcoder models are established and the
functionality and robustness of the transcoders are verified and checked.
13.1.1. Inputs and Outputs of Transcoder
Inputs and outputs of the research work are shown in Fig. 13.1.
Fig. 13.1 Inputs and outputs of transcoder
Compressed
Domain
Transcoder
H.264 Bitstream High resolution
H.264 Bitstream Lower resolution
Output
Resolution
Encoding
QP
Standalone /
Reuse
PSNR
File size
Complexity
Bits / frame
170
13.1.2. Functionality Verification of Compressed Domain Decoder
The compressed domain decoder output must be H.264 Standard Compliance. The
bitstreams are generated by encoding YUV video files by x264 encoder. There are 23
YUV videos used in experimentation.
QCIF (176x144) – Coastguard, Foreman, Mother, Salesman
CIF (352x288) – Akiyo, Foreman, Mobile, Soccer
SD (720x576) – Animation, Ant, Balloon, Carrace, Stadium, Unknown Trailer
C640x360 – Gangnam (downloaded video)
HD720p – 1280x720 – Ducks Take Off, Parkrun, Shields
HD1080p – 1920x1080 – Bluesky, Dinner, Gangnam, RiverBed, Touchdown
These sequences are encoded by x264 with 8 different QPs (i.e., 1, 7, 14, 21, 28, 35,
42, and 49). These H.264 Baseline bitstreams are decoded by this compressed domain
decoder. The output compressed domain decoded frames are inverse-transformed and
constructed as spatial domain frames. The same bitstream is decoded by JM decoder to
generate reference video frames. These two frames are compared as shown in Fig. 13.2.
The output videos are always found same resulting in the output of comparator being zero.
Fig. 13.2 Checking the compliance of Compressed Domain Decoder (Spatial Domain)
Compressed Domain
H.264 Baseline
Decoder
Compressed Domain
H.264 Baseline
Encoder
Compressed
Domain Resizer
Reuse
Engine H.264
Bitstream
Parsed Information
Syntax Elements
Compressed domain
Decoded frame
Compressed
domain
Resized frame
Transcoded
H.264
Bitstream
Resize
Ratio
Resize
Ratio
Spatial Domain
H.264 Baseline
Decoder (JM)
Inverse
Transform
–
Always ZERO for H.264
Standard Compliance
Modified
Syntax Elements
171
The outputs of spatial and compressed domain decoder are also checked in
compressed domain as shown in Fig. 13.3. The difference is found always zero. So the
compressed domain decoder ensured that it adhered to the compliance of H.264 Standard.
Fig. 13.3 Checking the compliance of Compressed Domain Decoder (Compressed Domain)
Totally 23 x 8 = 184 bitstreams are decoded by compressed domain decoder and
checked the functionality. For all the bitstreams, the compressed domain decoder
performed as per H.264 Standard without any error.
Compressed Domain
H.264 Baseline
Decoder
Compressed Domain
H.264 Baseline
Encoder
Compressed
Domain Resizer
Reuse
Engine H.264
Bitstream
Parsed Information
Syntax Elements
Compressed domain
Decoded frame
Compressed
domain
Resized frame
Transcoded
H.264
Bitstream
Resize
Ratio
Resize
Ratio
Spatial Domain
H.264 Baseline
Decoder (JM)
Forward
Transform
–
Always ZERO for
H.264 Standard
Compliance
Modified
Syntax Elements
172
13.1.3. Functionality Verification of Compressed Domain Resizer
The compressed domain resizer is tested with the above-mentioned set of YUV
sequences for different resolutions. The sequence is resized by spatial domain resizer
which is imresize.m function in MATLAB Image Processing Toolbox. The same sequence
is forward transformed and resized by compressed domain resizer. The output of resizer is
inverse transformed to bring the result into spatial domain. The resolutions of input and
output frame are mentioned common to both spatial and compressed domain resizers.
PSNR is calculated by comparing the results of resizers frame-by-frame which is shown in
Fig. 13.4.
Fig. 13.4 Checking the functionality of Compressed Domain Resizer
The PSNR in all the cases are more than 50dB. The different resolutions used for
resizing the input videos are, 176x144, 352x288, 480x320, 640x480 and 720x576. The
resizing ratios are arbitrary and it is found that compressed domain resizer works for any
downsizing resolution.
Forward
Transform
PSNR > 50dB
YUV Sequence
Compressed
domain Resizer
Inverse
Transform PSNR Calculation
Spatial Domain
Resizer
I/P & O/P resolution
173
13.1.4. Functionality Verification of Compressed Domain Encoder
The functionality of compressed domain encoder is checked as shown in Fig. 13.5.
The YUV sequence is forward-transformed and sent as input to compressed domain
encoder. The encoder compresses the input video into bitstream with compressed domain
reconstructed frame. These compressed domain frame is inverse-transformed and PSNR of
each frame is calculated with reference to input video YUV sequence.
Fig. 13.5 Checking the functionality of Compressed Domain Encoder
The output of encoder, i.e., the H.264 bitstream is decoded by JM decoder. During
decoding, PSNR of each frame is calculated with reference to input YUV sequence. It is
found that the PSNRs are matching frame-by-frame component wise. It proves that the
compressed domain encoder is H.264 compliance.
Zero error
Compressed domain
Reconstructed Frame
Forward
Transform
YUV Sequence
Compressed
domain Encoder
Inverse
Transform
PSNR
Calculation
JM Decoder
PSNR
Calculation
QP
H.264 bitstream
174
13.1.5. Creation of Reference Model
The available reference software (JM16.1 and x264) are used as standalone encoders
and they do not perform transcoding. The transcoding setup is made with the reference
software. But there is no possibility to implement reusing techniques while encoding. The
architecture of reference model is shown in Fig. 13.6.
Fig. 13.6 Architecture of Reference Model (Classical Spatial Domain Transcoder by
Reference Software) with compressed domain resizer
The input H.264 bitstreams are decoded by reference software first. In order to have
fair comparison, the compressed domain resizer is used here. Then those decoded video
sequences are transformed to compressed domain and resized by the compressed domain
resizer. And the outputs of compressed domain resizer are inverse transformed to give
spatial domain resized video sequences. These video sequences are encoded by reference
software to get required H.264 bitstream. The resizing ratio and QP are noted to follow in
the research (MATLAB) Model.
Inverse Transform
Decoded Transform
Domain Video
Forward Transform
Compressed Domain
Resizer
Spatial Domain
Decoder
Spatial Domain
Encoder
Input H.264
Bitstream
Output H.264
Bitstream
Decoded Spatial
Domain Video
Resized Transform
Domain Video
Resized Spatial
Domain Video
175
13.1.6. Creation of Research Models
The compressed domain decoder, compressed domain resizer, reuse engine and
compressed domain encoder are coded in MATLAB. There are two models in this research
work, namely Standalone Model and Reuse Model. The standalone model is shown in Fig.
13.7. The research standalone model is coded such that it adheres to the H.264 Standard
compliance.
Fig. 13.7 Architecture of Research Standalone Model
The reuse model is shown in Fig. 13.8. Here the switches are joined to enable Reuse
engine which supplies modified syntax elements to encoder. This research reuse model is
coded such that it adheres to the H.264 Standard compliance.
Fig. 13.8 Architecture of Research Reuse Model
Compressed Domain
H.264 Baseline
Decoder
Compressed Domain
H.264 Baseline
Encoder
Compressed
Domain Resizer
Reuse
Engine H.264
Bitstream
Parsed Information
Syntax Elements
Compressed domain
Decoded frame
Compressed
domain
Resized frame
Transcoded
H.264
Bitstream
Resize
Ratio
Resize
Ratio
Modified
Syntax Elements
Compressed Domain
H.264 Baseline
Decoder
Compressed Domain
H.264 Baseline
Encoder
Compressed
Domain Resizer
Reuse
Engine H.264
Bitstream
Parsed Information
Syntax Elements
Compressed domain
Decoded frame
Compressed
domain
Resized frame
Transcoded
H.264
Bitstream
Resize
Ratio
Resize
Ratio
Modified
Syntax Elements
176
13.1.7. Compliance to H.264 Standard
As the resized video of encoder in reference model and research model must be same
for fair comparison, the compressed domain resizer is used in the reference model with
appropriate forward and inverse transform operations as shown in Fig. 13.6. It is ensured
that the output of the resizer in the reference model is found same as that of research
model.
Now the spatial domain resized video is encoded by reference software. The output
of spatial domain encoder is in compliance with H.264 Standard. This output is called
reference output. The syntax elements for those transcoded bitstreams are checked and
found compliance with H.264 Standard by simply decoding the transcoded bitstreams by
JM decoder.
The research model is set to Standalone Model and performed the transcoding
processes. And the research model is set to Reuse Model and performed the transcoding.
The output of the transcoders must be compliance to H.264 Standard.
Fig. 13.9 Architecture for H.264 Standard compliance check
As shown in Fig. 13.9, the compressed domain reconstructed frames which are used
in research model are converted to spatial domain reconstructed frames. The transcoded
H.264 bitstream is decoded to spatial domain frames by reference decoder. These two
video frames are compared and found same, resulting the transcoded H.264 bitstream is in
compliance with H.264 Standard.
Research Model –
Standalone / Reuse
H.264
Bitstream
Compressed domain
reconstructed frame
Transcoded H.264
Bitstream
Spatial Domain
H.264 Baseline
Decoder (JM)
Inverse
Transform
–
Always ZERO for H.264
Standard Compliance
Spatial domain
reconstructed frame
Spatial domain
decoded frame
177
13.2. Experiments
Homogeneous Video Transcoding through Integer transform is shown to be possible
in the research work. This section explains the experimentation of the entire video
transcoding pipeline in MATLAB environment.
For the presentation in this research work, three different CIF sequences (which are
4:2:0 Chroma Subsampling) have been used as data to assess the efficacy of the algorithm
developed. They are 1) Akiyo (which has low motion) 2) Foreman (Camera panning and
fast motion) and 3) Mobile (which is colourful and various motion). Each video sequences
are having 300 frames. The frame rate is 25 fps. The compression characteristics of those
video sequences are shown in Fig. 13.10. It shows the motion / behavioural deviations
between the sequences. This plot is obtained by compressing those video sequences by
x264 software with different QP (varying from 1 to 51). The curves are separate with
considerable distance which indicates that the sequences are varied with different motion
characteristics.
Fig. 13.10 Characteristics of CIF sequences (Akiyo, Foreman and Mobile)
The input of the transcoder is H.264 bitstream which is obtained by compressing the
YUV video sequences by x264 or JM reference software. The identified videos are
compressed into H.264 bitstreams by x264 with QP as 7 in Baseline Profile with one
reference frame. Group of Picture (GOP) = 25 (i.e., Intra frame interval = 25).
Results are obtained from Reference Model, Standalone Model and Reuse Model of
Research Model.
0
1000
2000
3000
4000
5000
6000
7000
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51
Fil
esiz
e (K
B)
QP
Akiyo Foreman Mobile
178
13.3. Results of Transcoder Experiments
Metrics measured as results of Transcoding are listed below.
1. File size
2. Quality (in terms of PSNR)
3. Complexity (in terms of )
Akiyo, Foreman and Mobile sequences are compressed by x264 encoder with QP =
7. The H.264 bitstreams are used as input of the transcoder models. The screen size is
resized from CIF (352 x 288) to QCIF (176 x 144). The QP is varied at the encoder from 7
to 42 insteps of 7 (i.e., QP = 7, 14, 21, 28, 35, 42).
File Sizes (in Kilo Bytes) of the transcoded bitstreams for Reference Model,
Standalone Model and Reuse Model of Research Model are listed in Table 13.1. The
quality of output of those models for the given bitstreams is measured in terms of PSNR
(in dB) and listed in Table 13.2. The complexity in terms of of those models for the
given bitstreams is listed in Table 13.3.
13.4. Analysis of Results
The analysis of results between research model and reference model is done in terms of the
metrics given below.
1. Perceptual Video Quality
2. Objective Quality (PSNR) of Frames
3. Bits consumed by Frames
4. Complexity ( ) of Frames
5. Objective Quality (PSNR) vs. Bitrate – RD-Plot
6. Quality + Bitrate + Complexity of Frames – RDC-Plot
179
Table 13.1 File sizes (Kilo bytes) of transcoded bitstreams for resizing ratio 2:1
QP
Akiyo Foreman Mobile
Refe
rence
Model
Research
Model
(Stand
alone)
Research
Model
(Reuse)
Refe
rence
Model
Research
Model
(Stand
alone)
Research
Model
(Reuse)
Refe
rence
Model
Research
Model
(Stand
alone)
Research
Model
(Reuse)
7 883 865 968 3023 3443 3321 5261 5354 5423
14 328 326 345 1419 1775 1651 3410 3518 3584
21 148 142 154 617 826 738 1948 2013 2068
28 61 58 57 223 332 272 842 857 629
35 27 26 25 85 134 120 241 229 191
42 15 14 10 40 58 52 71 79 89
Table 13.2 Quality (PSNR in dB) of transcoded bitstreams for resizing ratio 2:1
QP
Akiyo Foreman Mobile
Refe
rence
Model
Research
Model
(Stand
alone)
Research
Model
(Reuse)
Refe
rence
Model
Research
Model
(Stand
alone)
Research
Model
(Reuse)
Refe
rence
Model
Research
Model
(Stand
alone)
Research
Model
(Reuse)
7 53.5 53.8 54.0 52.6 53.2 52.9 52.3 52.8 52.7
14 48.5 48.9 48.8 46.7 47.3 47.2 45.6 46.4 46.3
21 44.2 44.3 44.4 41.8 42.1 42.0 39.5 39.8 39.6
28 39.4 38.6 38.2 37.1 37.2 36.0 33.4 33.8 32.4
35 35.1 33.6 34.6 33.3 33.1 31.1 28.2 28.2 25.2
42 31.4 30.8 31.9 30.0 28.7 29.6 24.5 22.6 20.0
Table 13.3 Complexity ( ) of transcoded bitstreams for resizing ratio 2:1
QP
Akiyo Foreman Mobile
Refe
rence
Model
Research
Model
(Stand
alone)
Research
Model
(Reuse)
Refe
rence
Model
Research
Model
(Stand
alone)
Research
Model
(Reuse)
Refe
rence
Model
Research
Model
(Stand
alone)
Research
Model
(Reuse)
7 44.23 62.29 18.33 47.18 67.40 41.90 63.62 89.60 27.21
14 38.88 49.85 17.67 45.94 57.42 25.93 48.03 60.04 25.08
21 33.73 47.51 13.85 41.28 55.79 24.48 40.75 53.62 22.07
28 34.39 47.11 12.29 37.39 52.67 21.17 29.14 37.85 19.74
35 28.94 40.18 11.84 32.72 44.83 18.51 28.66 35.83 15.76
42 20.85 29.79 10.71 29.10 39.32 15.23 18.60 23.25 14.29
180
13.4.1. Perceptual Video Quality
The visual quality and bitrate of the research work are compared with reference
model. The quality of the resized video is comparable with the reference model and found
less distortion. Three different input bitstreams of same screen size, CIF (352 x 288) with
QP = 7 are transcoded to QCIF (176 x 144) with QP = 14. The sample shots are shown in
Fig. 13.11 to Fig. 13.13. Akiyo sequence, which is slow motion, news reading sequence is
shown in Fig. 13.11.
CIF (352 x 288) QCIF (176 x 144)
(a) Input bitstream
(b) Reference Model
(c) Research Model (Reuse)
Fig. 13.11 Frame number 1 of Akiyo Sequence a) Input bitstream b) Transcoded by reference
model c) Transcoded by research model
Foreman sequence, which is commonly used in Video compression world, is shown
in Fig. 13.12. The decoded bitstream (in CIF resolution) is shown in left side of Fig. 13.12.
The video sequence is resized to QCIF resolution. Then the resized video is encoded by
reference model and research work, which are shown in right side of Fig. 13.12. The fast
moving and colourful sequence is mobile. The decoded frame, encoded frame by reference
model and research model of mobile sequence are shown in Fig. 13.13. The perceptual
visual qualities of those sequences are same. The visual comparisons of reference and
research model of those sequences transcoded with QP = 14 are shown Fig. 13.14 to Fig.
13.16.
181
CIF (352 x 288) QCIF (176 x 144)
(a) Input bitstream
(b) Reference Model
(c) Research Model (Reuse)
Fig. 13.12 Frame number 90 of Foreman sequence a) Input bitstream b) Transcoded by
reference model c) Transcoded by research model
CIF (352 x 288) QCIF (176 x 144)
(a) Input bitstream
(b) Reference Model
(c) Research Model (Reuse)
Fig. 13.13 Frame number 238 of Mobile sequence a) Input bitstream b) Transcoded by
reference model c) Transcoded by research model
182
Reference Model Research Model
(Standalone) Research Model (Reuse)
Fig. 13.14 Frame number 1, 61, 121, 181 and 241 of transcoded Akiyo Sequence
183
Reference Model Research Model
(Standalone) Research Model (Reuse)
Fig. 13.15 Frame number 1, 61, 121, 181 and 241 of transcoded Foreman Sequence
184
Reference Model Research Model
(Standalone) Research Model (Reuse)
Fig. 13.16 Frame number 1, 61, 121, 181 and 241 of transcoded Mobile Sequence
185
13.4.2. Objective Quality (PSNR) of Frames
PSNRs with respect to resizer output (spatial domain) for each frame of reference
model and research model’s output bitstreams with QP = 14 have been compared. The Fig.
13.17 to Fig. 13.19 indicated that PSNR of research model is very closer to reference
model, but it very negligible. At least 2dB deviation cannot be identified by naked eyes.
PSNR for each component is calculated as follows.
………………………………………………………………..(13-1)
where
For a 4:2:0, average quality i.e., PSNR of a frame is calculated as a weighted average
of three components as shown below.
…………………………………………….(13-2)
where is PSNR of Luma component, is that of Cb and is that
of Cr component.
I-frame normally takes more bits than P-frames while compressing the video
sequence for same QP. The spikes in the Fig. 13.17 to Fig. 13.19 are the PSNRs of I-
frames in the sequence. The intra frame interval is set to 25 while encoding. So for every
25 frames, the first frame is compressed as I-frame. It is observed that the spikes are
repeated every 25 frames.
The visual qualities of bitstreams for different QPs are compared in Fig. 13.20 to Fig.
13.22. The PSNRs of the entire sequence are averaged to get mean PSNR for a given QP.
The Fig. 13.20 to Fig. 13.22 are plot with mean PSNR vs. QP. It is observed that the mean
PSNR of research models as standalone and reuse are very closer or equal to that of
reference model.
186
Fig. 13.17 Quality Comparison of Akiyo Sequence
Fig. 13.18 Quality Comparison of Foreman Sequence
Fig. 13.19 Quality Comparison of Mobile Sequence
47.00
47.50
48.00
48.50
49.00
49.50
50.00
1
11
21
31
41
51
61
71
81
91
10
1
11
1
12
1
13
1
14
1
15
1
16
1
17
1
18
1
19
1
20
1
21
1
22
1
23
1
24
1
25
1
26
1
27
1
28
1
29
1
PS
NR
(d
B)
Frame Number
Reference Model Research Model (Standalone) Research Model (Reuse)
44.00
45.00
46.00
47.00
48.00
49.00
50.00
1
11
21
31
41
51
61
71
81
91
10
1
11
1
12
1
13
1
14
1
15
1
16
1
17
1
18
1
19
1
20
1
21
1
22
1
23
1
24
1
25
1
26
1
27
1
28
1
29
1
PS
NR
(d
B)
Frame Number
Reference Model Research Model (Standalone) Research Model (Reuse)
43.00
44.00
45.00
46.00
47.00
48.00
49.00
50.00
1
11
21
31
41
51
61
71
81
91
10
1
11
1
12
1
13
1
14
1
15
1
16
1
17
1
18
1
19
1
20
1
21
1
22
1
23
1
24
1
25
1
26
1
27
1
28
1
29
1
PS
NR
(d
B)
Frame Number
Reference Model Research Model (Standalone) Research Model (Reuse)
187
Fig. 13.20 Mean PSNR vs. QP of Akiyo Sequence
Fig. 13.21 Mean PSNR vs. QP of Foreman Sequence
Fig. 13.22 Mean PSNR vs. QP of Mobile Sequence
0.0
10.0
20.0
30.0
40.0
50.0
60.0
7 14 21 28 35 42
Mea
n P
SN
R (
dB
)
Quantization Parameter (QP)
Reference Model Research Model (Standalone) Research Model (Reuse)
0.0
10.0
20.0
30.0
40.0
50.0
60.0
7 14 21 28 35 42
Mea
n P
SN
R (
dB
)
Quantization Parameter (QP)
Reference Model Research Model (Standalone) Research Model (Reuse)
0.0
10.0
20.0
30.0
40.0
50.0
60.0
7 14 21 28 35 42
Mea
n P
SN
R (
dB
)
Quantization Parameter (QP)
Reference Model Research Model (Standalone) Research Model (Reuse)
188
13.4.3. Bits consumed by Frames
The Fig. 13.23 to Fig. 13.25 showed that the bits consumed by each frame of those
sequences. The results showed that research model has taken 10% extra bits that of
reference model for each frame. But the deviation may not be visible in Akiyo and Mobile;
but it is visible in Foreman plot.
Fig. 13.23 Bits spent Comparison of Akiyo Sequence
Fig. 13.24 Bits spent Comparison of Foreman Sequence
The reasons for taking excessive bits than reference model are given below.
1. The reference model works on spatial domain; research model works on
compressed domain where the precision is limited in mode decision and motion
estimation.
0
10000
20000
30000
40000
50000
60000
70000
1
11
21
31
41
51
61
71
81
91
10
1
11
1
12
1
13
1
14
1
15
1
16
1
17
1
18
1
19
1
20
1
21
1
22
1
23
1
24
1
25
1
26
1
27
1
28
1
29
1
Bit
s p
er f
ram
e
Frame Number
Reference Model Research Model (Standalone) Research Model (Reuse)
0
20000
40000
60000
80000
100000
120000
1
11
21
31
41
51
61
71
81
91
10
1
11
1
12
1
13
1
14
1
15
1
16
1
17
1
18
1
19
1
20
1
21
1
22
1
23
1
24
1
25
1
26
1
27
1
28
1
29
1
Bit
s p
er f
ram
e
Frame Number
Reference Model Research Model (Standalone) Research Model (Reuse)
189
2. The reference model uses rigorous sliding techniques to find the best match in
motion estimation; the research model uses restricted searching technique to
address the hardware implementation.
In Foreman sequence, there is a transition of scenes from 170th
frame to 220th
frame.
These transitions have used more bits to code more residue coefficients. This resulted
higher quality in Fig. 13.18.
Fig. 13.25 Bits spent Comparison of Mobile Sequence
The bits taken by bitstreams for different QPs are compared in Fig. 13.26 to Fig.
13.28. The bits of the entire sequence are summed to get total bits (file size) for a given
QP. The Fig. 13.26 to Fig. 13.28 are plot with file size vs. QP. It is observed that the file
size of research models as standalone and reuse are 10% more than that of reference
model.
Fig. 13.26 File size vs. QP of Akiyo Sequence
0
50000
100000
150000
200000
1
11
21
31
41
51
61
71
81
91
10
1
11
1
12
1
13
1
14
1
15
1
16
1
17
1
18
1
19
1
20
1
21
1
22
1
23
1
24
1
25
1
26
1
27
1
28
1
29
1
Bit
s p
er f
ram
e
Frame Number
Reference Model Research Model (Standalone) Research Model (Reuse)
0
200
400
600
800
1000
7 14 21 28 35 42
Fil
esiz
e (k
ilo
by
tes)
Quantization Parameter (QP)
Reference Model Research Model (Standalone) Research Model (Reuse)
190
Fig. 13.27 File size vs. QP of Foreman Sequence
Fig. 13.28 File size vs. QP of Mobile Sequence
13.4.4. Complexity ( ) of Frames
The computational complexity is calculated based on number of operations such as
addition, subtraction, multiplication and shifting in the transcoding path. The complexity of
decoding, resizing and encoding each frame is computed in terms of operations. The
computational complexities obtained from transcoding by reference model and transcoding
by research model (Standalone and Reuse Models) are compared in this research work.
Because research model worked in compressed domain, its complexity is 10-20% higher
than that of spatial domain reference model. But, with reuse techniques, the complexity of
research model is 70-80% lower than that of spatial domain reference model. The
complexities of the above-said sequences are shown in Fig. 13.29 to Fig. 13.31.
0
500
1000
1500
2000
2500
3000
3500
7 14 21 28 35 42
Fil
esiz
e (k
ilo
by
tes)
Quantization Parameter (QP)
Reference Model Research Model (Standalone) Research Model (Reuse)
0
1000
2000
3000
4000
5000
6000
7 14 21 28 35 42
Fil
esiz
e (k
ilo
by
tes)
Quantization Parameter (QP)
Reference Model Research Model (Standalone) Research Model (Reuse)
191
Fig. 13.29 Complexity comparison for Akiyo Sequence
Fig. 13.30 Complexity comparison for Foreman Sequence
Fig. 13.31 Complexity comparison for Mobile Sequence
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
1
11
21
31
41
51
61
71
81
91
10
1
11
1
12
1
13
1
14
1
15
1
16
1
17
1
18
1
19
1
20
1
21
1
22
1
23
1
24
1
25
1
26
1
27
1
28
1
29
1
Co
mp
lexit
y (
mil
lio
n
op
era
tio
ns
per
Fra
me)
Frame Number
Reference Model Research Model (Standalone) Research Model (Reuse)
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
1
11
21
31
41
51
61
71
81
91
10
1
11
1
12
1
13
1
14
1
15
1
16
1
17
1
18
1
19
1
20
1
21
1
22
1
23
1
24
1
25
1
26
1
27
1
28
1
29
1
Co
mp
lexit
y (
mil
lio
n
op
era
tio
ns
per
Fra
me)
Frame Number
Reference Model Research Model (Standalone) Research Model (Reuse)
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
80.0
1
11
21
31
41
51
61
71
81
91
10
1
11
1
12
1
13
1
14
1
15
1
16
1
17
1
18
1
19
1
20
1
21
1
22
1
23
1
24
1
25
1
26
1
27
1
28
1
29
1
Co
mp
lexit
y (
mil
lio
n
op
era
tio
ns
per
Fra
me)
Frame Number
Reference Model Research Model (Standalone) Research Model (Reuse)
192
The computational complexities of the entire sequence are averaged to get mean
complexity ( ) for a given QP. The computational complexities by bitstreams for
different QPs are compared in Fig. 13.32 to Fig. 13.34. The Fig. 13.32 to Fig. 13.34 are
plot with mean Complexity vs. QP. It is observed that the mean Complexity of research
model (reuse) is 70-80% less than that of standalone and reference model. The plots
showed that at least 75% of computation complexity is saved. In slow moving sequences,
like Akiyo, 80% savings is possible. It is observed that 78% and 75% savings are done
Foreman sequence and Mobile sequence respectively.
Fig. 13.32 Mean Complexity vs. QP for Akiyo Sequence
Fig. 13.33 Mean Complexity vs. QP for Foreman Sequence
0.000
10.000
20.000
30.000
40.000
50.000
60.000
70.000
7 14 21 28 35 42
Mea
n C
om
ple
xit
y (
mo
pf)
Quantization Parameter (QP)
Reference Model Research Model (Standalone) Research Model (Reuse)
0.000
10.000
20.000
30.000
40.000
50.000
60.000
70.000
7 14 21 28 35 42
Mea
n C
om
ple
xit
y (
mo
pf)
Quantization Parameter (QP)
Reference Model Research Model (Standalone) Research Model (Reuse)
193
Fig. 13.34 Mean Complexity vs. QP for Mobile Sequence
13.4.5. PSNR vs. Bitrate (RD Plot)
Next the PSNR vs. bitrate plot (RD plot) is done to evaluate the quality. The Video
quality vs. bitrate of the research model is very closer with the reference model. The
comparisons for the above-said Akiyo, Foreman and Mobile sequences are shown below in
Fig. 13.35 to Fig. 13.37. It is clearly noted that the quality at different bitrates is on-par
with that of reference software.
Fig. 13.35 Quality vs. Bitrate of Akiyo Sequence
0.000
20.000
40.000
60.000
80.000
100.000
7 14 21 28 35 42
Mea
n C
om
ple
xit
y (
mo
pf)
Quantization Parameter (QP)
Reference Model Research Model (Standalone) Research Model (Reuse)
0.0
20.0
40.0
60.0
11 10 9 8 7
6 5
PS
NR
(d
B)
Bitrate (in powers of 2) (kb)
Reference Model Research Model (Standalone) Research Model (Reuse)
194
Fig. 13.36 Quality vs. Bitrate of Foreman Sequence
Fig. 13.37 Quality vs. Bitrate of Mobile Sequence
13.4.6. Rate-Distortion-Complexity (RDC) Plot
The new metric called RDC-plot is introduced in this research work. The
combination of bits consumed, the quality in terms of SSE and complexity in are
combined as follows for each frame.
(13-2)
where
is listed in Table A1.6 (Annexure – I). The RDC-Plots for the same
sequences are plotted in Fig. 13.38 to Fig. 13.40. It indicated that the RDC-parameter for
0.0
20.0
40.0
60.0
13 12 11 10 9 8
7
PS
NR
(d
B)
Bitrate (in powers of 2) (kb)
Reference Model Research Model (Standalone) Research Model (Reuse)
0.0
20.0
40.0
60.0
13 12 11 10 9 8
7
PS
NR
(d
B)
Bitrate (in powers of 2) (kb)
Reference Model Research Model (Standalone) Research Model (Reuse)
195
research work is always less than that of reference model. It saved 30% to 50% of overall
computation with optimized rate and quality.
Fig. 13.38 Rate-Distortion-Complexity comparison for Akiyo Sequence
Fig. 13.39 Rate-Distortion-Complexity comparison for Foreman Sequence
Fig. 13.40 Rate-Distortion-Complexity comparison for Mobile Sequence
0
2000000
4000000
6000000
8000000
10000000
7 14 21 28 35 42
RD
C C
ost
Quantization Parameter (QP)
Reference Model Research Model (Standalone) Research Model (Reuse)
0
2000000
4000000
6000000
8000000
10000000
12000000
7 14 21 28 35 42
RD
C C
ost
Quantization Parameter (QP)
Reference Model Research Model (Standalone) Research Model (Reuse)
0
1000000
2000000
3000000
4000000
5000000
6000000
7000000
7 14 21 28 35 42
RD
C C
ost
Quantization Parameter (QP)
Reference Model Research Model (Standalone) Research Model (Reuse)
196
13.5. Results of different resizing ratios
The transcoder is tested for different resizing ratios in order to test the stability and
robustness of compressed domain resizer. There are two different resizing ratios depicted
here, 1) 1280x720 to 720x576 i.e., HD720p to SD and 2) 720x576 to 480x320 i.e., SD to
320p. Here width-wise resizing ratios are 1.778:1 and 1.5:1 and height-wise resizing ratios
are 1.25:1 and 1.8:1. This shows that the compressed domain resizer resizes arbitrary
resizing ratios too.
Table 13.4 File sizes (Kilo bytes) of transcoded bitstreams for different resizing ratios
QP
Shields (HD to SD) SD to 320p
Reference
Model
Research
Model
(Standalone)
Research
Model
(Reuse)
Reference
Model
Research
Model
(Standalone)
Research
Model
(Reuse)
7 13875 10055 8207 17482 19282 20036
14 5720 5953 4225 9808 10818 11241
21 3636 3235 2976 4610 5085 5283
28 1026 1183 879 1607 1772 1842
35 504 543 521 500 551 573
42 255 250 246 237 261 272
Table 13.5 Quality (PSNR in dB) of transcoded bitstreams for different resizing ratios
QP
Shields (HD to SD) SD to 320p
Reference
Model
Research
Model
(Standalone)
Research
Model
(Reuse)
Reference
Model
Research
Model
(Standalone)
Research
Model
(Reuse)
7 53.2563 53.1901 52.7249 52.3092 52.5705 52.4010
14 46.6552 46.6485 46.3962 45.7937 46.0225 45.8741
21 41.5879 41.4440 41.4158 39.9838 40.1835 40.0540
28 37.0409 36.8473 36.1874 34.1302 34.3007 34.1901
35 33.0036 32.5472 31.9388 29.4681 29.6153 29.5198
42 30.4831 29.1440 28.5534 26.8067 26.9406 26.8537
Table 13.6 Complexity ( ) of transcoded bitstreams for different resizing ratios
QP
Shields (HD to SD) SD to 320p
Reference
Model
Research
Model
(Standalone)
Research
Model
(Reuse)
Reference
Model
Research
Model
(Standalone)
Research
Model
(Reuse)
7 193.6453 261.8464 97.1455 100.3472 139.6943 49.3139
14 121.9546 159.4850 60.1628 87.3475 121.5973 42.9254
21 106.2155 136.3968 45.9893 72.9563 101.5632 35.8531
28 97.8327 110.9638 39.0806 54.2751 75.5569 26.6726
35 94.8630 105.5490 35.2117 50.9672 70.9519 25.0470
42 91.9594 101.4523 32.5880 45.8765 63.8651 22.5452
197
One frame has been sampled from input 1280x720 resolution bitstream and shown
in Fig. 13.41. The sample frame from output 720x576 resolution bitstream is shown in Fig.
13.42.
Fig. 13.41 Sample frame from HD1280x720 Shields Sequence
Fig. 13.42 Sample frame from SD720x576 transcoded Sequence
198
One frame has been sampled from input 720x576 resolution bitstream and shown in
Fig. 13.43. The sample frame from output 480x320 resolution bitstream is shown in Fig.
13.44.
Fig. 13.43 Sample frame from SD720x576 Stadium Sequence
Fig. 13.44 Sample frame from 480x320 transcoded Sequence
199
The PSNR plots for reference model, standalone model and reuse model for these two
sequences at QP=14 are shown in Fig. 13.45 and Fig. 13.46.
Fig. 13.45 Quality Comparison of Shields Sequence
Fig. 13.46 Quality Comparison of Stadium Sequence
44.5000
45.0000
45.5000
46.0000
46.5000
47.0000
47.5000
48.0000
1
18
35
52
69
86
10
3
12
0
13
7
15
4
17
1
18
8
20
5
22
2
23
9
25
6
27
3
29
0
30
7
32
4
34
1
35
8
37
5
39
2
40
9
42
6
44
3
46
0
47
7
49
4
PS
NR
(d
B)
Frame Number
Reference Model Research Model (Standalone) Research Model (Reuse)
44.0000
44.5000
45.0000
45.5000
46.0000
46.5000
47.0000
47.5000
48.0000
48.5000
1
9
17
25
33
41
49
57
65
73
81
89
97
10
5
11
3
12
1
12
9
13
7
14
5
15
3
16
1
16
9
17
7
18
5
19
3
20
1
20
9
21
7
PS
NR
(d
B)
Frame Number
Reference Model Research Model (Standalone) Research Model (Reuse)