balance of 0,1 bit
-
Upload
man-mohan-goel -
Category
Documents
-
view
220 -
download
0
Transcript of balance of 0,1 bit
7/27/2019 balance of 0,1 bit
http://slidepdf.com/reader/full/balance-of-01-bit 1/3
IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 52, NO. 3, MARCH 2004 359
Balance of 0, 1 Bits for Huffman and Reversible Variable-Length Coding
Jia-Yu Lin, Ying Liu, and Ke-Chu Yi
Abstract—This letterproposes a novel algorithm to obtaina sub-optimal solution for the balance of bit distribution after Huffmancoding. The algorithm is simple, and can be embedded in the con-ventional Huffman coding process. In addition, the letter also dis-cusses the bit-balance problem for reversible variable-length codes(RVLCs) based on Huffman coding. Analytical and experimentalresults suggest that the new algorithm is very useful in improvingthe
0 1
balance property for Huffman codes and RVLCs.
Index Terms—Balance of 0/1 bits, Huffman codes, reversiblevariable-length codes (RVLCs).
I. INTRODUCTION
HUFFMAN coding [1] and its variants have been widely
used in data compression, audio coding, image coding,
and video coding. Normally, the performance of Huffman en-coders is measured not only by the compression effectiveness,
but also by other criteria [2], such as self-synchronizing ability,
memory, and searching efficiency. To improve error-resiliency
capabilities, reversible variable-length codes (RVLCs) [4]–[7]
based on Huffman coding have been introduced.In this letter, the problem of probability distribution of zeros
and ones in binary Huffman and RVLC streams is discussed.
In general, the more balanced the zeros and ones are in the bit
stream, the better the bit stream is for further processing and
transmission. Bit balance in the output stream of the source
encoder minimizes the influence of source statistics on the
channel-coding performance. For error correction, it is always
assumed that the input data sequence is basically randomwith equally probable zeros and ones [8]. Correspondingly,
the assumption that each codeword is equally likely to be
transmitted is usually made when analyzing the probability
of decoding error for the binary symmetric channel [3]. Fur-
thermore, the performance of subsystems, such as bit timing
recovery, frame synchronization, equalization in time domain,
and time-frequency property of modulation signals, could be
improved by bit-balanced transmission.
However, the bit distribution in conventional Huffman codes
may be significantly unbalanced. In [3], Montgomery pointed
out that a source has balanced-bit probabilities for any optimal
source code if and only if the source is dyadic. This is a rare situ-
Paper approved by K. Rose, the Editor for Source-Channel Coding of theIEEE Communications Society. Manuscript received January 13, 2002; revisedNovember 7, 2002; April 5, 2003; and July 24, 2003. This work was supportedby NSFC 60172029.
J.-Y. Lin and Y. Liu are with the School of Electronic Science and Engi-neering, National Universityof Defence Technology, Changsha,Hunan 410073,China (e-mail: [email protected], [email protected]; [email protected]).
K.-C. Yi is with the State Key Laboratory on Integrated Service Networks,Xidian University, Xi’an, Shanxi 710071, China (e-mail: [email protected]).
Digital Object Identifier 10.1109/TCOMM.2004.823568
ation. Even for codes that are 97% to 99% efficient, probability
of the more likely bit may be significantly greater than [3].In [3], the upper bounds for the maximum and minimum
probability values of the more likely bit are given. However, no
work was done on algorithms to minimize the difference in bit
probabilities. This letter proposes a suboptimal algorithm for
the construction of Huffman codes with balanced-bit probabil-
ities, and discusses the bit balance in bidirectionally decodable
streams [4] and RVLCs [5]–[7].
II. CONSTRUCTION OF BIT-BALANCED HUFFMAN CODES
Generally, there are two steps in constructing Huffman codes.
First, we construct a Huffman tree according to the occurrence
probabilities of source symbols; then, assign either zero or one
(in the binary case) to each branch of the Huffman tree. Usually,
the assignment of zeros or ones to the left or right branches is
fixed. We will refer to this kind of Huffman code as “conven-
tional Huffman codes.” However, we may simply reverse the
assignment for each pair of sibling branches independently, and
still obtain a Huffman coder satisfying the prefix condition. This
idea allows us to balance the bit probabilities in the final bit
stream.
For source symbols occurring with probabilities
, we denote the corresponding code
lengths as . Then the average code length
is given by . Suppose in the th codeword, the
numbers of one and zero bits are and , respectively. Wehave . We define the average
numbers of one and zero in all the codewords as and .
The bit probabilities in the code stream are and
.
Assuming that a Huffman tree has been generated, and that
bits have been assigned to the branches, we can calculate statis-
tically theoccurrence frequencyof oneand zero accordingto the
weights of branches. Let be the set of weights of all branches
labeled aszero. Let bedefinedanalogously. Let and
denote the weights of the sibling pair of branches connected to
the th internal node, with and . We have
and (note that there are
internal nodes, including the root). Thus, the difference
between and is , and
our goal is to minimize .
After the construction of the Huffman tree (before the assign-
ment of bit labels), the weights of the left and right branches
connected to the th internal node are settled, denoted as
and , respectively. We may assume , according
to the common routine to construct the Huffman tree. We de-
note the difference of weights in a sibling pair of branches as
.
0090-6778/04$20.00 © 2004 IEEE
7/27/2019 balance of 0,1 bit
http://slidepdf.com/reader/full/balance-of-01-bit 2/3
360 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 52, NO. 3, MARCH 2004
TABLE IHUFFMAN CODES AND RVLCS FOR ENGLISH ALPHABET (C1: CONVENTIONAL HUFFMAN CODES; C2: OUR BIT-BALANCED HUFFMAN CODES; C3: SYMMETRICAL
RVLC IN [6]; C4: OUR CORRESPONDING SYMMETRICAL RVLC; C5: ASYMMETRICAL RVLC IN [7]; C6: OUR CORRESPONDING ASYMMETRICAL RVLC)
Now, when assigning zero and one to every pair of sib-
ling branches in the conventional way (without generality,
assuming that zero and one are assigned to the left and right
branches, respectively), we have .Thus, . But, the assignment of
zero and one to any pair of sibling branches could be re-
versed, resulting in . So, we have
, with the conventional
assignment for a sibling pair of branches leading to ,
and reversed assignment to . As to the bit assignment
in conventional Huffman codes, for all the pairs of sibling
branches, . Obviously, this removes
the obligation to minimize . As far as the bit-balance
criterion is concerned, it could be the worst code-assignment
scheme.
We should be able to find the most suitable to minimize, which is a special scheme to assign the labels for
the pairs of sibling branches accordingly. We can find
the optimal solution by exhaustively searching the pos-
sible cases of . But when is large, it is computationally
expensive to find such a global optimum solution. We present a
suboptimal algorithm, which involves much less computational
complexity, as follows.
First of all, we sort in an increasing
order with respect to their values, and denote the sorted list by
, where .
The problem can then be restated as minimizing ,
subject to the constraint .
The search for is carried out iteratively
as follows.
Step 0 Let . We may arbitrarily set
, then , and set .Step 1 If , set ; otherwise, set
.
Step 2 Set . If , go to Step 1; otherwise,
end.
We get the proper , whose values are segment-wise taken as
1 and 1. A suboptimal solution for the will be obtained
accordingly.
Although the solution may not be optimal, the algorithm is
simple to implement. Except for the ordering of , only
times of additions and comparisons are needed. The algorithm
may be embedded in the assignment of Huffman code bits.
When constructing the Huffman tree, weight differences inpairs of sibling branches could be recorded by producing
internal nodes. Then, the suboptimal is found out using
the algorithm shown above, and the bit-labels assignment to
branches is decided accordingly. That is, when is one, assign
zero and one to the left and right branches of internal node ,
respectively. When is , assignment of zero and one is
reversed.
The English alphabet [5] is shown in Table I. The occur-
rence probabilities of letters, the conventional Huffman codes
(C1), and our bit-balanced Huffman codes (C2) are listed. The
two Huffman codes have the same average code lengths, i.e.,
. Both schemes minimize the code variances,
7/27/2019 balance of 0,1 bit
http://slidepdf.com/reader/full/balance-of-01-bit 3/3
IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 52, NO. 3, MARCH 2004 361
resulting in the value 0.888 613. However, and of the
conventional Huffman codes are 0.456 410 72 and 0.543 58928,
respectively, with the absolute difference 0.087 178 55; while
and of our Huffman codes are both close to 0.5, with
the absolute difference only 0.000 096 35. The result shows that
the new method is much better than the conventional one under
the bit–balance criterion.
III. BALANCE OF ZEROS AND ONES FOR RVLCS
A. Bidirectionally Decodable Streams
The bidirectionally decodable stream [4] is generated from
a Huffman code by reversing the original codewords, and per-
forming a bitwise exclusive OR operation on the original and
reversed bit streams, where the codebook of the Huffman code
is left unchanged. So, our method depicted above can be ap-
plied here. This section discusses the relationship between the
bit balance in the original Huffman code and the bit balance in
the corresponding bidirectionally decodable stream.
Assume the bit probabilities in the original encoded stream
are and . According to the bitwise exclusive OR opera-tion, the bit probabilities in the bidirectionally decodable stream
are , and , ignoring
the leading and trailing zeros [4]. So, with the difference be-
tween the bit probabilities in the original encoded stream being
, the one in the bidirectionally decodable
stream is . Since ,
we have . That is, the bidirectionally decodable
stream decreases the bit-probability difference, compared with
the original one of the Huffman code stream.
B. RVLC With Redesigned Codebook
There are symmetrical [5], [6] and asymmetrical [5]–[7]RVLCs constructed from a given Huffman code. These ap-
proaches design new codebooks based on original Huffman
codes. The redesigned codebooks satisfy both the prefix and
suffix condition. Our method discussed above can not be
applied to either of them, since the suffix condition may be
violated when the bit assignment is changed. However, if RVLC
codewords with the same length are permutated, i.e., their
assignment changed to different source symbols which have
the same codeword length, the prefix and suffix conditions will
still be satisfied, and the compression effectiveness and coding
efficiency will be retained. This is adjustment of bit-probability
distribution at the codewords level, while we notice that the
adjustment of the Huffman codes discussed above is processedat the bits level.
We call the group of source symbols with equal codeword
length “a source symbol segment,” in which permutations of
codewords’ assignment are tried. Since the number of source
symbols in a source symbol segment is usually not large (the
maximum is eight, in the example of Table I), we can try all
of the permutations. We process from source symbol seg-
ments with shorter codeword lengths (with larger occurrence
probabilities) to those with longer lengths (with smaller proba-
bilities). We search in each source symbol segment to minimize
, where
means the partial sum of the probability difference till
symbol segment and belong to the index set of symbol
segment . is the element of permutation matrix , and
when , codeword is assigned to symbol .
In Table I, C3 is the symmetrical code from [6], with the
bit-probability difference of 0.018 68104, which could be
decreased to 0.001 824 64 (see C4), without loss in compres-
sion performance. C5 is asymmetrical RVLC from [7], with
the bit-probability difference of 0.048 581 26, which could be
decreased to 0.004 790 58 (see C6).
IV. CONCLUSION
In this letter, we have discussed the problem of Huffman
codes and RVLCs with balanced zeros and ones in encoded bit
streams. This letter proposes an effective algorithm to make
the bit probabilities balanced. The algorithm can be embedded
in the construction of the Huffman codebook, with little
complexity. In the analysis of RVLCs based on the Huffman
codes, we showed that the bidirectionally decodable stream hadgood performance under the bit-balance criterion, and it could
be combined with the proposed algorithm to further decrease
the bit-probability difference. For symmetrical and asymmet-
rical RVLCs, probability differences could be decreased by
reassigning codewords to source symbols after the creation
of codebooks. The analytic and experimental results suggest
that the proposed algorithm is quite promising in designing
Huffman codes with balanced zero and one probabilities.
ACKNOWLEDGMENT
The authors would like to thank the reviewers who provided
very valuable feedback. References [2], [3], and [7] were rec-ommended by them. This paper was clarified substantially with
their help. The authors would also like to thank Prof. W.-D. Kou,
the Director of the State Key Lab of Integrated Service Net-
works (Xidian University), China, and Prof. K. Rose, the Ed-
itor, who provided considerable help in modifying the text of
the letter.
REFERENCES
[1] D. A. Huffman, “A method for the construction of minimum redundancycodes,” Proc. IRE , vol. 40, pp. 1098–1101, Sept. 1952.
[2] J. Abrahams, “Code and parse trees for lossless source encoding,”Commun. Inform. Syst., vol. 1, no. 2, pp. 113–146, Apr. 2001.
[3] B. L. Montgomery, H. Diamond, and B. Kumar, “Bit probabilities of
optimal binary source codes,” IEEE Trans. Inform. Theory, vol. 36, pp.1446–1450, June 1990.
[4] B. Girod, “Bidirectionally decodable streams of prefix codewords,” IEEE Commun. Lett., vol. 3, pp. 245–247, Aug. 1999.
[5] Y. Takishima, M. Wada, and H. Murakami, “Reversible variable-lengthcodes,” IEEE Trans. Commun., vol. 43, pp. 158–162, Mar. 1995.
[6] C. W. Tsai and J. L. Wu, “On constructing the Huffman code-basedreversible variable-length codes,” IEEE Trans. Commun., vol. 49, pp.1506–1509, Sept. 2001.
[7] K. Lakovic and J. Villasenor, “On design of error-correcting reversiblevariable-length codes,” IEEE Commun. Lett., vol. 6, pp. 337–339, Aug.2002.
[8] J. G. Proakis, Digital Communications, 3rd ed. New York: McGraw-Hill, 1998.