Distributed Source Coding for Sensor Networkslena.tamu.edu/repository/SPM.pdf · wireless...

IEEE SIGNAL PROCESSING MAGAZINE80 SEPTEMBER 20041053-5888/04/$20.00©2004IEEE

Distributed Source Coding for Sensor Networks

Compression viachannel coding

Zixiang Xiong, Angelos D. Liveris,and Samuel Cheng

n recent years, sensor research has been undergoing aquiet revolution, promising to have a significant impactthroughout society that could quite possibly dwarf pre-vious milestones in the information revolution. MIT

Technology Review ranked wireless sensor networks that con-sist of many tiny, low-power and cheap wireless sensors asthe number one emerging technology. Unlike PCs or theInternet, which are designed to support all types ofapplications, sensor networks are usually mission drivenand application specific (be it detection of biologicalagents and toxic chemicals; environmental measure-ment of temperature, pressure and vibration; or real-time area video surveillance). Thus they must operateunder a set of unique constraints and requirements. Forexample, in contrast to many other wireless devices(e.g., cellular phones, PDAs, and laptops), in whichenergy can be recharged from time to time, the energy

provisioned for a wireless sensor node is not expected tobe renewed throughout its mission. The limited amount of

energy available to wireless sensors has a significant impacton all aspects of a wireless sensor network, from the amount

of information that the node can process, to the volume ofwireless communication it can carry across large distances. Realizing the great promise of sensor networks requires more

than a mere advance in individual technologies; it relies on many com-ponents working together in an efficient, unattended, comprehensible, and

trustworthy manner. One of the enabling technologies for sensor networks is distributedsource coding (DSC), which refers to the compression of multiple correlated sensor out-puts [1]–[4] that do not communicate with each other (hence distributed coding). Thesesensors send their compressed outputs to a central point [e.g., the base station (BS)] forjoint decoding.

I

©D

IGIT

AL

VIS

ION

To motivate DSC, consider a wireless video sensornetwork consisting of clusters of low-cost video sensornodes (VSNs), an aggregation node (AN) for eachcluster, and a BS for surveillance applications. Thelower tier VSNs are used for data acquisition and pro-cessing; the upper tier ANs are used for data fusion andtransmitting information out of the network. This typeof network is expected to operate unattended over anextended period of time. As such, VSN and AN powerconsumption cause severe system constraints; addition-ally, traditional, one-to-many, video processing that isroutinely applied to sophisticated video encoders (e.g.,MPEG compression) will not be suitable for use on aVSN. This is because under the traditional broadcastparadigm the video encoder is the computational work-horse of the video codec; consequently, computationalcomplexity is dominated by the motion estimationoperation. The decoder, on the other hand, is a rela-tively lightweight device operating in a “slave” mode tothe encoder. The severe power constraints at VSNsthus bring about the following basic requirements: 1)an extremely low-power and low-complexity wirelessvideo encoder, which is critical to prolonging the life-time of a wireless video sensor node, and 2) a highratio of compression efficiency, since bit rate directlyimpacts transmission power consumption at a node.

DSC allows a many-to-one video coding paradigmthat effectively swaps encoder-decoder complexity withrespect to conventional (one-to-many) video coding,thereby representing a fundamental conceptual shift invideo processing. Under this paradigm, the encoder ateach VSN is designed as simply and efficiently as possi-ble, while the decoder at the BS is powerful enough toperform joint decoding. Furthermore, each VSN canoperate independently of its neighbors; consequently, areceiver is not needed for video processing at a VSN,which enables the system to save a substantial amountof hardware cost and communication (i.e., receiver)power. In practice, depending also on the nature of thesensor network, the VSN might still need a receiver totake care of other operations, such as routing, control,and synchronization, but such a receiver will be signifi-cantly less sophisticated.

Under this new DSC paradigm, a challenging prob-lem is to achieve the same efficiency (e.g., joint entropyof correlated sources) as traditional video coding, whilenot requiring sensors to communicate with each other.A moment of thought reveals that this might be possi-ble because correlation exists among readings fromclosely placed neighboring sensors and the decoder canexploit such correlation with DSC; this is done at theencoder with traditional video coding. As an example,suppose we have two correlated 8-b gray-scale imagesX and Y whose same location pixel values x and y arerelated by x ∈ {y − 3, y − 2, y − 1, y, y + 1, y + 2,

y + 3, y + 4}. In other words, the correlation of x andy is characterized by −3 ≤ x − y ≤ 4, or x assumes onlyeight different values around y. Thus, joint coding of x

would take 3 b. But in DSC, we simply take modulo ofpixel value x with respect to eight, which also reducesthe required bits to three. Specifically, let x = 121 andy = 119. Instead of transmitting both x and y at 8 b/pwithout loss, we transmit y = 119 and x′ = x(mod 8) = 1 in distributed coding. Consequently, x′

indexes the set that x belongs to, i.e., x ∈{1, 9, 17, . . . , 249}, and the joint decoder picks theelement x = 121 closest to y = 119.

The above is but one simple example showcasing thefeasibility of DSC. Slepian and Wolf [1] theoreticallyshowed that separate encoding (with increased com-plexity at the joint decoder) is as efficient as jointencoding for lossless compression. Similar results wereobtained by Wyner and Ziv [2] with regard to lossycoding of joint Gaussian sources. Driven by applica-tions like sensor networks, DSC has recently become avery active research area—more than 30 years afterSlepian and Wolf laid its theoretical foundation [4].

A tutorial article, “Distributed Compression in aDense Microsensor Network” [5], appeared in IEEESignal Processing Magazine in 2002. Central to [5] is apractical DSC scheme called DISCUS (distributedsource coding using syndromes) [6]. This current arti-cle is intended as a sequel to [5] with the main aim ofcovering our work on DSC and other relevant researchefforts ignited by DISCUS.

We note that DSC is only one of the communicationlayers in a network, and its interaction with the lowercommunication layers, such as the transport, the net-work, the medium access control (MAC), and thephysical layers, is crucial for exploiting the promisedgains of DSC. Issues related to cross-layer design (e.g.,queuing management [7], resource allocation [8], calladmission control [9], and MAC [10]) are addressed inother articles in this issue. DSC cannot be used withoutproper synchronization between the nodes of a sensornetwork, i.e., several assumptions are made for therouting and scheduling algorithms and their connec-tion to the utilized DSC scheme. In addition, com-pared to a separate design, further gains can beobtained by jointly designing the distributed sourcecodes with the underlying protocols, channel codes,and modulation schemes.

Many relevant results and references on DSC couldnot be included in this article. We refer readers to alonger version with more figures and tables and a com-prehensive bibliography at http://lena.tamu.edu.

Slepian-Wolf CodingLet {(Xi ,Y i )}∞i=1 be a sequence of independent andidentically distributed (i.i.d.) drawings of a pair of cor-related discrete random variables X and Y . For losslesscompression with X = X and Y = Y after decompres-sion, we know from Shannon’s source coding theory[3] that a rate given by the joint entropy H (X ,Y ) ofX and Y is sufficient if we are encoding them together[see Figure 1(a)]. For example, we can first compress

IEEE SIGNAL PROCESSING MAGAZINESEPTEMBER 2004 81

Y into H (Y ) bits per sample and based on the com-plete knowledge of Y at both the encoder and thedecoder, we then compress X into H (X |Y ) bits persample. But what if X and Y must be separatelyencoded for some user to reconstruct both of them?

One simple way is to do separate coding with rateR = H (X ) + H (Y ), which is greater than H (X ,Y )

when X and Y are correlated. In a landmark paper [1],Slepian and Wolf showed that R = H (X ,Y ) is suffi-cient even for separate encoding of correlated sources[see Figure 1(b)]! The Slepian-Wolf theorem says thatthe achievable region of DSC for discrete sources Xand Y is given by R1 ≥ H (X |Y ), R2 ≥ H (Y |X ) andR1 + R2 ≥ H (X ,Y ), which is shown in Figure 2. Theproof of the Slepian-Wolf theorem is based on randombinning. Binning is a key concept in DSC and refers topartitioning the space of all possible outcomes of a ran-dom source into disjoint subsets or bins. Examples

explaining the binning process are given in “Slepian-Wolf Coding Examples.” The achievability of Slepian-Wolf coding was generalized by Cover [3] to arbitraryergodic processes, countably infinite alphabets, andarbitrary number of correlated sources.

Slepian-Wolf Coding of Two Binary SourcesJust like in Shannon’s channel coding theorem [3], therandom binning argument used in the proof of theSlepian-Wolf theorem is asymptotic and nonconstruc-tive. For practical Slepian-Wolf coding, we can first tryto design codes to approach the blue corner point Awith R1 + R2 = H (X |Y ) + H (Y ) = H (X ,Y ) in theSlepian-Wolf rate region of Figure 2. This is a problemof source coding of X with side information Y at thedecoder as depicted in Figure 3 in “Slepian-WolfCoding Examples.” If this can be done, then the othercorner point B of the Slepian-Wolf rate region can beapproached by swapping the roles of X and Y and allpoints between these two corner points can be realized

by time sharing—for example, usingthe two codes designed for the cor-ner points 50% of the time each willresult in the mid-point C.

Although constructive approach-es (e.g., [12], [13]) have been pro-posed to directly approach themidpoint C in Figure 2 andprogress has recently been made inpractical code designs (see [14] andreferences therein) that canapproach any point between A andB due to the space limitation, weonly consider code designs for thecorner points (or source codingwith side information at thedecoder) in this article. The formerapproaches are referred to as sym-metric and the latter as asymmetric.In asymmetric coding, our aim is to

code X at a rate that approaches H (X |Y ) based on theconditional statistics of (or the correlation modelbetween) X and Y but not the specific y at theencoder. Wyner first realized the close connection ofDSC to channel coding and suggested the use of linearchannel codes as a constructive approach for Slepian-Wolf coding in his 1974 paper [11]. The basic idea wasto partition the space of all possible source outcomesinto disjoint bins (sets) that are the cosets of some“good” linear channel code for the specific correlationmodel (see “Slepian-Wolf Coding Examples” for exam-ples and more detailed explanation). Consider the caseof binary symmetric sources and Hamming distancemeasure, with a linear (n, k) binary block code; thereare 2n−k distinct syndromes, each indexing a bin (set)of 2k binary words of length n. Each bin is a coset codeof the linear binary block code, which means that theHamming distance properties of the original linear

IEEE SIGNAL PROCESSING MAGAZINE82 SEPTEMBER 2004

▲ 1. (a) Joint encoding of X and Y . The encoders collaborateand a rate H(X, Y ) is sufficient. (b) Distributed/separate encod-ing of X and Y . The encoders do not collaborate. The Slepian-Wolf theorem says that a rate H(X, Y ) is also sufficient providedthat decoding of X and Y is done jointly.

▲ 2. The Slepian-Wolf rate region for two sources.

X

Y

Encoder 1

Encoder 2

Joint DecoderX, Y

(a)

(b)

^ ^

X

Y

Encoder 1

Encoder 2

Joint DecoderX, Y^ ^

Compression of X withSide Information Y at

the Joint Decoder

Joint Encodingand Decoding

R2

H(X,Y)

H(Y)

H(Y|X)

H(X|Y) H(X)H(X,Y)

A

C

B

R1

Achievable Rates withSlepian-Wolf Coding

code are preserved in each bin. In compressing, asequence of n input bits is mapped into its correspon-ding (n − k) syndrome bits, achieving a compressionratio of n : (n − k). This approach, known as “Wyner’s

scheme” [11] for some time, was only recently used in[6] for practical Slepian-Wolf code designs based onconventional channel codes like block and trellis codes.

If the correlation between X and Y can be modeled


Slepian-Wolf Coding Examples

1) Assume X and Y are equiprobable binary triplets withX, Y ∈ {0, 1}3 and they differ at most in one position. (We firststart with the same example given in [5] and [6] and extendit to more general cases later.) Then H(X) = H(Y) = 3 b.Because the Hamming distance between X and Y isdH(X, Y) ≤ 1, for a given Y , there are four equiprobable choic-es of X. For example, when Y = 000, X ∈ {000, 100, 010, 001}.Hence H(X|Y) = 2 b. For joint encoding of X and Y , three bitsare needed to convey Y and two additional bits to index thefour possible choices of X associated with Y , thus a total ofH(X, Y) = H(Y) + H(X|Y) = 5 b suffice.

For source coding with side information at the decoder asdepicted in Figure 3, the side information Y is perfectlyknown at the decoder but not at the encoder, by theSlepian-Wolf theorem, it is still possible to send H(X|Y) = 2b instead of H(X) = 3 b for X and decode it without loss atthe joint decoder. This can be done by first partitioning theset of all possible outcomes of X into four bins (sets)Z00, Z01, Z10 , and Z11 with Z00 = {000, 111}, Z01 = {001, 110},Z10 = {010, 101} and Z11 = {011, 100} and then sending twobits for the index s of the bin (set) Zs that X belongs to. Informing the bins Zs’s, we make sure that each of them hastwo elements with Hamming distance dH = 3. For jointdecoding with s (hence Zs) and side information Y , we pickin bin Zs the X with dH(X, Y) ≤ 1. Unique decoding is guaran-teed because the two elements in each bin Zs haveHamming distance dH = 3. Thus we indeed achieve theSlepian-Wolf limit of H(X, Y) = H(Y) + H(X|Y) = 3 + 2 = 5bits in this example with lossless decoding.

To cast the above example in the framework of cosetcodes and syndromes [11], we form the parity-check matrixH of rate 1/3 repetition channel code as

H =[

1 1 01 0 1

]. (1)

Then if we think of x as a length-3 binary word, the index sof the bin Zs is just the syndrome s = xHT associated withall x ∈ Zs. Sending the 2-b syndrome s instead of the origi-nal 3-b x achieves a compression ratio of 3: 2. In partition-ing the eight x according to their syndromes into fourdisjoint bins Zs, we preserve the Hamming distance proper-ties of the repetition code in each bin Zs. This ensures thesame decoding performance for different syndromes.

In channel coding, the set of the length three vectors xsatisfying s = xHT is called a coset code Cs of the linearchannel code C00 defined by H (s = xHT = 00 for the linearcode). It is easy to see in this example that each coset code

corresponds to a bin Zs, i.e., all the members of the bin Zs

are code words of the coset code Cs and vice versa, i.e., allthe code words of Cs also belong to Zs. The Hamming dis-tance properties between the code words of the linear rate1/3 repetition channel code C00 , are the same as thosebetween the code words of each coset code Cs . Given theindex s of the bin Zs, i.e., a specific coset code Cs , the sideinformation Y can indicate its closest code word in Cs , andthus, recover X , or equivalently x, without an error.

2) We now generalize the above example to the casewhen X and Y are equiprobable (2m − 1)-bit binary sources.Here m ≥ 3 is a positive integer. The correlation modelbetween X and Y is again characterized by dH(X, Y) ≤ 1. Letn = 2m − 1 and k = n − m, then H(X) = H(Y) = n bits,H(X|Y) = m bits and H(X, Y) = n + m bits. Assuming thatH(Y) = n bits are spent in coding Y so that it is perfectlyknown at the decoder, for Slepian-Wolf coding of X , weenlist the help of the m × n parity-check matrix H of the(n, k) binary Hamming channel code. When m = 3,

H =[ 1 0 0 1 0 1 1

0 1 0 1 1 1 00 0 1 0 1 1 1

]. (2)

For each n bit input x, its corresponding syndrome s = xHT

is coded using H(X|Y) = m bits, achieving the Slepian-Wolflimit with a compression ratio of n : m for X . In doing so, weagain partition the total of 2n binary words according totheir syndromes into 2m disjoint bins (or Zs’s), indexed bythe syndrome s with each set containing 2k elements. Withthis partition, all 2k code words of the linear (n, k) binaryHamming channel code C0 are included in the bin Z0 andthe code distance properties (minimum Hamming distanceof three) is preserved in each coset code Cs. This way X , orequivalently x, is recovered correctly.

Again, a linear channel code with its coset codes were usedto do the binning and hence to construct a Slepian-Wolf limitachieving source code. The reason behind the use of channelcodes is that the correlation between the source X and theside information Y can be modeled with a virtual “correlationchannel.” The input of this “channel” is X and its output is Y.For a received syndrome s, the decoder uses Y together withthe correlation statistics to determine which code word of thecoset code Cs was input to the “channel.” So if the linear chan-nel code C0 is a good channel code for the “correlation chan-nel,” then the Slepian-Wolf source code defined by the cosetcodes Cs is also a good source code for this type of correlation.

by a binary channel, Wyner’s syndrome concept can beextended to all binary linear codes, and state-of-the-artnear-capacity channel codes such as turbo and LDPCcodes [15] can be employed to approach the Slepian-Wolf limit. A short summary on turbo and LDPCcodes is provided in “Turbo and LDPC Codes.” Thereis, however, a small probability of loss in general at theSlepian-Wolf decoder due to channel coding (zero-errorcoding of correlated sources is outside the scope of thisarticle). In practice, the linear channel code rate andcode design in Wyner’s scheme depend on the correla-tion model. Toy examples like those in “Slepian-WolfCoding Examples” are tailor-designed with correlationmodels satisfying H (X ) = n bits and H (X |Y ) = n − k

bits so that binary (n, k) Hamming codes can be usedto exactly achieve the Slepian-Wolf limits.

A more practical correlation model than those of“Slepian-Wolf Coding Examples” is the binary symmet-ric model, where {(Xi ,Y i )}∞i=1 is a sequence of i.i.d.drawings of a pair of correlated binary Bernoulli(0.5)random variables X and Y and the correlation betweenX and Y is modeled by a “virtual” binary symmetriccase (BSC) with crossover probability p . In this channelH (X |Y )=H (p)=−p log2 p−(1−p) log2(1−p). Althoughthis correlation model looks simple, the Slepian-Wolfcoding problem is not trivial. As the BSC is a well-stud-ied channel model in channel coding with a number ofcapacity-approaching code designs available [15],through the close connection between Slepian-Wolfcoding and channel coding, this progress could beexploited to design Slepian-Wolf limit approachingcodes. The first such practical designs [13], [16], [17]borrowed concepts from channel coding, especiallyturbo codes, but did not establish the more direct linkwith channel coding through the syndromes and thecoset codes, as in “Slepian-Wolf Coding Examples.” A


▲ 3. Lossless source coding with side information at the decoderas one case of Slepian-Wolf coding.

X Lossless Encoder(Syndrome Former)

R H(X|Y)

Syndrome BitsJoint Decoder

Y

X^

Turbo and LDPC Codes

Although Gallager discovered low-density parity-check(LDPC) codes 40 years ago, much of the recent devel-

opments in near-capacity codes on graphs and iterativedecoding [15], ranging from turbo codes to the rediscoveryof LDPC codes only happened in the last decade.

Turbo codes and more generally concatenated codes area class of near-capacity channel codes that involve concate-nation of encoders via random interleaving in the encoderand iterative maximum-a-posteriori (MAP) decoding, alsocalled Bahl-Cocke-Jelinek-Raviv (BCJR) decoding, for thecomponent codes in the decoder. A carefully designed con-catenation of two or more codes performs better than eachof the component codes alone. The role of the pseudo-random interleaver is to make the code appear randomwhile maintaining enough code structure to permit decod-ing. The BCJR decoding algorithm allows exchange of infor-mation between decoders and thus an iterative interactionbetween the decoders (hence turbo), leading to near-capacity performance.

LDPC codes [15] are block codes, best described by theirparity-check matrix H and the associated bipartite graph.The parity-check matrix H of a binary LDPC code has a smallnumber of ones (hence low density). The way these onesare spread in H is described by the degree distribution poly-nomials λ(x) and ρ(x), which indicate the percentage ofcolumns and rows of H respectively, with differentHamming weights (number of ones). When both λ(x) andρ(x) have only a single term, the LDPC code is regular, oth-erwise it is irregular. In general, an optimized irregularLDPC code is expected to be more powerful than a regularone of the same code word length and code rate. Given

both λ(x) and ρ(x), the code rate is exactly determined, butthere are several channel codes and thus, several H’s thatcan be formed. Usually one is constructed randomly.

The bipartite graph of an LDPC code is an equivalent rep-resentation of the parity-check matrix H. Each column is rep-resented with a variable or left node and each row with acheck or right node. All variable (left) nodes are put in onecolumn, all check (right) nodes in a parallel column andthen wherever there is an one in H, there is an edge con-necting the corresponding variable and check node. Thebipartite graph is used in the decoding procedure, allowingthe application of the message-passing algorithm [15].(Iterative decoding algorithms including the sum-productalgorithm, the forward/backward algorithm in speech recog-nition and the belief propagation algorithm in artificial intel-ligence are collectively called message-passing algorithms.)LDPC codes are the most powerful channel codes nowadays[15] and they can be designed via density evolution [15].

Both turbo and LDPC codes exhibit near-approachingperformance with reasonable complexity over most conven-tional channels; and algorithms have been devised for thedesign of very good turbo and LDPC codes. However, thedesign procedure for LDPC codes is more flexible and lesscomplex and therefore allows faster, easier and more pre-cise design. This has made LDPC codes the most powerfulchannel codes over both conventional and unconventionalchannels. But turbo and more generally concatenated codesare still employed in many applications when short blocklengths are needed or when one of the component codes isrequired to satisfy additional properties (e.g., for jointsource-channel coding).

turbo scheme with structured com-ponent codes was used in [16] andparity bits were sent in [13] and[17] instead of syndrome bits asadvocated in Wyner’s scheme. Codedesign that did follow Wyner’sscheme for this problem was doneby Liveris et al. [18] withturbo/LDPC codes, achieving per-formance better than in [13], [16],and [17] and very close to theSlepian-Wolf limit H (p).

Some simulation results from[18] are shown in Figure 4. Thehorizontal axis in Figure 4 shows theamount of correlation, i.e., lowerH (p) means higher correlation, andthe vertical axis shows the probabili-ty of error for the decoded X . Allcoding schemes in Figure 4 achieve2:1 compression, and therefore theSlepian-Wolf limit is shown to be 0.5b at almost zero error probability.The performance limit of the more


▲ 4. Slepian-Wolf coding of binary X with side information Y at the decoder based onturbo/LDPC codes. The rate for all codes is fixed at 1/2.

10–1

10–2

10–3

10–4

10–5

10–6

D

0.36 0.38 0.40 0.42 0.44 0.46 0.48 0.50 0.52

H(X|Y) = H(p) (Bits)

RegularLDPC

104

Best Non-Syndrome

Turbo

Turbo104

105

IrregularLDPC

104

Turbo105

Irregular LDPC105

Wyner ZivLimit

Slepian WolfLimit

Wyner-Ziv Coding: The Binary Symmetric Case and the Quadratic Gaussian Case

Wyner-Ziv coding generalizes the setup of Slepian-Wolfcoding [1] in that coding of X in Figure 3 is with

respect to a fidelity criterion rather than lossless.

The Binary Symmetric CaseX and Y are binary symmetric sources, the correlationbetween them is modeled as a BSC with crossover proba-bility p and the distortion measure is the Hamming dis-tance. We can write X = Y

⊕E , where E is a Bernouli(p)

source. Then the rate-distortion function RE(D) for E servesas the performance limit RX |Y (D) of lossy coding of X givenY at both the encoder and the decoder. From [3] we have

RX |Y (D) = RE(D) ={

H(p) − H(D), 0 ≤ D ≤ min{p, 1 − p},0, D > min{p, 1 − p}.

(3)

On the other hand, the Wyner-Ziv rate-distortion function inthis case is [2], [21]

R∗W Z(D) = l.c.e{H(p ∗ D) − H(D), (p, 0)}, 0 ≤ D ≤ p, (4)

the lower convex envelope of H(p ∗ D) − H(D) and thepoint (D = p, R = 0), where p ∗ D = (1 − p)D + (1 − D)p.For p ≤ 0.5, R∗

W Z(D) ≥ RX |Y (D) with equality only at two dis-tortion-rate points: the zero-rate point (p, 0) and the zero-distortion (or Slepian-Wolf) point (0, H(p)). See as depictedlater in Figure 6 for p = 0.27. Thus Wyner-Ziv coding suffersrate loss in this binary symmetric case. When D = 0, the

Wyner-Ziv problem degenerates to the Slepian-Wolf prob-lem with R∗

W Z(0) = RX |Y (0) = H(X|Y) = H(p).

The Quadratic Gaussian CaseX and Y are zero mean and stationary Gaussian memory-less sources and the distortion metric is mean-squared error(MSE). Let the covariance matrix of X and Y be

� =[

σ 2X ρσXσY

ρσXσY σ 2Y

]

with |ρ| < 1, then [2]

R∗W Z(D) = RX |Y (D) = 1

2log+

[σ 2

X (1 − ρ2)

D

], (5)

where log+x = max{log2x, 0}. There is no rate loss withWyner-Ziv coding in this quadratic Gaussian case. If Y canbe written as Y = X + Z , with independent X ∼ N(0, σ 2

X ) andZ ∼ N(0, σ 2

Z ), then

R∗W Z(D) = RX |Y (D) = 1

2log+

[σ 2

Z(1 + σ 2

Z /σ 2X

)D

]. (6)

On the other hand, if X = Y + Z , with independentY ∼ N(0, σ 2

Y ) and Z ∼ N(0, σ 2Z ), then

R∗W Z(D) = RX |Y (D) = 1

2log+

(σ 2

Z

D

). (7)

general Wyner-Ziv coding described in “Wyner-ZivCoding: The Binary Symmetric Case and the QuadraticGaussian Case” and the section “The Binary SymmetricCase” is also included. The code-word length of eachSlepian-Wolf code is also given in Figure 4. It can beseen that the higher the correlation, the lower the prob-ability of error for a certain Slepian-Wolf code. Themore powerful the Slepian-Wolf code, the lower thecorrelation needed for the code to achieve very lowprobability of error. Clearly, stronger channel codes,e.g., same family of codes with longer code word lengthin Figure 4, for the BSC result in better Slepian-Wolfcodes in the binary symmetric correlation setup.

This last statement can be generalized to any corre-lation model: if the correlation between the source out-put X and the side information Y can be modeledwith a “virtual” correlation channel, then a good chan-nel code over this channel can provide us with a goodSlepian-Wolf code through the syndromes and theassociated coset codes. Thus the seemingly source cod-ing problem of Slepian-Wolf coding is actually a chan-nel coding one and near-capacity channel codes such asturbo and LDPC codes can be used to approach theSlepian-Wolf limits.

Slepian-Wolf Coding of Multiple Sourceswith Arbitrary CorrelationPractical Slepian-Wolf code designs for other correla-tion models and more than two sources have appearedrecently in the literature (e.g., [19]) using powerfulturbo and LDPC codes. However, there is no system-atic approach to general practical Slepian-Wolf codedesign yet, in the sense of being able to account foran arbitrary number of sources with nonbinary alpha-bets and possibly with memory in the marginal sourceand/or correlation statistics. Most designs do notconsider sources with memory and/or the memory inthe correlation between the sources. The onlyapproach taking into account correlation with memo-ry is [20]. The lack of such results is due to the factthat the channel coding analog to such a generalSlepian-Wolf coding problem is a channel code designproblem over channels that resemble more involvedcommunication channels, and thus, has not been ade-quately studied yet.

Nevertheless, there has been a significant amount ofwork, and for a number of different scenarios the avail-able Slepian-Wolf code designs perform well. Thesedesigns are very important, as asymmetric or symmetricSlepian-Wolf coding plays the role of conditional or

joint entropy coding, respectively. Not only canSlepian-Wolf coding be considered the analog toentropy coding in classic lossless source coding but alsothe extension of entropy coding to problems with sideinformation and/or distributed sources. In these moregeneral problems, near-lossless compression of a sourcedown to its entropy is a special case of Slepian-Wolfcoding when there is no correlation either between thesource and the side information (asymmetric setup) orbetween the different sources (symmetric setup).

So, apart from its importance as a separate problem,when combined with quantization, Slepian-Wolf canprovide a practical approach to lossy DSC problems,such as the Wyner-Ziv problem considered next, simi-larly to the way quantization and entropy coding arecombined in classic lossy source coding.

Wyner-Ziv CodingIn the previous section, we focused on lossless sourcecoding of discrete sources with side information at thedecoder as one case of Slepian-Wolf coding. In sensornetwork applications, we are often dealing with contin-uous sources; then the problem of rate distortion withside information at the decoder arises. The question toask is how many bits are needed to encode X under theconstraint that the average distortion between X andthe coded version X is E {d(X ,X )} ≤ D , assuming theside information Y is available at the decoder but notat the encoder. This problem, first considered byWyner and Ziv in [2], is one instance of DSC with Yavailable uncoded as side information at the decoder. Itgeneralizes the setup of [1] in that coding of discrete Xis with respect to a fidelity criterion rather than lossless.For both discrete and continuous alphabet cases andgeneral distortion metrics d(·), Wyner and Ziv [2] gavethe rate-distortion function R∗

W Z (D) for this problem.We include in “Wyner-Ziv Coding: The BinarySymmetric Case and the Quadratic Gaussian Case” theWyner-Ziv rate distortion functions for the binary sym-metric case and the quadratic Gaussian case.

The important thing about Wyner-Ziv coding is thatit usually suffers rate loss when compared to lossy cod-ing of X when the side information Y is available atboth the encoder and the decoder (see the binary sym-metric case in “Wyner-Ziv Coding: The BinarySymmetric Case and the Quadratic Gaussian Case”).One exception is when X and Y are jointly Gaussianwith MSE measure (the quadratic Gaussian case in“Wyner-Ziv Coding: The Binary Symmetric Case andthe Quadratic Gaussian Case”). There is no rate losswith Wyner-Ziv coding in this case, which is of specialinterest in practice (e.g., sensor networks) becausemany image and video sources can be modeled as joint-ly Gaussian (after mean subtraction). Pradhan et al.[22] recently extended the no rate loss condition forWyner-Ziv coding to X = Y + Z , where Z is inde-pendently Gaussian but X and Y could follow moregeneral distributions.


The seemingly source codingproblem of Slepian-Wolf codingis actually a channel codingproblem.

From an information-theoretical perspective, accord-ing to [23], there are granular gain and boundary gainin source coding and packing gain and shaping gain inchannel coding. Wyner-Ziv coding is foremost a sourcecoding (i.e., a rate-distortion) problem; one shouldconsider the granular gain and the boundary gain. Inaddition, the side information necessitates channel cod-ing for compression (e.g., via Wyner’s syndrome-basedbinning scheme [11]), which utilizes a linear channelcode together with its coset codes. Thus, channel cod-ing in Wyner-Ziv coding is not conventional in thesense that there is only packing gain but no shapinggain. One needs to establish the equivalence betweenthe boundary gain in source coding and the packinggain in channel coding for Wyner-Ziv coding; this is fea-sible because channel coding for compression in Wyner-Ziv coding can perform conditional entropy coding toachieve the boundary gain—the same way as entropycoding achieves the boundary gain in classic source cod-ing [23], [24 p. 123]. Then in Wyner-Ziv coding,he/she can shoot for the granular gain via source cod-ing and the boundary gain via channel coding.

From a practical viewpoint, because we are introduc-ing loss/distortion to the source with Wyner-Ziv cod-ing, source coding is needed to quantize X . “MainResults in Source Coding (Quantization Theory)”reviews main results in source coding (quantizationtheory) regarding Gaussian sources.

Usually there is still correlation remaining in thequantized version of X and the side information Y ,and Slepian-Wolf coding should be employed to exploitthis correlation to reduce the rate. Since Slepian-Wolfcoding is based on channel coding,Wyner-Ziv coding is, in a nutshell, asource-channel coding problem.There are quantization loss due tosource coding and binning loss dueto channel coding. To reach theWyner-Ziv limit, one needs toemploy both source codes [e.g.,trellis coded quantization (TCQ)]that can achieve the granular gainand channel codes (e.g., turbo and

LDPC codes) that can approach the Slepian-Wolf limit.In addition, the side information Y can be used injointly decoding and estimating X at the decoder tohelp reduce the distortion d(X ,X ) for nonbinarysources, especially at low bit rates. The intuition is thatin decoding X , the joint decoder should rely more onY when the rate is too low to make the coded versionof X to be useful in terms of lowering the distortion.On the other hand, when the rate is high, the codedversion of X becomes more reliable than Y so thedecoder should put more weight on the former in esti-mating X . Figure 5 depicts the block diagram of ageneric Wyner-Ziv coder.

To illustrate the basic concepts of Wyner-Ziv codingwe consider nested lattice codes, which were intro-duced by Zamir et al. [21] as codes that can achievethe Wyner-Ziv limit asymptotically, for large dimen-sions. Practical nested lattice code implementation wasfirst done in [25]. The coarse lattice is nested in thefine lattice in the sense that each point of the coarse lat-tice is also a point of the fine lattice but not vice versa.The fine code in the nested pair plays the role of sourcecoding while each coset coarse code does channel cod-ing. To encode, x is first quantized with respect to thefine source code, resulting in quantization loss.However, only the index identifying the bin (cosetchannel code) that contains the quantized x is coded tosave rate. Using this coded index, the decoder finds inthe bin (coset code) the code word closest to the sideinformation y as the best estimate of x. Due to the bin-ning process employed in nested coding, the Wyner-Ziv decoder suffers a small probability of error. To


Main Results in Source Coding (Quantization Theory)

For Gaussian source X ∼ N(0, σ 2X ) with MSE distortion

metric, the rate-distortion function [3] isRX(D) = (1/2) log+

(σ 2X /D) and the distortion-rate function is

DX(R) = σ 2X 2−2R. At high rate, the minimum MSE (or Lloyd-

Max) scalar quantizer, which is nonuniform, performs 4.35dB away from DX(R) [24]; and the entropy-coded scalarquantizer (or uniform scalar quantization followed by idealentropy coding) for X suffers only 1.53 dB loss with respectto DX(R) [24]. High dimensional lattice vector quantizers

[24] can be found in dimensions 2, 4, 8, 16, and 24 thatperform 1.36, 1.16, 0.88, 0.67 and 0.50 dB away, respec-tively, from DX(R). In addition, trellis-coded quantization(TCQ) can be employed to implement equivalent vectorquantizers in even higher dimensions, achieving betterresults. For example, TCQ can perform 0.46 dB away fromDX(R) at 1 b/s for Gaussian sources [24] and entropy-codedTCQ can get as close as 0.2 dB to DX(R) for any source withsmooth PDF [24].

▲ 5. Block diagram of a generic Wyner-Ziv coder.

X SourceEncoder

I Slepian-WolfEncoder

Encoder

Syndrome Joint Source-Channel Decoding Estimation

Decoder

X

Y

X~ ^

reduce this decoding error probability, the coarse chan-nel code has to be strong with large minimum distancebetween its code words. This means that the dimen-sionality of the coarse linear/lattice code needs to behigh. It is proven in [21] that infinite dimensionalsource and channel codes are needed to reach theWyner-Ziv limit. This result is only asymptotical andnot practical. Implemented code designs for the binarysymmetric and the quadratic Gaussian cases are dis-cussed next.

The Binary Symmetric CaseRecall that in Wyner’s scheme [11] for lossless Slepian-Wolf coding, a linear (n, k) binary block code is used.There are 2n−k distinct syndromes, each indexing a set(bin) of 2k binary words of length n that preserve theHamming distance properties of the original code. Incompressing, a sequence of n input bits is mapped intoits corresponding (n − k) syndrome bits, achieving acompression ratio of n : (n − k).

For lossy Wyner-Ziv coding, Shamai, Verdu, andZamir generalized Wyner’s scheme using nested linear

binary block codes [21], [26]. According to this nestedscheme, a linear (n, k2) binary block code is again usedto partition the space of all binary words of length ninto 2n−k2 bins of 2k2 elements, each indexed by aunique syndrome value. Out of these 2n−k2 bins only2k1−k2 (k1 ≥ k2) are used, and the elements of theremaining 2n−k2 − 2k1−k2 sets are “quantized’’ to theclosest, in Hamming distance sense, binary word of theallowable 2k1−k2 × 2k2 = 2k1 ones. This “quantization’’can be viewed as a (n, k1) binary block source code.Then the linear (n, k2) binary block code can be con-sidered to be a coarse channel code nested inside the(n, k1) fine source code.

To come close to the Wyner-Ziv limit, both codes inthe above nested scheme should be good, i.e., a goodfine source code is needed with a good coarse channelsubcode [21], [26]. Knowing how to employ goodchannel codes based on Wyner’s scheme (k1 = n) [18],Liveris et al. proposed a scheme in [27] based on con-catenated codes, where from the constructions in [18]the use of good channel codes is guaranteed. As for thesource code, its operation resembles that of TCQ andhence, it is expected to be a good source code. Thescheme in [27] can come within 0.09 b from the theo-retical limit (see Figure 6). This is the only resultreported so far for the binary Wyner-Ziv problem.

The binary symmetric Wyner-Ziv problem does notseem to be practical, but due to its simplicity, it providesuseful insight into the interaction between source andchannel coding. The rate loss from the Wyner-Ziv limitin the case of binary Wyner-Ziv coding can be clearlyseparated into source coding loss and channel coding

loss. For example, in Figure 6 the rate gap inbits between the simulated points and theWyner-Ziv limit can be separated into sourcecoding rate loss and channel coding rate loss.This helps us understand how to quantifyand combine rate losses in Wyner-Ziv coding.

In the binary setup, there is no estimationinvolved as in the general scheme of Figure5. The design process consists of two steps.First, a good classic binary quantizer isselected, i.e., a quantizer that can minimizedistortion D close to the distortion-ratefunction of a single Bernoulli(0.5) source ata given rate. The second step is to design aSlepian-Wolf encoder matched to the quan-tizer codebook. The better the matching ofthe Slepian-Wolf code constraints (paritycheck equations) to the quantizer codebook,the better the performance of the decoder.Joint source-channel decoding in Figure 5refers to the fact that the decoder combinesthe Slepian-Wolf code constraints with thequantizer codebook to reconstruct X . Thisbinary design approach is very helpful inunderstanding the Gaussian Wyner-Ziv codedesign.


▲ 6. Simulated performance of the nested scheme in [27] for binary Wyner-Zivcoding for correlation p = 0.27. The time-sharing line between the zero-ratepoint (p, 0) and the Slepian-Wolf point (0, H(p)) is also shown.

Correlation p = Pr[X≠Y] = 0.27

H(p)

R (

Bits

)

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.00.00 0.05 0.10 0.15 0.20 0.25 0.30p

D

Time-Sharing

Simulation Results

RX|Y(D)

RWZ(D)*

Wyner-Ziv coding is, in anutshell, a source-channelcoding problem.

The Quadratic Gaussian CaseFor practical code design in this case, one can firstconsider lattice codes [28] and trellis-based codes [23]that have been used for both source and channel cod-ing in the past and focus on finding good nestingcodes among them. Following Zamir et al.’s theoreti-cal nested lattice coding scheme, Servetto [25] pro-posed explicit nested lattice constructionsbased on similar sublattices for the high cor-relation case. Figure 7 shows the simplestone-dimensional (1-D) nested lattice/scalarquantizer with N = 4 bins, where the finesource code employs a uniform scalar quan-tizer with stepsize q and the coarse channelcode uses a 1-D lattice code with minimumdistance dmin = Nq. The distortion consistsof two parts: the “good” distortion intro-duced from quantization by the source codeand the “bad” distortion from decodingerror of the channel code. To reduce the“good” distortion, it is desirable to choose asmall quantization stepsize q; on the otherhand, to limit the “bad” distor tion,dmin = Nq should be maximized to mini-mize the channel decoding error probabilityPe . Thus for a fixed N , there exists an opti-mal q that minimizes the total distortion.

Research on trellis-based nested codes as away of realizing high-dimensional nested lat-tice codes has just started recently. Forexample, in DISCUS [6], two source codes(scalar quantization and TCQ) and twochannel codes (scalar coset code and trellis-based coset code [23]) are used in source-channel cod-ing for the Wyner-Ziv problem, resulting in fourcombinations. One of them (scalar quantization withscalar coset code) is nested scalar quantization andanother one (TCQ with trellis-based coset code) caneffectively be considered as nested TCQ.

Nested lattice or TCQ constructions in [6] might bethe first approach one would attempt because source andchannel codes of about the same dimension are utilized.However, in this setup, the coarse channel code is notstrong enough. This can be seen clearly from Figure 8,where performance bounds [28], [29] of lattice sourceand channel codes are plotted together. With nested scalarquantization, the fine source code (scalar quantization)leaves unexploited the maximum granular gain of only1.53 dB [24] but the coarse channel code (scalar cosetcode) suffers more than 6.5 dB loss with respect to thecapacity (with Pe = 10−6). On the other hand, Figure 8indicates that lattice channel code at dimension 250 stillperforms more than 1 dB away from the capacity.Following the 6-dB rule, i.e., that every 6 dB correspondto 1 b, which is approximately true for both source andchannel coding, the decibel gaps can be converted intorate losses (bits) and then combined into a single rate lossfrom the Wyner-Ziv rate-distortion function.

Nested TCQ employed in DISCUS [6] can beviewed as a nested lattice vector quantizer, where thelattice source code corresponding to TCQ has a smallergap from the performance limit than the lattice channelcode (trellis-based coset code) of about the samedimension does. As the dimensionality increases, latticesource codes reach the ceiling of 1.53 dB in granular


▲ 7. A 1-D nested lattice/uniform quantizer with four bins for the quadraticGaussian Wyner-Ziv problem, where Y is the side information only available atthe decoder. (a) “Good” distortion only. (b) “Good” and “bad” distortion.

PDF of X

PDF of X

dmin=4q

dmin=4q

0 1 2 3 0 1 2 3 0 1 2 3 0

0 1 2 3 0 1 2 3 0 1 2 3 0

q

q

D

D

(a)

(b)

x

x

x y

x y

▲ 8. Lower bound in terms of the performance gap (in dB) of lat-tice source code from the rate-distortion function of Gaussiansources and lattice channel codes from the capacity of Gaussianchannels, as a function of dimensionality.

SN

R G

ap (

in d

B)

to th

e P

erfo

rman

ce L

imit 0

–1

–2

–3

–4

–5

–6

–7

–80 50 100 150 200 250 300

Dimension

Lattice Source Coding

Lattice Channel Coding

Rate-Distortion Function/Channel Capacity

for Pe=10–5

for Pe=10–6

for Pe=10–7

gain much faster than lattice channel codes approachthe capacity. Consequently one needs channel codes ofmuch higher dimension than source codes to achievethe same loss, and the Wyner-Ziv limit should beapproached with nesting codes of different dimensional-

ities in practice. The need of strong channel codes forWyner-Ziv coding was also emphasized in [21].

This leads to the second approach [30] based onSlepian-Wolf coded nested quantization (SWC-NQ),i.e., nested scheme followed by a second layer of bin-ning. At high rate, asymptotic performance bounds ofSWC-NQ similar to those in classic source coding wereestablished in [30], showing that ideal Slepian-Wolfcoded 1-D/two-dimensional (2-D) nested latticequantization performs 1.53/1.36 dB worse than theWyner-Ziv distortion-rate function D∗

W Z (R) withprobability almost one. Performances close to the cor-responding theoretical limits were obtained by using 1-D and 2-D nested lattice quantization, together withirregular LDPC codes for Slepian-Wolf coding.

The third practical nested approach toWyner-Ziv coding involves combinedsource and channel coding. The mainscheme in this approach has been the com-bination of a classic scalar quantizer (nobinning in the quantization) and a powerfulSlepian-Wolf code. The intuition that all thebinning should be left to the Slepian-Wolfcode, allows the best possible binning (ahigh dimensional channel code). This limitsthe performance loss of such a Wyner-Zivcode to that from source coding alone.Some first interesting results were given in[17], where assuming ideal Slepian-Wolfcoding and high rate the use of classicquantization seemed to be sufficient, lead-ing to a similar 1.53 dB gap for classicscalar quantization with ideal SWC. In amore general context, this approach couldbe viewed as a form of nesting with fixedfinite source code dimension and largerchannel code dimension. This generalizedcontext can include the turbo-trellis Wyner-Ziv codes introduced in [31], where thesource code is a TCQ nested with a turbochannel code. However, the scheme in [31]can also be classified as a nested one in thesecond approach. Wyner-Ziv coding basedon TCQ and LDPC codes was presented in[32], which shows that at high rate, TCQwith ideal Slepian-Wolf coding performs 0.2dB away from the theoretical limit D∗

W Z (R)

with probability almost one. Practicaldesigns with TCQ, irregular LDPC codebased Slepian-Wolf coding and optimal esti-mation at the decoder can perform 0.82 dBaway from D∗

W Z (R) at medium bit rates(e.g., ≥ 1.5 b/s). With 2-D trellis-codedvector quantization (TCVQ), the perform-ance gap to D∗

W Z (R) is only 0.66 dB at 1.0b/s and 0.47 dB at 3.3 b/s [32]. Thus weare approaching the theoretical perform-ance limit of Wyner-Ziv coding.


▲ 9. Performance of Gaussian Wyner-Ziv coding with scalar quantization andSlepian-Wolf coding.

logD

D(R)

USQSWC

NSQ

SWC Nesting

1.53 dB1.53 dB

DWZ(R)

SWC-USQSWC-NQ

R

ECSQ

EC

▲ 10. (a) Classic source coding. (b) DSC (source coding with side information atthe decoder). (c) Equivalent encoder structure of DSC for correlated sources.

X

Y

Y(a)

(c)

(b)

T QX

X

YX

X

^ X

EC

T–1 Q–1 EC–1 T–1 Q–1 EC–1

T Q EC

T Q ECT with

Side InfoQ with

Side InfoEC withSide Info

==>

Wyner first realized theclose connection of DSC tochannel coding.

▲ 11. SWC-NQ for Wyner-Ziv coding of i.i.d. sources.

i.i.d. X NestedQuantization

R=H(NQ(X)) Slepian-WolfCoding

(SC+CC/Binning) (CC/Binning)

R=H(NQ(X)|Y)

Comparison of Different Approaches to the Quadratic Gaussian CaseBased on scalar quantization, Figure 9 illustrates theperformance difference between classic uniform scalarquantization (USQ), the first approach with nestedscalar quantization (NSQ), the second with SWC-NQ,and the third with ideal Slepian-Wolf coded uniformscalar quantization (SWC-USQ).

Although the last two approaches perform roughlythe same, using nesting/binning in the quantization stepin the second approach of SWC-NQ has the advantagethat even without the additional Slepian-Wolf codingstep, nested quantization (e.g.,NSQ) alone performs better thanthe third approach of SWC-USQwithout Slepian-Wolf coding,which degenerates to just classicquantization (e.g., USQ). At highrate, the nested quantizer asymp-totically becomes almost anonnested regular one so that

strong channel coding is guaranteed and there is a con-stant gain in rate at the same distortion level due to nest-ing when compared with classic quantization. The roleof Slepian-Wolf coding in both SWC-NQ and SWC-USQ is to exploit the correlation between the quantizedsource and the side information for further compressionand in SWC-NQ, as a complementary binning layer tomake the overall channel code stronger.

SWC-NQ generalizes the classic source codingapproach of quantization (Q) and entropy coding(EC) in the sense that the quantizer performs quitewell alone and can exhibit further rate savings by


From Classic Source Coding to Wyner-Ziv Coding

We shall start with the general case that in addition tothe correlation between X and Y , {Xi}∞

i=1 are also cor-related (we thus implicitly assume that {Yi}∞

i=1 are correlat-ed as well). The classic transform, quantization, and entropycoding (T-Q-EC) paradigm for X is show in Figure 10(a).When the side information Y is available at the decoder, weimmediately obtain the DSC paradigm in Figure 10(b). Notethat Y is not available at the encoder—the dotted lines inFigure 10 (b) only serve as a reminder that the design ofeach of the T,Q, and EC components should reflect the factthat Y is available at the decoder.

The equivalent encoder structure of DSC in Figure 10 (b)is redrawn in Figure 10(c). Each component in the classic T-Q-EC paradigm is now replaced with one that takes intoaccount the side information at the decoder . For example,the first component “T with side info” could be the condi-tional Karhunen-Loeve transform [33], which is beyond thescope of this article.

Assuming that “T with side information” is doing a perfectjob in the sense that it completely decorrelates X condition-al on Y , from this point on we will assume i.i.d. X and Yand focus on “Q with side information” and “EC with sideinformation” in Figure 10(c). We rename the former asnested quantization (the latter is exactly Slepian-Wolf cod-ing) and end up with the encoder structure of SWC-NQ forWyner-Ziv coding of i.i.d. sources in Figure 11.

Nested quantization in SWC-NQ plays the role of source-channel coding, in which the source coding componentrelies on the fine code and the channel coding componenton the coarse code of the nested pair of codes. The channelcoding component is introduced to the nested quantizerprecisely because the side information Y is available at the

decoder. It effectively implements a binning scheme to takeadvantage of this fact in the quantizer. Nested quantizationin Figure 11 thus corresponds to quantization in classicsource coding.

For practical lossless source coding, conventional tech-niques (e.g., Huffman coding, arithmetic coding, Lempel-Zivcoding, and others referred in [34]) have dominated so far.However, if one regards lossless source coding as a specialcase of Slepian-Wolf coding without side information at thedecoder, then channel coding techniques can also be usedfor source coding based on syndromes (see references in[34]). In this light, the Slepian-Wolf coding component inFigure 11 can be viewed as the counterpart of entropy cod-ing in classic source coding. Although the idea of usingchannel codes for source coding dates back to theShannon-MacMillan theorem [3] and theoretical resultsappeared later [34], practical turbo/LDPC code based noise-less data compression schemes did not appear until veryrecently [34], [35].

Starting from syndrome based approaches for entropycoding, one can easily make the schematic connectionbetween entropy-coded quantization for classic sourcecoding and SWC-NQ for Wyner-Ziv coding, as syndromebased approaches can also be employed for Slepian-Wolfcoding (or source coding with side information at thedecoder) in the latter case. Performance-wise, the work in[30], [32] reveals that the performance gap of high-rateWyner-Ziv coding (with ideal Slepian-Wolf coding) toD∗

W Z (R) is exactly the same as that of high-rate classicsource coding (with ideal entropy coding) to the distortion-rate function DX (R). This interesting and important findingis highlighted in Table 1.

Classic Source Coding Wyner-Ziv CodingCoding Scheme Gap to DX (R) Coding Scheme Gap to D∗

WZ (R)

ECSQ [24] 1.53 dB SWC-NSQ [30] 1.53 dB ECLQ (2-D) [28] 1.36 dB SWC-NQ (2-D) [30] 1.36 dB ECTCQ [24] 0.2 dB SWC-TCQ [32] 0.2 dB

Table 1. High-rate classic source coding versus high-rate Wyner-Ziv coding.

employing a powerful Slepian-Wolf code. This con-nection between entropy-coded quantization for clas-sic source coding and SWC-NQ for Wyner-Zivcoding is developed in “From Classic Source Codingto Wyner-Ziv Coding.”

Successive Wyner-Ziv CodingSuccessive or scalable image coding made popular bySPIHT [36] is attractive in practical applications such asnetworked multimedia. For Wyner-Ziv coding, scalabilityis also a desirable feature in applications that go beyondsensor networks. Steinberg and Merhav [37] recentlyextended Equitz and Cover’s work [38] on successiverefinement of information to Wyner-Ziv coding, showingthat both the doubly symmetric binary source and thejointly Gaussian source are successively refinable. Cheng etal. [39] further pointed out that the broader class ofsources that satisfy the general condition of no rate loss[22] for Wyner-Ziv coding is also successively refinable.Practical layered Wyner-Ziv code design for Gaussiansources based on nested scalar quantization and multilevelLDPC code for Slepian-Wolf coding was also presented in[39]. Layered Wyner-Ziv coding of real video sources wasintroduced in [40].

Applications of DSC in Sensor NetworksIn the above discussions we mainly considered lossless(Slepian-Wolf) and lossy (Wyner-Ziv) source codingwith side information only at the decoder, as most ofthe work so far in DSC has been focusing on these twoproblems. For any network and especially a sensor net-work, this means that the nodes transmitting correlatedinformation need to cooperate in groups of two orthree so that one node provides the side informationand another one can compress its information down tothe Slepian-Wolf or the Wyner-Ziv limit. This approachhas been followed in [41].

As pointed out in [41], such cooperation in groupsof two and three sensor nodes means that less complexdecoding algorithms should be employed to save somedecoding processing power as the decoder is also a sen-sor node or a data processing node of limited power[41]. This is not a big issue as several low complexitychannel decoding algorithms have been studied in thepast years and so the decoding loss for using lowercomplexity algorithms can be minimized.

A way to change this assumption of cooperation insmall groups and the associated low complexity decod-ing is to employ DSC schemes for multiple sources.Several Slepian-Wolf coding approaches for multiplesources (lossless DSC) have been discussed earlier.However, theoretical performance limits in the moregeneral setting of lossy DSC for multiple sources stillremain elusive. Even Wyner-Ziv coding assumes perfectside information at the decoder, i.e., it cannot really beconsidered a two sources problem. There are scantcode designs (except [12], [14], and [42]) in the litera-ture for the case of lossy DSC with two sources, wherethe side information is also coded.

The main issue for practical deployment of DSC isthe correlation model. Although there has been signifi-cant effort in DSC designs for different correlationmodels, in some cases even application specific [5],[12], [41], [45], in practice it is usually hard to comeup with a joint probability mass or density function insensor networks, especially if there is little room fortraining or little information about the current networktopology. In some applications, e.g., video surveillancenetworks, the correlation statistics can be mainly afunction of the location of the sensor nodes. In case thesensor network has a training mode option and/or cantrack the varying network topology, adaptive or univer-sal DSC that could achieve gains for time-varying cor-relation could be used to follow the time-varyingcorrelation. Such universal DSC that can work well fora variety of correlation statistics seems to be the mostappropriate approach for sensor networks, but it is stillan open and very challenging DSC problem.

Measurement noise, which is another important issuein sensor networks, can be addressed through DSC.Following the Wyner-Ziv coding approach presented inthe previous section, the existence of measurement

noise, if not taken into account, causessome mismatch between the actualcorrelation between the sources andthe noiseless correlation statistics usedto do the decoding, which meansworse performance. One way toresolve this issue is the robust codedesign discussed before. But if thenoise statistics are known, even approx-imately, they can be incorporated intothe correlation model and thus consid-ered in the code design for DSC.There is actually one specific DSC


Source Coding Channel Coding

DSC [1], [2] (source coding with side Data hiding [46], [47] (channel codinginformation at the decoder) with side information at the encoder)Multiterminal source coding [4] Coding for MIMO/broadcast channels

[21], [48]Multiple description coding [49] Coding for multiple access channels [3]

Table 2. Dual problems in source and channel coding, where the encoderfor the source coding problem is the functional dual of the decoder for the

corresponding channel coding problem and vice versa

Slepian and Wolf theoreticallyshowed that separate encodingis as efficient as joint encodingfor lossless compression.

problem, the chief executive officer (CEO) problem[44], which considers such a scenario. In this problemthe CEO of a company employs a number of agents toobserve an event, and each of the agents provides theCEO with his/her own (noisy) version of the event. Theagents are not allowed to convene, and the goal of theCEO is to recover as much information as possibleabout the actual event from the noisy observationsreceived from the agents, while minimizing the totalinformation rate from the agents (sum rate). The CEOproblem, hence, can account for the measurement noiseat the sensor nodes. Preliminary practical code construc-tions for the CEO problem appeared in [12] and [42],based on the Wyner-Ziv coding approaches, but they areonly limited to special cases.

Another issue is the cross-layer design aspect ofDSC. DSC can be considered to be at the top of thesensor networks protocol stack, the applicationlayer.Therefore, DSC sets several requirements for theunderlying layers, especially the next lower layer, thetransport layer, regarding synchronization betweenpackets from correlated nodes.

DSC can also be designed to work together with thetransport layer to make retransmissions smarter. In thatsense, scalable DSC [39] seems to be the way to imple-ment such smart retransmissions.

One last aspect of the cross-layer design is the jointdesign with the lowest layer in the protocol stack, thephysical layer. In a packet transmitted through the net-work, the correlated compressed data can have weakerprotection than the associated header, thus saving someoverhead, because the available side information at thedecoding node can make up for this weaker protection.

Related TopicsSo far we have motivated DSC with applications to sen-sor networks. Research on this application area [5], [12],[41], [43] has just begun. Significant work remains to bedone to fully harness the potential of these next-genera-tion sensor networks, as discussed in the last section.However, DSC is also related to a number of differentsource and channel coding problems in networks.

First of all, due to the duality [22], [45] betweenDSC (source coding with side information) and chan-nel coding with side information [46], [47] (data hid-ing/digital watermarking), there are many applicationareas related to DSC (e.g., coding for the multipleaccess channel [3] and the MIMO/broadcast chan-nels [21], [48]). Table 2 summarizes different dualproblems in source and channel coding, where theencoder for the source coding problem is the func-tional dual of the decoder for the correspondingchannel coding problem and vice versa. We can seethat DSC only represents the starting point of a classof related research problems.

Furthermore, a recent theoretical result establishedmultiple description coding [49] as a special case ofDSC with colocated sources, with multiple descriptions

easily generated by embedded coding [36] plusunequal error protection. Under this context, iterativedecoding approaches in DSC immediately suggest thatthe same iterative (turbo) technique can be employedfor decoding multiple descriptions. Yet another newapplication example of DSC is layered coding for videostreaming, where the error drifting problem in standardMPEG-4 FGS coding can be potentially eliminatedwith distributed video coding [17], [40], [43], [50].

Last but not least, DSC principles can be applied inreliable communications with uncoded side informa-tion and systematic source-channel coding [26]. Thelatter includes embedding digital signals into analog(e.g., TV) channels and communication over channelswith unknown SNRs. Applying DSC algorithms tothese real-world problems will prove to be extremelyexciting and yield the most fruitful results.

AcknowledgmentsThis work was supported by NSF CAREER GrantMIP-00-96070, NSF Grant CCR-01-04834, ONRYIP Grant N00014-01-1-0531, and ARO YIP GrantDAAD19-00-1-0509.

Zixiang Xiong received the Ph.D. degree in electricalengineering in 1996 from the University of Illinois,Urbana-Champaign. From 1997 to 1999, he was withthe University of Hawaii. Since 1999, he has been withthe Department of Electrical Engineering at Texas A&MUniversity as an associate professor. His current researchinterests are coding for multiterminal communicationnetworks, joint source-channel coding and genomic sig-nal processing. He received an NSF Career Award in1999, an ARO Young Investigator Award in 2000, andan ONR Young Investigator Award in 2001. He alsoreceived faculty fellow awards in 2001, 2002 and 2003from Texas A&M University. He is currently an associateeditor for IEEE Transactions on Circuits and Systems forVideo Technology, IEEE Transactions on Signal Processing,and IEEE Transactions on Image Processing.

Angelos D. Liveris received the Diploma in electricaland computer engineering from the National TechnicalUniversity of Athens, Greece, in 1998. Since 1999, hehas been working toward the Ph.D. degree in electricalengineering at Texas A&M University, College Station.His research interests include network information the-ory, distributed source and channel coding, detection,and wireless communications.

Samuel Cheng received the B.Eng. degree in electricaland electronic engineering from the University ofHong Kong and the M.Phil. degree in physics and theM.S. degree in electrical engineering from the HongKong University of Science and Technology and theUniversity of Hawaii, Honolulu, respectively. He iscurrently pursuing the Ph.D. degree in electrical engi-neering at Texas A&M University, College Station.


References[1] D. Slepian and J.K. Wolf, “Noiseless coding of correlated information

sources,’’ IEEE Trans. Inform. Theory, vol. 19, no. 4, pp. 471–480, 1973.

[2] A. Wyner and J. Ziv, “The rate-distortion function for source coding withside information at the decoder,” IEEE Trans. Inform. Theory, vol. 22, no. 1, pp. 1–10, 1976.

[3] T. Cover and J. Thomas, Elements of Information Theory. New York: Wiley, 1991.

[4] T. Berger, “Multiterminal source coding,” in The Information Theory Approachto Communications, G. Longo, Ed. New York: Springer-Verlag, 1977.

[5] S. Pradhan, J. Kusuma, and K. Ramchandran, “Distributed compression in a densemicrosensor network,” IEEE Signal Processing Mag., vol. 19, pp. 51–60, Mar. 2002.

[6] S. Pradhan and K. Ramchandran, “Distributed source coding using syn-dromes (DISCUS): Design and construction,” IEEE Trans. Inform.Theory, vol. 49, pp. 626–643, Mar. 2003.

[7] S. Bohacek, K. Shah, G. Arce, and M. Davis, “Signal processing challengesin active queue management,” IEEE Signal Proc. Mag., vol. 21, Sept. 2004.

[8] R. Berry and E. Yeh, “Cross-layer wireless resource allocation: Fundamentalperformance limits,” IEEE Signal Proc. Mag., vol. 21, Sept. 2004.

[9] R. Rao, C. Comaniciu, T. Lakshman, and H.V. Poor, “Integrated calladmission control in DS-CDMA wireless multimedia networks,” IEEESignal Proc. Mag., vol. 21, Sept. 2004.

[10] L. Tong, V. Naware, and P. Venkitasubramaniam, “Signal processing in randomaccess: A cross-layer perspective, IEEE Signal Proc. Mag., vol. 21, Sept. 2004.

[11] A. Wyner, “Recent results in the Shannon theory,” IEEE Trans. Inform.Theory, vol. 20, no. 1, pp. 2–10, 1974.

[12] S. Pradhan and K. Ramchandran, “Distributed source coding: Symmetricrates and applications to sensor networks,” in Proc. DCC’00, Snowbird,UT, 2000, pp. 363–372.

[13] Y. Zhao and J. Garcia-Frias, “Data compression of correlated non-binarysources using punctured turbo codes,” in Proc. DCC’02, Snowbird, UT,2002, pp. 242–251.

[14] V. Stankovic, A. Liveris, Z. Xiong, and C. Georghiades, “Design ofSlepian-Wolf codes by channel code partitioning,” in Proc. DCC’04,Snowbird, UT, 2004, pp. 302–311.

[15] “Special Issue on Codes on Graphs and Iterative Algorithms,” IEEETrans. Inform. Theory, vol. 47, Feb. 2001, pp. 493–849.

[16] P. Mitran and J. Bajcsy, “Coding for the Wyner-Ziv problem with turbo-like codes,” in Proc. ISIT’02, Lausanne, Switzerland, 2002, p. 91.

[17] B. Girod, A. Aaron, S. Rane, and D. Rebollo-Monedero, “Distributedvideo coding,” Proc. IEEE, 2004 (to appear).

[18] A. Liveris, Z. Xiong, and C. Georghiades, “Compression of binary sourceswith side information at the decoder using LDPC codes,” IEEE Commun.Lett., vol. 6, no. 10, pp. 440–442, 2002.

[19] C. Lan, A. Liveris, K. Narayanan, Z. Xiong, and C. Georghiades,“Slepian-Wolf coding of multiple M-ary sources using LDPC codes,” Proc.DCC’04, Snowbird, UT, 2004, p. 549.

[20] J. Garcia-Frias and W. Zhong, “LDPC codes for compression of multiter-minal sources with hidden Markov correlation,” IEEE Commun. Lett., vol.7, no. 3, pp. 115–117, 2003.

[21] R. Zamir, S. Shamai, and U. Erez, “Nested linear/lattice codes for struc-tured multiterminal binning,” IEEE Trans. Inform. Theory, vol. 48, no. 6, pp. 1250–1276, 2002.

[22] S. Pradhan, J. Chou, and K. Ramchandran, “Duality between source cod-ing and channel coding and its extension to the side information case,”IEEE Trans. Inform. Theory, vol. 49, no. 5, pp. 1181–1203, 2003.

[23] M.V. Eyuboglu and G.D. Forney, “Lattice and trellis quantization withlattice- and trellis-bounded codebooks—High-rate theory for memorylesssources,” IEEE Trans. Inform. Theory, vol. 39, no. 1, pp. 46–59, 1993.

[24] D. Taubman and M. Marcellin, JPEG2000: Image CompressionFundamentals, Standards, and Practice. Norwell, MA: Kluwer, 2001.

[25] S. Servetto, “Lattice quantization with side information,” in Proc.DCC’00, Snowbird, UT, 2000, pp. 510–519.

[26] S. Shamai, S. Verdu, and R. Zamir, “Systematic lossy source/channel cod-ing,” IEEE Trans. Inform. Theory, vol. 44, no. 2, pp. 564–579, 1998.

[27] A. Liveris, Z. Xiong, and C. Georghiades, “Nested turbo codes for the binaryWyner-Ziv problem,” in Proc. ICIP’03, Barcelona, Spain, 2003, pp. 601–604.

[28] J. Conway and N. Sloane, Sphere Packings, Lattices and Groups. NewYork: Springer Verlag, 1998.

[29] V. Tarokh, A. Vardy, and K. Zeger, “Universal bounds on the performance oflattice codes,” IEEE Trans. Inform. Theory, vol. 45, no. 2, pp. 670–681, 1999.

[30] Z. Liu, S. Cheng, A. Liveris, and Z. Xiong, “Slepian-Wolf coded nestedquantization (SWC-NQ) for Wyner-Ziv coding: Performance analysis andcode design,” in Proc. DCC’04, Snowbird, UT, 2004, pp. 322–331.

[31] J. Chou, S. Pradhan, and K. Ramchandran, “Turbo and trellis-based con-structions for source coding with side information,” in Proc. DCC’03,Snowbird, UT, 2003, pp. 33–42.

[32] Y. Yang, S. Cheng, Z. Xiong, and W. Zhao, “Wyner-Ziv coding based onTCQ and LDPC codes,” in Proc. Asilomar Conf. Signals, Systems, andComputers, Pacific Grove, CA, 2003, pp. 825–829.

[33] M. Gastpar, P. Dragotti, and M. Vetterli, “The distributed, partial, andconditional Karhunen-Loeve transforms” in Proc. DCC’03, Snowbird, UT,2003, pp. 283–292.

[34] G. Caire, S. Shamai, and S. Verdu, “Lossless data compression with low-density parity-check codes,” in Multiantenna Channels: Capacity, Codingand Signal Processing, G. Foschini and S. Verdu, Eds. Providence, RI:American Math. Soc., 2003.

[35] J. Garcia-Frias and Y. Zhao: “Compression of binary memoryless sources usingpunctured turbo codes,” IEEE Commun. Letters, vol. 6, pp. 394–396, Sept. 2002.

[36] J.M. Shapiro, “Embedded image coding using zerotrees of wavelet coeffi-cients,’’ IEEE Trans. Signal Processing, vol. 41, pp. 3445–3463, Dec. 1993.

[37] Y. Steinberg and N. Merhav, “On successive refinement for the Wyner-Zivproblem,” IEEE Trans. Inform. Theory, to be published.

[38] W. Equitz and T. Cover, “Successive refinement of information,” IEEETrans. Inform. Theory, vol. 37, pp. 269–274, Mar. 1991.

[39] S. Cheng and Z. Xiong, “Successive refinement for the Wyner-Zivproblem and layered code design,” in Proc. DCC’04, Snowbird, UT,2004, p. 531.

[40] Q. Xu and Z. Xiong, “Layered Wyner-Ziv video coding,” in Proc.VCIP’04, San Jose, CA, 2004, pp. 83–91.

[41] J. Chou, D. Petrovic, and K. Ramchandran, “A distributed and adaptive sig-nal processing approach to reducing energy consumption in sensor net-works,’’ in Proc. INFOCOM 2003, San Francisco, CA, 2003, pp. 1054–1062.

[42] Y. Yang, V. Stankovic, Z. Xiong, and W. Zhao, “Asymmetric code designfor remote multiterminal source coding,” in Proc. DCC’04, Snowbird, UT,2004, p. 572.

[43] R. Puri and K. Ramchandran, “PRISM: A new ‘reversed’ multimedia cod-ing paradigm,” in Proc. ICIP’03, Barcelona, Spain, 2003, pp. 617–620.

[44] T. Berger, Z. Zhang, and H. Viswanathan, “The CEO problem [multiterminalsource coding],” IEEE Trans. Inform. Theory, vol. 42, pp. 887–902, May 1996.

[45] R. Barron, B. Chen, and G. Wornell, “The duality between informationembedding and source coding with side information and some applica-tions,” IEEE Trans. Inform. Theory, vol. 49, pp. 1159–1180, May 2003.

[46] S.I. Gelfand and M.S. Pinsker, “Coding for channel with random parame-ters,” Probl. Contr. Inform. Theory, vol. 9, no. 1, pp. 19–31, 1980.

[47] M. Costa, “Writing on dirty paper,” IEEE Trans. Inform. Theory, vol. 29,no. 3, pp. 439–441, 1983.

[48] G. Caire and S. Shamai, “On the achievable throughput of a multiantennaGaussian broadcast channel,” IEEE Trans. Inform. Theory, vol. 49, no. 7,pp. 1691–1706, 2003.

[49] V. Goyal, “Multiple description coding: compression meets the network,”IEEE Signal Processing Mag., vol. 18, no. 5, pp. 74–93, 2001.

[50] A. Sehgal, A. Jagmohan, and N. Ahuja, “Wyner-Ziv coding of video: Anerror-resilient compression framework,” IEEE Trans. Multimedia, vol. 6,pp. 249–258, Apr. 2004


Distributed Source Coding for Sensor Networkslena.tamu.edu/repository/SPM.pdf · wireless...

Documents

Transcript of Distributed Source Coding for Sensor Networkslena.tamu.edu/repository/SPM.pdf · wireless...