CODING SCHEMES FOR A DISSERTATIONsy560th1179/Ban... · 2013-06-18 · the same as the...
Transcript of CODING SCHEMES FOR A DISSERTATIONsy560th1179/Ban... · 2013-06-18 · the same as the...
CODING SCHEMES FOR
DETERMINISTIC INTERFERENCE CHANNELS
A DISSERTATION
SUBMITTED TO THE DEPARTMENT OF
ELECTRICAL ENGINEERING
AND THE COMMITTEE ON GRADUATE STUDIES
OF STANFORD UNIVERSITY
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
Bernd Bandemer
December 2011
http://creativecommons.org/licenses/by-nc-nd/3.0/us/
This dissertation is online at: http://purl.stanford.edu/sy560th1179
© 2011 by Bernd Frank Bandemer. All Rights Reserved.
Re-distributed by Stanford University under license with the author.
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
ii
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
Abbas El-Gamal, Primary Adviser
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
Arogyaswami Paulraj
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
Itschak Weissman
Approved for the Stanford University Committee on Graduate Studies.
Patricia J. Gumport, Vice Provost Graduate Education
This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file inUniversity Archives.
iii
iv
Abstract
One of the canonical unsolved problems in network information theory is to find the ca-
pacity region of the interference channel. The problem is motivated by today’s wireless
communication systems which have experienced a steep growth in participant density and
thus increasingly operate in the interference-limited regime. Data rates are no longer limited
by propagation path loss and thermal noise, but instead, by simultaneous transmissions in
the same frequency band.
The interference channel models such concurrent communication among several trans-
mitter–receiver pairs using a shared medium. Its capacity region describes the optimal
trade-off between simultaneously achievable data rates. While considerable progress has
been made in characterizing the capacity region for the case with two sender–receiver pairs,
much less is known for interference channels with three or more user pairs.
This dissertation contributes to this area by investigating a class of deterministic (noise-
free) interference channels with three user pairs. A series of three coding schemes and
corresponding achievable rate regions is developed, each of which subsumes and improves
upon its predecessor. As a baseline, a first transmission scheme is considered where point-
to-point random codes are combined with receivers that disregard the special statistical
structure of the interfering signals and simply treat them as white noise. It is shown that
despite its simplicity, this scheme achieves the sum capacity for an important subclass of
channels.
The baseline scheme is not optimal in general. In order to overcome its shortfalls, two
different viewpoints on the interference channel are taken. The receiver-centric view states
that each received signal is composed of multiple independent structured transmissions,
among which only the desired one must be decoded correctly. An approach is developed
that allows the receivers to exploit the structure of the combined interference signal without
insisting to decode any of the interfering messages partly or fully. This interference decoding
scheme results in a second achievable rate region that is shown to strictly dominate treating
interference as noise. In addition, it also contains as a special case the scheme that decodes
the undesired messages uniquely.
v
The complementary view of the interference channel is transmitter-centric. The obser-
vation that each sender affects all receivers, but needs to convey a message only to one of
them while minimizing the disturbance caused at the others leads to a new model of commu-
nication with disturbance constraints. Disturbance is measured by a mutual information
expression that represents the rate of unwanted information flow from the transmitter to
the side receivers. The disturbance-constrained communication problem is first studied in
isolation, and its optimal coding schemes are identified. The rate–disturbance trade-off is
established for the single constraint case, where the optimal encoding scheme turns out to be
the same as the Han–Kobayashi scheme for the two user-pair interference channel. For the
case of communication with two disturbance constraints, the best known encoding scheme
involves rate splitting, Marton coding and superposition coding, and is shown to be optimal
in several nontrivial cases.
Finally, the two viewpoints are consolidated by applying the codebook structure from
communication with two disturbance constraints in the three-user-pair interference channel
and combining it with interference-decoding receivers. This yields a third achievable rate
region that is the central result of this dissertation. It is strictly larger than the two previous
inner bounds to the capacity region. Furthermore, it is shown to achieve the capacity region
of each two-user-pair subchannel embedded within the three-pair interference channel, and
as such, the coding scheme generalizes the Han–Kobayashi scheme to more than two user
pairs.
While the results are presented in the framework of the deterministic interference channel
with three user pairs, the modular approach of separating the transmitter- and receiver-
centric viewpoints as well as the new coding schemes apply in principle to general discrete
memoryless interference channels.
vi
Acknowledgment
It is my pleasure to thank those whose support has made my doctoral work and this dis-
sertation possible. Most importantly, I owe my gratitude to Professor Abbas El Gamal for
accepting me as his doctoral student and kindling my interest in information theory. Since I
first met him, I was and continue to be impressed by the depth of his knowledge. It was a
great joy to learn from him and contribute to the field with him.
I am indebted to Professor Arogyaswami Paulraj for his service as my associate doctoral
adviser. He accepted me into his research group early on and offered me a stimulating
academic home to grow in. I also thank him for keeping my eye on the practical consequences
of information theory in wireless communications.
I would like to thank Professor Tsachy Weissman for serving as dissertation reader and
providing constructive feedback and encouragement. I am grateful to Professor Andrea
Goldsmith and Professor Ramesh Johari for participating as examiners in my oral dissertation
defense and contributing much appreciated questions and insightful suggestions.
The experience during my doctoral studies has been greatly enhanced by my colleagues
and fellow researchers. I would like to thank Yeow-Khiang Chia, Han-I Su, Lei Zhao, Paul
Cuff, and Haim Permuter for many fruitful discussions and inspiring insights, information-
theoretic and otherwise. I have also thoroughly enjoyed and benefited from discussions with
visiting professors David Tse and Pramod Viswanath. From Professor Paulraj’s group, I am
grateful to my scientific collaborators Aydin Sezgin, Gonzalo Vazquez Vilar, Nicolai Czink,
and Taemin Kim, and to my officemates Alireza Ghaderipoor, Amin Mobasher, Gokmen
Altay, Heunchul Lee, Jan Haase, Martin Wrulich, Mohamad Charafeddine, Moon-Sik Lee,
Naoki Kita, Simon Umbricht, Stephanie Pereira, Takayuki Shimizu, A. J. Thiruvengadam,
and Hyunjong Yang, who made day-to-day life in the office fun and memorable. In addition,
I thank ISL assistants Kelly Yilmaz, Denise Murphy, and Rashmi Shah for administrative
help.
I would like to express my appreciation and gratefulness to Eric and Illeana Benhamou,
who have generously supported my doctoral studies through a Stanford Graduate Fellowship.
This has given me the academic freedom to pursue what I was most interested in.
vii
My deepest gratitude belongs to my family. I wholeheartedly thank my parents Heidi
and Werner Bandemer for always being there for me. It is only through their support that I
could get this far. I thank my sister and brother-in-law Sabine and Jan Bandemer for being
a source of inspiration and happiness. My heartfelt thankfulness goes to my love and best
friend Leila Zia, who brightens my life every day.
viii
Contents
Abstract v
Acknowledgment vii
1 Introduction 11.1 Brief survey of known results . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Interference channels with two user pairs . . . . . . . . . . . . . . 3
1.1.2 Interference channels with more than two user pairs . . . . . . . . . 5
1.2 Three-user-pair deterministic interference channel . . . . . . . . . . . . . . 6
1.3 Organization of this dissertation . . . . . . . . . . . . . . . . . . . . . . . 10
2 Treating interference as noise 132.1 Inner bound by treating interference as noise . . . . . . . . . . . . . . . . . 13
2.2 Binary-field 3-DIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.1 Converse proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.2 Achievability proof . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.3 Optimal assignment matrices . . . . . . . . . . . . . . . . . . . . . 28
3 Interference Decoding 313.1 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.1.1 Interference-decoding inner bound . . . . . . . . . . . . . . . . . . 32
3.1.2 Capacity region under strong interference . . . . . . . . . . . . . . 34
3.1.3 Comparison to treating interference as noise . . . . . . . . . . . . . 35
3.1.4 Extension to 3-DIC with noisy observations . . . . . . . . . . . . . 38
3.1.5 Interference decoding is not optimal in general . . . . . . . . . . . 40
3.2 Proof of Theorem 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.3 Proof of Theorem 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.4 Proof of Theorem 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
ix
4 Communication with disturbance constraints 53
4.1 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1.1 Rate–disturbance region for a single disturbance constraint . . . . . 55
4.1.2 Inner and outer bounds for the deterministic channel with two dis-
turbance constraints . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 Proofs for a single disturbance constraint . . . . . . . . . . . . . . . . . . . 68
4.2.1 Proof of achievability for Theorem 4.1 . . . . . . . . . . . . . . . . 68
4.2.2 Proof of converse for Theorem 4.1 . . . . . . . . . . . . . . . . . . 69
4.2.3 Proof of Corollary 4.1 . . . . . . . . . . . . . . . . . . . . . . . . 71
4.2.4 Proof of Corollary 4.2 . . . . . . . . . . . . . . . . . . . . . . . . 72
4.2.5 Proof of Theorem 4.2 . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.3 Proofs for two disturbance constraints . . . . . . . . . . . . . . . . . . . . 77
4.3.1 Proof of Theorem 4.3 . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.3.2 Proof of Theorem 4.4 . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.3.3 Proof of Theorem 4.5 . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.3.4 Proof of Corollary 4.3 . . . . . . . . . . . . . . . . . . . . . . . . 88
5 General achievable rate region for 3-DIC 91
5.1 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.1.1 Achievable rate region for 3-DIC . . . . . . . . . . . . . . . . . . 92
5.1.2 Alternative characterization of the achievable rate region . . . . . . 96
5.1.3 Region without Ul . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.1.4 Special case: One-to-many 3-DIC . . . . . . . . . . . . . . . . . . 101
5.2 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.2.1 Codebook generation for Theorem 5.1 and Corollary 5.1 . . . . . . 104
5.2.2 Error probability analysis for Corollary 5.1 . . . . . . . . . . . . . 106
5.2.3 Equivalence of Theorem 5.1 and Corollary 5.1 . . . . . . . . . . . 113
5.2.4 Proof of Corollary 5.3 . . . . . . . . . . . . . . . . . . . . . . . . 114
6 Conclusion 117
x
A Useful auxiliary results 119A.1 Probability decomposition by index and by value . . . . . . . . . . . . . . 119
A.2 Independence lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
B Application of new techniques to 2-DIC 123B.1 2-DIC has no saturation gain . . . . . . . . . . . . . . . . . . . . . . . . . 126
C Mathematical notation 129
Bibliography 133
xi
xii
List of Tables
4.1 Message subsets for decoding error events. . . . . . . . . . . . . . . . . . . 79
5.1 Shorthand notation for terms related to transmitter 1. . . . . . . . . . . . . 94
5.2 Shorthand notation for terms related to transmitter 2. . . . . . . . . . . . . 96
5.3 Shorthand notation for terms related to transmitter 3. . . . . . . . . . . . . 97
5.4 Message subsetsM1i. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.5 Message subsetsM2j . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.6 Message subsetsM3j . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.7 Index subsets for union bound and corresponding sufficient conditions. . . . 112
B.1 2-DIC shorthand notation for terms related to transmitter 1. . . . . . . . . . 123
B.2 2-DIC shorthand notation for terms related to transmitter 2. . . . . . . . . . 124
xiii
xiv
List of Figures
1.1 Interference channel with K transmitter–receiver pairs. . . . . . . . . . . . 2
1.2 Deterministic interference channel with two user pairs (2-DIC). . . . . . . . 5
1.3 Deterministic interference channel with three user pairs (3-DIC). . . . . . . 7
1.4 3-DIC from the viewpoint of the first receiver. . . . . . . . . . . . . . . . . 7
1.5 Additive 3-DIC example (Example 1.1). . . . . . . . . . . . . . . . . . . . 9
1.6 Receiver and transmitter point of view of interference channels. . . . . . . 11
2.1 Region of Theorem 2.1 for the additive 3-DIC example. . . . . . . . . . . . 14
2.2 Cyclically symmetric binary-field 3-DIC. . . . . . . . . . . . . . . . . . . 16
2.3 Illustration of Theorem 2.2. . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Components of received signal Y1 for the converse proof of Theorem 2.2. . 19
2.5 Parameter regions for achievability proof of Theorem 2.2. . . . . . . . . . . 24
2.6 Transmit and received signal in region “Df”. . . . . . . . . . . . . . . . . . 25
2.7 Rules for verifying decodability. . . . . . . . . . . . . . . . . . . . . . . . 26
3.1 Region R1(p), which ensures decodability at the first receiver. . . . . . . . 34
3.2 Region of Theorem 3.1 for the additive 3-DIC example. . . . . . . . . . . . 37
3.3 Comparison of interference decoding and treating interference as noise. . . 38
3.4 Gaussian interference channel with BPSK. . . . . . . . . . . . . . . . . . . 40
3.5 2-DIC example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.6 Capacity region for a deterministic MAC. . . . . . . . . . . . . . . . . . . 47
4.1 Communication system with disturbance constraints. . . . . . . . . . . . . 54
4.2 Example of R(U,X), the constituent region of R. . . . . . . . . . . . . . . 57
4.3 The link between 2-DIC and communication with disturbance constraints. . 58
4.4 Deterministic example with one disturbance constraint. . . . . . . . . . . . 58
4.5 Constituent region for Theorem 4.3. . . . . . . . . . . . . . . . . . . . . . 62
4.6 Deterministic channel with two disturbance constraints (Example 4.2). . . 65
4.7 Two-dimensional projections of the rate–disturbance region for Example 4.2. 66
xv
4.8 Constituent region for Theorem 4.2. . . . . . . . . . . . . . . . . . . . . . 74
4.9 Illustration of decoding error events, for m0 = 1. . . . . . . . . . . . . . . 80
4.10 Constituent region for Corollary 4.4. . . . . . . . . . . . . . . . . . . . . . 88
5.1 Region of Corollary 5.2 for the additive 3-DIC example. . . . . . . . . . . 101
5.2 Comparison of the regions in Theorems 2.1 and 3.1 and Corollary 5.2. . . . 102
5.3 One-to-many special case of 3-DIC. . . . . . . . . . . . . . . . . . . . . . 102
xvi
List of Theorems
Theorem 1.1 Capacity region of 2-DIC, El Gamal–Costa 1982 . . . . . . . . . . . 5
Theorem 2.1 Treating interference as noise . . . . . . . . . . . . . . . . . . . . . . 13
Theorem 2.2 Normalized symmetric capacity for binary-field 3-DIC . . . . . . . . 17
Theorem 3.1 Interference-decoding inner bound . . . . . . . . . . . . . . . . . . . 33
Theorem 3.2 3-DIC capacity region with strong interference and invertible hk . . . 35
Theorem 3.3 Interference decoding versus treating interference as noise . . . . . . 36
Theorem 3.4 Interference decoding for 3-DIC with noisy observations . . . . . . . 39
Theorem 4.1 Rate–disturbance region of DMC-1-DC . . . . . . . . . . . . . . . . 55
Theorem 4.2 Gaussian vector channel with one disturbance constraint . . . . . . . 60
Theorem 4.3 Inner bound for deterministic DMC-2-DC . . . . . . . . . . . . . . . 61
Theorem 4.4 Outer bound for deterministic DMC-2-DC . . . . . . . . . . . . . . . 63
Theorem 4.5 Rate–disturbance region of certain deterministic DMC-2-DC . . . . . 64
Theorem 5.1 Inner bound to the capacity region of 3-DIC . . . . . . . . . . . . . . 94
Theorem B.1 Capacity region of 2-DIC . . . . . . . . . . . . . . . . . . . . . . . . 124
Corollary 4.1 Rate–disturbance region of deterministic DMC-1-DC . . . . . . . . . 56
Corollary 4.2 Gaussian channel with one disturbance constraint . . . . . . . . . . . 59
Corollary 4.3 Rate–disturbance region with degraded side receivers . . . . . . . . . 67
Corollary 4.4 Simpler inner bound for deterministic DMC-2-DC . . . . . . . . . . 87
Corollary 5.1 Alternative inner bound to the capacity region of 3-DIC . . . . . . . 97
Corollary 5.2 Inner bound to the capacity region of 3-DIC, no Ul . . . . . . . . . . 100
Corollary 5.3 Inner bound to the capacity region of one-to-many 3-DIC . . . . . . 103
Corollary A.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Corollary A.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Corollary B.1 Capacity region of 2-DIC, no saturation . . . . . . . . . . . . . . . . 126
Lemma 3.1 Packing lemma for pairs . . . . . . . . . . . . . . . . . . . . . . . . . 43
Lemma A.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Lemma A.2 Independence lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
xvii
xviii
Chapter 1
Introduction
The information-theoretic interference channel is a model for concurrent data transmission
using a coupled medium. Consider, for example, a wireless communication system in which
many participating devices have to use the same spectrum for regulation or implementation
reasons, and spatial proximity leads to interference between them. Similarly, nearby copper
wires in communication systems such as digital subscriber line (DSL) may be insufficiently
isolated from each other, thus introducing cross-talk between adjacent lines. From the
system designer’s point of view, it is important to understand the impact of interference on
system performance. How much does the presence of interference impair the achievable
transmission rates? Furthermore, algorithms and transmission schemes need to be devised
that are able to handle such interference. How should a communication system be designed
to be robust to interference?
To put these questions in a mathematical framework, consider the memoryless inter-
ference channel with K sender–receiver pairs depicted in Figure 1.1. In this channel, K
non-cooperating transmitters would each like to send a message to a corresponding receiver.
The concurrent transmissions are coupled to each other by way of the shared channel through
which they occur, which creates a trade-off between the reliably achievable data rates of
each communication. The goal of studying the interference channel is to characterize this
data rate trade-off.
Formally, the problem of finding this trade-off can be expressed as follows. The channel
consists of input and output alphabets X1, . . . ,XK and Y1, . . . ,YK and a collection of
2 CHAPTER 1. INTRODUCTION
Y nK MKDecoder K
Y n2 M2Decoder 2
M1Xn
1Encoder 1Y n
1 M1Decoder 1
M2Xn
2Encoder 2
MKXnKEncoder K p(
y 1,...,yK|
x1,...,x
K)
Figure 1.1. Interference channel with K transmitter–receiver pairs.
conditional probability distributions p(y1, . . . , yK |x1, . . . , xK). In each channel use, the
channel outputs are drawn randomly from a probability distribution that depends on the
channel inputs.
A (2nR1 , . . . , 2nRK , n) block code of data rates (R1, . . . , RK) ∈ RK+ and block length n
consists of K encoding functions
xnk : {1:2nRk} → X nk , for k ∈ {1:K},
that map a message mk into a transmitted codeword xnk(mk), and K decoding functions
mk : Ynk → {1:2nRk}, for k ∈ {1:K},
that map a received sequence ynk into an estimate mk(ynk ) of the message. The probability of
error of a code is defined as
P (n)e = P
{(M1,M2, . . .MK) 6= (M1, M2, . . . MK)
},
where the messages Mk are now random variables, independent from each other and uni-
formly distributed over the message sets {1:2nRk}, for k ∈ {1:K}.
A rate tuple (R1, . . . , RK) is achievable in the K-pair interference channel if there exists
a sequence of (2nR1 , . . . , 2nRK , n) block codes such that the probability of error tends to
1.1. BRIEF SURVEY OF KNOWN RESULTS 3
zero as the block length grows to infinity,
limn→∞
P (n)e = 0.
The capacity region C of the K-pair interference channel is the closure of the set of all
achievable rate tuples. Our goal is to identify C for a given interference channel, since it
exactly describes the trade-off between data rates of the participating user pairs. We are
also interested in the nature of the transmission scheme that achieves capacity, as it offers
guidance and insight for practical system design.
1.1 Brief survey of known results
The interference channel was first introduced in [Ahl74]. A survey of early work is given
in [Meu94], and an up-to-date account of known results is contained in [EK11]. Despite
years of investigation, the capacity region of the interference channel remains unknown in
general. This is the case even for the two sender–receiver pair interference channel (K = 2).
1.1.1 Interference channels with two user pairs
The capacity region is known for certain classes of interference channels. When the in-
terference is very strong [Car75], it is optimal for both receivers to decode both messages
completely. In fact, the presence of interference does not impair the per-user capacity in this
case. When the interference is strong [Sat81, CG87], it is still optimal for both receivers to
decode both messages, but the achievable rates are less than in the interference-free case.
When interference is weak enough, a natural scheme is to treat it as noise. For two-user
Gaussian interference channels, which are of particular practical relevance, this scheme
achieves the sum capacity provided the gain on the cross links are sufficiently small [AV09,
SKC09, MK09].
The best known general inner bound for the two-user-pair interference channel is
achieved by the Han–Kobayashi scheme [HK81], for which a simplified description has been
recently developed in [CMGE08]. The scheme combines the ingredients of the schemes that
work well for strong and weak interference. Each message is divided into a common part
4 CHAPTER 1. INTRODUCTION
which is decoded at both receivers, and a private part which is decoded only at the desired
receiver and treated as noise at the undesired receiver. Codebooks are then constructed using
superposition coding. The resulting achievable rate region is optimal for all interference
channels for which the capacity region is known. For the Gaussian interference channel,
a simplified Han–Kobayashi scheme with Gaussian input distributions has been shown
in [ETW08] to be at most half a bit per user away from the capacity region. This result is
shown using outer-bounding techniques first developed in [Kra04].
Injective deterministic interference channel with two user pairs
An interference channel that is of particular interest to us is the two-user-pair deterministic
interference channel (2-DIC) depicted in Figure 1.2. The channel consists of two sender
alphabets Xl, for l ∈ {1:2} and two receiver alphabets Yk, for k ∈ {1:2}, loss functions glkthat model the links between each sender and receiver, and a function fk at each receiver
that maps the two impinging signals into the receiver observations Y1 and Y2. The channel
is memoryless and the outputs are deterministic functions of the inputs,
Y1 = f1(X11, X21),
Y2 = f2(X22, X12), where
Xlk = glk(Xl).
We assume that the functions fk are injective in each argument, that is, they become one-to-
one when either one of their arguments is fixed. For example, for Y1 = f1(X11, X21), this
assumption is equivalent to H(X11) = H(Y1 |X21) and H(X21) = H(Y1 |X11) for every
probability mass function (pmf) p(x11, x21). An example of a function that is injective in
each argument (but not injective) is regular addition.
As shown in [EC82], the capacity region of this channel is known and achieved by the
Han–Kobayashi scheme.
1.1. BRIEF SURVEY OF KNOWN RESULTS 5
X1 f1
g21
X21
Y1
g12
f2
X12
X2Y2
g11
g22
X11
X22
Figure 1.2. Deterministic interference channel with two user pairs (2-DIC).
Theorem 1.1 (Capacity region of 2-DIC, El Gamal–Costa 1982).The capacity region C2-DIC of the two-pair deterministic interference channel is the set
of rate pairs (R1, R2) that satisfy
R1 ≤ H(X11 |Q),
R2 ≤ H(X22 |Q),
R1 +R2 ≤ H(Y1 |X12, Q) +H(Y2 |X21, Q),
R1 +R2 ≤ H(X11 |X12, Q) +H(Y2 |Q),
R1 +R2 ≤ H(X22 |X21, Q) +H(Y1 |Q),
2R1 +R2 ≤ H(Y1 |Q) +H(X11 |X12, Q) +H(Y2 |X21, Q),
R1 + 2R2 ≤ H(Y2 |Q) +H(X22 |X21, Q) +H(Y1 |X12, Q),
for some distribution p = p(q)p(x1|q)p(x2|q).
1.1.2 Interference channels with more than two user pairs
Much less is known about interference channels with more than two user pairs. In addition
to containing all complexities of the two-pair case, these channels exhibit the interesting
property that decoding at each receiver is impaired by the joint effect of interference from all
other senders rather than by each sender’s signal separately. Consequently, dealing directly
with the effect of the combined interference signal is expected to achieve higher rates.
6 CHAPTER 1. INTRODUCTION
One such coding scheme is interference alignment [MMK08, CJ08], in which the code
is designed so that the combined interference signal at each receiver is confined (aligned) to
a subset of the receiver signal space. Depending on the specific channel, this alignment may
be achieved via linear subspaces, signal scale levels, time delay slots, or number-theoretic
bases of rationally independent real numbers [EO09, MOMK09]. In some cases, e.g., the
multiple-input multiple-output (MIMO) Gaussian interference channel [CJ08], the decoder
simply treats interference as noise. In general, however, decoding can be thought of as a
two-step procedure. In the first step, the received signal is projected onto the desired signal
subspace, e.g., by multiplying it by a matrix as for the MIMO case [MMK08, CJ08] or by
separating each received symbol into its constituent lattice points as for the scalar Gaussian
case [MOMK09]. In the second step, interference-unaware decoding is performed on the
projection of the received signal.
A natural question that has not been answered in the literature is how to generalize the
Han–Kobayashi scheme to interference channels with more than two user pairs. For the
Gaussian case, it was shown in [BPT10] that a straightforward extension using a partial
message for each subset of receivers and superposition coding does not work well in general.
1.2 Three-user-pair deterministic interference channel
In this dissertation, we investigate the three-user-pair deterministic interference channel
(3-DIC) depicted in Figure 1.3, which we first introduced in [BE10, BE11d].
The channel consists of three sender–receiver alphabet pairs (Xl,Yl), loss functions glkthat model the links between each sender and receiver, and a function at each receiver that
maps the three impinging signals into the receiver observation Yl, for k, l ∈ {1:3}. Each of
these functions is composed of two stages, as depicted in Figure 1.4 for the first receiver,
namely an interference combining function hl and a receiver function fl. In the spirit of
2-DIC, we assume that the functions hl and fl are injective in each argument. For Y1 =
f1(X11, S1), this condition can be expressed in terms of entropies as H(X11) = H(Y1 |S1)
and H(S1) = H(Y1 |X11) for every joint distribution p(x11, s1). The channel is assumed to
1.2. THREE-USER-PAIR DETERMINISTIC INTERFERENCE CHANNEL 7
X1
g12
Y1
X2
X3
Y2
Y3
g13
g11
g21
g22
g23
g32
g33
g31
X21
X11
X31
Figure 1.3. Deterministic interference channel with three user pairs (3-DIC).
X1
X2
X3
g11
g21
g31
X11
X21
X31S1
Y1f1
h1
Figure 1.4. 3-DIC from the viewpoint of the first receiver.
8 CHAPTER 1. INTRODUCTION
be memoryless. Its outputs are then given as
Yl = fl(Xll, Sl), where (1.1)
Xlk = glk(Xl),
S1 = h1(X21, X31),
S2 = h2(X32, X12),
S3 = h3(X13, X23).
This interference channel model is a natural choice to consider for several reasons. First,
it allows us to explore the effect of interference without noise. Although we are eventually
interested in understanding noisy interference channels, it is generally a good idea to study
simplified models first before we can hope to analyze the general case.
Second, the model allows us argue explicitly about the combined effect of interference
on the receivers by giving direct access to the combined interference signals S1, S2, and S3.
Although we focus on the case with three user-pairs, the insight generalizes to the general
case, since the step from two to three pairs is more difficult than the step from three to an
arbitrary number of pairs.
Third, the 3-DIC generalizes the 2-DIC discussed above, for which the capacity region
is known and achieved by the Han–Kobayashi scheme. This gives some hope that an
appropriate extension of Han–Kobayashi may be optimal for more than two user pairs.
Fourth, this class of interference channels includes the binary-field deterministic model
proposed in [ADT07, ADT11], which has been shown to approximate the two-pair Gaussian
interference channel well in high signal-to-noise ratios [BT08]. This is of great interest in
wireless communication.
Finally, although many of our results apply in principle to general interference channels,
their presentation and analysis is much simpler for the deterministic case. Focusing on the
3-DIC allows us to concentrate on the essence of the new ideas.
Example 1.1 (Additive 3-DIC). Consider a cyclically symmetric 3-DIC with
X1 = X2 = X3 = {0, 1, 2},
1.2. THREE-USER-PAIR DETERMINISTIC INTERFERENCE CHANNEL 9
Y1 = Y2 = Y3 = {0, 1, 2, 3, 4},
g11 = g22 = g33 = Id,
g12 = g23 = g31 = {0 7→ 0, 1 7→ 1, 2 7→ 1},
g13 = g21 = g32 = {0 7→ 0, 1 7→ 1, 2 7→ 0},
h1 = h2 = h3 = +,
f1 = f2 = f3 = +.
The loss functions are inspired by the Blackwell broadcast channel [Meu77], and the inter-
ference combining functions and receiver functions are taken to be addition. The resulting
input-to-output mapping is shown in Figure 1.5. This 3-DIC is cyclically symmetric, i.e.,
the channel is invariant to cyclic relabeling of the pairs (performing subscript replacements
1 7→ 2 7→ 3 7→ 1 or 1 7→ 3 7→ 2 7→ 1).
The 3-DIC capacity region is not known in general. In this dissertation, we make
progress in characterizing it by means of inner bounds and corresponding encoding schemes.
{0, 1}
{0, 1}
X1
X2
X3
Y1
Y2
Y3
{0, . . . , 4}{0, 1, 2}
{0, 1}
{0, 1}
{0, 1}
{0, 1}
{0, 1, 2}
{0, 1, 2}
{0, . . . , 4}
{0, . . . , 4}
021
0
1
021
0
1
021
0
1
021
0
1
021
0
1
021
0
1
Figure 1.5. Additive 3-DIC example (Example 1.1).
10 CHAPTER 1. INTRODUCTION
1.3 Organization of this dissertation
The main body of the dissertation consists of four chapters. In Chapter 2, we establish
a baseline inner bound to the capacity region of the 3-DIC. This bound is achieved by
using point-to-point (non-layered) random codebooks at the transmitters, and treating all
interference as noise at the receivers. We show that this simple scheme can achieve sum
capacity for an important subclass of 3-DIC, in which the inputs and outputs are vectors of
bits and the channel loss functions are vector shift operations. This binary field model has
been shown to approximate Gaussian interference channels in the interference-limited (low
noise) regime.
In the subsequent chapters, we study the interference channel from two different view-
points as shown in Figure 1.6. From the point of view of each receiver, the channel resembles
a multiple-access channel [EK11], see Figure 1.6(a). However, the receiver is interested in
decoding only one of the transmitted messages. In particular, the receiver is not required
to decode the undesired messages partly or fully. In Chapter 3, we focus on this receiver-
centric view. Assuming simple point-to-point random codes at the transmitters, we devise an
interference decoding receiver that does not uniquely decode any of the interfering messages,
but exploits the structure in the combined interfering signal to increase the achievable data
rate. This idea leads to an inner bound to the 3-DIC capacity region which is strictly larger
than the region achieved by treating interference as noise.
In Chapter 4, we take the opposite viewpoint as depicted in Figure 1.6(b). From the point
of view of each transmitter, the channel resembles a broadcast channel [EK11]. However,
the sender wishes to send a message only to one of the receivers while causing the least
disturbance to the other receivers. Abstracting this line of thought, we define the setting of
communication with disturbance constraints. We measure disturbance in terms of the rate
of undesired information flow from the sender. In the case of a single disturbance constraint,
the optimal encoding scheme turns out to be rate splitting and superposition coding, which
coincides with the Han–Kobayashi scheme for two-pair interference channels. This gives us
hope that a coding scheme for communication with two disturbance constraints would work
well in three-pair interference channels. Consequently, we develop inner and outer bounds
on the rate–disturbance region with two disturbance constraints.
1.3. ORGANIZATION OF THIS DISSERTATION 11
X1
X2
XK
Y1
Y2
YK
(a) Receiver point of view.
X1
X2
XK
Y1
Y2
YK
(b) Transmitter point of view.
Figure 1.6. Receiver and transmitter point of view of interference channels.
Finally, in Chapter 5, we combine the insight from the preceding chapters to develop
a new inner bound to the 3-DIC capacity region which constitutes the main result of this
dissertation. We borrow the codebook structure from communication with disturbance
constraints and the receiver architecture from interference decoding and combine the two
pieces in a modular fashion. The resulting coding scheme achieves a larger rate region
than any previously known scheme. We argue that this is the natural way to generalize the
Han–Kobayashi scheme to interference channels with more than two user pairs.
Chapter 6 concludes the dissertation. There are three appendices: Appendix A contains
supporting mathematical results that may be useful in other applications. Appendix B applies
the 3-DIC techniques developed in this dissertation to the 2-DIC and shows how the result
collapses to Theorem 1.1. Appendix C summarizes our mathematical notation.
How to read this dissertation. Each of the following chapters consists of two parts. The
first part contains the theorems and corollaries that constitute the results in that chapter, a
discussion of their important properties, some concrete examples, and a high-level sketch
of the main proof ideas. Subsequent sections contain the proofs in detail. Some rather
technical parts are labeled as propositions and are proved outside the main flow of the text.
It is recommended to approach the material in the fashion of successive refinement: A first
pass over the material would include only the first section of each chapter, saving a deeper
descent into the mathematical underpinnings for later passes.
Chapter 2
Treating interference as noise
In this chapter1, we review a first inner bound on the capacity region of the 3-DIC, which
serves as a benchmark for subsequent results. We also show that the inner bound achieves
the sum capacity for a special case of 3-DIC.
2.1 Inner bound by treating interference as noise
The following achievable rate region is well-known in the literature [EK11]. It applies to
general interference channels and is given in the following.
Theorem 2.1 (Treating interference as noise).The set RTIN of rate triples (R1, R2, R3) such that
Rk ≤ I(Xk;Yk |Q), k ∈ {1:3}, (2.1)
for some pmf p(q)p(x1|q)p(x2|q)p(x3|q) constitutes an inner bound to the capacity
region of the memoryless interference channel with three user pairs.
To achieve this bound, a user pair does not need to know the codebooks of other user pairs.
Each receiver decodes only its message. Although the interfering signal exposes temporal1The results in this chapter were first published in [BVVE09, BE10, BE11d].
14 CHAPTER 2. TREATING INTERFERENCE AS NOISE
structure originating from the codebooks of the undesired transmitters, this structure is
disregarded by the receiver. Instead, the interference is regarded as independent samples
from a certain distribution, i.e., it is treated as white noise.
Note that this inner bound, with appropriate selections of the input pmfs, includes the
interference alignment inner bounds in [CJ08, JV08]. In the case of 3-DIC as defined
in Section 1.2, we can identify the alignment effect in the rate conditions. Consider the
condition for the first message rate R1,
R1 ≤ I(X1;Y1 |Q) = H(Y1 |Q)−H(S1 |Q).
Recall that the combined interference S1 is a function of the individual interference signals
X21 and X31. Thus,
H(S1 |Q) ≤ H(X21 |Q) +H(X31 |Q).
Alignment occurs when the inequality is strict, i.e., the effect of the combined interference
signal is less severe than the sum of the effects of the interference signals individually.
Continuation of Example 1.1. Recall the channel in Example 1.1 on page 8. The inner
bound to the capacity of this channel given by Theorem 2.1 is depicted in Figure 2.1.
R1
R2
R3
Figure 2.1. Region of Theorem 2.1 for the additive 3-DIC example.
2.2. BINARY-FIELD 3-DIC 15
2.2 Binary-field 3-DIC
In this section, we discuss an important subclass of 3-DIC for which Theorem 2.1 achieves
the sum capacity. We specialize the 3-DIC model as follows. Let the input and output
alphabets be
Xk = FN2 ,
Yk = F2N2 ,
for k ∈ {1 : 3}, where F2 denotes the binary finite field. The inputs and outputs of the
channel are thus column vectors of bits. We choose the combining functions hk and fk as
componentwise finite-field addition in F2N2 . Finally, the loss functions glk map from FN2 to
F2N2 and are given as
g11 = g22 = g33 : x 7→ Zx,
g12 = g23 = g31 : x 7→ S(1−β)N↓ Zx,
g13 = g21 = g32 : x 7→ S(α−1)N↑ Zx.
Here, Z ∈ F2N×N2 is a zero-padding matrix, defined as
Z =
[0N×N
IN
].
Further, S↑, S↓ ∈ F2N×2N2 are up-shift and down-shift matrices, respectively, such that
S↑ [x1, x2, . . . , x2N−1, x2N ]T = [x2, x3, . . . , x2N , 0]T,
S↓ [x1, x2, . . . , x2N−1, x2N ]T = [0, x1, . . . , x2N−2, x2N−1]T.
The channel is parameterized by the triple (N,α, β), which we constrain to α ∈ [1, 2],
β ∈ [0, 1], and αN, βN ∈ Z. The parameters α and β characterize the amount of up/down-
shift on the cross links and thus loosely correspond to channel gains. Note that due to the
zero-padding matrix Z, the up-shift operation retains the complete information of its input,
16 CHAPTER 2. TREATING INTERFERENCE AS NOISE
while the down-shift operation incurs clipping at the low end of the vector.
This specialized 3-DIC is cyclically symmetric. Each transmitter causes interference
to one receiver through the up-shift function, and to one through the down-shift function.
Likewise, each receiver experiences one up-shifted and one down-shifted interfering signal.
The channel is depicted in Figure 2.2.
The binary-field 3-DIC is of interest because of its connection to Gaussian interference
channels. Deterministic channels of this type as models for noisy networks have first been
proposed in [ADT07, ADT11]. In [BT08], the two-user Gaussian interference channel was
studied and it was shown that there is a correspondence between the generalized degrees of
freedom of the Gaussian case and the capacity of the deterministic binary-field case. The
deterministic model thus captures the asymptotic behavior of the Gaussian channel in the
interference-limited regime. Some progress toward generalizing this result to more than two
user-pairs has been made in [JV08], where the solution is found for the fully symmetric case
where α = β.
In addition to the capacity region C , define the sum capacity asRΣ = sup{R1+R2+R3 |(R1, R2, R3) ∈ C } and the symmetric capacity as Rsym = sup{R | (R,R,R) ∈ C }. By
symmetry of the channel and convexity of the capacity region, RΣ = 3Rsym. Furthermore,
X1 Y1
X2
X3
Y2
Y3
β
α
α
β
α
β
X13
X13X31
X31
Figure 2.2. Cyclically symmetric binary-field 3-DIC.
2.2. BINARY-FIELD 3-DIC 17
define the normalized symmetric capacity dsym = Rsym/N , where the normalization is with
respect to the interference-free symmetric capacity N .
Before we state the sum capacity result for the binary-field 3-DIC, define the function
V(x) =1 + |x− 1|
2=
x/2 if x ≥ 1
1− x/2 if x < 1.
Remark 2.1. The normalized symmetric capacity of the binary-field 2-DIC with parameters
(N,α), where α ∈ [0,∞), can be expressed with the function V as
dsym = min{
1,V(α),V(2α)}.
This was shown in [ETW08, BT08] and is essentially a consequence of Theorem 1.1,
specialized to the binary-field case.
We are now ready to state the normalized symmetric capacity result for the binary-field
3-DIC defined above for a large set of (α, β) parameters.
Theorem 2.2 (Normalized symmetric capacity for binary-field 3-DIC).The normalized symmetric capacity of the cyclically symmetric binary-field 3-DIC with
parameters (α, β) ∈ [1, 2]× [0, 1], where α ≥ 2β or α ≥ β2
+ 1, is
dsym = min{
1,V(α),V(β),V(2β),V(α− β)}.
Figure 2.3 illustrates the theorem. The claimed dsym is piecewise linear in (α, β), and
the figure shows the linear regions in the parameter plane. The value of dsym is indicated by
shading.
Remark 2.2. The theorem implies that dsym is independent of N . For fixed α and β, all
valid values of N (satisfying αN, βN ∈ Z) yield the same dsym.
Remark 2.3. The result of the theorem continues to hold for the cyclically symmetric
binary-field K-DIC.
18 CHAPTER 2. TREATING INTERFERENCE AS NOISE
α
β
X
210
1
Figure 2.3. Illustration of Theorem 2.2 in the (α, β) parameter plane. The result applies every-where except in region “X”. The value of dsym is represented by different levels of shading, and localmaxima are marked by a star.
The proof is given in two parts below. In Subsection 2.2.1, we prove the converse by
allowing the receivers access to some additional “genie” information. In Subsection 2.2.2,
we prove achievability by identifying optimal input distributions for Theorem 2.1.
2.2.1 Converse proof
The upper bounds 1, V(α), V(β), and V(2β) follow in a straightforward way from the known
degree of freedom result of the two user-pair case (see Remark 2.1). This can be shown by
giving the complete signal Xnk of one of the interferers as genie information to the receivers,
thus effectively degenerating the three user-pair case to the two-pair case.
Hence we focus on proving the bound V(α− β) by generalizing the methods introduced
in [EC82] to the case at hand. First note that Fano’s inequality implies for every k
nRk ≤ I(Xnk ;Y n
k ) + nεn,
where εn tends to zero as the block length n grows to infinity.
2.2. BINARY-FIELD 3-DIC 19
Without overlap between interferers
First consider α− β ≥ 1, which corresponds to the first line in the definition of V(α− β).
In this case, the two interfering signals do not overlap within the received signal, as shown
in Figure 2.4(a). For example, at receiver 1, the sparsity patterns of X21 and X31 are disjoint.
We can write
I(Xn1 ;Y n
1 )(a)= I(Xn
1 ;Y n1 , X
n23)
= I(Xn1 ;Xn
23) + I(Xn1 ;Y n
1 | Xn23)
(b)= H(Y n
1 | Xn23)−H(Y n
1 | Xn1 , X
n23),
where in (a), Xn23 is a form of “genie” information given to the receiver, which does not
increase the mutual information since X23 is not interfered with in Y1, and (b) uses the
independence between the messages from the first and second transmitter. Now consider the
last term.
H(Y n1 | Xn
1 , Xn23) = H(Xn
11 +Xn21 +Xn
31 | Xn1 , X
n23)
αN
βN
X23
X23
X21 X31X11
N
(α−β
)N
(a) Interferers do not overlap.
X23
X23
X21 X31X11
(1−
(α−β
))N
T31
T31
(b) Interferers overlap.
Figure 2.4. Components of received signal Y1 for the converse proof. The thick horizontal line inthe bottom of the figure symbolizes the “noise level”, i.e., the lower end of the vector where furtherdown-shifts cause loss of information. The received signal Y1 is the elementwise sum of the threesignals.
20 CHAPTER 2. TREATING INTERFERENCE AS NOISE
= H(Xn21 +Xn
31 | Xn23)
(a)= H(Xn
31) +H(Xn21 | Xn
23)
= H(Xn31) +H(Xn
23 | Xn23),
where (a) follows from the fact that X21 and X31 do not overlap and different transmit-
ters’ signals are independent, and X23 is the part of X2 that is not contained in X23 (see
Figure 2.4(a)). We conclude that
I(Xn1 ;Y n
1 ) = H(Y n1 | Xn
23)−H(Xn31)−H(Xn
23 | Xn23).
Writing an analogous equation for I(Xn2 ;Y n
2 ) and I(Xn3 ;Y n
3 ), and adding all three of them,
we arrive at
n(R1 +R2 +R3 − 3εn) ≤ H(Y n1 | Xn
23) +H(Y n2 | Xn
31) +H(Y n3 | Xn
12)
−H(Xn12)−H(Xn
12 | Xn12)−H(Xn
23)
−H(Xn23 | Xn
23)−H(Xn31)−H(Xn
31 | Xn31)
= H(Y n1 | Xn
23) +H(Y n2 | Xn
31) +H(Y n3 | Xn
12)
−H(Xn1 )−H(Xn
2 )−H(Xn3 )
Considering that nRk ≤ H(Xnk ) + nεn, for all k ∈ {1:3}, we conclude that
2n(R1 +R2 +R3 − 6εn) ≤ H(Y n1 | Xn
23) +H(Y n2 | Xn
31) +H(Y n3 | Xn
12)
≤ nH(Y1 | X23) + nH(Y2 | X31) + nH(Y3 | X12),
where single-letterization is performed by using the chain rule and omitting part of the con-
ditioning. The right hand side of the last equation is maximized by letting each component
of each Xk be independent Bern(1/2). Thus
2(R1 +R2 +R3) ≤ 3N(α− β), and finally,
dsym = Rsym/N ≤ (α− β)/2.
2.2. BINARY-FIELD 3-DIC 21
With overlap between interferers
Now consider the case where α − β < 1, i.e., the two interfering signals at each receiver
overlap in signal space, see Figure 2.4(b). Define the topmost (1− (α− β))N bits of Xk as
Tk. We will augment the genie information Xn23 of the previous subsection by T n31. This is
exactly the part of the X3-based interference that overlaps with the X2-based interference.
Similar to the previous case, we conclude
I(Xn1 ;Y n
1 ) ≤ I(Xn1 ;Y n
1 , Xn23, T
n31)
= I(Xn1 ;Xn
23, Tn31) + I(Xn
1 ;Y n1 | Xn
23, Tn31)
= H(Y n1 | Xn
23, Tn31)−H(Y n
1 | Xn1 , X
n23, T
n31),
The last term becomes
H(Y n1 | Xn
1 , Xn23, T
n31) = H(X11 +Xn
21 +Xn31 | Xn
1 , Xn23, T
n31)
= H(Xn21 +Xn
31 | Xn23, T
n31)
(a)= H(T n31 | T n31) +H(Xn
23 | Xn23),
where T31 denotes the part of X31 that is not included in T31. Its size is N(α− 1). We are
allowed to separate the terms in (a) because the overlapping part is resolved by T31.
Again, repeating the same for all three rates, we arrive at
n(R1 +R2 +R3 − 3εn) ≤ H(Y n1 | Xn
23, Tn31)−H(T n12 | T n12)−H(Xn
12 | Xn12)
+H(Y n2 | Xn
31, Tn12)−H(T n23 | T n23)−H(Xn
23 | Xn23)
+H(Y n3 | Xn
12, Tn23)−H(T n31 | T n31)−H(Xn
31 | Xn31).
Since T12 and T12 form X12, which when combined with X12 forms X1, we can write
n(R1 − nεn) ≤ H(Xn1 ) = H(T n12) +H(T n12|T n12) +H(X12|T n12, T
n12︸ ︷︷ ︸
=Xn12
)
Using this expression and its equivalent for R2 and R3 with the previous inequality, we
22 CHAPTER 2. TREATING INTERFERENCE AS NOISE
obtain
2n(R1 +R2 +R3 − 6εn) ≤ H(Y n1 | Xn
23, Tn31) +H(T n12) +H(Y n
2 | Xn31, T
n12)
+H(T n23) +H(Y n3 | Xn
12, Tn23) +H(T n31)
≤ n(H(Y1 | X23, T31) +H(T12) +H(Y2 | X31, T12)
+H(T23) +H(Y3 | X12, T23) +H(T31)).
Again, the right hand side is maximized by choosing all Xk components independently
according to Bern(1/2), yielding
2(R1 +R2 +R3) ≤ 3N + 3N(1− (α− β)),
dsym ≤ 1− (α− β)/2,
which matches the definition of V(α− β) for α− β < 1. This concludes the converse proof
of Theorem 2.2. �
2.2.2 Achievability proof
We prove achievability by identifying optimal input distributions for Theorem 2.1. We use
input distributions of the form
Xk = G(α, β)Dk, for k ∈ {1:3},
where G(α, β) is an assignment matrix of size N × Ndsym(α, β) with elements from F2,
and Dk is a vector of Ndsym(α, β) independent message bits with distribution Bern(1/2).
We further constrain the coding scheme in several ways. Firstly, all three transmitters use
the same assignment matrix G(α, β). Secondly, there is no coding across multiple channel
uses. Specifically, the assignment matrices are chosen such that I(Xk;Yk) = H(Xk), i.e.,
from observing the channel output Yk at a single time, the decoder can reconstruct the
complete transmit vector Xk that was sent at that time, and thereby, the message bits that
are contained in it. Like the encoding step, this reconstruction is implemented by a linear
operator Xk = G(α, β)Yk. Finally, the proposed G matrices have at most one non-zero
2.2. BINARY-FIELD 3-DIC 23
element per row, i.e., each component of Xk is assigned either an information bit or a zero.
While these assumptions may seem overly restrictive, they are sufficient for our purposes.
Indeed, it is surprising that such a constrained set of codes is able to meet the upper bound
of Subsection 2.2.1.
Remark 2.4. If the number N of components in the input vectors is small, it can severely
limit our options in terms of assignment matrices. The following argument can circumvent
this problem by expanding a given channel to one with larger N . To this end, take the
binary field 3-DIC with parameters (N,α, β), and consider L ≥ 2 subsequent channel uses
with channel inputs Xk,1, . . . , Xk,L. Let us interleave these vectors into a supersymbol
Xk =∑L
l=1(IN ⊗ el)Xk,l, and likewise for the outputs Yk. Here, ⊗ denotes the Kronecker
product, and el is the lth column of IL. The resulting channel {X1, X2, X3} → {Y1, Y2, Y3}is then a binary field 3-DIC with parameters (LN,α, β). Through this method, we have
increased N to LN , where L can be arbitrarily large. Note that dsym is unaffected by this
transformation since it is normalized by N . In light of this transformation, we assume from
now on that N is large enough such that any fraction of N that we incur evaluates to an
integer number.
The assignment matrix G depends on the channel parameters α and β. The set of interest
{(α, β)} is divided into 18 regions “Aa” to “Ee” as shown in Figure 2.5. Compared to
Figure 2.3, some of the parameter regions are subdivided (for example, “Ea” and “Eb”),
which indicates that a different kind of assignment matrix G is needed even within a
parameter range where dsym is linear in α and β.
Optimal assignment matrices G for all parameter regions in Figure 2.5 are listed in
Subsection 2.2.3 on page 28. For each region, we specify the affine constraints on (α, β)
that define the region and the optimal input assignment matrix G. The latter is given in terms
of the resulting transmit vector Xk. In the following we discuss the details for one particular
example, which is representative for all other cases.
Optimal input distribution in region “Df”
This region is parameterized by (α, β) = (4/3 + ε, 2/3 + δ) with ε ≤ 2δ, ε ≥ 12δ, δ ≤ 1
3.
Figure 2.6(a), copied from Subsection 2.2.3, represents an optimal assignment G by means
24 CHAPTER 2. TREATING INTERFERENCE AS NOISE
α
β
Ab
Aa
Ba
BcBd
Be Bf Bg
EaEb
EcEd
Ee
Da
DbDc
Df
X
Bb
210
1
Figure 2.5. Regions in the (α, β) parameter plane for the achievability proof of Theorem 2.2.
of the resulting transmit vector. The vector Xk is subdivided into data blocks (hatched) that
correspond to non-zero rows of G, and zero blocks (gray) that correspond to all-zero rows
of G. Some data blocks occur twice. We denote such block pairs as twins. Twins carry
the same data bits, albeit in reverse order as discussed later. The length of each block as a
fraction of N is annotated in the figure.
To prove achievability of Theorem 2.2, we require the transmit vector to be both valid
and decodable. By valid we mean (a) all block lengths are non-negative for the range of
(ε, δ) that constitute the region, (b) the sum of the block lengths is 1, and (c) adding the
sizes of all data blocks, counting twins only once, results in the desired dsym as claimed in
Theorem 2.2, e.g., 2/3− δ/2 for our case. By decodable, we mean that using this transmit
vector assignment, the receiver can recover all desired data blocks from the received signal,
or equivalently, I(Xk;Yk) = H(Xk).
To verify decodability, consider Figure 2.6(b), which uses the same conventions as
Figure 2.4. The receiver sees the sum of data blocks from different transmitters, each
characterized by its length and shift location. Blocks from different transmitters may or may
not overlap. Decoding is performed sequentially, block by block. In each step, one of three
rules is applied in order to decode additional data blocks, which are then removed from the
2.2. BINARY-FIELD 3-DIC 25
Xk
1/3− δ
ε− (1/2) δ
1/3− δ
−ε+ 2δ
ε− δ/2
1/3− δ
−ε+ 2δ
(a) Transmit signal Xk.
X21 X31X11
1
1
2
2
33
33
4
4
4
4
5
6
(b) Received signal Y1.
Figure 2.6. Transmit and received signal in region “Df”. (a) Proposed assignment G, shown asthe resulting transmit vector. The block lengths are given as fractions of N , such that the sum ofall block lengths is 1. (b) Received signal Y1, at α = 1.6, β = 0.9, with dsym = 0.55. Blocks indifferent columns carry different data.
26 CHAPTER 2. TREATING INTERFERENCE AS NOISE
received signal. The three decoding rules are as follows.
Direct readout. Consider the situation in Figure 2.7(b). If a data block (i) does not
overlap with any other data block and (ii) is located above the noise level, then its data
content can be read out directly from the received signal. It is crucial that both (i) and (ii)
hold for all (α, β) in the region, since the length and location of the blocks in Figure 2.6(b)
change when α and β vary. A block that has been read out is then removed from the received
signal. If the block is one of a twin, its sibling is removed as well.
Overlapping twins (A). Consider Figure 2.7(c). If two twin pairs exist such that (i) they
have the same block length, b1 = b2, (ii) they have the same separation, s1 = s2, (iii) the
relative shift between the pairs is less than the separation, c < s1, and (iv) the solid green
sections of (A) in Figure 2.7(c) do not overlap with any other data block and are above the
noise floor, then both twin pairs can be decoded and canceled from the received signal. As
no other data
other data blocks
Legend
blocks allowed
are allowed
(a) (b) Direct readout.
(A) (B)
b2
b2
s2
c
b1
b1
s1
(c) Overlapping twins.
Figure 2.7. Rules for verifying decodability. Legend (a) applies to “direct readout”, shown in (b),and two variants of “overlapping twins”, shown in (c).
2.2. BINARY-FIELD 3-DIC 27
before, conditions (i)–(iv) must hold for all (α, β) in the region. To see this, consider the
following successive decoding argument [JV08]. Let the two copies within a twin be in
reverse order of each other. First, the lowest part of the bottom blue twin is read out. Its data
reappears on the top end of the upper blue twin, thus revealing a chunk of data on the top
end of the upper orange twin. This data in turn is replicated on the lower end side of the
bottom orange twin, which exposes a new part of the bottom blue twin. The process repeats
until both twins are completely decoded.
Overlapping twins (B). This rule is a variant of the previous one, where pattern (B)
replaces pattern (A) in Figure 2.7 (c). Decoding proceeds similarly, but starting from the
inside end of the twins.
In our example, the sequence of steps that completely decodes X1 is annotated in
Figure 2.6(b): First, block 1 is decoded via direct readout. The now-known data block
and its twin are removed from the received signal Y1. The same rule allows block 2 to
be decoded, which is then removed from Y1. Each removal step makes more room for
subsequent rule applications. Next, the overlapping twins (A) rule is applied to the two pairs
of twins 3. Continuing in the same fashion, the removal of blocks 1, 2 and 3 enables the two
twin pairs 4 to be decoded using the overlapping twins (B) rule. Finally, data blocks 5 and 6
can be recovered by direct readout, which completes the decoding process. By symmetry,
the signals at the other two receivers can be similarly decoded.
The assignments for all other regions as listed in Subsection 2.2.3 can be shown to be
valid and decodable using the same procedure. This concludes the achievability proof of
Theorem 2.2. �
28 CHAPTER 2. TREATING INTERFERENCE AS NOISE
2.2.3 Optimal assignment matrices
Region Aa.
Parameters:α = 2 + ε
β = δ
Constraints:δ ≥ 0
δ ≤ 1 + ε
ε ≤ −δ
Rate dsym:1 + ε/2− δ/2 δ
−ε/2− δ/2
1 + ε
−ε/2− δ/2 Region Ab.
Parameters:α = 2 + ε
β = δ
Constraints:ε ≤ 0
δ ≤ 1/2
ε ≥ −δ
Rate dsym:1− δ
δ
1− δ
Region Ba.
Parameters:α = 6/5 + ε
β = 2/5 + δ
Constraints:ε ≥ 3δ
ε ≤ 1/5 + δ
ε ≥ 1/10− δ/2
Rate dsym:3/5− ε/2 + δ/2 1/5− ε+ δ
1/5− ε/2− δ/2
1/5− ε/2− δ/2
−1/5 + 2ε+ δ
1/5− ε/2− δ/2
1/5− ε/2− δ/2
1/5 + ε
Region Bb.
Parameters:α = 6/5 + ε
β = 2/5 + δ
Constraints:ε ≥ −δ/3ε ≤ 1/5 + δ
ε ≤ 1/10− δ/2ε ≥ 3δ
Rate dsym:3/5− ε/2 + δ/2 1/5− ε+ δ
1/5− ε/2− δ/2
(3/2) ε+ δ/2
1/5− 2ε− δ(3/2) ε+ δ/2
1/5− ε/2− δ/2
1/5 + ε
Region Bc.
Parameters:α = 6/5 + ε
β = 2/5 + δ
Constraints:ε ≥ δ/2ε ≤ 1/5 + δ
ε ≤ −δ/3
Rate dsym:3/5− ε/2 + δ/2 1/5 + δ/2
1/5 + ε
−(3/2) ε− δ/2
1/5 + ε
−(3/2) ε− δ/2
1/5 + ε
1/5 + δ/2 Region Bd.
Parameters:α = 6/5 + ε
β = 2/5 + δ
Constraints:ε ≤ δ/2ε ≤ −2δ
ε ≥ −1/5
Rate dsym:3/5 + ε/2 −ε+ δ/2
1/5 + ε
−ε+ δ/2
1/5 + ε
−ε/2− δ
1/5 + ε
−ε/2− δ
1/5 + ε
−ε+ δ/2
1/5 + ε
−ε+ δ/2
2.2. BINARY-FIELD 3-DIC 29
Region Be.
Parameters:α = 6/5 + ε
β = 2/5 + δ
Constraints:ε ≤ δ/2ε ≥ −2δ
δ ≤ 1/10
Rate dsym:3/5− δ −ε+ δ/2
1/5 + ε
−ε+ δ/2
1/5 + ε
1/5− 2δ
1/5 + ε
−ε+ δ/2
1/5 + ε
−ε+ δ/2 Region Bf.
Parameters:α = 6/5 + ε
β = 2/5 + δ
Constraints:ε ≥ δ/2ε ≤ 3δ
ε ≤ 1/10− δ/2
Rate dsym:3/5− δ
1/5− ε+ δ
1/5− ε+ δ
2ε− δ1/5− 2ε− δ
2ε− δ
1/5− ε+ δ
1/5 + ε
Region Bg.
Parameters:α = 6/5 + ε
β = 2/5 + δ
Constraints:ε ≤ 3δ
δ ≤ 1/10
ε ≥ 1/10− δ/2
Rate dsym:3/5− δ 1/5− ε+ δ
1/5− ε+ δ
1/5− 2δ
−1/5 + 2ε+ δ
1/5− 2δ
1/5− ε+ δ
1/5 + ε
Region Da.
Parameters:α = 4/3 + ε
β = 2/3 + δ
Constraints:ε ≥ 2δ
ε ≥ −δε ≤ 1/3 + δ
Rate dsym:2/3− ε/2 + δ/2
1/3− δ
ε/2 + δ/2
1/3− δ
ε/2 + δ/2
1/3− ε+ δ
Region Db.
Parameters:α = 4/3 + ε
β = 2/3 + δ
Constraints:ε ≤ −δε ≥ δ/2δ ≥ −1/6
Rate dsym:2/3 + δ
1/3 + δ/2
1/3− δ
1/3 + δ/2
Region Dc.
Parameters:α = 4/3 + ε
β = 2/3 + δ
Constraints:ε ≤ δ/2ε ≥ 2δ
δ ≥ −1/6
Rate dsym:2/3 + δ −ε+ δ/2
1/3 + ε
−ε+ δ/2
1/3 + 2ε− 2δ
−ε+ δ/2
1/3 + ε
−ε+ δ/2
30 CHAPTER 2. TREATING INTERFERENCE AS NOISE
Region Df.
Parameters:α = 4/3 + ε
β = 2/3 + δ
Constraints:ε ≤ 2δ
ε ≥ δ/2δ ≤ 1/3
Rate dsym:2/3− δ/2 1/3− δ
ε− δ/2
1/3− δ
−ε+ 2δ
ε− δ/2
1/3− δ
−ε+ 2δRegion Ea.
Parameters:α = 2 + ε
β = 2/3 + δ
Constraints:ε ≤ 0
δ ≥ −1/15
ε ≥ 3δ
Rate dsym:2/3 + δ −3δ
1/3 + 2δ
−3δ
1/3 + 5δ
−3δ
1/3 + 2δ
Region Eb.
Parameters:α = 2 + ε
β = 2/3 + δ
Constraints:ε ≤ 0
δ ≤ −1/15
δ ≥ −1/6
ε ≥ 3δ
Rate dsym:2/3 + δ
−3δ
1/3 + 2δ
1/3 + 2δ
−1/3− 5δ
1/3 + 2δ
1/3 + 2δ Region Ec.
Parameters:α = 2 + ε
β = 2/3 + δ
Constraints:ε ≤ 3δ
ε ≥ −1/3 + δ
ε ≤ −1/3− 2δ
Rate dsym:2/3 + ε/2− δ/2
−ε/2− (3/2) δ
1/3 + ε/2 + δ/2
1/3 + ε/2 + δ/2
−1/3− ε− 2δ
1/3 + ε/2 + δ/2
1/3 + ε/2 + δ/2
−ε/2 + (3/2) δ
Region Ed.
Parameters:α = 2 + ε
β = 2/3 + δ
Constraints:ε ≤ 3δ
ε ≥ −1/3− 2δ
ε ≥ −1/3 + δ
ε ≤ −3δ
Rate dsym:2/3 + ε/2− δ/2 −ε/2− (3/2) δ
1/3 + ε/2 + δ/2
−ε/2− (3/2) δ
1/3 + ε+ 2δ
−ε/2− (3/2) δ
1/3 + ε/2 + δ/2
−ε/2 + (3/2) δ Region Ee.
Parameters:α = 2 + ε
β = 2/3 + δ
Constraints:ε ≤ 0
δ ≤ 1/3 + ε
δ ≥ −ε/3
Rate dsym:2/3 + ε/2− δ/2
1/3− δ
1/3− δ
ε/2 + (3/2) δ
1/3− δ
ε/2 + (3/2) δ
−ε
Chapter 3
Interference Decoding
In this chapter1, we use interference decoding to develop an inner bound for the capacity
region of the 3-DIC defined in Section 1.2 on page 6. The idea is to treat the combined
interference signal as one entity at the receivers instead of artificially separating different
sources of interference from each other.
Explicit decoding of the combined interference signal was first discussed in [BPT10] for
the many-to-one Gaussian interference channel. The authors argue that with Gaussian codes,
decoding the combined interference is tantamount to decoding each interfering sender’s
codeword. On the other hand, with structured (lattice) codes, the combined interference can
be made to appear essentially as a codeword from a single interferer. Lattice codes [EZ04,
ELZ05] have found applications in a number of Gaussian interference network settings, see,
e.g., [SJV+08, MDFT11, TY11, SD11]. In general, for channels with inherent linearity such
as Gaussian interference channels, it is natural to consider decoding linear combinations of
interfering codewords, instead of individual codewords. This idea is developed in [NG09]
for Gaussian relay networks, leading to a compute–forward relaying scheme.
We focus on the 3-DIC, where the combined interference signal takes values from a
finite set, and therefore a certain type of alignment can be observed without resorting to
complicated structured codes [NG08]. We assume point-to-point codes without rate split-
ting or superposition coding since such codes are widely deployed and it is interesting to
investigate the benefit of using a more sophisticated receiver instead of treating interference
1The results in this chapter were first published in [BE10, BE11d].
32 CHAPTER 3. INTERFERENCE DECODING
as noise. Specifically, each receiver simultaneously decodes the intended message and the
combined interference without penalizing incorrect decoding of the latter. Of course, one
does not expect this scheme to be optimal in general, since even for the two-user-pair case,
superposition coding is required for optimality. Note that for our class of deterministic chan-
nels, algebraic structures such as linear subspaces or lattices do not exist in general. Hence,
our decoder does not use the two-step procedure as in the work on Gaussian channels and
their corresponding high SNR deterministic models (see the discussion in Subsection 1.1.2
on page 5).
The key observation is that depending on the input pmfs and the message rates, the
number of possible combined interference sequences can be equal to the number of in-
terfering message pairs, the number of typical combined interference sequences, or some
combination of the two. In our scheme, each sender does not need to know the other senders’
codebooks. However, we use simultaneous decoding, which requires that the receivers know
all codebooks. As in the recent characterization of the Han–Kobayashi region [CMGE08],
we do not require the interference decoding to be correct with arbitrarily small probability
of error.
3.1 Results and discussion
In the following we summarize the results in this chapter.
3.1.1 Interference-decoding inner bound
Fix the random tuple (Q,X1, X2, X3) ∼ p = p(q)p(x1|q)p(x2|q)p(x3|q), where Q is a
time-sharing random variable from alphabet Q. Define the region R1(p) ⊂ R3+ to consist of
the rate triples (R1, R2, R3) such that
R1 ≤ H(X11 |Q), (3.1)
R1 + min{R2, H(X21 |Q)} ≤ H(Y1 |X31, Q), (3.2)
R1 + min{R3, H(X31 |Q)} ≤ H(Y1 |X21, Q), (3.3)
R1 + min{R2 +R3,
3.1. RESULTS AND DISCUSSION 33
R2 +H(X31 |Q),
H(X21 |Q) +R3,
H(S1 |Q)} ≤ H(Y1 |Q). (3.4)
Similarly define the regions R2(p) and R3(p) by making the subscript replacements 1 7→2 7→ 3 7→ 1 and 1 7→ 3 7→ 2 7→ 1 in R1(p), respectively.
Let S denote the convex hull of the set S .
Theorem 3.1 (Interference-decoding inner bound). The region
RID =⋃p
R1(p) ∩R2(p) ∩R3(p),
where p = p(q)p(x1|q)p(x2|q)p(x3|q) is an inner bound to the 3-DIC capacity region.
Region Rk(p) ensures decodability at receiver k, and the intersection R1(p) ∩R2(p) ∩R3(p) ensures decodability at all three receivers. The proof for this theorem is given in
Section 3.2.
Remark 3.1 (Saturation). The min terms on the left hand side of the inequalities arise
from counting the effective number of interfering sequences at various links of the channel.
For example, consider the min{R2, H(X21 |Q)} term in (3.2). If R2 is small, the number
of distinct sequences that can occur at X21 is equal to the number of possible messages from
sender 2. As R2 increases beyond H(X21 |Q), the number of possible sequences at X21
“saturates” to the number of typical sequences, which is roughly 2nH(X21 |Q). In this case,
we can increase the rate of the second sender further without negatively impacting the first
receiver. The min expressions in (3.3) and (3.4) likewise capture the saturation effects at
X31 and S1, respectively.
An example of region R1(p) is plotted in Figure 3.1. The region is unbounded in the R2
and R3 directions, due to saturation. This is expected, since regardless of the values of R2
and R3, S1 can always be treated as noise to achieve a non-zero rate R1. However, as R2
and R3 become smaller, the proposed scheme takes advantage of the structure in S1 and can
thereby increase R1.
34 CHAPTER 3. INTERFERENCE DECODING
R1
R2
R3
Figure 3.1. Region R1(p), which ensures decodability at the first receiver.
Remark 3.2 (Convexity). The regions R1(p), R2(p), and R3(p), and thus their intersec-
tion, are generally nonconvex. By virtue of time-sharing, we are allowed to convexify.
However, this convexification operation is not achieved by the coded time sharing mecha-
nism of Q, hence the explicit convex hull operation in the theorem.
3.1.2 Capacity region under strong interference
Consider the subclass of 3-DIC with strong interference and invertible hk in which the
following two conditions hold.
First, the loss functions glk are such that
min{H(X12), H(X13)} ≥ H(X11),
min{H(X21), H(X23)} ≥ H(X22),
min{H(X31), H(X32)} ≥ H(X33),
for all product input pmfs p(x1)p(x2)p(x3). This condition implies that interference is
strong.
3.1. RESULTS AND DISCUSSION 35
Second, the functions hk are invertible, i.e.,
H(S1) = H(X21) +H(X31),
H(S2) = H(X12) +H(X32),
H(S3) = H(X13) +H(X23),
for all product input pmfs p(x1)p(x2)p(x3). With the conditional invertibility property of
fk, the channel becomes a non-symmetric version of the deterministic model for the SIMO
interference channel described in [GJ11]. In both cases, a receiver can uniquely recover
both interfering signals given the received sequence and the desired transmitted sequence.
The capacity region under these conditions is achieved by interference decoding.
Theorem 3.2 (3-DIC capacity region with strong interference and invertible hk).The capacity region of the 3-DIC under strong interference and invertible hk functions
is the set of rate triples (R1, R2, R3) such that
Rk ≤ H(Xkk |Q), k ∈ {1:3},
R1 +R2 ≤ min{H(Y1 |X31, Q), H(Y2 |X32, Q)},
R1 +R3 ≤ min{H(Y1 |X21, Q), H(Y3 |X23, Q)},
R2 +R3 ≤ min{H(Y2 |X12, Q), H(Y3 |X13, Q)},
R1 +R2 +R3 ≤ min{H(Y1 |Q), H(Y2 |Q), H(Y3 |Q)},
for some (Q,X1, X2, X3) ∼ p(q)p(x1|q)p(x2|q)p(x3|q) with |Q| ≤ 12.
The proof of this theorem is given in Section 3.3.
3.1.3 Comparison to treating interference as noise
In the two-user-pair interference channel, decoding both messages at each receiver and
treating interference as noise are considered as two extreme schemes. The extremes are
bridged by the Han–Kobayashi scheme in which part of the interference is decoded and the
36 CHAPTER 3. INTERFERENCE DECODING
rest is treated as noise [EK11]. While treating interference as noise is better for channels
with weak interference, decoding both messages is optimal under strong interference. We
show surprisingly that for the 3-DIC under consideration, treating interference as noise is a
special case of interference decoding!
In Section 3.4, we establish the following result.
Theorem 3.3 (Interference decoding versus treating interference as noise).The rate region achievable by treating interference as noise (Theorem 2.1) is included in
the interference-decoding rate region of Theorem 3.1, i.e.,
RTIN ⊆ RID.
The difference between treating interference as noise and interference decoding is
essentially that the former assumes that the combined interference signal Sk is always
saturated, while the latter distinguishes between saturated and non-saturated cases. Later in
this section, we argue that the above inclusion result is tightly coupled to the definition of
the 3-DIC.
The following example shows that the inclusion of Theorem 3.3 can be strict, i.e., the
treating interference as noise region is strictly contained in the interference-decoding region.
Continuation of Example 1.1. Recall the additive 3-DIC in Example 1.1 on page 8. For
this channel, the interference-decoding rate region strictly contains the region achievable
by treating interference as noise. To demonstrate this, we computed the approximation of
the interference decoding inner bound depicted in Figure 3.2. Since it is computationally
infeasible to enumerate the potentially arbitrarily large number of conditional distributions
of inputs given Q as required by Theorem 3.1, we used the following procedure. We first
assume Q = ∅ and consider a grid over all input distributions p(x1)p(x2)p(x3). For each
grid point, we compute the achievable rate region as given by Theorem 3.1, respectively.
We represent the region as the convex hull of its corner points. The final approximation is
obtained by taking the union of all such corner points over the grid. Note that due to the
simple structure of RTIN in Theorem 2.1, which consists of a union of rectangular boxes, this
method can compute RTIN to arbitrary precision provided the grid is sufficiently fine (see
3.1. RESULTS AND DISCUSSION 37
Figure 2.1 on page 14). On the other hand, when applied to Theorem 3.1, our approximation
method yields a possibly strictly smaller inner bound than RID.
Figure 3.3 depicts the intersection of the three-dimensional regions in Figures 2.1 and 3.2
with the plane defined by R1 = R3. Note that the same maximum sum rate RΣ = 3 is
achieved by both schemes. However, while treating interference as noise does so at exactly
one rate triple (R1 = R2 = R3 = 1), interference decoding achieves the maximal sum rate
at many different asymmetric rate triples.
Remark 3.3. As we have seen in Subsection 2.2, treating interference as noise achieves the
sum capacity for the cyclically symmetric binary-field 3-DIC in a wide range of parameters
(α, β). It would be interesting to investigate whether interference decoding can achieve
higher sum rates than treating interference as noise in the (α, β) range where the sum
capacity is not known. Moreover, even in the range where we know the sum capacity,
interference decoding may achieve higher asymmetric rates than treating interference as
noise, as in the additive 3-DIC example. The main challenge in settling these questions is
the prohibitively large space of possible input distributions in Theorem 3.1.
R1
R2
R3
Figure 3.2. Region of Theorem 3.1 for the additive 3-DIC example. Compare to Figure 2.1 onpage 14.
38 CHAPTER 3. INTERFERENCE DECODING
RΣ = 3
0.5 1.0 1.5
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
(R1 +R3)/√
2
R2
Interference as noiseInterference decoding
Figure 3.3. Comparison of interference decoding and treating interference as noise (Theorems 2.1and 3.1) for the additive 3-DIC example.
3.1.4 Extension to 3-DIC with noisy observations
In this subsection, we consider the 3-DIC with noisy observations. In this generalization of
3-DIC, the channel outputs in (1.1) on page 8 are observed through memoryless channels
Yk → Zk for k ∈ {1:3}. Thus receiver k now observes a noisy version Zk of Yk, which may
be from a discrete or a continuous alphabet.
The interference-decoding inner bound generalizes to the 3-DIC with noisy observations
as follows. Let (Q,X1, X2, X3) ∼ p = p(q)p(x1|q)p(x2|q)p(x3|q). Define the region
R ′1(p) ⊂ R3+ as the set of rate triples (R1, R2, R3) such that
R1 ≤ I(X1;Z1 |S1, Q),
R1 + min{R2, H(X21 |Q)} ≤ I(X1, X21;Z1 |X31, Q),
R1 + min{R3, H(X31 |Q)} ≤ I(X1, X31;Z1 |X21, Q),
R1 + min{R2 +R3,
R2 +H(X31 |Q),
H(X21 |Q) +R3,
H(S1 |Q)} ≤ I(X1, S1;Z1 |Q).
3.1. RESULTS AND DISCUSSION 39
Similarly, define the regions R ′2(p) and R ′3(p) by making the subscript replacements 1 7→2 7→ 3 7→ 1 and 1 7→ 3 7→ 2 7→ 1 in R ′1(p), respectively.
Theorem 3.4 (Interference decoding for 3-DIC with noisy observations).The region
R ′ID =⋃p
R ′1(p) ∩R ′2(p) ∩R ′3(p),
where p = p(q)p(x1|q)p(x2|q)p(x3|q) is an inner bound to the capacity region of the
3-DIC with noisy observations.
The proof of this theorem proceeds completely analogously to the proof of Theorem 3.1
as presented in Section 3.2, and thus its details are omitted. Note that the inclusion of
Theorem 3.3 does not generalize to the case with noisy observations, the formal reason of
which is discussed in Remark 3.4 on page 52.
The following example demonstrates the inner bound for the 3-DIC with noisy observa-
tions. It also serves to illustrate that treating interference as noise can perform better than
interference decoding for this channel model.
Example 3.1 (Gaussian interference channel with BPSK). Consider the Gaussian inter-
ference channel with finite input alphabets. The channel output at receiver k is
Yk =3∑l=1
glkXl,
Zk = Yk +Nk, (3.5)
where glk ∈ R is the path gain from transmitter l to receiver k, and Nk is additive white
Gaussian noise of average power σ2. This is a realistic model for a wireless interference
channel where the transmitter hardware is based on digital signal processing (DSP) and
digital-to-analog conversion (DAC). For example, Xl = {+1,−1} represents a system with
a binary constellation, e.g., binary phase-shift keying (BPSK). Equation (3.5) represents
continuous-valued outputs (soft outputs), but our model would also apply if a quantizer is
added (hard outputs), for example due to analog–digital conversion (ADC) at the receivers.
40 CHAPTER 3. INTERFERENCE DECODING
Figure 3.4 shows approximations of the inner bounds for a cyclically symmetric Gaussian
interference channel with BPSK inputs and continuous outputs. In contrast to the noiseless
case, neither the interference-decoding region nor the region achieved by treating interference
as noise contains the other, i.e., Theorem 3.3 does not hold for 3-DIC with noisy observations.
In particular, the sum rates achieved by treating interference as noise and interference
decoding are 2.51 and 2.37, respectively. Intuitively, interference decoding attempts to
separate the combined interference from the additive noise. As such, it may achieve lower
rates than simply treating interference as noise for which this separation is not enforced.
This discrepancy is more pronounced for low values of SNR, and it vanishes asymptotically
as SNR grows.
3.1.5 Interference decoding is not optimal in general
As in treating interference as noise, the interference-decoding scheme uses point-to-point
codes. Although the decoder in interference decoding is more sophisticated than the one
underlying Theorem 2.1, the interference-decoding inner bound is not optimal in general.
R1
R2
R3
Figure 3.4. Gaussian interference channel with BPSK: Rate regions achieved by interferencedecoding (dashed outline) and treating interference as noise (shaded) for a cyclically symmetricGaussian interference channel with Xk ∈ {+1,−1}, path gains g11 = 1.8, g21 = 1.0, g31 = 1.1,and noise power σ2 = 0.1.
3.1. RESULTS AND DISCUSSION 41
To exemplify this, consider the following example of a 2-DIC (cf. Subsection 1.1.1).
Example 3.2. Consider the 2-DIC in Figure 3.5(a) with input alphabets X1 = {0, 1, 2},X2 = {0, 1}, loss functions g12 = {0 7→ 0, 1 7→ 0, 2 7→ 1} and g11 = g22 = g21 = Id, and
receiver functions f1 = f2 being addition. The outputs of the channel are thus given by
Y1 = X1 +X2,
Y2 = g12(X1) +X2.
The interference-decoding inner bound in Theorem 3.1 reduces to the set of rate pairs
(R1, R2) that such that
R1 ≤ H(X1 |Q),
R2 ≤ H(X2 |Q),
R1 + min{R2, H(S1 |Q)} ≤ H(Y1 |Q),
R2 + min{R1, H(S2 |Q)} ≤ H(Y2 |Q),
for some p(q)p(x1|q)p(x2|q).
Figure 3.5(b) compares this inner bound to the capacity region given by Theorem 1.1 and
to the region achievable by treating interference as noise (Theorem 2.1). Not surprisingly, in-
terference decoding does not achieve the full capacity. To achieve capacity, Han–Kobayashi
rate splitting and superposition coding are needed.
42 CHAPTER 3. INTERFERENCE DECODING
X1{0, 1, 2}
012
0
1
X2
Y1
Y2{0, 1}
{0, 1, 2, 3}
{0, 1, 2}
{0, 1}
(a) Block diagram of the channel.
R1
R2
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
0.2
0.4
0.6
0.8
1.0
Interference as noiseInterference decodingCapacity
(b) Capacity region and inner bounds.
Figure 3.5. 2-DIC example.
3.2. PROOF OF THEOREM 3.1 43
3.2 Proof of Theorem 3.1
We first present a key lemma which formalizes the notion of link saturation as discussed in
Remark 3.1 by generalizing the packing lemma stated in [EK11].
Lemma 3.1 (Packing lemma for pairs).Let (U,A,B,C) ∼ p = p(u)p(a|u)p(b|u)p(c|a, b, u). Let Un ∼
∏ni=1 pU(ui). For
each m ∈ {1 : 2nRA}, let An(m) ∼∏n
i=1 pA|U(ai |ui). For each l ∈ {1 : 2nRB}, let
Bn(l) ∼∏n
i=1 pB|U(bi |ui), conditionally independent of each An(m) given Un. Let
Cn ∼∏n
i=1 pC|U(ci |ui), conditionally independent of each An(m) and Bn(l) given
Un. There exists a δ(ε) with limε→0 δ(ε) = 0 such that if
min{RA, H(A |U)}+ min{RB, H(B |U)} < I(A,B;C |U)− δ(ε),
then P{(Un, An(m), Bn(l), Cn) ∈ T (n)ε for some m, l} → 0 as n→∞, where typical-
ity, entropies and mutual information are with respect to p.
Proof: Applying the packing lemma in [EK11] with U = U , X = (A,B), and Y = C
immediately establishes the convergence if RA + RB < I(A,B;C |U) − δ(ε). Next, we
prove convergence when RB +H(A |U) < I(A,B;C |U)− δ(ε). To this end, we bound
the probability in question as
P{(Un, An(m), Bn(l), Cn) ∈ T (n)ε for some m, l}
≤2nRB∑l=1
∑un∈T (n)
ε (U)
P{Un = un)}
︸ ︷︷ ︸≤1
∑bn∈T (n)
ε (B |un)
P{Bn(l) = bn |Un = un}
︸ ︷︷ ︸≤1
· P{(un, An(m), bn, Cn) ∈ T (n)ε for some m}. (3.6)
To bound the last probability term, we apply Corollary A.2 with Ai = An(m), D = Cn,
Q = T (n)ε (A,C |un, bn), QA = T (n)
ε (A |un, bn), and PD given by the bound
P{(un, an, bn, Cn) ∈ T (n)ε } ≤ 2−n(I(A,B;C|U)−δ1(ε)),
44 CHAPTER 3. INTERFERENCE DECODING
which follows from the joint typicality lemma in [EK11]. The corollary the implies
P{(un, An(m), bn, Cn) ∈ T (n)ε for some m} ≤ |T (n)
ε (A |un, bn)| · 2−n(I(A,B;C|U)−δ1(ε)
≤ 2n(H(A |U,B)+δ2(ε)−I(A,B;C|U)+δ1(ε))
(a)= 2n(H(A |U)−I(A,B;C|U)+δ(ε)),
where in step (a), H(A |U,B) = H(A |U) by the Markov chain A− U − B, and δ(ε) =
δ1(ε) + δ2(ε). Substituting into (3.6), we have
P{(Un, An(m), Bn(l), Cn) ∈ T (n)ε for some m, l} ≤ 2n(RB+H(A |U)−I(A,B;C|U)+δ(ε)).
Clearly, this probability converges to zero as n→∞ if RB +H(A |U) < I(A,B;C |U)−δ(ε). In the same manner, convergence follows from RA +H(B |U) < I(A,B;C |U)−δ(ε). Thus convergence is implied by min {RA +RB, RA +H(B |U), H(A |U) +RB} <I(A,B;C |U)−δ(ε), and the desired result follows by recalling thatH(A |U)+H(B |U) ≥I(A,B;C |U). �
We are now ready to prove Theorem 3.1. We begin by fixing an input distribution
p(q)p(x1|q)p(x2|q)p(x3|q).
Codebook generation. Randomly generate a sequence qn according to∏n
i=1 pQ(qi). For
each k ∈ {1:3}, randomly and conditionally independently generate sequences xnk(mk),
mk ∈ {1:2nRk}, each according to∏n
i=1 pXk|Q(xki|qi). From the channel definition, this
procedure induces intermediate sequences xnkl(mk) for l ∈ {1:3}, combined interference
sequences sn1 (m2,m3), sn2 (m1,m3), sn3 (m1,m2), and output sequences ynk (m1,m2,m3).
Encoding. To send the message mk ∈ {1:2nRk}, k ∈ {1:3}, encoder k transmits xnk(mk).
Decoding. The receivers use simultaneous non-unique decoding. Upon observing yn1 ,
decoder 1 declares that m1 has been sent if it is the unique message such that
(qn, xn1 (m1), sn1 (m2, m3), xn21(m2), xn31(m3), yn1 ) ∈ T (n)ε ,
for some m2, m3. Decoding at the other receivers is performed similarly.
3.2. PROOF OF THEOREM 3.1 45
Analysis of the probability of error. Without loss of generality, assume that mk = 1 for
k ∈ {1 :3}. Define Emlk = {(Qn, Xn1 (m), Sn1 (l, k), Xn
21(l), Xn31(k), Y n
1 (1, 1, 1)) ∈ T (n)ε },
and the events
E0 = Ec111,
E1 = {Em11 for some m 6= 1} ,
E2 = {Eml1 for some m, l 6= 1} ,
E3 = {Em1k for some m, k 6= 1} ,
E4 = {Emlk for some m, l, k 6= 1} .
Then the probability of decoding error at the first receiver averaged over codebooks is upper
bounded as P(E) = P(E0 ∪ E1 ∪ E2 ∪ E3 ∪ E4) ≤∑4
j=0 P(Ej). We bound each term. First,
by the law of large numbers, P(E0)→ 0 as n→∞.
Next consider
E1 ⊆{
(Qn, Xn1 (m), Sn1 (1, 1), Y n
1 (1, 1, 1)) ∈ T (n)ε for some m 6= 1
}.
By the packing lemma in [EK11], the probability of this event tends to zero as n→∞ if
R1 < I(X1;Y1 |S1, Q)− δ(ε),
which simplifies to
R1 < H(X11 |Q)− δ(ε). (3.7)
The event E2 can be treated as follows. Consider
E2 ⊆{
(Qn, Xn1 (m), Xn
21(l), Xn31(1), Y n
1 (1, 1, 1)) ∈ T (n)ε for some m, l 6= 1
}.
Using Lemma 3.1 with Un = (Qn, Xn31(1)), An = Xn
1 , Bn = Xn21, and Cn = Y n
1 (1, 1, 1),
we conclude that P(E2)→ 0 if
R1 + min{R2, H(X21 |X31, Q)} < I(X1, X21;Y1 |X31, Q)− δ(ε),
46 CHAPTER 3. INTERFERENCE DECODING
or, equivalently,
R1 + min{R2, H(X21 |Q)} < H(Y1 |X31, Q)− δ(ε). (3.8)
Completely symmetrically, P(E3)→ 0 as n→∞ if
R1 + min{R3, H(X31 |Q)} < H(Y1 |X21, Q)− δ(ε). (3.9)
Finally, we bound the probability of E4 by the following, which is proved below.
Proposition 3.1. P(E4) vanishes as n→∞ if
R1 + min{R2 +R3, R2 +H(X31 |Q),
H(X21 |Q) +R3, H(S1 |Q)}< H(Y1 |Q)− δ(ε). (3.10)
As per Remark 3.1, the intuition in this proposition is that the min term represents the
effective number of sequences that appear at S1. Recall that S1 is the output of a deterministic
multiple access channel with inputs X21 and X31 and input to output mapping h1. Figure 3.6
shows the number of output sequences for different ranges of R1 and R2. Note that when
(R1, R2) is in the deterministic MAC capacity region, the number of output sequences is
simply 2n(R1+R2). For (R1, R2) outside the capacity region, the number of output sequences
saturates in one or both dimensions. The logarithm of the number of output sequences
divided by n appears in the min expression of the lemma.
Collecting (3.7) to (3.10) yields the conditions of R1. The probability of error at the
second and third receiver can be bounded similarly, leading to the conditions of R2 and R3.
This concludes the proof of Theorem 3.1. �
3.2. PROOF OF THEOREM 3.1 47
2n(R3+H(X21 |Q)
2n(R2+H(X31 |Q)
H(X21 |Q)
H(X31 |Q) H(S
1 |Q)
R3
R2
2nH(S1 |Q)
2n(R2+R3)
Figure 3.6. Capacity region for a deterministic MAC. The number of output sequences as afunction of the number of input sequences is annotated in each region.
Proof of Proposition 3.1: The first and last term in the min expression follow immediately
from Lemma 3.1 by disregarding the special structure of Sn1 . To obtain the second term,
P(E4)
= P{
(Qn, Xn1 (m), Sn1 (l, k), Xn
21(l), Xn31(k), Y n
1 (1, 1, 1)) ∈ T (n)ε for some m, l, k 6= 1
}≤
2nR1∑m=2
2nR2∑l=2
∑qn∈T (n)
ε (Q)
P{Qn = qn}
︸ ︷︷ ︸≤1
·∑
xn1∈T(n)ε (X1 | qn)
P{Xn1 (m) = xn1 |Qn = qn}
︸ ︷︷ ︸≤1
∑xn21∈T
(n)ε (X21 | qn)
P{Xn21(l) = xn21 |Qn = qn}
︸ ︷︷ ︸≤1
· P{(qn, xn1 , xn21, Xn31(k), Y n
1 (1, 1, 1)) ∈ T (n)ε for some k 6= 1}.
To bound the last probability term, we apply Corollary A.2 with Ai = Xn31(k), D =
Y n1 (1, 1, 1),Q = T (n)
ε (X31, Y | qn, xn1 , xn21),QA = T (n)ε (X31 | qn, xn1 , xn21), and PD given by
the bound
P{(qn, xn1 , xn21, xn31, Y
n1 (1, 1, 1)) ∈ T (n)
ε } ≤ 2−n(I(X1,X21,X31;Y1|Q)−δ1(ε)),
48 CHAPTER 3. INTERFERENCE DECODING
which is a consequence of the joint typicality lemma in [EK11]. Thus, by the corollary,
P{(qn, xn1 , xn21, Xn31(k), Y n
1 (1, 1, 1)) ∈ T (n)ε for some k 6= 1}
≤ 2n(H(X31|Q)+δ2(ε)) · 2−n(H(Y1|Q)−δ1(ε)).
Letting δ(ε) = δ1(ε) + δ2(ε), then P(E4) is bounded by 2n(R1+R2+H(X31|Q)−H(Y1|Q)+δ(ε)),
which clearly tends to zero as n→∞ if R1 + R2 +H(X31|Q) < H(Y1|Q)− δ(ε). Thus
the second term in the min expression is established. The third term follows likewise. �
3.3 Proof of Theorem 3.2
Proof of achievability: We prove achievability by specializing Theorem 3.1. Specifically,
we show that under strong interference and invertible hk, regions Rk of Theorem 3.1 simplify
to regions R ′′k below while maintaining R1 ∩R2 ∩R3 = R ′′1 ∩R ′′2 ∩R ′′3 .
Recall the definition of R1(p) as the set of rate triples (R1, R2, R3) that satisfy inequali-
ties (3.1) to (3.4). Further recall the analogous definitions of R2 and R3, which include the
inequalities
R2 ≤ H(X22 |Q), (3.11)
R3 ≤ H(X33 |Q). (3.12)
When combined with the strong interference assumption, inequalities (3.11) and (3.12) imply
that the min expressions in (3.2) and (3.3) simplify to R2 and R3, respectively. Furthermore,
the sum of (3.11) and (3.12) implies that
R2 +R3 ≤ H(X22 |Q) +H(X33 |Q)
≤ H(X21 |Q) +H(X31 |Q)
= H(S1 |Q),
where we have used the invertibility of h1. Therefore, the min expression in (3.4) simplifies
to R2 +R3.
3.3. PROOF OF THEOREM 3.2 49
Consequently, for p = p(q)p(x1|q)p(x2|q)p(x3|q), define R ′′1 (p) as the set of rate triples
(R1, R2, R3) such that
R1 ≤ H(X11 |Q),
R1 +R2 ≤ H(Y1 |X31, Q),
R1 +R3 ≤ H(Y1 |X21, Q),
R1 +R2 +R3 ≤ H(Y1 |Q).
Likewise, define the regions R ′′2 (p) and R ′′3 (p) by replacing subscripts following 1 7→ 2 7→3 7→ 1 and 1 7→ 3 7→ 2 7→ 1 in R ′′1 (p), respectively. Then Theorem 3.1 implies that
⋃p
R ′′1 (p) ∩R ′′2 (p) ∩R ′′3 (p)
is achievable, and the proposition follows by expanding the intersection operations. �
Proof of converse: Consider a sequence of codes with rates (R1, R2, R3), empirical pmf
p(xn1 )p(xn2 )p(xn3 ), and P (n)e tending to 0 as n→∞. First, note that
nR1 ≤ I(Xn1 ;Y n
1 ) + nεn
= I(Xn11;Y n
1 ) + nεn
≤ H(Xn11) + nεn
= nH(X11 |Q) + nεn,
where Q is a time-sharing random variable uniformly distributed over {1:n}. Next, consider
n(R1 +R2)
≤ I(Xn1 ;Y n
1 ) + I(Xn2 ;Y n
2 ) + nεn
= H(Y n1 )−H(Y n
1 |Xn1 ) +H(Y n
2 )−H(Y n2 |Xn
2 ) + nεn
= H(Y n1 )−H(Sn1 ) +H(Y n
2 )−H(Sn2 ) + nεn
= H(Y n1 )−H(Xn
31) + (H(Y n2 )−H(Xn
21)−H(Xn12)−H(Xn
32)) + nεn
≤ H(Y n1 |Xn
31) + nεn
50 CHAPTER 3. INTERFERENCE DECODING
≤ nH(Y1 |X31, Q) + nεn,
where we have used H(Xn22) ≤ H(Xn
21) and H(Y n2 ) ≤ H(Xn
22) + H(Xn12) + H(Xn
32). In
the same way, it can be shown that
n(R1 +R3) ≤ nH(Y1 |X21, Q) + nεn.
Finally,
n(R1 +R2 +R3)
≤ H(Y n1 )−H(Sn1 ) +H(Y n
2 )−H(Sn2 ) +H(Y n3 )−H(Sn3 ) + nεn
= H(Y n1 ) + nεn + (H(Y n
2 )−H(Xn21)−H(Xn
12)−H(Xn32))
+ (H(Y n3 )−H(Xn
31)−H(Xn13)−H(Xn
23))
≤ nH(Y1 |Q) + nεn.
Thus, all four conditions related to the first receiver have been shown. Analogous steps
yield the remaining bounds. Finally, the cardinality bound on Q can be established using
the convex cover method described in [EK11]. �
3.4 Proof of Theorem 3.3
We show that the inner bound in Theorem 2.1 is included in the inner bound of Theorem 3.1.
The conditions of region R1 in Theorem 3.1 can be made more stringent by replacing the
min expression with any one of its argument terms. For example, (R1, R2, R3) ∈ R1 is
implied by
R1 ≤ H(X11 |Q),
R1 +H(X21 |Q) ≤ H(Y1 |X31, Q),
R1 +H(X31 |Q) ≤ H(Y1 |X21, Q),
R1 +H(S1 |Q) ≤ H(Y1 |Q),
3.4. PROOF OF THEOREM 3.3 51
or, equivalently,
R1 ≤ min{H(X11|Q),
H(Y1|X31, Q)−H(X21|Q),
H(Y1|X21, Q)−H(X31|Q),
H(Y1|Q)−H(S1|Q)}. (3.13)
To simplify this expression, consider
H(X11 |Q) ≥ I(X11;Y1 |Q)
= H(Y1 |Q)−H(Y1 |X11, Q)
= H(Y1 |Q)−H(S1 |Q),
as well as
(H(Y1 |X21, Q)−H(X31 |Q)
)−(H(Y1 |Q)−H(S1 |Q)
)= H(Y1, X21 |Q)−H(X21 |Q)−H(X31 |Q)︸ ︷︷ ︸
H(S1 |X21,Q)
−H(Y1 |Q) +H(S1 |Q)
= H(X21 |Y1, Q)−H(X21 |S1, Q)
≥ H(X21 |Y1, S1, Q)−H(X21 |S1, Q)
= 0,
and, by symmetry,
(H(Y1|X31, Q)−H(X21|Q)
)−(H(Y1|Q)−H(S1|Q)
)≥ 0.
Thus, the min in (3.13) is always achieved by the last term, and (3.13) simplifies to
R1 ≤ H(Y1 |Q)−H(S1 |Q) = I(X1;Y1 |Q).
52 CHAPTER 3. INTERFERENCE DECODING
Using a similar argument, it follows that the conditions for R2 and R3 in Theorem 3.1 are
implied by (2.1). �
Remark 3.4. In the case with noisy observations, this proof fails in the following manner.
Interference decoding entails the inequality
R1 ≤ I(X1, S1;Z1 |Q)−H(S1 |Q)
= I(X1;Z1 |Q) + I(S1;Z1 |X1, Q)−H(S1 |Q)
= I(X1;Z1 |Q) +H(S1 |X1, Q)−H(S1 |X1, Z1, Q)−H(S1 |Q)
= I(X1;Z1 |Q)−H(S1 |X1, Z1, Q).
The first term is the achievable rate when treating interference as noise. The second term is
zero when the channel is noiseless and acts as a penalty when noise is introduced.
Chapter 4
Communication with disturbance constraints
The problem of communication with disturbance constraints problem is motivated by the
broadcast view of the interference channel, in which each sender wishes to communicate a
message only to one of the receivers while causing the least disturbance to the other receivers.
In this chapter1, we focus on studying the problem of communication with disturbance
constraints itself, in isolation from the interference channel.
Alice wishes to communicate a message to Bob while causing the least disturbance to
nearby Dick, Diane, and Diego, who are not interested in the communication from Alice.
Assume a discrete memoryless broadcast channel p(y, z1, . . . , zK |x) between Alice X , Bob
Y , and their preoccupied friends Z1, . . . , ZK as depicted in Figure 4.1. We measure the
disturbance at side receiver Zj by the amount of undesired information rate (1/n)I(Xn;Znj )
originating from the sender X , and require this rate not to exceed Rd,j in the limit. The
problem is to determine the optimal trade-off between the message communication rate R
and the disturbance rates Rd,j .
For a single disturbance constraint, we show that the optimal encoding scheme is rate
splitting and superposition coding, which is the same as the Han–Kobayashi scheme for
the two user-pair interference channel [HK81, CMGE08]. This motivates us to study
communication with more than one disturbance constraint with the hope of finding good
coding schemes for interference channels with more than two user pairs, specifically the
3-DIC defined in Subsection 1.2. To this end, we establish inner and outer bounds on the1The results in this chapter were first published in [BE11b, BE11c].
54 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
MXn Y n
Zn1
ZnK
1nI(Xn;Zn
1 ) ≤ Rd,1
1nI(Xn;Zn
K) ≤ Rd,K
MEncoder
p(y,z
1,...,zK|x
) Decoder
Figure 4.1. Communication system with disturbance constraints.
rate–disturbance region for the deterministic channel model with two disturbance constraints
that are tight in some nontrivial special cases. In the following section we provide needed
definitions and present an extended summary of the results. The proofs are presented in
subsequent sections.
4.1 Results and discussion
Consider the discrete memoryless communication system with K disturbance constraints
(henceforth referred to as DMC-K-DC) depicted in Figure 4.1. The channel consists
of K + 2 finite alphabets X , Y , Zj , j ∈ {1 :K}, and a collection of conditional pmfs
p(y, z1, . . . , zK |x). A (2nR, n) code for the DMC-K-DC consists of the message set {1:2nR},an encoding function xn : {1:2nR} → X n, and a decoding function m : Yn → {1:2nR}. We
assume that the message M is uniformly distributed over {1:2nR}. A rate–disturbance tuple
(R,Rd,1, . . . , Rd,K) ∈ RK+1+ is achievable for the DMC-K-DC if there exists a sequence of
(2nR, n) codes such that
limn→∞
P{M 6= M} = 0,
lim supn→∞
(1/n)I(Xn;Znj ) ≤ Rd,j, j ∈ {1:K}.
The rate–disturbance region R of the DMC-K-DC is the closure of the set of all achievable
tuples (R,Rd,1, . . . , Rd,K).
Remark 4.1. Like the message rate R, the disturbance rates Rd,j , for j ∈ {1 :K}, are
4.1. RESULTS AND DISCUSSION 55
measured in units of bits per channel use.
Remark 4.2. The disturbance measure (1/n)I(Xn;Znj ) can be expanded as (1/n)H(Zn
j )−(1/n)H(Zn
j |Xn). The first term is the entropy rate of the received signal Zj and is caused
by both the transmission itself and by noise inherent to the channel. Subtracting the second
term separates out the noise part. (For channels with additive white noise, e.g., the Gaussian
case, the second term is exactly the differential entropy of each noise sample.)
Remark 4.3. The results in this chapter remain essentially true if disturbance is measured
by (1/n)H(Znj ) instead. If the channel is deterministic, the two measures coincide.
Remark 4.4. The disturbance constraint (1/n)I(Xn;Znj ) ≤ Rd,j is reminiscent of the
information leakage rate constraint for the wiretap channel [Wyn75, CK78], which is of
the form (1/n)I(M ;Znj ) ≤ Rleak. Replacing M with Xn, however, dramatically changes
the problem and the optimal coding scheme. In the wiretap channel, the key component of
the optimal encoding scheme is randomized encoding, which helps control the leakage rate
(1/n)I(M ;Znj ). Such randomization reduces the achievable transmission rate for a given
disturbance constraint, hence is not desirable in our setting.
The rate–disturbance region is not known in general. We establish the following results.
4.1.1 Rate–disturbance region for a single disturbance constraint
Consider the case with a single disturbance constraint, i.e., K = 1, and relabel Z1 as Z and
Rd,1 as Rd. We fully characterize the rate–disturbance region for this case.
Theorem 4.1 (Rate–disturbance region of DMC-1-DC).The rate–disturbance region R of the DMC-1-DC is the set of rate pairs (R,Rd) such
that
R ≤ I(X;Y ),
Rd ≥ I(X;Z |U),
R−Rd ≤ I(X;Y |U)− I(X;Z |U),
for some pmf p(u, x) with |U| ≤ |X |+ 1.
56 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
Let R(U,X) be the rate region defined by the rate constraints in the theorem for a
fixed joint pmf (U,X) ∼ p(u, x). This rate region is illustrated in Figure 4.2. The rate–
disturbance region is simply the union of these regions over all p(u, x) and is convex without
the need for a time-sharing random variable.
The proof of Theorem 4.1 is given in Subsections 4.2.1 and 4.2.2. Achievability is
established using rate splitting and superposition coding. Receiver Y decodes the satellite
codeword while receiver Z distinguishes only the cloud center. Note that this encoding
scheme is identical to the Han–Kobayashi scheme for the two user-pair interference chan-
nel [HK81, CMGE08].
We now consider three interesting special cases.
Deterministic channel
Assume that Y and Z are deterministic functions of X . We show that the rate–disturbance
region in Theorem 4.1 reduces to the following.
Corollary 4.1 (Rate–disturbance region of deterministic DMC-1-DC).The rate–disturbance region for the deterministic channel with one disturbance constraint
is the set of rate pairs (R,Rd) such that
R ≤ H(Y ),
R−Rd ≤ H(Y |Z),
for some pmf p(x).
Clearly, this region is convex. Alternatively, the region can be written as the set of rate
pairs (R,Rd) such that
R ≤ H(Y |Q),
Rd ≥ I(Y ;Z |Q),
for some joint pmf p(q, x) with |Q| ≤ 2. Corollary 4.1 and the alternative description of the
4.1. RESULTS AND DISCUSSION 57
A
B
45◦
R
Rd
R(U,X)
I(X;Y )I(X;Y |U)
I(X;Z|U)
Figure 4.2. Example of R(U,X), the constituent region of R.
region are established by substituting U = Z in the region of Theorem 4.1 and simplifying
the resulting region as detailed in Subsection 4.2.3.
Remark 4.5. Recall the 2-DIC of Subsection 1.1.1 (see Figure 1.2 on page 5). According
to Theorem 1.1, the capacity region is achieved by the Han–Kobayashi scheme in which
the transmitters use superposition codebooks generated according to p(x12)p(x1|x12) and
p(x21)p(x2|x21). Now consider the dashed orange boxes in Figure 4.3, where some of the
signals are relabeled with respect to Figure 1.2. Corollary 4.1 shows that the same encoding
scheme achieves the disturbance-constrained capacity for the channels X1 → (Y ′1 , Z1) and
X2 → (Y ′2 , Z2), shown as dashed boxes in Figure 4.3. Here, Y ′1 and Y ′2 are the desired
receivers, and Z1 and Z2 are the side receivers associated with disturbance constraints. Note
that decodability of the desired messages at receivers Y1 and Y2 in the interference channel
implies decodability at Y ′1 and Y ′2 in the channels with disturbance constraint, respectively.
Example 4.1. Consider the deterministic channel depicted in Figure 4.4(a) and its rate–
disturbance region in Figure 4.4(b). Note that rates R ≤ 1 can be achieved with zero
disturbance rate by restricting the transmission to input symbols {0, 1} (or {2, 3}), which
map to different symbols at Y , but are indistinguishable at Z. On the other hand, for largeRd,
the disturbance constraint is inactive and R is bounded only by the unconstrained capacity
log(3). In addition to the optimal region achieved by superposition coding, the figure also
shows the strictly suboptimal region achieved by simple non-layered random codes.
58 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
Y ′1
Y ′2
Y1 → M1
Y2 → M2
M1 → X1
M2 → X2
Z1
Z2
g11
g12
g21
g22
f1
f2
Figure 4.3. The link between 2-DIC and communication with disturbance constraints.
M → X{0, 1, 2, 3} 0
12
0
23
1
012
0
3 1Z
{0, 1}
Y → M{0, 1, 2}
(a) Channel block diagram
Rd
R
Single-user codebooksSuperposition codebooks
0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8
0.2
0.4
0.6
0.8
1.0
(b) Rate–disturbance region
Figure 4.4. Deterministic example with one disturbance constraint.
4.1. RESULTS AND DISCUSSION 59
Gaussian channel
Consider the problem of communication with one disturbance constraint for the Gaussian
channel
Y = X +W1,
Z = X +W2,
where the noise is W1 ∼ N (0, 1) and W2 ∼ N (0, N). Assume an average power constraint
P on the transmitted signal X .
The case N ≤ 1 is not interesting, since then Y is a degraded version of Z and the
disturbance rate is simply given by the data rate R. If N > 1, Z is a degraded version of Y ,
and the rate–disturbance region reduces to the following.
Corollary 4.2 (Gaussian channel with one disturbance constraint).The rate–disturbance region of the Gaussian channel with parameters P > 0 and N > 1
is the set of rate pairs (R,Rd) such that
R ≤ C(αP ),
Rd ≥ C(αP/N),
for some α ∈ [0, 1], where C(x) = (1/2) log(1 + x) for x ≥ 0.
Achievability is proved using Gaussian codes with power αP . The converse follows by
defining α? ∈ [0, 1] such thatR = C(α?P ) and applying the vector entropy power inequality
to Zn = Y n + W n2 , where W2 ∼ N (0, N − 1) is the excess noise. The details are given in
Subsection 4.2.4. An alternative proof of the corollary using the relation between mutual
information and minimum mean-square error estimation [GSV05] was given in [BS11].
Note that this is a degenerate form of the Han–Kobayashi scheme because the constraint
from the multiple access side of the interference channel is not taken into consideration.
60 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
Vector Gaussian channel
Now consider the vector Gaussian channel with one disturbance constraint
Y = X +W1,
Z = X +W2,
where X ∈ Rn and the noise is W1 ∼ N (0, K1) and W2 ∼ N (0, K2) for some positive
semidefinite covariance matrices K1, K2 ∈ Rn×n. Assume an average transmit power
constraint tr(KX) ≤ P , where KX = E(XXT) is the covariance matrix of X . This case is
not degraded in general.
Theorem 4.2 (Gaussian vector channel with one disturbance constraint).The rate–disturbance region of the Gaussian vector channel with parameters P , K1, and
K2 is the convex hull of the set of pairs (R,Rd) such that
R ≤ 12
log|KU +KV +K1|
|K1|,
R−Rd ≤ 12
log|KV +K1||KV +K2|
|K2||K1|
,
Rd ≥ 12
log|KV +K2||K2|
.
for some positive semidefinite matrices KU , KV ∈ Rn×n with tr(KU +KV ) ≤ P .
Achievability of this rate–disturbance region is shown by applying Theorem 4.1. Using
the discretization procedure in [EK11], it can be shown that the theorem continues to hold
with the power constraint additionally applied to the set of permissible input distributions.
The claimed region then follows by considering the special case where the input distribution
p(u, x) is jointly Gaussian. To prove the converse, we use an extremal inequality in [LV07]
to show that Gaussian input distributions are sufficient. The details of the proof are given in
Subsection 4.2.5.
4.1. RESULTS AND DISCUSSION 61
4.1.2 Inner and outer bounds for the deterministic channel with twodisturbance constraints
The correspondence between optimal encoding for the channel with one disturbance con-
straint and the Han–Kobayashi scheme for the interference channel suggests that the optimal
coding scheme for K disturbance constraints may provide an efficient (if not optimal)
scheme for the interference channel with more than two user pairs. This is particularly true
for the 3-DIC, since it is an extension of the 2-DIC for which the Han–Kobayashi scheme
is optimal (see Remark 4.5). Consequently, we restrict our attention to the deterministic
version of the DMC-2-DC.
First, we establish the following inner bound on the rate–disturbance region.
Theorem 4.3 (Inner bound for deterministic DMC-2-DC).The rate–disturbance region R of the deterministic channel with two disturbance con-
straints is inner-bounded by the set of rate triples (R,Rd,1, Rd,2) such that
R ≤ H(Y ), (4.1)
Rd,1 +Rd,2 ≥ I(Z1;Z2 |U), (4.2)
R−Rd,1 ≤ H(Y |Z1, U), (4.3)
R−Rd,2 ≤ H(Y |Z2, U), (4.4)
R−Rd,1 −Rd,2 ≤ H(Y |Z1, Z2, U)− I(Z1;Z2 |U), (4.5)
2R−Rd,1 −Rd,2 ≤ H(Y |Z1, Z2, U) +H(Y |U)− I(Z1;Z2 |U), (4.6)
for some pmf p(u, x).
The inner bound is convex. The region R(U,X) defined by the inequalities in the
theorem for a fixed p(u, x) is illustrated in Figure 4.5. The expression I(Z1;Z2 |U) appears
in three of the inequalities. As in Marton coding for the 2-receiver broadcast channel with a
common message, it is the penalty incurred in encoding independent messages via correlated
sequences. The auxiliary random variable U can be used to reduce this penalty. For example,
62 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
if U is chosen such that it is a function of both Z1 and Z2 individually, then
I(Z1;Z2 |U) = H(Z1 |U) +H(Z2 |U)−H(Z1, Z2 |U)
= H(Z1) +H(Z2)−H(Z1, Z2)−H(U)
= I(Z1;Z2)−H(U),
i.e., the penalty I(Z1;Z2 |U) with U is less than the penalty I(Z1;Z2) without U .
Remark 4.6. The right-hand side of condition (4.6) can be equivalently expressed as
H(Y |Z1, Z2, U) +H(Y |U)− I(Z1;Z2 |U)
= H(Y |Z1, U) +H(Y |Z2, U)− I(Z1;Z2 |U, Y ),
This shows that the condition is stricter than the sum of conditions (4.3) and (4.4).
The encoding scheme for Theorem 4.3 involves rate splitting, Marton coding, and
superposition coding. The analysis of the probability of error, however, is complicated by the
fact that receiver Y wishes to decode all parts of the message as detailed in Subsection 4.3.1.
Receivers Z1 and Z2 each observe a satellite codeword from a superposition codebook.
Note that the encoding scheme can be readily extended to the general (non-deterministic)
DMC-2-DC.
Rd,1
Rd,2
R
(4.1)
(4.2)
(4.3) (4.4)
(4.5)
(4.6)
Figure 4.5. Constituent region R(U,X) for Theorem 4.3. Each face is annotated by the inequalitythat defines it.
4.1. RESULTS AND DISCUSSION 63
To complement the inner bound, we establish the following outer bound on the rate–
disturbance region of the deterministic channel with two disturbance constraints.
Theorem 4.4 (Outer bound for deterministic DMC-2-DC).If a rate triple (R,Rd,1, Rd,2) is achievable for the deterministic channel with two
disturbance constraints, then it must satisfy the conditions
R ≤ H(Y |Q),
Rd,1 ≥ I(Y ;Z1 |Q),
Rd,2 ≥ I(Y ;Z2 |Q),
for some pmf p(q, x) with |Q| ≤ 3.
The proof of this outer bound is given in Subsection 4.3.2. Note that this outer bound is
very similar in form to the alternative description of Corollary 4.1 for the single-constraint
deterministic case.
The inner bound in Theorem 4.3 and the outer bound in Theorem 4.4 coincide in some
special cases. To discuss these, we introduce the following notation. Since all channel
outputs are functions of X , they can be equivalently thought of as set partitions of the
input alphabet X . Set partitions form a partially ordered set (poset) under the refinement
relation. Since this poset is a complete lattice [Sta11], the following concepts are well-
defined. For two set partitions (functions) f and g, let f 4 g denote that f is a refinement of
g (equivalently, g is degraded with respect to f ), let f ∧ g be the intersection of the two set
partitions (the function that returns both f and g), and let f ∨ g denote the finest set partition
of which both f and g are refinements (the Gacs–Korner–Witsenhausen common part of f
and g, cf. [GK73, Wit75]).
The inner bound of Theorem 4.3 coincides with the outer bound of Theorem 4.4 if Z1 or
Z2 is a degraded version of Y ∧ (Z1 ∨ Z2), i.e., if the output Y together with the common
part of Z1 and Z2 determine Z1 or Z2 completely.
64 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
Theorem 4.5 (Rate–disturbance region of certain deterministic DMC-2-DC).The rate–disturbance region R of the deterministic channel with two disturbance con-
straints is given by the outer bound of Theorem 4.4 if
Y ∧ (Z1 ∨ Z2) 4 Z1, or
Y ∧ (Z1 ∨ Z2) 4 Z2.
The theorem is proved by specializing Theorem 4.3 as detailed in Subsection 4.3.3. In the
case when Z1 or Z2 is a degraded version of Y alone, achievability follows by setting U = ∅in Theorem 4.3. Otherwise, we let U = Z1 ∨ Z2. This is intuitive, since U corresponds to
the common-message step in the Marton encoding scheme.
Example 4.2. Consider the deterministic channel depicted in Figure 4.6(a). The desired
receiver output Y is a refinement of both side receiver outputs Z1 and Z2, and hence, Theo-
rem 4.5 applies. Figure 4.6(b) depicts the rate–disturbance region, numerically approximated
by evaluating each point in a regular grid over the distributions p(x) and subsequently taking
the convex hull. Figure 4.7(a) contrasts the single-constraint case (Rd,2 is set to infinity
and thus inactive) with the case where both side receivers are under the same disturbance
rate constraint (Rd,1 = Rd,2). As expected, imposing an additional disturbance constraint
can significantly reduce the achievable message rate. Figure 4.7(b) illustrates the trade-off
between the disturbance rates Rd,1 and Rd,2 at the two side receivers, for a fixed data rate R.
4.1. RESULTS AND DISCUSSION 65
M → X{0, 1, 2, 3}
012 1
3 2
0
Z2{0, 1, 2}
012
0
312
Z1{0, 1, 2}
Y → M
(a) Block diagram of the channel.
Rd,1
Rd,2
R
(b) Rate–disturbance region.
Figure 4.6. Deterministic channel with two disturbance constraints (Example 4.2).
66 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
Rd
0.5
1.0
1.5
2.0
R0.5 1.0 1.5 2.0
Symmetric disturbance constraintsSingle disturbance constraint
(a) Single disturbance constraint (Rd,1 = Rd, Rd,2 =∞) and symmet-ric disturbance constraint (Rd,1 = Rd,2 = Rd).
Rd,1
Rd,2
0.5 1.0 1.5
0.5
1.0
1.5 R=2.0
1.9
1.81.71.61.51.41.31.21.1
(b) Contour lines of the rate–disturbance region at constant rate R.
Figure 4.7. Two-dimensional projections of the rate–disturbance region for Example 4.2.
4.1. RESULTS AND DISCUSSION 67
We conclude this section by considering another case in which we can fully characterize
the rate–disturbance region of the deterministic channel with two disturbance constraints. If
Z1 is a degraded version of Z2 (or vice versa), the region R of Theorem 4.3 is optimal and
simplifies to the following.
Corollary 4.3 (Rate–disturbance region with degraded side receivers).The rate–disturbance region R of the deterministic channel with two disturbance con-
straints with Z1 4 Z2 or Z2 4 Z1 is the set of rate triples (R,Rd,1, Rd,2) such that
R ≤ H(Y ),
R−Rd,1 ≤ H(Y |Z1),
R−Rd,2 ≤ H(Y |Z2).
for some pmf p(x).
Achievability follows as a special case of Theorem 4.3. The encoding scheme underlying
the theorem carefully avoids introducing an ordering between the side receiver signals Z1
and Z2, but such ordering is naturally given by the channel here. Consequently, the corollary
follows by setting the auxiliary U equal to the output at the degraded side receiver. This
turns the encoding scheme into superposition coding with three layers. The details are given
in Subsection 4.3.4.
Note that the region of Corollary 4.3 is akin to the deterministic case with one disturbance
constraint in Corollary 4.1. In both cases, the side receiver signals need not be degraded
with respect to Y .
68 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
4.2 Proofs for a single disturbance constraint
4.2.1 Proof of achievability for Theorem 4.1
Achievability is proved as follows.
Codebook generation. Fix a pmf p(u, x).
1. Split the message M into two independent messages M0 and M1 with rates R0 and
R1, respectively. Hence R = R0 +R1.
2. For each m0 ∈ {1:2nR0}, independently generate a sequence un(m0) according to∏ni=1 p(ui).
3. For each (m0,m1) ∈ {1 : 2nR0} × {1 : 2nR1}, independently generate a sequence
xn(m0,m1) according to∏n
i=1 p(xi |ui(m0)).
Encoding. To send message m = (m0,m1), transmit xn(m0,m1).
Decoding. Upon receiving yn, declare that (m0, m1) has been sent if it is the unique
message such that
(un(m0), xn(m0, m1), yn) ∈ T (n)ε (U,X, Y ).
Analysis of the probability of error. We are using a superposition code over the channel
from X to Y . Using the law of large numbers and the packing lemma in [EK11], it can be
shown that the probability of error tends to zero as n→∞ if
R1 < I(X;Y |U)− δ(ε), (4.7)
R0 +R1 < I(X;Y )− δ(ε). (4.8)
Analysis of disturbance rate. In the following, we analyze the disturbance rate averaged
over codebooks C.
I(Xn;Zn | C) ≤ H(Zn,M0 | C)−H(Zn |Xn, C)
= H(M0) +H(Zn |M0, C)−H(Zn |Xn)
4.2. PROOFS FOR A SINGLE DISTURBANCE CONSTRAINT 69
(a)≤ nR0 +H(Zn |Un)− nH(Z |X)
≤ nR0 + nH(Z |U)− nH(Z |X,U)
= nR0 + nI(X;Z |U)
≤ nRd, (4.9)
where (a) follows sinceUn is a function of the codebook C andM0. SubstitutingR = R0+R1
and using Fourier–Motzkin elimination on inequalities (4.7), (4.8), and (4.9) completes the
proof of achievability. �
4.2.2 Proof of converse for Theorem 4.1
Consider a sequence of codes with P (n)e → 0 as n → ∞ and the joint pmf that it induces
on (M,Xn, Y n, Zn) assuming M ∼ Unif{1 : 2nR}. Define the time-sharing random
variable Q ∼ Unif{1 : n}, independent of everything else. We use the identification
U = (Q, Y nQ+1, Z
Q−1), and let X = XQ, Y = YQ, and Z = ZQ. Note that (X, Y, Z) is
consistent with the channel. Then
R ≤ I(X;Y ) + εn,
as in the converse proof for point-to-point channel capacity, which uses the same identifica-
tions of random variables. On the other hand,
nRd ≥ I(Xn;Zn)
= H(Zn)−H(Zn |Xn)
=n∑i=1
(H(Zi |Zi−1)−H(Zi |Xi)
)≥
n∑i=1
H(Zi |Zi−1, Y ni+1)− nH(Z |X)
= nH(Z |U)− nH(Z |X,U)
= nI(X;Z |U).
70 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
Finally,
n(Rd −R)
≥ I(Xn;Zn)− nR(a)≥ H(Zn)−H(Zn |Xn)− I(M ;Y n)− nεn(b)=
n∑i=1
(H(Zi |Zi−1)− I(M ;Yi |Y n
i+1))− nH(Z |X)− nεn
=n∑i=1
(H(Zi |Zi−1, Y n
i+1) + I(Y ni+1;Zi |Zi−1)
−H(Yi |Y ni+1) +H(Yi |M,Y n
i+1))− nH(Z |X)− nεn
(c)=
n∑i=1
(H(Zi |Zi−1, Y n
i+1) + I(Yi;Zi−1 |Y n
i+1)
−H(Yi |Y ni+1) +H(Yi |Xi)
)− nH(Z |X)− nεn
=n∑i=1
(H(Zi |Zi−1, Y n
i+1)−H(Yi |Zi−1, Y ni+1)
+H(Yi |Xi, Zi−1, Y n
i+1))− nH(Z |X)− nεn
=n∑i=1
(H(Zi |Zi−1, Y n
i+1)− I(Xi;Yi |Zi−1, Y ni+1))
− nH(Z |X)− nεn(d)= nH(Z |U)− nI(X;Y |U)− nH(Z |X,U)− nεn= nI(X;Z |U)− I(X;Y |U)− nεn,
where (a) uses Fano’s inequality, (b) single-letterizes the noise term H(Zn |Xn) with
equality due to memorylessness of the channel, (c) applies Csiszar’s sum identity to the
second term and channel memorylessness to the fourth term, and (d) uses the previous
definitions of auxiliary random variables. Finally, the cardinality bound on U is established
using the convex cover method in [EK11]. �
4.2. PROOFS FOR A SINGLE DISTURBANCE CONSTRAINT 71
4.2.3 Proof of Corollary 4.1
Using the deterministic nature of the channel, the region in Theorem 4.1 reduces to the set
of rate pairs (R,Rd) such that
R ≤ H(Y ), (4.10)
Rd ≥ H(Z |U), (4.11)
Rd ≥ R +H(Z |U)−H(Y |U), (4.12)
for some pmf p(u, x). Now fixing a rate R and a pmf p(x) and varying p(u|x) to minimize
Rd, the right hand sides of (4.11) and (4.12) are lower bounded by
H(Z |U) ≥ 0,
and
R +H(Z |U)−H(Y |U) = R +H(Z |U)−H(Y, Z |U) +H(Z |Y, U)
= R−H(Y |Z,U) +H(Z |Y, U)
≥ R−H(Y |Z).
Note that the particular choice U = Z simultaneously achieves both lower bounds with
equality and is therefore sufficient. The rate–disturbance region thus reduces to Corollary 4.1.
For a fixed pmf p(x), this region has exactly two corner points: P1 = (H(Y |Z), 0) and
P2 = (H(Y ), I(Y ;Z)). As we vary p(x), there is one corner point P1 that dominates all
other P1 points. The pmf p(x) for this dominant P1 can be constructed by maximizing
H(Y |Z) as follows. For each z ∈ Z , define Yz ⊆ Y to be the set of y symbols that are
compatible with z. Let z? be a symbol that maximizes |Yz|. For each element of Yz? , pick
exactly one x that is compatible with it and z?. Finally, place equal probability mass on
each of these x values, and zero mass on all others. This pmf on X yields the dominant
corner point P1, namely (log(|Yz?|), 0). Moreover, for this distribution, P2 coincides with
P1. Therefore, the net contribution (modulo convexification) of each pmf p(x) to the rate–
disturbance region amounts to its corner point P2. This implies the alternative description of
the region. The cardinality bound onQ follows from the convex cover method in [EK11]. �
72 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
4.2.4 Proof of Corollary 4.2
Achievability is straightforward using a random Gaussian codebook with power control, and
upper-bounding the disturbance rate at receiver Z by white Gaussian noise. The converse
can be seen as follows. Clearly, R ≤ C(P ). Let α? ∈ [0, 1] be such that R = C(α?P ). Then
nC(α?P ) = nR ≤ I(Xn;Y n) + nεn
= h(Y n)− h(Y n |Xn) + nεn,
and therefore,
h(Y n) ≥ n2
log(2πe) + nC(α?P )− nεn= n
2log (2πe(1 + α?P ))− nεn
Since N < 1, we can write the physically degraded form of the channel as Y = X +W1,
Z = Y + W2, where W2 ∼ N (0, N − 1) is the excess noise that receiver Z experiences in
addition to receiver Y . Applying the vector entropy power inequality to Zn = Y n + W n2 ,
we conclude
1nh(Zn) ≥ 1
2log(
22nh(Y n) + 2
2nh(Wn
2 ))
≥ 12
log(2−2εn · 2πe(1 + α?P ) + 2πe(N − 1)
)≥ 1
2log (2πe(N + α?P ))− εn,
and finally,
Rd ≥ 1nI(Xn;Zn)
= 1nh(Zn)− 1
2log(2πeN)
≥ C(α?P/N)− εn.
This concludes the proof of Corollary 4.2. �
4.2. PROOFS FOR A SINGLE DISTURBANCE CONSTRAINT 73
4.2.5 Proof of Theorem 4.2
Recall the shape of R(U,X) depicted in Figure 4.2. The coordinates of the corner points A
and B are given by
A(U,X) : R = h(X +W1)− h(W1), (4.13)
Rd = h(X +W2 |U) + h(X +W1)− h(X +W1 |U)− h(W2), (4.14)
B(U,X) : R = h(X +W1 |U)− h(W1), (4.15)
Rd = h(X +W2 |U)− h(W2). (4.16)
Proof of achievability: We specialize Theorem 4.1. Consider the specific p(u, x) con-
structed as follows. For given positive semidefinite matrices KU , KV ∈ Rn×n with
tr(KU +KV ) ≤ P , let
U ∼ N (0, KU),
V ∼ N (0, KV ),
X = U + V,
where U and V are independent. Then, the terms in Theorem 4.1 evaluate to
I(X;Y ) = h(Y )− h(W1) = 12
log|KU +KV +K1|
|K1|,
I(X;Y |U) = h(Y |U)− h(W1) = 12
log|KV +K1||K1|
,
I(X;Z |U) = h(Z |U)− h(W2) = 12
log|KV +K2||K2|
.
Simplifying the right hand sides and introducing time-sharing leads to the desired result.
For completeness, the coordinates of A and B for given matrices KU , KV are
A(KU , KV ) : R = 12
log|KU +KV +K1|
|K1|, (4.17)
Rd = 12
log|KV +K2||K2|
|KU +KV +K1||KV +K1|
, (4.18)
74 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
B(KU , KV ) : R = 12
log|KV +K1||K1|
, (4.19)
Rd = 12
log|KV +K2||K2|
. (4.20)
The constituent region R(U,X) for fixed KU and KV is depicted in Figure 4.8. �
Proof of converse: The converse proof of Theorem 4.1 continues to hold and we only need
to show that Gaussian input distributions are sufficient. We proceed as follows. Since the
rate–disturbance region is convex, its boundary can be fully characterized by maximizing
R− λRd for each λ > 0. We write
R− λRd ≤ max(R,Rd)∈R
{R− λRd}
= max(U,X)
max(R,Rd)∈R(U,X)
{R− λRd} ,
where the outer optimization is over the joint distribution of (U,X) and the inner optimization
is over the region achieved by that distribution. The inner optimization can be solved
explicitly as follows. For ease of presentation, assume for the moment that the power
constraint is of the form KX � S for some positive semidefinite matrix S. (That is, valid
KX are precisely those that result in the matrix S −KX being positive semidefinite.)
A
B
45◦
R
Rd
R(U,X)
12 log |KU+KV +K1|
|K1|12 log |KV +K1|
|K1|
12 log |KV +K2|
|K2|
Figure 4.8. Constituent region for Theorem 4.2, using a Gaussian superposition codebook withparameters KU and KV .
4.2. PROOFS FOR A SINGLE DISTURBANCE CONSTRAINT 75
First, consider λ ≤ 1. For any distribution (U,X) ∼ p(u, x), point A(U,X) achieves a
value of the inner optimization at least as large as point B(U,X), or any point on the line
between them. Using the coordinates of A(U,X) in (4.13) and (4.14), we can write
R− λRd ≤ max(U,X)
{λ (h(X +W1 |U)− h(X +W2 |U))
+ (1− λ)h(X +W1)− h(W1) + λh(W2)}(a)≤ λ ·max
(U,X){h(X +W1 |U)− h(X +W2 |U)}
+ (1− λ) ·max(U,X)
{h(X +W1)} − h(W1) + λh(W2)
(b)≤ λ · max
KX�S
{12
log|KX +K1||KX +K2|
}+ (1− λ) · max
KX�S
{12
log ((2πe)n|KX +K1|)}
− 12
log ((2πe)n|K1|) + λ2
log ((2πe)n|K2|) .
In (a), the two maximizations are taken independently. In step (b), the first maximization is
achieved by a Gaussian X that is independent of U , due to a theorem proved by Liu and
Viswanath [LV07, Thm. 8]. The optimization is now only over covariances matrices. Let
K? be an optimizer of this first maximization. The second maximization is also achieved by
a Gaussian X , and is optimized by KX = S since f(KX) = |KX +K1| is matrix monotone.
It follows that
R− λRd ≤ λ2
log|K? +K1||K? +K2|
+ 1−λ2
log ((2πe)n|S +K1|)
− 12
log ((2πe)n|K1|) + λ2
log ((2πe)n|K2|)
= 12
log|S +K1||K1|
− λ2
log|K? +K2||K? +K1|
|S +K1||K2|
.
But this upper bound is achieved with equality by Gaussian superposition codebooks, namely
through the pointA(KU , KV ) as specified by equations (4.17) and (4.18), withKU = S−K?
and KV = K?.
Now, consider λ > 1. The argument proceeds analogously to the previous case. For
completeness’ sake, the details are as follows. We can write the inner optimization explicitly
76 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
using the coordinates of B(U,X) in (4.15) and (4.16) as
R− λRd ≤ max(U,X)
{h(X +W1 |U)− λh(X +W2 |U)}+ λh(W2)− h(W1)
(a)≤ max
KX�S
{12
log ((2πe)n|KX +K1|)− λ2
log ((2πe)n|KX +K2|)}
+ λ2
log ((2πe)n|K2|)− 12
log ((2πe)n|K1|) .
The optimum in (a) is achieved by a GaussianX (independent ofU ) by virtue of [LV07, Thm.
8], while the other two terms are independent of the optimization variable. Let K? be an
optimizer. Then
R− λRd ≤ 12
log|K? +K1||K1|
− λ2
log|K? +K2||K2|
.
This upper bound is achieved with equality by Gaussian superposition codebooks through
the point B(KU , KV ) as given by equations (4.19) and (4.20) with KU = 0 and KV = K?.
This is a power control strategy, similar to the scalar Gaussian case.
We have thus shown that under a power constraint KX � S, Gaussian superposition
codes are optimal. The conclusion extends to the sum power constraint tr(KX) ≤ P by
observing that
{KX : tr(KX) ≤ P} =⋃
S:S�0tr(S)≤P
{KX : KX � S}.
In other words, the sum power constraint can be expressed as a union of constraints of the
type KX � S, for each of which Gaussian superposition codes are optimal. Therefore, a
Gaussian superposition code must be optimal overall, too. �
4.3. PROOFS FOR TWO DISTURBANCE CONSTRAINTS 77
4.3 Proofs for two disturbance constraints
4.3.1 Proof of Theorem 4.3
Codebook generation. Fix a pmf p(u, x). Split the rate as R = R0 + R1 + R2 + R3.
Define the auxiliary rates R1 ≥ R1 and R2 ≥ R2, let ε′ > 0, and define the set partitions
{1:2nR1} = L1(1) ∪ · · · ∪ L1(2nR1),
{1:2nR2} = L2(1) ∪ · · · ∪ L2(2nR2),
where L1(·) and L2(·) are indexed sets of size 2n(R1−R1) and 2n(R2−R2), respectively.
1. For each m0 ∈ {1:2nR0}, generate un(m0) according to∏n
i=1 p(ui).
2. For each l1 ∈ {1:2nR1}, generate zn1 (m0, l1) according to∏n
i=1 p(z1i |ui(m0)). Like-
wise, for each l2 ∈ {1:2nR2}, generate zn2 (m0, l2) according to∏n
i=1 p(z2i |ui(m0)).
3. For each (m0,m1,m2), let S(m0,m1,m2) be the set of all pairs (l1, l2) from the prod-
uct set L1(m1)×L2(m2) such that (zn1 (m0, l1), zn2 (m0, l2)) ∈ T (n)ε′ (Z1, Z2 |un(m0)).
4. For each (m0, l1, l2) and m3 ∈ {1:2nR3}, generate xn(m0, l1, l2,m3) according to
n∏i=1
p(xi |ui(m0), z1i(l1), z2i(l2))
if (l1, l2) ∈ S(m0,m1,m2). Otherwise, we draw from Unif(X n).
5. Choose (l(m0,m1,m2)1 , l
(m0,m1,m2)2 ) uniformly from S(m0,m1,m2). If S(m0,m1,m2)
is empty, choose (1, 1).
Encoding. To send message m = (m0,m1,m2,m3), transmit the sequence
xn(m0, l(m0,m1,m2)1 , l
(m0,m1,m2)2 ,m3).
78 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
Decoding. Let ε > ε′. Upon receiving yn, define the tuple
T (m0,m1,m2,m3) =(un(m0), zn1
(m0, l
(m0,m1,m2)1
), zn2(m0, l
(m0,m1,m2)2
),
xn(m0, l
(m0,m1,m2)1 , l
(m0,m1,m2)2 ,m3
), yn)
Declare that m = (m0, m1, m2, m3) has been sent if it is the unique message such that
T (m0, m1, m2, m3) ∈ T (n)ε (U,Z1, Z2, X, Y ).
Analysis of the probability of error. Without loss of generality, assume thatm0 = m1 =
m2 = m3 = 1 is transmitted. Define the following events.
Ee1 : S(1, 1, 1) is empty,
Ee2 : S(1, 1, 1) contains two distinct pairs with equal first or second component,
Ei : {T (m0,m1,m2,m3) ∈ T (n)ε (U,Z1, Z2, X, Y ) for some (m0,m1,m2,m3) ∈Mi},
i ∈ {0:5},
where the message subsetsMi are specified in Table 4.1. Defining the “encoding error”
event Ee = Ee1 ∪ Ee2 and the “decoding error” event Ed = Ec0 ∪ E1 ∪ E2 ∪ E3 ∪ E4 ∪ E5, the
probability of error can be upper-bounded as
P(E) ≤ P(Ee ∪ Ed) ≤ P(Ee) + P(Ed | Ece ).
The motivation for introducing Ee2 as an “error” is to simplify the analysis of the second
probability term.
We bound P(Ee) by the following proposition. Let r1 = R1 −R1 and r2 = R2 −R2.
Proposition 4.1. P(Ee)→ 0 as n→∞ if
r1 + r2 > I(Z1;Z1 |U) + δ(ε′), (4.21)
r1/2 + r2 < I(Z1;Z2 |U)− δ(ε′), (4.22)
r1 + r2/2 < I(Z1;Z2 |U)− δ(ε′). (4.23)
4.3. PROOFS FOR TWO DISTURBANCE CONSTRAINTS 79
Proof sketch: First, consider Ee1. As in the proof of Marton’s inner bound for the broadcast
channel, the mutual covering lemma [EK11] implies P(Ee1)→ 0 as n→∞ if (4.21) holds.
Now consider Ee2, for which we need to control the number of typical pairs that can
occur in the same “row” or “column” of the product set L1(m1)×L2(m2), i.e., for the same
l1 or l2 coordinate. The probability P(Ee2) tends to zero provided that (4.22) and (4.23) hold.
This is akin to the birthday problem [Mis39], where k samples are drawn uniformly
and independently from {1 :N}, and the interest is in samples that have the same value
(collisions). It is well-known that for the probability of collision to be pc, the number
of samples required is roughly k ≈√−2N ln(1− pc), which scales with
√N . In our
case, the number of samples is the cardinality of the set S(m0,m1,m2), which is roughly
k = 2n(r1+r2−I(Z1;Z2 |U)). The samples are categorized intoN1 = 2nr1 andN2 = 2nr2 classes
along rows and columns, respectively. To achieve a probability of collision pc → 0 along
both dimensions, we need k � min{√N1,√N2}, which yields exactly the conditions (4.22)
and (4.23).
A rigorous proof is given below on page 82. �
We bound the probability P(Ed | Ece ) by the following proposition.
Proposition 4.2. P(Ed | Ece )→ 0 as n→∞ if
R3 < H(Y |Z1, Z2, U)− δ(ε), (4.24)
R1 +R3 < H(Y |Z2, U) + I(Z1;Z2 |U)− δ(ε), (4.25)
Message subset m0 m1 m2 m3
M0 1 1 1 1
M1 1 1 1 6= 1
M2 1 6= 1 1 anyM3 1 1 6= 1 anyM4 1 6= 1 6= 1 anyM5 6= 1 any any any
Table 4.1. Message subsets for decoding error events.
80 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
R2 +R3 < H(Y |Z1, U) + I(Z1;Z2 |U)− δ(ε), (4.26)
R1 + R2 +R3 < H(Y |U) + I(Z1;Z2 |U)− δ(ε), (4.27)
R0 + R1 + R2 +R3 < H(Y ) + I(Z1;Z2 |U)− δ(ε). (4.28)
Proof sketch: The events of which Ed is composed are illustrated in Figure 4.9, which also
depicts the structure of the codebook for m0 = 1. The product sets L1(m1) × L2(m2),
for each (m1,m2), are represented by shaded squares. In each product set, the sequence
pair selected in step 5 of the codebook generation procedure is shown with its superposed
xn codewords, as created in step 4. The correct codeword xn(1, 1, 1, 1) is shown as a
white circle which is connected to the received sequence yn. The codewords that may be
mistakenly detected at the receiver are shown as black circles. The product sets associated
with decoding error events E1, E2, E3, and E4 are labeled 1, 2, 3, and 4, respectively.
We bound the probability of each sub-event of Ed. First, note that by the conditional
typicality lemma in [EK11], P(Ec0)→ 0 as n→∞ (this relies on ε′ < ε). The probabilities
of the events E1 through E5 conditioned on Ece tend to zero as n→∞ under conditions (4.24)
through (4.28), correspondingly.
zn2 (1, l(1,1)2 )
m2 = 1 m2 = 2 m2 = 3
m1
=1
m1
=2
xn(1, 1, 1, 1)
yn
zn1 (1, 1)zn1 (1, 2)
zn1 (1, l(1,1)1 )
zn1 (1, 2n(R1−R1))
Figure 4.9. Illustration of decoding error events, for m0 = 1.
4.3. PROOFS FOR TWO DISTURBANCE CONSTRAINTS 81
The events E2 and E3 require the most careful analysis, since the true codeword, namely
xn(1, 1, 1, 1), and the codewords with which it may be confused can share the same zn1 or zn2sequence (see dashed line and circles on it in Figure 4.9). Moreover, even when the chosen
pairs in two different product sets do not share one of the two coordinates (see the chosen
pairs for (m1,m2) = (1, 1) and (2, 1) in Figure 4.9), correlation could potentially be caused
by the selection procedure in step 5 of codebook generation. We use the independence lemma
(Lemma A.2) to show that the event Ece prevents this correlation leakage from occurring. The
application of the lemma is what distinguishes this analysis from the conventional Marton
inner bound for broadcast channels [Mar79, EM81]. There, analysis of the selection process
can be altogether avoided since each receiver decodes only one of the two coordinates.
A detailed proof for the event E3 is given below on page 82, the other events follow
likewise. �
Analysis of disturbance rate. When viewed by receiver Z1, the codeword for message
m = (m0,m1,m2,m3) appears as zn1 (m0, l(m0,m1,m2)1 ). We can pessimistically assume that
all sequences zn1 (m0, l1) as created in step 2 of codebook generation can be seen at the
receiver for some message m. Therefore, the number of possible sequences at Z1, and
thus its disturbance rate, is upper-bounded by H(Zn1 ) ≤ n(R0 + R1). Applying the same
argument for Z2, the proposed scheme achieves
R0 + R1 ≤ Rd,1, (4.29)
R0 + R2 ≤ Rd,2. (4.30)
Conclusion of the proof. Collecting inequalities (4.21) through (4.30), recalling R =
R0 +R1 +R2 +R3, and using the Fourier–Motzkin procedure to eliminate R0, R1, R2, and
R3 leads to the (R,Rd,1, Rd,2) region claimed in the theorem.
Finally, the statement of Remark 4.6 follows from
− I(Z1;Z2 |U) + I(Z1;Z2 |U, Y )
= −H(Z2 |U) +H(Z2 |U,Z1) +H(Z2 |U, Y )−H(Z2 |U, Y, Z1)
= −I(Y ;Z2 |U) + I(Y ;Z2 |U,Z1),
82 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
which leads to the equality
H(Y |Z1, Z2, U) +H(Y |U)− I(Z1;Z2 |U) + I(Z1;Z2 |U, Y )
= H(Y |Z1, Z2, U) +H(Y |U)− I(Y ;Z2 |U) + I(Y ;Z2 |U,Z1)
= H(Y |Z1, U) +H(Y |Z2, U).
This concludes the proof of Theorem 4.3. �
Proof of Proposition 4.1: The product bin (m1,m2) = (1, 1) for m0 = 1 contains lm
sequence pairs, where l = 2nr1 andm = 2nr2 . Each pair (Zn1 (1, l1), Zn
2 (1, l2)), for l1 ∈ {1:l}and l2 ∈ {1 :m}, has probability p .
= 2−nI(Z1;Z2 |U) to be jointly typical. Now fix one
coordinate, say l1 = 1. The corresponding “row” of the bin contains m sequences Zn2 (1, l2),
each of which has an independent probability of p to be jointly typical with Zn1 (1, 1). Let K
be the total number of typical sequences in this row. Then
P{K = 0} = (1− p)m,
P{K = 1} = mp(1− p)m−1,
P{K ≥ 2} = 1− (1− p+mp) (1− p)m−1︸ ︷︷ ︸≥1−(m−1)p
≤ m2p2.
We have thus upper-bounded the probability to encounter two or more typical pairs in a
single row. Consequently, the probability of two or more typical pairs occurring in any row
is upper bounded by lm2p2. Substituting definitions leads to the desired inequality. The
same argument can be made for columns of the bin. �
Proof of Proposition 4.2, exemplified for E3: We analyze the probability of E3 as follows.
E3 ={(Un(1), Zn
1 (1, L(1,1,m2)1 ), Zn
2 (1, L(1,1,m2)2 ),
Xn(1, L(1,1,m2)1 , L
(1,1,m2)2 ,m3), Y n
)∈ T (n)
ε ,
for some m2 6= 1, m3
}
4.3. PROOFS FOR TWO DISTURBANCE CONSTRAINTS 83
⊆{(Un(1), Zn
1 (1, L(1,1,m2)1 ), Zn
2 (1, l2), Xn(1, L(1,1,m2)1 , l2,m3), Y n
)∈ T (n)
ε ,
for some m2 6= 1, m3, l2 /∈ L2(1)},
Define the event Eeq = {L(1,1,m2)1 = L
(1,1,1)1 }, which allows us to write P(E3 | Ece ) =
P(E3 ∩ Eeq | Ece ) + P(E3 ∩ Eceq | Ece ). We consider both terms separately.
E3 ∩ Eeq ⊆{(Un(1), Zn
1 (1, L(1,1,1)1 ), Zn
2 (1, l2), Xn(1, L(1,1,1)1 , l2,m3), Y n
)∈ T (n)
ε ,
for some l2 /∈ L2(1), m3
}.
Thus,
P(E3 ∩ Eeq | Ece )
≤∑
(un,zn1 ,yn)∈T (n)
ε
P{Un(1) = un, Zn
1 (1, L(1,1,1)12 ) = zn1 , Y
n = yn∣∣ Ece}
·∑
l2 /∈L2(1)
2nR3∑m3=1
P{
(un, zn1 , Zn2 (1, l2), Xn(1, L
(1,1,1)1 , l2,m3), yn) ∈ T (n)
ε
∣∣ Ece}≤ 2n(R2+R3) P ?,
where P ? is shorthand for the last P{·} expression. Continue with
P ? =∑
(zn2 ,xn)∈T (n)
ε (Z2,X |un,zn1 ,yn)
P{Zn
2 (1, l2) = zn2 , Xn(1, L
(1,1,1)1 , l2,m3) = xn
∣∣Un(1) = un, Zn
1 (L(1,1,1)1 ) = zn1 , Y
n = yn, Ece}
(a)=
∑(zn2 ,x
n)∈T (n)ε (
Z2,X |un,zn1 ,yn)︸ ︷︷ ︸.= 2nH(X,Z2|Z1,Y,U)
p(zn2 |un)︸ ︷︷ ︸.= 2−nH(Z2|U)
p(xn | zn1 , zn2 , un)︸ ︷︷ ︸.= 2−nH(X|Z1,Z2,U)
≤ 2n(H(X,Z2|Z1,Y,U)−H(Z2|U)−H(X|Z1,Z2,U)+δ(ε))
= 2n(−H(Y |Z1,U)−I(Z1;Z2|U)+δ(ε)).
In step (a), we have used the fact that l2 /∈ L2(1), and therefore, Zn2 (1, l2) relates to a bin
84 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
other than the first one. It is independent of the conditions Y n = yn and Ece , both of which
relate only to the (1, 1) bin for m0 = 1. A similar argument applies to the second term.
Substituting back in the previous chain of inequalities implies that P(E3 ∩ Eeq | Ece )→ 0
as n→∞ if inequality (4.26) holds.
Next, consider
E3 ∩ Eceq ⊆{(Un(1), Zn
1 (1, l1), Zn2 (1, l2), Xn(1, l1, l2,m3), Y n
)∈ T (n)
ε ,
for some l1 ∈ L1(1) \ {L(1,1,1)1 }, l2 /∈ L2(1), m3
}.
We argue
P(E3 ∩ Eceq | Ece )
≤∑
(un,yn)∈T (n)ε
P {Un(1) = un, Y n = yn | Ece}∑
l1∈L1(1)\{L(1,1,1)1 }
∑l2 /∈L2(1)
2nR3∑m3=1
P{
(un, Zn1 (1, l1), Zn
2 (1, l2), Xn(1, l1, l2,m3), yn) ∈ T (n)ε
∣∣ Un(1) = un, Y n = yn, Ece}
≤ 2n(R1−R1+R2+R3) P ?,
where P ? represents the last P{·} expression. Finally,
P ? =∑
(zn1 ,zn2 ,x
n)∈T (n)ε (
Z1,Z2,X |un,yn)
P{Zn
1 (1, l1) = zn1 , Zn2 (1, l2) = zn2 , X
n(1, l1, l2,m3) = xn∣∣
Un(1) = un, Y n = yn, Ece}
=∑
(zn1 ,zn2 ,x
n)∈T (n)ε (
Z1,Z2,X |un,yn)
∑zn2 (l′2), for
all l′2∈L2(1)
P{Zn
2 (1, l′2) = zn2 (l′2) for all l′2 ∈ L2(1)∣∣ Ece}
· P{Zn
1 (1, l1) = zn1 , Zn2 (1, l2) = zn2 , X
n(1, l1, l2,m3) = xn∣∣
Un(1) = un, Y n = yn, Zn2 (1, l′2) = zn2 (l′2) for all l′2 ∈ L2(1), Ece
}(a)≤
∑(zn1 ,z
n2 ,x
n)∈T (n)ε (
Z1,Z2,X |un,yn)︸ ︷︷ ︸.= 2nH(X,Z1,Z2|Y,U)
p(zn1 |un, Ece )︸ ︷︷ ︸(b)
.= 2−nH(Z1|U)
p(zn2 |un)︸ ︷︷ ︸.= 2−nH(Z2|U)
p(xn | zn1 , zn2 , un)︸ ︷︷ ︸.= 2−nH(X|Z1,Z2,U)
4.3. PROOFS FOR TWO DISTURBANCE CONSTRAINTS 85
≤ 2n(H(X,Z1,Z2|Y,U)−H(Z1|U)−H(Z2|U)−H(X|Z1,Z2,U)+δ(ε))
= 2n(−H(Y |U)−I(Z1;Z2|U)+δ(ε)).
Here, (a) uses uses the fact that for the l1 indices in question, Zn1 (1, l1) is independent of
Y n. This is a consequence of independence between the selected Zn1 (1, L
(1,1,1)1 ) and the
other (non-selected) Zn1 (1, l1) due to Lemma A.2. The lemma applies because the event
is conditioned (1) on Ece , which ensures that picking L(1,1,1)1 is uniform as required by the
lemma, and (2) on Zn2 (1, l′2) for all l′2 ∈ L2(1), which provides for the qualifying set A′ of
the lemma.
Step (b) follows from
p(zn1 |un, Ece ) = p(zn1 |un) · p(Ece |un, zn1 )
p(Ece |un)
≤ p(zn1 |un) · 1
p(Ece |un)
≤ p(zn1 |un) · 1
1− 2−δn
≤ 2−n(H(Z1|U)−ε) · 2nδ′
≤ 2−n(H(Z1|U)−ε−δ′).
Here, δ is the minimum slack of the three conditions for Ece in Lemma 4.1. Note that for any
δ, δ′ > 0, we can find an N0 such that
∀n ≥ N0 :1
1− 2−δn≤ 2nδ
′.
We conclude that P(E3 ∩ Eceq | Ece )→ 0 as n→∞ if
R1 −R1 + R2 +R3 ≤ H(Y |Q) + I(X1;X2|Q)− δ(ε).
This is an implication of (4.27) which stems from analyzing E4, and may thus be omitted. �
86 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
4.3.2 Proof of Theorem 4.4
First, consider
nR ≤ I(Xn;Y n) + nεn
=n∑i=1
I(Xn;Yi |Y i−1) + nεn
=n∑i=1
I(Xi;Yi |Y i−1) + nεn
= nI(X;Y |Q)
= nH(Y |Q).
Furthermore,
nRd,1 ≥ I(Xn;Zn1 )
≥ I(Y n;Zn1 )
=n∑i=1
I(Yi;Zn1 |Y i−1)
≥n∑i=1
I(Yi;Z1i |Y i−1)
= nI(Y ;Z1 |Q),
where Y = YT , Z1 = Z1T , and Q = (Y T−1, T ) with T ∼ Unif{1:n}. The same argument
leads to
nRd,2 ≥ nI(Y ;Z2 |Q),
with the same random variable identifications, and the additional Z2 = Z2T . Finally, the
cardinality bound on Q follows from the convex cover method in [EK11]. �
4.3. PROOFS FOR TWO DISTURBANCE CONSTRAINTS 87
4.3.3 Proof of Theorem 4.5
First, we specialize Theorem 4.3 as follows.
Corollary 4.4 (Simpler inner bound for deterministic DMC-2-DC).The rate–disturbance region R of the deterministic channel with two disturbance
constraints is inner-bounded by the set of rate triples (R,Rd,1, Rd,2) such that
R ≤ H(Y ), (4.31)
Rd,1 ≥ I(Y ;Z1, U), (4.32)
Rd,2 ≥ I(Y ;Z2, U), (4.33)
Rd,1 +Rd,2 ≥ I(Y ;Z1, Z2, U) + I(Y ;U) + I(Z1;Z2 |U)
= I(Y ;Z1, U) + I(Y ;Z2, U) + I(Z1;Z2 |U, Y ), (4.34)
for some pmf p(u, x).
The two equivalent expressions in (4.34) originate from Remark 4.6 on page 62. An
example of the constituent regions of Corollary 4.4 for fixed p(u, x) is depicted in Figure 4.10.
The figure also illustrates how the corollary follows from Theorem 4.3: Each constituent
region of the corollary is a strict subset of the constituent region of the theorem, for the same
p(u, x).
Proof of Corollary 4.4: In Theorem 4.3, consider the case where (4.1) is met with equality,
i.e., R = H(Y ). This yields a subset region which is still achievable. It simplifies to
Rd,1 +Rd,2 ≥ I(Z1;Z2 |U), (4.35)
Rd,1 ≥ I(Y ;Z1, U), (4.36)
Rd,2 ≥ I(Y ;Z2, U), (4.37)
Rd,1 +Rd,2 ≥ I(Y ;Z1, Z2, U) + I(Z1;Z2 |U), (4.38)
Rd,1 +Rd,2 ≥ I(Y ;Z1, Z2, U) + I(Y ;U) + I(Z1;Z2 |U)
= I(Y ;Z1, U) + I(Y ;Z2, U) + I(Z1;Z2 |U, Y ). (4.39)
88 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS
Rd,1
Rd,2
R
(4.31)
(4.32)(4.33)(4.34)
Figure 4.10. Constituent region for Corollary 4.4, for a fixed p(u, x). Each face is annotated bythe inequality that defines it. For comparison, the constituent region of Theorem 4.3 is shown withdashed lines (see Figure 4.5).
Clearly, conditions (4.35) and (4.38) are dominated by inequality (4.39), and the desired
result follows. �
Proof of achievability for Theorem 4.5: We further specialize Corollary 4.4. We choose
U = Z1 ∨ Z2, i.e., the common part of Z1 and Z2. This implies that condition (4.34) can be
omitted, since I(Z1;Z2 |U, Y ) = 0 for all p(u, x) by assumption. Furthermore, U can be
dropped from conditions (4.32) and (4.33) by virtue of being a function of Z1 and Z2. We
conclude that
R ≤ H(Y ), (4.40)
Rd,1 ≥ I(Y ;Z1), (4.41)
Rd,2 ≥ I(Y ;Z2), (4.42)
is achievable for all p(x). Adding a time-sharing random variable Q completes the proof.
Note that in the special case where Y 4 Z1 or Y 4 Z2, the same conclusion holds with
the choice U = ∅. �
4.3.4 Proof of Corollary 4.3
Proof of achievability: We prove the result for Z1 4 Z2, the other case follows by sym-
metry. We specialize the achievable region of Theorem 4.3 by choosing U = Z2. The
4.3. PROOFS FOR TWO DISTURBANCE CONSTRAINTS 89
rate–disturbance constraints are
R ≤ H(Y ), (4.43)
Rd,1 +Rd,2 ≥ 0, (4.44)
R−Rd,1 ≤ H(Y |Z1), (4.45)
R−Rd,2 ≤ H(Y |Z2), (4.46)
R−Rd,1 −Rd,2 ≤ H(Y |Z1), (4.47)
2R−Rd,1 −Rd,2 ≤ H(Y |Z1) +H(Y |Z2). (4.48)
Clearly, (4.44) is vacuous. Furthermore, (4.47) is dominated by (4.45), and (4.48) is domi-
nated by the sum of (4.45) and (4.46). This completes the proof. �
Proof of converse: The first inequality follows from Fano’s inequality as
nR ≤ I(Xn;Y n) + nεn
= H(Y n) + nεn
≤ nH(Y ) + nεn,
where Y = YQ and Q ∼ Unif{1:n}. The other two inequalities follow as
n(R−Rd,1) ≤ nR− I(Xn;Zn1 )
≤ H(Y n)−H(Zn1 ) + nεn
≤ H(Y n, Zn1 )−H(Zn
1 ) + nεn
= H(Y n |Zn1 ) + nεn
≤ nH(Y |Z1) + nεn,
with Z1 = Z1Q, and likewise for n(R−Rd,2). �
Chapter 5
General achievable rate region for 3-DIC
In this chapter1, we synthesize the results of the previous chapters and develop a new encod-
ing scheme for the 3-DIC and its corresponding achievable rate region. This scheme general-
izes the Han–Kobayashi scheme and performs strictly better than previously discussed inner
bounds on the 3-DIC capacity region. The key idea is to combine the receiver-centric insight
obtained from interference decoding (Chapter 3) with the transmitter-centric viewpoint that
led to the communication with disturbance constraints setting in Chapter 4. We borrow the
codebook construction via Marton coding and superposition coding from the latter, and
apply saturation arguments at the receiver similar to the former, which permits us to take
advantage of the structure of the combined interfering signal without decoding any of part
of the interfering messages. The proofs are relegated to Section 5.2.
5.1 Results and discussion
We first discuss two equivalent characterizations of the achievable rate region (Subsec-
tions 5.1.1 and 5.1.2). The description of the regions is complicated by an auxiliary random
variable Ul for each transmitter l ∈ {1:3} and by a Fourier–Motzkin operator that cannot be
evaluated symbolically. Thus, we explore two simplifications of the achievable rate region.
First, we specialize the region to the case where Ul = ∅ for all l. This leads to a weaker inner
bound that is simpler to evaluate numerically (Subsection 5.1.3). We provide a numerical1The results in this chapter were first published in [BE11a].
92 CHAPTER 5. GENERAL ACHIEVABLE RATE REGION FOR 3-DIC
example for this case. Second, we apply the general inner bound to the special case of the
one-to-many 3-DIC in which only one of the transmitters causes interference. The bound
simplifies via symbolic evaluation of the Fourier–Motzkin operator, and thus permits a more
explicit characterization than for the general case (Subsection 5.1.4).
5.1.1 Achievable rate region for 3-DIC
We will need the following notation. Fix a joint pmf for (Q,U1, X1, U2, X2, U3, X3) of the
form
p = p(q)p(u1, x1|q)p(u2, x2|q)p(u3, x3|q).
Here, Q and Ul, for l ∈ {1:3}, are auxiliary random variables of arbitrary cardinality. While
Q is a time-sharing random variable that is common between all three transmitters, Ul is
associated with the lth transmitter only. Define the rate region R1(p) ⊂ R18+ to consist of
the rate tuples
(R10, R11, R12, R13, R12, R13,
R20, R22, R23, R21, R23, R21,
R30, R33, R31, R32, R31, R32) (5.1)
such that
R12 −R12 + R13 −R13 ≥ I(X12;X13 |U1, Q), (5.2)
R12 −R12 + (R13 −R13)/2 ≤ I(X12;X13 |U1, Q), (5.3)
(R12 −R12)/2 + R13 −R13 ≤ I(X12;X13 |U1, Q), (5.4)
R12 ≥ R12, (5.5)
R13 ≥ R13, (5.6)
and for all i ∈ {1:5},
r1i ≤ H(X11 | c1i, Q) + t1i, (5.7)
r1i + R21 ≤ H(Y1 | c1i, U2, X31, Q) + t1i, (5.8)
5.1. RESULTS AND DISCUSSION 93
r1i + R31 ≤ H(Y1 | c1i, X21, U3, Q) + t1i, (5.9)
r1i + min{R20 + R21, H(X21 |Q)} ≤ H(Y1 | c1i, X31, Q) + t1i, (5.10)
r1i + min{R30 + R31, H(X31 |Q)} ≤ H(Y1 | c1i, X21, Q) + t1i, (5.11)
r1i + min{R21 + R31, H(S1 |U2, U3, Q)} ≤ H(Y1 | c1i, U2, U3, Q) + t1i, (5.12)
r1i + min{R20 + R21 + R31,
H(X21 |Q) + R31,
H(S1 |U3, Q)}≤ H(Y1 | c1i, U3, Q) + t1i, (5.13)
r1i + min{R21 +R30 + R31,
R21 +H(X31 |Q),
H(S1 |U2, Q)}≤ H(Y1 | c1i, U2, Q) + t1i, (5.14)
r1i + min{R20 + R21 +R30 + R31,
R20 + R21 +H(X31 |Q),
H(X21 |Q) +R30 + R31,
H(S1 |Q)}≤ H(Y1 | c1i, Q) + t1i. (5.15)
In the latter set of conditions, lower-case symbols are placeholders for the terms specified in
Table 5.1. The term r1i represents rates, the term c1i stands for sets of random variables on
which certain entropy terms are conditioned, and t1i is an additive term. For example, with
i = 3, condition (5.13) corresponds to the inequality
R13 +R11 + min{R20 + R21 + R31,
H(X21 |Q) + R31,
H(S1 |U3, Q)}≤ H(Y1 |U1, X12, U3, Q) + I(X12;X13 |U1, Q). (5.16)
Similarly, define the regions R2(p) and R3(p) by making the subscript replacements
1 7→ 2 7→ 3 7→ 1 and 1 7→ 3 7→ 2 7→ 1 in the definition of R1(p), respectively.
Define an operator FM that maps a convex 18-dimensional set of rate vectors of the
form (5.1) to a 3-dimensional rate region by substituting Rl0 = Rl −Rl1 −Rl2 −Rl3, for
l ∈ {1:3}, and subsequently projecting on the coordinates (R1, R2, R3). The operator FM
94 CHAPTER 5. GENERAL ACHIEVABLE RATE REGION FOR 3-DIC
can be implemented by Fourier–Motzkin elimination.
We are now ready to state the main result of this chapter.
Theorem 5.1 (Inner bound to the capacity region of 3-DIC).The region
RIB =⋃p
FM{
R1(p) ∩R2(p) ∩R3(p)},
where p = p(q)p(u1, x1|q)p(u2, x2|q)p(u3, x3|q), is an inner bound to the capacity
region of the 3-DIC.
Remark 5.1 (Saturation). The min terms on the left hand side of conditions (5.7) to (5.15)
correspond to different modes of signal saturation, as in our discussion of interference
decoding in Chapter 3. There are numerous modes of saturation here, since the transmitters
employ a more sophisticated scheme than single-user random codes.
Remark 5.2 (Convexity). The regions R1(p), R2(p), and R3(p) ensure decodability at
the first, second, and third receiver, respectively. As in Chapter 3, they are generally
nonconvex. The regions can alternatively be written as finite unions of convex components
by expanding the cases in which the min terms take on each of their arguments. The
intersection R1(p) ∩R2(p) ∩R3(p) is also generally nonconvex. By virtue of time-sharing,
we are allowed to convexify, as shown in the theorem. This convex hull operation is useful
even for a single fixed distribution p. However, it is not achieved by the coded time sharing
i r1i c1i t1i
1 R11 {U1, X12, X13} 0
2 R12 +R11 {U1, X13} I(X12;X13 |U1, Q)
3 R13 +R11 {U1, X12} I(X12;X13 |U1, Q)
4 R12 + R13 +R11 {U1} I(X12;X13 |U1, Q)
5 R10 + R12 + R13 +R11 ∅ I(X12;X13 |U1, Q)
Table 5.1. Shorthand notation for terms related to transmitter 1.
5.1. RESULTS AND DISCUSSION 95
mechanism of Q. The explicit convex hull operation also ensures that the argument of FM
is in fact a convex set.
Remark 5.3 (Fourier–Motzkin elimination). By contrast to other settings that use rate
splitting, the Fourier–Motzkin elimination denoted by FM cannot be carried out symbol-
ically due to the convex hull operation in its argument. However, this does not hinder
numerical evaluation of the region, since for each fixed p, the set R1(p) ∩R2(p) ∩R3(p)
as represented by its extreme points can be computed explicitly, and FM can be evaluated
by numerical Fourier–Motzkin elimination.
Remark 5.4 (Relation to previous bounds). It is not known whether the inner bound is
tight in general. However, it strictly includes the interference decoding inner bound in
Theorem 3.1, and thereby, the bound obtained by single-user random codes and treating
interference as noise in Theorem 2.1, i.e.,
RTIN ⊆ RID ⊆ RIB.
This follows by setting Ul = Xl and Rl = Rl0 for all l ∈ {1:3}.
Remark 5.5 (Generalization of the Han–Kobayashi scheme). The two-pair projections
of the inner bound are optimal, i.e., if one of the three rates, say R3, is set to zero, the
two-dimensional region that the inner bound achieves for (R1, R2) is in fact the capacity
region of the interference channel that consists of the first and second user pair. This follows
by setting Ul = ∅ for all l ∈ {1 : 3}, letting R12 = R12, R13 = R13 + I(X12;X13 |Q),
R21 = R21, R23 = R23 + I(X21;X23 |Q), and replacing all min terms in (5.7) to (5.15)
with their first argument. Moreover, the codebook structure that underlies the inner bound
contains superposition codebooks as a special case. Hence the proposed encoding scheme
subsumes the Han–Kobayashi scheme and generalizes it naturally to more than two user
pairs.
Before we discuss the proof of Theorem 5.1, it is instructive to consider the following
alternative formulation of the inner bound.
96 CHAPTER 5. GENERAL ACHIEVABLE RATE REGION FOR 3-DIC
5.1.2 Alternative characterization of the achievable rate region
While the following alternative characterization of the inner bound is more difficult to
compute than the region of Theorem 5.1, it allows deeper insight into the structure of the
decodability conditions (see Remark 5.7).
Define a new region R ′1(p) similar to R1(p) above, but replacing conditions (5.7)
to (5.15) by
r1i + min{r21j + r31k, H(S1 | c21j, c31k, Q)} ≤ H(Y1 | c1i, c21j, c31k, Q) + t1i,
for all i ∈ {1:5}, j ∈ {1:3}, k ∈ {1:3}.(5.17)
The lower-case symbols indexed by i, j, and k are placeholders for the terms specified in
Tables 5.1, 5.2, and 5.3, respectively. For example, the case where i = 3, j = 3, and k = 2
corresponds to the inequality
R13 +R11 + min{
min{R20 + R21, R20 +H(X21 |U2, Q),
H(X21 |Q)}+ min{R31, H(X31 |U3, Q)},
H(S1 |U3, Q)}≤ H(Y1 |U1, X12, U3, Q) + I(X12;X13 |U1, Q). (5.18)
Similarly, define the regions R ′2(p) and R ′3(p) by making the subscript replacements
1 7→ 2 7→ 3 7→ 1 and 1 7→ 3 7→ 2 7→ 1 in the definition of R ′1(p), respectively.
j r21j c21j
1 0 {X21}2 min{R21, H(X21 |U2, Q)} {U2}3 min{R20 + R21, R20 +H(X21 |U2, Q), H(X21 |Q)} ∅
Table 5.2. Shorthand notation for terms related to transmitter 2.
5.1. RESULTS AND DISCUSSION 97
Corollary 5.1 (Alternative inner bound to the capacity region of 3-DIC).The region
R ′IB =⋃p
FM{
R ′1(p) ∩R ′2(p) ∩R ′3(p)},
where p = p(q)p(u1, x1|q)p(u2, x2|q)p(u3, x3|q), is an inner bound to the capacity
region of the 3-DIC.
Remark 5.6. The regions R ′IB and RIB of Corollary 5.1 and Theorem 5.1 are equal. This is
proved in Subsection 5.2.3.
Remark 5.7 (R ′IB is logically simpler than RIB). The condition of inequality (5.17) ex-
poses a product structure with individual “factors” related to the first, second, and third
transmitter as specified in Tables 5.1, 5.2, and 5.3, respectively. This structure reflects the
fact that the transmitted messages are independent and there is no cooperation between the
transmitting nodes.
Remark 5.8 (RIB is computationally simpler than R ′IB). The sets R1(p) and R ′1(p) are
both defined by fifty inequality conditions. There is a natural one-to-one correspondence
between inequalities for R1(p) and R ′1(p) in which conditions (5.7) through (5.15) for some
index i correspond to inequality (5.17) for the same i and all j, k ∈ {1:3}. However, the
individual conditions are much simpler for R1(p) than for R ′1(p). Consider, for example,
the corresponding conditions (5.16) and (5.18). Expanding the nested min terms in (5.18)
leads to
k r31k c31k
1 0 {X31}2 min{R31, H(X31 |U3, Q)} {U3}3 min{R30 + R31, R30 +H(X31 |U3, Q), H(X31 |Q)} ∅
Table 5.3. Shorthand notation for terms related to transmitter 3.
98 CHAPTER 5. GENERAL ACHIEVABLE RATE REGION FOR 3-DIC
R13 +R11 + min{R20 + R21 + R31,
R20 + R21 +H(X31 |U3, Q),
R20 +H(X21 |U2, Q) + R31,
R20 +H(X21 |U2, Q) +H(X31 |U3, Q),
H(X21 |Q) + R31,
H(S1 |U3, Q)}≤ H(Y1 |U1, X12, U3, Q) + I(X12;X13 |U1, Q). (5.19)
The difference between the expression in (5.16) and the one in (5.19) is that the former
has fewer arguments in the min term than the latter. The non-convex set R1(p) therefore
consists of fewer convex components than R ′1(p), which reduces the computational effort
required to evaluate the region.
The proof of Theorem 5.1 and Corollary 5.1 is divided in three parts, discussed in
Subsections 5.2.1 through 5.2.3. Both results are based on the same encoding scheme, which
originates from the setting of communication with disturbance constraints in Chapter 4
and is described in detail in Subsection 5.2.1. The error probability analysis that leads
to Corollary 5.1 is detailed in Subsection 5.2.2. We treat the signal from the undesired
transmitters using methods from interference decoding with point-to-point codes (see Chap-
ter 3). Each of the corresponding message indices is treated either by the union bound or by
Corollary A.2. The interplay between the union bound and this corollary is the formal reason
for the min terms on the left hand side of the inequalities in Theorem 5.1 and Corollary 5.1.
The operational meaning of these terms is the saturation of different links as discussed in
Remark 5.1. Next, the signal from the desired transmitter is treated by borrowing from
the analysis in disturbance-constrained communication (see Subsection 4.3.1). Finally, in
Subsection 5.2.3, we show that the regions in Theorem 5.1 and Corollary 5.1 are equal,
which concludes the proof of the theorem.
Remark 5.9. In principle, the saturation analysis leading to the min terms in the regions
applies to the 2-DIC as defined in Subsection 1.1.1 as well. However, the capacity region of
that channel can be achieved without considering saturation at all, as given by Theorem 1.1.
In Appendix B, we demonstrate that applying the new tools leads to an alternative achievable
rate region, and show how that region reverts back to the known capacity result. Thus,
5.1. RESULTS AND DISCUSSION 99
saturation analysis provides a benefit only in interference channels with more than two user
pairs, which in an abstract sense agrees with the cases when interference alignment [MMK08,
CJ08] is beneficial.
5.1.3 Region without Ul
The encoding scheme underlying Theorem 5.1 is adopted from the setting of communication
with disturbance constraints. Motivated by the observation in Theorem 4.5 that the auxiliary
random variable U is unnecessary in some special cases of the disturbance-constrained
setting, we specialize the result of Theorem 5.1 to the case with Ul = ∅ and Rl0 = 0 for
l ∈ {1:3}. In terms of the encoding scheme, this corresponds to Marton coding without
common message in the broadcast channel, as opposed to the common message case that
results in the general theorem.
Fix a pmf for (Q,X1, X2, X3) of the form
p = p(q)p(x1|q)p(x2|q)p(x3|q).
Define the rate region R ′′1 (p) ⊂ R15+ to consist of the rate tuples
(R11, R12, R13, R12, R13, R21, R22, R23, R22, R23, R31, R32, R33, R32, R33) (5.20)
such that
R11 ≤ H(X11 |X12, X13, Q),
R12 +R11 ≤ H(X11 |X13, Q) + I(X12;X13 |Q),
R13 +R11 ≤ H(X11 |X12, Q) + I(X12;X13 |Q),
R12 + R13 +R11 ≤ H(X11 |Q) + I(X12;X13 |Q),
R11 + R21 ≤ H(Y1 |X12, X13, X31, Q),
R12 +R11 + R21 ≤ H(Y1 |X13, X31, Q) + I(X12;X13 |Q),
R13 +R11 + R21 ≤ H(Y1 |X12, X31, Q) + I(X12;X13 |Q),
R12 + R13 +R11 + R21 ≤ H(Y1 |X31, Q) + I(X12;X13 |Q),
100 CHAPTER 5. GENERAL ACHIEVABLE RATE REGION FOR 3-DIC
R11 + R31 ≤ H(Y1 |X12, X13, X21, Q),
R12 +R11 + R31 ≤ H(Y1 |X13, X21, Q) + I(X12;X13 |Q),
R13 +R11 + R31 ≤ H(Y1 |X12, X21, Q) + I(X12;X13 |Q),
R12 + R13 +R11 + R31 ≤ H(Y1 |X21, Q) + I(X12;X13 |Q),
R11 + min{R21 + R31, H(S1 |Q)} ≤ H(Y1 |X12, X13, Q),
R12 +R11 + min{R21 + R31, H(S1 |Q)} ≤ H(Y1 |X13, Q) + I(X12;X13 |Q),
R13 +R11 + min{R21 + R31, H(S1 |Q)} ≤ H(Y1 |X12, Q) + I(X12;X13 |Q),
R12 + R13 +R11 + min{R21 + R31, H(S1 |Q)} ≤ H(Y1 |Q) + I(X12;X13 |Q),
R12 −R12 + R13 −R13 ≥ I(X12;X13 |Q),
R12 −R12 + (R13 −R13)/2 ≤ I(X12;X13 |Q),
(R12 −R12)/2 + R13 −R13 ≤ I(X12;X13 |Q),
R12 ≥ R12,
R13 ≥ R13.
Similarly, define the regions R ′′2 (p) and R ′′3 (p) by making the subscript replacements
1 7→ 2 7→ 3 7→ 1 and 1 7→ 3 7→ 2 7→ 1 in the definition of R ′′1 (p), respectively.
Define an operator FM′′ that maps a convex 15-dimensional set of rate vectors of the
form (5.20) to a 3-dimensional rate region by substituting Rl1 = Rl − Rl2 − Rl3, for
l ∈ {1:3}, and subsequently projecting on the coordinates (R1, R2, R3). The operator FM′′
can be implemented by Fourier–Motzkin elimination.
Corollary 5.2 (Inner bound to the capacity region of 3-DIC, no Ul).The region
R ′′IB =⋃p
FM′′{
R ′′1 (p) ∩R ′′2 (p) ∩R ′′3 (p)},
where p = p(q)p(x1|q)p(x2|q)p(x3|q), is an inner bound to the capacity region of the
3-DIC.
5.1. RESULTS AND DISCUSSION 101
Remark 5.10 (R ′′IB is computationally simpler than R ′IB and RIB). By virtue of contain-
ing much fewer min terms on the left hand side of inequalities, the region of Corollary 5.2 is
easier to evaluate computationally than the regions in Theorem 5.1 and Corollary 5.1. (See
Remark 5.8.)
Continuation of Example 1.1. Recall the additive 3-DIC in Example 1.1 on page 8.
Figure 5.1 depicts a numerical approximation of the inner bound in Corollary 5.2. The
optimal trade-off between R1 and R2 when R3 = 0 is achieved in Figure 5.1, as per
Remark 5.5. The same is not true for the interference decoding inner bound in Figure 3.2
on page 37, or the inner bound by treating interference as noise in Figure 2.1 on page 14.
Figure 5.2 depicts the intersection of the three-dimensional regions with the plane defined
by R1 = R3. The intersection highlights the improvement of Corollary 5.2, and thus of the
general achievable rate region in Theorem 5.1, over the interference decoding inner bound.
5.1.4 Special case: One-to-many 3-DIC
Consider the one-to-many 3-DIC depicted in Figure 5.3. In this special case, interference
is caused only by the second transmitter, i.e., X12 = X13 = X31 = X32 = ∅. Furthermore,
there are no loss functions on the desired links, i.e., X11 = X1, Y2 = X22 = X2, and
R1
R2
R3
Figure 5.1. Region of Corollary 5.2 for the additive 3-DIC example. Compare to Figure 2.1 onpage 14 and Figure 3.2 on page 37.
102 CHAPTER 5. GENERAL ACHIEVABLE RATE REGION FOR 3-DIC
0.5 1.0 1.5
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
(R1 +R3)/√
2
R2
Interference as noiseInterference decodingCorollary 5.2
Figure 5.2. Comparison of the regions in Theorem 2.1 (treating interference as noise), Theorem 3.1(interference decoding) and Corollary 5.2 for the additive 3-DIC example.
X33 = X3. This special case is of interest since the region of Theorem 5.1 is sufficiently
simple to permit symbolic evaluation of the Fourier–Motzkin elimination steps. The result
for the special case is thus more concrete than the general result.
Let R(1)(p) be the set of triples (R1, R2, R3) ∈ R3+ satisfying
Rk ≤ H(Xk |Q), k ∈ {1:3},
R1 +R2 ≤ H(Y1 |Q) +H(X2 |U2, X21, Q),
R2 +R3 ≤ H(Y3 |Q) +H(X2 |U2, X23, Q),
X1
X2
f1
g21 X21
Y1
Y2
g23
f3
X23
X3Y3
Figure 5.3. One-to-many special case of 3-DIC.
5.1. RESULTS AND DISCUSSION 103
R1 +R3 ≤ I(X1;Y1 |U2, Q) + I(X3;Y3 |U2, Q) +H(X21, X23 |U2, Q),
R1 +R2 +R3 ≤ I(X1;Y1 |U2, Q) + I(X3;Y3 |U2, Q) +H(X2 |U2, Q) + I(U2;Y1 |Q),
R1 +R2 +R3 ≤ I(X1;Y1 |U2, Q) + I(X3;Y3 |U2, Q) +H(X2 |U2, Q) + I(U2;Y3 |Q).
Let R(2)(p) be the set of triples (R1, R2, R3) ∈ R3+ satisfying
R1 ≤ I(X1;Y1 |Q),
Rk ≤ H(Xk |Q), k ∈ {2, 3},
R2 +R3 ≤ H(Y3 |Q) +H(X2 |U2, X23, Q),
R1 +R3 ≤ I(X1;Y1 |U2, Q) + I(X3;Y3 |U2, Q) +H(X21, X23 |U2, Q),
R1 +R2 +R3 ≤ I(X1;Y1 |U2, Q) + I(X3;Y3 |U2, Q) +H(X2 |U2, Q) + I(U2;Y3 |Q).
Let R(3)(p) be the set of triples (R1, R2, R3) ∈ R3+ satisfying
Rk ≤ H(Xk |Q), k ∈ {1, 2},
R3 ≤ I(X3;Y3 |Q),
R1 +R2 ≤ H(Y1 |Q) +H(X2 |U2, X21, Q),
R1 +R3 ≤ I(X1;Y1 |U2, Q) + I(X3;Y3 |U2, Q) +H(X21, X23 |U2, Q),
R1 +R2 +R3 ≤ I(X1;Y1 |U2, Q) + I(X3;Y3 |U2, Q) +H(X2 |U2, Q) + I(U2;Y1 |Q).
Corollary 5.3 (Inner bound to the capacity region of one-to-many 3-DIC).The region
⋃p
R(1)(p) ∪R(2)(p) ∪R(3)(p),
where p = p(q)p(x1|q)p(u2, x2|q)p(x3|q), is an inner bound to the capacity region of
the one-to-many 3-DIC.
Taking the union over p of the sets R(1), R(2), and R(3) individually results in three
different inner bounds, none of which in general dominates the other. The conditions for
104 CHAPTER 5. GENERAL ACHIEVABLE RATE REGION FOR 3-DIC
R(2) and R(3) are of the same form as if the first and third receiver, respectively, are treating
interference as noise. The region R(1) then corresponds to the first and third receiver
decoding part of the second transmitter’s message. Note however, that all three are achieved
by the same encoding scheme and with the same decoding rule. As in the simpler setting of
interference decoding with point-to-point codes in Chapter 3, treating interference as noise
is subsumed in the general coding scheme as a special case. The proof of Corollary 5.3 is
given in Subsection 5.2.4.
5.2 Proofs
5.2.1 Codebook generation for Theorem 5.1 and Corollary 5.1
For the sake of simplified notation, we omit the auxiliary random variable Q throughout
this subsection and the subsequent analysis. To obtain the proof with Q, the codebook
generation procedure as described below must be augmented by generating a coded time
sharing sequence qn i.i.d. from p(q), and conditioning all subsequent analysis steps on it in
the usual way [EK11].
Fix a pmf p(u1, x1)p(u2, x2)p(u3, x3). We begin by describing the generation procedure
for the first transmitter. The codebook is constructed as in the deterministic case of com-
munication with two disturbance constraints (see Theorem 4.3 on page 61 and its proof in
Subsection 4.3.1 on page 77), using rate splitting, Marton coding and superposition coding.
The crosslink outputs X12 and X13 take the place of the side receivers that are not interested
in any part of the message.
Split the rate as R1 = R10 +R11 +R12 +R13. Define the auxiliary rates R12 ≥ R12 and
R13 ≥ R13, let ε′ > 0, and define the set partitions
{1:2nR12} = L12(1) ∪ · · · ∪ L12(2nR12),
{1:2nR13} = L13(1) ∪ · · · ∪ L13(2nR13),
where L12(·) and L13(·) are indexed sets of size 2n(R12−R12) and 2n(R13−R13), respectively.
1. For each m10 ∈ {1:2nR10}, generate un1 (m10) according to∏n
i=1 p(u1i).
5.2. PROOFS 105
2. For each l12 ∈ {1:2nR12}, generate xn12(m10, l12) according to∏n
i=1 p(x12i |u1i(m10)).
Likewise, for each l13 ∈ {1:2nR13}, generate a sequence xn13(m10, l13) according to∏ni=1 p(x13i |u1i(m10)).
3. For each triple (m10,m12,m13), let S(m10,m12,m13) be the set of all pairs (l12, l13)
from the product set L12(m12)× L13(m13) such that (xn12(m10, l12), xn13(m10, l13)) ∈T (n)ε′ (X12, Z13 |un1 (m10)).
4. For each (m10, l12, l13) and m11 ∈ {1 : 2nR11}, generate xn1 (m10, l12, l13,m11) ac-
cording to∏n
i=1 p(x1i |u1i(m10), x12i(l12), x13i(l13)) if (l12, l13) ∈ S(m10,m12,m13).
Otherwise, we draw from Unif(X n).
5. Choose (l(m10,m12,m13)12 , l
(m10,m12,m13)13 ) uniformly from S(m10,m12,m13).
If S(m10,m12,m13) is empty, choose (1, 1).
Note the notational difference in the way the rates R1i are indexed here as opposed
to Subsection 4.3.1. Here, the steps in codebook generation follow the scheme R10 →(R12, R13)→ R11. The first index of the rate variables is always 1 and represents the first
transmitter. The second index uses the intuition that R10 is “common” in the sense that it
affects both side receivers, and R11 is “private” and does not appear at either side receiver.
(In Subsection 4.3.1, there is only one transmitter and thus no first index. The second index
follows the notational convention R0 → (R1, R2)→ R3.)
Codebooks for the second and third transmitters are generated similarly by applying the
subscript substitutions 1 7→ 2 7→ 3 7→ 1 and 1 7→ 3 7→ 2 7→ 1 in each step of the procedure.
Encoding. To send message m1 = (m10,m12,m13,m11), transmit
xn1 (m10, l(m10,m12,m13)12 , l
(m10,m12,m13)13 ,m11).
Decoding. The receivers use simultaneous non-unique decoding. The first receiver
observes yn1 . Define the tuple
T (m10,m12,m13,m11,m20, l21,m30, l31)
=(un1 (m10), xn12(m10, l
(m10,m12,m13)12 ), xn13(m10, l
(m10,m12,m13)13 ),
106 CHAPTER 5. GENERAL ACHIEVABLE RATE REGION FOR 3-DIC
xn1 (m10, l(m10,m12,m13)12 , l
(m10,m12,m13)13 ,m11),
un2 (m20), xn21(m20, l21), un3 (m30), xn31(m30, l31), sn1 (m20, l21,m30, l31), yn1
).
Let ε > ε′. Declare that m1 = (m10, m12, m13, m11) has been sent if it is the unique message
such that
T (m10, m12, m13, m11, m20, l21, m30, l31)
∈ T (n)ε (U1, X12, X13, X1, U2, X21, U3, X31, S1, Y1)
for some m20, l21, m30, l31.
5.2.2 Error probability analysis for Corollary 5.1
Without loss of generality, assume that ml0 = ml1 = ml2 = ml3 = 1 is transmitted from
users l ∈ {1:3}. To analyze the probability of error at the first receiver, define the following
events.
Ee1 : S(1, 1, 1) is empty,
Ee2 : S(1, 1, 1) contains two distinct pairs with equal first or second component,
E0 : {T (1, 1, 1, 1,m20, l21,m30, l31) /∈ T (n)ε for all m20, l21,m30, l31},
Eijk : {T (m10,m12,m13,m11,m20, l21,m30, l31) ∈ T (n)ε
for some (m10,m12,m13,m11) ∈M1i, (m20, l21) ∈M2j, (m30, l31) ∈M3k},
for i ∈ {1:5}, j ∈ {1:3}, k ∈ {1:3},
where the message subsetsM1i,M2j , andM3k are specified in Tables 5.4, 5.5, and 5.6.
With “encoding” and “decoding” error events
Ee = Ee1 ∪ Ee2,
Ed = E0 ∪⋃i,j,k
Eijk,
5.2. PROOFS 107
the probability of error is upper bounded by
P(E) ≤ P(Ee) + P(Ed | Ece ).
As in the case of communication with disturbance constraints (Subsection 4.3.1), P(Ee)→ 0
as n → ∞ if conditions (5.2), (5.3), and (5.4) are fulfilled. We treat P(Ed | Ece ) term by
term via the union bound. First, note that by the conditional typicality lemma in [EK11],
P(E0 | Ece ) → 0 as n → ∞ (this relies on ε′ < ε). Next, we bound each term P(Eijk | Ece ).
For each i ∈ {1:5}, j ∈ {1:3}, and k ∈ {1:3}, we show that P(Eijk | Ece )→ 0 as n→∞ is
implied by condition (5.17) with the same indices i, j, and k.
As an example, consider the case of i = 3, j = 3, k = 2. The probability of the event
E332 = {T (1, 1,m13,m11,m20, l21, 1, l31) ∈ T (n)ε for
some m13 6= 1, m11, m20 6= 1, l21, l31 6= L(1,1,1)31 },
conditioned on Ece , tends to zero as n→∞ if (5.18) is true. Recalling the expansion (5.19),
i Message subset m10 m12 m13 m11
1 M11 1 1 1 6= 1
2 M12 1 6= 1 1 any3 M13 1 1 6= 1 any4 M14 1 6= 1 6= 1 any5 M15 6= 1 any any any
Table 5.4. Message subsetsM1i.
j Message subset m20 l21
1 M21 1 L(1,1,1)21
2 M22 1 6= L(1,1,1)21
3 M23 6= 1 any
Table 5.5. Message subsetsM2j .
108 CHAPTER 5. GENERAL ACHIEVABLE RATE REGION FOR 3-DIC
the claim is that P(E332 | Ece )→ 0 follows from any of the following sufficient conditions.
R13 +R11 +R20 + R21 + R31 ≤ ♦, (5.21a)
R13 +R11 +R20 + R21 +H(X31 |U3) ≤ ♦, (5.21b)
R13 +R11 +R20 +H(X21 |U2) + R31 ≤ ♦, (5.21c)
R13 +R11 +R20 +H(X21 |U2) +H(X31 |U3) ≤ ♦, (5.21d)
R13 +R11 +H(X21) + R31 ≤ ♦, (5.21e)
R13 +R11 +H(S1 |U3) ≤ ♦, (5.21f)
where ♦ stands for the right-hand side of conditions (5.18) and (5.19), H(Y1 |U1, X12, U3) +
I(X12;X13 |U1).
Recall the definition of the tuple T and write
E332 ={(Un
1 (1), Xn12(1, L
(1,1,m13)12 ), Xn
13(1, L(1,1,m13)13 ), Xn
1 (1, L(1,1,m13)12 , L
(1,1,m13)13 ,m11),
Un2 (m20), Xn
21(m20, l21), Un3 (1), Xn
31(1, l31), Sn1 (m20, l21, 1, l31), Y n1
)∈ T (n)
ε
for some m13 6= 1,m11,m20 6= 1, l21, l31 6= L(1,1,1)31
}⊆{(Un
1 (1), Xn12(1, L
(1,1,m13)12 ), Xn
13(1, l13), Xn1 (1, L
(1,1,m13)12 , l13,m11),
Un2 (m20), Xn
21(m20, l21), Un3 (1), Xn
31(1, l31), Sn1 (m20, l21, 1, l31), Y n1
)∈ T (n)
ε
for some m13 6= 1, l13 /∈ L13(1),m11,m20 6= 1, l21, l31 6= L(1,1,1)31
}where we have augmented the event by replacing the random variables {L(1,1,m13)
13 : m13 6=1} with the set of all their possible values, i.e., all l13 /∈ L13(1).
There are two cases, depending on whether or not E332 occurs in conjunction with the
k Message subset m30 l31
1 M31 1 L(1,1,1)31
2 M32 1 6= L(1,1,1)31
3 M33 6= 1 any
Table 5.6. Message subsetsM3j .
5.2. PROOFS 109
event Eeq = {L(1,1,m13)12 = L
(1,1,1)12 }. We write P(E332 | Ece ) = P(E332 ∩ Eeq | Ece ) + P(E332 ∩
Eceq | Ece ) and treat both terms separately. First, consider
E332 ∩ Eeq ⊆{(Un
1 (1), Xn12(1, L
(1,1,1)12 ), Xn
13(1, l13), Xn1 (1, L
(1,1,1)12 , l13,m11),
Un2 (m20), Xn
21(m20, l21), Un3 (1), Xn
31(1, l31), Sn1 (m20, l21, 1, l31), Y n1
)∈ T (n)
ε
for some l13 /∈ L13(1),m11,m20 6= 1, l21, l31 6= L(1,1,1)31
}Thus,
P(E332 ∩ Eeq | Ece )
≤∑
(un1 ,xn12,u
n3 ,y
n1 )∈T (n)
ε
P{Un
1 (1) = un1 , Xn12(1, L
(1,1,1)12 ) = xn12, U
n3 (1) = un3 , Y
n1 = yn1
∣∣ Ece}
·∑
l13 /∈L13(1)
2nR11∑m11=1
P{
(un1 , xn12, X
n13(1, l13), Xn
1 (1, L(1,1,1)12 , l13,m11),
Un2 (m20), Xn
21(m20, l21), un3 , Xn31(1, l31),
Sn1 (m20, l21, 1, l31), yn1 ) ∈ T (n)ε for some
m20 6= 1, l21, l31 6= L(1,1,1)31
∣∣ Ece , un1 , xn12, un3 , y
n1
}≤ 2n(R13+R11) P1, (5.22)
where P1 is shorthand for the last P{·} expression. The conditioning un1 , xn12, u
n3 , y
n1 in P1
is our abbreviating notation for Un1 (1) = un1 , X
n12(1, L
(1,1,1)12 ) = xn12, U
n3 (1) = un3 , Y
n1 =
yn1 . The expression for P1 can be bounded in several ways, each leading to one of the
conditions (5.21a) through (5.21f).
To show that conditions (5.21a) to (5.21e) are sufficient, we bound P1 by omitting Sn1from the typicality requirement.
P1 ≤ P{
(un1 , xn12, X
n13(1, l13), Xn
1 (1, L(1,1,1)12 , l13,m11),
Un2 (m20), Xn
21(m20, l21), un3 , Xn31(1, l31), yn1 ) ∈ T (n)
ε
for some m20 6= 1, l21, l31 6= L(1,1,1)31
∣∣ Ece , un1 , xn12, un3 , y
n1
}.
The subsequent process of bounding P1 decomposes into two stages. The first stage treats
110 CHAPTER 5. GENERAL ACHIEVABLE RATE REGION FOR 3-DIC
the signals from transmitters 2 and 3 using methods from interference decoding with point-
to-point codes as developed in Chapter 3. The second stage treats the signals from the
desired transmitter and borrows from the analysis in disturbance-constrained communication
in Chapter 4.
The first stage itself is subdivided into a union-bound step and a step based on Corol-
lary A.2. Let us focus on (5.21d) to illustrate the proof. In this case, the union bound is
applied to the index m20, and Corollary A.2 accounts for the remaining indices l21 and l31.
P1 ≤2nR20∑m20=2
∑un2∈T
(n)ε (U2 |
un1 ,xn12,u
n3 ,y
n1 )︸ ︷︷ ︸
.= 2nH(U2|U1,X12,U3,Y1)
P{Un
2 (m20) = un2∣∣ Ece , un1 , xn12, u
n3 , y
n1
}︸ ︷︷ ︸.= 2−nH(U2)
· P{
(un1 , xn12, X
n13(1, l13), Xn
1 (1, L(1,1,1)12 , l13,m11),
un2 , Xn21(m20, l21), un3 , X
n31(1, l31), yn1 ) ∈ T (n)
ε
for some l21, l31 6= L(1,1,1)31
∣∣ Ece , un1 , xn12, un2 , u
n3 , y
n1
}≤ 2n(R20+H(U2|U1,X12,U3,Y1)−H(U2)+δ1(ε)) P2, (5.23)
where P2 is shorthand for the last probability term.
As the second step, we bound P2 by using Corollary A.2, with
Ai = (Xn21(m20, l21), Xn
31(1, l31)), i = (l21, l31),
D = (Xn13(1, l13), Xn
1 (1, L(1,1,1)12 , l13,m11)),
Q = T (n)ε (X13, X1, X21, X31 |un1 , xn12, u
n2 , u
n3 , y
n1 ).
We have
QA = T (n)ε (X21, X31 |un1 , xn12, u
n2 , u
n3 , y
n1 ),
|QA| ≤ 2n(H(X21,X31|U1,X12,U2,U3,Y1)+δ2(ε)). (5.24)
The final building block for using the corollary is to analyze P{(a,D) ∈ Q} for any fixed
a = (xn21, xn31). This constitutes the second stage of the bounding procedure, as it relates
5.2. PROOFS 111
only to signals from the first (desired) transmitter. We have
P{(a,D) ∈ Q}
= P{
(Xn13(1, l13), Xn
1 (1, L(1,1,1)12 , l13,m11))
∈ T (n)ε (X13, X1|un1 , xn12, u
n2 , x
n21, u
n3 , x
n31, y
n1 )∣∣ Ece , un1 , xn12, u
n2 , x
n21, u
n3 , x
n31, y
n1
}=
∑(xn13,x
n1 )∈T (n)
ε (X13,X1 |un1 ,x
n12,u
n2 ,x
n21,u
n3 ,x
n31,y
n1 )
P{Xn
13(1, l13) = xn13, Xn1 (1, L
(1,1,1)12 , l13,m11) = xn1
∣∣Ece , Un
1 (1) = un1 , Xn12(1, L
(1,1,1)12 ) = xn12, U
n2 (m20) = un2 ,
Xn21(m20, l21) = xn21, U
n3 (1) = un3 , X
n31(1, l31) = xn31, Y
n1 = yn1
}(a)=
∑(xn13,x
n1 )∈T (n)
ε (X13,X1 |un1 ,x
n12,u
n2 ,x
n21,u
n3 ,x
n31,y
n1 )︸ ︷︷ ︸
.= 2nH(X13,X1|U1,X12,U2,X21,U3,X31,Y1)
P{Xn
13(1, l13) = xn13
∣∣ Un1 (1) = un1
}︸ ︷︷ ︸.= 2−nH(X13|U1)
· P{Xn
1 (1, L(1,1,1)12 , l13,m11) = xn1
∣∣ xn12, xn13, u
n1
}︸ ︷︷ ︸.= 2−nH(X1|X12,X13,U1)
≤ 2n(−H(Y1|U1,X12,U3)−I(X12;X13|U1)+H(U2,X21)+H(X31|U3)−H(U2,X21,X31|U1,X12,U3,Y1)+δ3(ε))
(b)= PD. (5.25)
In (a), the first term under the sum arises by omitting irrelevant conditions. In particular,
the conditions Un2 (m20) = un2 and Xn
21(m20, l21) = xn21 can be omitted because m20 6= 1.
Conditions Un3 (1) = un3 and Xn
31(1, l31) = xn31 are omitted because of the Markov chain
(Un3 , X
n31) − Y n
1 − Xn13, The condition Xn
12(1, L(1,1,1)12 ) = xn12 can be omitted since l13 /∈
L13(1). Finally, the conditions Ece and Y n1 = yn1 can be omitted because they relate only
to the (1, 1) bin for m0 = 1, but Xn13(1, l13) relates to a bin other than the first due to
l13 /∈ L13(1). Similar simplifications lead to the second term under the sum. In (b), we
note that the expression is independent of a and can thus serve as the required PD for
Corollary A.2.
Backtracking through the stages of the proof, Corollary A.2 implies P2 ≤ |QA| · PD,
where the two terms are given in (5.24) and (5.25). Substituting in (5.23) and eventually
in (5.22) implies that P(E332 ∩ Eeq | Ece )→ 0 as n→∞ follows from (5.21d).
The remaining conditions (5.21a) through (5.21e) can be shown by varying the division
of labor between the union bound and Corollary A.2, that is, by applying the union bound
112 CHAPTER 5. GENERAL ACHIEVABLE RATE REGION FOR 3-DIC
to different subsets of indices {m20, l21, l31}. Table 5.7 summarizes the correspondence
between subsets and conditions. Note that layering in codebook generation implies that if
the union bound is used on l21, then it must also be used on m20, and only the subsets that
satisfy this layering condition appear in the table. The last row of the table corresponds to
treating all three indices via the corollary. This leads to a bound that is dominated by (5.21f)
and thus irrelevant.
To show that condition (5.21f) is sufficient, we bound P ? by omitting Un2 , Xn
21, and Xn31
from the typicality requirement,
P1 ≤ P{
(un1 , xn12, X
n13(1, l13), Xn
1 (1, L(1,1,1)12 , l13,m11),
un3 , Sn1 (m20, l21, 1, l31), yn1 ) ∈ T (n)
ε
for some m20 6= 1, l21, l31 6= L(1,1,1)31
∣∣ Ece , un1 , xn12, un3 , y
n1
},
and then use Corollary A.2 with
Ai = Sn1 (m20, l21, 1, l31), i = (m20, l21, l31),
D = (Xn13(1, l13), Xn
1 (1, L(1,1,1)12 , l13,m11)),
Q = T (n)ε (X13, X1, S1 |un1 , xn12, u
n3 , y
n1 ).
In order to complete the proof, we need to consider the event E332 ∩ Eceq. We omit the
details, as they do not contain new ideas. As for E332∩Eeq, the analysis decomposes into two
Union bound applied for indices Condition
{m20, l21, l31} (5.21a)
{m20, l21} (5.21b)
{m20, l31} (5.21c)
{m20} (5.21d)
{l31} (5.21e)
{} —
Table 5.7. Index subsets for union bound and corresponding sufficient conditions.
5.2. PROOFS 113
stages, relating to transmitters 2 and 3, and transmitter 1, respectively. The first stage uses a
combination of the union bound and Corollary A.2. In the second stage, the independence
lemma (Lemma A.2) is needed to rule out correlation leakage through the Marton selection
process. Eventually, the conditions for P(E332 ∩ Eceq | Ece ) are subsumed by the conditions for
P(E432)→ 0.
This concludes the proof of Corollary 5.1. �
5.2.3 Equivalence of Theorem 5.1 and Corollary 5.1
Finally, we show that the regions RIB and R ′IB in Theorem 5.1 and Corollary 5.1 are equal.
It is clear that RIB ⊆ R ′IB, since the conditions of the theorem are more stringent than those
of the corollary (compare Remark 5.8). To show the converse inclusion, we establish that
every rate point in R ′IB is contained in RIB. The key idea is to vary the auxiliary rates that
define the rate split while keeping the overall rate unchanged. The procedure is analogous to
the analysis of the two-user-pair case in Appendix B.1.
Consider a fixed distribution p. We are given a rate split
(R′10, R′12, R
′13, R
′12, R
′13, R
′11),
which satisfies the conditions of Corollary 5.1. Define
∆12 = R′12 −min{R′12, H(X12 |U1)},
∆13 = R′13 −min{R′13, H(X13 |U1)},
and let the modified rate split (R10, R12, R13, R12, R13, R11) be given as
R10 = R′10,
R12 = R′12 −∆12,
R13 = R′13 −∆13,
R12 = R′12 −∆12,
R13 = R′13 −∆13,
114 CHAPTER 5. GENERAL ACHIEVABLE RATE REGION FOR 3-DIC
R11 = R′11 + ∆12 + ∆13.
First note that this rate split maintains the same overall rate R10 + R12 + R13 + R11 as
the original rate split. To verify non-negativity of each component rate, first note that
R10, R12, R13, R11 ≥ 0 by definition. Furthermore,
R12 = R′12 − R′12 + min{R′12, H(X12 |U1)} ≥ 0,
where the right hand side follows from R′12 ≥ 0 and condition (5.3). Likewise, it fol-
lows that R13 ≥ 0. The modified rate split is thus valid. It remains to be shown that
(R10, R12, R13, R12, R13, R11) satisfies conditions (5.2) to (5.15), using the fact that the tu-
ple (R′10, R′12, R
′13, R
′12, R
′13, R
′11) satisfies (5.2) to (5.6) and (5.17). This is a tedious but
straightforward exercise, which we omit here. This concludes the proof that the statements
of Theorem 5.1 and Corollary 5.1 are equivalent, and thereby, the proof of Theorem 5.1. �
5.2.4 Proof of Corollary 5.3
We apply Theorem 5.1. Since the first transmitter does not cause any interference, we set
U1 = ∅ and R10 = R12 = R12 = R13 = R13 = 0, i.e., the entire rate R1 is contained in R11
and the codebook at the first transmitter degenerates to a non-layered (single-user) random
codebook according to p(x1). We proceed analogously for the third transmitter. Using these
simplifications, the intersection R1(p) ∩R2(p) ∩R3(p) of Theorem 5.1 is represented by
the reduced set of conditions
R1 ≤ H(X1 |Q), (5.26)
R1 + R21 ≤ H(Y1 |U2, Q), (5.27)
R1 + min{R20 + R21, H(X21 |Q)} ≤ H(Y1 |Q), (5.28)
R22 ≤ H(Y2 |U2, X21, X23, Q), (5.29)
R21 +R22 ≤ H(Y2 |U2, X23, Q) + I(X21;X23 |U2, Q), (5.30)
R23 +R22 ≤ H(Y2 |U2, X21, Q) + I(X21;X23 |U2, Q), (5.31)
R21 + R23 +R22 ≤ H(Y2 |U2, Q) + I(X21;X23 |U2, Q), (5.32)
5.2. PROOFS 115
R20 + R21 + R23 +R22 ≤ H(Y2 |Q) + I(X21;X23 |U2, Q), (5.33)
R21 −R21 + R23 −R23 ≥ I(X21;X23 |U2, Q), (5.34)
R21 −R21 + (R23 −R23)/2 ≤ I(X21;X23 |U2, Q), (5.35)
(R21 −R21)/2 + R23 −R23 ≤ I(X21;X23 |U2, Q), (5.36)
R21 ≥ R21, (5.37)
R23 ≥ R23, (5.38)
R3 ≤ H(X1 |Q), (5.39)
R3 + R23 ≤ H(Y1 |U2, Q), (5.40)
R3 + min{R20 + R23, H(X23 |Q)} ≤ H(Y1 |Q), (5.41)
where conditions (5.26) to (5.28) are from R1(p), conditions (5.29) to (5.38) are from R2(p),
and conditions (5.39) to (5.41) are from R3(p) in Theorem 5.1, respectively. Due to the
min terms on the left hand side of (5.28) and (5.41), these conditions specify a non-convex
region which equals the union of four convex sets R(1)(p) ∪ R(2)(p) ∪ R(3)(p) ∪ R(4)(p).
Each such set is obtained by replacing the two min terms with one of their two arguments,
respectively. Thus the region of Theorem 5.1 becomes
⋃p
FM{
R1(p) ∩R2(p) ∩R3(p)}
=⋃p
FM{
R(1)(p) ∪ R(2)(p) ∪ R(3)(p) ∪ R(4)(p)}
(a)=⋃p
FM{
R(1)(p)}∪ FM
{R(2)(p)
}∪ FM
{R(3)(p)
}∪ FM
{R(4)(p)
}(b)=⋃p
R(1)(p) ∪R(2)(p) ∪R(3)(p),
where in (a), we have exchanged the convex hull operator and the Fourier–Motzkin operator
FM. In step (b), the operator FM is evaluated symbolically. For i ∈ {1:3}, the expression
FM{R(i)(p)} evaluates to the region R(i)(p) claimed in the corollary. The fourth term
FM{R(4)(p)} turns out to be a subset of R(1)(p) and can thus be omitted. Thus we have
proved Corollary 5.3. �
Chapter 6
Conclusion
The main result of this dissertation is the inner bound to the capacity of the 3-DIC given in
Theorem 5.1 and the insight on coding scheme design that it entails. The inner bound strictly
includes all previously known bounds, and thus contributes to a deeper understanding of
interference channels in general. The two main ingredients are the codebook design that is
inspired from communication with disturbance constraints, and the receiver architecture that
is drawn from interference decoding.
The coding scheme that we obtained by combining these viewpoints provides a natural
extension of the Han–Kobayashi scheme to interference channels with more than two
user pairs. It turns out that the key property of the Han–Kobayashi scheme that allows
generalization is not splitting the message into public and private components and building a
layered superposition codebook. Instead, the invariant between the Han–Kobayashi scheme
for two-pair channels and its proposed generalization to the three-pair case is that both
schemes solve the underlying disturbance-constrained communication problem.
On a more abstract level, the modular approach that we have taken may be applicable
in other problems of multi-terminal information theory as well. Only by focusing on the
transmitter and receiver end of the problem individually first and solving the associated
problems in isolation did we gain the insight for approaching the 3-DIC problem.
Finally, although we described the encoding scheme and the resulting achievable rate
region using the example of deterministic channels with three user pairs, the key ideas can be
readily applied to discrete memoryless interference channels, and to interference channels
118 CHAPTER 6. CONCLUSION
with a larger number of user pairs. The main difficulty in generalizing to interference
channels with noise is that the associated disturbance-constrained communication problem
then requires additional auxiliary random variables, and consequently, the codebook structure
at the side receivers does not simplify as in the deterministic case.
Appendix A
Useful auxiliary results
A.1 Probability decomposition by index and by value
In this section, we state and prove a lemma and two corollaries that are useful in error
probability analyses that involve saturation arguments. They bound the probability of a
union of events, such as the probability that any one of the possible incorrect message
combinations appears to be correct at a receiver in the 3-DIC. The proofs are given below.
Lemma A.1. Let A1, . . . , An be identically distributed random variables from an al-
phabet A, and let D be a random variable from alphabet D. Let Q ⊂ A×D be a set of
“qualified” pairs. Then
P (∪ni=1 {(Ai, D) ∈ Q}) ≤n∑i=1
P{(Ai, D) ∈ Q}, (A.1)
P (∪ni=1 {(Ai, D) ∈ Q}) ≤∑a∈A
P{(a,D) ∈ Q}. (A.2)
Remark A.1. Inequality (A.1) is the well-known union bound; it decomposes the probabil-
ity by index. The second inequality (A.2) decomposes the probability by value.
Remark A.2. Note that the random variable D is crucial in inequality (A.2). With D = ∅,the terms in the sum are essentially indicator functions, and the right hand side of the
bound generally becomes larger than one and thus useless. For the bound to be useful, the
randomness of D must act as dithering that equalizes the probability P{(a,D) ∈ Q} over a.
120 APPENDIX A. USEFUL AUXILIARY RESULTS
For the following two corollaries, in addition to the assumptions in Lemma A.1, let
QA be the subset of values from A that can qualify at all1, and let PD be given such that
P{(a,D) ∈ Q} ≤ PD for all a.
Corollary A.1. P (∪ni=1 {(Ai, D) ∈ Q}) ≤ nP{Ai ∈ QA} · PD.
Corollary A.2. P (∪ni=1 {(Ai, D) ∈ Q}) ≤ |QA| · PD.
Remark A.3. The factor nP{Ai ∈ QA} in Corollary A.1 is the expected number of random
variables Ai that qualify by themselves (for some d). It thus relates to counting random
variables, which matches the interpretation of enumerating random variable indices. On the
other hand, the factor |QA| in Corollary A.2 counts a set of values of random variables.
Proof of Lemma A.1: Inequality (A.1) is the union bound. To see inequality (A.2), con-
sider
P (∪ni=1 {(Ai, D) ∈ Q}) =∑an∈An
P{An = an} P (∪ni=1 {(ai, D) ∈ Q})
(a)≤∑an∈An
P{An = an} P (∪a∈A {(a,D) ∈ Q})
(b)≤∑an∈An
P{An = an}∑a∈A
P{(a,D) ∈ Q}
=∑a∈A
P{(a,D) ∈ Q},
where (a) uses the fact that the union contains at most |A| distinct events, and (b) uses the
union bound. �
Proof of Corollary A.1: Refine the right hand side of (A.1) as
n∑i=1
P{(Ai, D) ∈ Q} =n∑i=1
∑a∈QA
P{Ai = a} P{(a,D) ∈ Q}
1QA = {a ∈ A | (a, d) ∈ Q for some d}
A.2. INDEPENDENCE LEMMA 121
≤ nPD · P{Ai ∈ QA}. �
Proof of Corollary A.2: Develop the right hand side of (A.2) as
∑a∈A
P{(a,D) ∈ Q} =∑a∈QA
P{(a,D) ∈ Q}
≤ |QA| · PD. �
A.2 Independence lemma
Lemma A.2 (Independence lemma).Consider a finite setA and a subsetA′ ⊂ A. Let pA be an arbitrary pmf overA. Let the
random vector An be distributed proportionally to the product distribution∏n
l=1 pA(al),
restricted to the support set {an : ak ∈ A′ for some k}. Let I be drawn uniformly from
{i : Ai ∈ A′}. Let J = ((I + s− 1) mod n) + 1 for some integer s ∈ {1:(n − 1)}.Then, the random variables AI and AJ are independent.
Proof: We prove the lemma for s = 1, the remaining cases follow by symmetry. For ease of
notation, define the specialized modulo operator JxK = 1 + ((x− 1) mod n), the indicator
function 1A′(a) = 1 if a ∈ A′ and 0 otherwise, and the shorthand notations Y = AI and
Z = AJ . Notice that
p(an) =
1c
∏nl=1 pA(al) if ak ∈ A′ for some k ∈ {1:n}
0 otherwise,
where c is a normalization constant, the exact value of which is not relevant. Further,
p(i | an) =
1∑nk=1 1A′ (ak)
if ai ∈ A′
0 otherwise.
122 APPENDIX A. USEFUL AUXILIARY RESULTS
The joint distribution of (An, I, J, Y, Z) is then
p(an, i, j, y, z) =
p(an)∑n
k=1 1A′ (ak)if ai ∈ A′, ai = y, aj = z, and j = Ji+ 1K
0 otherwise.
Partially marginalizing, it follows that
p(y, z) =n∑i=1
∑an: ai∈A′ai=y
aJi+1K=z
p(an)∑nk=1 1A′(ak)
.
It is clear that p(y, z) = p(y)p(z) = 0 if y /∈ A′. On the other hand, for y ∈ A′, we have
p(y, z) =n∑i=1
∑an: ai=yaJi+1K=z
∏nl=1 pA(al)
c∑n
k=1 1A′(ak).
The fraction under the sum is invariant under permutations of an. Therefore,
p(y, z) =1
c
n∑i=1
∑an: a1=ya2=z
∏nl=1 pA(al)∑nk=1 1A′(ak)
=n
c
∑an=(y,z,an3 )
∏nl=1 pA(al)∑nk=1 1A′(ak)
=n pA(y) pA(z)
c
∑an3∈An−2
∏nl=3 pA(al)
1 + 1A′(z) +∑n
k=3 1A′(ak),
where an3 are the last n− 2 components of an. Observe that p(y, z) separates into a function
of z and a function of y. Independence is thus established. �
Appendix B
Application of new techniques to 2-DIC
In this appendix, we show that the capacity region of the 2-DIC as given by Theorem 1.1 can
be written in the same notational framework as Theorem 5.1 and Corollary 5.1. Furthermore,
we show that saturation effects do not play a crucial role for the 2-DIC.
The capacity region of the 2-DIC can alternatively be written as follows. Fix a joint
distribution for (Q,X1, X2) of the form p = p(q)p(x1|q)p(x2|q). Let the region R ′1(p) ⊂R4
+ be the set of rate tuples (R11, R12, R21, R22) that satisfy
r1i + r2j ≤ H(Y1 | c1i, c2j, Q), for all i, j ∈ {1, 2}. (B.1)
The lower-case symbols are placeholders for the terms shown in Tables B.1 and B.2.
Similarly, define R ′2(p) by making the subscript replacement 1 7→ 2 7→ 1 in the definition
of R ′1(p). Define the operator FM′ as the specialized Fourier–Motzkin elimination that
maps a convex 4-dimensional set of rate tuples (R11, R12, R21, R22) to a 2-dimensional rate
region by substituting R12 = R1 − R11 and R21 = R2 − R22, and then projecting on the
coordinates (R1, R2).
i r1i c1i
1 R11 {X12}2 R12 +R11 {∅}
Table B.1. 2-DIC shorthand notation for terms related to transmitter 1.
124 APPENDIX B. APPLICATION OF NEW TECHNIQUES TO 2-DIC
Theorem B.1 (Capacity region of 2-DIC).The capacity region of the 2-DIC is equal to the set
R ′2-DIC =⋃p
FM′{
R ′1(p) ∩R ′2(p)},
where p = p(q)p(x1|q)p(x2|q).
Remark B.1. The region in this theorem has the same product structure as the 3-DIC region
in Corollary 5.1.
Achievability of the region R ′2-DIC in Theorem B.1 follows from Han–Kobayashi coding.
The first transmitter constructs 2nR12 cloud center codewords according to p(x12), and 2nR11
satellite codewords according to p(x1|x12). The second transmitter proceeds likewise. The
error probability analysis is entirely analogous to the proof of Corollary 5.1 in Subsec-
tion 5.2.2. Each combination of (i, j) in condition (B.1) corresponds to a certain error event
at the first receiver. Since correct decoding is required only for messages associated with
R11 and R12, we can take advantage, at least formally, of saturation effects as expressed by
the min term in Table B.2.
Remark B.2 (Explicit notation for Theorem B.1). The region R ′1(p) can be rewritten by
expanding conditions (B.1) explicitly. It is the set of rate tuples (R11, R12, R21, R22) that
j r2j c2j
1 0 {X21}2 min{R21, H(X21 |Q)} {∅}
Table B.2. 2-DIC shorthand notation for terms related to transmitter 2.
125
satisfy
R ′1(p) : R11 ≤ H(X11 |X12, Q), (B.2a)
R12 +R11 ≤ H(X11 |Q), (B.2b)
R11 + min{R21, H(X21 |Q)} ≤ H(Y1 |X12, Q), (B.2c)
R12 +R11 + min{R21, H(X21 |Q)} ≤ H(Y1 |Q). (B.2d)
Likewise, the region R ′2(p) is the set of rate tuples (R11, R12, R21, R22) that satisfy
R ′2(p) : R22 ≤ H(X22 |X21, Q), (B.3a)
R21 +R22 ≤ H(X22 |Q), (B.3b)
R22 + min{R12, H(X12 |Q)} ≤ H(Y2 |X21, Q), (B.3c)
R21 +R22 + min{R12, H(X12 |Q)} ≤ H(Y2 |Q). (B.3d)
Before proving the converse of Theorem B.1, it is instructive to consider a corollary
to Theorem 1.1. To this end, define the modified set R1(p) that contains all rate tuples
(R11, R12, R21, R22) satisfying
R1(p) : R11 ≤ H(X11 |X12, Q), (B.4a)
R12 +R11 ≤ H(X11 |Q), (B.4b)
R11 +R21 ≤ H(Y1 |X12, Q), (B.4c)
R12 +R11 +R21 ≤ H(Y1 |Q). (B.4d)
Likewise, let R2(p) be the set of all tuples that satisfy
R2(p) : R22 ≤ H(X22 |X21, Q), (B.5a)
R21 +R22 ≤ H(X22 |Q), (B.5b)
R22 +R12 ≤ H(Y2 |X21, Q), (B.5c)
R21 +R22 +R12 ≤ H(Y2 |Q). (B.5d)
126 APPENDIX B. APPLICATION OF NEW TECHNIQUES TO 2-DIC
Corollary B.1 (Capacity region of 2-DIC, no saturation).The capacity region of the 2-DIC is equal to the set
R2-DIC =⋃p
FM′ {R1(p) ∩R2(p)} , (B.6)
where p = p(q)p(x1|q)p(x2|q).
Proof: Use Fourier–Motzkin elimination to evaluate the FM′ operator in R2-DIC. �
Remark B.3. We note the formal similarity between the sets in Theorem B.1 and Corol-
lary B.1. A seeming difference is that the set R2-DIC does not have either of the two convex
hull operators of R ′2-DIC. This difference is vacuous. In fact, the operators could be added to
the expression in (B.6) without changing it. The inner convex hull operator is superfluous
because R1(p) ∩R2(p) is convex by construction, while the outer convex hull operator is
subsumed by coded timesharing via Q.
Converse proof for Theorem B.1: It is clear from (B.2) and (B.4) that the set R1(p) is a
subset of R ′1(p) since the conditions of the former are more stringent than those of the latter
(every min expression is no greater than its arguments). Likewise, R2(p) is a subset of
R ′2(p). It follows that R2-DIC is a subset of R ′2-DIC because set intersection, the convex hull
operation, FM′, and set union are monotone with respect to set inclusion. We have thus
established that R ′2-DIC contains the capacity region. �
B.1 2-DIC has no saturation gain
It follows that both R2-DIC and R ′2-DIC are equal to the capacity region C2-DIC. Although the
region R ′2-DIC formally exploits saturation effects through the min terms in (B.2) and (B.3),
in actuality, such saturation gains are not available in the 2-DIC. This is clear from the
modified region R2-DIC, which does not contain saturation terms in (B.4) and (B.5), but
nevertheless equals the capacity region. Although the constituent regions R ′1(p) ∩R ′2(p)
generally strictly include the saturation-unaware R1(p) ∩R2(p), the final regions R ′2-DIC
and R2-DIC are the same, and consequently, saturation need not be considered in the 2-DIC.
B.1. 2-DIC HAS NO SATURATION GAIN 127
The strict inclusion is lost during the projection operation in FM′ and relates to the
underlying rate splitting. It can be understood directly by proving R ′2-DIC ⊆ R2-DIC as
follows. Consider a fixed distribution p = p(q)p(x1|q)p(x2|q), and an achievable rate pair
(R1, R2) with a particular rate split R1 = R′11 +R′12 and R2 = R′21 +R′22 in R ′1(p)∩R ′2(p).
The min terms may be equal to either of their arguments, i.e., achievability of this particular
rate split may or may not rely on saturation. In any case, we can construct a modified split
R1 = R11 +R12 and R2 = R21 +R22 that maintains the same total rates but does not rely
on saturation, i.e., is contained in R1(p) ∩R2(p).
Specifically, let
∆1 = R′12 −min{R′12, H(X12 |Q)},
∆2 = R′21 −min{R′21, H(X21 |Q)},
and define the modified rate split (R11, R12, R21, R22) as
R12 = R′12 −∆1 = min{R′12, H(X12 |Q)}, (B.7)
R21 = R′21 −∆2 = min{R′21, H(X21 |Q)}, (B.8)
R11 = R′11 + ∆1, (B.9)
R22 = R′22 + ∆2. (B.10)
This is a valid rate split since the total rates R1 and R2 are maintained and each component
rate is non-negative. We need to show that the modified rate split is in R1(p) ∩ R2(p),
i.e., it satisfies (B.4a) through (B.5d). By substituting (B.7) to (B.10), it is straightforward
(if tedious) to see that (B.4a) follows from (B.2a) and (B.2b), (B.4b) follows from (B.2b),
(B.4c) follows from (B.2c) and (B.2d), (B.4d) follows from (B.2d), (B.5a) follows from
(B.3a) and (B.3b), (B.5b) follows from (B.3b), (B.5c) follows from (B.3c) and (B.3d), and
finally, (B.5d) follows from (B.3d).
Appendix C
Mathematical notation
Sets.∅ empty set
X ,Y, . . . discrete sets
|X | set cardinality
C ,R, . . . continuous sets
F2 Galois field of order 2, i.e., {0, 1}Z the set of integers
{1:n} the set {1, 2, . . . , n}R the set of real numbers
R+ the set of non-negative real numbers
T (n)ε (X) set of typical sequences xn, as defined in [EK11]
Functions.Id identity mapping, Id(x) = x
JxK 1 + ((x− 1) mod n), modulo-n operator with indexing starting at 1
log logarithm to base 2
H(X) entropy, in bits (using log to base 2)
h(X) differential entropy, in bits (using log to base 2)
I(X;Y ) mutual information, in bits (using log to base 2)
⊗ Kronecker product
S the convex hull of the set S
FM,FM′,FM′′ specialized Fourier–Motzkin projection operators
130 APPENDIX C. MATHEMATICAL NOTATION
Probability and random variables.
E event
P(E) probability of the event EP{condition} shorthand for P({condition})X,Y, . . . random variables
E(X) expected value of the random variable X
X ∼ p the random variable X is distributed according to p
pX(x) probability mass function (or probability density function) of the random
variable X
p(x) shorthand for pX(x) (when the context is clear)
N (µ,Σ) Gaussian probability density function of mean vector µ and covariance
matrix Σ
Unif(X ) uniform distribution over the set X
Matrices, vectors, and sequences.
xnm the vector (sequence) (xm, xm+1, . . . , xn)
xn shorthand for xn1XT transpose
KX , S, . . . matrices
tr(S) matrix trace
|S| matrix determinant
KX � S partial order induced by the positive semidefinite matrix cone, i.e.,S −KX is positive semidefinite
IL identity matrix of size L× L0m×n zero matrix of size m× nel canonical unit vector, the lth column of ILS↑ up-shift matrix
S↓ down-shift matrix
Z zero-padding matrix
131
Other symbols and abbreviations.
Xlk in 3-DIC context, the signal from transmitter l arriving at receiver k
f 4 g partial order of set partition refinement, f is a refinement of g
f ∨ g the finest set partition of which both f and g are refinements
f ∧ g intersection of two set partitions
pmf probability mass function
� end of proof sketch
� end of proof, q.e.d.
Bibliography
[ADT07] A. S. Avestimehr, S. N. Diggavi, and D. N. C. Tse, “A deterministic approach
to wireless relay networks.” (Sep. 2007), presented at the 45th Annual Allerton
Conference on Communication, Control, and Computing (Monticello, IL),
arXiv:0710.3777.
[ADT11] A. S. Avestimehr, S. N. Diggavi, and D. N. C. Tse, “Wireless network
information flow: A deterministic approach.” IEEE Trans. Inf. Theory, vol. 57,
no. 4, pp. 1872–1905 (Apr. 2011).
[Ahl74] R. Ahlswede, “The capacity region of a channel with two senders and two
receivers.” Ann. Probab., vol. 2, no. 5, pp. 805–814 (1974).
[AV09] V. S. Annapureddy and V. V. Veeravalli, “Gaussian interference networks:
Sum capacity in the low-interference regime and new outer bounds on the
capacity region.” IEEE Trans. Inf. Theory, vol. 55, no. 7, pp. 3032–3050 (Jul.
2009).
[BE10] B. Bandemer and A. El Gamal, “Interference decoding for deterministic chan-
nels.” In Proceedings of ISIT 2010, Austin, TX (Jun. 2010).
[BE11a] B. Bandemer and A. El Gamal, “An achievable rate region for the 3-user-pair
deterministic interference channel.” In Proceedings of the 49th Annual Allerton
Conference on Communication, Control, and Computing, Monticello, IL (Sep.
2011), invited paper.
[BE11b] B. Bandemer and A. El Gamal, “Communication with disturbance constraints.”
In Proceedings of ISIT 2011, St. Petersburg, Russia (Aug. 2011).
[BE11c] B. Bandemer and A. El Gamal, “Communication with disturbance con-
straints.” IEEE Trans. Inf. Theory (Nov. 2011), submitted for publication,
arXiv:1103.0996.
134 BIBLIOGRAPHY
[BE11d] B. Bandemer and A. El Gamal, “Interference decoding for deterministic chan-
nels.” IEEE Trans. Inf. Theory, vol. 57, no. 5, pp. 2966–2975 (May 2011),
arXiv:1001.4588.
[BPT10] G. Bresler, A. Parekh, and D. Tse, “The approximate capacity of the many-
to-one and one-to-many Gaussian interference channels.” IEEE Trans. Inf.
Theory, vol. 56, no. 9, pp. 4566–4592 (Sep. 2010), arXiv:0804.4489.
[BS11] R. Bustin and S. Shamai (Shitz), “MMSE of ‘bad’ codes.” IEEE Trans. Inf.
Theory (Jun. 2011), submitted for publication, arXiv:1106.1017.
[BT08] G. Bresler and D. Tse, “The two-user Gaussian interference channel: A de-
terministic view.” Euro. Trans. Telecomm., vol. 19, no. 4, pp. 333–354 (Jun.
2008), arXiv:0807.3222.
[BVVE09] B. Bandemer, G. Vazquez-Vilar, and A. El Gamal, “On the sum capacity
of a class of cyclically symmetric deterministic interference channels.” In
Proceedings of ISIT 2009, Seoul, Korea (Jun. 2009).
[Car75] A. B. Carleial, “A case where interference does not reduce capacity.” IEEE
Trans. Inf. Theory, vol. 21, no. 5, pp. 569–570 (Sep. 1975).
[CG87] M. Costa and A. E. Gamal, “The capacity region of the discrete memoryless
interference channel with strong interference (corresp.).” IEEE Trans. Inf.
Theory, vol. 33, no. 5, pp. 710–711 (Sep. 1987).
[CJ08] V. R. Cadambe and S. A. Jafar, “Interference alignment and degrees of
freedom of the K-user interference channel.” IEEE Trans. Inf. Theory, vol. 54,
no. 8, pp. 3425–3441 (Aug. 2008).
[CK78] I. Csiszar and J. Korner, “Broadcast channels with confidential messages.”
IEEE Trans. Inf. Theory, vol. 24, no. 3, pp. 339–348 (May 1978).
BIBLIOGRAPHY 135
[CMGE08] H.-F. Chong, M. Motani, H. K. Garg, and H. El Gamal, “On the Han-
Kobayashi region for the interference channel.” IEEE Trans. Inf. Theory,
vol. 54, no. 7, pp. 3188–3195 (Jul. 2008).
[EC82] A. A. El Gamal and M. H. M. Costa, “The capacity region of a class of
deterministic interference channels.” IEEE Trans. Inf. Theory, vol. 28, no. 2,
pp. 343–346 (Mar. 1982).
[EK11] A. El Gamal and Y.-H. Kim, Network Information Theory. Cambridge Univer-
sity Press (2011).
[ELZ05] U. Erez, S. Litsyn, and R. Zamir, “Lattices which are good for (almost) ev-
erything.” IEEE Trans. Inf. Theory, vol. 51, no. 10, pp. 3401–3416 (Oct.
2005).
[EM81] A. El Gamal and E. C. van der Meulen, “A proof of Marton’s coding theorem
for the discrete memoryless broadcast channel.” IEEE Trans. Inf. Theory,
vol. 27, no. 1, pp. 120–122 (Jan. 1981).
[EO09] R. H. Etkin and E. Ordentlich, “The degrees-of-freedom of the K-user
Gaussian interference channel is discontinuous at rational channel coefficients.”
IEEE Trans. Inf. Theory, vol. 55, no. 11, pp. 4932–4946 (Nov. 2009).
[ETW08] R. H. Etkin, D. N. C. Tse, and H. Wang, “Gaussian interference channel
capacity to within one bit.” IEEE Trans. Inf. Theory, vol. 54, no. 12, pp.
5534–5562 (Dec. 2008).
[EZ04] U. Erez and R. Zamir, “Achieving (1/2) log(1 + SNR) on the AWGN channel
with lattice encoding and decoding.” IEEE Trans. Inf. Theory, vol. 50, no. 10,
pp. 2293–2314 (Oct. 2004).
[GJ11] T. Gou and S. A. Jafar, “Sum capacity of a class of symmetric SIMO Gaussian
interference channels within O(1).” IEEE Trans. Inf. Theory, vol. 57, no. 4,
pp. 1932–1958 (Apr. 2011), arXiv:0905.1745.
136 BIBLIOGRAPHY
[GK73] P. Gacs and J. Korner, “Common information is far less than mutual informa-
tion.” Problems of Control and Information Theory, vol. 2, no. 2, pp. 149–162
(1973).
[GSV05] D. Guo, S. Shamai (Shitz), and S. Verdu, “Mutual information and minimum
mean-square error in Gaussian channels.” IEEE Trans. Inf. Theory, vol. 51,
no. 4, pp. 1261–1282 (Apr. 2005).
[HK81] T. S. Han and K. Kobayashi, “A new achievable rate region for the interference
channel.” IEEE Trans. Inf. Theory, vol. 27, no. 1, pp. 49–60 (Jan. 1981).
[JV08] S. A. Jafar and S. Vishwanath, “Generalized degrees of freedom of the sym-
metric K user Gaussian interference channel.” (Apr. 2008), arXiv:0804.4489.
[Kra04] G. Kramer, “Outer bounds on the capacity of Gaussian interference channels.”
IEEE Trans. Inf. Theory, vol. 50, no. 3, pp. 581–586 (Mar. 2004).
[LV07] T. Liu and P. Viswanath, “An extremal inequality motivated by multiterminal
information-theoretic problems.” IEEE Trans. Inf. Theory, vol. 53, no. 5, pp.
1839–1851 (May 2007).
[Mar79] K. Marton, “A coding theorem for the discrete memoryless broadcast channel.”
IEEE Trans. Inf. Theory, vol. 25, no. 3, pp. 306–311 (May 1979).
[MDFT11] S. Mohajer, S. N. Diggavi, C. Fragouli, and D. N. C. Tse, “Approximate
capacity of a class of Gaussian interference-relay networks.” IEEE Trans. Inf.
Theory, vol. 57, no. 5, pp. 2837–2864 (May 2011).
[Meu77] E. C. van der Meulen, “A survey of multi-way channels in information theory:
1961-1976.” IEEE Trans. Inf. Theory, vol. 23, no. 1, pp. 1–37 (Jan. 1977).
[Meu94] E. C. van der Meulen, “Some reflections on the interference channel.” In R. E.
Blahut, D. J. Costello, U. Maurer, and T. Mittelholzer (Editors), Communica-
tions and Cryptography: Two Sides of One Tapestry, pp. 409–421, Kluwer,
Boston (1994).
BIBLIOGRAPHY 137
[Mis39] R. von Mises, “Uber Aufteilungs- und Besetzungs-Wahrscheinlichkeiten.”
Revue de la Faculte des Sciences de l’Universite d’Istanbul, vol. 4, pp. 145–
163 (1939), reprinted in “Selected Papers of Richard von Mises”, vol. 2 (Ed.
P. Frank, S. Goldstein, M. Kac, W. Prager, G. Szego, and G. Birkhoff).
Providence, RI: American Mathematical Society, pp. 313-334, 1964.
[MK09] A. S. Motahari and A. K. Khandani, “Capacity bounds for the Gaussian
interference channel.” IEEE Trans. Inf. Theory, vol. 55, no. 2, pp. 620–643
(Feb. 2009).
[MMK08] M. A. Maddah-Ali, A. S. Motahari, and A. K. Khandani, “Communication
over MIMO X channels: Interference alignment, decomposition, and perfor-
mance analysis.” IEEE Trans. Inf. Theory, vol. 54, no. 8, pp. 3457–3470 (Aug.
2008).
[MOMK09] A. S. Motahari, S. Oveis Gharan, M. A. Maddah-Ali, and A. K. Khandani,
“Real interference alignment: Exploiting the potential of single antenna sys-
tems.” IEEE Trans. Inf. Theory (Nov. 2009), submitted for publication,
arXiv:0908.2282.
[NG08] B. Nazer and M. Gastpar, “The case for structured random codes in network
capacity theorems.” Euro. Trans. Telecomm., Special Issue on New Directions
in Information Theory, vol. 19, no. 4, pp. 455–474 (Jun. 2008).
[NG09] B. Nazer and M. Gastpar, “Compute-and-forward: Harnessing interference
through structured codes.” IEEE Trans. Inf. Theory (Aug. 2009), submitted
for publication, arXiv:0908.2119v2.
[Sat81] H. Sato, “The capacity of the Gaussian interference channel under strong
interference.” IEEE Trans. Inf. Theory, vol. 27, no. 6, pp. 786–788 (Nov.
1981).
138 BIBLIOGRAPHY
[SD11] Y. Song and N. Devroye, “Structured interference-mitigation in two-hop net-
works.” In Proceedings of the Information Theory and Applications Workshop
(ITA), La Jolla, CA (Feb. 2011).
[SJV+08] S. Sridharan, A. Jafarian, S. Vishwanath, S. A. Jafar, and S. Shamai (Shitz),
“A layered lattice coding scheme for a class of three user Gaussian interfer-
ence channels.” In Proceedings of the 46th Annual Allerton Conference on
Communication, Control, and Computing, pp. 531–538, Monticello, IL (Sep.
2008).
[SKC09] X. Shang, G. Kramer, and B. Chen, “A new outer bound and the noisy-
interference sum-rate capacity for Gaussian interference channels.” IEEE
Trans. Inf. Theory, vol. 55, no. 2, pp. 689–699 (Feb. 2009).
[Sta11] R. P. Stanley, Enumerative Combinatorics, vol. 1. Cambridge University Press,
2nd ed. (2011), URL http://www-math.mit.edu/˜rstan/ec/.
[TY11] Y. Tian and A. Yener, “The Gaussian interference relay channel: Improved
achievable rates and sum rate upperbounds using a potent relay.” IEEE Trans.
Inf. Theory, vol. 57, no. 5, pp. 2865–2879 (May 2011).
[Wit75] H. S. Witsenhausen, “On sequences of pairs of dependent random variables.”
SIAM Journal of Applied Mathematics, vol. 28, no. 1, pp. 100–113 (Jan. 1975).
[Wyn75] A. D. Wyner, “The wire-tap channel.” Bell System Technical Journal, vol. 54,
no. 8, pp. 1355–1387 (Oct. 1975).