CODING SCHEMES FOR A DISSERTATIONsy560th1179/Ban... · 2013-06-18 · the same as the...

CODING SCHEMES FOR

DETERMINISTIC INTERFERENCE CHANNELS

A DISSERTATION

SUBMITTED TO THE DEPARTMENT OF

ELECTRICAL ENGINEERING

AND THE COMMITTEE ON GRADUATE STUDIES

OF STANFORD UNIVERSITY

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

Bernd Bandemer

December 2011

http://creativecommons.org/licenses/by-nc-nd/3.0/us/

This dissertation is online at: http://purl.stanford.edu/sy560th1179

© 2011 by Bernd Frank Bandemer. All Rights Reserved.

Re-distributed by Stanford University under license with the author.

This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.

ii



http://purl.stanford.edu/sy560th1179

I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.

Abbas El-Gamal, Primary Adviser


Arogyaswami Paulraj


Itschak Weissman

Approved for the Stanford University Committee on Graduate Studies.

Patricia J. Gumport, Vice Provost Graduate Education

This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file inUniversity Archives.

iii

Abstract

One of the canonical unsolved problems in network information theory is to find the ca-

pacity region of the interference channel. The problem is motivated by today’s wireless

communication systems which have experienced a steep growth in participant density and

thus increasingly operate in the interference-limited regime. Data rates are no longer limited

by propagation path loss and thermal noise, but instead, by simultaneous transmissions in

the same frequency band.

The interference channel models such concurrent communication among several trans-

mitter–receiver pairs using a shared medium. Its capacity region describes the optimal

trade-off between simultaneously achievable data rates. While considerable progress has

been made in characterizing the capacity region for the case with two sender–receiver pairs,

much less is known for interference channels with three or more user pairs.

This dissertation contributes to this area by investigating a class of deterministic (noise-

free) interference channels with three user pairs. A series of three coding schemes and

corresponding achievable rate regions is developed, each of which subsumes and improves

upon its predecessor. As a baseline, a first transmission scheme is considered where point-

to-point random codes are combined with receivers that disregard the special statistical

structure of the interfering signals and simply treat them as white noise. It is shown that

despite its simplicity, this scheme achieves the sum capacity for an important subclass of

channels.

The baseline scheme is not optimal in general. In order to overcome its shortfalls, two

different viewpoints on the interference channel are taken. The receiver-centric view states

that each received signal is composed of multiple independent structured transmissions,

among which only the desired one must be decoded correctly. An approach is developed

that allows the receivers to exploit the structure of the combined interference signal without

insisting to decode any of the interfering messages partly or fully. This interference decoding

scheme results in a second achievable rate region that is shown to strictly dominate treating

interference as noise. In addition, it also contains as a special case the scheme that decodes

the undesired messages uniquely.

v

The complementary view of the interference channel is transmitter-centric. The obser-

vation that each sender affects all receivers, but needs to convey a message only to one of

them while minimizing the disturbance caused at the others leads to a new model of commu-

nication with disturbance constraints. Disturbance is measured by a mutual information

expression that represents the rate of unwanted information flow from the transmitter to

the side receivers. The disturbance-constrained communication problem is first studied in

isolation, and its optimal coding schemes are identified. The rate–disturbance trade-off is

established for the single constraint case, where the optimal encoding scheme turns out to be

the same as the Han–Kobayashi scheme for the two user-pair interference channel. For the

case of communication with two disturbance constraints, the best known encoding scheme

involves rate splitting, Marton coding and superposition coding, and is shown to be optimal

in several nontrivial cases.

Finally, the two viewpoints are consolidated by applying the codebook structure from

communication with two disturbance constraints in the three-user-pair interference channel

and combining it with interference-decoding receivers. This yields a third achievable rate

region that is the central result of this dissertation. It is strictly larger than the two previous

inner bounds to the capacity region. Furthermore, it is shown to achieve the capacity region

of each two-user-pair subchannel embedded within the three-pair interference channel, and

as such, the coding scheme generalizes the Han–Kobayashi scheme to more than two user

pairs.

While the results are presented in the framework of the deterministic interference channel

with three user pairs, the modular approach of separating the transmitter- and receiver-

centric viewpoints as well as the new coding schemes apply in principle to general discrete

memoryless interference channels.

vi

Acknowledgment

It is my pleasure to thank those whose support has made my doctoral work and this dis-

sertation possible. Most importantly, I owe my gratitude to Professor Abbas El Gamal for

accepting me as his doctoral student and kindling my interest in information theory. Since I

first met him, I was and continue to be impressed by the depth of his knowledge. It was a

great joy to learn from him and contribute to the field with him.

I am indebted to Professor Arogyaswami Paulraj for his service as my associate doctoral

adviser. He accepted me into his research group early on and offered me a stimulating

academic home to grow in. I also thank him for keeping my eye on the practical consequences

of information theory in wireless communications.

I would like to thank Professor Tsachy Weissman for serving as dissertation reader and

providing constructive feedback and encouragement. I am grateful to Professor Andrea

Goldsmith and Professor Ramesh Johari for participating as examiners in my oral dissertation

defense and contributing much appreciated questions and insightful suggestions.

The experience during my doctoral studies has been greatly enhanced by my colleagues

and fellow researchers. I would like to thank Yeow-Khiang Chia, Han-I Su, Lei Zhao, Paul

Cuff, and Haim Permuter for many fruitful discussions and inspiring insights, information-

theoretic and otherwise. I have also thoroughly enjoyed and benefited from discussions with

visiting professors David Tse and Pramod Viswanath. From Professor Paulraj’s group, I am

grateful to my scientific collaborators Aydin Sezgin, Gonzalo Vazquez Vilar, Nicolai Czink,

and Taemin Kim, and to my officemates Alireza Ghaderipoor, Amin Mobasher, Gokmen

Altay, Heunchul Lee, Jan Haase, Martin Wrulich, Mohamad Charafeddine, Moon-Sik Lee,

Naoki Kita, Simon Umbricht, Stephanie Pereira, Takayuki Shimizu, A. J. Thiruvengadam,

and Hyunjong Yang, who made day-to-day life in the office fun and memorable. In addition,

I thank ISL assistants Kelly Yilmaz, Denise Murphy, and Rashmi Shah for administrative

help.

I would like to express my appreciation and gratefulness to Eric and Illeana Benhamou,

who have generously supported my doctoral studies through a Stanford Graduate Fellowship.

This has given me the academic freedom to pursue what I was most interested in.

vii

My deepest gratitude belongs to my family. I wholeheartedly thank my parents Heidi

and Werner Bandemer for always being there for me. It is only through their support that I

could get this far. I thank my sister and brother-in-law Sabine and Jan Bandemer for being

a source of inspiration and happiness. My heartfelt thankfulness goes to my love and best

friend Leila Zia, who brightens my life every day.

viii

Contents

Abstract v

Acknowledgment vii

1 Introduction 11.1 Brief survey of known results . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.1.1 Interference channels with two user pairs . . . . . . . . . . . . . . 3

1.1.2 Interference channels with more than two user pairs . . . . . . . . . 5

1.2 Three-user-pair deterministic interference channel . . . . . . . . . . . . . . 6

1.3 Organization of this dissertation . . . . . . . . . . . . . . . . . . . . . . . 10

2 Treating interference as noise 132.1 Inner bound by treating interference as noise . . . . . . . . . . . . . . . . . 13

2.2 Binary-field 3-DIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.1 Converse proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.2.2 Achievability proof . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.2.3 Optimal assignment matrices . . . . . . . . . . . . . . . . . . . . . 28

3 Interference Decoding 313.1 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.1.1 Interference-decoding inner bound . . . . . . . . . . . . . . . . . . 32

3.1.2 Capacity region under strong interference . . . . . . . . . . . . . . 34

3.1.3 Comparison to treating interference as noise . . . . . . . . . . . . . 35

3.1.4 Extension to 3-DIC with noisy observations . . . . . . . . . . . . . 38

3.1.5 Interference decoding is not optimal in general . . . . . . . . . . . 40

3.2 Proof of Theorem 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.3 Proof of Theorem 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.4 Proof of Theorem 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

ix

4 Communication with disturbance constraints 53

4.1 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.1.1 Rate–disturbance region for a single disturbance constraint . . . . . 55

4.1.2 Inner and outer bounds for the deterministic channel with two dis-

turbance constraints . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.2 Proofs for a single disturbance constraint . . . . . . . . . . . . . . . . . . . 68

4.2.1 Proof of achievability for Theorem 4.1 . . . . . . . . . . . . . . . . 68

4.2.2 Proof of converse for Theorem 4.1 . . . . . . . . . . . . . . . . . . 69

4.2.3 Proof of Corollary 4.1 . . . . . . . . . . . . . . . . . . . . . . . . 71


4.2.5 Proof of Theorem 4.2 . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.3 Proofs for two disturbance constraints . . . . . . . . . . . . . . . . . . . . 77

4.3.1 Proof of Theorem 4.3 . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.3.2 Proof of Theorem 4.4 . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.3.3 Proof of Theorem 4.5 . . . . . . . . . . . . . . . . . . . . . . . . . 87


5 General achievable rate region for 3-DIC 91

5.1 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.1.1 Achievable rate region for 3-DIC . . . . . . . . . . . . . . . . . . 92

5.1.2 Alternative characterization of the achievable rate region . . . . . . 96

5.1.3 Region without Ul . . . . . . . . . . . . . . . . . . . . . . . . . . 99

5.1.4 Special case: One-to-many 3-DIC . . . . . . . . . . . . . . . . . . 101

5.2 Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

5.2.1 Codebook generation for Theorem 5.1 and Corollary 5.1 . . . . . . 104

5.2.2 Error probability analysis for Corollary 5.1 . . . . . . . . . . . . . 106

5.2.3 Equivalence of Theorem 5.1 and Corollary 5.1 . . . . . . . . . . . 113


6 Conclusion 117

x

A Useful auxiliary results 119A.1 Probability decomposition by index and by value . . . . . . . . . . . . . . 119

A.2 Independence lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

B Application of new techniques to 2-DIC 123B.1 2-DIC has no saturation gain . . . . . . . . . . . . . . . . . . . . . . . . . 126

C Mathematical notation 129

Bibliography 133

xi

List of Tables

4.1 Message subsets for decoding error events. . . . . . . . . . . . . . . . . . . 79

5.1 Shorthand notation for terms related to transmitter 1. . . . . . . . . . . . . 94



5.4 Message subsetsM1i. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

5.5 Message subsetsM2j . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

5.6 Message subsetsM3j . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

5.7 Index subsets for union bound and corresponding sufficient conditions. . . . 112

B.1 2-DIC shorthand notation for terms related to transmitter 1. . . . . . . . . . 123

B.2 2-DIC shorthand notation for terms related to transmitter 2. . . . . . . . . . 124

xiii

List of Figures

1.1 Interference channel with K transmitter–receiver pairs. . . . . . . . . . . . 2

1.2 Deterministic interference channel with two user pairs (2-DIC). . . . . . . . 5

1.3 Deterministic interference channel with three user pairs (3-DIC). . . . . . . 7

1.4 3-DIC from the viewpoint of the first receiver. . . . . . . . . . . . . . . . . 7

1.5 Additive 3-DIC example (Example 1.1). . . . . . . . . . . . . . . . . . . . 9

1.6 Receiver and transmitter point of view of interference channels. . . . . . . 11

2.1 Region of Theorem 2.1 for the additive 3-DIC example. . . . . . . . . . . . 14

2.2 Cyclically symmetric binary-field 3-DIC. . . . . . . . . . . . . . . . . . . 16

2.3 Illustration of Theorem 2.2. . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.4 Components of received signal Y1 for the converse proof of Theorem 2.2. . 19

2.5 Parameter regions for achievability proof of Theorem 2.2. . . . . . . . . . . 24

2.6 Transmit and received signal in region “Df”. . . . . . . . . . . . . . . . . . 25

2.7 Rules for verifying decodability. . . . . . . . . . . . . . . . . . . . . . . . 26

3.1 Region R1(p), which ensures decodability at the first receiver. . . . . . . . 34

3.2 Region of Theorem 3.1 for the additive 3-DIC example. . . . . . . . . . . . 37

3.3 Comparison of interference decoding and treating interference as noise. . . 38

3.4 Gaussian interference channel with BPSK. . . . . . . . . . . . . . . . . . . 40

3.5 2-DIC example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.6 Capacity region for a deterministic MAC. . . . . . . . . . . . . . . . . . . 47

4.1 Communication system with disturbance constraints. . . . . . . . . . . . . 54

4.2 Example of R(U,X), the constituent region of R. . . . . . . . . . . . . . . 57

4.3 The link between 2-DIC and communication with disturbance constraints. . 58

4.4 Deterministic example with one disturbance constraint. . . . . . . . . . . . 58

4.5 Constituent region for Theorem 4.3. . . . . . . . . . . . . . . . . . . . . . 62

4.6 Deterministic channel with two disturbance constraints (Example 4.2). . . 65

4.7 Two-dimensional projections of the rate–disturbance region for Example 4.2. 66

xv

4.8 Constituent region for Theorem 4.2. . . . . . . . . . . . . . . . . . . . . . 74

4.9 Illustration of decoding error events, for m0 = 1. . . . . . . . . . . . . . . 80

4.10 Constituent region for Corollary 4.4. . . . . . . . . . . . . . . . . . . . . . 88

5.1 Region of Corollary 5.2 for the additive 3-DIC example. . . . . . . . . . . 101

5.2 Comparison of the regions in Theorems 2.1 and 3.1 and Corollary 5.2. . . . 102

5.3 One-to-many special case of 3-DIC. . . . . . . . . . . . . . . . . . . . . . 102

xvi

List of Theorems

Theorem 1.1 Capacity region of 2-DIC, El Gamal–Costa 1982 . . . . . . . . . . . 5

Theorem 2.1 Treating interference as noise . . . . . . . . . . . . . . . . . . . . . . 13

Theorem 2.2 Normalized symmetric capacity for binary-field 3-DIC . . . . . . . . 17

Theorem 3.1 Interference-decoding inner bound . . . . . . . . . . . . . . . . . . . 33

Theorem 3.2 3-DIC capacity region with strong interference and invertible hk . . . 35

Theorem 3.3 Interference decoding versus treating interference as noise . . . . . . 36

Theorem 3.4 Interference decoding for 3-DIC with noisy observations . . . . . . . 39

Theorem 4.1 Rate–disturbance region of DMC-1-DC . . . . . . . . . . . . . . . . 55

Theorem 4.2 Gaussian vector channel with one disturbance constraint . . . . . . . 60

Theorem 4.3 Inner bound for deterministic DMC-2-DC . . . . . . . . . . . . . . . 61

Theorem 4.4 Outer bound for deterministic DMC-2-DC . . . . . . . . . . . . . . . 63

Theorem 4.5 Rate–disturbance region of certain deterministic DMC-2-DC . . . . . 64

Theorem 5.1 Inner bound to the capacity region of 3-DIC . . . . . . . . . . . . . . 94

Theorem B.1 Capacity region of 2-DIC . . . . . . . . . . . . . . . . . . . . . . . . 124

Corollary 4.1 Rate–disturbance region of deterministic DMC-1-DC . . . . . . . . . 56

Corollary 4.2 Gaussian channel with one disturbance constraint . . . . . . . . . . . 59

Corollary 4.3 Rate–disturbance region with degraded side receivers . . . . . . . . . 67

Corollary 4.4 Simpler inner bound for deterministic DMC-2-DC . . . . . . . . . . 87

Corollary 5.1 Alternative inner bound to the capacity region of 3-DIC . . . . . . . 97

Corollary 5.2 Inner bound to the capacity region of 3-DIC, no Ul . . . . . . . . . . 100

Corollary 5.3 Inner bound to the capacity region of one-to-many 3-DIC . . . . . . 103

Corollary A.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Corollary A.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Corollary B.1 Capacity region of 2-DIC, no saturation . . . . . . . . . . . . . . . . 126

Lemma 3.1 Packing lemma for pairs . . . . . . . . . . . . . . . . . . . . . . . . . 43

Lemma A.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

Lemma A.2 Independence lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

xvii

Chapter 1

Introduction

The information-theoretic interference channel is a model for concurrent data transmission

using a coupled medium. Consider, for example, a wireless communication system in which

many participating devices have to use the same spectrum for regulation or implementation

reasons, and spatial proximity leads to interference between them. Similarly, nearby copper

wires in communication systems such as digital subscriber line (DSL) may be insufficiently

isolated from each other, thus introducing cross-talk between adjacent lines. From the

system designer’s point of view, it is important to understand the impact of interference on

system performance. How much does the presence of interference impair the achievable

transmission rates? Furthermore, algorithms and transmission schemes need to be devised

that are able to handle such interference. How should a communication system be designed

to be robust to interference?

To put these questions in a mathematical framework, consider the memoryless inter-

ference channel with K sender–receiver pairs depicted in Figure 1.1. In this channel, K

non-cooperating transmitters would each like to send a message to a corresponding receiver.

The concurrent transmissions are coupled to each other by way of the shared channel through

which they occur, which creates a trade-off between the reliably achievable data rates of

each communication. The goal of studying the interference channel is to characterize this

data rate trade-off.

Formally, the problem of finding this trade-off can be expressed as follows. The channel

consists of input and output alphabets X1, . . . ,XK and Y1, . . . ,YK and a collection of

2 CHAPTER 1. INTRODUCTION

Y nK MKDecoder K

Y n2 M2Decoder 2

M1Xn

1Encoder 1Y n

1 M1Decoder 1

M2Xn

2Encoder 2

MKXnKEncoder K p(

y 1,...,yK|

x1,...,x

K)

Figure 1.1. Interference channel with K transmitter–receiver pairs.

conditional probability distributions p(y1, . . . , yK |x1, . . . , xK). In each channel use, the

channel outputs are drawn randomly from a probability distribution that depends on the

channel inputs.

A (2nR1 , . . . , 2nRK , n) block code of data rates (R1, . . . , RK) ∈ RK+ and block length n

consists of K encoding functions

xnk : {1:2nRk} → X nk , for k ∈ {1:K},

that map a message mk into a transmitted codeword xnk(mk), and K decoding functions

mk : Ynk → {1:2nRk}, for k ∈ {1:K},

that map a received sequence ynk into an estimate mk(ynk ) of the message. The probability of

error of a code is defined as

P (n)e = P

{(M1,M2, . . .MK) 6= (M1, M2, . . . MK)

},

where the messages Mk are now random variables, independent from each other and uni-

formly distributed over the message sets {1:2nRk}, for k ∈ {1:K}.

A rate tuple (R1, . . . , RK) is achievable in the K-pair interference channel if there exists

a sequence of (2nR1 , . . . , 2nRK , n) block codes such that the probability of error tends to

1.1. BRIEF SURVEY OF KNOWN RESULTS 3

zero as the block length grows to infinity,

limn→∞

P (n)e = 0.

The capacity region C of the K-pair interference channel is the closure of the set of all

achievable rate tuples. Our goal is to identify C for a given interference channel, since it

exactly describes the trade-off between data rates of the participating user pairs. We are

also interested in the nature of the transmission scheme that achieves capacity, as it offers

guidance and insight for practical system design.

1.1 Brief survey of known results

The interference channel was first introduced in [Ahl74]. A survey of early work is given

in [Meu94], and an up-to-date account of known results is contained in [EK11]. Despite

years of investigation, the capacity region of the interference channel remains unknown in

general. This is the case even for the two sender–receiver pair interference channel (K = 2).

1.1.1 Interference channels with two user pairs

The capacity region is known for certain classes of interference channels. When the in-

terference is very strong [Car75], it is optimal for both receivers to decode both messages

completely. In fact, the presence of interference does not impair the per-user capacity in this

case. When the interference is strong [Sat81, CG87], it is still optimal for both receivers to

decode both messages, but the achievable rates are less than in the interference-free case.

When interference is weak enough, a natural scheme is to treat it as noise. For two-user

Gaussian interference channels, which are of particular practical relevance, this scheme

achieves the sum capacity provided the gain on the cross links are sufficiently small [AV09,

SKC09, MK09].

The best known general inner bound for the two-user-pair interference channel is

achieved by the Han–Kobayashi scheme [HK81], for which a simplified description has been

recently developed in [CMGE08]. The scheme combines the ingredients of the schemes that

work well for strong and weak interference. Each message is divided into a common part


which is decoded at both receivers, and a private part which is decoded only at the desired

receiver and treated as noise at the undesired receiver. Codebooks are then constructed using

superposition coding. The resulting achievable rate region is optimal for all interference

channels for which the capacity region is known. For the Gaussian interference channel,

a simplified Han–Kobayashi scheme with Gaussian input distributions has been shown

in [ETW08] to be at most half a bit per user away from the capacity region. This result is

shown using outer-bounding techniques first developed in [Kra04].

Injective deterministic interference channel with two user pairs

An interference channel that is of particular interest to us is the two-user-pair deterministic

interference channel (2-DIC) depicted in Figure 1.2. The channel consists of two sender

alphabets Xl, for l ∈ {1:2} and two receiver alphabets Yk, for k ∈ {1:2}, loss functions glkthat model the links between each sender and receiver, and a function fk at each receiver

that maps the two impinging signals into the receiver observations Y1 and Y2. The channel

is memoryless and the outputs are deterministic functions of the inputs,

Y1 = f1(X11, X21),

Y2 = f2(X22, X12), where

Xlk = glk(Xl).

We assume that the functions fk are injective in each argument, that is, they become one-to-

one when either one of their arguments is fixed. For example, for Y1 = f1(X11, X21), this

assumption is equivalent to H(X11) = H(Y1 |X21) and H(X21) = H(Y1 |X11) for every

probability mass function (pmf) p(x11, x21). An example of a function that is injective in

each argument (but not injective) is regular addition.

As shown in [EC82], the capacity region of this channel is known and achieved by the

Han–Kobayashi scheme.

1.1. BRIEF SURVEY OF KNOWN RESULTS 5

X1 f1

g21

X21

Y1

g12

f2

X12

X2Y2

g11

g22

X11

X22

Figure 1.2. Deterministic interference channel with two user pairs (2-DIC).

Theorem 1.1 (Capacity region of 2-DIC, El Gamal–Costa 1982).The capacity region C2-DIC of the two-pair deterministic interference channel is the set

of rate pairs (R1, R2) that satisfy

R1 ≤ H(X11 |Q),

R2 ≤ H(X22 |Q),

R1 +R2 ≤ H(Y1 |X12, Q) +H(Y2 |X21, Q),

R1 +R2 ≤ H(X11 |X12, Q) +H(Y2 |Q),

R1 +R2 ≤ H(X22 |X21, Q) +H(Y1 |Q),

2R1 +R2 ≤ H(Y1 |Q) +H(X11 |X12, Q) +H(Y2 |X21, Q),

R1 + 2R2 ≤ H(Y2 |Q) +H(X22 |X21, Q) +H(Y1 |X12, Q),

for some distribution p = p(q)p(x1|q)p(x2|q).

1.1.2 Interference channels with more than two user pairs

Much less is known about interference channels with more than two user pairs. In addition

to containing all complexities of the two-pair case, these channels exhibit the interesting

property that decoding at each receiver is impaired by the joint effect of interference from all

other senders rather than by each sender’s signal separately. Consequently, dealing directly

with the effect of the combined interference signal is expected to achieve higher rates.


One such coding scheme is interference alignment [MMK08, CJ08], in which the code

is designed so that the combined interference signal at each receiver is confined (aligned) to

a subset of the receiver signal space. Depending on the specific channel, this alignment may

be achieved via linear subspaces, signal scale levels, time delay slots, or number-theoretic

bases of rationally independent real numbers [EO09, MOMK09]. In some cases, e.g., the

multiple-input multiple-output (MIMO) Gaussian interference channel [CJ08], the decoder

simply treats interference as noise. In general, however, decoding can be thought of as a

two-step procedure. In the first step, the received signal is projected onto the desired signal

subspace, e.g., by multiplying it by a matrix as for the MIMO case [MMK08, CJ08] or by

separating each received symbol into its constituent lattice points as for the scalar Gaussian

case [MOMK09]. In the second step, interference-unaware decoding is performed on the

projection of the received signal.

A natural question that has not been answered in the literature is how to generalize the

Han–Kobayashi scheme to interference channels with more than two user pairs. For the

Gaussian case, it was shown in [BPT10] that a straightforward extension using a partial

message for each subset of receivers and superposition coding does not work well in general.

1.2 Three-user-pair deterministic interference channel

In this dissertation, we investigate the three-user-pair deterministic interference channel

(3-DIC) depicted in Figure 1.3, which we first introduced in [BE10, BE11d].

The channel consists of three sender–receiver alphabet pairs (Xl,Yl), loss functions glkthat model the links between each sender and receiver, and a function at each receiver that

maps the three impinging signals into the receiver observation Yl, for k, l ∈ {1:3}. Each of

these functions is composed of two stages, as depicted in Figure 1.4 for the first receiver,

namely an interference combining function hl and a receiver function fl. In the spirit of

2-DIC, we assume that the functions hl and fl are injective in each argument. For Y1 =

f1(X11, S1), this condition can be expressed in terms of entropies as H(X11) = H(Y1 |S1)

and H(S1) = H(Y1 |X11) for every joint distribution p(x11, s1). The channel is assumed to

1.2. THREE-USER-PAIR DETERMINISTIC INTERFERENCE CHANNEL 7

X1

g12

Y1

X2

X3

Y2

Y3

g13

g11

g21

g22

g23

g32

g33

g31

X21

X11

X31

Figure 1.3. Deterministic interference channel with three user pairs (3-DIC).

X1

X2

X3

g11

g21

g31

X11

X21

X31S1

Y1f1

h1

Figure 1.4. 3-DIC from the viewpoint of the first receiver.


be memoryless. Its outputs are then given as

Yl = fl(Xll, Sl), where (1.1)

Xlk = glk(Xl),

S1 = h1(X21, X31),

S2 = h2(X32, X12),

S3 = h3(X13, X23).

This interference channel model is a natural choice to consider for several reasons. First,

it allows us to explore the effect of interference without noise. Although we are eventually

interested in understanding noisy interference channels, it is generally a good idea to study

simplified models first before we can hope to analyze the general case.

Second, the model allows us argue explicitly about the combined effect of interference

on the receivers by giving direct access to the combined interference signals S1, S2, and S3.

Although we focus on the case with three user-pairs, the insight generalizes to the general

case, since the step from two to three pairs is more difficult than the step from three to an

arbitrary number of pairs.

Third, the 3-DIC generalizes the 2-DIC discussed above, for which the capacity region

is known and achieved by the Han–Kobayashi scheme. This gives some hope that an

appropriate extension of Han–Kobayashi may be optimal for more than two user pairs.

Fourth, this class of interference channels includes the binary-field deterministic model

proposed in [ADT07, ADT11], which has been shown to approximate the two-pair Gaussian

interference channel well in high signal-to-noise ratios [BT08]. This is of great interest in

wireless communication.

Finally, although many of our results apply in principle to general interference channels,

their presentation and analysis is much simpler for the deterministic case. Focusing on the

3-DIC allows us to concentrate on the essence of the new ideas.

Example 1.1 (Additive 3-DIC). Consider a cyclically symmetric 3-DIC with

X1 = X2 = X3 = {0, 1, 2},

1.2. THREE-USER-PAIR DETERMINISTIC INTERFERENCE CHANNEL 9

Y1 = Y2 = Y3 = {0, 1, 2, 3, 4},

g11 = g22 = g33 = Id,

g12 = g23 = g31 = {0 7→ 0, 1 7→ 1, 2 7→ 1},

g13 = g21 = g32 = {0 7→ 0, 1 7→ 1, 2 7→ 0},

h1 = h2 = h3 = +,

f1 = f2 = f3 = +.

The loss functions are inspired by the Blackwell broadcast channel [Meu77], and the inter-

ference combining functions and receiver functions are taken to be addition. The resulting

input-to-output mapping is shown in Figure 1.5. This 3-DIC is cyclically symmetric, i.e.,

the channel is invariant to cyclic relabeling of the pairs (performing subscript replacements

1 7→ 2 7→ 3 7→ 1 or 1 7→ 3 7→ 2 7→ 1).

The 3-DIC capacity region is not known in general. In this dissertation, we make

progress in characterizing it by means of inner bounds and corresponding encoding schemes.

{0, 1}

{0, 1}

X1

X2

X3

Y1

Y2

Y3

{0, . . . , 4}{0, 1, 2}

{0, 1}

{0, 1}

{0, 1}

{0, 1}

{0, 1, 2}

{0, 1, 2}

{0, . . . , 4}

{0, . . . , 4}

021

0

1

021

0

1

021

0

1

021

0

1

021

0

1

021

0

1

Figure 1.5. Additive 3-DIC example (Example 1.1).


1.3 Organization of this dissertation

The main body of the dissertation consists of four chapters. In Chapter 2, we establish

a baseline inner bound to the capacity region of the 3-DIC. This bound is achieved by

using point-to-point (non-layered) random codebooks at the transmitters, and treating all

interference as noise at the receivers. We show that this simple scheme can achieve sum

capacity for an important subclass of 3-DIC, in which the inputs and outputs are vectors of

bits and the channel loss functions are vector shift operations. This binary field model has

been shown to approximate Gaussian interference channels in the interference-limited (low

noise) regime.

In the subsequent chapters, we study the interference channel from two different view-

points as shown in Figure 1.6. From the point of view of each receiver, the channel resembles

a multiple-access channel [EK11], see Figure 1.6(a). However, the receiver is interested in

decoding only one of the transmitted messages. In particular, the receiver is not required

to decode the undesired messages partly or fully. In Chapter 3, we focus on this receiver-

centric view. Assuming simple point-to-point random codes at the transmitters, we devise an

interference decoding receiver that does not uniquely decode any of the interfering messages,

but exploits the structure in the combined interfering signal to increase the achievable data

rate. This idea leads to an inner bound to the 3-DIC capacity region which is strictly larger

than the region achieved by treating interference as noise.

In Chapter 4, we take the opposite viewpoint as depicted in Figure 1.6(b). From the point

of view of each transmitter, the channel resembles a broadcast channel [EK11]. However,

the sender wishes to send a message only to one of the receivers while causing the least

disturbance to the other receivers. Abstracting this line of thought, we define the setting of

communication with disturbance constraints. We measure disturbance in terms of the rate

of undesired information flow from the sender. In the case of a single disturbance constraint,

the optimal encoding scheme turns out to be rate splitting and superposition coding, which

coincides with the Han–Kobayashi scheme for two-pair interference channels. This gives us

hope that a coding scheme for communication with two disturbance constraints would work

well in three-pair interference channels. Consequently, we develop inner and outer bounds

on the rate–disturbance region with two disturbance constraints.

1.3. ORGANIZATION OF THIS DISSERTATION 11

X1

X2

XK

Y1

Y2

YK

(a) Receiver point of view.

X1

X2

XK

Y1

Y2

YK

(b) Transmitter point of view.

Figure 1.6. Receiver and transmitter point of view of interference channels.

Finally, in Chapter 5, we combine the insight from the preceding chapters to develop

a new inner bound to the 3-DIC capacity region which constitutes the main result of this

dissertation. We borrow the codebook structure from communication with disturbance

constraints and the receiver architecture from interference decoding and combine the two

pieces in a modular fashion. The resulting coding scheme achieves a larger rate region

than any previously known scheme. We argue that this is the natural way to generalize the

Han–Kobayashi scheme to interference channels with more than two user pairs.

Chapter 6 concludes the dissertation. There are three appendices: Appendix A contains

supporting mathematical results that may be useful in other applications. Appendix B applies

the 3-DIC techniques developed in this dissertation to the 2-DIC and shows how the result

collapses to Theorem 1.1. Appendix C summarizes our mathematical notation.

How to read this dissertation. Each of the following chapters consists of two parts. The

first part contains the theorems and corollaries that constitute the results in that chapter, a

discussion of their important properties, some concrete examples, and a high-level sketch

of the main proof ideas. Subsequent sections contain the proofs in detail. Some rather

technical parts are labeled as propositions and are proved outside the main flow of the text.

It is recommended to approach the material in the fashion of successive refinement: A first

pass over the material would include only the first section of each chapter, saving a deeper

descent into the mathematical underpinnings for later passes.

Chapter 2

Treating interference as noise

In this chapter1, we review a first inner bound on the capacity region of the 3-DIC, which

serves as a benchmark for subsequent results. We also show that the inner bound achieves

the sum capacity for a special case of 3-DIC.

2.1 Inner bound by treating interference as noise

The following achievable rate region is well-known in the literature [EK11]. It applies to

general interference channels and is given in the following.

Theorem 2.1 (Treating interference as noise).The set RTIN of rate triples (R1, R2, R3) such that

Rk ≤ I(Xk;Yk |Q), k ∈ {1:3}, (2.1)

for some pmf p(q)p(x1|q)p(x2|q)p(x3|q) constitutes an inner bound to the capacity

region of the memoryless interference channel with three user pairs.

To achieve this bound, a user pair does not need to know the codebooks of other user pairs.

Each receiver decodes only its message. Although the interfering signal exposes temporal1The results in this chapter were first published in [BVVE09, BE10, BE11d].

14 CHAPTER 2. TREATING INTERFERENCE AS NOISE

structure originating from the codebooks of the undesired transmitters, this structure is

disregarded by the receiver. Instead, the interference is regarded as independent samples

from a certain distribution, i.e., it is treated as white noise.

Note that this inner bound, with appropriate selections of the input pmfs, includes the

interference alignment inner bounds in [CJ08, JV08]. In the case of 3-DIC as defined

in Section 1.2, we can identify the alignment effect in the rate conditions. Consider the

condition for the first message rate R1,

R1 ≤ I(X1;Y1 |Q) = H(Y1 |Q)−H(S1 |Q).

Recall that the combined interference S1 is a function of the individual interference signals

X21 and X31. Thus,

H(S1 |Q) ≤ H(X21 |Q) +H(X31 |Q).

Alignment occurs when the inequality is strict, i.e., the effect of the combined interference

signal is less severe than the sum of the effects of the interference signals individually.

Continuation of Example 1.1. Recall the channel in Example 1.1 on page 8. The inner

bound to the capacity of this channel given by Theorem 2.1 is depicted in Figure 2.1.

R1

R2

R3

Figure 2.1. Region of Theorem 2.1 for the additive 3-DIC example.

2.2. BINARY-FIELD 3-DIC 15

2.2 Binary-field 3-DIC

In this section, we discuss an important subclass of 3-DIC for which Theorem 2.1 achieves

the sum capacity. We specialize the 3-DIC model as follows. Let the input and output

alphabets be

Xk = FN2 ,

Yk = F2N2 ,

for k ∈ {1 : 3}, where F2 denotes the binary finite field. The inputs and outputs of the

channel are thus column vectors of bits. We choose the combining functions hk and fk as

componentwise finite-field addition in F2N2 . Finally, the loss functions glk map from FN2 to

F2N2 and are given as

g11 = g22 = g33 : x 7→ Zx,

g12 = g23 = g31 : x 7→ S(1−β)N↓ Zx,

g13 = g21 = g32 : x 7→ S(α−1)N↑ Zx.

Here, Z ∈ F2N×N2 is a zero-padding matrix, defined as

Z =

[0N×N

IN

].

Further, S↑, S↓ ∈ F2N×2N2 are up-shift and down-shift matrices, respectively, such that

S↑ [x1, x2, . . . , x2N−1, x2N ]T = [x2, x3, . . . , x2N , 0]T,

S↓ [x1, x2, . . . , x2N−1, x2N ]T = [0, x1, . . . , x2N−2, x2N−1]T.

The channel is parameterized by the triple (N,α, β), which we constrain to α ∈ [1, 2],

β ∈ [0, 1], and αN, βN ∈ Z. The parameters α and β characterize the amount of up/down-

shift on the cross links and thus loosely correspond to channel gains. Note that due to the

zero-padding matrix Z, the up-shift operation retains the complete information of its input,


while the down-shift operation incurs clipping at the low end of the vector.

This specialized 3-DIC is cyclically symmetric. Each transmitter causes interference

to one receiver through the up-shift function, and to one through the down-shift function.

Likewise, each receiver experiences one up-shifted and one down-shifted interfering signal.

The channel is depicted in Figure 2.2.

The binary-field 3-DIC is of interest because of its connection to Gaussian interference

channels. Deterministic channels of this type as models for noisy networks have first been

proposed in [ADT07, ADT11]. In [BT08], the two-user Gaussian interference channel was

studied and it was shown that there is a correspondence between the generalized degrees of

freedom of the Gaussian case and the capacity of the deterministic binary-field case. The

deterministic model thus captures the asymptotic behavior of the Gaussian channel in the

interference-limited regime. Some progress toward generalizing this result to more than two

user-pairs has been made in [JV08], where the solution is found for the fully symmetric case

where α = β.

In addition to the capacity region C , define the sum capacity asRΣ = sup{R1+R2+R3 |(R1, R2, R3) ∈ C } and the symmetric capacity as Rsym = sup{R | (R,R,R) ∈ C }. By

symmetry of the channel and convexity of the capacity region, RΣ = 3Rsym. Furthermore,

X1 Y1

X2

X3

Y2

Y3

β

α

α

β

α

β

X13

X13X31

X31

Figure 2.2. Cyclically symmetric binary-field 3-DIC.


define the normalized symmetric capacity dsym = Rsym/N , where the normalization is with

respect to the interference-free symmetric capacity N .

Before we state the sum capacity result for the binary-field 3-DIC, define the function

V(x) =1 + |x− 1|

2=

x/2 if x ≥ 1

1− x/2 if x < 1.

Remark 2.1. The normalized symmetric capacity of the binary-field 2-DIC with parameters

(N,α), where α ∈ [0,∞), can be expressed with the function V as

dsym = min{

1,V(α),V(2α)}.

This was shown in [ETW08, BT08] and is essentially a consequence of Theorem 1.1,

specialized to the binary-field case.

We are now ready to state the normalized symmetric capacity result for the binary-field

3-DIC defined above for a large set of (α, β) parameters.

Theorem 2.2 (Normalized symmetric capacity for binary-field 3-DIC).The normalized symmetric capacity of the cyclically symmetric binary-field 3-DIC with

parameters (α, β) ∈ [1, 2]× [0, 1], where α ≥ 2β or α ≥ β2

+ 1, is

dsym = min{

1,V(α),V(β),V(2β),V(α− β)}.

Figure 2.3 illustrates the theorem. The claimed dsym is piecewise linear in (α, β), and

the figure shows the linear regions in the parameter plane. The value of dsym is indicated by

shading.

Remark 2.2. The theorem implies that dsym is independent of N . For fixed α and β, all

valid values of N (satisfying αN, βN ∈ Z) yield the same dsym.

Remark 2.3. The result of the theorem continues to hold for the cyclically symmetric

binary-field K-DIC.


α

β

X

210

1

Figure 2.3. Illustration of Theorem 2.2 in the (α, β) parameter plane. The result applies every-where except in region “X”. The value of dsym is represented by different levels of shading, and localmaxima are marked by a star.

The proof is given in two parts below. In Subsection 2.2.1, we prove the converse by

allowing the receivers access to some additional “genie” information. In Subsection 2.2.2,

we prove achievability by identifying optimal input distributions for Theorem 2.1.

2.2.1 Converse proof

The upper bounds 1, V(α), V(β), and V(2β) follow in a straightforward way from the known

degree of freedom result of the two user-pair case (see Remark 2.1). This can be shown by

giving the complete signal Xnk of one of the interferers as genie information to the receivers,

thus effectively degenerating the three user-pair case to the two-pair case.

Hence we focus on proving the bound V(α− β) by generalizing the methods introduced

in [EC82] to the case at hand. First note that Fano’s inequality implies for every k

nRk ≤ I(Xnk ;Y n

k ) + nεn,

where εn tends to zero as the block length n grows to infinity.


Without overlap between interferers

First consider α− β ≥ 1, which corresponds to the first line in the definition of V(α− β).

In this case, the two interfering signals do not overlap within the received signal, as shown

in Figure 2.4(a). For example, at receiver 1, the sparsity patterns of X21 and X31 are disjoint.

We can write

I(Xn1 ;Y n

1 )(a)= I(Xn

1 ;Y n1 , X

n23)

= I(Xn1 ;Xn

23) + I(Xn1 ;Y n

1 | Xn23)

(b)= H(Y n

1 | Xn23)−H(Y n

1 | Xn1 , X

n23),

where in (a), Xn23 is a form of “genie” information given to the receiver, which does not

increase the mutual information since X23 is not interfered with in Y1, and (b) uses the

independence between the messages from the first and second transmitter. Now consider the

last term.

H(Y n1 | Xn

1 , Xn23) = H(Xn

11 +Xn21 +Xn

31 | Xn1 , X

n23)

αN

βN

X23

X23

X21 X31X11

N

(α−β

)N

(a) Interferers do not overlap.

X23

X23

X21 X31X11

(1−

(α−β

))N

T31

T31

(b) Interferers overlap.

Figure 2.4. Components of received signal Y1 for the converse proof. The thick horizontal line inthe bottom of the figure symbolizes the “noise level”, i.e., the lower end of the vector where furtherdown-shifts cause loss of information. The received signal Y1 is the elementwise sum of the threesignals.


= H(Xn21 +Xn

31 | Xn23)

(a)= H(Xn

31) +H(Xn21 | Xn

23)

= H(Xn31) +H(Xn

23 | Xn23),

where (a) follows from the fact that X21 and X31 do not overlap and different transmit-

ters’ signals are independent, and X23 is the part of X2 that is not contained in X23 (see

Figure 2.4(a)). We conclude that

I(Xn1 ;Y n

1 ) = H(Y n1 | Xn

23)−H(Xn31)−H(Xn

23 | Xn23).

Writing an analogous equation for I(Xn2 ;Y n

2 ) and I(Xn3 ;Y n

3 ), and adding all three of them,

we arrive at

n(R1 +R2 +R3 − 3εn) ≤ H(Y n1 | Xn

23) +H(Y n2 | Xn

31) +H(Y n3 | Xn

12)

−H(Xn12)−H(Xn

12 | Xn12)−H(Xn

23)

−H(Xn23 | Xn

23)−H(Xn31)−H(Xn

31 | Xn31)

= H(Y n1 | Xn

23) +H(Y n2 | Xn

31) +H(Y n3 | Xn

12)

−H(Xn1 )−H(Xn

2 )−H(Xn3 )

Considering that nRk ≤ H(Xnk ) + nεn, for all k ∈ {1:3}, we conclude that

2n(R1 +R2 +R3 − 6εn) ≤ H(Y n1 | Xn

23) +H(Y n2 | Xn

31) +H(Y n3 | Xn

12)

≤ nH(Y1 | X23) + nH(Y2 | X31) + nH(Y3 | X12),

where single-letterization is performed by using the chain rule and omitting part of the con-

ditioning. The right hand side of the last equation is maximized by letting each component

of each Xk be independent Bern(1/2). Thus

2(R1 +R2 +R3) ≤ 3N(α− β), and finally,

dsym = Rsym/N ≤ (α− β)/2.


With overlap between interferers

Now consider the case where α − β < 1, i.e., the two interfering signals at each receiver

overlap in signal space, see Figure 2.4(b). Define the topmost (1− (α− β))N bits of Xk as

Tk. We will augment the genie information Xn23 of the previous subsection by T n31. This is

exactly the part of the X3-based interference that overlaps with the X2-based interference.

Similar to the previous case, we conclude

I(Xn1 ;Y n

1 ) ≤ I(Xn1 ;Y n

1 , Xn23, T

n31)

= I(Xn1 ;Xn

23, Tn31) + I(Xn

1 ;Y n1 | Xn

23, Tn31)

= H(Y n1 | Xn

23, Tn31)−H(Y n

1 | Xn1 , X

n23, T

n31),

The last term becomes

H(Y n1 | Xn

1 , Xn23, T

n31) = H(X11 +Xn

21 +Xn31 | Xn

1 , Xn23, T

n31)

= H(Xn21 +Xn

31 | Xn23, T

n31)

(a)= H(T n31 | T n31) +H(Xn

23 | Xn23),

where T31 denotes the part of X31 that is not included in T31. Its size is N(α− 1). We are

allowed to separate the terms in (a) because the overlapping part is resolved by T31.

Again, repeating the same for all three rates, we arrive at

n(R1 +R2 +R3 − 3εn) ≤ H(Y n1 | Xn

23, Tn31)−H(T n12 | T n12)−H(Xn

12 | Xn12)

+H(Y n2 | Xn

31, Tn12)−H(T n23 | T n23)−H(Xn

23 | Xn23)

+H(Y n3 | Xn

12, Tn23)−H(T n31 | T n31)−H(Xn

31 | Xn31).

Since T12 and T12 form X12, which when combined with X12 forms X1, we can write

n(R1 − nεn) ≤ H(Xn1 ) = H(T n12) +H(T n12|T n12) +H(X12|T n12, T

n12︸︷︷︸

=Xn12

)

Using this expression and its equivalent for R2 and R3 with the previous inequality, we


obtain

2n(R1 +R2 +R3 − 6εn) ≤ H(Y n1 | Xn

23, Tn31) +H(T n12) +H(Y n

2 | Xn31, T

n12)

+H(T n23) +H(Y n3 | Xn

12, Tn23) +H(T n31)

≤ n(H(Y1 | X23, T31) +H(T12) +H(Y2 | X31, T12)

+H(T23) +H(Y3 | X12, T23) +H(T31)).

Again, the right hand side is maximized by choosing all Xk components independently

according to Bern(1/2), yielding

2(R1 +R2 +R3) ≤ 3N + 3N(1− (α− β)),

dsym ≤ 1− (α− β)/2,

which matches the definition of V(α− β) for α− β < 1. This concludes the converse proof

of Theorem 2.2. �

2.2.2 Achievability proof

We prove achievability by identifying optimal input distributions for Theorem 2.1. We use

input distributions of the form

Xk = G(α, β)Dk, for k ∈ {1:3},

where G(α, β) is an assignment matrix of size N × Ndsym(α, β) with elements from F2,

and Dk is a vector of Ndsym(α, β) independent message bits with distribution Bern(1/2).

We further constrain the coding scheme in several ways. Firstly, all three transmitters use

the same assignment matrix G(α, β). Secondly, there is no coding across multiple channel

uses. Specifically, the assignment matrices are chosen such that I(Xk;Yk) = H(Xk), i.e.,

from observing the channel output Yk at a single time, the decoder can reconstruct the

complete transmit vector Xk that was sent at that time, and thereby, the message bits that

are contained in it. Like the encoding step, this reconstruction is implemented by a linear

operator Xk = G(α, β)Yk. Finally, the proposed G matrices have at most one non-zero


element per row, i.e., each component of Xk is assigned either an information bit or a zero.

While these assumptions may seem overly restrictive, they are sufficient for our purposes.

Indeed, it is surprising that such a constrained set of codes is able to meet the upper bound

of Subsection 2.2.1.

Remark 2.4. If the number N of components in the input vectors is small, it can severely

limit our options in terms of assignment matrices. The following argument can circumvent

this problem by expanding a given channel to one with larger N . To this end, take the

binary field 3-DIC with parameters (N,α, β), and consider L ≥ 2 subsequent channel uses

with channel inputs Xk,1, . . . , Xk,L. Let us interleave these vectors into a supersymbol

Xk =∑L

l=1(IN ⊗ el)Xk,l, and likewise for the outputs Yk. Here, ⊗ denotes the Kronecker

product, and el is the lth column of IL. The resulting channel {X1, X2, X3} → {Y1, Y2, Y3}is then a binary field 3-DIC with parameters (LN,α, β). Through this method, we have

increased N to LN , where L can be arbitrarily large. Note that dsym is unaffected by this

transformation since it is normalized by N . In light of this transformation, we assume from

now on that N is large enough such that any fraction of N that we incur evaluates to an

integer number.

The assignment matrix G depends on the channel parameters α and β. The set of interest

{(α, β)} is divided into 18 regions “Aa” to “Ee” as shown in Figure 2.5. Compared to

Figure 2.3, some of the parameter regions are subdivided (for example, “Ea” and “Eb”),

which indicates that a different kind of assignment matrix G is needed even within a

parameter range where dsym is linear in α and β.

Optimal assignment matrices G for all parameter regions in Figure 2.5 are listed in

Subsection 2.2.3 on page 28. For each region, we specify the affine constraints on (α, β)

that define the region and the optimal input assignment matrix G. The latter is given in terms

of the resulting transmit vector Xk. In the following we discuss the details for one particular

example, which is representative for all other cases.

Optimal input distribution in region “Df”

This region is parameterized by (α, β) = (4/3 + ε, 2/3 + δ) with ε ≤ 2δ, ε ≥ 12δ, δ ≤ 1

3.

Figure 2.6(a), copied from Subsection 2.2.3, represents an optimal assignment G by means


α

β

Ab

Aa

Ba

BcBd

Be Bf Bg

EaEb

EcEd

Ee

Da

DbDc

Df

X

Bb

210

1

Figure 2.5. Regions in the (α, β) parameter plane for the achievability proof of Theorem 2.2.

of the resulting transmit vector. The vector Xk is subdivided into data blocks (hatched) that

correspond to non-zero rows of G, and zero blocks (gray) that correspond to all-zero rows

of G. Some data blocks occur twice. We denote such block pairs as twins. Twins carry

the same data bits, albeit in reverse order as discussed later. The length of each block as a

fraction of N is annotated in the figure.

To prove achievability of Theorem 2.2, we require the transmit vector to be both valid

and decodable. By valid we mean (a) all block lengths are non-negative for the range of

(ε, δ) that constitute the region, (b) the sum of the block lengths is 1, and (c) adding the

sizes of all data blocks, counting twins only once, results in the desired dsym as claimed in

Theorem 2.2, e.g., 2/3− δ/2 for our case. By decodable, we mean that using this transmit

vector assignment, the receiver can recover all desired data blocks from the received signal,

or equivalently, I(Xk;Yk) = H(Xk).

To verify decodability, consider Figure 2.6(b), which uses the same conventions as

Figure 2.4. The receiver sees the sum of data blocks from different transmitters, each

characterized by its length and shift location. Blocks from different transmitters may or may

not overlap. Decoding is performed sequentially, block by block. In each step, one of three

rules is applied in order to decode additional data blocks, which are then removed from the


Xk

1/3− δ

ε− (1/2) δ

1/3− δ

−ε+ 2δ

ε− δ/2

1/3− δ

−ε+ 2δ

(a) Transmit signal Xk.

X21 X31X11

1

1

2

2

33

33

4

4

4

4

5

6

(b) Received signal Y1.

Figure 2.6. Transmit and received signal in region “Df”. (a) Proposed assignment G, shown asthe resulting transmit vector. The block lengths are given as fractions of N , such that the sum ofall block lengths is 1. (b) Received signal Y1, at α = 1.6, β = 0.9, with dsym = 0.55. Blocks indifferent columns carry different data.


received signal. The three decoding rules are as follows.

Direct readout. Consider the situation in Figure 2.7(b). If a data block (i) does not

overlap with any other data block and (ii) is located above the noise level, then its data

content can be read out directly from the received signal. It is crucial that both (i) and (ii)

hold for all (α, β) in the region, since the length and location of the blocks in Figure 2.6(b)

change when α and β vary. A block that has been read out is then removed from the received

signal. If the block is one of a twin, its sibling is removed as well.

Overlapping twins (A). Consider Figure 2.7(c). If two twin pairs exist such that (i) they

have the same block length, b1 = b2, (ii) they have the same separation, s1 = s2, (iii) the

relative shift between the pairs is less than the separation, c < s1, and (iv) the solid green

sections of (A) in Figure 2.7(c) do not overlap with any other data block and are above the

noise floor, then both twin pairs can be decoded and canceled from the received signal. As

no other data

other data blocks

Legend

blocks allowed

are allowed

(a) (b) Direct readout.

(A) (B)

b2

b2

s2

c

b1

b1

s1

(c) Overlapping twins.

Figure 2.7. Rules for verifying decodability. Legend (a) applies to “direct readout”, shown in (b),and two variants of “overlapping twins”, shown in (c).


before, conditions (i)–(iv) must hold for all (α, β) in the region. To see this, consider the

following successive decoding argument [JV08]. Let the two copies within a twin be in

reverse order of each other. First, the lowest part of the bottom blue twin is read out. Its data

reappears on the top end of the upper blue twin, thus revealing a chunk of data on the top

end of the upper orange twin. This data in turn is replicated on the lower end side of the

bottom orange twin, which exposes a new part of the bottom blue twin. The process repeats

until both twins are completely decoded.

Overlapping twins (B). This rule is a variant of the previous one, where pattern (B)

replaces pattern (A) in Figure 2.7 (c). Decoding proceeds similarly, but starting from the

inside end of the twins.

In our example, the sequence of steps that completely decodes X1 is annotated in

Figure 2.6(b): First, block 1 is decoded via direct readout. The now-known data block

and its twin are removed from the received signal Y1. The same rule allows block 2 to

be decoded, which is then removed from Y1. Each removal step makes more room for

subsequent rule applications. Next, the overlapping twins (A) rule is applied to the two pairs

of twins 3. Continuing in the same fashion, the removal of blocks 1, 2 and 3 enables the two

twin pairs 4 to be decoded using the overlapping twins (B) rule. Finally, data blocks 5 and 6

can be recovered by direct readout, which completes the decoding process. By symmetry,

the signals at the other two receivers can be similarly decoded.

The assignments for all other regions as listed in Subsection 2.2.3 can be shown to be

valid and decodable using the same procedure. This concludes the achievability proof of

Theorem 2.2. �


2.2.3 Optimal assignment matrices

Region Aa.

Parameters:α = 2 + ε

β = δ

Constraints:δ ≥ 0

δ ≤ 1 + ε

ε ≤ −δ

Rate dsym:1 + ε/2− δ/2 δ

−ε/2− δ/2

1 + ε

−ε/2− δ/2 Region Ab.


β = δ

Constraints:ε ≤ 0

δ ≤ 1/2

ε ≥ −δ

Rate dsym:1− δ

δ

1− δ

Region Ba.

Parameters:α = 6/5 + ε

β = 2/5 + δ

Constraints:ε ≥ 3δ

ε ≤ 1/5 + δ

ε ≥ 1/10− δ/2

Rate dsym:3/5− ε/2 + δ/2 1/5− ε+ δ

1/5− ε/2− δ/2

1/5− ε/2− δ/2

−1/5 + 2ε+ δ

1/5− ε/2− δ/2

1/5− ε/2− δ/2

1/5 + ε

Region Bb.


β = 2/5 + δ

Constraints:ε ≥ −δ/3ε ≤ 1/5 + δ

ε ≤ 1/10− δ/2ε ≥ 3δ

Rate dsym:3/5− ε/2 + δ/2 1/5− ε+ δ

1/5− ε/2− δ/2

(3/2) ε+ δ/2

1/5− 2ε− δ(3/2) ε+ δ/2

1/5− ε/2− δ/2

1/5 + ε

Region Bc.


β = 2/5 + δ

Constraints:ε ≥ δ/2ε ≤ 1/5 + δ

ε ≤ −δ/3

Rate dsym:3/5− ε/2 + δ/2 1/5 + δ/2

1/5 + ε

−(3/2) ε− δ/2

1/5 + ε

−(3/2) ε− δ/2

1/5 + ε

1/5 + δ/2 Region Bd.


β = 2/5 + δ

Constraints:ε ≤ δ/2ε ≤ −2δ

ε ≥ −1/5

Rate dsym:3/5 + ε/2 −ε+ δ/2

1/5 + ε

−ε+ δ/2

1/5 + ε

−ε/2− δ

1/5 + ε

−ε/2− δ

1/5 + ε

−ε+ δ/2

1/5 + ε

−ε+ δ/2


Region Be.


β = 2/5 + δ

Constraints:ε ≤ δ/2ε ≥ −2δ

δ ≤ 1/10

Rate dsym:3/5− δ −ε+ δ/2

1/5 + ε

−ε+ δ/2

1/5 + ε

1/5− 2δ

1/5 + ε

−ε+ δ/2

1/5 + ε

−ε+ δ/2 Region Bf.


β = 2/5 + δ

Constraints:ε ≥ δ/2ε ≤ 3δ

ε ≤ 1/10− δ/2

Rate dsym:3/5− δ

1/5− ε+ δ

1/5− ε+ δ

2ε− δ1/5− 2ε− δ

2ε− δ

1/5− ε+ δ

1/5 + ε

Region Bg.


β = 2/5 + δ

Constraints:ε ≤ 3δ

δ ≤ 1/10

ε ≥ 1/10− δ/2

Rate dsym:3/5− δ 1/5− ε+ δ

1/5− ε+ δ

1/5− 2δ

−1/5 + 2ε+ δ

1/5− 2δ

1/5− ε+ δ

1/5 + ε

Region Da.


β = 2/3 + δ

Constraints:ε ≥ 2δ

ε ≥ −δε ≤ 1/3 + δ

Rate dsym:2/3− ε/2 + δ/2

1/3− δ

ε/2 + δ/2

1/3− δ

ε/2 + δ/2

1/3− ε+ δ

Region Db.


β = 2/3 + δ

Constraints:ε ≤ −δε ≥ δ/2δ ≥ −1/6

Rate dsym:2/3 + δ

1/3 + δ/2

1/3− δ

1/3 + δ/2

Region Dc.


β = 2/3 + δ

Constraints:ε ≤ δ/2ε ≥ 2δ

δ ≥ −1/6

Rate dsym:2/3 + δ −ε+ δ/2

1/3 + ε

−ε+ δ/2

1/3 + 2ε− 2δ

−ε+ δ/2

1/3 + ε

−ε+ δ/2


Region Df.


β = 2/3 + δ


ε ≥ δ/2δ ≤ 1/3

Rate dsym:2/3− δ/2 1/3− δ

ε− δ/2

1/3− δ

−ε+ 2δ

ε− δ/2

1/3− δ

−ε+ 2δRegion Ea.


β = 2/3 + δ


δ ≥ −1/15

ε ≥ 3δ

Rate dsym:2/3 + δ −3δ

1/3 + 2δ

−3δ

1/3 + 5δ

−3δ

1/3 + 2δ

Region Eb.


β = 2/3 + δ


δ ≤ −1/15

δ ≥ −1/6

ε ≥ 3δ

Rate dsym:2/3 + δ

−3δ

1/3 + 2δ

1/3 + 2δ

−1/3− 5δ

1/3 + 2δ

1/3 + 2δ Region Ec.


β = 2/3 + δ


ε ≥ −1/3 + δ

ε ≤ −1/3− 2δ

Rate dsym:2/3 + ε/2− δ/2

−ε/2− (3/2) δ

1/3 + ε/2 + δ/2

1/3 + ε/2 + δ/2

−1/3− ε− 2δ

1/3 + ε/2 + δ/2

1/3 + ε/2 + δ/2

−ε/2 + (3/2) δ

Region Ed.


β = 2/3 + δ


ε ≥ −1/3− 2δ

ε ≥ −1/3 + δ

ε ≤ −3δ

Rate dsym:2/3 + ε/2− δ/2 −ε/2− (3/2) δ

1/3 + ε/2 + δ/2

−ε/2− (3/2) δ

1/3 + ε+ 2δ

−ε/2− (3/2) δ

1/3 + ε/2 + δ/2

−ε/2 + (3/2) δ Region Ee.


β = 2/3 + δ


δ ≤ 1/3 + ε

δ ≥ −ε/3

Rate dsym:2/3 + ε/2− δ/2

1/3− δ

1/3− δ

ε/2 + (3/2) δ

1/3− δ

ε/2 + (3/2) δ

−ε

Chapter 3

Interference Decoding

In this chapter1, we use interference decoding to develop an inner bound for the capacity

region of the 3-DIC defined in Section 1.2 on page 6. The idea is to treat the combined

interference signal as one entity at the receivers instead of artificially separating different

sources of interference from each other.

Explicit decoding of the combined interference signal was first discussed in [BPT10] for

the many-to-one Gaussian interference channel. The authors argue that with Gaussian codes,

decoding the combined interference is tantamount to decoding each interfering sender’s

codeword. On the other hand, with structured (lattice) codes, the combined interference can

be made to appear essentially as a codeword from a single interferer. Lattice codes [EZ04,

ELZ05] have found applications in a number of Gaussian interference network settings, see,

e.g., [SJV+08, MDFT11, TY11, SD11]. In general, for channels with inherent linearity such

as Gaussian interference channels, it is natural to consider decoding linear combinations of

interfering codewords, instead of individual codewords. This idea is developed in [NG09]

for Gaussian relay networks, leading to a compute–forward relaying scheme.

We focus on the 3-DIC, where the combined interference signal takes values from a

finite set, and therefore a certain type of alignment can be observed without resorting to

complicated structured codes [NG08]. We assume point-to-point codes without rate split-

ting or superposition coding since such codes are widely deployed and it is interesting to

investigate the benefit of using a more sophisticated receiver instead of treating interference

1The results in this chapter were first published in [BE10, BE11d].

32 CHAPTER 3. INTERFERENCE DECODING

as noise. Specifically, each receiver simultaneously decodes the intended message and the

combined interference without penalizing incorrect decoding of the latter. Of course, one

does not expect this scheme to be optimal in general, since even for the two-user-pair case,

superposition coding is required for optimality. Note that for our class of deterministic chan-

nels, algebraic structures such as linear subspaces or lattices do not exist in general. Hence,

our decoder does not use the two-step procedure as in the work on Gaussian channels and

their corresponding high SNR deterministic models (see the discussion in Subsection 1.1.2

on page 5).

The key observation is that depending on the input pmfs and the message rates, the

number of possible combined interference sequences can be equal to the number of in-

terfering message pairs, the number of typical combined interference sequences, or some

combination of the two. In our scheme, each sender does not need to know the other senders’

codebooks. However, we use simultaneous decoding, which requires that the receivers know

all codebooks. As in the recent characterization of the Han–Kobayashi region [CMGE08],

we do not require the interference decoding to be correct with arbitrarily small probability

of error.

3.1 Results and discussion

In the following we summarize the results in this chapter.

3.1.1 Interference-decoding inner bound

Fix the random tuple (Q,X1, X2, X3) ∼ p = p(q)p(x1|q)p(x2|q)p(x3|q), where Q is a

time-sharing random variable from alphabet Q. Define the region R1(p) ⊂ R3+ to consist of

the rate triples (R1, R2, R3) such that

R1 ≤ H(X11 |Q), (3.1)

R1 + min{R2, H(X21 |Q)} ≤ H(Y1 |X31, Q), (3.2)

R1 + min{R3, H(X31 |Q)} ≤ H(Y1 |X21, Q), (3.3)

R1 + min{R2 +R3,

3.1. RESULTS AND DISCUSSION 33

R2 +H(X31 |Q),

H(X21 |Q) +R3,

H(S1 |Q)} ≤ H(Y1 |Q). (3.4)

Similarly define the regions R2(p) and R3(p) by making the subscript replacements 1 7→2 7→ 3 7→ 1 and 1 7→ 3 7→ 2 7→ 1 in R1(p), respectively.

Let S denote the convex hull of the set S .

Theorem 3.1 (Interference-decoding inner bound). The region

RID =⋃p

R1(p) ∩R2(p) ∩R3(p),

where p = p(q)p(x1|q)p(x2|q)p(x3|q) is an inner bound to the 3-DIC capacity region.

Region Rk(p) ensures decodability at receiver k, and the intersection R1(p) ∩R2(p) ∩R3(p) ensures decodability at all three receivers. The proof for this theorem is given in

Section 3.2.

Remark 3.1 (Saturation). The min terms on the left hand side of the inequalities arise

from counting the effective number of interfering sequences at various links of the channel.

For example, consider the min{R2, H(X21 |Q)} term in (3.2). If R2 is small, the number

of distinct sequences that can occur at X21 is equal to the number of possible messages from

sender 2. As R2 increases beyond H(X21 |Q), the number of possible sequences at X21

“saturates” to the number of typical sequences, which is roughly 2nH(X21 |Q). In this case,

we can increase the rate of the second sender further without negatively impacting the first

receiver. The min expressions in (3.3) and (3.4) likewise capture the saturation effects at

X31 and S1, respectively.

An example of region R1(p) is plotted in Figure 3.1. The region is unbounded in the R2

and R3 directions, due to saturation. This is expected, since regardless of the values of R2

and R3, S1 can always be treated as noise to achieve a non-zero rate R1. However, as R2

and R3 become smaller, the proposed scheme takes advantage of the structure in S1 and can

thereby increase R1.


R1

R2

R3

Figure 3.1. Region R1(p), which ensures decodability at the first receiver.

Remark 3.2 (Convexity). The regions R1(p), R2(p), and R3(p), and thus their intersec-

tion, are generally nonconvex. By virtue of time-sharing, we are allowed to convexify.

However, this convexification operation is not achieved by the coded time sharing mecha-

nism of Q, hence the explicit convex hull operation in the theorem.

3.1.2 Capacity region under strong interference

Consider the subclass of 3-DIC with strong interference and invertible hk in which the

following two conditions hold.

First, the loss functions glk are such that

min{H(X12), H(X13)} ≥ H(X11),

min{H(X21), H(X23)} ≥ H(X22),

min{H(X31), H(X32)} ≥ H(X33),

for all product input pmfs p(x1)p(x2)p(x3). This condition implies that interference is

strong.


Second, the functions hk are invertible, i.e.,

H(S1) = H(X21) +H(X31),

H(S2) = H(X12) +H(X32),

H(S3) = H(X13) +H(X23),

for all product input pmfs p(x1)p(x2)p(x3). With the conditional invertibility property of

fk, the channel becomes a non-symmetric version of the deterministic model for the SIMO

interference channel described in [GJ11]. In both cases, a receiver can uniquely recover

both interfering signals given the received sequence and the desired transmitted sequence.

The capacity region under these conditions is achieved by interference decoding.

Theorem 3.2 (3-DIC capacity region with strong interference and invertible hk).The capacity region of the 3-DIC under strong interference and invertible hk functions

is the set of rate triples (R1, R2, R3) such that

Rk ≤ H(Xkk |Q), k ∈ {1:3},

R1 +R2 ≤ min{H(Y1 |X31, Q), H(Y2 |X32, Q)},

R1 +R3 ≤ min{H(Y1 |X21, Q), H(Y3 |X23, Q)},

R2 +R3 ≤ min{H(Y2 |X12, Q), H(Y3 |X13, Q)},

R1 +R2 +R3 ≤ min{H(Y1 |Q), H(Y2 |Q), H(Y3 |Q)},

for some (Q,X1, X2, X3) ∼ p(q)p(x1|q)p(x2|q)p(x3|q) with |Q| ≤ 12.

The proof of this theorem is given in Section 3.3.

3.1.3 Comparison to treating interference as noise

In the two-user-pair interference channel, decoding both messages at each receiver and

treating interference as noise are considered as two extreme schemes. The extremes are

bridged by the Han–Kobayashi scheme in which part of the interference is decoded and the


rest is treated as noise [EK11]. While treating interference as noise is better for channels

with weak interference, decoding both messages is optimal under strong interference. We

show surprisingly that for the 3-DIC under consideration, treating interference as noise is a

special case of interference decoding!

In Section 3.4, we establish the following result.

Theorem 3.3 (Interference decoding versus treating interference as noise).The rate region achievable by treating interference as noise (Theorem 2.1) is included in

the interference-decoding rate region of Theorem 3.1, i.e.,

RTIN ⊆ RID.

The difference between treating interference as noise and interference decoding is

essentially that the former assumes that the combined interference signal Sk is always

saturated, while the latter distinguishes between saturated and non-saturated cases. Later in

this section, we argue that the above inclusion result is tightly coupled to the definition of

the 3-DIC.

The following example shows that the inclusion of Theorem 3.3 can be strict, i.e., the

treating interference as noise region is strictly contained in the interference-decoding region.

Continuation of Example 1.1. Recall the additive 3-DIC in Example 1.1 on page 8. For

this channel, the interference-decoding rate region strictly contains the region achievable

by treating interference as noise. To demonstrate this, we computed the approximation of

the interference decoding inner bound depicted in Figure 3.2. Since it is computationally

infeasible to enumerate the potentially arbitrarily large number of conditional distributions

of inputs given Q as required by Theorem 3.1, we used the following procedure. We first

assume Q = ∅ and consider a grid over all input distributions p(x1)p(x2)p(x3). For each

grid point, we compute the achievable rate region as given by Theorem 3.1, respectively.

We represent the region as the convex hull of its corner points. The final approximation is

obtained by taking the union of all such corner points over the grid. Note that due to the

simple structure of RTIN in Theorem 2.1, which consists of a union of rectangular boxes, this

method can compute RTIN to arbitrary precision provided the grid is sufficiently fine (see


Figure 2.1 on page 14). On the other hand, when applied to Theorem 3.1, our approximation

method yields a possibly strictly smaller inner bound than RID.

Figure 3.3 depicts the intersection of the three-dimensional regions in Figures 2.1 and 3.2

with the plane defined by R1 = R3. Note that the same maximum sum rate RΣ = 3 is

achieved by both schemes. However, while treating interference as noise does so at exactly

one rate triple (R1 = R2 = R3 = 1), interference decoding achieves the maximal sum rate

at many different asymmetric rate triples.

Remark 3.3. As we have seen in Subsection 2.2, treating interference as noise achieves the

sum capacity for the cyclically symmetric binary-field 3-DIC in a wide range of parameters

(α, β). It would be interesting to investigate whether interference decoding can achieve

higher sum rates than treating interference as noise in the (α, β) range where the sum

capacity is not known. Moreover, even in the range where we know the sum capacity,

interference decoding may achieve higher asymmetric rates than treating interference as

noise, as in the additive 3-DIC example. The main challenge in settling these questions is

the prohibitively large space of possible input distributions in Theorem 3.1.

R1

R2

R3

Figure 3.2. Region of Theorem 3.1 for the additive 3-DIC example. Compare to Figure 2.1 onpage 14.


RΣ = 3

0.5 1.0 1.5

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

(R1 +R3)/√

2

R2

Interference as noiseInterference decoding

Figure 3.3. Comparison of interference decoding and treating interference as noise (Theorems 2.1and 3.1) for the additive 3-DIC example.

3.1.4 Extension to 3-DIC with noisy observations

In this subsection, we consider the 3-DIC with noisy observations. In this generalization of

3-DIC, the channel outputs in (1.1) on page 8 are observed through memoryless channels

Yk → Zk for k ∈ {1:3}. Thus receiver k now observes a noisy version Zk of Yk, which may

be from a discrete or a continuous alphabet.

The interference-decoding inner bound generalizes to the 3-DIC with noisy observations

as follows. Let (Q,X1, X2, X3) ∼ p = p(q)p(x1|q)p(x2|q)p(x3|q). Define the region

R ′1(p) ⊂ R3+ as the set of rate triples (R1, R2, R3) such that

R1 ≤ I(X1;Z1 |S1, Q),

R1 + min{R2, H(X21 |Q)} ≤ I(X1, X21;Z1 |X31, Q),

R1 + min{R3, H(X31 |Q)} ≤ I(X1, X31;Z1 |X21, Q),

R1 + min{R2 +R3,

R2 +H(X31 |Q),

H(X21 |Q) +R3,

H(S1 |Q)} ≤ I(X1, S1;Z1 |Q).


Similarly, define the regions R ′2(p) and R ′3(p) by making the subscript replacements 1 7→2 7→ 3 7→ 1 and 1 7→ 3 7→ 2 7→ 1 in R ′1(p), respectively.

Theorem 3.4 (Interference decoding for 3-DIC with noisy observations).The region

R ′ID =⋃p

R ′1(p) ∩R ′2(p) ∩R ′3(p),

where p = p(q)p(x1|q)p(x2|q)p(x3|q) is an inner bound to the capacity region of the

3-DIC with noisy observations.

The proof of this theorem proceeds completely analogously to the proof of Theorem 3.1

as presented in Section 3.2, and thus its details are omitted. Note that the inclusion of

Theorem 3.3 does not generalize to the case with noisy observations, the formal reason of

which is discussed in Remark 3.4 on page 52.

The following example demonstrates the inner bound for the 3-DIC with noisy observa-

tions. It also serves to illustrate that treating interference as noise can perform better than

interference decoding for this channel model.

Example 3.1 (Gaussian interference channel with BPSK). Consider the Gaussian inter-

ference channel with finite input alphabets. The channel output at receiver k is

Yk =3∑l=1

glkXl,

Zk = Yk +Nk, (3.5)

where glk ∈ R is the path gain from transmitter l to receiver k, and Nk is additive white

Gaussian noise of average power σ2. This is a realistic model for a wireless interference

channel where the transmitter hardware is based on digital signal processing (DSP) and

digital-to-analog conversion (DAC). For example, Xl = {+1,−1} represents a system with

a binary constellation, e.g., binary phase-shift keying (BPSK). Equation (3.5) represents

continuous-valued outputs (soft outputs), but our model would also apply if a quantizer is

added (hard outputs), for example due to analog–digital conversion (ADC) at the receivers.


Figure 3.4 shows approximations of the inner bounds for a cyclically symmetric Gaussian

interference channel with BPSK inputs and continuous outputs. In contrast to the noiseless

case, neither the interference-decoding region nor the region achieved by treating interference

as noise contains the other, i.e., Theorem 3.3 does not hold for 3-DIC with noisy observations.

In particular, the sum rates achieved by treating interference as noise and interference

decoding are 2.51 and 2.37, respectively. Intuitively, interference decoding attempts to

separate the combined interference from the additive noise. As such, it may achieve lower

rates than simply treating interference as noise for which this separation is not enforced.

This discrepancy is more pronounced for low values of SNR, and it vanishes asymptotically

as SNR grows.

3.1.5 Interference decoding is not optimal in general

As in treating interference as noise, the interference-decoding scheme uses point-to-point

codes. Although the decoder in interference decoding is more sophisticated than the one

underlying Theorem 2.1, the interference-decoding inner bound is not optimal in general.

R1

R2

R3

Figure 3.4. Gaussian interference channel with BPSK: Rate regions achieved by interferencedecoding (dashed outline) and treating interference as noise (shaded) for a cyclically symmetricGaussian interference channel with Xk ∈ {+1,−1}, path gains g11 = 1.8, g21 = 1.0, g31 = 1.1,and noise power σ2 = 0.1.


To exemplify this, consider the following example of a 2-DIC (cf. Subsection 1.1.1).

Example 3.2. Consider the 2-DIC in Figure 3.5(a) with input alphabets X1 = {0, 1, 2},X2 = {0, 1}, loss functions g12 = {0 7→ 0, 1 7→ 0, 2 7→ 1} and g11 = g22 = g21 = Id, and

receiver functions f1 = f2 being addition. The outputs of the channel are thus given by

Y1 = X1 +X2,

Y2 = g12(X1) +X2.

The interference-decoding inner bound in Theorem 3.1 reduces to the set of rate pairs

(R1, R2) that such that

R1 ≤ H(X1 |Q),

R2 ≤ H(X2 |Q),

R1 + min{R2, H(S1 |Q)} ≤ H(Y1 |Q),

R2 + min{R1, H(S2 |Q)} ≤ H(Y2 |Q),

for some p(q)p(x1|q)p(x2|q).

Figure 3.5(b) compares this inner bound to the capacity region given by Theorem 1.1 and

to the region achievable by treating interference as noise (Theorem 2.1). Not surprisingly, in-

terference decoding does not achieve the full capacity. To achieve capacity, Han–Kobayashi

rate splitting and superposition coding are needed.


X1{0, 1, 2}

012

0

1

X2

Y1

Y2{0, 1}

{0, 1, 2, 3}

{0, 1, 2}

{0, 1}

(a) Block diagram of the channel.

R1

R2

0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6

0.2

0.4

0.6

0.8

1.0

Interference as noiseInterference decodingCapacity

(b) Capacity region and inner bounds.

Figure 3.5. 2-DIC example.

3.2. PROOF OF THEOREM 3.1 43

3.2 Proof of Theorem 3.1

We first present a key lemma which formalizes the notion of link saturation as discussed in

Remark 3.1 by generalizing the packing lemma stated in [EK11].

Lemma 3.1 (Packing lemma for pairs).Let (U,A,B,C) ∼ p = p(u)p(a|u)p(b|u)p(c|a, b, u). Let Un ∼

∏ni=1 pU(ui). For

each m ∈ {1 : 2nRA}, let An(m) ∼∏n

i=1 pA|U(ai |ui). For each l ∈ {1 : 2nRB}, let

Bn(l) ∼∏n

i=1 pB|U(bi |ui), conditionally independent of each An(m) given Un. Let

Cn ∼∏n

i=1 pC|U(ci |ui), conditionally independent of each An(m) and Bn(l) given

Un. There exists a δ(ε) with limε→0 δ(ε) = 0 such that if

min{RA, H(A |U)}+ min{RB, H(B |U)} < I(A,B;C |U)− δ(ε),

then P{(Un, An(m), Bn(l), Cn) ∈ T (n)ε for some m, l} → 0 as n→∞, where typical-

ity, entropies and mutual information are with respect to p.

Proof: Applying the packing lemma in [EK11] with U = U , X = (A,B), and Y = C

immediately establishes the convergence if RA + RB < I(A,B;C |U) − δ(ε). Next, we

prove convergence when RB +H(A |U) < I(A,B;C |U)− δ(ε). To this end, we bound

the probability in question as

P{(Un, An(m), Bn(l), Cn) ∈ T (n)ε for some m, l}

≤2nRB∑l=1

∑un∈T (n)

ε (U)

P{Un = un)}

︸︷︷︸≤1

∑bn∈T (n)

ε (B |un)

P{Bn(l) = bn |Un = un}

︸︷︷︸≤1

· P{(un, An(m), bn, Cn) ∈ T (n)ε for some m}. (3.6)

To bound the last probability term, we apply Corollary A.2 with Ai = An(m), D = Cn,

Q = T (n)ε (A,C |un, bn), QA = T (n)

ε (A |un, bn), and PD given by the bound

P{(un, an, bn, Cn) ∈ T (n)ε } ≤ 2−n(I(A,B;C|U)−δ1(ε)),


which follows from the joint typicality lemma in [EK11]. The corollary the implies

P{(un, An(m), bn, Cn) ∈ T (n)ε for some m} ≤ |T (n)

ε (A |un, bn)| · 2−n(I(A,B;C|U)−δ1(ε)

≤ 2n(H(A |U,B)+δ2(ε)−I(A,B;C|U)+δ1(ε))

(a)= 2n(H(A |U)−I(A,B;C|U)+δ(ε)),

where in step (a), H(A |U,B) = H(A |U) by the Markov chain A− U − B, and δ(ε) =

δ1(ε) + δ2(ε). Substituting into (3.6), we have

P{(Un, An(m), Bn(l), Cn) ∈ T (n)ε for some m, l} ≤ 2n(RB+H(A |U)−I(A,B;C|U)+δ(ε)).

Clearly, this probability converges to zero as n→∞ if RB +H(A |U) < I(A,B;C |U)−δ(ε). In the same manner, convergence follows from RA +H(B |U) < I(A,B;C |U)−δ(ε). Thus convergence is implied by min {RA +RB, RA +H(B |U), H(A |U) +RB} <I(A,B;C |U)−δ(ε), and the desired result follows by recalling thatH(A |U)+H(B |U) ≥I(A,B;C |U). �

We are now ready to prove Theorem 3.1. We begin by fixing an input distribution

p(q)p(x1|q)p(x2|q)p(x3|q).

Codebook generation. Randomly generate a sequence qn according to∏n

i=1 pQ(qi). For

each k ∈ {1:3}, randomly and conditionally independently generate sequences xnk(mk),

mk ∈ {1:2nRk}, each according to∏n

i=1 pXk|Q(xki|qi). From the channel definition, this

procedure induces intermediate sequences xnkl(mk) for l ∈ {1:3}, combined interference

sequences sn1 (m2,m3), sn2 (m1,m3), sn3 (m1,m2), and output sequences ynk (m1,m2,m3).

Encoding. To send the message mk ∈ {1:2nRk}, k ∈ {1:3}, encoder k transmits xnk(mk).

Decoding. The receivers use simultaneous non-unique decoding. Upon observing yn1 ,

decoder 1 declares that m1 has been sent if it is the unique message such that

(qn, xn1 (m1), sn1 (m2, m3), xn21(m2), xn31(m3), yn1 ) ∈ T (n)ε ,

for some m2, m3. Decoding at the other receivers is performed similarly.


Analysis of the probability of error. Without loss of generality, assume that mk = 1 for

k ∈ {1 :3}. Define Emlk = {(Qn, Xn1 (m), Sn1 (l, k), Xn

21(l), Xn31(k), Y n

1 (1, 1, 1)) ∈ T (n)ε },

and the events

E0 = Ec111,

E1 = {Em11 for some m 6= 1} ,

E2 = {Eml1 for some m, l 6= 1} ,

E3 = {Em1k for some m, k 6= 1} ,

E4 = {Emlk for some m, l, k 6= 1} .

Then the probability of decoding error at the first receiver averaged over codebooks is upper

bounded as P(E) = P(E0 ∪ E1 ∪ E2 ∪ E3 ∪ E4) ≤∑4

j=0 P(Ej). We bound each term. First,

by the law of large numbers, P(E0)→ 0 as n→∞.

Next consider

E1 ⊆{

(Qn, Xn1 (m), Sn1 (1, 1), Y n

1 (1, 1, 1)) ∈ T (n)ε for some m 6= 1

}.

By the packing lemma in [EK11], the probability of this event tends to zero as n→∞ if

R1 < I(X1;Y1 |S1, Q)− δ(ε),

which simplifies to

R1 < H(X11 |Q)− δ(ε). (3.7)

The event E2 can be treated as follows. Consider

E2 ⊆{

(Qn, Xn1 (m), Xn

21(l), Xn31(1), Y n

1 (1, 1, 1)) ∈ T (n)ε for some m, l 6= 1

}.

Using Lemma 3.1 with Un = (Qn, Xn31(1)), An = Xn

1 , Bn = Xn21, and Cn = Y n

1 (1, 1, 1),

we conclude that P(E2)→ 0 if

R1 + min{R2, H(X21 |X31, Q)} < I(X1, X21;Y1 |X31, Q)− δ(ε),


or, equivalently,

R1 + min{R2, H(X21 |Q)} < H(Y1 |X31, Q)− δ(ε). (3.8)

Completely symmetrically, P(E3)→ 0 as n→∞ if

R1 + min{R3, H(X31 |Q)} < H(Y1 |X21, Q)− δ(ε). (3.9)

Finally, we bound the probability of E4 by the following, which is proved below.

Proposition 3.1. P(E4) vanishes as n→∞ if

R1 + min{R2 +R3, R2 +H(X31 |Q),

H(X21 |Q) +R3, H(S1 |Q)}< H(Y1 |Q)− δ(ε). (3.10)

As per Remark 3.1, the intuition in this proposition is that the min term represents the

effective number of sequences that appear at S1. Recall that S1 is the output of a deterministic

multiple access channel with inputs X21 and X31 and input to output mapping h1. Figure 3.6

shows the number of output sequences for different ranges of R1 and R2. Note that when

(R1, R2) is in the deterministic MAC capacity region, the number of output sequences is

simply 2n(R1+R2). For (R1, R2) outside the capacity region, the number of output sequences

saturates in one or both dimensions. The logarithm of the number of output sequences

divided by n appears in the min expression of the lemma.

Collecting (3.7) to (3.10) yields the conditions of R1. The probability of error at the

second and third receiver can be bounded similarly, leading to the conditions of R2 and R3.

This concludes the proof of Theorem 3.1. �


2n(R3+H(X21 |Q)

2n(R2+H(X31 |Q)

H(X21 |Q)

H(X31 |Q) H(S

1 |Q)

R3

R2

2nH(S1 |Q)

2n(R2+R3)

Figure 3.6. Capacity region for a deterministic MAC. The number of output sequences as afunction of the number of input sequences is annotated in each region.

Proof of Proposition 3.1: The first and last term in the min expression follow immediately

from Lemma 3.1 by disregarding the special structure of Sn1 . To obtain the second term,

P(E4)

= P{

(Qn, Xn1 (m), Sn1 (l, k), Xn

21(l), Xn31(k), Y n

1 (1, 1, 1)) ∈ T (n)ε for some m, l, k 6= 1

}≤

2nR1∑m=2

2nR2∑l=2

∑qn∈T (n)

ε (Q)

P{Qn = qn}

︸︷︷︸≤1

·∑

xn1∈T(n)ε (X1 | qn)

P{Xn1 (m) = xn1 |Qn = qn}

︸︷︷︸≤1

∑xn21∈T

(n)ε (X21 | qn)

P{Xn21(l) = xn21 |Qn = qn}

︸︷︷︸≤1

· P{(qn, xn1 , xn21, Xn31(k), Y n

1 (1, 1, 1)) ∈ T (n)ε for some k 6= 1}.

To bound the last probability term, we apply Corollary A.2 with Ai = Xn31(k), D =

Y n1 (1, 1, 1),Q = T (n)

ε (X31, Y | qn, xn1 , xn21),QA = T (n)ε (X31 | qn, xn1 , xn21), and PD given by

the bound

P{(qn, xn1 , xn21, xn31, Y

n1 (1, 1, 1)) ∈ T (n)

ε } ≤ 2−n(I(X1,X21,X31;Y1|Q)−δ1(ε)),


which is a consequence of the joint typicality lemma in [EK11]. Thus, by the corollary,

P{(qn, xn1 , xn21, Xn31(k), Y n

1 (1, 1, 1)) ∈ T (n)ε for some k 6= 1}

≤ 2n(H(X31|Q)+δ2(ε)) · 2−n(H(Y1|Q)−δ1(ε)).

Letting δ(ε) = δ1(ε) + δ2(ε), then P(E4) is bounded by 2n(R1+R2+H(X31|Q)−H(Y1|Q)+δ(ε)),

which clearly tends to zero as n→∞ if R1 + R2 +H(X31|Q) < H(Y1|Q)− δ(ε). Thus

the second term in the min expression is established. The third term follows likewise. �


Proof of achievability: We prove achievability by specializing Theorem 3.1. Specifically,

we show that under strong interference and invertible hk, regions Rk of Theorem 3.1 simplify

to regions R ′′k below while maintaining R1 ∩R2 ∩R3 = R ′′1 ∩R ′′2 ∩R ′′3 .

Recall the definition of R1(p) as the set of rate triples (R1, R2, R3) that satisfy inequali-

ties (3.1) to (3.4). Further recall the analogous definitions of R2 and R3, which include the

inequalities

R2 ≤ H(X22 |Q), (3.11)

R3 ≤ H(X33 |Q). (3.12)

When combined with the strong interference assumption, inequalities (3.11) and (3.12) imply

that the min expressions in (3.2) and (3.3) simplify to R2 and R3, respectively. Furthermore,

the sum of (3.11) and (3.12) implies that

R2 +R3 ≤ H(X22 |Q) +H(X33 |Q)

≤ H(X21 |Q) +H(X31 |Q)

= H(S1 |Q),

where we have used the invertibility of h1. Therefore, the min expression in (3.4) simplifies

to R2 +R3.


Consequently, for p = p(q)p(x1|q)p(x2|q)p(x3|q), define R ′′1 (p) as the set of rate triples

(R1, R2, R3) such that

R1 ≤ H(X11 |Q),

R1 +R2 ≤ H(Y1 |X31, Q),

R1 +R3 ≤ H(Y1 |X21, Q),

R1 +R2 +R3 ≤ H(Y1 |Q).

Likewise, define the regions R ′′2 (p) and R ′′3 (p) by replacing subscripts following 1 7→ 2 7→3 7→ 1 and 1 7→ 3 7→ 2 7→ 1 in R ′′1 (p), respectively. Then Theorem 3.1 implies that

⋃p

R ′′1 (p) ∩R ′′2 (p) ∩R ′′3 (p)

is achievable, and the proposition follows by expanding the intersection operations. �

Proof of converse: Consider a sequence of codes with rates (R1, R2, R3), empirical pmf

p(xn1 )p(xn2 )p(xn3 ), and P (n)e tending to 0 as n→∞. First, note that

nR1 ≤ I(Xn1 ;Y n

1 ) + nεn

= I(Xn11;Y n

1 ) + nεn

≤ H(Xn11) + nεn

= nH(X11 |Q) + nεn,

where Q is a time-sharing random variable uniformly distributed over {1:n}. Next, consider

n(R1 +R2)

≤ I(Xn1 ;Y n

1 ) + I(Xn2 ;Y n

2 ) + nεn

= H(Y n1 )−H(Y n

1 |Xn1 ) +H(Y n

2 )−H(Y n2 |Xn

2 ) + nεn

= H(Y n1 )−H(Sn1 ) +H(Y n

2 )−H(Sn2 ) + nεn

= H(Y n1 )−H(Xn

31) + (H(Y n2 )−H(Xn

21)−H(Xn12)−H(Xn

32)) + nεn

≤ H(Y n1 |Xn

31) + nεn


≤ nH(Y1 |X31, Q) + nεn,

where we have used H(Xn22) ≤ H(Xn

21) and H(Y n2 ) ≤ H(Xn

22) + H(Xn12) + H(Xn

32). In

the same way, it can be shown that

n(R1 +R3) ≤ nH(Y1 |X21, Q) + nεn.

Finally,

n(R1 +R2 +R3)

≤ H(Y n1 )−H(Sn1 ) +H(Y n

2 )−H(Sn2 ) +H(Y n3 )−H(Sn3 ) + nεn

= H(Y n1 ) + nεn + (H(Y n

2 )−H(Xn21)−H(Xn

12)−H(Xn32))

+ (H(Y n3 )−H(Xn

31)−H(Xn13)−H(Xn

23))

≤ nH(Y1 |Q) + nεn.

Thus, all four conditions related to the first receiver have been shown. Analogous steps

yield the remaining bounds. Finally, the cardinality bound on Q can be established using

the convex cover method described in [EK11]. �


We show that the inner bound in Theorem 2.1 is included in the inner bound of Theorem 3.1.

The conditions of region R1 in Theorem 3.1 can be made more stringent by replacing the

min expression with any one of its argument terms. For example, (R1, R2, R3) ∈ R1 is

implied by

R1 ≤ H(X11 |Q),

R1 +H(X21 |Q) ≤ H(Y1 |X31, Q),

R1 +H(X31 |Q) ≤ H(Y1 |X21, Q),

R1 +H(S1 |Q) ≤ H(Y1 |Q),


or, equivalently,

R1 ≤ min{H(X11|Q),

H(Y1|X31, Q)−H(X21|Q),

H(Y1|X21, Q)−H(X31|Q),

H(Y1|Q)−H(S1|Q)}. (3.13)

To simplify this expression, consider

H(X11 |Q) ≥ I(X11;Y1 |Q)

= H(Y1 |Q)−H(Y1 |X11, Q)

= H(Y1 |Q)−H(S1 |Q),

as well as

(H(Y1 |X21, Q)−H(X31 |Q)

)−(H(Y1 |Q)−H(S1 |Q)

)= H(Y1, X21 |Q)−H(X21 |Q)−H(X31 |Q)︸︷︷︸

H(S1 |X21,Q)

−H(Y1 |Q) +H(S1 |Q)

= H(X21 |Y1, Q)−H(X21 |S1, Q)

≥ H(X21 |Y1, S1, Q)−H(X21 |S1, Q)

= 0,

and, by symmetry,

(H(Y1|X31, Q)−H(X21|Q)

)−(H(Y1|Q)−H(S1|Q)

)≥ 0.

Thus, the min in (3.13) is always achieved by the last term, and (3.13) simplifies to

R1 ≤ H(Y1 |Q)−H(S1 |Q) = I(X1;Y1 |Q).


Using a similar argument, it follows that the conditions for R2 and R3 in Theorem 3.1 are

implied by (2.1). �

Remark 3.4. In the case with noisy observations, this proof fails in the following manner.

Interference decoding entails the inequality

R1 ≤ I(X1, S1;Z1 |Q)−H(S1 |Q)

= I(X1;Z1 |Q) + I(S1;Z1 |X1, Q)−H(S1 |Q)

= I(X1;Z1 |Q) +H(S1 |X1, Q)−H(S1 |X1, Z1, Q)−H(S1 |Q)

= I(X1;Z1 |Q)−H(S1 |X1, Z1, Q).

The first term is the achievable rate when treating interference as noise. The second term is

zero when the channel is noiseless and acts as a penalty when noise is introduced.

Chapter 4

Communication with disturbance constraints

The problem of communication with disturbance constraints problem is motivated by the

broadcast view of the interference channel, in which each sender wishes to communicate a

message only to one of the receivers while causing the least disturbance to the other receivers.

In this chapter1, we focus on studying the problem of communication with disturbance

constraints itself, in isolation from the interference channel.

Alice wishes to communicate a message to Bob while causing the least disturbance to

nearby Dick, Diane, and Diego, who are not interested in the communication from Alice.

Assume a discrete memoryless broadcast channel p(y, z1, . . . , zK |x) between Alice X , Bob

Y , and their preoccupied friends Z1, . . . , ZK as depicted in Figure 4.1. We measure the

disturbance at side receiver Zj by the amount of undesired information rate (1/n)I(Xn;Znj )

originating from the sender X , and require this rate not to exceed Rd,j in the limit. The

problem is to determine the optimal trade-off between the message communication rate R

and the disturbance rates Rd,j .

For a single disturbance constraint, we show that the optimal encoding scheme is rate

splitting and superposition coding, which is the same as the Han–Kobayashi scheme for

the two user-pair interference channel [HK81, CMGE08]. This motivates us to study

communication with more than one disturbance constraint with the hope of finding good

coding schemes for interference channels with more than two user pairs, specifically the

3-DIC defined in Subsection 1.2. To this end, we establish inner and outer bounds on the1The results in this chapter were first published in [BE11b, BE11c].

54 CHAPTER 4. COMMUNICATION WITH DISTURBANCE CONSTRAINTS

MXn Y n

Zn1

ZnK

1nI(Xn;Zn

1 ) ≤ Rd,1

1nI(Xn;Zn

K) ≤ Rd,K

MEncoder

p(y,z

1,...,zK|x

) Decoder

Figure 4.1. Communication system with disturbance constraints.

rate–disturbance region for the deterministic channel model with two disturbance constraints

that are tight in some nontrivial special cases. In the following section we provide needed

definitions and present an extended summary of the results. The proofs are presented in

subsequent sections.


Consider the discrete memoryless communication system with K disturbance constraints

(henceforth referred to as DMC-K-DC) depicted in Figure 4.1. The channel consists

of K + 2 finite alphabets X , Y , Zj , j ∈ {1 :K}, and a collection of conditional pmfs

p(y, z1, . . . , zK |x). A (2nR, n) code for the DMC-K-DC consists of the message set {1:2nR},an encoding function xn : {1:2nR} → X n, and a decoding function m : Yn → {1:2nR}. We

assume that the message M is uniformly distributed over {1:2nR}. A rate–disturbance tuple

(R,Rd,1, . . . , Rd,K) ∈ RK+1+ is achievable for the DMC-K-DC if there exists a sequence of

(2nR, n) codes such that

limn→∞

P{M 6= M} = 0,

lim supn→∞

(1/n)I(Xn;Znj ) ≤ Rd,j, j ∈ {1:K}.

The rate–disturbance region R of the DMC-K-DC is the closure of the set of all achievable

tuples (R,Rd,1, . . . , Rd,K).

Remark 4.1. Like the message rate R, the disturbance rates Rd,j , for j ∈ {1 :K}, are


measured in units of bits per channel use.

Remark 4.2. The disturbance measure (1/n)I(Xn;Znj ) can be expanded as (1/n)H(Zn

j )−(1/n)H(Zn

j |Xn). The first term is the entropy rate of the received signal Zj and is caused

by both the transmission itself and by noise inherent to the channel. Subtracting the second

term separates out the noise part. (For channels with additive white noise, e.g., the Gaussian

case, the second term is exactly the differential entropy of each noise sample.)

Remark 4.3. The results in this chapter remain essentially true if disturbance is measured

by (1/n)H(Znj ) instead. If the channel is deterministic, the two measures coincide.

Remark 4.4. The disturbance constraint (1/n)I(Xn;Znj ) ≤ Rd,j is reminiscent of the

information leakage rate constraint for the wiretap channel [Wyn75, CK78], which is of

the form (1/n)I(M ;Znj ) ≤ Rleak. Replacing M with Xn, however, dramatically changes

the problem and the optimal coding scheme. In the wiretap channel, the key component of

the optimal encoding scheme is randomized encoding, which helps control the leakage rate

(1/n)I(M ;Znj ). Such randomization reduces the achievable transmission rate for a given

disturbance constraint, hence is not desirable in our setting.

The rate–disturbance region is not known in general. We establish the following results.

4.1.1 Rate–disturbance region for a single disturbance constraint

Consider the case with a single disturbance constraint, i.e., K = 1, and relabel Z1 as Z and

Rd,1 as Rd. We fully characterize the rate–disturbance region for this case.

Theorem 4.1 (Rate–disturbance region of DMC-1-DC).The rate–disturbance region R of the DMC-1-DC is the set of rate pairs (R,Rd) such

that

R ≤ I(X;Y ),

Rd ≥ I(X;Z |U),

R−Rd ≤ I(X;Y |U)− I(X;Z |U),

for some pmf p(u, x) with |U| ≤ |X |+ 1.


Let R(U,X) be the rate region defined by the rate constraints in the theorem for a

fixed joint pmf (U,X) ∼ p(u, x). This rate region is illustrated in Figure 4.2. The rate–

disturbance region is simply the union of these regions over all p(u, x) and is convex without

the need for a time-sharing random variable.

The proof of Theorem 4.1 is given in Subsections 4.2.1 and 4.2.2. Achievability is

established using rate splitting and superposition coding. Receiver Y decodes the satellite

codeword while receiver Z distinguishes only the cloud center. Note that this encoding

scheme is identical to the Han–Kobayashi scheme for the two user-pair interference chan-

nel [HK81, CMGE08].

We now consider three interesting special cases.

Deterministic channel

Assume that Y and Z are deterministic functions of X . We show that the rate–disturbance

region in Theorem 4.1 reduces to the following.

Corollary 4.1 (Rate–disturbance region of deterministic DMC-1-DC).The rate–disturbance region for the deterministic channel with one disturbance constraint

is the set of rate pairs (R,Rd) such that

R ≤ H(Y ),

R−Rd ≤ H(Y |Z),

for some pmf p(x).

Clearly, this region is convex. Alternatively, the region can be written as the set of rate

pairs (R,Rd) such that

R ≤ H(Y |Q),

Rd ≥ I(Y ;Z |Q),

for some joint pmf p(q, x) with |Q| ≤ 2. Corollary 4.1 and the alternative description of the


A

B

45◦

R

Rd

R(U,X)

I(X;Y )I(X;Y |U)

I(X;Z|U)

Figure 4.2. Example of R(U,X), the constituent region of R.

region are established by substituting U = Z in the region of Theorem 4.1 and simplifying

the resulting region as detailed in Subsection 4.2.3.

Remark 4.5. Recall the 2-DIC of Subsection 1.1.1 (see Figure 1.2 on page 5). According

to Theorem 1.1, the capacity region is achieved by the Han–Kobayashi scheme in which

the transmitters use superposition codebooks generated according to p(x12)p(x1|x12) and

p(x21)p(x2|x21). Now consider the dashed orange boxes in Figure 4.3, where some of the

signals are relabeled with respect to Figure 1.2. Corollary 4.1 shows that the same encoding

scheme achieves the disturbance-constrained capacity for the channels X1 → (Y ′1 , Z1) and

X2 → (Y ′2 , Z2), shown as dashed boxes in Figure 4.3. Here, Y ′1 and Y ′2 are the desired

receivers, and Z1 and Z2 are the side receivers associated with disturbance constraints. Note

that decodability of the desired messages at receivers Y1 and Y2 in the interference channel

implies decodability at Y ′1 and Y ′2 in the channels with disturbance constraint, respectively.

Example 4.1. Consider the deterministic channel depicted in Figure 4.4(a) and its rate–

disturbance region in Figure 4.4(b). Note that rates R ≤ 1 can be achieved with zero

disturbance rate by restricting the transmission to input symbols {0, 1} (or {2, 3}), which

map to different symbols at Y , but are indistinguishable at Z. On the other hand, for largeRd,

the disturbance constraint is inactive and R is bounded only by the unconstrained capacity

log(3). In addition to the optimal region achieved by superposition coding, the figure also

shows the strictly suboptimal region achieved by simple non-layered random codes.


Y ′1

Y ′2

Y1 → M1

Y2 → M2

M1 → X1

M2 → X2

Z1

Z2

g11

g12

g21

g22

f1

f2

Figure 4.3. The link between 2-DIC and communication with disturbance constraints.

M → X{0, 1, 2, 3} 0

12

0

23

1

012

0

3 1Z

{0, 1}

Y → M{0, 1, 2}

(a) Channel block diagram

Rd

R

Single-user codebooksSuperposition codebooks

0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8

0.2

0.4

0.6

0.8

1.0

(b) Rate–disturbance region

Figure 4.4. Deterministic example with one disturbance constraint.


Gaussian channel

Consider the problem of communication with one disturbance constraint for the Gaussian

channel

Y = X +W1,

Z = X +W2,

where the noise is W1 ∼ N (0, 1) and W2 ∼ N (0, N). Assume an average power constraint

P on the transmitted signal X .

The case N ≤ 1 is not interesting, since then Y is a degraded version of Z and the

disturbance rate is simply given by the data rate R. If N > 1, Z is a degraded version of Y ,

and the rate–disturbance region reduces to the following.

Corollary 4.2 (Gaussian channel with one disturbance constraint).The rate–disturbance region of the Gaussian channel with parameters P > 0 and N > 1

is the set of rate pairs (R,Rd) such that

R ≤ C(αP ),

Rd ≥ C(αP/N),

for some α ∈ [0, 1], where C(x) = (1/2) log(1 + x) for x ≥ 0.

Achievability is proved using Gaussian codes with power αP . The converse follows by

defining α? ∈ [0, 1] such thatR = C(α?P ) and applying the vector entropy power inequality

to Zn = Y n + W n2 , where W2 ∼ N (0, N − 1) is the excess noise. The details are given in

Subsection 4.2.4. An alternative proof of the corollary using the relation between mutual

information and minimum mean-square error estimation [GSV05] was given in [BS11].

Note that this is a degenerate form of the Han–Kobayashi scheme because the constraint

from the multiple access side of the interference channel is not taken into consideration.


Vector Gaussian channel

Now consider the vector Gaussian channel with one disturbance constraint

Y = X +W1,

Z = X +W2,

where X ∈ Rn and the noise is W1 ∼ N (0, K1) and W2 ∼ N (0, K2) for some positive

semidefinite covariance matrices K1, K2 ∈ Rn×n. Assume an average transmit power

constraint tr(KX) ≤ P , where KX = E(XXT) is the covariance matrix of X . This case is

not degraded in general.

Theorem 4.2 (Gaussian vector channel with one disturbance constraint).The rate–disturbance region of the Gaussian vector channel with parameters P , K1, and

K2 is the convex hull of the set of pairs (R,Rd) such that

R ≤ 12

log|KU +KV +K1|

|K1|,

R−Rd ≤ 12

log|KV +K1||KV +K2|

|K2||K1|

,

Rd ≥ 12

log|KV +K2||K2|

.

for some positive semidefinite matrices KU , KV ∈ Rn×n with tr(KU +KV ) ≤ P .

Achievability of this rate–disturbance region is shown by applying Theorem 4.1. Using

the discretization procedure in [EK11], it can be shown that the theorem continues to hold

with the power constraint additionally applied to the set of permissible input distributions.

The claimed region then follows by considering the special case where the input distribution

p(u, x) is jointly Gaussian. To prove the converse, we use an extremal inequality in [LV07]

to show that Gaussian input distributions are sufficient. The details of the proof are given in

Subsection 4.2.5.


4.1.2 Inner and outer bounds for the deterministic channel with twodisturbance constraints

The correspondence between optimal encoding for the channel with one disturbance con-

straint and the Han–Kobayashi scheme for the interference channel suggests that the optimal

coding scheme for K disturbance constraints may provide an efficient (if not optimal)

scheme for the interference channel with more than two user pairs. This is particularly true

for the 3-DIC, since it is an extension of the 2-DIC for which the Han–Kobayashi scheme

is optimal (see Remark 4.5). Consequently, we restrict our attention to the deterministic

version of the DMC-2-DC.

First, we establish the following inner bound on the rate–disturbance region.

Theorem 4.3 (Inner bound for deterministic DMC-2-DC).The rate–disturbance region R of the deterministic channel with two disturbance con-

straints is inner-bounded by the set of rate triples (R,Rd,1, Rd,2) such that

R ≤ H(Y ), (4.1)

Rd,1 +Rd,2 ≥ I(Z1;Z2 |U), (4.2)

R−Rd,1 ≤ H(Y |Z1, U), (4.3)

R−Rd,2 ≤ H(Y |Z2, U), (4.4)

R−Rd,1 −Rd,2 ≤ H(Y |Z1, Z2, U)− I(Z1;Z2 |U), (4.5)

2R−Rd,1 −Rd,2 ≤ H(Y |Z1, Z2, U) +H(Y |U)− I(Z1;Z2 |U), (4.6)

for some pmf p(u, x).

The inner bound is convex. The region R(U,X) defined by the inequalities in the

theorem for a fixed p(u, x) is illustrated in Figure 4.5. The expression I(Z1;Z2 |U) appears

in three of the inequalities. As in Marton coding for the 2-receiver broadcast channel with a

common message, it is the penalty incurred in encoding independent messages via correlated

sequences. The auxiliary random variable U can be used to reduce this penalty. For example,


if U is chosen such that it is a function of both Z1 and Z2 individually, then

I(Z1;Z2 |U) = H(Z1 |U) +H(Z2 |U)−H(Z1, Z2 |U)

= H(Z1) +H(Z2)−H(Z1, Z2)−H(U)

= I(Z1;Z2)−H(U),

i.e., the penalty I(Z1;Z2 |U) with U is less than the penalty I(Z1;Z2) without U .

Remark 4.6. The right-hand side of condition (4.6) can be equivalently expressed as

H(Y |Z1, Z2, U) +H(Y |U)− I(Z1;Z2 |U)

= H(Y |Z1, U) +H(Y |Z2, U)− I(Z1;Z2 |U, Y ),

This shows that the condition is stricter than the sum of conditions (4.3) and (4.4).

The encoding scheme for Theorem 4.3 involves rate splitting, Marton coding, and

superposition coding. The analysis of the probability of error, however, is complicated by the

fact that receiver Y wishes to decode all parts of the message as detailed in Subsection 4.3.1.

Receivers Z1 and Z2 each observe a satellite codeword from a superposition codebook.

Note that the encoding scheme can be readily extended to the general (non-deterministic)

DMC-2-DC.

Rd,1

Rd,2

R

(4.1)

(4.2)

(4.3) (4.4)

(4.5)

(4.6)

Figure 4.5. Constituent region R(U,X) for Theorem 4.3. Each face is annotated by the inequalitythat defines it.


To complement the inner bound, we establish the following outer bound on the rate–

disturbance region of the deterministic channel with two disturbance constraints.

Theorem 4.4 (Outer bound for deterministic DMC-2-DC).If a rate triple (R,Rd,1, Rd,2) is achievable for the deterministic channel with two

disturbance constraints, then it must satisfy the conditions

R ≤ H(Y |Q),

Rd,1 ≥ I(Y ;Z1 |Q),

Rd,2 ≥ I(Y ;Z2 |Q),

for some pmf p(q, x) with |Q| ≤ 3.

The proof of this outer bound is given in Subsection 4.3.2. Note that this outer bound is

very similar in form to the alternative description of Corollary 4.1 for the single-constraint

deterministic case.

The inner bound in Theorem 4.3 and the outer bound in Theorem 4.4 coincide in some

special cases. To discuss these, we introduce the following notation. Since all channel

outputs are functions of X , they can be equivalently thought of as set partitions of the

input alphabet X . Set partitions form a partially ordered set (poset) under the refinement

relation. Since this poset is a complete lattice [Sta11], the following concepts are well-

defined. For two set partitions (functions) f and g, let f 4 g denote that f is a refinement of

g (equivalently, g is degraded with respect to f ), let f ∧ g be the intersection of the two set

partitions (the function that returns both f and g), and let f ∨ g denote the finest set partition

of which both f and g are refinements (the Gacs–Korner–Witsenhausen common part of f

and g, cf. [GK73, Wit75]).

The inner bound of Theorem 4.3 coincides with the outer bound of Theorem 4.4 if Z1 or

Z2 is a degraded version of Y ∧ (Z1 ∨ Z2), i.e., if the output Y together with the common

part of Z1 and Z2 determine Z1 or Z2 completely.


Theorem 4.5 (Rate–disturbance region of certain deterministic DMC-2-DC).The rate–disturbance region R of the deterministic channel with two disturbance con-

straints is given by the outer bound of Theorem 4.4 if

Y ∧ (Z1 ∨ Z2) 4 Z1, or

Y ∧ (Z1 ∨ Z2) 4 Z2.

The theorem is proved by specializing Theorem 4.3 as detailed in Subsection 4.3.3. In the

case when Z1 or Z2 is a degraded version of Y alone, achievability follows by setting U = ∅in Theorem 4.3. Otherwise, we let U = Z1 ∨ Z2. This is intuitive, since U corresponds to

the common-message step in the Marton encoding scheme.

Example 4.2. Consider the deterministic channel depicted in Figure 4.6(a). The desired

receiver output Y is a refinement of both side receiver outputs Z1 and Z2, and hence, Theo-

rem 4.5 applies. Figure 4.6(b) depicts the rate–disturbance region, numerically approximated

by evaluating each point in a regular grid over the distributions p(x) and subsequently taking

the convex hull. Figure 4.7(a) contrasts the single-constraint case (Rd,2 is set to infinity

and thus inactive) with the case where both side receivers are under the same disturbance

rate constraint (Rd,1 = Rd,2). As expected, imposing an additional disturbance constraint

can significantly reduce the achievable message rate. Figure 4.7(b) illustrates the trade-off

between the disturbance rates Rd,1 and Rd,2 at the two side receivers, for a fixed data rate R.


M → X{0, 1, 2, 3}

012 1

3 2

0

Z2{0, 1, 2}

012

0

312

Z1{0, 1, 2}

Y → M

(a) Block diagram of the channel.

Rd,1

Rd,2

R

(b) Rate–disturbance region.

Figure 4.6. Deterministic channel with two disturbance constraints (Example 4.2).


Rd

0.5

1.0

1.5

2.0

R0.5 1.0 1.5 2.0

Symmetric disturbance constraintsSingle disturbance constraint

(a) Single disturbance constraint (Rd,1 = Rd, Rd,2 =∞) and symmet-ric disturbance constraint (Rd,1 = Rd,2 = Rd).

Rd,1

Rd,2

0.5 1.0 1.5

0.5

1.0

1.5 R=2.0

1.9

1.81.71.61.51.41.31.21.1

(b) Contour lines of the rate–disturbance region at constant rate R.

Figure 4.7. Two-dimensional projections of the rate–disturbance region for Example 4.2.


We conclude this section by considering another case in which we can fully characterize

the rate–disturbance region of the deterministic channel with two disturbance constraints. If

Z1 is a degraded version of Z2 (or vice versa), the region R of Theorem 4.3 is optimal and

simplifies to the following.

Corollary 4.3 (Rate–disturbance region with degraded side receivers).The rate–disturbance region R of the deterministic channel with two disturbance con-

straints with Z1 4 Z2 or Z2 4 Z1 is the set of rate triples (R,Rd,1, Rd,2) such that

R ≤ H(Y ),

R−Rd,1 ≤ H(Y |Z1),

R−Rd,2 ≤ H(Y |Z2).

for some pmf p(x).

Achievability follows as a special case of Theorem 4.3. The encoding scheme underlying

the theorem carefully avoids introducing an ordering between the side receiver signals Z1

and Z2, but such ordering is naturally given by the channel here. Consequently, the corollary

follows by setting the auxiliary U equal to the output at the degraded side receiver. This

turns the encoding scheme into superposition coding with three layers. The details are given

in Subsection 4.3.4.

Note that the region of Corollary 4.3 is akin to the deterministic case with one disturbance

constraint in Corollary 4.1. In both cases, the side receiver signals need not be degraded

with respect to Y .


4.2 Proofs for a single disturbance constraint

4.2.1 Proof of achievability for Theorem 4.1

Achievability is proved as follows.

Codebook generation. Fix a pmf p(u, x).

1. Split the message M into two independent messages M0 and M1 with rates R0 and

R1, respectively. Hence R = R0 +R1.

2. For each m0 ∈ {1:2nR0}, independently generate a sequence un(m0) according to∏ni=1 p(ui).

3. For each (m0,m1) ∈ {1 : 2nR0} × {1 : 2nR1}, independently generate a sequence

xn(m0,m1) according to∏n

i=1 p(xi |ui(m0)).

Encoding. To send message m = (m0,m1), transmit xn(m0,m1).

Decoding. Upon receiving yn, declare that (m0, m1) has been sent if it is the unique

message such that

(un(m0), xn(m0, m1), yn) ∈ T (n)ε (U,X, Y ).

Analysis of the probability of error. We are using a superposition code over the channel

from X to Y . Using the law of large numbers and the packing lemma in [EK11], it can be

shown that the probability of error tends to zero as n→∞ if

R1 < I(X;Y |U)− δ(ε), (4.7)

R0 +R1 < I(X;Y )− δ(ε). (4.8)

Analysis of disturbance rate. In the following, we analyze the disturbance rate averaged

over codebooks C.

I(Xn;Zn | C) ≤ H(Zn,M0 | C)−H(Zn |Xn, C)

= H(M0) +H(Zn |M0, C)−H(Zn |Xn)

4.2. PROOFS FOR A SINGLE DISTURBANCE CONSTRAINT 69

(a)≤ nR0 +H(Zn |Un)− nH(Z |X)

≤ nR0 + nH(Z |U)− nH(Z |X,U)

= nR0 + nI(X;Z |U)

≤ nRd, (4.9)

where (a) follows sinceUn is a function of the codebook C andM0. SubstitutingR = R0+R1

and using Fourier–Motzkin elimination on inequalities (4.7), (4.8), and (4.9) completes the

proof of achievability. �

4.2.2 Proof of converse for Theorem 4.1

Consider a sequence of codes with P (n)e → 0 as n → ∞ and the joint pmf that it induces

on (M,Xn, Y n, Zn) assuming M ∼ Unif{1 : 2nR}. Define the time-sharing random

variable Q ∼ Unif{1 : n}, independent of everything else. We use the identification

U = (Q, Y nQ+1, Z

Q−1), and let X = XQ, Y = YQ, and Z = ZQ. Note that (X, Y, Z) is

consistent with the channel. Then

R ≤ I(X;Y ) + εn,

as in the converse proof for point-to-point channel capacity, which uses the same identifica-

tions of random variables. On the other hand,

nRd ≥ I(Xn;Zn)

= H(Zn)−H(Zn |Xn)

=n∑i=1

(H(Zi |Zi−1)−H(Zi |Xi)

)≥

n∑i=1

H(Zi |Zi−1, Y ni+1)− nH(Z |X)

= nH(Z |U)− nH(Z |X,U)

= nI(X;Z |U).


4.2.3 Proof of Corollary 4.1

Using the deterministic nature of the channel, the region in Theorem 4.1 reduces to the set

of rate pairs (R,Rd) such that

R ≤ H(Y ), (4.10)

Rd ≥ H(Z |U), (4.11)

Rd ≥ R +H(Z |U)−H(Y |U), (4.12)

for some pmf p(u, x). Now fixing a rate R and a pmf p(x) and varying p(u|x) to minimize

Rd, the right hand sides of (4.11) and (4.12) are lower bounded by

H(Z |U) ≥ 0,

and

R +H(Z |U)−H(Y |U) = R +H(Z |U)−H(Y, Z |U) +H(Z |Y, U)

= R−H(Y |Z,U) +H(Z |Y, U)

≥ R−H(Y |Z).

Note that the particular choice U = Z simultaneously achieves both lower bounds with

equality and is therefore sufficient. The rate–disturbance region thus reduces to Corollary 4.1.

For a fixed pmf p(x), this region has exactly two corner points: P1 = (H(Y |Z), 0) and

P2 = (H(Y ), I(Y ;Z)). As we vary p(x), there is one corner point P1 that dominates all

other P1 points. The pmf p(x) for this dominant P1 can be constructed by maximizing

H(Y |Z) as follows. For each z ∈ Z , define Yz ⊆ Y to be the set of y symbols that are

compatible with z. Let z? be a symbol that maximizes |Yz|. For each element of Yz? , pick

exactly one x that is compatible with it and z?. Finally, place equal probability mass on

each of these x values, and zero mass on all others. This pmf on X yields the dominant

corner point P1, namely (log(|Yz?|), 0). Moreover, for this distribution, P2 coincides with

P1. Therefore, the net contribution (modulo convexification) of each pmf p(x) to the rate–

disturbance region amounts to its corner point P2. This implies the alternative description of

the region. The cardinality bound onQ follows from the convex cover method in [EK11]. �



Achievability is straightforward using a random Gaussian codebook with power control, and

upper-bounding the disturbance rate at receiver Z by white Gaussian noise. The converse

can be seen as follows. Clearly, R ≤ C(P ). Let α? ∈ [0, 1] be such that R = C(α?P ). Then

nC(α?P ) = nR ≤ I(Xn;Y n) + nεn

= h(Y n)− h(Y n |Xn) + nεn,

and therefore,

h(Y n) ≥ n2

log(2πe) + nC(α?P )− nεn= n

2log (2πe(1 + α?P ))− nεn

Since N < 1, we can write the physically degraded form of the channel as Y = X +W1,

Z = Y + W2, where W2 ∼ N (0, N − 1) is the excess noise that receiver Z experiences in

addition to receiver Y . Applying the vector entropy power inequality to Zn = Y n + W n2 ,

we conclude

1nh(Zn) ≥ 1

2log(

22nh(Y n) + 2

2nh(Wn

2 ))

≥ 12

log(2−2εn · 2πe(1 + α?P ) + 2πe(N − 1)

)≥ 1

2log (2πe(N + α?P ))− εn,

and finally,

Rd ≥ 1nI(Xn;Zn)

= 1nh(Zn)− 1

2log(2πeN)

≥ C(α?P/N)− εn.

This concludes the proof of Corollary 4.2. �


4.2.5 Proof of Theorem 4.2

Recall the shape of R(U,X) depicted in Figure 4.2. The coordinates of the corner points A

and B are given by

A(U,X) : R = h(X +W1)− h(W1), (4.13)

Rd = h(X +W2 |U) + h(X +W1)− h(X +W1 |U)− h(W2), (4.14)

B(U,X) : R = h(X +W1 |U)− h(W1), (4.15)

Rd = h(X +W2 |U)− h(W2). (4.16)

Proof of achievability: We specialize Theorem 4.1. Consider the specific p(u, x) con-

structed as follows. For given positive semidefinite matrices KU , KV ∈ Rn×n with

tr(KU +KV ) ≤ P , let

U ∼ N (0, KU),

V ∼ N (0, KV ),

X = U + V,

where U and V are independent. Then, the terms in Theorem 4.1 evaluate to

I(X;Y ) = h(Y )− h(W1) = 12

log|KU +KV +K1|

|K1|,

I(X;Y |U) = h(Y |U)− h(W1) = 12

log|KV +K1||K1|

,

I(X;Z |U) = h(Z |U)− h(W2) = 12

log|KV +K2||K2|

.

Simplifying the right hand sides and introducing time-sharing leads to the desired result.

For completeness, the coordinates of A and B for given matrices KU , KV are

A(KU , KV ) : R = 12

log|KU +KV +K1|

|K1|, (4.17)

Rd = 12

log|KV +K2||K2|

|KU +KV +K1||KV +K1|

, (4.18)


B(KU , KV ) : R = 12

log|KV +K1||K1|

, (4.19)

Rd = 12

log|KV +K2||K2|

. (4.20)

The constituent region R(U,X) for fixed KU and KV is depicted in Figure 4.8. �

Proof of converse: The converse proof of Theorem 4.1 continues to hold and we only need

to show that Gaussian input distributions are sufficient. We proceed as follows. Since the

rate–disturbance region is convex, its boundary can be fully characterized by maximizing

R− λRd for each λ > 0. We write

R− λRd ≤ max(R,Rd)∈R

{R− λRd}

= max(U,X)

max(R,Rd)∈R(U,X)

{R− λRd} ,

where the outer optimization is over the joint distribution of (U,X) and the inner optimization

is over the region achieved by that distribution. The inner optimization can be solved

explicitly as follows. For ease of presentation, assume for the moment that the power

constraint is of the form KX � S for some positive semidefinite matrix S. (That is, valid

KX are precisely those that result in the matrix S −KX being positive semidefinite.)

A

B

45◦

R

Rd

R(U,X)

12 log |KU+KV +K1|

|K1|12 log |KV +K1|

|K1|

12 log |KV +K2|

|K2|

Figure 4.8. Constituent region for Theorem 4.2, using a Gaussian superposition codebook withparameters KU and KV .


First, consider λ ≤ 1. For any distribution (U,X) ∼ p(u, x), point A(U,X) achieves a

value of the inner optimization at least as large as point B(U,X), or any point on the line

between them. Using the coordinates of A(U,X) in (4.13) and (4.14), we can write

R− λRd ≤ max(U,X)

{λ (h(X +W1 |U)− h(X +W2 |U))

+ (1− λ)h(X +W1)− h(W1) + λh(W2)}(a)≤ λ ·max

(U,X){h(X +W1 |U)− h(X +W2 |U)}

+ (1− λ) ·max(U,X)

{h(X +W1)} − h(W1) + λh(W2)

(b)≤ λ · max

KX�S

{12

log|KX +K1||KX +K2|

}+ (1− λ) · max

KX�S

{12

log ((2πe)n|KX +K1|)}

− 12

log ((2πe)n|K1|) + λ2

log ((2πe)n|K2|) .

In (a), the two maximizations are taken independently. In step (b), the first maximization is

achieved by a Gaussian X that is independent of U , due to a theorem proved by Liu and

Viswanath [LV07, Thm. 8]. The optimization is now only over covariances matrices. Let

K? be an optimizer of this first maximization. The second maximization is also achieved by

a Gaussian X , and is optimized by KX = S since f(KX) = |KX +K1| is matrix monotone.

It follows that

R− λRd ≤ λ2

log|K? +K1||K? +K2|

+ 1−λ2

log ((2πe)n|S +K1|)

− 12

log ((2πe)n|K1|) + λ2

log ((2πe)n|K2|)

= 12

log|S +K1||K1|

− λ2

log|K? +K2||K? +K1|

|S +K1||K2|

.

But this upper bound is achieved with equality by Gaussian superposition codebooks, namely

through the pointA(KU , KV ) as specified by equations (4.17) and (4.18), withKU = S−K?

and KV = K?.

Now, consider λ > 1. The argument proceeds analogously to the previous case. For

completeness’ sake, the details are as follows. We can write the inner optimization explicitly


using the coordinates of B(U,X) in (4.15) and (4.16) as

R− λRd ≤ max(U,X)

{h(X +W1 |U)− λh(X +W2 |U)}+ λh(W2)− h(W1)

(a)≤ max

KX�S

{12

log ((2πe)n|KX +K1|)− λ2

log ((2πe)n|KX +K2|)}

+ λ2

log ((2πe)n|K2|)− 12

log ((2πe)n|K1|) .

The optimum in (a) is achieved by a GaussianX (independent ofU ) by virtue of [LV07, Thm.

8], while the other two terms are independent of the optimization variable. Let K? be an

optimizer. Then

R− λRd ≤ 12

log|K? +K1||K1|

− λ2

log|K? +K2||K2|

.

This upper bound is achieved with equality by Gaussian superposition codebooks through

the point B(KU , KV ) as given by equations (4.19) and (4.20) with KU = 0 and KV = K?.

This is a power control strategy, similar to the scalar Gaussian case.

We have thus shown that under a power constraint KX � S, Gaussian superposition

codes are optimal. The conclusion extends to the sum power constraint tr(KX) ≤ P by

observing that

{KX : tr(KX) ≤ P} =⋃

S:S�0tr(S)≤P

{KX : KX � S}.

In other words, the sum power constraint can be expressed as a union of constraints of the

type KX � S, for each of which Gaussian superposition codes are optimal. Therefore, a

Gaussian superposition code must be optimal overall, too. �

4.3. PROOFS FOR TWO DISTURBANCE CONSTRAINTS 77

4.3 Proofs for two disturbance constraints


Codebook generation. Fix a pmf p(u, x). Split the rate as R = R0 + R1 + R2 + R3.

Define the auxiliary rates R1 ≥ R1 and R2 ≥ R2, let ε′ > 0, and define the set partitions

{1:2nR1} = L1(1) ∪ · · · ∪ L1(2nR1),

{1:2nR2} = L2(1) ∪ · · · ∪ L2(2nR2),

where L1(·) and L2(·) are indexed sets of size 2n(R1−R1) and 2n(R2−R2), respectively.

1. For each m0 ∈ {1:2nR0}, generate un(m0) according to∏n

i=1 p(ui).

2. For each l1 ∈ {1:2nR1}, generate zn1 (m0, l1) according to∏n

i=1 p(z1i |ui(m0)). Like-

wise, for each l2 ∈ {1:2nR2}, generate zn2 (m0, l2) according to∏n

i=1 p(z2i |ui(m0)).

3. For each (m0,m1,m2), let S(m0,m1,m2) be the set of all pairs (l1, l2) from the prod-

uct set L1(m1)×L2(m2) such that (zn1 (m0, l1), zn2 (m0, l2)) ∈ T (n)ε′ (Z1, Z2 |un(m0)).

4. For each (m0, l1, l2) and m3 ∈ {1:2nR3}, generate xn(m0, l1, l2,m3) according to

n∏i=1

p(xi |ui(m0), z1i(l1), z2i(l2))

if (l1, l2) ∈ S(m0,m1,m2). Otherwise, we draw from Unif(X n).

5. Choose (l(m0,m1,m2)1 , l

(m0,m1,m2)2 ) uniformly from S(m0,m1,m2). If S(m0,m1,m2)

is empty, choose (1, 1).

Encoding. To send message m = (m0,m1,m2,m3), transmit the sequence

xn(m0, l(m0,m1,m2)1 , l

(m0,m1,m2)2 ,m3).


Decoding. Let ε > ε′. Upon receiving yn, define the tuple

T (m0,m1,m2,m3) =(un(m0), zn1

(m0, l

(m0,m1,m2)1

), zn2(m0, l

(m0,m1,m2)2

),

xn(m0, l

(m0,m1,m2)1 , l

(m0,m1,m2)2 ,m3

), yn)

Declare that m = (m0, m1, m2, m3) has been sent if it is the unique message such that

T (m0, m1, m2, m3) ∈ T (n)ε (U,Z1, Z2, X, Y ).

Analysis of the probability of error. Without loss of generality, assume thatm0 = m1 =

m2 = m3 = 1 is transmitted. Define the following events.

Ee1 : S(1, 1, 1) is empty,

Ee2 : S(1, 1, 1) contains two distinct pairs with equal first or second component,

Ei : {T (m0,m1,m2,m3) ∈ T (n)ε (U,Z1, Z2, X, Y ) for some (m0,m1,m2,m3) ∈Mi},

i ∈ {0:5},

where the message subsetsMi are specified in Table 4.1. Defining the “encoding error”

event Ee = Ee1 ∪ Ee2 and the “decoding error” event Ed = Ec0 ∪ E1 ∪ E2 ∪ E3 ∪ E4 ∪ E5, the

probability of error can be upper-bounded as

P(E) ≤ P(Ee ∪ Ed) ≤ P(Ee) + P(Ed | Ece ).

The motivation for introducing Ee2 as an “error” is to simplify the analysis of the second

probability term.

We bound P(Ee) by the following proposition. Let r1 = R1 −R1 and r2 = R2 −R2.

Proposition 4.1. P(Ee)→ 0 as n→∞ if

r1 + r2 > I(Z1;Z1 |U) + δ(ε′), (4.21)

r1/2 + r2 < I(Z1;Z2 |U)− δ(ε′), (4.22)

r1 + r2/2 < I(Z1;Z2 |U)− δ(ε′). (4.23)


Proof sketch: First, consider Ee1. As in the proof of Marton’s inner bound for the broadcast

channel, the mutual covering lemma [EK11] implies P(Ee1)→ 0 as n→∞ if (4.21) holds.

Now consider Ee2, for which we need to control the number of typical pairs that can

occur in the same “row” or “column” of the product set L1(m1)×L2(m2), i.e., for the same

l1 or l2 coordinate. The probability P(Ee2) tends to zero provided that (4.22) and (4.23) hold.

This is akin to the birthday problem [Mis39], where k samples are drawn uniformly

and independently from {1 :N}, and the interest is in samples that have the same value

(collisions). It is well-known that for the probability of collision to be pc, the number

of samples required is roughly k ≈√−2N ln(1− pc), which scales with

√N . In our

case, the number of samples is the cardinality of the set S(m0,m1,m2), which is roughly

k = 2n(r1+r2−I(Z1;Z2 |U)). The samples are categorized intoN1 = 2nr1 andN2 = 2nr2 classes

along rows and columns, respectively. To achieve a probability of collision pc → 0 along

both dimensions, we need k � min{√N1,√N2}, which yields exactly the conditions (4.22)

and (4.23).

A rigorous proof is given below on page 82. �

We bound the probability P(Ed | Ece ) by the following proposition.

Proposition 4.2. P(Ed | Ece )→ 0 as n→∞ if

R3 < H(Y |Z1, Z2, U)− δ(ε), (4.24)

R1 +R3 < H(Y |Z2, U) + I(Z1;Z2 |U)− δ(ε), (4.25)

Message subset m0 m1 m2 m3

M0 1 1 1 1

M1 1 1 1 6= 1

M2 1 6= 1 1 anyM3 1 1 6= 1 anyM4 1 6= 1 6= 1 anyM5 6= 1 any any any

Table 4.1. Message subsets for decoding error events.


R2 +R3 < H(Y |Z1, U) + I(Z1;Z2 |U)− δ(ε), (4.26)

R1 + R2 +R3 < H(Y |U) + I(Z1;Z2 |U)− δ(ε), (4.27)

R0 + R1 + R2 +R3 < H(Y ) + I(Z1;Z2 |U)− δ(ε). (4.28)

Proof sketch: The events of which Ed is composed are illustrated in Figure 4.9, which also

depicts the structure of the codebook for m0 = 1. The product sets L1(m1) × L2(m2),

for each (m1,m2), are represented by shaded squares. In each product set, the sequence

pair selected in step 5 of the codebook generation procedure is shown with its superposed

xn codewords, as created in step 4. The correct codeword xn(1, 1, 1, 1) is shown as a

white circle which is connected to the received sequence yn. The codewords that may be

mistakenly detected at the receiver are shown as black circles. The product sets associated

with decoding error events E1, E2, E3, and E4 are labeled 1, 2, 3, and 4, respectively.

We bound the probability of each sub-event of Ed. First, note that by the conditional

typicality lemma in [EK11], P(Ec0)→ 0 as n→∞ (this relies on ε′ < ε). The probabilities

of the events E1 through E5 conditioned on Ece tend to zero as n→∞ under conditions (4.24)

through (4.28), correspondingly.

zn2 (1, l(1,1)2 )

m2 = 1 m2 = 2 m2 = 3

m1

=1

m1

=2

xn(1, 1, 1, 1)

yn

zn1 (1, 1)zn1 (1, 2)

zn1 (1, l(1,1)1 )

zn1 (1, 2n(R1−R1))

Figure 4.9. Illustration of decoding error events, for m0 = 1.


The events E2 and E3 require the most careful analysis, since the true codeword, namely

xn(1, 1, 1, 1), and the codewords with which it may be confused can share the same zn1 or zn2sequence (see dashed line and circles on it in Figure 4.9). Moreover, even when the chosen

pairs in two different product sets do not share one of the two coordinates (see the chosen

pairs for (m1,m2) = (1, 1) and (2, 1) in Figure 4.9), correlation could potentially be caused

by the selection procedure in step 5 of codebook generation. We use the independence lemma

(Lemma A.2) to show that the event Ece prevents this correlation leakage from occurring. The

application of the lemma is what distinguishes this analysis from the conventional Marton

inner bound for broadcast channels [Mar79, EM81]. There, analysis of the selection process

can be altogether avoided since each receiver decodes only one of the two coordinates.

A detailed proof for the event E3 is given below on page 82, the other events follow

likewise. �

Analysis of disturbance rate. When viewed by receiver Z1, the codeword for message

m = (m0,m1,m2,m3) appears as zn1 (m0, l(m0,m1,m2)1 ). We can pessimistically assume that

all sequences zn1 (m0, l1) as created in step 2 of codebook generation can be seen at the

receiver for some message m. Therefore, the number of possible sequences at Z1, and

thus its disturbance rate, is upper-bounded by H(Zn1 ) ≤ n(R0 + R1). Applying the same

argument for Z2, the proposed scheme achieves

R0 + R1 ≤ Rd,1, (4.29)

R0 + R2 ≤ Rd,2. (4.30)

Conclusion of the proof. Collecting inequalities (4.21) through (4.30), recalling R =

R0 +R1 +R2 +R3, and using the Fourier–Motzkin procedure to eliminate R0, R1, R2, and

R3 leads to the (R,Rd,1, Rd,2) region claimed in the theorem.

Finally, the statement of Remark 4.6 follows from

− I(Z1;Z2 |U) + I(Z1;Z2 |U, Y )

= −H(Z2 |U) +H(Z2 |U,Z1) +H(Z2 |U, Y )−H(Z2 |U, Y, Z1)

= −I(Y ;Z2 |U) + I(Y ;Z2 |U,Z1),


which leads to the equality

H(Y |Z1, Z2, U) +H(Y |U)− I(Z1;Z2 |U) + I(Z1;Z2 |U, Y )

= H(Y |Z1, Z2, U) +H(Y |U)− I(Y ;Z2 |U) + I(Y ;Z2 |U,Z1)

= H(Y |Z1, U) +H(Y |Z2, U).

This concludes the proof of Theorem 4.3. �

Proof of Proposition 4.1: The product bin (m1,m2) = (1, 1) for m0 = 1 contains lm

sequence pairs, where l = 2nr1 andm = 2nr2 . Each pair (Zn1 (1, l1), Zn

2 (1, l2)), for l1 ∈ {1:l}and l2 ∈ {1 :m}, has probability p .

= 2−nI(Z1;Z2 |U) to be jointly typical. Now fix one

coordinate, say l1 = 1. The corresponding “row” of the bin contains m sequences Zn2 (1, l2),

each of which has an independent probability of p to be jointly typical with Zn1 (1, 1). Let K

be the total number of typical sequences in this row. Then

P{K = 0} = (1− p)m,

P{K = 1} = mp(1− p)m−1,

P{K ≥ 2} = 1− (1− p+mp) (1− p)m−1︸︷︷︸≥1−(m−1)p

≤ m2p2.

We have thus upper-bounded the probability to encounter two or more typical pairs in a

single row. Consequently, the probability of two or more typical pairs occurring in any row

is upper bounded by lm2p2. Substituting definitions leads to the desired inequality. The

same argument can be made for columns of the bin. �

Proof of Proposition 4.2, exemplified for E3: We analyze the probability of E3 as follows.

E3 ={(Un(1), Zn

1 (1, L(1,1,m2)1 ), Zn

2 (1, L(1,1,m2)2 ),

Xn(1, L(1,1,m2)1 , L

(1,1,m2)2 ,m3), Y n

)∈ T (n)

ε ,

for some m2 6= 1, m3

}


⊆{(Un(1), Zn

1 (1, L(1,1,m2)1 ), Zn

2 (1, l2), Xn(1, L(1,1,m2)1 , l2,m3), Y n

)∈ T (n)

ε ,

for some m2 6= 1, m3, l2 /∈ L2(1)},

Define the event Eeq = {L(1,1,m2)1 = L

(1,1,1)1 }, which allows us to write P(E3 | Ece ) =

P(E3 ∩ Eeq | Ece ) + P(E3 ∩ Eceq | Ece ). We consider both terms separately.

E3 ∩ Eeq ⊆{(Un(1), Zn

1 (1, L(1,1,1)1 ), Zn

2 (1, l2), Xn(1, L(1,1,1)1 , l2,m3), Y n

)∈ T (n)

ε ,

for some l2 /∈ L2(1), m3

}.

Thus,

P(E3 ∩ Eeq | Ece )

≤∑

(un,zn1 ,yn)∈T (n)

ε

P{Un(1) = un, Zn

1 (1, L(1,1,1)12 ) = zn1 , Y

n = yn∣∣ Ece}

·∑

l2 /∈L2(1)

2nR3∑m3=1

P{

(un, zn1 , Zn2 (1, l2), Xn(1, L

(1,1,1)1 , l2,m3), yn) ∈ T (n)

ε

∣∣ Ece}≤ 2n(R2+R3) P ?,

where P ? is shorthand for the last P{·} expression. Continue with

P ? =∑

(zn2 ,xn)∈T (n)

ε (Z2,X |un,zn1 ,yn)

P{Zn

2 (1, l2) = zn2 , Xn(1, L

(1,1,1)1 , l2,m3) = xn

∣∣Un(1) = un, Zn

1 (L(1,1,1)1 ) = zn1 , Y

n = yn, Ece}

(a)=

∑(zn2 ,x

n)∈T (n)ε (

Z2,X |un,zn1 ,yn)︸︷︷︸.= 2nH(X,Z2|Z1,Y,U)

p(zn2 |un)︸︷︷︸.= 2−nH(Z2|U)

p(xn | zn1 , zn2 , un)︸︷︷︸.= 2−nH(X|Z1,Z2,U)

≤ 2n(H(X,Z2|Z1,Y,U)−H(Z2|U)−H(X|Z1,Z2,U)+δ(ε))

= 2n(−H(Y |Z1,U)−I(Z1;Z2|U)+δ(ε)).

In step (a), we have used the fact that l2 /∈ L2(1), and therefore, Zn2 (1, l2) relates to a bin


other than the first one. It is independent of the conditions Y n = yn and Ece , both of which

relate only to the (1, 1) bin for m0 = 1. A similar argument applies to the second term.

Substituting back in the previous chain of inequalities implies that P(E3 ∩ Eeq | Ece )→ 0

as n→∞ if inequality (4.26) holds.

Next, consider

E3 ∩ Eceq ⊆{(Un(1), Zn

1 (1, l1), Zn2 (1, l2), Xn(1, l1, l2,m3), Y n

)∈ T (n)

ε ,

for some l1 ∈ L1(1) \ {L(1,1,1)1 }, l2 /∈ L2(1), m3

}.

We argue

P(E3 ∩ Eceq | Ece )

≤∑

(un,yn)∈T (n)ε

P {Un(1) = un, Y n = yn | Ece}∑

l1∈L1(1)\{L(1,1,1)1 }

∑l2 /∈L2(1)

2nR3∑m3=1

P{

(un, Zn1 (1, l1), Zn

2 (1, l2), Xn(1, l1, l2,m3), yn) ∈ T (n)ε

∣∣ Un(1) = un, Y n = yn, Ece}

≤ 2n(R1−R1+R2+R3) P ?,

where P ? represents the last P{·} expression. Finally,

P ? =∑

(zn1 ,zn2 ,x

n)∈T (n)ε (

Z1,Z2,X |un,yn)

P{Zn

1 (1, l1) = zn1 , Zn2 (1, l2) = zn2 , X

n(1, l1, l2,m3) = xn∣∣

Un(1) = un, Y n = yn, Ece}

=∑

(zn1 ,zn2 ,x

n)∈T (n)ε (

Z1,Z2,X |un,yn)

∑zn2 (l′2), for

all l′2∈L2(1)

P{Zn

2 (1, l′2) = zn2 (l′2) for all l′2 ∈ L2(1)∣∣ Ece}

· P{Zn

1 (1, l1) = zn1 , Zn2 (1, l2) = zn2 , X

n(1, l1, l2,m3) = xn∣∣

Un(1) = un, Y n = yn, Zn2 (1, l′2) = zn2 (l′2) for all l′2 ∈ L2(1), Ece

}(a)≤

∑(zn1 ,z

n2 ,x

n)∈T (n)ε (

Z1,Z2,X |un,yn)︸︷︷︸.= 2nH(X,Z1,Z2|Y,U)

p(zn1 |un, Ece )︸︷︷︸(b)

.= 2−nH(Z1|U)

p(zn2 |un)︸︷︷︸.= 2−nH(Z2|U)

p(xn | zn1 , zn2 , un)︸︷︷︸.= 2−nH(X|Z1,Z2,U)


≤ 2n(H(X,Z1,Z2|Y,U)−H(Z1|U)−H(Z2|U)−H(X|Z1,Z2,U)+δ(ε))

= 2n(−H(Y |U)−I(Z1;Z2|U)+δ(ε)).

Here, (a) uses uses the fact that for the l1 indices in question, Zn1 (1, l1) is independent of

Y n. This is a consequence of independence between the selected Zn1 (1, L

(1,1,1)1 ) and the

other (non-selected) Zn1 (1, l1) due to Lemma A.2. The lemma applies because the event

is conditioned (1) on Ece , which ensures that picking L(1,1,1)1 is uniform as required by the

lemma, and (2) on Zn2 (1, l′2) for all l′2 ∈ L2(1), which provides for the qualifying set A′ of

the lemma.

Step (b) follows from

p(zn1 |un, Ece ) = p(zn1 |un) · p(Ece |un, zn1 )

p(Ece |un)

≤ p(zn1 |un) · 1

p(Ece |un)

≤ p(zn1 |un) · 1

1− 2−δn

≤ 2−n(H(Z1|U)−ε) · 2nδ′

≤ 2−n(H(Z1|U)−ε−δ′).

Here, δ is the minimum slack of the three conditions for Ece in Lemma 4.1. Note that for any

δ, δ′ > 0, we can find an N0 such that

∀n ≥ N0 :1

1− 2−δn≤ 2nδ

′.

We conclude that P(E3 ∩ Eceq | Ece )→ 0 as n→∞ if

R1 −R1 + R2 +R3 ≤ H(Y |Q) + I(X1;X2|Q)− δ(ε).

This is an implication of (4.27) which stems from analyzing E4, and may thus be omitted. �



First, consider

nR ≤ I(Xn;Y n) + nεn

=n∑i=1

I(Xn;Yi |Y i−1) + nεn

=n∑i=1

I(Xi;Yi |Y i−1) + nεn

= nI(X;Y |Q)

= nH(Y |Q).

Furthermore,

nRd,1 ≥ I(Xn;Zn1 )

≥ I(Y n;Zn1 )

=n∑i=1

I(Yi;Zn1 |Y i−1)

≥n∑i=1

I(Yi;Z1i |Y i−1)

= nI(Y ;Z1 |Q),

where Y = YT , Z1 = Z1T , and Q = (Y T−1, T ) with T ∼ Unif{1:n}. The same argument

leads to

nRd,2 ≥ nI(Y ;Z2 |Q),

with the same random variable identifications, and the additional Z2 = Z2T . Finally, the

cardinality bound on Q follows from the convex cover method in [EK11]. �



First, we specialize Theorem 4.3 as follows.

Corollary 4.4 (Simpler inner bound for deterministic DMC-2-DC).The rate–disturbance region R of the deterministic channel with two disturbance

constraints is inner-bounded by the set of rate triples (R,Rd,1, Rd,2) such that

R ≤ H(Y ), (4.31)

Rd,1 ≥ I(Y ;Z1, U), (4.32)

Rd,2 ≥ I(Y ;Z2, U), (4.33)

Rd,1 +Rd,2 ≥ I(Y ;Z1, Z2, U) + I(Y ;U) + I(Z1;Z2 |U)

= I(Y ;Z1, U) + I(Y ;Z2, U) + I(Z1;Z2 |U, Y ), (4.34)

for some pmf p(u, x).

The two equivalent expressions in (4.34) originate from Remark 4.6 on page 62. An

example of the constituent regions of Corollary 4.4 for fixed p(u, x) is depicted in Figure 4.10.

The figure also illustrates how the corollary follows from Theorem 4.3: Each constituent

region of the corollary is a strict subset of the constituent region of the theorem, for the same

p(u, x).

Proof of Corollary 4.4: In Theorem 4.3, consider the case where (4.1) is met with equality,

i.e., R = H(Y ). This yields a subset region which is still achievable. It simplifies to

Rd,1 +Rd,2 ≥ I(Z1;Z2 |U), (4.35)

Rd,1 ≥ I(Y ;Z1, U), (4.36)

Rd,2 ≥ I(Y ;Z2, U), (4.37)

Rd,1 +Rd,2 ≥ I(Y ;Z1, Z2, U) + I(Z1;Z2 |U), (4.38)

Rd,1 +Rd,2 ≥ I(Y ;Z1, Z2, U) + I(Y ;U) + I(Z1;Z2 |U)

= I(Y ;Z1, U) + I(Y ;Z2, U) + I(Z1;Z2 |U, Y ). (4.39)


Rd,1

Rd,2

R

(4.31)

(4.32)(4.33)(4.34)

Figure 4.10. Constituent region for Corollary 4.4, for a fixed p(u, x). Each face is annotated bythe inequality that defines it. For comparison, the constituent region of Theorem 4.3 is shown withdashed lines (see Figure 4.5).

Clearly, conditions (4.35) and (4.38) are dominated by inequality (4.39), and the desired

result follows. �

Proof of achievability for Theorem 4.5: We further specialize Corollary 4.4. We choose

U = Z1 ∨ Z2, i.e., the common part of Z1 and Z2. This implies that condition (4.34) can be

omitted, since I(Z1;Z2 |U, Y ) = 0 for all p(u, x) by assumption. Furthermore, U can be

dropped from conditions (4.32) and (4.33) by virtue of being a function of Z1 and Z2. We

conclude that

R ≤ H(Y ), (4.40)

Rd,1 ≥ I(Y ;Z1), (4.41)

Rd,2 ≥ I(Y ;Z2), (4.42)

is achievable for all p(x). Adding a time-sharing random variable Q completes the proof.

Note that in the special case where Y 4 Z1 or Y 4 Z2, the same conclusion holds with

the choice U = ∅. �


Proof of achievability: We prove the result for Z1 4 Z2, the other case follows by sym-

metry. We specialize the achievable region of Theorem 4.3 by choosing U = Z2. The


rate–disturbance constraints are

R ≤ H(Y ), (4.43)

Rd,1 +Rd,2 ≥ 0, (4.44)

R−Rd,1 ≤ H(Y |Z1), (4.45)

R−Rd,2 ≤ H(Y |Z2), (4.46)

R−Rd,1 −Rd,2 ≤ H(Y |Z1), (4.47)

2R−Rd,1 −Rd,2 ≤ H(Y |Z1) +H(Y |Z2). (4.48)

Clearly, (4.44) is vacuous. Furthermore, (4.47) is dominated by (4.45), and (4.48) is domi-

nated by the sum of (4.45) and (4.46). This completes the proof. �

Proof of converse: The first inequality follows from Fano’s inequality as

nR ≤ I(Xn;Y n) + nεn

= H(Y n) + nεn

≤ nH(Y ) + nεn,

where Y = YQ and Q ∼ Unif{1:n}. The other two inequalities follow as

n(R−Rd,1) ≤ nR− I(Xn;Zn1 )

≤ H(Y n)−H(Zn1 ) + nεn

≤ H(Y n, Zn1 )−H(Zn

1 ) + nεn

= H(Y n |Zn1 ) + nεn

≤ nH(Y |Z1) + nεn,

with Z1 = Z1Q, and likewise for n(R−Rd,2). �

Chapter 5

General achievable rate region for 3-DIC

In this chapter1, we synthesize the results of the previous chapters and develop a new encod-

ing scheme for the 3-DIC and its corresponding achievable rate region. This scheme general-

izes the Han–Kobayashi scheme and performs strictly better than previously discussed inner

bounds on the 3-DIC capacity region. The key idea is to combine the receiver-centric insight

obtained from interference decoding (Chapter 3) with the transmitter-centric viewpoint that

led to the communication with disturbance constraints setting in Chapter 4. We borrow the

codebook construction via Marton coding and superposition coding from the latter, and

apply saturation arguments at the receiver similar to the former, which permits us to take

advantage of the structure of the combined interfering signal without decoding any of part

of the interfering messages. The proofs are relegated to Section 5.2.


We first discuss two equivalent characterizations of the achievable rate region (Subsec-

tions 5.1.1 and 5.1.2). The description of the regions is complicated by an auxiliary random

variable Ul for each transmitter l ∈ {1:3} and by a Fourier–Motzkin operator that cannot be

evaluated symbolically. Thus, we explore two simplifications of the achievable rate region.

First, we specialize the region to the case where Ul = ∅ for all l. This leads to a weaker inner

bound that is simpler to evaluate numerically (Subsection 5.1.3). We provide a numerical1The results in this chapter were first published in [BE11a].

92 CHAPTER 5. GENERAL ACHIEVABLE RATE REGION FOR 3-DIC

example for this case. Second, we apply the general inner bound to the special case of the

one-to-many 3-DIC in which only one of the transmitters causes interference. The bound

simplifies via symbolic evaluation of the Fourier–Motzkin operator, and thus permits a more

explicit characterization than for the general case (Subsection 5.1.4).

5.1.1 Achievable rate region for 3-DIC

We will need the following notation. Fix a joint pmf for (Q,U1, X1, U2, X2, U3, X3) of the

form

p = p(q)p(u1, x1|q)p(u2, x2|q)p(u3, x3|q).

Here, Q and Ul, for l ∈ {1:3}, are auxiliary random variables of arbitrary cardinality. While

Q is a time-sharing random variable that is common between all three transmitters, Ul is

associated with the lth transmitter only. Define the rate region R1(p) ⊂ R18+ to consist of

the rate tuples

(R10, R11, R12, R13, R12, R13,

R20, R22, R23, R21, R23, R21,

R30, R33, R31, R32, R31, R32) (5.1)

such that

R12 −R12 + R13 −R13 ≥ I(X12;X13 |U1, Q), (5.2)

R12 −R12 + (R13 −R13)/2 ≤ I(X12;X13 |U1, Q), (5.3)

(R12 −R12)/2 + R13 −R13 ≤ I(X12;X13 |U1, Q), (5.4)

R12 ≥ R12, (5.5)

R13 ≥ R13, (5.6)

and for all i ∈ {1:5},

r1i ≤ H(X11 | c1i, Q) + t1i, (5.7)

r1i + R21 ≤ H(Y1 | c1i, U2, X31, Q) + t1i, (5.8)


r1i + R31 ≤ H(Y1 | c1i, X21, U3, Q) + t1i, (5.9)

r1i + min{R20 + R21, H(X21 |Q)} ≤ H(Y1 | c1i, X31, Q) + t1i, (5.10)

r1i + min{R30 + R31, H(X31 |Q)} ≤ H(Y1 | c1i, X21, Q) + t1i, (5.11)

r1i + min{R21 + R31, H(S1 |U2, U3, Q)} ≤ H(Y1 | c1i, U2, U3, Q) + t1i, (5.12)

r1i + min{R20 + R21 + R31,

H(X21 |Q) + R31,

H(S1 |U3, Q)}≤ H(Y1 | c1i, U3, Q) + t1i, (5.13)

r1i + min{R21 +R30 + R31,

R21 +H(X31 |Q),

H(S1 |U2, Q)}≤ H(Y1 | c1i, U2, Q) + t1i, (5.14)

r1i + min{R20 + R21 +R30 + R31,

R20 + R21 +H(X31 |Q),

H(X21 |Q) +R30 + R31,

H(S1 |Q)}≤ H(Y1 | c1i, Q) + t1i. (5.15)

In the latter set of conditions, lower-case symbols are placeholders for the terms specified in

Table 5.1. The term r1i represents rates, the term c1i stands for sets of random variables on

which certain entropy terms are conditioned, and t1i is an additive term. For example, with

i = 3, condition (5.13) corresponds to the inequality

R13 +R11 + min{R20 + R21 + R31,

H(X21 |Q) + R31,

H(S1 |U3, Q)}≤ H(Y1 |U1, X12, U3, Q) + I(X12;X13 |U1, Q). (5.16)

Similarly, define the regions R2(p) and R3(p) by making the subscript replacements

1 7→ 2 7→ 3 7→ 1 and 1 7→ 3 7→ 2 7→ 1 in the definition of R1(p), respectively.

Define an operator FM that maps a convex 18-dimensional set of rate vectors of the

form (5.1) to a 3-dimensional rate region by substituting Rl0 = Rl −Rl1 −Rl2 −Rl3, for

l ∈ {1:3}, and subsequently projecting on the coordinates (R1, R2, R3). The operator FM


can be implemented by Fourier–Motzkin elimination.

We are now ready to state the main result of this chapter.

Theorem 5.1 (Inner bound to the capacity region of 3-DIC).The region

RIB =⋃p

FM{

R1(p) ∩R2(p) ∩R3(p)},

where p = p(q)p(u1, x1|q)p(u2, x2|q)p(u3, x3|q), is an inner bound to the capacity

region of the 3-DIC.

Remark 5.1 (Saturation). The min terms on the left hand side of conditions (5.7) to (5.15)

correspond to different modes of signal saturation, as in our discussion of interference

decoding in Chapter 3. There are numerous modes of saturation here, since the transmitters

employ a more sophisticated scheme than single-user random codes.

Remark 5.2 (Convexity). The regions R1(p), R2(p), and R3(p) ensure decodability at

the first, second, and third receiver, respectively. As in Chapter 3, they are generally

nonconvex. The regions can alternatively be written as finite unions of convex components

by expanding the cases in which the min terms take on each of their arguments. The

intersection R1(p) ∩R2(p) ∩R3(p) is also generally nonconvex. By virtue of time-sharing,

we are allowed to convexify, as shown in the theorem. This convex hull operation is useful

even for a single fixed distribution p. However, it is not achieved by the coded time sharing

i r1i c1i t1i

1 R11 {U1, X12, X13} 0

2 R12 +R11 {U1, X13} I(X12;X13 |U1, Q)

3 R13 +R11 {U1, X12} I(X12;X13 |U1, Q)

4 R12 + R13 +R11 {U1} I(X12;X13 |U1, Q)

5 R10 + R12 + R13 +R11 ∅ I(X12;X13 |U1, Q)

Table 5.1. Shorthand notation for terms related to transmitter 1.


mechanism of Q. The explicit convex hull operation also ensures that the argument of FM

is in fact a convex set.

Remark 5.3 (Fourier–Motzkin elimination). By contrast to other settings that use rate

splitting, the Fourier–Motzkin elimination denoted by FM cannot be carried out symbol-

ically due to the convex hull operation in its argument. However, this does not hinder

numerical evaluation of the region, since for each fixed p, the set R1(p) ∩R2(p) ∩R3(p)

as represented by its extreme points can be computed explicitly, and FM can be evaluated

by numerical Fourier–Motzkin elimination.

Remark 5.4 (Relation to previous bounds). It is not known whether the inner bound is

tight in general. However, it strictly includes the interference decoding inner bound in

Theorem 3.1, and thereby, the bound obtained by single-user random codes and treating

interference as noise in Theorem 2.1, i.e.,

RTIN ⊆ RID ⊆ RIB.

This follows by setting Ul = Xl and Rl = Rl0 for all l ∈ {1:3}.

Remark 5.5 (Generalization of the Han–Kobayashi scheme). The two-pair projections

of the inner bound are optimal, i.e., if one of the three rates, say R3, is set to zero, the

two-dimensional region that the inner bound achieves for (R1, R2) is in fact the capacity

region of the interference channel that consists of the first and second user pair. This follows

by setting Ul = ∅ for all l ∈ {1 : 3}, letting R12 = R12, R13 = R13 + I(X12;X13 |Q),

R21 = R21, R23 = R23 + I(X21;X23 |Q), and replacing all min terms in (5.7) to (5.15)

with their first argument. Moreover, the codebook structure that underlies the inner bound

contains superposition codebooks as a special case. Hence the proposed encoding scheme

subsumes the Han–Kobayashi scheme and generalizes it naturally to more than two user

pairs.

Before we discuss the proof of Theorem 5.1, it is instructive to consider the following

alternative formulation of the inner bound.


5.1.2 Alternative characterization of the achievable rate region

While the following alternative characterization of the inner bound is more difficult to

compute than the region of Theorem 5.1, it allows deeper insight into the structure of the

decodability conditions (see Remark 5.7).

Define a new region R ′1(p) similar to R1(p) above, but replacing conditions (5.7)

to (5.15) by

r1i + min{r21j + r31k, H(S1 | c21j, c31k, Q)} ≤ H(Y1 | c1i, c21j, c31k, Q) + t1i,

for all i ∈ {1:5}, j ∈ {1:3}, k ∈ {1:3}.(5.17)

The lower-case symbols indexed by i, j, and k are placeholders for the terms specified in

Tables 5.1, 5.2, and 5.3, respectively. For example, the case where i = 3, j = 3, and k = 2

corresponds to the inequality

R13 +R11 + min{

min{R20 + R21, R20 +H(X21 |U2, Q),

H(X21 |Q)}+ min{R31, H(X31 |U3, Q)},


Similarly, define the regions R ′2(p) and R ′3(p) by making the subscript replacements

1 7→ 2 7→ 3 7→ 1 and 1 7→ 3 7→ 2 7→ 1 in the definition of R ′1(p), respectively.

j r21j c21j

1 0 {X21}2 min{R21, H(X21 |U2, Q)} {U2}3 min{R20 + R21, R20 +H(X21 |U2, Q), H(X21 |Q)} ∅



Corollary 5.1 (Alternative inner bound to the capacity region of 3-DIC).The region

R ′IB =⋃p

FM{

R ′1(p) ∩R ′2(p) ∩R ′3(p)},

where p = p(q)p(u1, x1|q)p(u2, x2|q)p(u3, x3|q), is an inner bound to the capacity

region of the 3-DIC.

Remark 5.6. The regions R ′IB and RIB of Corollary 5.1 and Theorem 5.1 are equal. This is

proved in Subsection 5.2.3.

Remark 5.7 (R ′IB is logically simpler than RIB). The condition of inequality (5.17) ex-

poses a product structure with individual “factors” related to the first, second, and third

transmitter as specified in Tables 5.1, 5.2, and 5.3, respectively. This structure reflects the

fact that the transmitted messages are independent and there is no cooperation between the

transmitting nodes.

Remark 5.8 (RIB is computationally simpler than R ′IB). The sets R1(p) and R ′1(p) are

both defined by fifty inequality conditions. There is a natural one-to-one correspondence

between inequalities for R1(p) and R ′1(p) in which conditions (5.7) through (5.15) for some

index i correspond to inequality (5.17) for the same i and all j, k ∈ {1:3}. However, the

individual conditions are much simpler for R1(p) than for R ′1(p). Consider, for example,

the corresponding conditions (5.16) and (5.18). Expanding the nested min terms in (5.18)

leads to

k r31k c31k

1 0 {X31}2 min{R31, H(X31 |U3, Q)} {U3}3 min{R30 + R31, R30 +H(X31 |U3, Q), H(X31 |Q)} ∅



R13 +R11 + min{R20 + R21 + R31,

R20 + R21 +H(X31 |U3, Q),

R20 +H(X21 |U2, Q) + R31,

R20 +H(X21 |U2, Q) +H(X31 |U3, Q),

H(X21 |Q) + R31,


The difference between the expression in (5.16) and the one in (5.19) is that the former

has fewer arguments in the min term than the latter. The non-convex set R1(p) therefore

consists of fewer convex components than R ′1(p), which reduces the computational effort

required to evaluate the region.

The proof of Theorem 5.1 and Corollary 5.1 is divided in three parts, discussed in

Subsections 5.2.1 through 5.2.3. Both results are based on the same encoding scheme, which

originates from the setting of communication with disturbance constraints in Chapter 4

and is described in detail in Subsection 5.2.1. The error probability analysis that leads

to Corollary 5.1 is detailed in Subsection 5.2.2. We treat the signal from the undesired

transmitters using methods from interference decoding with point-to-point codes (see Chap-

ter 3). Each of the corresponding message indices is treated either by the union bound or by

Corollary A.2. The interplay between the union bound and this corollary is the formal reason

for the min terms on the left hand side of the inequalities in Theorem 5.1 and Corollary 5.1.

The operational meaning of these terms is the saturation of different links as discussed in

Remark 5.1. Next, the signal from the desired transmitter is treated by borrowing from

the analysis in disturbance-constrained communication (see Subsection 4.3.1). Finally, in

Subsection 5.2.3, we show that the regions in Theorem 5.1 and Corollary 5.1 are equal,

which concludes the proof of the theorem.

Remark 5.9. In principle, the saturation analysis leading to the min terms in the regions

applies to the 2-DIC as defined in Subsection 1.1.1 as well. However, the capacity region of

that channel can be achieved without considering saturation at all, as given by Theorem 1.1.

In Appendix B, we demonstrate that applying the new tools leads to an alternative achievable

rate region, and show how that region reverts back to the known capacity result. Thus,


saturation analysis provides a benefit only in interference channels with more than two user

pairs, which in an abstract sense agrees with the cases when interference alignment [MMK08,

CJ08] is beneficial.

5.1.3 Region without Ul

The encoding scheme underlying Theorem 5.1 is adopted from the setting of communication

with disturbance constraints. Motivated by the observation in Theorem 4.5 that the auxiliary

random variable U is unnecessary in some special cases of the disturbance-constrained

setting, we specialize the result of Theorem 5.1 to the case with Ul = ∅ and Rl0 = 0 for

l ∈ {1:3}. In terms of the encoding scheme, this corresponds to Marton coding without

common message in the broadcast channel, as opposed to the common message case that

results in the general theorem.

Fix a pmf for (Q,X1, X2, X3) of the form

p = p(q)p(x1|q)p(x2|q)p(x3|q).

Define the rate region R ′′1 (p) ⊂ R15+ to consist of the rate tuples

(R11, R12, R13, R12, R13, R21, R22, R23, R22, R23, R31, R32, R33, R32, R33) (5.20)

such that

R11 ≤ H(X11 |X12, X13, Q),

R12 +R11 ≤ H(X11 |X13, Q) + I(X12;X13 |Q),

R13 +R11 ≤ H(X11 |X12, Q) + I(X12;X13 |Q),

R12 + R13 +R11 ≤ H(X11 |Q) + I(X12;X13 |Q),

R11 + R21 ≤ H(Y1 |X12, X13, X31, Q),

R12 +R11 + R21 ≤ H(Y1 |X13, X31, Q) + I(X12;X13 |Q),

R13 +R11 + R21 ≤ H(Y1 |X12, X31, Q) + I(X12;X13 |Q),

R12 + R13 +R11 + R21 ≤ H(Y1 |X31, Q) + I(X12;X13 |Q),


R11 + R31 ≤ H(Y1 |X12, X13, X21, Q),

R12 +R11 + R31 ≤ H(Y1 |X13, X21, Q) + I(X12;X13 |Q),

R13 +R11 + R31 ≤ H(Y1 |X12, X21, Q) + I(X12;X13 |Q),

R12 + R13 +R11 + R31 ≤ H(Y1 |X21, Q) + I(X12;X13 |Q),

R11 + min{R21 + R31, H(S1 |Q)} ≤ H(Y1 |X12, X13, Q),

R12 +R11 + min{R21 + R31, H(S1 |Q)} ≤ H(Y1 |X13, Q) + I(X12;X13 |Q),

R13 +R11 + min{R21 + R31, H(S1 |Q)} ≤ H(Y1 |X12, Q) + I(X12;X13 |Q),

R12 + R13 +R11 + min{R21 + R31, H(S1 |Q)} ≤ H(Y1 |Q) + I(X12;X13 |Q),

R12 −R12 + R13 −R13 ≥ I(X12;X13 |Q),

R12 −R12 + (R13 −R13)/2 ≤ I(X12;X13 |Q),

(R12 −R12)/2 + R13 −R13 ≤ I(X12;X13 |Q),

R12 ≥ R12,

R13 ≥ R13.

Similarly, define the regions R ′′2 (p) and R ′′3 (p) by making the subscript replacements

1 7→ 2 7→ 3 7→ 1 and 1 7→ 3 7→ 2 7→ 1 in the definition of R ′′1 (p), respectively.

Define an operator FM′′ that maps a convex 15-dimensional set of rate vectors of the

form (5.20) to a 3-dimensional rate region by substituting Rl1 = Rl − Rl2 − Rl3, for

l ∈ {1:3}, and subsequently projecting on the coordinates (R1, R2, R3). The operator FM′′

can be implemented by Fourier–Motzkin elimination.

Corollary 5.2 (Inner bound to the capacity region of 3-DIC, no Ul).The region

R ′′IB =⋃p

FM′′{

R ′′1 (p) ∩R ′′2 (p) ∩R ′′3 (p)},

where p = p(q)p(x1|q)p(x2|q)p(x3|q), is an inner bound to the capacity region of the

3-DIC.


Remark 5.10 (R ′′IB is computationally simpler than R ′IB and RIB). By virtue of contain-

ing much fewer min terms on the left hand side of inequalities, the region of Corollary 5.2 is

easier to evaluate computationally than the regions in Theorem 5.1 and Corollary 5.1. (See

Remark 5.8.)

Continuation of Example 1.1. Recall the additive 3-DIC in Example 1.1 on page 8.

Figure 5.1 depicts a numerical approximation of the inner bound in Corollary 5.2. The

optimal trade-off between R1 and R2 when R3 = 0 is achieved in Figure 5.1, as per

Remark 5.5. The same is not true for the interference decoding inner bound in Figure 3.2

on page 37, or the inner bound by treating interference as noise in Figure 2.1 on page 14.

Figure 5.2 depicts the intersection of the three-dimensional regions with the plane defined

by R1 = R3. The intersection highlights the improvement of Corollary 5.2, and thus of the

general achievable rate region in Theorem 5.1, over the interference decoding inner bound.

5.1.4 Special case: One-to-many 3-DIC

Consider the one-to-many 3-DIC depicted in Figure 5.3. In this special case, interference

is caused only by the second transmitter, i.e., X12 = X13 = X31 = X32 = ∅. Furthermore,

there are no loss functions on the desired links, i.e., X11 = X1, Y2 = X22 = X2, and

R1

R2

R3

Figure 5.1. Region of Corollary 5.2 for the additive 3-DIC example. Compare to Figure 2.1 onpage 14 and Figure 3.2 on page 37.


0.5 1.0 1.5

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

(R1 +R3)/√

2

R2

Interference as noiseInterference decodingCorollary 5.2

Figure 5.2. Comparison of the regions in Theorem 2.1 (treating interference as noise), Theorem 3.1(interference decoding) and Corollary 5.2 for the additive 3-DIC example.

X33 = X3. This special case is of interest since the region of Theorem 5.1 is sufficiently

simple to permit symbolic evaluation of the Fourier–Motzkin elimination steps. The result

for the special case is thus more concrete than the general result.

Let R(1)(p) be the set of triples (R1, R2, R3) ∈ R3+ satisfying

Rk ≤ H(Xk |Q), k ∈ {1:3},

R1 +R2 ≤ H(Y1 |Q) +H(X2 |U2, X21, Q),

R2 +R3 ≤ H(Y3 |Q) +H(X2 |U2, X23, Q),

X1

X2

f1

g21 X21

Y1

Y2

g23

f3

X23

X3Y3

Figure 5.3. One-to-many special case of 3-DIC.


R1 +R3 ≤ I(X1;Y1 |U2, Q) + I(X3;Y3 |U2, Q) +H(X21, X23 |U2, Q),

R1 +R2 +R3 ≤ I(X1;Y1 |U2, Q) + I(X3;Y3 |U2, Q) +H(X2 |U2, Q) + I(U2;Y1 |Q),

R1 +R2 +R3 ≤ I(X1;Y1 |U2, Q) + I(X3;Y3 |U2, Q) +H(X2 |U2, Q) + I(U2;Y3 |Q).


R1 ≤ I(X1;Y1 |Q),

Rk ≤ H(Xk |Q), k ∈ {2, 3},

R2 +R3 ≤ H(Y3 |Q) +H(X2 |U2, X23, Q),




Rk ≤ H(Xk |Q), k ∈ {1, 2},

R3 ≤ I(X3;Y3 |Q),

R1 +R2 ≤ H(Y1 |Q) +H(X2 |U2, X21, Q),



Corollary 5.3 (Inner bound to the capacity region of one-to-many 3-DIC).The region

⋃p

R(1)(p) ∪R(2)(p) ∪R(3)(p),

where p = p(q)p(x1|q)p(u2, x2|q)p(x3|q), is an inner bound to the capacity region of

the one-to-many 3-DIC.

Taking the union over p of the sets R(1), R(2), and R(3) individually results in three

different inner bounds, none of which in general dominates the other. The conditions for


R(2) and R(3) are of the same form as if the first and third receiver, respectively, are treating

interference as noise. The region R(1) then corresponds to the first and third receiver

decoding part of the second transmitter’s message. Note however, that all three are achieved

by the same encoding scheme and with the same decoding rule. As in the simpler setting of

interference decoding with point-to-point codes in Chapter 3, treating interference as noise

is subsumed in the general coding scheme as a special case. The proof of Corollary 5.3 is

given in Subsection 5.2.4.

5.2 Proofs

5.2.1 Codebook generation for Theorem 5.1 and Corollary 5.1

For the sake of simplified notation, we omit the auxiliary random variable Q throughout

this subsection and the subsequent analysis. To obtain the proof with Q, the codebook

generation procedure as described below must be augmented by generating a coded time

sharing sequence qn i.i.d. from p(q), and conditioning all subsequent analysis steps on it in

the usual way [EK11].

Fix a pmf p(u1, x1)p(u2, x2)p(u3, x3). We begin by describing the generation procedure

for the first transmitter. The codebook is constructed as in the deterministic case of com-

munication with two disturbance constraints (see Theorem 4.3 on page 61 and its proof in

Subsection 4.3.1 on page 77), using rate splitting, Marton coding and superposition coding.

The crosslink outputs X12 and X13 take the place of the side receivers that are not interested

in any part of the message.

Split the rate as R1 = R10 +R11 +R12 +R13. Define the auxiliary rates R12 ≥ R12 and

R13 ≥ R13, let ε′ > 0, and define the set partitions

{1:2nR12} = L12(1) ∪ · · · ∪ L12(2nR12),

{1:2nR13} = L13(1) ∪ · · · ∪ L13(2nR13),

where L12(·) and L13(·) are indexed sets of size 2n(R12−R12) and 2n(R13−R13), respectively.

1. For each m10 ∈ {1:2nR10}, generate un1 (m10) according to∏n

i=1 p(u1i).

5.2. PROOFS 105

2. For each l12 ∈ {1:2nR12}, generate xn12(m10, l12) according to∏n

i=1 p(x12i |u1i(m10)).

Likewise, for each l13 ∈ {1:2nR13}, generate a sequence xn13(m10, l13) according to∏ni=1 p(x13i |u1i(m10)).

3. For each triple (m10,m12,m13), let S(m10,m12,m13) be the set of all pairs (l12, l13)

from the product set L12(m12)× L13(m13) such that (xn12(m10, l12), xn13(m10, l13)) ∈T (n)ε′ (X12, Z13 |un1 (m10)).

4. For each (m10, l12, l13) and m11 ∈ {1 : 2nR11}, generate xn1 (m10, l12, l13,m11) ac-

cording to∏n

i=1 p(x1i |u1i(m10), x12i(l12), x13i(l13)) if (l12, l13) ∈ S(m10,m12,m13).

Otherwise, we draw from Unif(X n).

5. Choose (l(m10,m12,m13)12 , l

(m10,m12,m13)13 ) uniformly from S(m10,m12,m13).

If S(m10,m12,m13) is empty, choose (1, 1).

Note the notational difference in the way the rates R1i are indexed here as opposed

to Subsection 4.3.1. Here, the steps in codebook generation follow the scheme R10 →(R12, R13)→ R11. The first index of the rate variables is always 1 and represents the first

transmitter. The second index uses the intuition that R10 is “common” in the sense that it

affects both side receivers, and R11 is “private” and does not appear at either side receiver.

(In Subsection 4.3.1, there is only one transmitter and thus no first index. The second index

follows the notational convention R0 → (R1, R2)→ R3.)

Codebooks for the second and third transmitters are generated similarly by applying the

subscript substitutions 1 7→ 2 7→ 3 7→ 1 and 1 7→ 3 7→ 2 7→ 1 in each step of the procedure.

Encoding. To send message m1 = (m10,m12,m13,m11), transmit

xn1 (m10, l(m10,m12,m13)12 , l

(m10,m12,m13)13 ,m11).

Decoding. The receivers use simultaneous non-unique decoding. The first receiver

observes yn1 . Define the tuple

T (m10,m12,m13,m11,m20, l21,m30, l31)

=(un1 (m10), xn12(m10, l

(m10,m12,m13)12 ), xn13(m10, l

(m10,m12,m13)13 ),


xn1 (m10, l(m10,m12,m13)12 , l

(m10,m12,m13)13 ,m11),

un2 (m20), xn21(m20, l21), un3 (m30), xn31(m30, l31), sn1 (m20, l21,m30, l31), yn1

).

Let ε > ε′. Declare that m1 = (m10, m12, m13, m11) has been sent if it is the unique message

such that

T (m10, m12, m13, m11, m20, l21, m30, l31)

∈ T (n)ε (U1, X12, X13, X1, U2, X21, U3, X31, S1, Y1)

for some m20, l21, m30, l31.

5.2.2 Error probability analysis for Corollary 5.1

Without loss of generality, assume that ml0 = ml1 = ml2 = ml3 = 1 is transmitted from

users l ∈ {1:3}. To analyze the probability of error at the first receiver, define the following

events.

Ee1 : S(1, 1, 1) is empty,

Ee2 : S(1, 1, 1) contains two distinct pairs with equal first or second component,

E0 : {T (1, 1, 1, 1,m20, l21,m30, l31) /∈ T (n)ε for all m20, l21,m30, l31},

Eijk : {T (m10,m12,m13,m11,m20, l21,m30, l31) ∈ T (n)ε

for some (m10,m12,m13,m11) ∈M1i, (m20, l21) ∈M2j, (m30, l31) ∈M3k},

for i ∈ {1:5}, j ∈ {1:3}, k ∈ {1:3},

where the message subsetsM1i,M2j , andM3k are specified in Tables 5.4, 5.5, and 5.6.

With “encoding” and “decoding” error events

Ee = Ee1 ∪ Ee2,

Ed = E0 ∪⋃i,j,k

Eijk,

5.2. PROOFS 107

the probability of error is upper bounded by

P(E) ≤ P(Ee) + P(Ed | Ece ).

As in the case of communication with disturbance constraints (Subsection 4.3.1), P(Ee)→ 0

as n → ∞ if conditions (5.2), (5.3), and (5.4) are fulfilled. We treat P(Ed | Ece ) term by

term via the union bound. First, note that by the conditional typicality lemma in [EK11],

P(E0 | Ece ) → 0 as n → ∞ (this relies on ε′ < ε). Next, we bound each term P(Eijk | Ece ).

For each i ∈ {1:5}, j ∈ {1:3}, and k ∈ {1:3}, we show that P(Eijk | Ece )→ 0 as n→∞ is

implied by condition (5.17) with the same indices i, j, and k.

As an example, consider the case of i = 3, j = 3, k = 2. The probability of the event

E332 = {T (1, 1,m13,m11,m20, l21, 1, l31) ∈ T (n)ε for

some m13 6= 1, m11, m20 6= 1, l21, l31 6= L(1,1,1)31 },

conditioned on Ece , tends to zero as n→∞ if (5.18) is true. Recalling the expansion (5.19),

i Message subset m10 m12 m13 m11

1 M11 1 1 1 6= 1

2 M12 1 6= 1 1 any3 M13 1 1 6= 1 any4 M14 1 6= 1 6= 1 any5 M15 6= 1 any any any

Table 5.4. Message subsetsM1i.

j Message subset m20 l21

1 M21 1 L(1,1,1)21

2 M22 1 6= L(1,1,1)21

3 M23 6= 1 any

Table 5.5. Message subsetsM2j .


the claim is that P(E332 | Ece )→ 0 follows from any of the following sufficient conditions.

R13 +R11 +R20 + R21 + R31 ≤ ♦, (5.21a)

R13 +R11 +R20 + R21 +H(X31 |U3) ≤ ♦, (5.21b)

R13 +R11 +R20 +H(X21 |U2) + R31 ≤ ♦, (5.21c)

R13 +R11 +R20 +H(X21 |U2) +H(X31 |U3) ≤ ♦, (5.21d)

R13 +R11 +H(X21) + R31 ≤ ♦, (5.21e)

R13 +R11 +H(S1 |U3) ≤ ♦, (5.21f)

where ♦ stands for the right-hand side of conditions (5.18) and (5.19), H(Y1 |U1, X12, U3) +

I(X12;X13 |U1).

Recall the definition of the tuple T and write

E332 ={(Un

1 (1), Xn12(1, L

(1,1,m13)12 ), Xn

13(1, L(1,1,m13)13 ), Xn

1 (1, L(1,1,m13)12 , L

(1,1,m13)13 ,m11),

Un2 (m20), Xn

21(m20, l21), Un3 (1), Xn

31(1, l31), Sn1 (m20, l21, 1, l31), Y n1

)∈ T (n)

ε

for some m13 6= 1,m11,m20 6= 1, l21, l31 6= L(1,1,1)31

}⊆{(Un

1 (1), Xn12(1, L

(1,1,m13)12 ), Xn

13(1, l13), Xn1 (1, L

(1,1,m13)12 , l13,m11),

Un2 (m20), Xn

21(m20, l21), Un3 (1), Xn

31(1, l31), Sn1 (m20, l21, 1, l31), Y n1

)∈ T (n)

ε

for some m13 6= 1, l13 /∈ L13(1),m11,m20 6= 1, l21, l31 6= L(1,1,1)31

}where we have augmented the event by replacing the random variables {L(1,1,m13)

13 : m13 6=1} with the set of all their possible values, i.e., all l13 /∈ L13(1).

There are two cases, depending on whether or not E332 occurs in conjunction with the

k Message subset m30 l31

1 M31 1 L(1,1,1)31

2 M32 1 6= L(1,1,1)31

3 M33 6= 1 any

Table 5.6. Message subsetsM3j .

5.2. PROOFS 109

event Eeq = {L(1,1,m13)12 = L

(1,1,1)12 }. We write P(E332 | Ece ) = P(E332 ∩ Eeq | Ece ) + P(E332 ∩

Eceq | Ece ) and treat both terms separately. First, consider

E332 ∩ Eeq ⊆{(Un

1 (1), Xn12(1, L

(1,1,1)12 ), Xn

13(1, l13), Xn1 (1, L

(1,1,1)12 , l13,m11),

Un2 (m20), Xn

21(m20, l21), Un3 (1), Xn

31(1, l31), Sn1 (m20, l21, 1, l31), Y n1

)∈ T (n)

ε

for some l13 /∈ L13(1),m11,m20 6= 1, l21, l31 6= L(1,1,1)31

}Thus,

P(E332 ∩ Eeq | Ece )

≤∑

(un1 ,xn12,u

n3 ,y

n1 )∈T (n)

ε

P{Un

1 (1) = un1 , Xn12(1, L

(1,1,1)12 ) = xn12, U

n3 (1) = un3 , Y

n1 = yn1

∣∣ Ece}

·∑

l13 /∈L13(1)

2nR11∑m11=1

P{

(un1 , xn12, X

n13(1, l13), Xn

1 (1, L(1,1,1)12 , l13,m11),

Un2 (m20), Xn

21(m20, l21), un3 , Xn31(1, l31),

Sn1 (m20, l21, 1, l31), yn1 ) ∈ T (n)ε for some

m20 6= 1, l21, l31 6= L(1,1,1)31

∣∣ Ece , un1 , xn12, un3 , y

n1

}≤ 2n(R13+R11) P1, (5.22)

where P1 is shorthand for the last P{·} expression. The conditioning un1 , xn12, u

n3 , y

n1 in P1

is our abbreviating notation for Un1 (1) = un1 , X

n12(1, L

(1,1,1)12 ) = xn12, U

n3 (1) = un3 , Y

n1 =

yn1 . The expression for P1 can be bounded in several ways, each leading to one of the

conditions (5.21a) through (5.21f).

To show that conditions (5.21a) to (5.21e) are sufficient, we bound P1 by omitting Sn1from the typicality requirement.

P1 ≤ P{

(un1 , xn12, X

n13(1, l13), Xn

1 (1, L(1,1,1)12 , l13,m11),

Un2 (m20), Xn

21(m20, l21), un3 , Xn31(1, l31), yn1 ) ∈ T (n)

ε

for some m20 6= 1, l21, l31 6= L(1,1,1)31

∣∣ Ece , un1 , xn12, un3 , y

n1

}.

The subsequent process of bounding P1 decomposes into two stages. The first stage treats


the signals from transmitters 2 and 3 using methods from interference decoding with point-

to-point codes as developed in Chapter 3. The second stage treats the signals from the

desired transmitter and borrows from the analysis in disturbance-constrained communication

in Chapter 4.

The first stage itself is subdivided into a union-bound step and a step based on Corol-

lary A.2. Let us focus on (5.21d) to illustrate the proof. In this case, the union bound is

applied to the index m20, and Corollary A.2 accounts for the remaining indices l21 and l31.

P1 ≤2nR20∑m20=2

∑un2∈T

(n)ε (U2 |

un1 ,xn12,u

n3 ,y

n1 )︸︷︷︸

.= 2nH(U2|U1,X12,U3,Y1)

P{Un

2 (m20) = un2∣∣ Ece , un1 , xn12, u

n3 , y

n1

}︸︷︷︸.= 2−nH(U2)

· P{

(un1 , xn12, X

n13(1, l13), Xn

1 (1, L(1,1,1)12 , l13,m11),

un2 , Xn21(m20, l21), un3 , X

n31(1, l31), yn1 ) ∈ T (n)

ε

for some l21, l31 6= L(1,1,1)31

∣∣ Ece , un1 , xn12, un2 , u

n3 , y

n1

}≤ 2n(R20+H(U2|U1,X12,U3,Y1)−H(U2)+δ1(ε)) P2, (5.23)

where P2 is shorthand for the last probability term.

As the second step, we bound P2 by using Corollary A.2, with

Ai = (Xn21(m20, l21), Xn

31(1, l31)), i = (l21, l31),

D = (Xn13(1, l13), Xn

1 (1, L(1,1,1)12 , l13,m11)),

Q = T (n)ε (X13, X1, X21, X31 |un1 , xn12, u

n2 , u

n3 , y

n1 ).

We have

QA = T (n)ε (X21, X31 |un1 , xn12, u

n2 , u

n3 , y

n1 ),

|QA| ≤ 2n(H(X21,X31|U1,X12,U2,U3,Y1)+δ2(ε)). (5.24)

The final building block for using the corollary is to analyze P{(a,D) ∈ Q} for any fixed

a = (xn21, xn31). This constitutes the second stage of the bounding procedure, as it relates

5.2. PROOFS 111

only to signals from the first (desired) transmitter. We have

P{(a,D) ∈ Q}

= P{

(Xn13(1, l13), Xn

1 (1, L(1,1,1)12 , l13,m11))

∈ T (n)ε (X13, X1|un1 , xn12, u

n2 , x

n21, u

n3 , x

n31, y

n1 )∣∣ Ece , un1 , xn12, u

n2 , x

n21, u

n3 , x

n31, y

n1

}=

∑(xn13,x

n1 )∈T (n)

ε (X13,X1 |un1 ,x

n12,u

n2 ,x

n21,u

n3 ,x

n31,y

n1 )

P{Xn

13(1, l13) = xn13, Xn1 (1, L

(1,1,1)12 , l13,m11) = xn1

∣∣Ece , Un

1 (1) = un1 , Xn12(1, L

(1,1,1)12 ) = xn12, U

n2 (m20) = un2 ,

Xn21(m20, l21) = xn21, U

n3 (1) = un3 , X

n31(1, l31) = xn31, Y

n1 = yn1

}(a)=

∑(xn13,x

n1 )∈T (n)

ε (X13,X1 |un1 ,x

n12,u

n2 ,x

n21,u

n3 ,x

n31,y

n1 )︸︷︷︸

.= 2nH(X13,X1|U1,X12,U2,X21,U3,X31,Y1)

P{Xn

13(1, l13) = xn13

∣∣ Un1 (1) = un1

}︸︷︷︸.= 2−nH(X13|U1)

· P{Xn

1 (1, L(1,1,1)12 , l13,m11) = xn1

∣∣ xn12, xn13, u

n1

}︸︷︷︸.= 2−nH(X1|X12,X13,U1)

≤ 2n(−H(Y1|U1,X12,U3)−I(X12;X13|U1)+H(U2,X21)+H(X31|U3)−H(U2,X21,X31|U1,X12,U3,Y1)+δ3(ε))

(b)= PD. (5.25)

In (a), the first term under the sum arises by omitting irrelevant conditions. In particular,

the conditions Un2 (m20) = un2 and Xn

21(m20, l21) = xn21 can be omitted because m20 6= 1.

Conditions Un3 (1) = un3 and Xn

31(1, l31) = xn31 are omitted because of the Markov chain

(Un3 , X

n31) − Y n

1 − Xn13, The condition Xn

12(1, L(1,1,1)12 ) = xn12 can be omitted since l13 /∈

L13(1). Finally, the conditions Ece and Y n1 = yn1 can be omitted because they relate only

to the (1, 1) bin for m0 = 1, but Xn13(1, l13) relates to a bin other than the first due to

l13 /∈ L13(1). Similar simplifications lead to the second term under the sum. In (b), we

note that the expression is independent of a and can thus serve as the required PD for

Corollary A.2.

Backtracking through the stages of the proof, Corollary A.2 implies P2 ≤ |QA| · PD,

where the two terms are given in (5.24) and (5.25). Substituting in (5.23) and eventually

in (5.22) implies that P(E332 ∩ Eeq | Ece )→ 0 as n→∞ follows from (5.21d).

The remaining conditions (5.21a) through (5.21e) can be shown by varying the division

of labor between the union bound and Corollary A.2, that is, by applying the union bound


to different subsets of indices {m20, l21, l31}. Table 5.7 summarizes the correspondence

between subsets and conditions. Note that layering in codebook generation implies that if

the union bound is used on l21, then it must also be used on m20, and only the subsets that

satisfy this layering condition appear in the table. The last row of the table corresponds to

treating all three indices via the corollary. This leads to a bound that is dominated by (5.21f)

and thus irrelevant.

To show that condition (5.21f) is sufficient, we bound P ? by omitting Un2 , Xn

21, and Xn31

from the typicality requirement,

P1 ≤ P{

(un1 , xn12, X

n13(1, l13), Xn

1 (1, L(1,1,1)12 , l13,m11),

un3 , Sn1 (m20, l21, 1, l31), yn1 ) ∈ T (n)

ε

for some m20 6= 1, l21, l31 6= L(1,1,1)31

∣∣ Ece , un1 , xn12, un3 , y

n1

},

and then use Corollary A.2 with

Ai = Sn1 (m20, l21, 1, l31), i = (m20, l21, l31),

D = (Xn13(1, l13), Xn

1 (1, L(1,1,1)12 , l13,m11)),

Q = T (n)ε (X13, X1, S1 |un1 , xn12, u

n3 , y

n1 ).

In order to complete the proof, we need to consider the event E332 ∩ Eceq. We omit the

details, as they do not contain new ideas. As for E332∩Eeq, the analysis decomposes into two

Union bound applied for indices Condition

{m20, l21, l31} (5.21a)

{m20, l21} (5.21b)

{m20, l31} (5.21c)

{m20} (5.21d)

{l31} (5.21e)

{} —

Table 5.7. Index subsets for union bound and corresponding sufficient conditions.

5.2. PROOFS 113

stages, relating to transmitters 2 and 3, and transmitter 1, respectively. The first stage uses a

combination of the union bound and Corollary A.2. In the second stage, the independence

lemma (Lemma A.2) is needed to rule out correlation leakage through the Marton selection

process. Eventually, the conditions for P(E332 ∩ Eceq | Ece ) are subsumed by the conditions for

P(E432)→ 0.

This concludes the proof of Corollary 5.1. �

5.2.3 Equivalence of Theorem 5.1 and Corollary 5.1

Finally, we show that the regions RIB and R ′IB in Theorem 5.1 and Corollary 5.1 are equal.

It is clear that RIB ⊆ R ′IB, since the conditions of the theorem are more stringent than those

of the corollary (compare Remark 5.8). To show the converse inclusion, we establish that

every rate point in R ′IB is contained in RIB. The key idea is to vary the auxiliary rates that

define the rate split while keeping the overall rate unchanged. The procedure is analogous to

the analysis of the two-user-pair case in Appendix B.1.

Consider a fixed distribution p. We are given a rate split

(R′10, R′12, R

′13, R

′12, R

′13, R

′11),

which satisfies the conditions of Corollary 5.1. Define

∆12 = R′12 −min{R′12, H(X12 |U1)},

∆13 = R′13 −min{R′13, H(X13 |U1)},

and let the modified rate split (R10, R12, R13, R12, R13, R11) be given as

R10 = R′10,

R12 = R′12 −∆12,

R13 = R′13 −∆13,

R12 = R′12 −∆12,

R13 = R′13 −∆13,


R11 = R′11 + ∆12 + ∆13.

First note that this rate split maintains the same overall rate R10 + R12 + R13 + R11 as

the original rate split. To verify non-negativity of each component rate, first note that

R10, R12, R13, R11 ≥ 0 by definition. Furthermore,

R12 = R′12 − R′12 + min{R′12, H(X12 |U1)} ≥ 0,

where the right hand side follows from R′12 ≥ 0 and condition (5.3). Likewise, it fol-

lows that R13 ≥ 0. The modified rate split is thus valid. It remains to be shown that

(R10, R12, R13, R12, R13, R11) satisfies conditions (5.2) to (5.15), using the fact that the tu-

ple (R′10, R′12, R

′13, R

′12, R

′13, R

′11) satisfies (5.2) to (5.6) and (5.17). This is a tedious but

straightforward exercise, which we omit here. This concludes the proof that the statements

of Theorem 5.1 and Corollary 5.1 are equivalent, and thereby, the proof of Theorem 5.1. �


We apply Theorem 5.1. Since the first transmitter does not cause any interference, we set

U1 = ∅ and R10 = R12 = R12 = R13 = R13 = 0, i.e., the entire rate R1 is contained in R11

and the codebook at the first transmitter degenerates to a non-layered (single-user) random

codebook according to p(x1). We proceed analogously for the third transmitter. Using these

simplifications, the intersection R1(p) ∩R2(p) ∩R3(p) of Theorem 5.1 is represented by

the reduced set of conditions

R1 ≤ H(X1 |Q), (5.26)

R1 + R21 ≤ H(Y1 |U2, Q), (5.27)

R1 + min{R20 + R21, H(X21 |Q)} ≤ H(Y1 |Q), (5.28)

R22 ≤ H(Y2 |U2, X21, X23, Q), (5.29)

R21 +R22 ≤ H(Y2 |U2, X23, Q) + I(X21;X23 |U2, Q), (5.30)

R23 +R22 ≤ H(Y2 |U2, X21, Q) + I(X21;X23 |U2, Q), (5.31)

R21 + R23 +R22 ≤ H(Y2 |U2, Q) + I(X21;X23 |U2, Q), (5.32)

5.2. PROOFS 115

R20 + R21 + R23 +R22 ≤ H(Y2 |Q) + I(X21;X23 |U2, Q), (5.33)

R21 −R21 + R23 −R23 ≥ I(X21;X23 |U2, Q), (5.34)

R21 −R21 + (R23 −R23)/2 ≤ I(X21;X23 |U2, Q), (5.35)

(R21 −R21)/2 + R23 −R23 ≤ I(X21;X23 |U2, Q), (5.36)

R21 ≥ R21, (5.37)

R23 ≥ R23, (5.38)

R3 ≤ H(X1 |Q), (5.39)

R3 + R23 ≤ H(Y1 |U2, Q), (5.40)

R3 + min{R20 + R23, H(X23 |Q)} ≤ H(Y1 |Q), (5.41)

where conditions (5.26) to (5.28) are from R1(p), conditions (5.29) to (5.38) are from R2(p),

and conditions (5.39) to (5.41) are from R3(p) in Theorem 5.1, respectively. Due to the

min terms on the left hand side of (5.28) and (5.41), these conditions specify a non-convex

region which equals the union of four convex sets R(1)(p) ∪ R(2)(p) ∪ R(3)(p) ∪ R(4)(p).

Each such set is obtained by replacing the two min terms with one of their two arguments,

respectively. Thus the region of Theorem 5.1 becomes

⋃p

FM{

R1(p) ∩R2(p) ∩R3(p)}

=⋃p

FM{

R(1)(p) ∪ R(2)(p) ∪ R(3)(p) ∪ R(4)(p)}

(a)=⋃p

FM{

R(1)(p)}∪ FM

{R(2)(p)

}∪ FM

{R(3)(p)

}∪ FM

{R(4)(p)

}(b)=⋃p

R(1)(p) ∪R(2)(p) ∪R(3)(p),

where in (a), we have exchanged the convex hull operator and the Fourier–Motzkin operator

FM. In step (b), the operator FM is evaluated symbolically. For i ∈ {1:3}, the expression

FM{R(i)(p)} evaluates to the region R(i)(p) claimed in the corollary. The fourth term

FM{R(4)(p)} turns out to be a subset of R(1)(p) and can thus be omitted. Thus we have

proved Corollary 5.3. �

Chapter 6

Conclusion

The main result of this dissertation is the inner bound to the capacity of the 3-DIC given in

Theorem 5.1 and the insight on coding scheme design that it entails. The inner bound strictly

includes all previously known bounds, and thus contributes to a deeper understanding of

interference channels in general. The two main ingredients are the codebook design that is

inspired from communication with disturbance constraints, and the receiver architecture that

is drawn from interference decoding.

The coding scheme that we obtained by combining these viewpoints provides a natural

extension of the Han–Kobayashi scheme to interference channels with more than two

user pairs. It turns out that the key property of the Han–Kobayashi scheme that allows

generalization is not splitting the message into public and private components and building a

layered superposition codebook. Instead, the invariant between the Han–Kobayashi scheme

for two-pair channels and its proposed generalization to the three-pair case is that both

schemes solve the underlying disturbance-constrained communication problem.

On a more abstract level, the modular approach that we have taken may be applicable

in other problems of multi-terminal information theory as well. Only by focusing on the

transmitter and receiver end of the problem individually first and solving the associated

problems in isolation did we gain the insight for approaching the 3-DIC problem.

Finally, although we described the encoding scheme and the resulting achievable rate

region using the example of deterministic channels with three user pairs, the key ideas can be

readily applied to discrete memoryless interference channels, and to interference channels

118 CHAPTER 6. CONCLUSION

with a larger number of user pairs. The main difficulty in generalizing to interference

channels with noise is that the associated disturbance-constrained communication problem

then requires additional auxiliary random variables, and consequently, the codebook structure

at the side receivers does not simplify as in the deterministic case.

Appendix A

Useful auxiliary results

A.1 Probability decomposition by index and by value

In this section, we state and prove a lemma and two corollaries that are useful in error

probability analyses that involve saturation arguments. They bound the probability of a

union of events, such as the probability that any one of the possible incorrect message

combinations appears to be correct at a receiver in the 3-DIC. The proofs are given below.

Lemma A.1. Let A1, . . . , An be identically distributed random variables from an al-

phabet A, and let D be a random variable from alphabet D. Let Q ⊂ A×D be a set of

“qualified” pairs. Then

P (∪ni=1 {(Ai, D) ∈ Q}) ≤n∑i=1

P{(Ai, D) ∈ Q}, (A.1)

P (∪ni=1 {(Ai, D) ∈ Q}) ≤∑a∈A

P{(a,D) ∈ Q}. (A.2)

Remark A.1. Inequality (A.1) is the well-known union bound; it decomposes the probabil-

ity by index. The second inequality (A.2) decomposes the probability by value.

Remark A.2. Note that the random variable D is crucial in inequality (A.2). With D = ∅,the terms in the sum are essentially indicator functions, and the right hand side of the

bound generally becomes larger than one and thus useless. For the bound to be useful, the

randomness of D must act as dithering that equalizes the probability P{(a,D) ∈ Q} over a.

120 APPENDIX A. USEFUL AUXILIARY RESULTS

For the following two corollaries, in addition to the assumptions in Lemma A.1, let

QA be the subset of values from A that can qualify at all1, and let PD be given such that

P{(a,D) ∈ Q} ≤ PD for all a.

Corollary A.1. P (∪ni=1 {(Ai, D) ∈ Q}) ≤ nP{Ai ∈ QA} · PD.

Corollary A.2. P (∪ni=1 {(Ai, D) ∈ Q}) ≤ |QA| · PD.

Remark A.3. The factor nP{Ai ∈ QA} in Corollary A.1 is the expected number of random

variables Ai that qualify by themselves (for some d). It thus relates to counting random

variables, which matches the interpretation of enumerating random variable indices. On the

other hand, the factor |QA| in Corollary A.2 counts a set of values of random variables.

Proof of Lemma A.1: Inequality (A.1) is the union bound. To see inequality (A.2), con-

sider

P (∪ni=1 {(Ai, D) ∈ Q}) =∑an∈An

P{An = an} P (∪ni=1 {(ai, D) ∈ Q})

(a)≤∑an∈An

P{An = an} P (∪a∈A {(a,D) ∈ Q})

(b)≤∑an∈An

P{An = an}∑a∈A

P{(a,D) ∈ Q}

=∑a∈A

P{(a,D) ∈ Q},

where (a) uses the fact that the union contains at most |A| distinct events, and (b) uses the

union bound. �

Proof of Corollary A.1: Refine the right hand side of (A.1) as

n∑i=1

P{(Ai, D) ∈ Q} =n∑i=1

∑a∈QA

P{Ai = a} P{(a,D) ∈ Q}

1QA = {a ∈ A | (a, d) ∈ Q for some d}

A.2. INDEPENDENCE LEMMA 121

≤ nPD · P{Ai ∈ QA}. �

Proof of Corollary A.2: Develop the right hand side of (A.2) as

∑a∈A

P{(a,D) ∈ Q} =∑a∈QA

P{(a,D) ∈ Q}

≤ |QA| · PD. �

A.2 Independence lemma

Lemma A.2 (Independence lemma).Consider a finite setA and a subsetA′ ⊂ A. Let pA be an arbitrary pmf overA. Let the

random vector An be distributed proportionally to the product distribution∏n

l=1 pA(al),

restricted to the support set {an : ak ∈ A′ for some k}. Let I be drawn uniformly from

{i : Ai ∈ A′}. Let J = ((I + s− 1) mod n) + 1 for some integer s ∈ {1:(n − 1)}.Then, the random variables AI and AJ are independent.

Proof: We prove the lemma for s = 1, the remaining cases follow by symmetry. For ease of

notation, define the specialized modulo operator JxK = 1 + ((x− 1) mod n), the indicator

function 1A′(a) = 1 if a ∈ A′ and 0 otherwise, and the shorthand notations Y = AI and

Z = AJ . Notice that

p(an) =

1c

∏nl=1 pA(al) if ak ∈ A′ for some k ∈ {1:n}

0 otherwise,

where c is a normalization constant, the exact value of which is not relevant. Further,

p(i | an) =

1∑nk=1 1A′ (ak)

if ai ∈ A′

0 otherwise.

122 APPENDIX A. USEFUL AUXILIARY RESULTS

The joint distribution of (An, I, J, Y, Z) is then

p(an, i, j, y, z) =

p(an)∑n

k=1 1A′ (ak)if ai ∈ A′, ai = y, aj = z, and j = Ji+ 1K

0 otherwise.

Partially marginalizing, it follows that

p(y, z) =n∑i=1

∑an: ai∈A′ai=y

aJi+1K=z

p(an)∑nk=1 1A′(ak)

.

It is clear that p(y, z) = p(y)p(z) = 0 if y /∈ A′. On the other hand, for y ∈ A′, we have

p(y, z) =n∑i=1

∑an: ai=yaJi+1K=z

∏nl=1 pA(al)

c∑n

k=1 1A′(ak).

The fraction under the sum is invariant under permutations of an. Therefore,

p(y, z) =1

c

n∑i=1

∑an: a1=ya2=z

∏nl=1 pA(al)∑nk=1 1A′(ak)

=n

c

∑an=(y,z,an3 )

∏nl=1 pA(al)∑nk=1 1A′(ak)

=n pA(y) pA(z)

c

∑an3∈An−2

∏nl=3 pA(al)

1 + 1A′(z) +∑n

k=3 1A′(ak),

where an3 are the last n− 2 components of an. Observe that p(y, z) separates into a function

of z and a function of y. Independence is thus established. �

Appendix B

Application of new techniques to 2-DIC

In this appendix, we show that the capacity region of the 2-DIC as given by Theorem 1.1 can

be written in the same notational framework as Theorem 5.1 and Corollary 5.1. Furthermore,

we show that saturation effects do not play a crucial role for the 2-DIC.

The capacity region of the 2-DIC can alternatively be written as follows. Fix a joint

distribution for (Q,X1, X2) of the form p = p(q)p(x1|q)p(x2|q). Let the region R ′1(p) ⊂R4

+ be the set of rate tuples (R11, R12, R21, R22) that satisfy

r1i + r2j ≤ H(Y1 | c1i, c2j, Q), for all i, j ∈ {1, 2}. (B.1)

The lower-case symbols are placeholders for the terms shown in Tables B.1 and B.2.

Similarly, define R ′2(p) by making the subscript replacement 1 7→ 2 7→ 1 in the definition

of R ′1(p). Define the operator FM′ as the specialized Fourier–Motzkin elimination that

maps a convex 4-dimensional set of rate tuples (R11, R12, R21, R22) to a 2-dimensional rate

region by substituting R12 = R1 − R11 and R21 = R2 − R22, and then projecting on the

coordinates (R1, R2).

i r1i c1i

1 R11 {X12}2 R12 +R11 {∅}

Table B.1. 2-DIC shorthand notation for terms related to transmitter 1.

124 APPENDIX B. APPLICATION OF NEW TECHNIQUES TO 2-DIC

Theorem B.1 (Capacity region of 2-DIC).The capacity region of the 2-DIC is equal to the set

R ′2-DIC =⋃p

FM′{

R ′1(p) ∩R ′2(p)},

where p = p(q)p(x1|q)p(x2|q).

Remark B.1. The region in this theorem has the same product structure as the 3-DIC region

in Corollary 5.1.

Achievability of the region R ′2-DIC in Theorem B.1 follows from Han–Kobayashi coding.

The first transmitter constructs 2nR12 cloud center codewords according to p(x12), and 2nR11

satellite codewords according to p(x1|x12). The second transmitter proceeds likewise. The

error probability analysis is entirely analogous to the proof of Corollary 5.1 in Subsec-

tion 5.2.2. Each combination of (i, j) in condition (B.1) corresponds to a certain error event

at the first receiver. Since correct decoding is required only for messages associated with

R11 and R12, we can take advantage, at least formally, of saturation effects as expressed by

the min term in Table B.2.

Remark B.2 (Explicit notation for Theorem B.1). The region R ′1(p) can be rewritten by

expanding conditions (B.1) explicitly. It is the set of rate tuples (R11, R12, R21, R22) that

j r2j c2j

1 0 {X21}2 min{R21, H(X21 |Q)} {∅}

Table B.2. 2-DIC shorthand notation for terms related to transmitter 2.

125

satisfy

R ′1(p) : R11 ≤ H(X11 |X12, Q), (B.2a)

R12 +R11 ≤ H(X11 |Q), (B.2b)

R11 + min{R21, H(X21 |Q)} ≤ H(Y1 |X12, Q), (B.2c)

R12 +R11 + min{R21, H(X21 |Q)} ≤ H(Y1 |Q). (B.2d)

Likewise, the region R ′2(p) is the set of rate tuples (R11, R12, R21, R22) that satisfy

R ′2(p) : R22 ≤ H(X22 |X21, Q), (B.3a)

R21 +R22 ≤ H(X22 |Q), (B.3b)

R22 + min{R12, H(X12 |Q)} ≤ H(Y2 |X21, Q), (B.3c)

R21 +R22 + min{R12, H(X12 |Q)} ≤ H(Y2 |Q). (B.3d)

Before proving the converse of Theorem B.1, it is instructive to consider a corollary

to Theorem 1.1. To this end, define the modified set R1(p) that contains all rate tuples

(R11, R12, R21, R22) satisfying

R1(p) : R11 ≤ H(X11 |X12, Q), (B.4a)

R12 +R11 ≤ H(X11 |Q), (B.4b)

R11 +R21 ≤ H(Y1 |X12, Q), (B.4c)

R12 +R11 +R21 ≤ H(Y1 |Q). (B.4d)

Likewise, let R2(p) be the set of all tuples that satisfy

R2(p) : R22 ≤ H(X22 |X21, Q), (B.5a)

R21 +R22 ≤ H(X22 |Q), (B.5b)

R22 +R12 ≤ H(Y2 |X21, Q), (B.5c)

R21 +R22 +R12 ≤ H(Y2 |Q). (B.5d)

126 APPENDIX B. APPLICATION OF NEW TECHNIQUES TO 2-DIC

Corollary B.1 (Capacity region of 2-DIC, no saturation).The capacity region of the 2-DIC is equal to the set

R2-DIC =⋃p

FM′ {R1(p) ∩R2(p)} , (B.6)

where p = p(q)p(x1|q)p(x2|q).

Proof: Use Fourier–Motzkin elimination to evaluate the FM′ operator in R2-DIC. �

Remark B.3. We note the formal similarity between the sets in Theorem B.1 and Corol-

lary B.1. A seeming difference is that the set R2-DIC does not have either of the two convex

hull operators of R ′2-DIC. This difference is vacuous. In fact, the operators could be added to

the expression in (B.6) without changing it. The inner convex hull operator is superfluous

because R1(p) ∩R2(p) is convex by construction, while the outer convex hull operator is

subsumed by coded timesharing via Q.

Converse proof for Theorem B.1: It is clear from (B.2) and (B.4) that the set R1(p) is a

subset of R ′1(p) since the conditions of the former are more stringent than those of the latter

(every min expression is no greater than its arguments). Likewise, R2(p) is a subset of

R ′2(p). It follows that R2-DIC is a subset of R ′2-DIC because set intersection, the convex hull

operation, FM′, and set union are monotone with respect to set inclusion. We have thus

established that R ′2-DIC contains the capacity region. �

B.1 2-DIC has no saturation gain

It follows that both R2-DIC and R ′2-DIC are equal to the capacity region C2-DIC. Although the

region R ′2-DIC formally exploits saturation effects through the min terms in (B.2) and (B.3),

in actuality, such saturation gains are not available in the 2-DIC. This is clear from the

modified region R2-DIC, which does not contain saturation terms in (B.4) and (B.5), but

nevertheless equals the capacity region. Although the constituent regions R ′1(p) ∩R ′2(p)

generally strictly include the saturation-unaware R1(p) ∩R2(p), the final regions R ′2-DIC

and R2-DIC are the same, and consequently, saturation need not be considered in the 2-DIC.

B.1. 2-DIC HAS NO SATURATION GAIN 127

The strict inclusion is lost during the projection operation in FM′ and relates to the

underlying rate splitting. It can be understood directly by proving R ′2-DIC ⊆ R2-DIC as

follows. Consider a fixed distribution p = p(q)p(x1|q)p(x2|q), and an achievable rate pair

(R1, R2) with a particular rate split R1 = R′11 +R′12 and R2 = R′21 +R′22 in R ′1(p)∩R ′2(p).

The min terms may be equal to either of their arguments, i.e., achievability of this particular

rate split may or may not rely on saturation. In any case, we can construct a modified split

R1 = R11 +R12 and R2 = R21 +R22 that maintains the same total rates but does not rely

on saturation, i.e., is contained in R1(p) ∩R2(p).

Specifically, let

∆1 = R′12 −min{R′12, H(X12 |Q)},

∆2 = R′21 −min{R′21, H(X21 |Q)},

and define the modified rate split (R11, R12, R21, R22) as

R12 = R′12 −∆1 = min{R′12, H(X12 |Q)}, (B.7)

R21 = R′21 −∆2 = min{R′21, H(X21 |Q)}, (B.8)

R11 = R′11 + ∆1, (B.9)

R22 = R′22 + ∆2. (B.10)

This is a valid rate split since the total rates R1 and R2 are maintained and each component

rate is non-negative. We need to show that the modified rate split is in R1(p) ∩ R2(p),

i.e., it satisfies (B.4a) through (B.5d). By substituting (B.7) to (B.10), it is straightforward

(if tedious) to see that (B.4a) follows from (B.2a) and (B.2b), (B.4b) follows from (B.2b),

(B.4c) follows from (B.2c) and (B.2d), (B.4d) follows from (B.2d), (B.5a) follows from

(B.3a) and (B.3b), (B.5b) follows from (B.3b), (B.5c) follows from (B.3c) and (B.3d), and

finally, (B.5d) follows from (B.3d).

Appendix C

Mathematical notation

Sets.∅ empty set

X ,Y, . . . discrete sets

|X | set cardinality

C ,R, . . . continuous sets

F2 Galois field of order 2, i.e., {0, 1}Z the set of integers

{1:n} the set {1, 2, . . . , n}R the set of real numbers

R+ the set of non-negative real numbers

T (n)ε (X) set of typical sequences xn, as defined in [EK11]

Functions.Id identity mapping, Id(x) = x

JxK 1 + ((x− 1) mod n), modulo-n operator with indexing starting at 1

log logarithm to base 2

H(X) entropy, in bits (using log to base 2)

h(X) differential entropy, in bits (using log to base 2)

I(X;Y ) mutual information, in bits (using log to base 2)

⊗ Kronecker product

S the convex hull of the set S

FM,FM′,FM′′ specialized Fourier–Motzkin projection operators

130 APPENDIX C. MATHEMATICAL NOTATION

Probability and random variables.

E event

P(E) probability of the event EP{condition} shorthand for P({condition})X,Y, . . . random variables

E(X) expected value of the random variable X

X ∼ p the random variable X is distributed according to p

pX(x) probability mass function (or probability density function) of the random

variable X

p(x) shorthand for pX(x) (when the context is clear)

N (µ,Σ) Gaussian probability density function of mean vector µ and covariance

matrix Σ

Unif(X ) uniform distribution over the set X

Matrices, vectors, and sequences.

xnm the vector (sequence) (xm, xm+1, . . . , xn)

xn shorthand for xn1XT transpose

KX , S, . . . matrices

tr(S) matrix trace

|S| matrix determinant

KX � S partial order induced by the positive semidefinite matrix cone, i.e.,S −KX is positive semidefinite

IL identity matrix of size L× L0m×n zero matrix of size m× nel canonical unit vector, the lth column of ILS↑ up-shift matrix

S↓ down-shift matrix

Z zero-padding matrix

131

Other symbols and abbreviations.

Xlk in 3-DIC context, the signal from transmitter l arriving at receiver k

f 4 g partial order of set partition refinement, f is a refinement of g

f ∨ g the finest set partition of which both f and g are refinements

f ∧ g intersection of two set partitions

pmf probability mass function

� end of proof sketch

� end of proof, q.e.d.

Bibliography

[ADT07] A. S. Avestimehr, S. N. Diggavi, and D. N. C. Tse, “A deterministic approach

to wireless relay networks.” (Sep. 2007), presented at the 45th Annual Allerton

Conference on Communication, Control, and Computing (Monticello, IL),

arXiv:0710.3777.

[ADT11] A. S. Avestimehr, S. N. Diggavi, and D. N. C. Tse, “Wireless network

information flow: A deterministic approach.” IEEE Trans. Inf. Theory, vol. 57,

no. 4, pp. 1872–1905 (Apr. 2011).

[Ahl74] R. Ahlswede, “The capacity region of a channel with two senders and two

receivers.” Ann. Probab., vol. 2, no. 5, pp. 805–814 (1974).

[AV09] V. S. Annapureddy and V. V. Veeravalli, “Gaussian interference networks:

Sum capacity in the low-interference regime and new outer bounds on the

capacity region.” IEEE Trans. Inf. Theory, vol. 55, no. 7, pp. 3032–3050 (Jul.

2009).

[BE10] B. Bandemer and A. El Gamal, “Interference decoding for deterministic chan-

nels.” In Proceedings of ISIT 2010, Austin, TX (Jun. 2010).

[BE11a] B. Bandemer and A. El Gamal, “An achievable rate region for the 3-user-pair

deterministic interference channel.” In Proceedings of the 49th Annual Allerton

Conference on Communication, Control, and Computing, Monticello, IL (Sep.

2011), invited paper.

[BE11b] B. Bandemer and A. El Gamal, “Communication with disturbance constraints.”

In Proceedings of ISIT 2011, St. Petersburg, Russia (Aug. 2011).

[BE11c] B. Bandemer and A. El Gamal, “Communication with disturbance con-

straints.” IEEE Trans. Inf. Theory (Nov. 2011), submitted for publication,

arXiv:1103.0996.

134 BIBLIOGRAPHY

[BE11d] B. Bandemer and A. El Gamal, “Interference decoding for deterministic chan-

nels.” IEEE Trans. Inf. Theory, vol. 57, no. 5, pp. 2966–2975 (May 2011),

arXiv:1001.4588.

[BPT10] G. Bresler, A. Parekh, and D. Tse, “The approximate capacity of the many-

to-one and one-to-many Gaussian interference channels.” IEEE Trans. Inf.

Theory, vol. 56, no. 9, pp. 4566–4592 (Sep. 2010), arXiv:0804.4489.

[BS11] R. Bustin and S. Shamai (Shitz), “MMSE of ‘bad’ codes.” IEEE Trans. Inf.

Theory (Jun. 2011), submitted for publication, arXiv:1106.1017.

[BT08] G. Bresler and D. Tse, “The two-user Gaussian interference channel: A de-

terministic view.” Euro. Trans. Telecomm., vol. 19, no. 4, pp. 333–354 (Jun.

2008), arXiv:0807.3222.

[BVVE09] B. Bandemer, G. Vazquez-Vilar, and A. El Gamal, “On the sum capacity

of a class of cyclically symmetric deterministic interference channels.” In

Proceedings of ISIT 2009, Seoul, Korea (Jun. 2009).

[Car75] A. B. Carleial, “A case where interference does not reduce capacity.” IEEE

Trans. Inf. Theory, vol. 21, no. 5, pp. 569–570 (Sep. 1975).

[CG87] M. Costa and A. E. Gamal, “The capacity region of the discrete memoryless

interference channel with strong interference (corresp.).” IEEE Trans. Inf.

Theory, vol. 33, no. 5, pp. 710–711 (Sep. 1987).

[CJ08] V. R. Cadambe and S. A. Jafar, “Interference alignment and degrees of

freedom of the K-user interference channel.” IEEE Trans. Inf. Theory, vol. 54,

no. 8, pp. 3425–3441 (Aug. 2008).

[CK78] I. Csiszar and J. Korner, “Broadcast channels with confidential messages.”

IEEE Trans. Inf. Theory, vol. 24, no. 3, pp. 339–348 (May 1978).

BIBLIOGRAPHY 135

[CMGE08] H.-F. Chong, M. Motani, H. K. Garg, and H. El Gamal, “On the Han-

Kobayashi region for the interference channel.” IEEE Trans. Inf. Theory,

vol. 54, no. 7, pp. 3188–3195 (Jul. 2008).

[EC82] A. A. El Gamal and M. H. M. Costa, “The capacity region of a class of

deterministic interference channels.” IEEE Trans. Inf. Theory, vol. 28, no. 2,

pp. 343–346 (Mar. 1982).

[EK11] A. El Gamal and Y.-H. Kim, Network Information Theory. Cambridge Univer-

sity Press (2011).

[ELZ05] U. Erez, S. Litsyn, and R. Zamir, “Lattices which are good for (almost) ev-

erything.” IEEE Trans. Inf. Theory, vol. 51, no. 10, pp. 3401–3416 (Oct.

2005).

[EM81] A. El Gamal and E. C. van der Meulen, “A proof of Marton’s coding theorem

for the discrete memoryless broadcast channel.” IEEE Trans. Inf. Theory,

vol. 27, no. 1, pp. 120–122 (Jan. 1981).

[EO09] R. H. Etkin and E. Ordentlich, “The degrees-of-freedom of the K-user

Gaussian interference channel is discontinuous at rational channel coefficients.”

IEEE Trans. Inf. Theory, vol. 55, no. 11, pp. 4932–4946 (Nov. 2009).

[ETW08] R. H. Etkin, D. N. C. Tse, and H. Wang, “Gaussian interference channel

capacity to within one bit.” IEEE Trans. Inf. Theory, vol. 54, no. 12, pp.

5534–5562 (Dec. 2008).

[EZ04] U. Erez and R. Zamir, “Achieving (1/2) log(1 + SNR) on the AWGN channel

with lattice encoding and decoding.” IEEE Trans. Inf. Theory, vol. 50, no. 10,

pp. 2293–2314 (Oct. 2004).

[GJ11] T. Gou and S. A. Jafar, “Sum capacity of a class of symmetric SIMO Gaussian

interference channels within O(1).” IEEE Trans. Inf. Theory, vol. 57, no. 4,

pp. 1932–1958 (Apr. 2011), arXiv:0905.1745.

136 BIBLIOGRAPHY

[GK73] P. Gacs and J. Korner, “Common information is far less than mutual informa-

tion.” Problems of Control and Information Theory, vol. 2, no. 2, pp. 149–162

(1973).

[GSV05] D. Guo, S. Shamai (Shitz), and S. Verdu, “Mutual information and minimum

mean-square error in Gaussian channels.” IEEE Trans. Inf. Theory, vol. 51,

no. 4, pp. 1261–1282 (Apr. 2005).

[HK81] T. S. Han and K. Kobayashi, “A new achievable rate region for the interference

channel.” IEEE Trans. Inf. Theory, vol. 27, no. 1, pp. 49–60 (Jan. 1981).

[JV08] S. A. Jafar and S. Vishwanath, “Generalized degrees of freedom of the sym-

metric K user Gaussian interference channel.” (Apr. 2008), arXiv:0804.4489.

[Kra04] G. Kramer, “Outer bounds on the capacity of Gaussian interference channels.”

IEEE Trans. Inf. Theory, vol. 50, no. 3, pp. 581–586 (Mar. 2004).

[LV07] T. Liu and P. Viswanath, “An extremal inequality motivated by multiterminal

information-theoretic problems.” IEEE Trans. Inf. Theory, vol. 53, no. 5, pp.

1839–1851 (May 2007).

[Mar79] K. Marton, “A coding theorem for the discrete memoryless broadcast channel.”

IEEE Trans. Inf. Theory, vol. 25, no. 3, pp. 306–311 (May 1979).

[MDFT11] S. Mohajer, S. N. Diggavi, C. Fragouli, and D. N. C. Tse, “Approximate

capacity of a class of Gaussian interference-relay networks.” IEEE Trans. Inf.

Theory, vol. 57, no. 5, pp. 2837–2864 (May 2011).

[Meu77] E. C. van der Meulen, “A survey of multi-way channels in information theory:

1961-1976.” IEEE Trans. Inf. Theory, vol. 23, no. 1, pp. 1–37 (Jan. 1977).

[Meu94] E. C. van der Meulen, “Some reflections on the interference channel.” In R. E.

Blahut, D. J. Costello, U. Maurer, and T. Mittelholzer (Editors), Communica-

tions and Cryptography: Two Sides of One Tapestry, pp. 409–421, Kluwer,

Boston (1994).

BIBLIOGRAPHY 137

[Mis39] R. von Mises, “Uber Aufteilungs- und Besetzungs-Wahrscheinlichkeiten.”

Revue de la Faculte des Sciences de l’Universite d’Istanbul, vol. 4, pp. 145–

163 (1939), reprinted in “Selected Papers of Richard von Mises”, vol. 2 (Ed.

P. Frank, S. Goldstein, M. Kac, W. Prager, G. Szego, and G. Birkhoff).

Providence, RI: American Mathematical Society, pp. 313-334, 1964.

[MK09] A. S. Motahari and A. K. Khandani, “Capacity bounds for the Gaussian

interference channel.” IEEE Trans. Inf. Theory, vol. 55, no. 2, pp. 620–643

(Feb. 2009).

[MMK08] M. A. Maddah-Ali, A. S. Motahari, and A. K. Khandani, “Communication

over MIMO X channels: Interference alignment, decomposition, and perfor-

mance analysis.” IEEE Trans. Inf. Theory, vol. 54, no. 8, pp. 3457–3470 (Aug.

2008).

[MOMK09] A. S. Motahari, S. Oveis Gharan, M. A. Maddah-Ali, and A. K. Khandani,

“Real interference alignment: Exploiting the potential of single antenna sys-

tems.” IEEE Trans. Inf. Theory (Nov. 2009), submitted for publication,

arXiv:0908.2282.

[NG08] B. Nazer and M. Gastpar, “The case for structured random codes in network

capacity theorems.” Euro. Trans. Telecomm., Special Issue on New Directions

in Information Theory, vol. 19, no. 4, pp. 455–474 (Jun. 2008).

[NG09] B. Nazer and M. Gastpar, “Compute-and-forward: Harnessing interference

through structured codes.” IEEE Trans. Inf. Theory (Aug. 2009), submitted

for publication, arXiv:0908.2119v2.

[Sat81] H. Sato, “The capacity of the Gaussian interference channel under strong

interference.” IEEE Trans. Inf. Theory, vol. 27, no. 6, pp. 786–788 (Nov.

1981).

138 BIBLIOGRAPHY

[SD11] Y. Song and N. Devroye, “Structured interference-mitigation in two-hop net-

works.” In Proceedings of the Information Theory and Applications Workshop

(ITA), La Jolla, CA (Feb. 2011).

[SJV+08] S. Sridharan, A. Jafarian, S. Vishwanath, S. A. Jafar, and S. Shamai (Shitz),

“A layered lattice coding scheme for a class of three user Gaussian interfer-

ence channels.” In Proceedings of the 46th Annual Allerton Conference on

Communication, Control, and Computing, pp. 531–538, Monticello, IL (Sep.

2008).

[SKC09] X. Shang, G. Kramer, and B. Chen, “A new outer bound and the noisy-

interference sum-rate capacity for Gaussian interference channels.” IEEE

Trans. Inf. Theory, vol. 55, no. 2, pp. 689–699 (Feb. 2009).

[Sta11] R. P. Stanley, Enumerative Combinatorics, vol. 1. Cambridge University Press,

2nd ed. (2011), URL http://www-math.mit.edu/˜rstan/ec/.

[TY11] Y. Tian and A. Yener, “The Gaussian interference relay channel: Improved

achievable rates and sum rate upperbounds using a potent relay.” IEEE Trans.

Inf. Theory, vol. 57, no. 5, pp. 2865–2879 (May 2011).

[Wit75] H. S. Witsenhausen, “On sequences of pairs of dependent random variables.”

SIAM Journal of Applied Mathematics, vol. 28, no. 1, pp. 100–113 (Jan. 1975).

[Wyn75] A. D. Wyner, “The wire-tap channel.” Bell System Technical Journal, vol. 54,

no. 8, pp. 1355–1387 (Oct. 1975).

http://www-math.mit.edu/~rstan/ec/

CODING SCHEMES FOR A DISSERTATIONsy560th1179/Ban... · 2013-06-18 · the same as the...

Documents

Transcript of CODING SCHEMES FOR A DISSERTATIONsy560th1179/Ban... · 2013-06-18 · the same as the...