Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of...
-
Upload
myles-tate -
Category
Documents
-
view
219 -
download
0
Transcript of Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of...
![Page 1: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/1.jpg)
Markov, Shannon, and Turbo Codes:The Benefits of Hindsight
Professor Stephen B. Wicker
School of Electrical Engineering
Cornell University
Ithaca, NY 14853
![Page 2: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/2.jpg)
Introduction
Theme 1: Digital Communications, Shannon and Error Control Coding
Theme 2: Markov and the Statistical Analysis of Systems with Memory
Synthesis: Turbo Error Control: Parallel Concatenated Encoding and Iterative Decoding
![Page 3: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/3.jpg)
Digital Telecommunication
The classical design problem: transmitter power vs. bit error rate (BER)
Complications:– Physical Distance– Co-Channel and Adjacent Channel Interference– Nonlinear Channels
![Page 4: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/4.jpg)
Shannon and Information Theory
Noisy Channel Coding Theorem (1948): – Every channel has a capacity C.– If we transmit at a data rate that is less
than capacity, there exists an error control code that provides arbitrarily low BER.
For an AWGN channel:
C =Wlog2 1+Es
N0
⎛
⎝ ⎜ ⎞
⎠ ⎟ bits per second
![Page 5: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/5.jpg)
Coding Gain
Coding Gain: PUNCODED - PCODED
– The difference in power required by the uncoded and coded systems to obtain a given BER.
NCCT: Almost 10dB possible on an AWGN channel with binary signaling.
1993: NASA/ESA Deep Space Standard provides 7.7 dB.
![Page 6: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/6.jpg)
Classical Error Control Coding
MAP Sequence Decoding Problem: – Find X that maximizes p(X|Y).– Derive estimate of U from estimate of X.– General problem is NP-Hard - related to many
optimization problems.– Polynomial time solutions exist for special
cases.
EncoderU=(u1, ... , uk) X
NoisyChannel
Y
![Page 7: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/7.jpg)
Class P Decoding Techniques
Hard decision: MAP decoding reduces to minimum distance decoding.
Example: Berlekamp algorithm (RS codes) Soft Decision: Received signals are quantized. Example: Viterbi algorithm (Convolutional
Codes) These techniques do NOT minimize
information error rate.
![Page 8: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/8.jpg)
Binary Convolutional Codes
Memory is incorporated into encoder in an obvious way.
Resulting code can be analyzed using state diagram.
![Page 9: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/9.jpg)
Trellis for a Convolutional Code
![Page 10: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/10.jpg)
Trees and Sequential Decoding
Convolutional code can be depicted as a tree. Tree and metric define a metric space. Sequential decoding is a local search of a
metric space. Search complexity is a polynomial function of
memory order. May not terminate in a finite amount of time. Local search methodology to return...
![Page 11: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/11.jpg)
Theme 2: Markov and Memory
Markov was, among many other things, a cryptanalyst.– Interested in the structure of written text.– Certain letters are can only be followed by
certain others. Markov Chains:
– Let I be a countable set of states and let be a probability measure on I.
![Page 12: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/12.jpg)
– Let random variable S range over I and set i = p(S = i)
– Let P = {pij} be a stochastic matrix with rows and columns indexed by I.
– S = (Sn)n≥0 is a Markov chain with initial distribution and transition matrix P if
- S0 has distribution
- p(Sn+1 | S0, S1, S2, …, Sn – 1, Sn) = P(Sn+1 | Sn) = pij
![Page 13: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/13.jpg)
Hidden Markov Models
HMM :
– Markov chain X = X1, X2, …
– Sequence of r.v.’s Y = Y1, Y2, … that are a probabilistic function f() of X.
Inference Problem: Observe Y and infer:– Initial state of X– State transition probabilities for X– Probabilistic function f()
![Page 14: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/14.jpg)
Hidden Markov Models are Everywhere...
Duration of eruptions by Old Faithful Movement of Locusts (Locusta Migratoria) Suicide rate in Capetown, SA. Progress of epidemics Econometric models Decoding of convolutional codes
![Page 15: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/15.jpg)
Baum-Welch Algorithm
Lloyd Welch and Leonard Baum developed iterative solution to the HMM inference problem (~1962).
Application-specific solution was classified for many years.
Published in general form:– L. E. Baum and T. Petrie, “Statistical Inference for
Probabilistic Functions of Finite State Markov Chains,” Ann. Math. Stat., 37:1554 - 1563, 1966.
![Page 16: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/16.jpg)
BW Overview
Member of the class of algorithms now known as “Expectation-Maximization”, or “EM” algorithms.
– Initial hypothesis 0
– Series of estimates generated by the mapping i = T(i-1)
– P(0) ≤ P(1) ≤ P(2) ≤ … , where is the maximum likelihood parameter estimate.
limi→ ∞
i
![Page 17: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/17.jpg)
Forward - Backward Algorithm: Exploiting the Markov Property
Goal: Derive probability measure p(xj, y).
BW algorithm recursively computes ’s and ’s.
p x j ,y( ) =p xj ,y j−( ) ⋅p yj |xj ,y j
−( ) ⋅p yj+ |xj ,yj ,y j
−( )
=p xj ,y j−( ) ⋅p yj |xj( )⋅p y j
+ |xj( )
= xj( )↑
past
⋅ γ xj( )↑
present
⋅β x j( )↑
future
![Page 18: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/18.jpg)
Forward and Backward Flow
Define flow (xi, xj) to be the probability that a random walk starting at xi will terminate at xj.
(xj) is the forward flow to xj at time j.
(xj) is the backward flow to xj at time j.
x j( ) = p x j ,y j−
( ) = α x j−1( )Q x j | x j−1( )γ x j−1( )x j−1∈X j−1
∑
x j( ) = p y j+ | x j( ) = Q x j+1 | x j( )γ x j+1( )β x j+1( )
x j+1∈X j+1
∑
![Page 19: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/19.jpg)
Earliest Reference to Backward-Forward Algorithm
Several of the woodsmen began to move slowly toward her and observing them closely, the little girl saw that they were turned backward, but really walking forward. “We have to go backward forward!” cried Dorothy. “Hurry up, before they catch us.”
– Ruth Plumly Thompson, The Lost King of Oz, pg. 120, The Reilly & Lee Co., 1925.
![Page 20: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/20.jpg)
Generalization: Belief Propagation in Polytrees
Judea Pearl (1988) Each node in a
polytree separates the graph into two distinct subgraphs.
X D-separates upper and lower variables, implying conditional independence.
![Page 21: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/21.jpg)
Spatial Recursion and Message Passing
![Page 22: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/22.jpg)
Synthesis: BCJR
1974: Bahl, Cocke, Jelinek, and Raviv apply portion of BW algorithm to trellis decoding for convolutional and block codes.– Forward and backward trellis flow: APP
that a given branch is traversed.– Info bit APP: sum of probabilities for
branches associated with particular bit value.
![Page 23: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/23.jpg)
BW/BCJR
u j( ) uj( )γ u j( )
![Page 24: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/24.jpg)
Synthesis Crescendo:Turbo Coding
May 25, 1993: G. Berrou, A. Glavieux, and P. Thitimajshima, “Near Shannon-Limit Error Correction Coding: Turbo Codes.”
Two Key Elements: – Parallel Concatenated Encoders– Iterative Decoding.
![Page 25: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/25.jpg)
Parallel Concatenated Encoders
One “systematic” and two parity streams are generated from the information.
Recursive (IIR) convolutional encoders are used as “component” encoders.
![Page 26: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/26.jpg)
Recursive Binary Convolutional Encoders
![Page 27: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/27.jpg)
Impact of the Interleaver
Only a small number of low-weight input sequences are mapped to low-weight output sequences.
The interleaver ensures that if the output of one component encoder has low weight, the output of the other probably will not.
PCC emphasis: minimize number of low weight code words, as opposed to maximizing the minimum weight.
![Page 28: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/28.jpg)
The PCE Decoding Problem
CC2
Encoder
Interleaver
CC1
Encoder
U =( u
1
, ... , u
k
)
X
1
U
Channel
Y
1
=( y
1
, ... , y
1
)
1 k
Y
s
=( y
s
, ... , y
s
)
1 k
X
2
Y
2
=( y
2
, ... , y
2
)
1 k
BELi a( ) =p(ui =a|y)
= i a( )systematic
term
{ ⋅π i a( ) a prioriterm
{ ⋅ p(y1 |x1)p(y2 |x2 ) j uj( )π j uj( )j=1j≠i
k
∏u:ui =a∑
extrinisic term1 2 4 4 4 4 4 4 4 3 4 4 4 4 4 4 4
![Page 29: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/29.jpg)
Turbo Decoding
BW/BCJR decoders are associated with each component encoder.
Decoders take turns estimating and exchanging distribution on information bits.
![Page 30: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/30.jpg)
Alternating Estimates of Information APP
BELi1 a( ) =α λi a( )
systematicterm
{ ⋅π i2 a( )
updatedterm
1 2 3 ⋅ p(y1 |x1) λ j uj( )π j uj( )j=1j≠i
k
∏u:ui =a∑
extrinisic term1 2 4 4 4 4 4 3 4 4 4 4 4
BELi2 a( ) =α λi a( )
systematicterm
{ ⋅πi1 a( )
updatedterm
1 2 3 ⋅ p(y2 |x2) λj uj( )π j uj( )j=1j≠i
k
∏u:ui =a∑
extrinisic term1 2 4 4 4 4 4 3 4 4 4 4 4
Decoder 1: BW/BCJR derives
Decoder 2: BW/BCJR derives
![Page 31: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/31.jpg)
Converging Estimates
Information exchanged by the decoders must not be strongly correlated with systematic info or earlier exchanges.
πim( ) a( ) =
αPr ui =a |Ys =ys,Y1 =y1{ }
λi a( )πim−1( ) a( )
if m is odd
αPr ui =a|Ys =ys,Y2 =y2{ }
λi a( )πim−1( ) a( )
if m is even
⎧
⎨ ⎪ ⎪
⎩ ⎪ ⎪
![Page 32: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/32.jpg)
Impact and Questions
Turbo coding provides coding gain near 10dB – Within 0.3 dB of the Shannon limit.– NASA/ESA DSN: 1 dB = $80M in 1996.
Issues:– Sometimes turbo decoding fails to correct all
of the errors in the received data. Why?– Sometimes the component decoders do not
converge. Why?– Why does turbo decoding work at all?
![Page 33: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/33.jpg)
Cross-Entropy Between the Component Decoders
Cross entropy, or the Kullback-Leibler distance, is a measure of the distance between two distributions.
Joachim Hagenauer et al. have suggested using a cross-entropy threshold as a stopping condition for turbo decoders.
D= π 1 uj =a|Y( )a=0
1
∑j=1
N
∑ logπ1 uj =a |Y( )π 2 uj =a|Y( )
![Page 34: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/34.jpg)
Correlating Decoder Errors with Cross-Entropy
![Page 35: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/35.jpg)
Neural Networks do the Thinking
Neural networks can implement any piecewise-continuous function.
Goal: Emulation of indicator functions for turbo decoder error and convergence.
Two Experiments: – FEDN: Predict eventual error and convergence
at the beginning of the decoding process.– DEDN: Detect error and convergence at the
end of the decoding process.
![Page 36: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/36.jpg)
Network Performance
Missed detection occurs when number of errors is small.
The average weight of error events in NN-assisted turbo is far less than that of CRC-assisted turbo decoding.
When coupled with a code combining protocol, NN-assisted turbo is extremely reliable.
![Page 37: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/37.jpg)
What Did the Networks Learn?
Examined weights generated during training. Network monitors slope of cross entropy (rate
of descent). Conjecture:
– Turbo decoding is a local search algorithm that attempts to minimize cross-entropy cycles.
– Topology of search space is strongly determined by initial cross entropy.
![Page 38: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/38.jpg)
Exploring the Conjecture
Turbo Simulated Annealing (Buckley, Hagenauer, Krishnamachari, Wicker)– Nonconvergent turbo decoding is nudged
out of local minimum cycles by randomization (heat).
Turbo Genetic Decoding (Krishnamachari, Wicker)– Multiple processes are started in different
places in the search space.
![Page 39: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/39.jpg)
Turbo Coding: A Change in Error Control Methodology
“Classical” response to Shannon: – Derive probability measure on transmitted
sequence, not actual information.– Explore optimal solutions to special cases of
NP-Hard problem.– Optimal, polynomial time decoding algorithms
limit choice of codes.
![Page 40: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/40.jpg)
“Modern”: Exploit Markov property to obtain temporal/spatial recursion:– Derive probability measure on information, not
codeword– Explore suboptimal solutions to more difficult
cases of NP-Hard problem.– Iterative decoding – Graph Theoretic Interpretation of Code Space– Variations on Local Search
![Page 41: Markov, Shannon, and Turbo Codes: The Benefits of Hindsight Professor Stephen B. Wicker School of Electrical Engineering Cornell University Ithaca, NY.](https://reader030.fdocuments.us/reader030/viewer/2022032723/56649d1f5503460f949f2ab7/html5/thumbnails/41.jpg)
The Future Relation of cross entropy to impact of cycles in
belief propagation. Near-term abandonment of PCE’s as unnecessarily
restrictive. Increased emphasis on low density parity check
codes and expander codes.– Decoding algorithms that look like solutions to K-
SAT problem.– Iteration between subgraphs.– Increased emphasis on decoding as local search.