Post on 31-Dec-2015
Mathematical PreliminariesMathematical Preliminaries
2
Matrix TheoryMatrix Theory
Vectors nth element of vector u : u(n)
Matrix mth row and nth column of A : a(m,n)
)(
)2(
)1(
)}({
Nu
u
u
nuu
N
NMaMaMa
a
Naaa
nma aaaA
21
),()2,()1,(
)1,2(
),1()2,1()1,1(
)},({
column vector
Tk kMakakawhere )],(),2(),1([ a
3
Row-ordered form of a matrix
Column-ordered form of a matrix
TNMxMxNxxNxxx )],()1,(),2()1,2(),1()2,1()1,1([ x ,2
1
Mr
r
r
TNkxkxkx
vectorrowwhere
k)],()2,()1,([ r
),()2,()1,(
)1,2(
),1()2,1()1,1(
NMxMxMx
x
Nxxx
),()2,()1,(
)1,2(
),1()2,1()1,1(
NMxMxMx
x
Nxxx
TNMxNxMxxMxxx )],(),1()2,()2,1()1,()1,2()1,1([ x ,2
1
Mc
c
c
TkMxkxkx
vectorcolumnwhere
k)],(),2(),1([ c
Lexicographic Ordering(Stacking operation) Lexicographic Ordering(Stacking operation)
4
5
Transposition and conjugation rules
Toeplitz matrices
Circulant matrices
***11
**
][,][][
][,][
AABAA
ABABAATT
TTTTT
0121
1
2
2101
110
tttt
t
t
tttt
ttt
N
N
N
T
0121
12
2101
1210
cccc
cc
cccc
cccc
N
NN
N
C
nmtnmt ),(
)modulo)((),( Nnmcnmc )%)(( Nnmc
6
1
0
)()()()()(xN
k
kxknhnxnhny
hNnnfornh ,0,0)(
)0()0()()()0(0
1
0
xhkxknhyn
N
k
x
)1()0()0()1()()1()()()1(1
01
1
0
xhxhkxkhkxknhykn
N
k
x
)()0()1()1()0()()()()()()(0
1
0
lxhxlhxlhkxklhkxknhlyl
kln
N
k
x
xNnnfornx ,0,0)(
Linear convolution using Toeplitz matrix
7
)1(
)1(
)0(
)1(000000000
)2()1(00000000
)0()1()1(00
0)0()1()2()2()1(0
00)0()1()2()1(
0000000)0()1()2(
00000000)0()1(
000000000)0(
)2(
)2(
)1(
)0(
x
h
hh
h
hh
hh
xh
Nx
x
x
Nh
NhNh
hhNh
hhhNhNh
hhNhNh
hhh
hh
h
NNy
y
y
y
xN
1 xh NN
Hxy
y H x(Toepliz matrix)
8
N-point circular convolution :
otherwise
Nnkxknhnxnhny
N
k
,0
0,)()(~
)()()(
1
0
1
1
1
00
1
0
)()()0()0()()(~
)()(~
)0(N
k
N
kn
N
k
kxkNhxhkxkhkxknhy
,)()(~
k
kNnhnh
1
2
1
01
1
0
)()()1()0()0()1()()1(~
)()(~
)1(N
k
N
kn
N
k
kxkNhxhxhkxkhkxknhy
1
1
1
0
1
0
)()()()0()1()1()0()(
)()(~
)()(~
)(
N
lk
N
kln
N
k
kxkNhlxhxlhxlh
kxklhkxknhly
Nnnfornx ,0,0)(
h(n) N x(n)
N
Circular convolution using circulant matrixCircular convolution using circulant matrix
9
)1(
)2(
)2(
)1(
)0(
)0()1()2()2()1(
)1()0()1()2()2(
)1()0()1()2(
)2()2()1()0()1(
)1()2()2()1()0(
)1(
)2(
)2(
)1(
)0(
Nx
Nx
x
x
x
hhhNhNh
NhhhhNh
Nhhhh
hNhNhhh
hhNhNhh
Ny
Ny
y
y
y
Hxy
y H x(circulant matrix)
Circular convolution + zero padding linear convolution
Circular convolution with the period : 1 hx NNN
the same result with that of linear convolution
,,0,0)( hNnnfornh xNnnfornx ,0,0)(
10
4
0
)()()()()(k
kxknhnxnhny
)4(
)3(
)2(
)1(
)0(
10000
01000
10100
01010
00101
00010
00001
)5(
)4(
)3(
)2(
)1(
)0(
)1(
x
x
x
x
x
y
y
y
y
y
y
y
4),()(,3)( NNnhnhnnh
)3(
)2(
)1(
)0(
3012
2301
1230
0123
)3(
)2(
)1(
)0(
x
x
x
x
y
y
y
y
(ex) Linear convolution as a Toeplitz matrix operation
(ex) Circular convolution as a circulant matrix operation
11,)( nnnh
10,)()()(1
0
NnkxknhnyN
k
71351 LNM
11
Orthogonal and unitary matrices Orthogonal : Unitary :
TAA 1 IAAAA TTorT*1 AA
Positive definiteness and quadratic forms is called positive definite, if is a Hermitian matrix and
is called positive semidefinite(nonnegative), if is a Hermitian matrix andTheorem
if is a symmetric positive definite matrix, then all its eigenvalues are positive and the determinant of satisfies
IAAAA TT **or
0xAxx ,0*TQ
A
A
N
k
N
kk kka
11
),(|| A
A A
A0xAxx ,0*TQ
A
12
Diagonal forms For any Hermitian matrix there exists a unitary matrix
such that
Eigenvalue and eigenvector
RT*
R
: diagonal matrix containing the the eigenvalues of R
Nkkkk ,,1, R
k : eigenvalue k : eigenvector
)( Ror
]|||[ 21 Nwhere
13
nmmm
n
n
,2,1,
,22,21,2
,12,11,1
AAA
AAA
AAA
A
20,30,)','()','(),(2
0'
1
0'
nmnnmmhnmxnmym n
(ex) 352
141
11
11
m
n
m
n
),( nmx ),( nmh
1 5 5 1
3 10 5 2
2 3 -2 -3
n
m
y(m,n)
][
13
45
12
10 xxX
1
4
1
3
5
2
x
210 yyyY
123
552
5103
132
2
1
0
y
y
y
y
Column; Stacking Operation
Block MatricesBlock Matrices
Block matrices : elements are matrices
14
1
0n'nn'nn xHy
where )},,'({ nmmh nH ,30 m 2'0 m
,
100
110
011
001
0H
100
110
011
001
1H
1
0
1
01
0
2
1
0
x
x
H0
HH
0H
y
y
y
Hxy
block matrix
Let xn and yn be the column vector, then
15
BB
BB
BBA
),()1,(
),1()1,1(
}),({
111
1
MMaMa
Maa
nma
(ex)
43
21,
11
11BA
4433
4433
2211
2211
,
4343
2121
4343
2121
ABBA
Definition
Properties(Table2.7)
)()())(( BDACDCBA
operationsNONO )()(:))(( 46 DCBA
operationsNO )(:)()( 4BDAC
Kronecker ProductsKronecker Products
16
Separable transformationTransformation on an NXM image
TAUBV
U
uBAv )( row-ordered form
m
Tmmk
m
Tm
T mkathenk
uBABuv ,][])[,(
VUuv andofvectorsrowLet mk:,
uv )(][ 21 BAvvv TTM
TT
m n
m n
nlbnmumka
nmlktnmulkv
),(),(),(
),;,(),(),(
Consider the transformation
),(),(),;,( nlbmkanmlkt , if
: matrix form
: vector form
17
18
Definitions Random signal : a sequence of random variables Mean : Variance : Covariance :
Cross covariance :
Autocorrelation :
Cross correlation :
)]([)( nuEnu
]|)()([|)( 22 nnuEnu
)',()]'(),([ 2 nnnunuCovuu
)]}'()'()][()({[ ** nnunnuE
)',()]'(),([ 2 nnnvnuCovuv
)]}'()'()][()({[ ** nnvnnuE vu
)]'()([)',( * nunuEnnruu )'()()',( *2 nnnnuu
)'()()',()]'()([)',( *2* nnnnnunuEnnr vuuv uv
Random SignalsRandom Signals
19
u)}({][ nE μu : Nx1 vector
)}',({])μ)(μ[(][ 2** nnECov uuT Cuuu
)}',({])μ)(μ[(],[ 2** nnECovuvuv
Tvu Cvuvu
: NxN matrix
: NxN matrix
μ : mean vector C : covariance matrix
Gaussian(or Normal) distribution
}2
||exp{
2
1)(
2
2
2
u
upu
Gaussian random processesGaussian random process if the joint probability density of any finite sub-sequence is a Gaussian distribution
)}μ()μ(2/1exp{]||)2[(),,,()( 1*12/12/21 uuuuupup TN
N CCuu
: covariance matrixC
Representation for an NX1 vector
20
Stationary process Strict-sense stationary if the joint density of any partial sequence
is the same as that of the shifted sequence
Wide-sense stationary if
Gaussian process : wide-sense = strict sense
}),({ klnlx }),({ 0 klnnlx
constantμ)]([ nuE
)'()]'()([ * nnrnunuE uu : covariance matrix is Toeplitz
),,,,(
),,,(
000000 1)(,),1(),(
1)(,),1(),(
nknnnnnkxnnxnnx
knnkxnxnx
xxxF
xxxF
knnfor ,, 0
21
Orthogonal : Independent : Uncorrelated :
(ex) Covariance matrix of a first-order stationary Markov sequence u(n)
nn nuu ,1||,)( ||2
1
1
1
2
12
N
N
C : Toeplitz
)()(),(, ypxpyxp yxyx
0][ * xyE
0]))([(][][][ *** yx yxEoryExExyE
Markov processesp-th order Markov
]),2(),1(|)([ nununuprob
npnununuprob )],(,),1(|)([
22
Karhunen-Loeve(KL) transform KL transform of
Property
The elements of y(k) are orthogonal is called the KL transform matrix The rows of are the conjugate eigenvectors of
,* xy T : NxN unitary matrix
x
Rxxyy TTTT EE **** ][][
)()]()([ * lklykyE k
T*T* R
23
Definitions Discrete random field
Each sample of a 2-D sequence is a random variable Mean : Covariance :
White noise field
Symmetry
),()],([ nmnmuE
)',';,()]','(),,([ 2 nmnmnmunmuCovuu
)]}','()','()][,(),({[ ** nmnmunmnmuE
)','(),()',';,( 22 nnmmnmnmnm xxxx
),;','()',';,(*22 nmnmnmnm uuuu
Discrete Random FieldDiscrete Random Field
24
Separable and isotropic image covariance functions Separable
Separable stationary covariance function
Nonseparable exponential function
)',()',()',';,( 222
21nnmmnmnmxx (Nonstationary case)
(Stationary case)
1||,1||,),( 21||
2||
122 nm
xx nm
}exp{),( 22
21
22 nmnmxx 21
2222 ,,),( nmdnm dxx
(isotropic or circularly symmetric)
Estimation mean and autocorrelation
M
m
N
n
nmuMN 1 1
),(1̂
mM
m
nN
nxxxx nnmmunmu
MNnmnm
1 1
22 ]ˆ)','(][ˆ)','([1
),(ˆ),(
)()(),( 222
21nmnmxx
25
n
uuu fnjnfSnu )2exp()()()}({SDF 2
5.0
5.0
2 )2exp()()( dffnjfSn uuu
2-D case
m n
uuu vnumjnmvuSnmu )](2exp[),(),()},({SDF 2
dudvvnumjvuSnm uuu
5.0
5.0
5.0
5.0
2 )](2exp[),(),(
Average powerdudvvuSuuu
5.0
5.0
5.0
5.0
2 ),()0,0(
SDF(spectral density function)SDF(spectral density function)
Definition Fourier transform of autocorrelation function
1-D case
26
(ex) the SDF of stationary white noise field
),(),( 22 nmnmxx 2),( vuS
27
Estimate the random variable x by a suitable function g(y), such that
dxdyyxfygxgE ),()]([]|)([| 22xyyx is min.
but )()|(),( yfyxfyxf yxxy
dxdyyxfygxyfgE )|()]([)(]|)([| 22xyyx
the integrand is non-negative ; it is sufficient to minimize
dxyxfygx )|()]([ 2x for every y
]|[)|()(ˆ yxyx x Edxyxxfg
Estimation TheoryEstimation Theory
Mean square estimates
28
minimum mean square estimate (MMSE)
also ][]]|[[)]([]ˆ[ xyxyx EEEgEE
unbiased estimator
◆ Theorem
Let y △ tnyyyy 321 and x be jointly Gaussian with zero mean.
The MMSE estimation is
N
iii yaE
1
]|[ yx , where ai is chosen, such that
0])[(1
k
N
iii yyaE x ∀ all k = 1, 2, … , N
(Pf) The random variable
N
iii ya
1
)(x nyyy ,,,, 21 are jointly Gaussian.
But the first one is uncorrelated with all the rest, it is independent of them.
Thus, the error
N
iii ya
1
)(x is independent of the random vector y.
29
0][][][]|)[(111
N
iii
N
iii
N
iii yEaEyaEyaE xxyx
N
iii
N
ii ayEaE
11
]|[]|[ yyx
N
iii yaE
1
]|[ yx
][min])ˆ[(min 2
)}({
2
)}({eEE
nn xx
where
N
n
xnyne1
)()( : estimation error
0)(
][ 2
n
eE
yields
0)]([ neyE , n = 1, 2, … , N
30
The estimation error is minimized if
0)]([ neyE , n = 1, 2, … , N
orthogonality principle
If x and {y(n)} are independent
][]|[ˆ xyx EEx
If zero mean Gaussian random variables
N
n
nyn1
)()(ˆ x : linear combination of {y(n)}
is determined by solving linear equations)(n
31
Orthogonality principle The minimum mean square estimation error vector is
orthogonal to every random variable functionally related to the observations, i.e., for any ))(,),2(),1(()( Nyyygg y
0)]()ˆ[( yxx gEx
x̂ )(yg
xx ˆ
)]([]|)([[)]()|([)](ˆ[ yxyyxyyxyx gEgEEgEEgE
Since is a function of x̂ y
N
n
nyn1
)()(ˆ xsubstitute
NnnxyEnykyEnN
n
,,1)],([)]()([)(1
matrix notation
0]ˆ)ˆ[( xxxE 0)]ˆ()ˆ[( xxx gE,
)]})([{)},({(,1 nyEn xyxyy xrrR
32
Minimum MSE : If x,y(n) are nonzero mean r.v.
If x,y(n) are non-Gaussian, the results still give the best linear mean square estimate.
xyT
x rα 22
N
nyxx nnyn
1ˆ )]()()[(ˆˆ xx
33
Information TheoryInformation Theory
Information
]bits[log2 kk pI
kk rLkp messagetindependeniesprobabilit:,,1,
Entropy]gebits/messa[log
12
L
kkk ppH
]bits[log1
log1
max 21
2 LLL
HL
kpk
p
)( pH
0 5.0 1
For a binary source, i.e., 10,1,,2 121 pppppL
)1(log)1(log 22 ppppH
34
Let x be a discrete r.v. with Sx={1, 2, … , K}
△ {x=k}
uncertainty of Ak is low, if pk is close to one,
with pk=Pr[x=k] let event Ak
and it is high, if pk is small.
uncertainty of event :
0)(
1ln)(
kPkI
r xx if Pr(x=k) = 1
entropy :
)(
1ln)()]([
1 kPkPkIEH
r
K
krx
x
xx
unit : bit when the logarithm is base 2
Information Theory
35
Consider the event Ak, describing the emission of symbol sk
by the source with probability pk
1) if pk=1 and pi=0 all ∀ i≠k
no surprise no information when s⇒ k is emitted by the source
2) if rk is low
more surprise information when s⇒ k is emitted by the source
kk p
sI1
log)( ; amount of information gained
after observing the event sk
)]([ kx sIEH ; average information per source symbol
Entropy as a measure of information
36
16 balls :
2 balls “3”, 2 balls “4”
1 ball “5”, “6”, “7”, “8”
4 balls “1”, 4 balls “2”
Question : Find out the number of the ball
through a series of yes/no questions.
ballbitH x /16
44
4
1log
4
1
4
1log
4
122
x=1 ?
yes
x=2 ?
yes
x=7 ?
yes
Ex)
1)
x=8
x=1 x=2 x=7
no no no
the average number of question asked :
16
51)
16
1(7)
16
1(7)
16
1(6)
16
1(5)
8
1(4)
8
1(3)
4
1(2)
4
1(1][ LE
37
x≤2 ?
yes
x≤4 ?
yes
x=7 ?
yes
2)x=8
x=1 x=2
x=7
no no nox≤6 ?
yes
x=1 ?
yes
no
x=3 x=4
x=3 ?
yes
no
x=5 x=6
x=5 ?
yes
no
no
16
44)
16
1(4)
16
1(4)
16
1(4)
16
1(4)
8
1(3)
8
1(3)
4
1(2)
4
1(2][ LE
⇒ The problem of designing the series of questions to identify x
is exactly the same as the problem of encoding the output
of information source.
38
x=1 0 0 0 yes / yes 1 1 ⇒
x=2 0 0 1 yes / no 1 0 ⇒
x=3 0 1 0 no / yes / yes 0 1 1 ⇒
x=4 0 1 1 no / yes / no 0 1 0 ⇒
x=5 1 0 0 no / no / yes / yes 0 0 1 1 ⇒
x=6 1 0 1 no / no / yes / no 0 0 1 0 ⇒
x=7 1 1 0 no / no / no / yes 0 0 0 1 ⇒
x=8 1 1 1 no / no / no / no 0 0 0 0 ⇒
3 bit / symbol variable length code pk
⇒ Huffman code
⇒ short code to frequency source symbol
average number of bits required to identify the outcome of x
⇒ entropy of x represent the max.
long code to rare source symbol
41
41
81
81
161
161
161
161
39
Noiseless Coding Theorem (1948, Shannon) min(R) = H(x) +ε bit / symbol
when R is the transmission rate and ε is a positive quantity that can be arbitrarily close to zero by sophisticated coding procedure utilizing an appropriate amount of encoding delay.
40
Rate distortion function Distortion
])[( 2yxED 2x : Gaussian r.v of variance
y : reproduced value
Rate distortion function of x
)](log2
1,0max[
0
),(log2
1 2
22
22
2
DD
DDRD
DR
D
Rate distortion function for a Gaussian source
For a fixed average distortion D
1
0
2
2 )](log2
1,0max[
1 N
k
kD N
R
: Gaussian r.v.’s)}1(,),1(),0({ Nxxx : reproduced values)}1(,),1(),0({ Nyyy
where is determined by solving
1
0
2 ],min[1 N
kkN
D