Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis...
Transcript of Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis...
![Page 1: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/1.jpg)
Chapter 8. Principal-‐Components Analysis
Neural Networks and Learning Machines (Haykin)
Lecture Notes of Self-‐learning Neural Algorithms
Byoung-‐Tak ZhangSchool of Computer Science and Engineering
Seoul National University
1
![Page 2: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/2.jpg)
Contents
8.1 Introduction ……………………………………………………..….. 38.2 Principles of Self-‐organization ………………………………. 48.3 Self-‐organized Feature Analysis ………………………….... 68.4 Principal-‐Components Analysis …………………………..... 88.5 Hebbian-‐Based Maximum Eigenfilter ……………....…. 158.6 Hebbian-‐Based PCA …………………………………………….. 198.7 Case Study: Image Decoding ……………………………….. 22Summary and Discussion …………………………..…………….. 25
2
![Page 3: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/3.jpg)
8.1 Introductionn Supervised learning
• Learning from labeled examplesn Semisupervised learning
• Learning from unlabeled and labeled examplesn Unsupervised learning
• Learning from examples without a teacherl Self-‐organized learning
• Neurobiological considerations• Locality of learning (immediate local behavior of neurons)
l Statistical learning theory• Mathematical considerations• Less emphasis on locality of learning
3
![Page 4: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/4.jpg)
8.2 Principles of Self-‐Organization (1/2)
n Principle 1: Self-‐amplification (self-‐reinforcement)l Synaptic modification self-‐amplifies by Hebb’s postulate of
learning1) If two neurons of a synapse are activated simultaneously, then
synaptic strength is selectively increased.2) If two neurons of a synapse are activated asynchronously,
then synaptic strength is selectively weakened or eliminated.
l Four key mechanisms of Hebbian synapsel Time-‐dependent mechanisml Local mechanisml Interactive mechanisml Conjunctional or correlational mechanism
!!Δwkj(n)=η yk(n)x j(n)
4
![Page 5: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/5.jpg)
8.2 Principles of Self-‐Organization (2/2)
n Principle 2: Competition• Limitation of available resources• The most vigorously growing (fittest) synapses or neurons are
selected at the expense of the others.• Synaptic plasticity (adjustability of a synaptic weight)
n Principle 3: Cooperation• Modifications in synaptic weights at the neural level and in
neurons at the network level tend to cooperate with each other.• Lateral interaction among a group of excited neurons
n Principle 4: Structural information• The underlying structure (redundancy) in the input signal is
acquired by a self-‐organizing system• Inherent characteristic of the input signal
5
![Page 6: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/6.jpg)
8.3 Self-‐organized Feature Analysis
Figure 8.1 Layout of modular self-‐adaptive Linsker’s model, with overlapping receptive fields. Mammalian visual system model.
6
![Page 7: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/7.jpg)
8.4 Principal-‐Components Analysis (1/8)
Does there exist an invertible linear transformation T such that the truncation of Tx is optimum in the mean-‐square-‐error sense?
!!!
x :m#dimentional!vectorX :m#dimentional!random!vectorq :m#dimentional!unit!vectorProjection :!!!!!!!A= XTq =qTXVariance!of!A:!!!!!!σ 2 =E[A2]=E[(qTX)(XTq)]=qTE[XXT ]q =qTRqR :m#by#m!correlation!matrix!!!!!!R =Ε[XXT ] 7
![Page 8: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/8.jpg)
8.4 Principal-‐Components Analysis (2/8)
!!!
!!!!!ψ (q)= !σ 2 =qTRq!!!!!!!!!!!!!!**For!any!small!perturbation!δq :!!!!!ψ (q+δq)=ψ (q).!.!.!.!.!.!.Introduce!a!scalar!factor!λ:!!!!!Rq = λq!!!!!!!(eigenvalue!problem)
!!!
λ1 ,λ2 ,...,λm : !Eigenvalues!of!Rq1 ,q2 ,...,qm : !Eigenvectors!of!R!!!!!Rq j = λ jq j !!!!!!!!j =1,!2,!..., !m!!!!!λ1 > λ2 >!> λ j >!> λm!!!!!Q = [q1 ,q2 ,...,q j ,...,qm]!!!!!!!RQ =QΛ!!
!!!
!!!!!RQ =QΛEigen!decomposition:!!!!!i)!QTRQ = Λ
!!!!!!!!!!q jTRq j =
λ j , !!!!k = j0,!!!!!!k ≠ j
⎧⎨⎪
⎩⎪!!!!!!!**
!!!!!ii)!R =QΛQT = λii=1
m
∑ qiqiT
!!!!!!!!!!!(spectral!theorem)
!!!
From!!**,!we!see!that!!!!!ψ (q j )= !λ j !!!!!!!j =1,2,...,m 8
![Page 9: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/9.jpg)
• Summary of the eigenstructure of PCA1) The eigenvectors of the correlation matrix R
for the random vector X define the unit vectors qj, representing the principal directions along with the variance probes have their extremal values.
2) The associated eigenvalues define the extremal values of the variance probes
8.4 Principal-‐Components Analysis (3/8)
!!!ψ (q j )
!!!ψ (u j )
9
![Page 10: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/10.jpg)
8.4 Principal-‐Components Analysis (4/8)
!!!
Data!vector!x : !a!realization!of!Xa : !a!realization!of!A!!!!!aj =q j
Tx = xTq j !!!!!!!!!!!j =1,2,...,maj : !the!projections!of!x !onto!principal!directions!!!!!!!!(principal!components)
Reconstruction!(synthesis)!of!the!original!data!x :!!!!!a = [a1 ,a2 ,...,am]T = [xTq1 ,xTq2 ,...,xTqm]T =QTx
!!!!!Qa =QQTx = Ix = x !!!!!
!!!!!x =Qa = ajq jj=1
m
∑10
![Page 11: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/11.jpg)
8.4 Principal-‐Components Analysis (5/8)
!!!
Dimensionality!reduction!!!!!λ1 ,λ2 ,...,λℓ : !largest!ℓ!eigenvalues!of!R
!!!!!x̂ = ajq jj=1
ℓ
∑ = [q1 ,q2 ,...,qℓ ]
a1a2"aℓ
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
, !!!ℓ ≤m
Encoder!for!x:!linear!projection!from!#m !to!#ℓ
!!!!!
a1a2"aℓ
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
= !!
q1T
q2T
"qℓT
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
x , !!!!!!!!!!ℓ ≤m
Figure 8.2 Two phases of PCA(a) Encoding, (b) Decoding
11
![Page 12: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/12.jpg)
8.4 Principal-‐Components Analysis (6/8)
!!!
Approximation!error!vector:!!!!!e = x 0 x̂
!!!!!e = aiqii=ℓ+1
m
∑
Figure 8.3: Relationship betweendata vector x,its reconstructed version and error vector e.
!!x̂
12
![Page 13: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/13.jpg)
8.4 Principal-‐Components Analysis (7/8)
Figure 8.4: A cloud of data points. Projection onto Axis 1 has maximum variance and shows bimodal.
13
![Page 14: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/14.jpg)
8.4 Principal-‐Components Analysis (8/8)
Figure 8.5: Digital compression of handwritten digits using PCA.
14
![Page 15: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/15.jpg)
8.5 Hebbian-‐Based Maximum Eigenfilter (1/4)
!!
Linear!neuron!with!Hebbian!adaptation
!!!!!y = wixii=ℓ+1
m
∑Synaptic!weight!wi !varies!with!time!!!!!wi(n+1)=wi(n)+η y(n)xi(n),!!!!i =1,2,...,m!!!!!"!!!!!wi(n+1)=wi(n)+η y(n) xi(n)− y(n)wi(n)( )
xi '(n)= xi(n)− y(n)wi(n)
!!!!!wi(n+1)=wi(n)+η y(n)xi '(n)15
![Page 16: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/16.jpg)
8.5 Hebbian-‐Based Maximum Eigenfilter (2/4)
Figure 8.6: Signal-‐flow graph representation of maximum eigenfilter
16
![Page 17: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/17.jpg)
8.5 Hebbian-‐Based Maximum Eigenfilter (3/4)
!!!
Matrix!formulation!!!!!x(n)= [x1(n),x2(n),...,xm(n)]T
!!!!!w(n)= [w1(n),w2(n),...,wm(n)]T
!!!!!y(n)= xT(n)w(n)=wT(n)x(n)!!!!!w(n+1)=w(n)+η y(n)[x(n)− y(n)w(n)]!!!!!!!!!!!!!!!!!!!!! =w(n)+ηxT(n)w(n)[x(n)−wT(n)x(n)w(n)]!!!!!!!!!!!!!!!!!!!!! =w(n)+η[xT(n)x(n)w(n)−wT(n)x(n)xT(n)w(n)w(n)]
17
![Page 18: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/18.jpg)
8.5 Hebbian-‐Based Maximum Eigenfilter (4/4)
18!!!
Aymptotic!stability!of!maximum!eigenfilter!!!!!w(t)→q1 !!!!!!!!!as!!t→∞
A!single!linear!neuron!governed!by!the!self;organizing!learning!ruleadaptively!extracts!the!first!principal!component!of!a!stationary!input.!!!!!x(n)= y(n)q1 !!!!!!!!!for!!n→∞
A!Hebbian;based!linear!neuron!with!learning!rulew(n+1)=w(n)+η y(n)[x(n)− y(n)w(n)]converges!with!probability!1!to!a!fixed!point:
1)!limn→∞
σ 2(n)= λ12)!lim
n→∞w(n)=q1 !!!with!!! limn→∞
||w(n)|| ! = !1!
![Page 19: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/19.jpg)
8.6 Hebbian-‐Based PCA (1/3)
Figure 8.7: Feedforward network with a single layer of computational nodes
19
!!
linear!neuron!!!!!!!m!inputs,!ℓ!outputs:!ℓ! < !mwji : !synaptic!weight!from!i !to!j
!!!!y j(n)= wji(n)xi(n)i=1
m
∑ !!!!!!!!!!!!!!j =1,!2,!..., !ℓ
![Page 20: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/20.jpg)
8.6 Hebbian-‐Based PCA (2/3)
20
!!!
Generalized!Hebbian!Algorithm!(GHA)
!!!!!!y j(n)= wji(n)xi(n)i=1
m
∑ !!!!!!!!!!!!!!j =1,!2,!..., !ℓ
Weight!update!rule:
!!!!!Δwji(n)=η y j(n)xi(n)− y j(n) wki(n)yk(n)k=1
j
∑⎛
⎝⎜⎞
⎠⎟
Rewriting!as!!!!!Δwji(n)=η y j(n) xi '(n)−wji(n)y j(n)⎡⎣ ⎤⎦ !
!!!!!!xi '(n)= xi(n)− wki(n)yk(n)k=1
j−1
∑
![Page 21: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/21.jpg)
8.6 Hebbian-‐Based PCA (3/3)
21Figure 8.8: Signal-‐flow graph of GHA
!!
Δwji(n)=η y j(n)xi ''(n)xi ''(n)= xi '(n)−wji(n)y j(n)
wji(n+1)=wji(n)+Δwji(n)wji(n)= z−1[wji(n+1)]
![Page 22: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/22.jpg)
8.7 Case Study: Image Decoding (1/3)
22
Figure 8.9: Signal-‐flow graph representation of how the reconstructed vector x^ is computed in the GHA.
!!! x̂(n)= yk(n)qk
k=1
ℓ
∑
![Page 23: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/23.jpg)
8.7 Case Study: Image Decoding (2/3)
23
Figure 8.10: (a) An image of Lena used in the image-‐coding experiment. (b) 8 x 8 masks representing the synaptic weights learned by the GHA. (c) Reconstructed image of Lena obtained using the dominant 8 principal components without quantization. (d) Reconstructed image of Lena with an 11-‐to-‐1 compression ratio using quantization
![Page 24: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/24.jpg)
8.7 Case Study: Image Decoding (3/3)
24
Figure 8.11: Image of peppers
![Page 25: Chapter(8.( Principal0Components(AnalysisChapter(8.(Principal0Components(Analysis Neural’Networks’and’Learning’Machines’ (Haykin) Lecture(Notes(of(Self0learning(NeuralAlgorithms](https://reader034.fdocuments.us/reader034/viewer/2022042019/5e76e5e7e44c3213f64931f7/html5/thumbnails/25.jpg)
Summary and Discussion• PCA = dimensionality reduction = compression• Generalized Hebbian algorithm (GHA) = Neural algorithm for
PCA• Dimensionality reduction
1) Representation of data2) Reconstruction of data
• Two views of unsupervised learning1) Bottom-‐up view2) Top-‐down view
• Nonlinear PCA methods1) Hebbian networks2) Replicator networks or autoencoders3) Principal curves4) Kernel PCA
25
!!!
a =
a1a2!aℓ
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
= !!
q1T
q2T
!qℓT
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
x , !!!ℓ ≤m
!!!
x̂ = ajq jj=1
ℓ
∑ = [q1 ,q2 ,...,qℓ ]
a1a2"aℓ
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
, !!!ℓ ≤m
!!!
!!!!!e = x " x̂
!!!!!e = aiqii=ℓ+1
m
∑