notes1

9
Welcome to Adaptive Signal Processing! 1 From Merriam-Webster’s Collegiate Dictionary: Main Entry: ad·ap·ta·tion Pronunciation: “a-”dap-’tA-sh&n, -d&p- Function: noun Date: 1610 1 : the act or process of adapting : the state of being adapted 2 : adjustment to environmental conditions: as a : adjustment of a sense organ to the intensity or quality of stimulation b : modification of an organism or its parts that makes it more fit for existence under the conditions of its environment 3 : something that is adapted; specifically : a composition rewritten into a new form - ad·ap·ta·tion·al /-shn&l, -sh&-n&l/ adjective - ad·ap·ta·tion·al·ly adverb Adaptive Signal Processing 2011 Lecture 1 Lectures and exercises 2 Lectures: Tuesdays 08.15-10.00 in room E:1406 Exercises: Wednesdays 08.15-10.00 in room E:1145 Computer exercises: Wednesdays 13.15-15.00 in room E:4115, or Thursdays 10.15-12.00 in room E:4115 Laborations: Lab I: Adaptive channel equalizer in room E:4115 Lab II: Adaptive filter on a DSP in room E:4115 Sign up on lists on webpage from Monday Nov 1. Adaptive Signal Processing 2011 Lecture 1 Course literature 3 Book: Simon Haykin, Adaptive Filter Theory, 4th edition, Prentice-Hall, 2001. ISBN: 0-13-090126-1 (Hardcover) Kapitel: Backgr., (2), 4, 5, 6, 7, 8, 9, 13.2, 14.1 (3:e edition: Intr., 1, (5), 8, 9, 10, 13, 16.1, 17.2 ) Exercise material:Exercise compendium (course home page) Computer exercises (course home page) Laborations (course home page) Other material: Lecture notes (course home page) Matlab code (course home page) Adaptive Signal Processing 2011 Lecture 1 Contents - References in the 4:th edition 4 Vecka 1: Repetition of OSB (Hayes, or chap.2), The method of the Steepest descent (chap.4) Vecka 2: The LMS algorithm (chap.5) Vecka 3: Modified LMS-algorithms (chap.6) Vecka 4: Freqency adaptive filters (chap.7) Vecka 5: The RLS algoritm (chap.8–9) Vecka 6: Tracking and implementation aspects (chap.13.2, 14.1) Vecka 7: Summary Adaptive Signal Processing 2011 Lecture 1

description

notes1

Transcript of notes1

Page 1: notes1

Welcome to Adaptive Signal Processing! 1

From Merriam-Webster’s Collegiate Dictionary:

Main Entry: ad·ap·ta·tionPronunciation: “a-”dap-’tA-sh&n, -d&p-Function: nounDate: 16101 : the act or process of adapting : the state of being adapted2 : adjustment to environmental conditions: as

a : adjustment of a sense organ to the intensity or qualityof stimulationb : modification of an organism or its parts that makes it morefit for existence under the conditions of its environment

3 : something that is adapted; specifically : a composition rewritteninto a new form

- ad·ap·ta·tion·al /-shn&l, -sh&-n&l/ adjective- ad·ap·ta·tion·al·ly adverb

Adaptive Signal Processing 2011 Lecture 1

Lectures and exercises 2

Lectures: Tuesdays 08.15-10.00 in room E:1406

Exercises: Wednesdays 08.15-10.00 in room E:1145

Computer exercises: Wednesdays 13.15-15.00 in room E:4115, orThursdays 10.15-12.00 in room E:4115

Laborations: Lab I: Adaptive channel equalizer in room E:4115Lab II: Adaptive filter on a DSP in room E:4115Sign up on lists on webpagefrom Monday Nov 1.

Adaptive Signal Processing 2011 Lecture 1

Course literature 3

Book: Simon Haykin, Adaptive Filter Theory, 4th edition,Prentice-Hall, 2001.ISBN: 0-13-090126-1 (Hardcover)

Kapitel: Backgr., (2), 4, 5, 6, 7, 8, 9, 13.2, 14.1

(3:e edition: Intr., 1, (5), 8, 9, 10, 13, 16.1, 17.2 )

Exercise material:Exercise compendium (course home page)

Computer exercises (course home page)

Laborations (course home page)

Other material: Lecture notes (course home page)

Matlab code (course home page)

Adaptive Signal Processing 2011 Lecture 1

Contents - References in the 4:th edition 4

Vecka 1: Repetition of OSB (Hayes, or chap.2),The method of the Steepest descent (chap.4)

Vecka 2: The LMS algorithm (chap.5)

Vecka 3: Modified LMS-algorithms (chap.6)

Vecka 4: Freqency adaptive filters (chap.7)

Vecka 5: The RLS algoritm (chap.8–9)

Vecka 6: Tracking and implementation aspects (chap.13.2, 14.1)

Vecka 7: Summary

Adaptive Signal Processing 2011 Lecture 1

Page 2: notes1

Contents - References in the 3:rd edition 5

Vecka 1: Repetition of OSB (Hayes, or chap.5),The method of the Steepest descent (chap.8)

Vecka 2: The LMS algorithm (chap.9)

Vecka 3: Modified LMS-algorithms (chap.9)

Vecka 4: Freqency adaptive filters (chap.10, 1)

Vecka 5: The RLS algoritm (chap.11)

Vecka 6: Tracking and implementation aspects (chap.16.1, 17.2)

Vecka 7: Summary

Adaptive Signal Processing 2011 Lecture 1

Lecture 1 6

This lecture deals with

• Repetition of the course Optimal signal processing (OSB)

• The method of the Steepest descent

Adaptive Signal Processing 2011 Lecture 1

Recap of Optimal signal processing (OSB) 7

The following problems were treated in OSB

• Signal modeling Either a model with both poles and zeros or amodel with only poles (vocal tract) or only zeros (lips).

• Invers filter of FIR type Deconvolution or equalization of a chan-nel.

• Wiener filter Filtrering, equalization, prediction och deconvolution.

Adaptive Signal Processing 2011 Lecture 1

Optimal Linear Filtrering 8

Filterw ��

��+

–Σ- -?

-

Output signaly(n)

Desired signald(n)

Estimation errore(n)=d(n)−y(n)

Input signalu(n)

The filter w=ˆw0 w1 w2 . . .

˜Twhich minimizes the estimation

error e(n), such that the output signal y(n) resembles the desiredsignal d(n) as much as possible is searched for.

Adaptive Signal Processing 2011 Lecture 1

Page 3: notes1

Optimal Linear Filtrering 9

In order to determine the optimal filter a cost function J , which punishthe deviation e(n), is introduced. The larger e(n), the higher cost.

From OSB you know some different strategies, e.g.,

• The total squared error (LS) Deterministic description of thesignal.

J =

n2Xn1

e2(n)

• Mean squared error (MS) Stochastic description of the signal.

J =E{|e(n)|2}

• Mean squared error with extra contraint

J =E{|e(n)|2}+λ|u(n)|2

Adaptive Signal Processing 2011 Lecture 1

Optimal Linear Filtrering 10

The cost function J(n)=E{|e(n)|p} can be used for any p≥1, butmost oftenly for p=2. This choice gives a convex cost function whichis refered to as the Mean Squared Error.

J = E{e(n)e∗(n)} = E{|e(n)|2} MSE

Adaptive Signal Processing 2011 Lecture 1

Optimal Linear Filtrering 11

In order to find the optimal filter coefficients J is minimized withregard to themselves. This is done by differentiating J with regard tow0, w1, . . ., and then by setting the derivative to zero. Here, it isimportant that the cost function is convex, i.e., so that there is a globalminimum.

The minimization is expressed in terms of the gradient operator ∇,

∇J =0

where ∇J is called gradient vector.

In particular, the choice of the squared cost function Mean SquaredError leads to the Wiener-Hopf equation-system.

Adaptive Signal Processing 2011 Lecture 1

Optimal Linear Filtrering 12

In matrix form, the cost function J =E{|e(n)|2} can be written

J(w)=E{[d(n)−wHu(n)][d(n)−wHu(n)]∗}

=σ2d−wHp−pHw+wHRw

dar

w=ˆw0 w1 . . . wM−1

˜TM × 1

u(n)=ˆu(n) u(n− 1) . . . u(n−M + 1)

˜TM × 1

R=E{u(n)uH(n)}=

26664r(0) r(1) . . . r(M − 1)

r∗(1) r(0) r(M − 2)...

.... . .

...

r∗(M − 1) r∗(M − 2) . . . r(0)

37775p=E{u(n)d

∗(n)}=

ˆp(0) p(−1) . . . p(−(M − 1))

˜TM × 1

σ2d=E{d(n)d

∗(n)}

Adaptive Signal Processing 2011 Lecture 1

Page 4: notes1

Optimal Linear Filtrering 13

The gradient operator yields

∇J(w)=2∂J(w)

∂w∗= 2

∂w∗(σ2

d−wHp−pHw+wHRw)

=−2p + 2Rw

If the gradient vector is set to zero, the Wiener-Hopf equation systemresults

Rwo =p Wiener-Hopf

which solution is the Wiener filter.

wo =R−1p Wienerfilter

In other words, the Wiener filter is optimal when the cost is controlledby MSE.

Adaptive Signal Processing 2011 Lecture 1

Optimal Linear Filtrering 14

The cost function’s dependence on the filter coefficients w can be madeclear if written in canonical form

J(w) = σ2d−wHp−pHw+wHRw

= σ2d−pHR−1p+(w−wo)

HR(w−wo)

Here, Wiener-Hopf and the expression of the Wienerfilter have beenused in addition to the fact that the following decomposition can bemade

wHRw=(w−wo)HR(w−wo)−wH

o Rwo+wHo Rw+wHRwo

With the optimal filter w=wo the minimal error Jmin is achieved:

Jmin ≡ J(wo) = σ2d − pHR−1p MMSE

Adaptive Signal Processing 2011 Lecture 1

Optimal Linear Filtrering 15

Error-Performance Surface for FIR-filter with two coefficients,w=[w0, w1]

T

0

1

2

3

4−4

−3

−2

−1

0

0

5

10

15

2

4

6

8

10

12

14

−4−3−2−100

0.5

1

1.5

2

2.5

3

3.5

4

J(w

)

w0 w1 w1

wo

w0

p=ˆ0.5272 −0.4458

˜T R=

»1.1 0.50.5 1.1

–σ2

d =0.9486

wo=ˆ0.8360 −0.7853

˜TJmin =0.1579

Adaptive Signal Processing 2011 Lecture 1

Steepest Descent 16

The method of the Steepest descent is a recursive method to findthe Wienerfiltret when the statistics of the signals are known.

The method of the Steepest descent is not an adaptive filter, but servesas a basis for the LMS algorithm which is presented in Lecture 2.

Adaptive Signal Processing 2011 Lecture 1

Page 5: notes1

Steepest Descent 17

The method of the Steepest descent is a recursive method that leadsto the Wiener-Hopfs equations. The statistics are known (R, p). Thepurpose is to avoid inversion of R. (saves computations)

• Set start values for the filter coefficients, w(0) (n=0)

• Determine the gradient ∇J(n) that points in the direction inwhich the cost function increases the most. ∇J(n)=−2p+2Rw(n)

• Adjust w(n+1) in the opposite direction to the gradient, butweight down the adjustment with the stepsize parameter µ

w(n+1)=w(n) +1

2µ[−∇J(n)]

• Repete steps 2 and 3.

Adaptive Signal Processing 2011 Lecture 1

Convergence, filter coefficients 18

Since the method of the Steepest Descent contains feedback, thereis a risk that the algorithm diverges. This limits the choices of thestepsize parameter µ. One example of the critical choice of µ isgiven below. The statistics are the same as in the previous example.

wo0

wo1

0 20 40 60 80 100Iteration n

w(n

)

0

µ=1.5

µ=0.1

µ=1.0

µ=1.25 p=

»0.5272−0.4458

–R=

»1.1 0.50.5 1.1

–wo =

»0.8360−0.7853

–w(0)=

»00

Adaptive Signal Processing 2011 Lecture 1

Convergence, error surface 19

The influence of the stepsize parameter on the convergence can be seenwhen analyzing J(w). The example below illustrates the convergencetowards Jmin for different choices of µ.

21 1.50.50-1 -0.5-1.5-2

2

1

1.5

0.5

0

-1

-0.5

-1.5

-2

w0

w1

w(0)wo

µ=0.1

µ=0.5

µ=1.0

p=

»0.5272−0.4458

–R=

»1.1 0.50.5 1.1

–wo =

»0.8360−0.7853

–w(0)=

»1

1.7

Adaptive Signal Processing 2011 Lecture 1

Convergence analysis 20

How should µ be chosen? A small value gives slow convergence, whilea large value constitutes a risk for divergence.

Perform an eigenvalue decomposition of R in the expression ofJ(w(n))

J(n)=Jmin+(w(n)−wo)HR(w(n)−wo)

=Jmin + (w(n)−wo)HQΛQH

(w(n)−wo)

=Jmin + νH

(n)Λν(n)=Jmin +X

k

λk|νk(n)|2

The convergence of the cost function depends on ν(n), i.e., the

convergence of w(n) through the relationship ν(n) = QH(w(n)−wo).

Adaptive Signal Processing 2011 Lecture 1

Page 6: notes1

Convergence analysis 21

With the observation that w(n)=Qν(n)+wo the update of the costfunction can be derived:

w(n+1)=w(n)+µ[p−Rw(n)]

Qν(n+1)+wo=Qν(n)+wo+µ[p−RQν(n)− Rwo]

ν(n+1)=ν(n)− µQHRQν(n) = (I− µΛ)ν(n)

νk(n + 1)=(1− µλk)νk(n) ( Element k i ν(n) )

The latter is a 1:st order difference equation, with the solution

νk(n)=(1− µλk)nνk(0)

For this equation to converge it is required that |1−µλk| < 1, whichleads to the stability criterion of the method of the Steepest Descent:

0 < µ <2

λmaxStabilitet, S.D.

Adaptive Signal Processing 2011 Lecture 1

Convergence, time constants 22

The time constants indicates how many iterations it takes until therespective error has decreased by the factor e−1, where e denotes thebase of the natural logarithm. The smaller time constant, the better.

The time constant τk for eigenmode k (eigenvalueλk) is

τk =−1

ln(1− µλk)

µ<<1≈ 1

µλkTime constant τk

If the entire convergence for whole coefficient vector w(n) is considered,the speed of convergence is limited by the largest and the smallesteigenvalue of R, λmax och λmin. This time constant is denoted τa:

−1

ln(1− µλmax)≤ τa ≤

−1

ln(1− µλmin)Time constant τa

Adaptive Signal Processing 2011 Lecture 1

Learning Curve 23

0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

1

Iteration n

Ko

stn

ad

sfu

nktio

nJ(n

)

µ=0.1

µ=0.5

µ=1.0

p=

»0.5272−0.4458

–R=

»1.1 0.50.5 1.1

–wo =

»0.8360−0.7853

–w(0)=

»00

–Jmin

How fast an adaptive filter converges is usually shown in a learningcurve, which is a plot of J(n) as a function of the iteration n.

For the SD, J(n) approaches Jmin. Since SD is deterministic it ismisleading to talk about learning curves in this case.

Adaptive Signal Processing 2011 Lecture 1

Summary Lecture 1 24

Repetition av OSB

• Quadratic cost function, J = E{|e(n)|2}• Definition of u(n), w, R and p

• The correlations R and p is assumed to be known in advance.

• The Gradient vector ∇J

• Optimal filter coefficients is given by wo = R−1p

• Optimal (minimal) cost Jmin 6= 0

Adaptive Signal Processing 2011 Lecture 1

Page 7: notes1

Summary Lecture 1 25

Summary of the method of the Steepest Descent

• Recursiv solution to the Wiener filter

• The statistics (R och p) is assumed to be known

• The gradient vector ∇J(n) is timedependent but deterministic,oand points in the direction in which the cost function increasesthe most.

• Recursion of filter veights: w(n+1)=w(n) +1

2µ[−∇J(n)]

• The cost function J(n) → Jmin, n → ∞, dvs w(n) →wo, n →∞

• For convergence it is required that the stepsize satisfies 0 < µ <2

λmax• The speed of convergence is determined by µ and the eigenvalues

of R −1

ln(1− µλmax)≤ τa ≤

−1

ln(1− µλmin)

Adaptive Signal Processing 2011 Lecture 1

To read 26

• Repetition OSB: Hayes, or Haykin kapitel 2.

• Background on adaptive filters, Haykin Background and Preview.

• Steepest descent, Haykin kapitel 4.

Exercises: 2.1, 2.2, 2.5, 3.1, 3.3, 3.5, 3.7, (3.2, 3.4, 3.6)

Computer exercise, theme: Implementation of the method of the Stee-pest descent.

Adaptive Signal Processing 2011 Lecture 1

Exempel pa adaptiva system 27

Har foljer ett antal exempel pa tillampningar som kommer att disku-teras under kursens gang. Materialet har ar langt ifran heltackande;ytterligare exempel aterfinns i Haykin Background and Preview.

Tanken med dessa exempel ar att du redan nu ska borja funderapa hur adaptiva filter kan anvandas i olika sammanhang.

Formuleringar av typen “anvand ett filter av samma ordning somsystemet” fortjanar ett fortydligande. Normalt kanner man inte syste-mets ordning, utan man far undersoka flera alternativ. Okas ordningensuccessivt, sa kommer man i forekommande fall att marka att efter enviss langd, sa ger ytterligare okning av filtrets langd ingen ytterligareforbattring.

Adaptive Signal Processing 2011 Bilaga, Forelasning 1

Exempel: Inversmodellering 28

Adaptivalgoritm

Undersoktsystem

FIRfilter

z−∆

� ��+–

Σ- - -

- �

-

?

��������

Exciterande signal d(n)

Fordrojning

u(n) bd(n)

e(n)

Vid inversmodellering kopplas det adaptiva filtret i kaskad med detundersokta systemet. Har systemet endast poler, kan ett adaptivt filterav motsvarande ordning anvandas.

Adaptive Signal Processing 2011 Bilaga, Forelasning 1

Page 8: notes1

Exempel: Modellering/Identifiering 29

Adaptivalgoritm

Undersoktsystem

FIRfilter � ��+

–Σ

-

?

-

- -

��������

u(n) d(n)

bd(n)

e(n)

Vid modellering/identifiering kopplas det adaptiva filtret parallellt meddet undersokta systemet. Om det undersokta systemet endast harnollstallen, sa ar det lampligt att anvanda motsvarande langd pa filtret.Har systemet bade poler och nollstallen kravs i regel ett langt adaptivtFIR-filter.

Adaptive Signal Processing 2011 Bilaga, Forelasning 1

Exempel: Ekoslackare I 30

Adaptivalgoritm

Hybrid

FIRfilter

� ��–+

Σ

?

-

-

?�

-

���������

u(n) Talare 1

bd(n)

d(n)e(n)

Talare 2

Vid telefoni sa lacker talet fran Talare 1 igenom vid hybriden. Talare 1kommer da att hora sin egen rost som ett eko. Detta vill man ta bort.Normalt stoppas adaptionen da Talare 2 pratar. Filtreringen sker dockunder hela tiden.

Adaptive Signal Processing 2011 Bilaga, Forelasning 1

Exempel: Ekoslackare II 31

Adaptivalgoritm

FIRfilter

� ��–+

Σ

-

-

-

?�

��������

��

@@

� ���

?

?� ��+

ImpulssvarEkovag

u(n) Talare 1

bd(n)

d(n)e(n)

Talare 2

Denna struktur ar tillampbar pa hogtalartelefoner, videokonferenssystemoch dylikt. Precis som vid telefonifallet hor Talare 1 sig sjalv i form avett eko. Dock ar denna effekt mer pataglig har, eftersom mikrofonenfangar upp det som sands ut av hogtalaren. Adaptionen stoppas normaltnar Talare 2 pratar, men filtreringen sker under hela tiden.

Adaptive Signal Processing 2011 Bilaga, Forelasning 1

Exempel: Adaptive Line Enhancer 32

Adaptivalgoritm

FIRfilter

��������

� ��++

Σ-?

??

z−∆ � ��+–

Σ-

- �

-

��

-

Periodisksignals(n)

v(n) Fargad storning

d(n)

bd(n)

e(n)

u(n)

Fordrojning

Information i form av en periodisk signal stors av ett fargat brus somar korrelerat med sig sjalv inom en viss tidsram. Genom att fordrojasignalen sa pass mycket att bruset (i u(n) respektive d(n)) blirokorrelerat, kan bruset tryckas ned genom linjar prediktion.

Adaptive Signal Processing 2011 Bilaga, Forelasning 1

Page 9: notes1

Exempel: Kanalutjamnare (Equalizer) 33

Adaptivalgoritm

FIRfilter

KanalC(z) � ��

+

z−∆

� ��+–

Σ6

-

���*-

?- -

-

-

��������

p(n)

Sluten under “traning” d(n)

Fordrojning

bd(n)

e(n)

u(n)

Brus v(n)

En kand pseudo noise-sekvens anvands for att skatta en invers modell avkanalen (“traning”). Darefter stoppas adaptionen, men filtret fortsatteratt verka pa den oversanda signalen. Syftet ar att ta bort kanalensinverkan pa den oversanda signalen.

Adaptive Signal Processing 2011 Bilaga, Forelasning 1