notes1

Welcome to Adaptive Signal Processing! 1

From Merriam-Webster’s Collegiate Dictionary:

Main Entry: ad·ap·ta·tionPronunciation: “a-”dap-’tA-sh&n, -d&p-Function: nounDate: 16101 : the act or process of adapting : the state of being adapted2 : adjustment to environmental conditions: as

a : adjustment of a sense organ to the intensity or qualityof stimulationb : modification of an organism or its parts that makes it morefit for existence under the conditions of its environment

3 : something that is adapted; specifically : a composition rewritteninto a new form

- ad·ap·ta·tion·al /-shn&l, -sh&-n&l/ adjective- ad·ap·ta·tion·al·ly adverb

Adaptive Signal Processing 2011 Lecture 1

Lectures and exercises 2

Lectures: Tuesdays 08.15-10.00 in room E:1406

Exercises: Wednesdays 08.15-10.00 in room E:1145

Computer exercises: Wednesdays 13.15-15.00 in room E:4115, orThursdays 10.15-12.00 in room E:4115

Laborations: Lab I: Adaptive channel equalizer in room E:4115Lab II: Adaptive filter on a DSP in room E:4115Sign up on lists on webpagefrom Monday Nov 1.


Course literature 3

Book: Simon Haykin, Adaptive Filter Theory, 4th edition,Prentice-Hall, 2001.ISBN: 0-13-090126-1 (Hardcover)

Kapitel: Backgr., (2), 4, 5, 6, 7, 8, 9, 13.2, 14.1

(3:e edition: Intr., 1, (5), 8, 9, 10, 13, 16.1, 17.2 )

Exercise material:Exercise compendium (course home page)

Computer exercises (course home page)

Laborations (course home page)

Other material: Lecture notes (course home page)

Matlab code (course home page)


Contents - References in the 4:th edition 4

Vecka 1: Repetition of OSB (Hayes, or chap.2),The method of the Steepest descent (chap.4)

Vecka 2: The LMS algorithm (chap.5)

Vecka 3: Modified LMS-algorithms (chap.6)

Vecka 4: Freqency adaptive filters (chap.7)

Vecka 5: The RLS algoritm (chap.8–9)

Vecka 6: Tracking and implementation aspects (chap.13.2, 14.1)

Vecka 7: Summary


Contents - References in the 3:rd edition 5

Vecka 1: Repetition of OSB (Hayes, or chap.5),The method of the Steepest descent (chap.8)

Vecka 2: The LMS algorithm (chap.9)

Vecka 3: Modified LMS-algorithms (chap.9)

Vecka 4: Freqency adaptive filters (chap.10, 1)

Vecka 5: The RLS algoritm (chap.11)

Vecka 6: Tracking and implementation aspects (chap.16.1, 17.2)

Vecka 7: Summary


Lecture 1 6

This lecture deals with

• Repetition of the course Optimal signal processing (OSB)

• The method of the Steepest descent


Recap of Optimal signal processing (OSB) 7

The following problems were treated in OSB

• Signal modeling Either a model with both poles and zeros or amodel with only poles (vocal tract) or only zeros (lips).

• Invers filter of FIR type Deconvolution or equalization of a chan-nel.

• Wiener filter Filtrering, equalization, prediction och deconvolution.


Optimal Linear Filtrering 8

Filterw ��

��+

–Σ- -?

-

Output signaly(n)

Desired signald(n)

Estimation errore(n)=d(n)−y(n)

Input signalu(n)

The filter w=ˆw0 w1 w2 . . .

˜Twhich minimizes the estimation

error e(n), such that the output signal y(n) resembles the desiredsignal d(n) as much as possible is searched for.



In order to determine the optimal filter a cost function J , which punishthe deviation e(n), is introduced. The larger e(n), the higher cost.

From OSB you know some different strategies, e.g.,

• The total squared error (LS) Deterministic description of thesignal.

J =

n2Xn1

e2(n)

• Mean squared error (MS) Stochastic description of the signal.

J =E{|e(n)|2}

• Mean squared error with extra contraint

J =E{|e(n)|2}+λ|u(n)|2



The cost function J(n)=E{|e(n)|p} can be used for any p≥1, butmost oftenly for p=2. This choice gives a convex cost function whichis refered to as the Mean Squared Error.

J = E{e(n)e∗(n)} = E{|e(n)|2} MSE



In order to find the optimal filter coefficients J is minimized withregard to themselves. This is done by differentiating J with regard tow0, w1, . . ., and then by setting the derivative to zero. Here, it isimportant that the cost function is convex, i.e., so that there is a globalminimum.

The minimization is expressed in terms of the gradient operator ∇,

∇J =0

where ∇J is called gradient vector.

In particular, the choice of the squared cost function Mean SquaredError leads to the Wiener-Hopf equation-system.



In matrix form, the cost function J =E{|e(n)|2} can be written

J(w)=E{[d(n)−wHu(n)][d(n)−wHu(n)]∗}

=σ2d−wHp−pHw+wHRw

dar

w=ˆw0 w1 . . . wM−1

˜TM × 1

u(n)=ˆu(n) u(n− 1) . . . u(n−M + 1)

˜TM × 1

R=E{u(n)uH(n)}=

26664r(0) r(1) . . . r(M − 1)

r∗(1) r(0) r(M − 2)...

.... . .

...

r∗(M − 1) r∗(M − 2) . . . r(0)

37775p=E{u(n)d

∗(n)}=

ˆp(0) p(−1) . . . p(−(M − 1))

˜TM × 1

σ2d=E{d(n)d

∗(n)}



The gradient operator yields

∇J(w)=2∂J(w)

∂w∗= 2

∂

∂w∗(σ2

d−wHp−pHw+wHRw)

=−2p + 2Rw

If the gradient vector is set to zero, the Wiener-Hopf equation systemresults

Rwo =p Wiener-Hopf

which solution is the Wiener filter.

wo =R−1p Wienerfilter

In other words, the Wiener filter is optimal when the cost is controlledby MSE.



The cost function’s dependence on the filter coefficients w can be madeclear if written in canonical form

J(w) = σ2d−wHp−pHw+wHRw

= σ2d−pHR−1p+(w−wo)

HR(w−wo)

Here, Wiener-Hopf and the expression of the Wienerfilter have beenused in addition to the fact that the following decomposition can bemade

wHRw=(w−wo)HR(w−wo)−wH

o Rwo+wHo Rw+wHRwo

With the optimal filter w=wo the minimal error Jmin is achieved:

Jmin ≡ J(wo) = σ2d − pHR−1p MMSE



Error-Performance Surface for FIR-filter with two coefficients,w=[w0, w1]

T

0

1

2

3

4−4

−3

−2

−1

0

0

5

10

15

2

4

6

8

10

12

14

−4−3−2−100

0.5

1

1.5

2

2.5

3

3.5

4

J(w

)

w0 w1 w1

wo

w0

p=ˆ0.5272 −0.4458

˜T R=

»1.1 0.50.5 1.1

–σ2

d =0.9486

wo=ˆ0.8360 −0.7853

˜TJmin =0.1579


Steepest Descent 16

The method of the Steepest descent is a recursive method to findthe Wienerfiltret when the statistics of the signals are known.

The method of the Steepest descent is not an adaptive filter, but servesas a basis for the LMS algorithm which is presented in Lecture 2.


Steepest Descent 17

The method of the Steepest descent is a recursive method that leadsto the Wiener-Hopfs equations. The statistics are known (R, p). Thepurpose is to avoid inversion of R. (saves computations)

• Set start values for the filter coefficients, w(0) (n=0)

• Determine the gradient ∇J(n) that points in the direction inwhich the cost function increases the most. ∇J(n)=−2p+2Rw(n)

• Adjust w(n+1) in the opposite direction to the gradient, butweight down the adjustment with the stepsize parameter µ

w(n+1)=w(n) +1

2µ[−∇J(n)]

• Repete steps 2 and 3.


Convergence, filter coefficients 18

Since the method of the Steepest Descent contains feedback, thereis a risk that the algorithm diverges. This limits the choices of thestepsize parameter µ. One example of the critical choice of µ isgiven below. The statistics are the same as in the previous example.

wo0

wo1

0 20 40 60 80 100Iteration n

w(n

)

0

µ=1.5

µ=0.1

µ=1.0

µ=1.25 p=

»0.5272−0.4458

–R=

»1.1 0.50.5 1.1

–wo =

»0.8360−0.7853

–w(0)=

»00

–


Convergence, error surface 19

The influence of the stepsize parameter on the convergence can be seenwhen analyzing J(w). The example below illustrates the convergencetowards Jmin for different choices of µ.

21 1.50.50-1 -0.5-1.5-2

2

1

1.5

0.5

0

-1

-0.5

-1.5

-2

w0

w1

w(0)wo

µ=0.1

µ=0.5

µ=1.0

p=

»0.5272−0.4458

–R=

»1.1 0.50.5 1.1

–wo =

»0.8360−0.7853

–w(0)=

»1

1.7

–


Convergence analysis 20

How should µ be chosen? A small value gives slow convergence, whilea large value constitutes a risk for divergence.

Perform an eigenvalue decomposition of R in the expression ofJ(w(n))

J(n)=Jmin+(w(n)−wo)HR(w(n)−wo)

=Jmin + (w(n)−wo)HQΛQH

(w(n)−wo)

=Jmin + νH

(n)Λν(n)=Jmin +X

k

λk|νk(n)|2

The convergence of the cost function depends on ν(n), i.e., the

convergence of w(n) through the relationship ν(n) = QH(w(n)−wo).


Convergence analysis 21

With the observation that w(n)=Qν(n)+wo the update of the costfunction can be derived:

w(n+1)=w(n)+µ[p−Rw(n)]

Qν(n+1)+wo=Qν(n)+wo+µ[p−RQν(n)− Rwo]

ν(n+1)=ν(n)− µQHRQν(n) = (I− µΛ)ν(n)

νk(n + 1)=(1− µλk)νk(n) ( Element k i ν(n) )

The latter is a 1:st order difference equation, with the solution

νk(n)=(1− µλk)nνk(0)

For this equation to converge it is required that |1−µλk| < 1, whichleads to the stability criterion of the method of the Steepest Descent:

0 < µ <2

λmaxStabilitet, S.D.


Convergence, time constants 22

The time constants indicates how many iterations it takes until therespective error has decreased by the factor e−1, where e denotes thebase of the natural logarithm. The smaller time constant, the better.

The time constant τk for eigenmode k (eigenvalueλk) is

τk =−1

ln(1− µλk)

µ<<1≈ 1

µλkTime constant τk

If the entire convergence for whole coefficient vector w(n) is considered,the speed of convergence is limited by the largest and the smallesteigenvalue of R, λmax och λmin. This time constant is denoted τa:

−1

ln(1− µλmax)≤ τa ≤

−1

ln(1− µλmin)Time constant τa


Learning Curve 23

0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

1

Iteration n

Ko

stn

ad

sfu

nktio

nJ(n

)

µ=0.1

µ=0.5

µ=1.0

p=

»0.5272−0.4458

–R=

»1.1 0.50.5 1.1

–wo =

»0.8360−0.7853

–w(0)=

»00

–Jmin

How fast an adaptive filter converges is usually shown in a learningcurve, which is a plot of J(n) as a function of the iteration n.

For the SD, J(n) approaches Jmin. Since SD is deterministic it ismisleading to talk about learning curves in this case.


Summary Lecture 1 24

Repetition av OSB

• Quadratic cost function, J = E{|e(n)|2}• Definition of u(n), w, R and p

• The correlations R and p is assumed to be known in advance.

• The Gradient vector ∇J

• Optimal filter coefficients is given by wo = R−1p

• Optimal (minimal) cost Jmin 6= 0


Summary Lecture 1 25

Summary of the method of the Steepest Descent

• Recursiv solution to the Wiener filter

• The statistics (R och p) is assumed to be known

• The gradient vector ∇J(n) is timedependent but deterministic,oand points in the direction in which the cost function increasesthe most.

• Recursion of filter veights: w(n+1)=w(n) +1

2µ[−∇J(n)]

• The cost function J(n) → Jmin, n → ∞, dvs w(n) →wo, n →∞

• For convergence it is required that the stepsize satisfies 0 < µ <2

λmax• The speed of convergence is determined by µ and the eigenvalues

of R −1

ln(1− µλmax)≤ τa ≤

−1

ln(1− µλmin)


To read 26

• Repetition OSB: Hayes, or Haykin kapitel 2.

• Background on adaptive filters, Haykin Background and Preview.

• Steepest descent, Haykin kapitel 4.

Exercises: 2.1, 2.2, 2.5, 3.1, 3.3, 3.5, 3.7, (3.2, 3.4, 3.6)

Computer exercise, theme: Implementation of the method of the Stee-pest descent.


Exempel pa adaptiva system 27

Har foljer ett antal exempel pa tillampningar som kommer att disku-teras under kursens gang. Materialet har ar langt ifran heltackande;ytterligare exempel aterfinns i Haykin Background and Preview.

Tanken med dessa exempel ar att du redan nu ska borja funderapa hur adaptiva filter kan anvandas i olika sammanhang.

Formuleringar av typen “anvand ett filter av samma ordning somsystemet” fortjanar ett fortydligande. Normalt kanner man inte syste-mets ordning, utan man far undersoka flera alternativ. Okas ordningensuccessivt, sa kommer man i forekommande fall att marka att efter enviss langd, sa ger ytterligare okning av filtrets langd ingen ytterligareforbattring.

Adaptive Signal Processing 2011 Bilaga, Forelasning 1

Exempel: Inversmodellering 28

Adaptivalgoritm

Undersoktsystem

FIRfilter

z−∆

� ��+–

Σ- - -

- �

-

?

��

Exciterande signal d(n)

Fordrojning

u(n) bd(n)

e(n)

Vid inversmodellering kopplas det adaptiva filtret i kaskad med detundersokta systemet. Har systemet endast poler, kan ett adaptivt filterav motsvarande ordning anvandas.


Exempel: Modellering/Identifiering 29

Adaptivalgoritm

Undersoktsystem

FIRfilter � ��+

–Σ

-

?

-

- -

�

��

u(n) d(n)

bd(n)

e(n)

Vid modellering/identifiering kopplas det adaptiva filtret parallellt meddet undersokta systemet. Om det undersokta systemet endast harnollstallen, sa ar det lampligt att anvanda motsvarande langd pa filtret.Har systemet bade poler och nollstallen kravs i regel ett langt adaptivtFIR-filter.


Exempel: Ekoslackare I 30

Adaptivalgoritm

Hybrid

FIRfilter

� ��–+

Σ

?

�

-

-

?�

�

-

��

u(n) Talare 1

bd(n)

d(n)e(n)

Talare 2

Vid telefoni sa lacker talet fran Talare 1 igenom vid hybriden. Talare 1kommer da att hora sin egen rost som ett eko. Detta vill man ta bort.Normalt stoppas adaptionen da Talare 2 pratar. Filtreringen sker dockunder hela tiden.


Exempel: Ekoslackare II 31

Adaptivalgoritm

FIRfilter

� ��–+

Σ

-

�

-

-

?�

�

��

��

@@

� ��

?

?� ��+

+Σ

ImpulssvarEkovag

�

u(n) Talare 1

bd(n)

d(n)e(n)

Talare 2

Denna struktur ar tillampbar pa hogtalartelefoner, videokonferenssystemoch dylikt. Precis som vid telefonifallet hor Talare 1 sig sjalv i form avett eko. Dock ar denna effekt mer pataglig har, eftersom mikrofonenfangar upp det som sands ut av hogtalaren. Adaptionen stoppas normaltnar Talare 2 pratar, men filtreringen sker under hela tiden.


Exempel: Adaptive Line Enhancer 32

Adaptivalgoritm

FIRfilter

��

� ��++

Σ-?

??

z−∆ � ��+–

Σ-

- �

-

��

-

Periodisksignals(n)

v(n) Fargad storning

d(n)

bd(n)

e(n)

u(n)

Fordrojning

Information i form av en periodisk signal stors av ett fargat brus somar korrelerat med sig sjalv inom en viss tidsram. Genom att fordrojasignalen sa pass mycket att bruset (i u(n) respektive d(n)) blirokorrelerat, kan bruset tryckas ned genom linjar prediktion.


Exempel: Kanalutjamnare (Equalizer) 33

Adaptivalgoritm

FIRfilter

KanalC(z) � ��

+

+Σ

z−∆

� ��+–

Σ6

-

��*-

?- -

-

-

�

��

p(n)

Sluten under “traning” d(n)

Fordrojning

bd(n)

e(n)

u(n)

Brus v(n)

En kand pseudo noise-sekvens anvands for att skatta en invers modell avkanalen (“traning”). Darefter stoppas adaptionen, men filtret fortsatteratt verka pa den oversanda signalen. Syftet ar att ta bort kanalensinverkan pa den oversanda signalen.


notes1

Documents

Transcript of notes1