6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State...

25
1 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms have inherent approximations. As a result, the implementations are subject to “gradient noise”. We wish to compare various algorithms to each other, so a framework under which comparisons can be made must be developed. The framework, in general, may appear to be obtuse (not in the direction expected) but it avoids approximations in order to provide a rigorous structure. In comparing algorithms, the specific algorithm or cost function is applied to the framework at which time simplifying assumptions are applied in order to compare algorithms. Think of it as going as far as possible on solid ground and then taking a few steps off vs. hand- waving and hoping the entire time although the path is longer, the concepts are well founded mathematically. The following arguments are based on two foundations: 1) Energy Conservation Relation. 2) Variance Relation Every adaptive algorithm can have these two relations derived through fundamental mathematics for stationary statistical/probabilistic cases without significant approximations or assumptions. (Yes, assuming a stationary model may be considered significant.) From these relations we then make minimal assumptions to derive the desire performance or properties. The pain is going through the math to develop the relations … Most important sections (from preamble) 6.3 and 6.5 Chapter 15 in the on-line textbook. Applications to algorithms Steady state performance: Chapters 16-19.

Transcript of 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State...

Page 1: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

1

6. Steady-State Performance of Adaptive Filters

The performances of stochastic gradient adaptive algorithms have inherent approximations. As a result, the implementations are subject to “gradient noise”. We wish to compare various algorithms to each other, so a framework under which comparisons can be made must be developed. The framework, in general, may appear to be obtuse (not in the direction expected) but it avoids approximations in order to provide a rigorous structure. In comparing algorithms, the specific algorithm or cost function is applied to the framework at which time simplifying assumptions are applied in order to compare algorithms. Think of it as going as far as possible on solid ground and then taking a few steps off vs. hand-waving and hoping the entire time … although the path is longer, the concepts are well founded mathematically. The following arguments are based on two foundations: 1) Energy Conservation Relation. 2) Variance Relation Every adaptive algorithm can have these two relations derived through fundamental mathematics for stationary statistical/probabilistic cases without significant approximations or assumptions. (Yes, assuming a stationary model may be considered significant.) From these relations we then make minimal assumptions to derive the desire performance or properties. The pain is going through the math to develop the relations … Most important sections (from preamble) 6.3 and 6.5 Chapter 15 in the on-line textbook. Applications to algorithms Steady state performance: Chapters 16-19.

Page 2: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

2

6.1 Performance Measure Stochastic gradient techniques established iterative procedures to arrive at an optimal solution.

11 iuuduii wRRww With the covariance matrices

uu Huu ER and H

du dER u

convergence condition required for the step size

max

20

Arrival at the optimal weight

duuuopt RRw 1

The interactive cost function

Hii wudwudEiJ 11

iuuH

iiH

duduH

idi wRwwRRwwJ 2

which results in

duuuH

dudH

wRRRwudwudEwJ 12

min min

The LMS technique (and other adaptive techniques in general) use sometimes instantaneous approximations of the covariance matrices

Hidu uidR ˆ and i

Hiuu uuR ˆ

to derive the weight iteration equation

11 iiH

iii wuiduww

The behavior of the LMS weight estimate is much more complex than the stochastic gradient technique and need not converge. The convergence error is defined by the a-priori output estimation error

1 ii wuidie

which directly relates to a cost function

Hii wudwudie 11

2

When the adaptation does not converge to a solution, it is referred to as “gradient noise” due to the association with the adaptation step formed not stepping in the direction of the gradient. These in fact are steps where the instantaneous approximation for the covariance matrices has failed. If “future steps” are sufficiently correct the algorithm recovers, if not … continued errors and a failure to converge.

Page 3: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

3

Primary performance measures: Cost Function

Hii wudwudEiJ 11

with the stochastic gradient approachable minimum

duuuH

udd RRRJ 12min

Squared a-priori error function (notice the similarity to the iterative cost function)

1 ii wuidie

Hiiii wuidwuidie 11

2

Squared a-posteriori error function ii wuidir

H

iiii wuidwuidir 2

Stochastic Equations It is useful to treat the adaptive update equations as stochastic difference equations rather than as deterministic difference equation. Distinction … adaptive … samples approximate probability/statistics, but expected values are not taken. Further, the values d, u and w have not been assumed to be random variables. Now, as a form of analysis, we can take the variables d, u as random variables (and w the sum of r.v.’s) and compute the expected values and covariance matrices. Note: the book is careful to change the notation from non-bold samples to bold random variables. Look close or you will miss it. Bad news, my notes will not be doing this. On with the computations ….

Page 4: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

4

Excess Mean-Square Error and Misadjustment Steady-state mean-square error (MSE) criterion, defined as

2lim ieEMSEi

where we are again using the a-priori estimation error, but as a random variable model 1 ii wuidie

The excess-mean-square error (EMSE) is defined as

minJMSEEMSE duuu

Hudd RRRMSEEMSE 12

where Jmin may be computed based on the stochastic cost minimum. The adaptive filter misadjustment is defined as

1minmin

J

MSE

J

EMSEM

Note: as EMSEJEMSERRRMSE duuuH

udd min

12

Then M ≥ 0 Notice that the misadjustment describes how close the algorithm gets to achieving the minimum cost function.

Page 5: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

5

6.2 Stationary Data Model For a stochastic process with a known optimal, we can use the orthogonality property of linear least-mean-square estimators (Theorem 2.4.1), which states

0 oi

Hi wuiduE

If we define an estimation error

oi wuidiv

0 ivuE H

i The result can be also be stated as

ivwuid oi

where v(i) is uncorrelated with the ui. The variance of v(i) is then

min

22 JwuidwuidEivEHoo

v

Linear Regression Model Given any random variables { d(i), ui }with second order moments {Ruu, Rdu, and d } we can always assume that the random variables are related via a model of the form

ivwuid oi

We introduce a stronger assumption: the sequence v(i) is i.i.d. and independent of all uj We assume the data { d(i), ui } satisfy the following conditions

(a) There exists a vector wo such that ivwuid oi

(b) The noise sequence { v(i) } is i.i.d. with variance 22 ivEv

(c) The sequence v(i) is independent of all uj for all i and j

(d) The initial condition w-1 is independent of all { d(j), uj, v(j) }

(e) The regressor covariance matrix is 0 ih

iuu uuER (f) The random variables { d(i), ui, v(i) } have zero mean.

This is a stationary environment (for the moment) in which to develop results and relationships.

Page 6: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

6

Useful Independence Results For the LMS algorithm

11 iiH

iii wuiduww

The weight estimators can be expressed in terms of the reference signals and the regressors.

011 ,,,;0,,1,; uuudjdjdwFw jjj

The estimator error v(i) is independent of each of the terms of the function wj for j<i. The weight error vector can be defined as

jo

j www ~

The estimator error v(i) is also independent of the weight error vector for j<i. The a-priori estimation error can be defined as

11~

io

iiia wwuwuie

The estimator error v(i) is also independent of the a-priori estimation error for j<i.

Deriving an Alternate Expression for EMSE From the previous definitions

2lim ieEMSEi

minJMSEEMSE and now

min

22 JwuidwuidEivEHoo

v

Using the a-priori error and v(i)

1 ii wuidie

oi wuidiv

we have substituting for d(i) where ivwuid o

i

11 i

oiii

oi wwuivwuwuivie

1

~ ii wuivie

ieivie a

we have the error function in terms of the estimator error and a-priori error

Page 7: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

7

With the assumed independence, forming the expected value of the norm

22222ieEivEieivieieivivEieE aa

Ha

Ha

222

ieEivieEieivEivEieE aH

aH

a

222ieEivEieE a

222

ieEieE av

Defining the MSE and EMSE with the alternate terms

222limlim ieEieEMSE ai

vi

22 lim ieEMSE ai

v

and

minJMSEEMSE

min

22 lim JieEEMSE ai

v

Since

min

22 JwuidwuidEivEHoo

v

An alternate definition for the EMSE is related to the a-priori error as

2

1

2 ~limlim ii

ia

iwuEieEEMSE

If the EMSE is known, the MSE is found by

2min vEMSEJEMSEMSE

and the adaptive filter misadjustment is defined as

122

vv

MSEEMSEM

Page 8: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

8

Error Quantities Recap A-priori output estimator error

1 ii wuidie

A posteriori output estimator error

ii wuidir

Weight error vector

jo

j www ~

A-priori estimation error

11~

io

iiia wwuwuie

A posteriori estimation error

io

iiip wwuwuie ~

Now we are ready for the Energy-Conservation Relation

Page 9: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

9

6.3 Fundamental Energy-Conservation Relation The energy terms of interest are

2~iw

2iea 2ie p

The generic form for an adaptive update can be defined as

ieguww Hiii 1

where the function g[e(i)] denotes a function based on the a-priori output estimation error. It is also convenient to generalize the solution for the widest range of algorithms. A list of the g[*] function for LMS, e-NLMS and other functions is found in Table 6.2.

Continueing ….

Page 10: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

10

Defining the updates in terms of the optimal weight, or distance from the optimal weight

ieguwwww Hiii 1

00

The iterative weight error from optimal is then described as

ieguww Hiii 1

~~

where the tilde over the w described the weight error

ii www 0~ and 10

1~

ii www

The a-priori estimation error is described in terms of the weight error as

11~

io

iiia wwuwuie

and the a-posterior estimation error can be defined in terms of the a-priori estimation error and function g[*] as

io

iiip wwuwuie ~

ieguwuie Hiiip 1

~

ieguuwuie Hiiiip 1

~

which can be rewritten as

ieguieie iap 2

In combination, these equations provide an alternate description of an adaptive filter

1~

iia wuie ieivie a

ieguww H

iii 1~~

In studying the behavior: 1. Steady-state behavior: relating to steady state values of

2~iwE , 2

ieE a and 2ieE

2. Stability: which is based on the range of values where the step size μ allows

the variances 2~iwE and 2

ieE a to remain bounded.

3. Transient behavior: which involves the time evolution of the values 2ieE a and the

weight terms 2~iwE and iwE ~

Page 11: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

11

6.3.1 Algebraic Derivation of Error Conservation Relation We are going to “balance” the a-priori error and updated weight errors with the a-posteriori error and previous weight error. Starting with the a-priori and a-posteriori error relationship, solve for g[*] as a function of the estimation errors:

ieguieie iap 2

ieieu

ieg pa

i

2

1

Second, substitute for g[*] into the weight error update equation

ieguww Hiii 1

~~

ieieu

uww pa

i

Hiii 21

1~~

ieieu

uww pa

i

Hi

ii 21~~

Rewriting in “balanced” terms of “new” and “old” (weights and errors)

ieu

uwie

u

uw p

i

Hi

ia

i

Hi

i 212~~

Form the energy relationship for both sides of the equation … “magnitude squared”

ie

u

uwie

u

uwie

u

uwie

u

uw p

i

Hi

i

H

p

i

Hi

ia

i

Hi

i

H

a

i

Hi

i 212122~~~~

where the two sides expand to

ieu

u

u

uiew

u

uieie

u

uww

ieu

u

u

uiew

u

uieie

u

uww

p

i

Hi

i

iHpi

i

iHpp

i

HiH

ii

a

i

Hi

i

iHai

i

iHaa

i

HiH

ii

221221

2

1

2222

2

~~~

~~~

Based on the Orthogonality of terms (errors and weights), this result in

2

22

2

1

2

22

2 ~~ ieu

u

u

uwie

u

u

u

uw p

i

Hi

i

iia

i

Hi

i

ii

and finally

Page 12: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

12

2

2

2

1

2

2

2 1~1~ ieu

wieu

w p

i

ia

i

i

Defining an iterative stepping factor

0,0

0,1

2

2

2

i

i

i

ufor

uforui

The energy conservation relation is defined as

22

1

22 ~~ ieiwieiw piai

The norm of the updated weight error and the scaled squared a-prior estimation error magnitude is equal to the norm of the prior weight error and the scaled squared a-posteriori estimation error magnitudes are equal. Therefore, it is described as “Energy-Conservation”. Note: This is an exact relationship with no critical assumptions! An alternate statement of the Energy Relation based on the observations is formed from

2

2

2

1

2

2

2 1~1~ ieu

wieu

w p

i

ia

i

i

to become

22

1

2222 ~~ iewuiewu piiaii

From the textbooks: “The important fact to emphasize here is that no approximations have been used to establish the energy relation (6.3.10 or 15.32); it is an exact relation that shows how the energies of the weight-error vectors at two successive time instants are related to the energies of the a-priori and a-posteriori estimation errors.” p. 289.

Page 13: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

13

6.4 Fundamental Variance Relation The energy conservation relation has important ramifications in the study of adaptive filters. Chapter 6 applies the relation to the steady-state performance. Chapter 7 applies the relation to tracking analysis. Chapter 8 applies the relation to finite-precision analysis (skipped in this class). Chapter 9 applies the relation to of transient analysis.

6.4.1 Steady State Filter Operation What is intended is that the filter is operating in steady state … it is working and has adapted. Theorem 6.4.1 Steady-State: An adaptive filter will be said to operate in steady-state if it holds that

)0(,~~1 susuallyiasswEwE ii

iasCwwEwwE Hii

Hii ,~~~~

11

That is the mean and covariance matrix tend to some finite constant value. In addition, it follows that

CTrcwhere

iascwEwEwwE iiiH

i

,~~~~ 2

1

2

6.4.2 Variance Relation for Steady-State Performance Taking the expected value of the energy conservation relation

22

1

22 ~~ ieiEwEieiEwE piai

At steady state, we expect the a-priori and a-posteriori estimation errors to be equal

iasieiEieiE pa ,22

Let’s investigate the expected values

22ieiEieiE pa

Expanding the a-posteriori term based on

ieguieie iap 2

Therefore

222ieguieiEieiE iaa

Page 14: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

14

Expanding the terms

2422

22

2

ieguiieiegui

uiegieiieiEieiE

iH

ai

iH

aa

a

Simplifying the 2nd, 3rd and 4th terms

Letting 0,1 2

2 i

i

uforu

i

Hai

Ha iegieEuiegieiE 2

Ha

Hai ieiegEieieguiE 2

222242 ieguEieguiE ii

Substituting

222

22

ieguEieiegE

iegieEieiEieiE

iH

a

Haaa

The first term can be subtracted from both sides of the equation resulting in

2220 ieguEieiegEiegieE iH

aH

a

or

Ha

Hai ieiegEiegieEieguE 222

and

iegieEieguE Hai Re2

22

Note: This is an exact relationship with no assumptions! Theorem 6.4.1 Variance Relation: For an adaptive filter of the form

ieguww Hiii 1

and for any data, assuming the filter is operating in steady-state, the following relationship holds

iasiegieEieguE Hai ,Re2

22

The Energy-Conservation and Variance relations are the bases to evaluate the steady-state performance of the adaptive systems described. See Sections 6.5-6.11. They are used to define the EMSE, misadjustment, and steady-state tracking performance for nonstationary systems (Chap. 7).

Page 15: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

15

6.5 Mean-Square Performance of LMS (on-line text Chap. 16) For the LMS filter

1 ii wuidie ieuww H

iii 1 The g[*] function (Table 16.2) is

ieieg

Taking the variance relation and substituting;

iegieEieguE Hai Re2

22

ieieEieuE Hai Re2

22

Using ieivie a

ieivieEieivuE a

Haai Re2

22

Expanding the left side and right side (to find and eliminate orthogonality)

ieieivieE

ieivieieivivivuE

aH

aH

a

aH

aH

aH

i

Re2

22

Using orthogonality and the noise variance and ea

222222Re2 ieEieEieuEivivuE aaai

Hi

Again using the orthogonality of the noise to the observations

22222 ieEieuEivivEuE aai

Hi

Using the known relations

22

vivE and uui RTruE 2

This equation becomes after substitution and swapping left and right

22222 ieuERTrieE aiuuva

The definitions of the EMSE was

2lim ieEEMSE ai

So for LMS, we have

2222 ieuERTr aiuuvLMS

or

uuvaiLMS RTrieuEEMSE 222

2

and the misadjustment becomes

uuai

vv

LMS RTrieuEEMSE

M22

22

1

2

Page 16: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

16

Now we will make some assumptions

Sufficiently Small Step-Size assumption Assumption: μ sufficiently small such that the a-priori estimation error is also very small

222 ieuRTr aiuuv Then

uuvLMS RTr 2

2

and

uuv

LMS RTrEMSE

M 22

Separation Principle Assumption Assumption based on separation (independence) in the expected value. Let

22222ieERTrieEuEieuE auuaiai

Then

uuvauuLMS RTrieERTr 22

2

uuvLMS

uuLMS RTrRTr 2

2

uuvuuLMS RTrRTr

2

221

uuvuuLMS RTrRTr 22

For

uu

uuvLMS

RTr

RTr

2

2

and uu

uu

v

LMS

RTr

RTrEMSEM

22

Note: note based on μ sufficiently small!

White Gaussian Input Data see text – a closed form solution … but most input observation do not consist of only random data

Conclusion The performance of the LMS algorithm is dependent upon the input covariance matrix, Ruu, the step size, μ, and the length of the filter (size of Tr[Ruu]).

Page 17: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

17

6.6 Mean-Square Performance of -NLMS For the e-NLMS filter

1 ii wuidie

ieu

uww

i

Hi

ii

21

The g[*] function

2

iu

ieieg

The variance relation

iegieEieguE Hai Re2

22

becomes

2

2

2

2Re2

i

Ha

i

iu

ieieE

u

ieuE

Substituting ieivie a

2

2

2

2Re2

i

aHa

i

ai

u

ieivieE

u

ieivuE

Jumping forward with the same orthogonality as LMS

2

2

22

22

22

22

Re2i

a

i

i

i

ai

u

ieE

u

ivuE

u

ieuE

Using the known relations and orthogonality

22

vivE

2

2

22

2

2

22

22

2i

a

i

iv

i

ai

u

ieE

u

uE

u

ieuE

The -NLMS Variance Relation

Page 18: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

18

-NLMS separation principle assumption Assumption letting

2

22

2

22

22

ieEu

uE

u

ieuE a

i

i

i

ai

and

2

2

2

21

i

a

i

a

uEieE

u

ieE

Then the variance relation becomes

2

2

22

2

22

22

21

2i

a

i

iva

i

i

uEieE

u

uEieE

u

uE

Using the following definitions

22

2

i

iu

u

uE

and

2

1

i

uu

E

2lim ieEEMSE ai

The previous equation becomes NLMSe

uuvNLMSe

u 22

with the result

uu

uvNLMSe

2

2

and the misadjustment becomes

uu

u

v

NLMSe EMSEM

22

For epsilon small Assumption #2: For epsilon small, another approximation is

u

iii

i

i

iu

uE

uE

u

uE

u

uE

2222

2

22

211

Therefore

2

2vNLMSe

and the misadjustment becomes

22v

NLMSe EMSEM

Page 19: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

19

-NLMS small epsilon and “steady state approximation” Starting from the variance relation again

iegieEieguE Hai Re2

22

2

2

2

2Re2

i

aHa

i

ai

u

ieivieE

u

ieivuE

2

2

22

22

22

22

21

i

a

i

iv

i

ai

u

ieE

uuE

u

ieuE

Letting epsilon go to zero

2

2

22

22

22

22

21

i

a

i

iv

i

ai

u

ieE

uuE

u

ieuE

2

2

2

2

2

2

21

i

a

i

v

i

a

u

ieE

uE

u

ieE

The steady state approximation

2

2

2

2

2

2

21

i

a

i

v

i

a

uE

ieE

uE

uE

ieE

Using the relation

uui RTruE 2

uu

NLMSe

i

vuu

NLMSe

RTruE

RTr

21

2

2

Collecting terms

2

21

2i

uuvNLMSe

uERTr

and the misadjustment becomes

22

1

2i

uuv

NLMSe

uERTr

EMSEM

Conclusion The performance of the e-NLMS algorithm for the first approach is independent of the input covariance matrix, Ruu, and based on the step size, μ.

Page 20: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

20

6.9 Mean-Square Performance of RLS Repeating elements of the adaptive equations and derivation

i

j

Hi

jiuu uu

iR

01

1

0

1

i

i

j

Hi

jiii PuuI

Hii

iH

iiii

uPu

PuuPPP

11

111

11

1

, where IP

11

1 ii wuidie

ieuPww H

iiii 1

Energy Conservation Relation

ieuPww Hiiii 1

~~

11

~ i

oiiia wwuwuie

i

oiiip wwuwuie ~

ieuPuwuie H

iiiiip 1~

define the Pi norm

HiiiPi uPuu

i2

ieuwuieiPiiip 2

1~

ieuieie

iPiap 2

The error becomes

ieieu

ie pa

Pii

2

1

ieuPww Hiiii 1

~~

ieieu

uPww pa

Pi

Hiiii

i

21

1~~

ieu

uPwie

u

uPw p

Pi

Hii

ia

Pi

Hii

i

ii

212~~

Page 21: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

21

Form the energy relationship for both sides of the equation with a weighted norm

ieu

uPwPie

u

uPw

ieu

uPwPie

u

uPw

p

Pi

Hii

ii

H

p

Pi

Hii

i

a

Pi

Hii

ii

H

a

Pi

Hii

i

ii

ii

211

21

2

1

2

~~

~~

ieu

uPP

u

PuiewP

u

Puieie

u

uPPww

ieu

uPP

u

PuiewP

u

Puieie

u

uPPww

p

Pi

Hii

i

Pi

iiHpii

Pi

iiHpp

Pi

Hii

iH

iPi

a

Pi

Hii

i

Pi

iiHaii

Pi

iiHaa

Pi

Hii

iH

iPi

iiii

i

iiii

i

2

1

211

22

11

21

2

1

2

1

22

12

~~~

~~~

1

1

Remove Pi and Pi inverse multiplications

ieu

uP

u

uiew

iu

uieie

u

uww

ieu

uP

u

uiew

u

uieie

u

uww

p

Pi

Hi

i

Pi

iHpi

P

iHpp

Pi

HiH

iPi

a

Pi

Hi

i

Pi

iHai

Pi

iHaa

Pi

HiH

iPi

iii

i

i

iiii

i

2212212

1

2222

2

~~~

~~~

1

1

Based on the Orthogonality of terms (errors and weights), this result in

ieu

uiewie

u

uiew p

Pi

PiHpPia

Pi

PiHaPi

i

i

i

i

i

i 22

2

2122

2

211

~~

and finally

2

2

21

2

2

2 1~1~11 ie

uwie

uw p

PiPia

PiPi

i

i

i

i

Defining

0,0

0,1

2

2

2

i

i

i

Pi

Pi

Pi

ufor

uforui

221

2211

~~ ieiwieiw pPiaPiii

For RLS this is a new energy-conservation relation. The analysis technique can again be applied.

Page 22: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

22

Steady State Approximations Covariance Matrix

1

0

1

i

i

j

Hi

jiii PuuI

i

j

Hi

jiii

iuuIEPE

0

11lim

i

j

Hi

jiii

iuuEIPE

0

11lim

i

juu

jiii

iRIPE

0

11lim

i

j

jiuu

ii

iRIPE

0

11lim

Taking the limit and assuming epsilon small

1

1lim 1

uuii

RPE

Then we allow

PRR

PEPE uuuu

ii

11

11 11

Weight norm squares steady state

Cww ii 2

1

2 ~~

which is used to approximate the scaled norm as

1

~ 12211

uuPiPi

RCTrPCTrww

i

Excess Mean Squared Error Performance For

2

1

2 ~~ ii wEwE

The energy-conservation relation becomes

221

2211

~~ ieiwieiw pPiaPiii

22ieiiei pa

Expanding the left as before

ieuieieiPiap 2

2

22ieuieiiei

iPiaa

Page 23: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

23

which after manipulation results in

ieieEieuE HaPi

i Re2

22

Using

ieivie a

This becomes

22222 Re2 ieEieuEuE aaPiPivii

RLSaPiPiv ieuEuE

ii 2

2222

Separation Approximation

PRTrieEieEuEieuE uuaaPiaPiii

22222

1121222 MieERRTrieEieuE auuuuaaPi

i

RLSaPiPiv ieuEuE

ii 2

2222

RLSRLSv MM 2112

Resulting in

12

12

M

MvRLS

and the misadjustment becomes

12

12 M

MEMSEM

v

RLS

An approximation for lambda very close to 1

2

12

MvRLS

and the misadjustment becomes 2

12

MEMSEM

v

RLS

Conclusion The performance of the RLS algorithm is dependent upon the selection of the “forget factor”, λ, and the length of the filter, M. As with the e-NLMS, the performance is independent of the input covariance matrix, Ruu.

Page 24: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

24

To be reviewed in class: 6.7.

Problem6.7:Consider now that the adaptive filter has error nonlinearities,

𝑤 𝑤 𝜇𝑢 𝑔 𝑒 𝑖

Previously we obtained the expressions for the EMSE for a variety of filters corresponding to particular choices of 𝑔 ∙ . In this problem we will derive an expression for EMSE based on a generic error function 𝑔 ∙ . For this purpose the textbook makes two assumptions

1. At steady state, ‖𝑢 ‖ is independent of 𝑒 𝑖 .

2. At steady state, the estimation error 𝑒 𝑖 is circular Gaussian.

The validity of the final assumption is based on the fact that if we consider long filters, the estimation error 𝑒 𝑖 𝑢 𝑤 is the sum of large number of individual components. Using central limit theorem we can then argue that the distribution of 𝑒 𝑖 can be approximated as Gaussian.

(a) If we denote the EMSE of the filter to be 𝜁, then we have to show that

𝜁𝜇2

𝑇𝑟 𝑅ℎℎ

where ℎ ≜ 𝑬 |𝑔 𝑒 𝑖 | and ℎ ≜𝑬

𝑬 | | as 𝑖 → ∞

This can be proven as soon as we identify that ‖𝑢 ‖ is independent of 𝑒 𝑖 .

(b) We now use the second assumption to argue that the terms 𝑅𝑒 𝑬 𝑒 𝑖 𝑔 𝑒 𝑖 and 𝑬 |𝑔 𝑒 𝑖 | are functions of 𝑬 |𝑒 𝑖 | alone in the steady state. This solution is not intuitive. 𝑔 𝑒 𝑖 is a function of 𝑒 and v, and since they both are independent, their joint pdfs is equal to the product of their individual pdfs. Hence we can write

𝑬 |𝑔 𝑒 𝑖 | |𝑔 𝑒 𝑣 | 𝑓 𝑒 𝑓 𝑣 𝑑𝑒 𝑑𝑣

Using the fact that 𝜁 𝑬 |𝑒 𝑖 | , and since 𝑒 𝑖 is circular Gaussian, we can prove the required result. Also we can prove that ℎ , ℎ are also functions of 𝜁.

Page 25: 6. Steady-State Performance of Adaptive Filtersbazuinb/ECE6565/Sec6.pdf · 6. Steady-State Performance of Adaptive Filters The performances of stochastic gradient adaptive algorithms

25

Dr. Bazuin’s Simulation/Computation Concerns From the discussion of excess mean square error (EMSE) is the value of the MSE. From the previous definitions

2lim ieEMSEi

minJEMSEMSE with

1 ii wuidie

and ieivie a

where

11~

io

iiia wwuwuie

and

min

22 JwuidwuidEivEHoo

v

I dislike v(i) 1) It is a noise term that should have a magnitude, but is usually assigned a noise magnitude 2) I want this to relate to minimum mean square value and/or linear least mean square value from Sections 2 and 3. Possible consideration in performing simulations ….

generate the appropriate vector and matrixes from the simulation data generated after the adaptive simulation is complete, the vector/matrices generated can be used to

compute the appropriate optimal weights and stochastic minimum estimation error Now the appropriate estimation error results can be compared to the adaptive results! An excellent place to demonstrate this is using the simulations of Section 5 for the dfe. In addition, the factors that go into computing the EMSE can be independently evaluated and visualized in terms of the e(i) and a-priori and a-posteriori errors (which require an optimal solution to compute).