ESE 524ESE 524 Detection and Estimation Theory · 2009-02-03 · φ() ln ()ss=Φ function equals...

ESE 524ESE 524Detection and Estimation Theoryy

Joseph A. O’SullivanJoseph A. O SullivanSamuel C. Sachs Professor

Electronic Systems and Signals Research Laboratoryl l dElectrical and Systems Engineering

Washington University211 Urbauer Hall

314-935-4173 (Lynda Markham answers)[email protected]

J. A. O'S. ESE 524, Lecture 7, 02/03/09 11

A tAnnouncements Problem Set 1 is due today Problem Set 1 is due today Problem Set 2 is posted, due 02/13/09 No class 02/10 02/12 and 02/19 No class 02/10, 02/12, and 02/19 Make up on Fridays: 02/20, 02/27 Class website: Class website:

http://classes.engineering.wustl.edu/ese524/

Questions? Questions?

J. A. O'S. ESE 524, Lecture 7, 02/03/09 22

Last Class: Information Rate Functions d P f B dand Performance Bounds

Motivation Chernoff Bound Binary Hypothesis Testing Tilted Distributions Relative Entropy Information Rate Functions Information Rate Functions Additivity of Information Examples: Examples:

Gaussian same covariance, different means Poisson different means,

G i diff t iJ. A. O'S. ESE 524, Lecture 7, 02/03/09 33

Gaussian same mean, different variances

Last Class: Information Rate Functions:M i iMotivation Receiver Operating Characteristic is often not

easily computable Performance is often a function of a few

parameters: SNR N other statisticsparameters: SNR, N, other statistics Bounds on performance may be more easily

computed The bounds below guarantee a level of

performance for optimal testsB d ti l i th t f ti Bounds are exponential in the rate function

Rate function is additive in information (proportional to N)

J. A. O'S. ESE 524, Lecture 7, 02/03/09 44

(proportional to N)

Last Class: Ch r ff B d

( ) ( )sx sXxs E e p X e dX

∞

−∞

Φ = = Chernoff Bound

Let x be a random variable with probability density

( ) ( ) sXx

A

s p X e dX∞

∞

Φ ≥ p y y

function px(X). Define the moment generating function and the log-moment generating function for real-

( ) , for all 0

ln ( ) ln ( ), for all 0

sAx

A

e p X dX s

P x A sA s s

∞

≥ ≥

≥ < − + Φ ≥

g gvalued s.

The Chernoff bound is a bound on the tail probability.

The probability of a rare event

[ ]0

ln ( ) ln ( ), for all 0( ) sup ln ( )

ln ( ) ( )s

P x A sA s sI A sA s

P x A I A≥

≥ < + Φ ≥= − Φ

≥ < − The probability of a rare event is closely approximated by the Chernoff bound.

The (information) rate f n tion eq als the Legend e

ln ( ) ( )( ) ln ( )

P x A I As sφ

≥ <= Φ

function equals the Legendre-Fenchel transform of the log-moment generating function.

The rate function can be used

( ) ln , for

ln ( ) inf ( )

T NE e

P I

φ

∈

= ∈ − ∈ >

s x

X

s X

x XA

A

J. A. O'S. ESE 524, Lecture 7, 02/03/09 55

to bound the probability of any open set in N dimensions. ( )I

∈X

XA

sup ln ( )T = − Φ s

s X s

Chernoff Bound SummaryFor random variables:For random variables:

( ) ( )sx sXxs E e p X e dX

∞

Φ = =

[ ]( ) ln ( )( ) ( )

s sI A Aφ

φ

−∞

= Φ

[ ]0

( ) sup ( )

ln ( ) ( )s

I A sA s

P x A I A

φ≥

= −

≥ < −For random vectors:

( ) ln , forT NE eφ = ∈

s xs X ( ) ln , for

ln ( ) inf ( )

E e

P I

φ

∈

∈ − ∈ >

X

s X

x XA

A

J. A. O'S. ESE 524, Lecture 7, 02/03/09 6

( ) sup ln ( )TI = − Φ s

X s X s

Binary Hypothesis T i

1

0

( | )( ) ln( | )

p Hlp H

= rrr

Testing Probabilities of miss and

( )0 0 0( ) | ( | )

( ( ) | )

sl sLls E e H p L H e dL

P P l Hγ

∞

−∞

Φ = =

= ≥

r

rfalse alarm are tail probabilities.

Upper bound them using [ ]

0

0 0

0 0

( ( ) | )ln ( ( ) | ) ( )

( ) sup ln ( )

FP P l HP l H II s s

γγ γ

γ γ

= ≥≥ < −= − Φ

rr

Chernoff bound information rate functions given hypotheses 1 and 0.

0

0 0( ) ln ( )s

s sφ≥

= Φ

Performance is better than computed point: optimal (PD,PF) is “up and to the l ft” f i t

1

1

( ( ) | )ln ( )

M

M

P P l HP I

γγ

∞

= ≤< −

r

left” of point ( )1 1 1

1

( ) | ( | )

( )

l LlE e H p L H e dL

I

σ σσ

γ

∞

−∞

Φ = =

=

r

[ ]1sup ln ( )σγ σ− Φ

J. A. O'S. ESE 524, Lecture 7, 02/03/09 77

1( )I γ [ ]10

1 1

sup ln ( )

( ) ln ( )σ

σγ σ

φ σ σ≤

Φ

= Φ

A idAside Tail probabilities can be

i h id

1( ( ) | )ln ( )

MP P l HP I

γγ

= ≤<

r

on either side. For the bound, the

variable in the (log-) moment generating

1

( )1 1 1

ln ( )

( ) | ( | )

M

l Ll

P I

E e H p L H e dLσ σ

γ

σ∞

−∞

< −

Φ = = r

moment generating function is a dummy variable.

The variables in the log-1( | ) L

lp L H e dLγ

σ

∞

−∞

≥ g

moment generating functions under the two hypotheses are different.

[ ]

1( | ) , for all 0

( ) l ( )

le p L H dL

I

γσγ σ

−∞

≥ ≤

Φ

[ ]1 1

0( ) sup ln ( )I

σγ σγ σ

≤= − Φ

J. A. O'S. ESE 524, Lecture 7, 02/03/09 88

Til d Di ib i

1

0

( )

( | )( ) ln( | )

l

p Hlp H

=

rrr

Tilted Distributions The moment

i f i

1

0

( )0 0

( | )ln( | )

0

( ) |

( | )

sl

p Hsp H

s E e H

p H e d ∞

Φ =

=

r

RRR R

generating functions are for the log-likelihood ratio given the hypotheses. [ ] [ ]

0

10 1

( | )

( | ) ( | )s s

p

p H p H d

−∞∞

−=

R R Rthe hypotheses.

If the supremum defining the rate function is achieved at

[ ]0 00

( ) sup ln ( )

( *)s

I s s

d s

γ γ−∞

≥= − Φ

Φan interior point, then the derivative is zero.

Tilt the original distribution until the

0

00

( *)ln ( *)

( *)( *)

d sd dssds s

d s

γΦ

= Φ =Φ

Φdistribution until the mean of the log-likelihood function equals the threshold.

[ ]0

*0

( |

( *)( )

( *) s

d sds E l

s

Φ=

ΦR

r

)H

J. A. O'S. ESE 524, Lecture 7, 02/03/09 99

q( |ln

0( | )( )

ps

sp H ep =

R

RR

1

0

)( | )

0 ( )

Hp H

s

Φ

R

Relationship toR l i E ( || ) log pD p q p= Relative Entropy Relative entropy is a

i i f

( || ) log

ln 1 1/ log 1 0

D p q pq

p qx x p p pq

=

≥ − ≥ − =

quantitative measure of information, given in bits or nats.

Information rate

( || ) 0( )( || ) ( ) ln s

pqD p q

pD p p p d

≥

= RR R Information rate

function equals the relative entropy between the tilted pdf

1

0

00

( | )ln( | )

0

( || ) ( ) ln( | )

( | )( )

s s

p Hsp H

D p p p dp H

p H e

= RR

R RR

RRand the pdf under the hypothesis.

Duality of exponential family and its mean:

[ ]

0

0

0 0

( | )( )( )

( || ) ( ) ln ( )

s

s s

p eps

D p p E sl s

=Φ

= − Φ

R

rfamily and its mean: mean determines parameter; parameter determines mean.

[ ]0 0ln ( ) ( )

( )*

s s

s s

s s IE l

γ γγ

= − Φ ==r

J. A. O'S. ESE 524, Lecture 7, 02/03/09 1010

* ss sγ γ→ →

Relationship to Relative E P 2

( )1 1( ) |lE e Hσσ Φ =

r

Entropy Part 2 Relative entropy

[ ]1 1

1 10

1

( ) |

( ) sup ln ( )

( *)

I

dσ

γ σγ σ

σ≤

= − Φ

Φbetween the tilted density and the density under hypothesis 1 is i l l d h

1

11

( )ln ( *)

( *)

dd d

d

σσγ σ

σ σ

Φ= Φ =

Φ

simply related to that under hypothesis 0. 1

0

( | )ln( | )

11

( | )( )( )

p Hp Hp H ep

σ

σ

+ =Φ

RRRR

11

( | )

( )( || ) ( ) ln( | )

ss s

p H

pD p p p dp H

= R

RR RR 1

0

1

( | )( 1)ln( | )

0

( )

( | )p Hp Hp H e

σ

σ

σ

+

Φ

=

RRR

1

0

( | )ln( | )

0

0

( | )( )( )

p Hsp H

sp H ep

s

=Φ

RRRR

[ ]

0

1

( 1)( *)

( )d

dE l

σσ

σ

=Φ +

Φ=r

J. A. O'S. ESE 524, Lecture 7, 02/03/09 1111

[ ]1 0

0 0

( || ) ( 1) ( ) ln ( )ln ( ) ( )

s s

s s s s

D p p E s l ss s Iγ γ γ γ

= − − Φ= − + − Φ = − +

r [ ]* 11

( )( *)

E lσ σ+ =Φ

r

Summary of Simple R l i hi ( )( ) |lE Hσ Φ rRelationships Find I0 as a function

( )1 1

11 0

( ) |

( | )( ) exp ( 1) ln( | )

|

lE e H

p HE HH

σσ

σ σ

Φ =

Φ = +

r

RRof the threshold;

subtract threshold to get I1.

1 00

1 0

( | )

( ) ( 1)( ) ( )

|p H

I Iσ σγ γ γ

Φ = Φ +

= +

R

Plot bounds Vary parameters to

gain a better

0 1

1 0

0

( ) ( )(0) (0) 1

(0) is a convex function

I Iγ γ γ= +Φ = Φ =

Φunderstanding.

Information is additive. N i.i.d.

0 0 1 0 1

1 1 0 1

( || ) (0, ( || )) is a point( || ) ( ( ||D p p D p p

D p p D p pγγ

= − = 0 ),0) is a point

0 ( ( || ) ( || )) i i tD D(N I0, N I1) 0 0

0

0 ( ( || ), ( || )) is a pointln ( ) 0

s sD p p D p pd s

ds

γ = Φ =

J. A. O'S. ESE 524, Lecture 7, 02/03/09 1212

Information Rate Functions and P f B dPerformance Bounds Example: Exponential Distributionsp p Additivity of Information Examples: Examples:

Gaussian same covariance, different means Poisson different means, Gaussian same mean, different variances

J. A. O'S. ESE 524, Lecture 7, 02/03/09 1313

Summary of Simple R l i hiRelationships Find I0 as a

( )1 1

1 0

( ) |

( ) ( 1)

lE e Hσσ

σ σ

Φ = Φ = Φ +

r

0function of the threshold; subtract

1 0

0 1

1 0

( ) ( )( ) ( )(0) (0) 1

I Iγ γ γ= +Φ = Φ =

subtract threshold to get I1.

0

0 0 1 0 1

(0) is a convex function( || ) (0, ( || )) is a point

( || ) ( ( || ) 0) is a pointD p p D p p

D p p D p pγγ

Φ= − =

Plot bounds Vary

parameters to

1 1 0 1 0

0 0

( || ) ( ( || ),0) is a point0 ( ( || ), ( || )) is a point

ls s

D p p D p pD p p D p p

d

γγ

=

=

0n ( ) 0sΦ =

parameters to gain a better understanding.

0

0 1ds

s

=

≤ ≤

J. A. O'S. ESE 524, Lecture 7, 02/03/09 1414

E pl E p ti l Di trib tiExample: Exponential Distributions0 : , 0, 1, 2,... , i.i.d.R

iR

H r e R i Nλ

α

λ − ≥ =

1 : , 0, 1, 2,... , i.i.d.

( ) ( ) l

Ri

N

H r e R i N

l

ααλ α

αλ

− ≥ =>

1

( )0 0

( ) ( ) ln

( ) |

ii

sl

l R

s E e H

αλ αλ=

= − + Φ =

r

R

[ ] [ ]10 1( | ) ( | )s s

N

p H p H d∞

−

−∞∞

= R R R

[ ] [ ]10 1

1

( | ) ( | )

( ) ( )

Ns s

i i ii

N

p R H p R H dR

s s

∞−

= −∞

=

Φ = Φ

∏

J. A. O'S. ESE 524, Lecture 7, 02/03/09 1515

0 0( ) ( )s s Φ = Φ

0 0

1 ((1 ) )

( ) ( )

( )

N

s s R s s

s s

s e dRλ αλ α∞

− − − +

Φ = Φ

Φ = Ex: Exponential

00

1

( )

((1 ) )

s s

s e dR

s s

λ α

λ αλ α

−

Φ =

=− +

Distributions

Compute the [ ]0 0

0

0

((1 ) )( ) sup ln ( )

( *)s

s sI s s

d sd d

λ αγ γ

≥

+= − Φ

Φ

pmoment generating functions and

[ ]

00

ln ( *)( *)

( ) ln , one component((1 ) )s s

d dssds s

E l

γ

λ α αγλ λ

= Φ =Φ−= = +

+r

functions and information rate functions. [ ]

( )0

((1 ) )

( ) ln ln((1 ) )

s s

s

s ss

I s ss s

λ α λλ α αγ α

λ α λ

− +−

= + −− + Plot parametric

curves: fns of s (1 )s− − ln ln((1 ) )

1 ln(1 ) (1 )

s s

s s s s

λ λ αλ λλ α λ α

+ − +

= − − − + − +

curves: fns of s γs, I0(γs),

J. A. O'S. ESE 524, Lecture 7, 02/03/09 16161

( ) ( )

( ) 1 ln(1 ) (1 )sI

s s s sα αγλ α λ α

= − − − + − +

I1(γs) = -γs + I0(γs),

E pl ti dExample continued We know exact performance

Th th h ld l t d

( )1

| | , 0( 1)!

1 1

i

i

LN

l H i Ni

L ep L H LN

μ

μ

−−

= ≥−

The thresholds are related The log-probabilities can be

compared to bounds ( )1

0 1

| 1'

1 1,

|D l HP p L H dLγ

μ μλ α∞

= =

=

( )0| 0

'

1

|

1

F l H

N

P p L H dL

γ

γ

∞

= 1( ) ( ) ln

( ) ln

N

ii

N

l R

R N

αλ αλ

αλ α

=

= − + +

R

( )

( )

1'

01

''

1 '!

1 '' , '' '

Nk

Fk

Nk

F

P ek

P ek

λγ

γ

λγ

γ γ λγ

−−

=

−−

=

= =

1

1

( ) ln

1 ( ) ln

ii

N

ii

R N

R l N

λ αλ

αλ α λ

=

=

= − + = − −

R ( )0

1

0

,!

'' 1 ''exp!

Fk

kN

Dk

k

Pk

γ γ γ

αγ αγλ λ

=

−

=

= −

1

1' ln

'' '

i

N αγ γλ α λ

γ λγ

= = − −

=

J. A. O'S. ESE 524, Lecture 7, 02/03/09 1717( )1 1ln '' ln lnF FNP P

N N N

γ λγλ γ αγ

λ α λ = − −

Matlab Code1

ROC in blue, Bound in redfunction [pf,pd]=gammainforate(N,lambda,alpha)gamma=0:N/200:(N+5);

0.6

0.8

1

PD

k=1:(N-1);kfac=factorial(k);gamexp=[gamma];if N>3,

for kk=2:N-1;gamexp=[gamexp; gamma.^kk];

end

0 2 0 4 0 6 0 80

0.2

0.4

Pendendpf=exp(-gamma).*(1+(1./kfac)*gamexp);gd=alpha*gamma/lambda;gamexp=[gd];if N>3,

for kk=2:N-1;0.2 0.4 0.6 0.8

PF

;gamexp=[gamexp; gd.^kk];

endendpd=exp(-gd).*(1+(1./kfac)*gamexp);s=0:0.01:1;info0=lambda./((1-s)*lambda+s*alpha)-1-log(lambda./((1-s)*lambda+s*alpha));gammas (lambda alpha) /((1 s)*lambda+s*alpha)+log(alpha/lambda);

100

ROC in blue, Bound in red

0 25

0.3

I0 in blue, I1 in red

gammas=(lambda-alpha)./((1-s)*lambda+s*alpha)+log(alpha/lambda);info1=alpha./((1-s)*lambda+s*alpha)-1-log(alpha./((1-s)*lambda+s*alpha));

10-1

PD

0 1

0.15

0.2

0.25

orm

atio

n R

ates alpha=.5

lambda=1N=10, 100

J. A. O'S. ESE 524, Lecture 7, 02/03/09 18

10-10

10-5

100

10-2

PF

-0.1 0 0.1 0.2 0.30

0.05

0.1

Threshold γ

Info

M tl b C d 0.8

1ROC in blue, Bound in red

Matlab Codefigure1 = figure;axes('Parent',figure1,'PlotBoxAspectRatio',[1 1.049 2.098],'LineWidth',1.5,...

0 2

0.4

0.6

PD

( , g , p ,[ ], , ,'FontSize',16,'DataAspectRatio',[1 1 1]);

ylim([0 1]);box('on'); hold('all');% Create plotplot(pf,pd,'LineWidth',1.5,'Color',[0 0 1]);

I in blue I in red

0.2 0.4 0.6 0.80

0.2

PF

plot(exp(-N*info0),1-exp(-N*info1),'LineWidth',1.5,'Color',[1 0 0]);xlabel('P_F','FontSize',16);ylabel('P_D','FontSize',16);title('ROC in blue, Bound in red','FontSize',16);

figure1 = figure;0.2

0.25

0.3

Rat

es


figure1 = figure;axes('Parent',figure1,'LineWidth',1.5,'FontSize',16);box('on'); hold('all');% Create plotplot(gammas,info0,'LineWidth',1.5,'Color',[0 0 1]);

0.05

0.1

0.15

Info

rmat

ion

p (g , , , , ,[ ]);plot(gammas,info1,'LineWidth',1.5,'Color',[1 0 0]);axis tightxlabel('Threshold \gamma','FontSize',16);ylabel('Information Rates','FontSize',16);title('I_0 in blue, I_1 in red','FontSize',16);

-0.1 0 0.1 0.2 0.30

Threshold γ

J. A. O'S. ESE 524, Lecture 7, 02/03/09 19

Relationship of Bounds to L Pr b bilitLog-Probabilitygammas1=gamma*((lambda-alpha)/N*lambda)+log(alpha/lambda);l t( 1 l ( f)/N 'Li Width' 1 5 'C l ' [0 1 0])

( )1 1ln '' ln lnF FNP P λ γ αγ = −


plot(gammas1,-log(pf)/N,'LineWidth',1.5,'Color',[0 1 0]);plot(gammas1,-log(1-pd)/N,'LineWidth',1.5,'Color',[0 1 0]);

( )ln ln ln

'' ln

F FP PN N N

N N

γλ α λ

γ λ α α γλ λ

− − + =

0.3

0.4

tes

( )01 ln lnF s sN

NP IN

λ αγ γλ α λ →∞

− → − − 0.2or

mat

ion

Rat

1 0 5 0 0 50

0.1Info

-1 -0.5 0 0.5Threshold γ

J. A. O'S. ESE 524, Lecture 7, 02/03/09 20

I f r ti i AdditiInformation is Additive Relative entropy of product distributions Relative entropy of product distributions

equals the sum of the relative entropies. Log-moment generating functions add.og o e t ge e at g u ct o s add Information rate functions add. Information is additive. Information is additive.

N i.i.d. (N I0, N I1) Exponential error bounds. Exponential error bounds.

J. A. O'S. ESE 524, Lecture 7, 02/03/09 2121

I f r ti i AdditiInformation is Additive Relative entropy of product distributions Relative entropy of product distributions

equals the sum of the relative entropies. Log-moment generating functions add.og o e t ge e at g u ct o s add Information rate functions add. Information is additive. Information is additive.

N i.i.d. (N I0, N I1) Exponential error bounds. Exponential error bounds.

J. A. O'S. ESE 524, Lecture 7, 02/03/09 2222

ESE 524ESE 524 Detection and Estimation Theory · 2009-02-03 · φ() ln ()ss=Φ function equals...

Documents

Transcript of ESE 524ESE 524 Detection and Estimation Theory · 2009-02-03 · φ() ln ()ss=Φ function equals...