An Adaptive Detection Algorithm - HOME - …cmurthy/E1_244/Slides/kelly.pdf · 1. INTRODUCTION...

13
1. INTRODUCTION An Adaptive Detection Algorithm E.J. KELLY Lincoln Laboratory A general problem of signal detection in a background of unknown Gaussian noise is addressed, using the techniques of statistical hypothesis testing. Signal presence is sought in one data vector, and another independent set of signal-free data vectors is available which share the unknown covariance matrix of the noise in the former vector. A likelihood ratio decision rule is derived and its performance evaluated in both the noise-only and signal-plus- noise cases. Manuscript received February 18, 1985; revised June 26, 1985. This work was supported by the United States Air Force under Contract F19628-85-C-0002. The views expressed are those of the author and do not reflect the official policy or position of the U.S. Government. Author's address: Lincoln Laboratory, Massachusetts Institute of Technology, P.O. Box 73, Lexington, MA 02173. 0018-9251/86/0300-0115 $1.00 M 1986 IEEE In a well-known paper [1], Reed, Mallett, and Brennan (RMB) discuss an adaptive procedure for the detection of a signal of known form in the presence of noise (interference) which is assumed to be Gaussian, but whose covariance matrix is totally unknown. Two sets of input data are used, which for convenience will be called the primary and secondary inputs. The possibility of signal presence is accepted for the primary data, while the secondary inputs are assumed to contain only noise, independent of and statistically identical to the noise components of the primary data. In the RMB procedure, the secondary inputs are used to form an estimate of the noise covariance, from which a weight vector for the detection of the known signal is determined. This weight vector is then applied to the primary data in the form of a standard colored noise matched filter. The implication is that the output of this filter is compared with a threshold for signal detection, but no rule is given for the determination of this threshold, whose value controls the probability of false alarm (PFA). In fact, no predetermined threshold can be assigned to achieve a given PFA, since the detector is supposed to operate in an interference environment of unknown form and intensity. Instead, the RMB paper provides an analysis of the signal-to-noise ratio (SNR) of the filter output, for given values of the secondary data. This SNR is a function of the secondary data and is therefore a random variable. The probability density function (PDF) of this SNR is deduced, and this PDF has the remarkable property of being independent of the actual noise covariance matrix; it is a function only of the dimensional parameters of the problem. In the intended application, the secondary data would (hopefully) be sufficient in quantity to support a good estimate of the noise covariance, and a threshold could presumably be determined from this estimate. The resulting final decision statistic is then more complicated than the matched filter output itself, and being a nonlinear function of the secondary inputs, the detection and false alarm probabilities are not functions of the SNR alone, leaving the actual performance of the procedure undetermined. In this study the original problem (with a slight generalization of the signal model) is reconsidered as an exercise in hypothesis testing, and the ad hoc RMB procedure is replaced by a likelihood ratio test. No optimality properties are claimed for this test, involving as it does the maximization of two likelihood functions over a set of unknown parameters. The form of the test is, however, reasonable, and the RMB matched filter output appears as a portion of the likelihood ratio detection statistic. This test exhibits the desirable property that its PFA is independent of the covariance matrix (level and structure) of the actual noise encountered. This is a generalization of the familiar constant false alarm rate (CFAR) behavior of detectors using scalar input data, in which only the level of the noise is unknown. In IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. AES-22, NO. 1 MARCH 1986 115

Transcript of An Adaptive Detection Algorithm - HOME - …cmurthy/E1_244/Slides/kelly.pdf · 1. INTRODUCTION...

1. INTRODUCTION

An Adaptive DetectionAlgorithm

E.J. KELLYLincoln Laboratory

A general problem of signal detection in a background of

unknown Gaussian noise is addressed, using the techniques of

statistical hypothesis testing. Signal presence is sought in one data

vector, and another independent set of signal-free data vectors is

available which share the unknown covariance matrix of the noise

in the former vector. A likelihood ratio decision rule is derived and

its performance evaluated in both the noise-only and signal-plus-

noise cases.

Manuscript received February 18, 1985; revised June 26, 1985.

This work was supported by the United States Air Force under Contract

F19628-85-C-0002.

The views expressed are those of the author and do not reflect the

official policy or position of the U.S. Government.

Author's address: Lincoln Laboratory, Massachusetts Institute of

Technology, P.O. Box 73, Lexington, MA 02173.

0018-9251/86/0300-0115 $1.00M 1986 IEEE

In a well-known paper [1], Reed, Mallett, andBrennan (RMB) discuss an adaptive procedure for thedetection of a signal of known form in the presence ofnoise (interference) which is assumed to be Gaussian, butwhose covariance matrix is totally unknown. Two sets ofinput data are used, which for convenience will be calledthe primary and secondary inputs. The possibility ofsignal presence is accepted for the primary data, whilethe secondary inputs are assumed to contain only noise,independent of and statistically identical to the noisecomponents of the primary data.

In the RMB procedure, the secondary inputs are usedto form an estimate of the noise covariance, from which aweight vector for the detection of the known signal isdetermined. This weight vector is then applied to theprimary data in the form of a standard colored noisematched filter. The implication is that the output of thisfilter is compared with a threshold for signal detection,but no rule is given for the determination of thisthreshold, whose value controls the probability of falsealarm (PFA). In fact, no predetermined threshold can beassigned to achieve a given PFA, since the detector issupposed to operate in an interference environment ofunknown form and intensity.

Instead, the RMB paper provides an analysis of thesignal-to-noise ratio (SNR) of the filter output, for givenvalues of the secondary data. This SNR is a function ofthe secondary data and is therefore a random variable.The probability density function (PDF) of this SNR isdeduced, and this PDF has the remarkable property ofbeing independent of the actual noise covariance matrix;it is a function only of the dimensional parameters of theproblem. In the intended application, the secondary datawould (hopefully) be sufficient in quantity to support agood estimate of the noise covariance, and a thresholdcould presumably be determined from this estimate. Theresulting final decision statistic is then more complicatedthan the matched filter output itself, and being anonlinear function of the secondary inputs, the detectionand false alarm probabilities are not functions of the SNRalone, leaving the actual performance of the procedureundetermined.

In this study the original problem (with a slightgeneralization of the signal model) is reconsidered as anexercise in hypothesis testing, and the ad hoc RMBprocedure is replaced by a likelihood ratio test. Nooptimality properties are claimed for this test, involvingas it does the maximization of two likelihood functionsover a set of unknown parameters. The form of the testis, however, reasonable, and the RMB matched filteroutput appears as a portion of the likelihood ratiodetection statistic. This test exhibits the desirable propertythat its PFA is independent of the covariance matrix(level and structure) of the actual noise encountered. Thisis a generalization of the familiar constant false alarm rate(CFAR) behavior of detectors using scalar input data, inwhich only the level of the noise is unknown. In

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. AES-22, NO. 1 MARCH 1986 115

addition, it is shown that the effect of signal presencedepends only on the dimensional parameters of theproblem and a parameter which is the same as the SNRof a conventional colored noise matched filter.

The PDF of this detection statistic itself is derivedhere, for both the noise-alone case and the signal-plus-noise case. In the former, the PDF proves to be verysimple, and the dependence of the PFA on threshold isexactly the same as that of a simple scalar CFARproblem, in which detection is based on one complexsample and the threshold is proportional to the sum of thesquares of a number of independent noise samples. Thecorresponding PDF in the presence of a signal componentexhibits the effect of an SNR loss factor, which obeys thesame beta distribution as the SNR loss factor studied inthe RMB paper. In fact, the behavioX of the detector isidentical to that of a simple scalar CFAR system with afluctuating target, where the latter fluctuation is governedby the beta distribution instead of one of the more usualradar models.

The final detection probability of the likelihood ratioalgorithm is obtained in the form of a finite series. Themethod used here leans heavily on the techniques of theRMB paper (which in turn leans directly on the analysisof Capon and Goodman [2]) with the difference that thematrix transformations required here are carried outdirectly on the variables of the problem, so that muchless reliance is placed on known properties of the Wishartdistribution.

It should be emphasized that the output of thislikelihood ratio algorithm, like that of the RMBprocedure, is a decision on signal presence, and not asequence of processed data samples from which theinterference component has been reduced (nulled) and inwhich actual signal detection remains to be accomplished.For this reason the direct application of the algorithm to areal radar problem would require the storage of data froman array of inputs (such as adaptive array elements),perhaps also sampled to form range-gated outputs foreach pulse and collected for a sequence of pulses (such asthose forming a coherent processing interval). Thepracticality of such a processor, which involves theinversion of a correspondingly large matrix, or thepossibility of its simplification, are topics not addressedin this paper.

11. FORMULATION OF THE PROBLEM

The mathematical setting for the formulation of thisdetection problem is actually quite general, but it isintroduced here first in a relatively specific way, in orderto lend concreteness by way of example. Suppose that theantenna system of a radar provides a number, say Na, ofRF signals. These may be the outputs of array elements,subarrays, beamformers, or any mix of the above. Theradar waveform is supposed to be a simple burst ofidentical pulses, say NP in number, and target detection isto be based upon the returns from this burst.

In effect, the radar front end carries out amplification,filtering, and reduction to base band, at which point thequadrature signals are subjected to pulse compression, thefinal stage of filtering. The order in which these thingsare carried out is immaterial to the present model, since itis not addressed to the problems of realization andchannel matching, although these are of great importancein practice. The I and Q output pairs are next sampled toform range-gate samples for each pulse, say, Ng rangegates/pulse. This results in a total of NaNpNg complexsamples for the burst. Signal presence is sought in onerange gate at a time, hence the primary data consists ofthe NaNp samples from a single, unnamed range gate.These samples are arranged in a column vector z ofdimension N = NaNp. The secondary data consist of theoutputs of K range gates, forming a subset of the Ng -1remaining ones, and these are described by the set ofvectors z(k), where k = 1 ... K. The decision rule willbe formulated in terms of the totality of input data,without the a priori assignment of different functions tothe primary and secondary inputs.

The secondary data are assumed to be free of signalcomponents, at least in the design of the algorithm, andany selection rules applied to make this assumption moreplausible are ignored. The primary data may contain asignal vector, written in the form bs, where b is anunknown complex scalar amplitude, and s is a columnvector of N components describing the signal which issought. The modeled variation of signal amplitude andphase among the array inputs is included in s, as well aspulse-to-pulse variations, such as those relating to aparticular target Doppler velocity. The problem ofunknown Doppler, or other unknown signal parameters,is mentioned briefly below. It should be noted that thesignal vector s can be normalized in any convenient way,since an unknown amplitude factor is already included,and we retain the freedom to assign a norm to s at a laterpoint, where it will be most advantageous to make aspecific choice.

The total noise components of the data vectors,representing all sources of internal and external noise andinterference, are modeled as zero-mean complex Gaussianrandom vectors. The noise component of the primaryvector z is characterized by the unknown covariancematrix M. Each of the z(k) is assumed to share thisN X N covariance matrix, and the vectors z and the z(k)are all mutually independent. All Gaussian vectors areassumed to have the circular property usually associatedwith I and Q pairs.

The key features of this model are the Gaussianassumption, the independence of the range-gated outputvectors, and the assumption of a common covariancematrix. The structure of the N vectors, in particular thedoubly indexed model used to account for multiple pulsesand multiple array outputs, is not exploited at all in thefollowing, and a notation for the components of thesevectors is not required.

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. AES-22, NO. 1 MARCH 198616

Ill. THE LIKELIHOOD RATIO TEST

Consider a single input vector from the secondarydata set, say z(k). If the covariance matrix of this vectoris M,

M = E{z(k) z(k)t}

then the N-dimensional Gaussian PDF of this complexrandom vector is

f [z(k)] = N e-z(k)t M 1 z(k)ITNIIMIIIn the notation used here the double bars signify thedeterminant of a matrix, and the superscript daggersymbolizes the conjugate transpose of a vector. Each ofthe secondary data vectors has this same PDF, and underthe noise-alone hypothesis, the primary vector does so aswell, hence the joint PDF of all the input data is theproduct:

K

fo[z,z(1), ..., z(K)] = [zl] H f [z(k)].

If v is any N vector, we can write the following innerproduct in the form of a matrix trace:

vtM-1v= tr(M-1V)

where V is the open product matrix

V = vvt.

When this equivalence is applied to all the factors of thejoint PDF, it is seen that the latter may be written in theconvenient form

F1 ~~~~K+1fo[z,z(l), ..., z(K)] =l ,NIIMi exp[-tr(M 1 To)] }

where

1~~~To= K+ 1 Zt + k z(k) z(k)t.

Under the signal-plus-noise hypothesis, the z(k) havethe same PDF as before, and the PDF of the primaryvector is obtained by replacing z by

z - E{z} = z - bs.

The resulting joint PDF of the inputs is then

F 1 K+1fl [z,z(1), ..., z(K)] =~N~iMII exp[-tr(M-1 T1)l

where now

K+1 (Z-bs)(z-bs)t + >, z(k) z(k)t).

In the likelihood ratio testing procedure, the PDF ofthe inputs is maximized over all unknown parameters,separately for each of the two hypotheses. The ratio ofthese maxima is the detection statistic, and the hypothesiswhose PDF is in the numerator is accepted as true if it

exceeds some preassigned threshold. The maximizingparameter values are, by definition, the maximumlikelihood (ML) estimators of these parameters, hence themaximized PDFs are obtained by replacing the unknownparameters by their ML estimators.

We begin with the noise-alone hypothesis,maximizing over the unknown covariance matrix M. Ofall positive definite M matrices, the one which maximizesthe expression inside the curly brackets of this PDF issimply To. This is equivalent to the statement that theML estimator of a covariance matrix is equal to thesample covariance matrix, which is well known [3].When this estimator is substituted in the PDF, the tracewhich appears there becomes the trace of the N x N unitmatrix, which is just N, and we find

z 1 xK+ 1

max fo = N J

The same procedure applied to the signal-plus-noisehypothesis yields the formula

M ((eTr) TlIl

and it remains to maximize this expression over thecomplex unknown signal amplitude b. Since b appearsonly in this PDF, we can form a likelihood ratio L(b) atthis point and subsequently maximize it over b. It is moreconvenient to work with the (K+ 1)st root of this ratio,and we put

L(b) {t(b)

Obviously,

1(b) - 11To11

and the final likelihood ratio test takes the form

max f(b) - 1T0Tll > to.b minII1T1II

b

The threshold parameter on the right will evidently begreater than unity, since the denominator on the leftequals the numerator for the choice b = 0, and we aremaximizing over b.

To proceed, we define the matrixK

S = E z(k) z(k)tk-i

which involves only the secondary data. This matrix is Ktimes the sample covariance matrix of these data, and itsatisfies the well-known Wishart distribution. The onlyproperty of this distribution that we need here is the factthat for K > N, a condition we now impose, the matrix Sis nonsingular with probability one. S is, of course,positive definite, and hence Hermitian. We use an easilyproved Lemma to evaluate the determinants of both sidesof the equation

KELLY: ADAPTIVE DETECTION ALGORITHM 117

(K+ 1) To = S + z zt

with the result

(K + 1)NIIToII = IISII (1 +zS- 1z).Similarly, we have

(K + l)NIIT1 1 = IISII (1 + (z-bs)tS- 1 (z-bs)).

Now is a good time to minimize this quantity over b,and we do this by completing the square:

(z-bs)tS- '(z-bs) = (ztS 1z) + 1b12 (stS- 1s)- 2 Re {b(ztS-1s)}

= (ZiS-1z) + (stS-1s)

x|b-(s'SlIs) 2

_I(stS-lz)12(sts1 s)

The minimum is clearly attained when the positive factorcontaining b is made to vanish, and the resultinglikelihood ratio is given by

f = max [(b) =b

1 + (ztS-lz)- (sts z)121 + (ztSl'z) (stS-1Sl)

It is convenient to introduce the quantity 'q, defined by

_ (stS__1z)j2q (StS' S) [1 + (ztS- 1Z)]

so thatr= 1 .

1 - n

Then the test

f > 40is equivalent to the test

Iio- 1n > Tio =

'£0We note that % lies between the values zero and one.

If the target model is generalized, so that the signalvector still contains one or more unknown parameters(such as target Doppler), the likelihood ratio obtainedabove must next be maximized over these parameters. Itis clear that this is equivalent to maximizing T itself overthe remaining target parameters. This maximizationgenerally cannot be carried out explicitly, and thestandard technique is to approximate it by evaluating thetest statistic, in this case T, for a discrete set of targetparameters, forming a filter bank, and declaring targetpresence if any filter output exceeds the threshold. Ourpurpose in discussing this here is only to show how ourtest can be generalized in this straightforward way, butfrom now on we ignore any additional target parameters,which is equivalent to concentrating on the performanceof a single member of the filter bank.

For comparison with the RMB procedure, weintroduce M, the ML estimator of the noise covariance,based on the secondary data alone. We have alreadynoted that this estimator is equal to

1M = -S.

K

The likelihood ratio test can then be written in the form

(stM-lz)12 > K%q0.(tM S) [1 +- (ZtM -z)]

K

We note that the secondary inputs enter this test onlythrough the sample covariance matrix M and also that

(stM-1z) = (wt z)

where w^ is the RMB weight vector

W-M'S.

The RMB test itself is just

W(w z)12 > threshold

which has the form of the colored noise matched filtertest, with M replacing the usual known covariance matrixof the noise.

The presence of the signal-dependent factor in thedenominator of the expression for q causes this detectionstatistic to be unchanged if the signal vector is altered bya scalar factor. Since the normalization of this vector hasbeen left arbitrary, this invariant is highly desirable. Ineffect, this factor in the denominator is normalizing s forus, in terms of the estimated noise covariance. The entiredetection statistic is also invariant to a common change ofscale of all the input data vectors, a minimal CFARrequirement. Further properties of n are developed in thefollowing section.

In the limit of very large K, one expects the estimatorM to converge to the true covariance matrix M at least inprobability. Moreover, it can be shown that the quantity

(ztM 1Z)

an inner product utilizing the actual covariance matrixinstead of its estimator, obeys the chi-squareddistribution, with 2N degrees of freedom, and hence thisterm, when divided by K, converges to zero inprobability, as K grows without bound. In this sense thelikelihood ratio test passes over into the conventionalcolored noise matched filter test, as the number of samplevectors in the secondary data set becomes very large.

IV. PROPERTIES OF THE LIKELIHOOD RATIO TEST

The likelihood ratio test is discussed in terms of therandom variable 'q, the decision statistic eventuallyobtained in the preceding section. The definition of 'q, aswell as that of the matrix S on which it depends, arereproduced here for convenience:

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. AES-22, NO. 1 MARCH 1986118

1_(StS-lZ)12(stS-1s) [1 + (ztS-lz)]

K

S = >1 z(k) z(k)t.k= 1

The random variable q is, of course, a function of boththe primary and secondary data, and as a preliminary todiscussing its actual PDF, some useful properties are firstderived. We begin with the noise-alone case, and assumethat the actual noise covariance matrix is M.

The matrix M is positive definite, and hence apositive definite square root matrix can be defined. SinceM can be diagonalized by a unitary transformation, it canbe represented in the form

M= UAUtwhere the columns of the unitary matrix U areeigenvectors of M, and A is diagonal. The diagonalelements of A, say X(n), (n= 1 ... N), are the real,positive eigenvalues of M. In case of degeneracy of aneigenvalue, the corresponding eigenvectors are assumedto have been orthogonalized. The square root may bedefined by the representation

m112 = U A1/2 ut

where A"2 is diagonal, with diagonal elements [X(n)]l12*The matrix M- 1/2 iS similarly defined in terms of A - 1/2,and it is easily seen to be the inverse of M112. Uniquenessof the square roots is not necessary for our purpose, onlytheir existence and positive definite (hence alsoHermitian) character.

Now consider the vector

= m-12 z

and the similarly transformed secondaries

Z(k) = M-12 Z(k).

The new vectors are zero-mean Gaussian variables, butwith covariance matrix equal to IN, the N X N identitymatrix. This follows directly from the definitions:

E{Z zt} = M- 112 E{z Zt} M- 1/2 = M- 1/2 M M- 1/2 NI

with identical reasoning for the transformed secondaries.The linear transformation introduced here is, of course, awhitening transformation.

We note that the scalar q depends on the data andsignal vectors only through inner products. By invertingthe whitening transformation we may evaluate, forexample, the product

(ztS-lZ) = (ztM1/2S-lMl12z)= (Zt (M-112SM-112)-1Z).

We define the new matrix

j =M-12SM-112and substitute for S, finding

K K

S = E M-l/2z(k) z(k)t M-l/2 = > z(k)z(k)t.k=l k=l

Therefore, the new S matrix is K times the samplecovariance matrix of the whitened secondaries, and therandom variable

-(ztS-lz) = (ztS-lz)is seen to be independent of M, being expressible as afunction of K+ 1 independent Gaussian vectors, each ofdimension N, and each sharing the covariance matrix, IN.The PDF of 1, like that of the RMB signal-to-noise ratio,is therefore a universal function of the dimensionalparameters, N and K, alone.

The other inner products in the decision statistic arehandled in an analogous manner; thus

(StS-lz) = (StM-1/2S- 1z) = (tt- 1z)

where t stands for the whitened signal vector

-=M-112S.

At this point we make the deferred definition of signalnormalization, by taking t to be a unit vector:

(ktt) = (Stm-ls) 1.

This choice gives specific meaning to the signalamplitude parameter b, whose square is now a propersignal-to-noise ratio, for which we introduce the symbola:

a _ |b12 = (E{z}tM1E{Z}).

When the obvious substitutions are made in the finalinner product, we obtain

(it S- 1 z)12j-(tt51 t) [1 + (zt- 1z)

The dependence on M is now confined to t, and it isshown below that even this dependence on the truecovariance matrix is illusory. When a signal is present, zis replaced by

z = bs + n

where n has all the properties attributed to z in the noise-alone case. In this situation, the whitened data vector is

z = M- 1/2Z = bk + V

where

v=-= 1/2n

which is statistically identical to the whitened data vectorin the noise-alone case. We have therefore found thatwhen a signal is present, the PDF of q depends on Monly through b and t, and the dependence on the unitvector t is again only apparent, as we now show.

Suppose the whitening transformation is followed by aunitary one, in which the whitened vectors are expressedas the products of a unitary matrix and a new set ofrandom vectors. These new random vectors are

KELLY: ADAPTIVE DETECTION ALGORITHM 119

statistically indistinguishable from their predecessors, andit would only be confusing to introduce a new notationfor them. Tracing this transformation through the innerproducts, we find that only the normalized signal vectoris changed: t is replaced by

tl-Ult

where U1 is the unitary matrix characterizing this lasttransformation. Any unit vector in the complex N spacecan be realized as t1 by such a transformation. Inparticular, we can cause tl to be a coordinate vector, forwhich a single element is unity, the remaining (N - 1)elements vanishing. It is for this reason that the PDF of qdepends on M only through the meaning of the signalamplitude parameter, b. In fact, this PDF can dependonly on b, N, and K, and hence the false alarmprobability of the likelihood ratio detector, namely

PFA = Pr{q > -n0}is independent of M, and this is the generalized CFARproperty claimed in Section I.

V. PROBABILITY.DISTRIBUTION OF THE TESTSTATISTIC

We now take advantage of our freedom to make aunitary transformation, and choose for tl a vector whosefirst element is unity, all others being zero. This can beaccomplished by choosing for U, a matrix whose firstrow is the conjugate transpose of t, and whose other rowsare the conjugates of unit vectors orthogonal to t.Understanding that this choice has been made, we dropthe subscript on t, so that q is still given by the formulain Section IV.

This form for t makes it expedient to decompose allvectors into two components, an A component consistingof the first element only, and a B component consistingof the rest of the vector. Thus we write

- ZA1

LZBJwhere the A component is a scalar and the B componentis an (N- 1) vector. In this notation, the signal vector isjust

t= L°Jthe zero being (N- 1) dimensional. Matrices aredecomposed in analogous fashion, and we write

A--JM AB

-L BA JBBJ

Note that the AA element is a scalar, the BA element isan (N - 1) dimensional column vector, and so on. Wealso give a name to the inverse of this matrix,decomposing it as well:

`[SAA SR]ABL~)BA !'BB J

With this notation we have, simply,

(ttS 't) = (Lt.St) = S~AAwhile

(LtSlz) =[1 0] A AB Z(ttJZ) [1 °] [BA £fDBB [ZB]= AA ZA + 5IPAB ZB

It is important to keep in mind that we now have a four-fold decomposition of the total input data set into primaryand secondary vectors, each of which is divided into Aand B components.

According to the Frobenius relations for partitionedmatrices,

SA = (-SA JAB JBB 5BA)which is a scalar, and also

SPBA JBB=-B BAS PA.Since the N x N sample covariance matrix and itsinverse are Hermitian, we obtain

S~AR = SRBAt= -3AASAB JBB 1

and therefore

(ttS ' ) = -AA(ZA SAB RBB ZB).The final inner product is expanded as follows:

(zts z) = SAA IZA I + 2 Re {ZASPABZB} + (z9DBBZB)where we have applied the identity

(utv) = (vtu)*to the (N- 1) dimensional inner product

ZB- BA

Next we complete the square in this last expression,writing

(Ztj-1Z) = SpAA4IZA + AA pABZB 1I+ ZB(SPBB-) <-PBAJ,SPAB) ZB

and by using a Frobenius relation in reverse we see that

"PB - SPAA' BASPAB = jB1Finally, combining these results, we find

(ZtSlz) SpAIZA SAB BZI + (ZBj 1ZB).When these evaluations are substituted into our

expression for , the result can be expressed in theapparently simple form:

X1 + X + IB

We have introduced here the notation

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. AES-22, NO. 1 MARCH 1986120

IB = (ZtB 'B ZB)which is retained, and the temporary notation

X = SAA I ZA - SABJBSZBIBNote that YB is just like the quantity E defined earlier,except that the dimensionality of the vectors involved isnow N - 1. The last form of the decision test, namely

11 > no

is evidently equivalent to

X > 'no (1 + IB).1 - No

We leave the test in this form for a time, while weexamine the statistical properties of the quantities whichenter into it.

A previous evaluation for the leading factor in X canbe used to obtain the following form:

X = IZA - SAB SBB ZBISAA - SAB 5BB JBA

We make use of the definitions to express thedenominator as a sum:

AA SJAB JBB JBAK

E (zA(k) SAB JiB zB (k))zA (k)*-k= 1

This is the same as the sum of squaresK

ZA (k) SABji ZB (k) 12k=1

because the terms supplied to complete this square add upto zero:K

(ZA (k) -SAB JBB ZB (k)) (sABJsB ZB (k))*

KE ZA (k) ZB(k) 1 SBBJBA SABSBBk= 1

K

X > ZB (k) ZB (k)tJiBBBA

ABJB' JBA- JABSBBSBBSB'BA = 0.

The evaluation of the sums here follows from thedefinitions of the partitioned matrix elements. Weintroduce the notation

y(k) ZA(k) - SABJSB zB(k)for the terms of the sum, and the analogous notation

Y ZA - JAB JBB ZB

for the quantity appearing in the numerator of X, so thatthe likelihood ratio test can be written in the moreexplicit form

1Y12 Tbo

XK > .(1 + EB).E ly(k)12 1 -no

We proceed by fixing the B vectors temporarily, andconsider the probability densities of all quantities enteringinto the decision statistic to be conditioned on thesevalues. The conditional probabilities of detection andfalse alarm are evaluated first, and the condition is thenremoved by taking expectation values over the joint PDFof the B vectors. With the B vectors fixed, only the K+ 1scalar A components are random, and we show now,under this condition, that y and the y(k) are Gaussianvariables, that y is uncorrelated with the y(k), and thatthe latter have a covariance matrix with simple properties.

Using the definitions of the ys and of the AB matrixelement which enters there, we can express thesequantities in the form

KY = ZA - E ZA (k) ZB (k) BBZB

k= 1

andK

y (k) = ZA (k) - E ZA (i) ZB (i) JBB ZB(k).

This represents the ys as linear combinations of the Acomponents, and hence proves their conditional Gaussiancharacter. Moreover, the y(k) have zero mean in allcases, while the conditional mean, written EB, of y in thegeneral case is

EB y = EZA = b

as a result of our choice of signal vector.The linear dependence of the ys on the A components

is best expressed in terms of the quantities

q (k) ZB (k) tSiB3 ZB

and

Q(i, k) -ZB (i)t1 JiB 1ZB (k)which are constants under the conditioning. Obviously,

K

Y ZA - ZA (k) q (k)

andK

y(k) = ZA (k) - ZA (i) Q(i, k).k= 1

The q(k) may be considered as the components of a Kvector q, and the Q(i,k) as the elements of a K X Kmatrix Q. The desired properties of the ys flow from thefollowing facts about this new vector and matrix:

Qq = q

and

Q2 = Q.

KELLY: ADAPTIVE DETECTION ALGORITHM 121

To prove the first of these, we write it out incomponent form:K K

E Q(k,i)q(i) = > ZB(k)SPflBZB(i)ZB(i)StJ3JZBzi=l i=l

The sum over i regenerates the BB matrix element:K

>ZB (i) ZB (i) = JBB

(as happened when the denominator of X was expressedas a sum of squares), and the result follows immediately.The idempotent character of Q is proved in the sameway. We also note that Q is Hermitian, and that its traceis N-1:

K

Tr(Q) = E ZB (k)t JBB1ZB (k)

K

= Tr(S >Bl1 ZB(k)ZB(k)t)

= Tr(IN- ) = N - 1.

Note that we are dealing with the trace of a K x Kmatrix on the left side here, and of (N- 1) X (N- 1)matrices on the right. All of these results will be requiredin the following.

The fact that y and the y(k) are conditionallyuncorrelated now follows easily from the independence ofthe A components themselves:

K\

EByy(k)* = EB q(i)ZA(i)zA (k)*)

K K

+ EB 1 q(i)zA(i) E ZA(n)*Q(n,k)*i=l n=l

K

= -q(k) + > q(i)Q(k,i) = 0.

Next, consider the conditional variance of y:K

EBIy-b12 = 1 + > Iq(k)j2.

Substituting for the q(k), we haveK K

E q(k) 2 = ZBSBBlzB (k)ZB(k)SJBB ZBk=l k=l

ZBZJSB1zB B

and hence

EBy-b12 = 1 + IB.This last result is responsible for a significantsimplification of the statistics of the likelihood ratio test.

Finally, we compute the conditional covariance of they(k). We use the notation 8(i, k) for the elements of theunit matrix, so that y(k) can be written

K

y(k) = > ZA(i)[5(i,k) - Q(i,k)].

Using the independence of the A variables again, weobtain

K

Ey(k)y(n)* = > [8(i,k) - Q(i,k)][8(i,n) -Q(i,n)]*

= 8(k,n) - Q(n,k) - Q(k,n)*K

+ > Q(i, k)Q(i, n)*.

Since Q is Hermitian and idempotent, we find the simpleresult

Ey(k)y(n)* = b(n,k) - Q(n,k).

The likelihood ratio test is now rearranged slightly toread

1Y12 Tbl KI+ L:s 0 k=l Y)

In view of fact that the conditional variance of y equalsthe denominator on the left, it makes sense to define anormalized variable

W - --

+ B)12Conditioned on the B vectors, w is Gaussian andindependent of the y(k). It has a conditional variance ofunity, and a conditional mean value:

bEBW=1/(1 + B)

In the noise-alone case, the conditioning has no effect onthe PDF of w. The sum over k is also given a name:

K

>E y(k)12k= 1

and the test is now written

1w12 > (to- 1)T.

where the original threshold constant has beenreintroduced. In fact, it is easily verified that our originallikelihood ratio is given by

= 1 + 1w12T

We now turn to the properties of T. Given the Bvectors, the joint PDF of the y(k) is zero-mean Gaussianwith covariance matrix J:

J(i, k) -(i, k) - Q(k, i).

The conditional characteristic function of T is therefore

- EB{eiXT} = III - iXJI|-1ISince Q is idempotent, its eigenvalues are either zero or

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. AES-22, NO. 1 MARCH 1986122

one, and from the value of its trace we see that Q musthave exactly N - 1 unit eigenvectors. It follows that J hasK+ 1 -N unit eigenvectors, the others being zero, andthus

'?B(X) = (1-iX)-(K+1-N)This is the characteristic function of a chi-squared randomvariable, and the PDF of T is simply

TK-NfB(T) TKN e(K-N)!

It is remarkable that the statistical properties of T areindependent of the actual values of the conditioning B-vectors, and we can consequently drop the subscript onits PDF. Moreover, T is statistically equivalent to the sumof the squares of K+ 1 -N independent, complexGaussian variables, each of which has zero mean and unitvariance. If we let w(k), (k= 1 ... K+ 1 -N), be such aset, then T is statistically indistinguishable from the sum

K+ 1-N

E jw(k)12.k=1

The properties of the likelihood ratio test are thereforeidentical to the properties of the simple test

K+ 1-N

1w12 > (fo- 1) w(k)12

where the w(k) are now also taken to be independent ofw. The probability of the truth of this inequality is stillconditioned on the B vectors, but this conditioningappears only through the quantity lB, which is containedin the conditional mean of w.

This equivalent decision rule represents the behaviorof a simple scalar CFAR test, in which the power in onecomplex sample (a single radar hit), being tested forsignal presence, is compared to a threshold proportionalto the sum of the powers of K+ 1 -N samples of noise.This problem is quite familiar, and the test just describedis also a likelihood ratio test in the correspondingsituation. The performance of the scalar CFAR detector isvery simple, and in particular, its PFA is just(1)K+1-N

In this case, when the signal amplitude is zero, theconditioning B vectors do not appear at all, and hencethis simple formula gives the PFA for our originallikelihood ratio test.

The probability of detection (PD) of the scalar CFARtest is also well known, and in our case it depends on theconditional SNR, which is the squared magnitude of theconditional mean of w. In terms of the colored noisematched filter SNR a defined earlier, and the quantity

11 + IB

this conditional SNR is just ra. The factor r represents a

loss factor, applied to the SNR, and caused by thenecessity of estimating the noise covariance matrix. ThePD of the CFAR detector can be expressed in aparticularly convenient way as a finite sum [4]:

PD 1 -LA kI (4~1)k1Gk(Lo k=l (k)o t

where L = K+ 1-N. The function G which enters hereis itself a finite sum:

k-1 n

Gk(y) = eY - In=O n.

In order to complete our computation of the PD of thelikelihood ratio test, we must take the expectation valueof this conditional PD over the joint PDF of the Bvectors. These, however, enter the final result onlythrough the loss factor r which acts as a fluctuationmodel for the signal. Unlike more familiar fluctuationmodels, this one is characterized by a factor lying in therange zero to one. The present situation is similar to thatdiscussed in the RMB paper, except that besides an SNRloss, our test will suffer a CFAR loss as well, whencompared with a colored noise matched filter test inwhich everything is known concerning the noise orinterference.

Although the loss factor found here depends on theprimary data, through its B component, while the RMBloss factor is a function only of the secondary data, itturns out that the two factors have exactly the same PDF.The proof of this interesting result is deferred to theAppendix, in which the RMB loss factor is alsoexpressed in our notation, and the evaluation of the PDFsof both these quantities is carried out in parallel.

The PDF shared by these loss functions is the betadistribution:

f(r) = (N L (1-r)N-2rLL !. (N - 2)!1and the final expression for the PD of our test can bewritten

= 1 - 1 A @L) (4 1)kH(a )

In this formula, the H functions are the expected valuesof the Gs:

Hk(y) = 1 Gk(ry)f(r)dr.

These integrals are elementary, although not simple, andtheir detailed evaluation is not given here. The results ofSection VI are based on these formulas.

VI. NUMERICAL RESULTS AND DISCUSSION

The performance of the likelihood ratio test dependsonly on the dimensional integers N and K and the SNRparameter a. The latter is a function of the true signal

KELLY: ADAPTIVE DETECTION ALGORITHM 123

strength and the intensity and character of the actual noiseand interference. Our analysis deals with a very generalproblem, and nothing can be said about the anticipatedvalues of a. The ability of a system to functioneffectively in interference depends principally on thearrangements which have been made in its design toachieve a good colored noise matched filter SNR in itsintended environment. These arrangements will usuallytake the form of diversity of RF inputs in one form or

another. An additional requirement is the need to haveinputs available from which the actual noisecharacteristics can be estimated, and this is the aspect ofthe problem which has been addressed here. In particular,for given values of PD and PFA, we can determine whatSNR is actually required to achieve those values using thelikelihood ratio detector, and compare that number withthe SNR which would be adequate to achieve identicalperformance if the noise covariance matrix were knownin advance. The difference is the penalty for having toestimate the noise covariance, and we expect that penaltyto vary sharply with the number K of available secondaryinput vectors.

This penalty has two components, one due to theCFAR character of the decision rule and another due tothe effective SNR loss factor. The latter is expected tobehave much as the results of the RMB analysis wouldpredict, based on the statistical properties of the lossfactor alone. The CFAR loss will decrease as the value ofK increases, and it may be expected to depend largely on

that parameter, while the SNR loss effect dependsroughly on the ratio of K to N.

These expectations are borne out by the numericalconsequences of our analysis, as shown in theaccompanying figures. In Figs. 1-4, PD is shown as a

function of a, the SNR, for three detectors (PFA is fixedat 10`6 for these curves). The detector performing best isa matched filter with known noise covariance, and theworst is the likelihood ratio detector which, of course, isestimating the noise covariance. The middle curve

(dashed) in these plots shows the performance of a

simple, scalar CFAR detector using L =K+ 1 -N noisesamples, and it differs from the behavior of the likelihoodratio detector only in that the SNR loss factor has beenignored. This detector is included in the comparison in

0

Pd0P0

0

0

0

5

1.0

'9

.-8

).7/MATCHED FILTER

).6/

).5 I

'.4 ').3 // LIKELIHOOD RATIO

).3// /

).2 / ,, PARAMETERS: N = 10,).2 ~~~~~~~K=20.L =11>.1 - ,' / PFA: 1.0 E-6

e 1010

SNR (dB)

Fig. 1.

15 20

1 .

0.9

0.8

0.7

0.6

Pd 0.5

0.4

0.3

0.2

0.1

Pd

Pd

10SNR (dB)

Fig. 2.

15 20

SNR (dB)

Fig. 3.

SNR (dB)

Fig. 4.

order to show how much of the degradation imposed bynoise estimation is due to each of the two contributingeffects.

We note that doubling both K and N has little effecton the portion of the degradation due to SNR loss, whilethe CFAR part is reduced, simply because K is beingincreased. The curves also show the significantimprovement which results from increasing the ratio of Kto N. When this ratio is equal to 5, the SNR losscontribution to the performance degradation is about 0.9dB, in agreement with the mean value of the SNR loss as

obtained from the beta distribution. Likewise, the CFARcontributions are directly comparable with the ordinaryCFAR loss for a detector of nonfluctuating targets withno noncoherent integration (i.e., a single radar hit).

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. AES-22, NO. 1 MARCH 1986

MATCHED FILTER

/,' LIKELIHOOD RATIO

PARAMETERS: N =10,

/,// ~~~K=50,L41L= = PFA: 1.0 E-6

_

U i---

41_

5

124

The detector performance is characterized in adifferent way in Figs. 5-8, which show the additionalSNR required, when estimating the noise covariance, toachieve the same PD and PFA as a matched filter forknown noise. In all these figures, the PD is specified at0.9, but the results will not depend strongly on thechosen PD level, since the curves of PD versus SNR arenearly parallel for the two detectors. Three PFA valuesare represented on each plot. The independent variablefor these curves is the number of secondary vectors, andthis variable always covers the range 2N through SN, for4 different N values. The SNR loss is not strictly afunction of the ratio K/N, but generally decreases withincreasing N, with this ratio held constant. The lossshown is the total loss, due to the CFAR effect and theSNR loss factor itself.

a

0

0

Fig. 5.

a0

U0)Co0

10.0

9.0

8.0

7.0

6.0

5.0

4.0

3.0

VECTOR DIMENSMIN. in

1.o0

020 25 30 35 40 45 50

K

Fig. 6.

E

o

K

Fig. 7.

9.0 _VECTOR DIMENSION: 40

8.0- DETECTION PROBABILITY: 0.9PFA 1O-C

7.0-

l 6.0

Co 5.0 C VALUES:

34.0 8

3.067

2.0- 4

1.0 i

80 90 100 110 120 130 140 150 160 170 180 190 200K

Fig. 8.

It was noted earlier that K, the number of secondaryvectors, must exceed N, the dimension of each of thedata vectors, in order to have a nonsingular samplecovariance matrix. It is clear from the results justdiscussed that K must exceed N by a significant factor ifnoise estimation is not to cause a serious loss inperformance. Since N is the dimension of the total vectorof data used for detection, the requirements on thenumber of secondaries can be extremely large. In theradar example mentioned in Section II, N was the productof the number of RF channels Na and the number ofpulses Np in a coherent processing interval. To supplyenough secondaries from range-gate outputs, other thanthe one being tested for signal presence, the bandwidth ofthe radar would have to exceed some minimum value.

The chief reason for the requirement of manysecondaries is the generality of our formulation, in whichany interference covariance matrix is allowed. In theradar example, this includes the possibility of arbitrarycorrelation between interference inputs from pulse returnswidely spaced in time, although it might well be realisticto assume independence (but not statistical identity) of theinterference inputs accompanying distinct pulse returns.This is a more difficult problem, for which only anapproximate solution has been found [5].

APPENDIX. THE PROBABILITY DENSITYFUNCTION OF THE LOSS FACTOR

The SNR loss factor derived in the text was expressedin the form

11 + IB

where

IB = ZBJtBS1 ZB.

Before discussing the PDF of EB, from which the PDF ofr follows easily, we express the RMB loss factor p in ournotation. From the RMB paper,

I( tS)I2(M- 5WMw

KELLY: ADAPTIVE DETECTION ALGORITHM 125

)l

4 ^ ^.

.1-1klimn LJIMCglofvfl; ou

DETECTION PROBABILITY: 0.9PFA = 1 o-C

C VALUES:

2.0[

where M is the actual noise covariance matrix, s is thesignal vector, and w^ is a weight vector:

= kM s.

In this last formula, M is the sample covariancematrix of the secondary data and k is an arbitraryconstant. The loss factor itself is the ratio of theconditional SNR of the output of a filter which uses wi asa weight, relative to the SNR of the colored noisematched filter for known M. The conditioning in this casecorresponds to given values of the secondary data.

Choosing k= 1/K, we obtain

w = S-Isin terms of the S matrix used in the text, and then

(StS- 1 s)2= (stM-ls)(sts-lMS- S)

Note that p is unaffected if s is changed by a constantfactor.

We now carry out the whitening transformation, as inthe text, and normalize the signal vector as before. Theresult is

(£tts- 1ot2

(tttp)2

A unitary transformation is now applied to convert thesignal vector to the final one used in the text, and thematrices are decomposed in the same way. This gives thesimple expression

p ((p 2)

It is clear at this point that the PDF of p will beindependent of the actual covariance matrix M.

Using the Frobenius relations, we obtain

= (SA)2 + _AB BA

= (AA)2 (1 +sABj2BA)and therefore

1p

p

where

Ep - S- BBJiB-5A

Note that the RMB loss factor depends on the secondarydata only, both A and B components, while r depends onthe B components of both primary and secondary data.

We proceed to analyze the two loss factors together,and begin by conditioning on the B components of thesecondary data vectors, on which both loss factorsdepend. Then

K

JBB = ZB (k) ZB (k)k= 1

is a constant matrix, positive definite, and nonsingular forall sets of conditioning vectors (except for a set ofprobability zero). We can therefore introduce the squareroot of this matrix and define the vectors

SB - JBBZ

and

t p C-1I CBWith the conditioning, these quantities are zero-meanGaussian vectors; the former is a linear function of theelements of the B component of the whitened primaryvector, and the latter is expressible in terms of thesecondary A components:

K

S = JSB > ZB (k) ZA (k).

We use the subscript C to denote the presentconditioning, and compute the conditional covariancematrices of these vectors:

EC SB S = SjB112 EC ZBzBB1/2 = 1

and

EctPtP -SB W JJBwhere

K K

W = EC > ZB(k)zA(k) > ZA (i)ZB(i)tk=l i=l

K

= > ZBZ(k)BZB(k) = JBB.k= 1

Therefore

Ec tp = SBB

and the two t vectors are statistically equivalent under theconditioning. They therefore share the same final PDFwhen the conditioning is removed by averaging over thesecondary B vectors. Since

=B B B)

and

p= (p)

this proves that the loss factors themselves are statisticallyidentical, and we continue with the loss factor r.

Since tB is a Gaussian (N- 1) vector, the conditionaljoint PDF of its components is

fc(eB) = N-i0B e tB JBBG)

The S matrix which enters here is itself subject to theWishart PDF, which in the present case (in which thesample vectors are of dimension N - 1 and the covariance

IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. AES-22, NO. 1 MARCH 1986126

matrix is equal to the identity) takes the form

f(A) = IIAK+ et(A)C(N- tr(AK

In this formula

N(N- I) N

C(N,K)=Ir 2 H (K-n)!n=l

is the Wishart normalization factor. The volume elementfor this PDF will be written d(A). It is (N- 1)2dimensional, ranging over the diagonal elements and thereal and imaginary parts of the upper off-diagonals of allpositive definite matrices, A. For our purpose, only thenormalization integral of the Wishart PDF is required,hence we need not dwell on the detailed properties of thisfascinating distribution.

Since tB depends on the conditioning data onlythrough the S matrix, its unconditioned PDF can bewritten

tr(A)f(A) d(A).

As in the text, we have replaced the exponential part ofthe Gaussian PDF by a trace, this time involving the openproduct matrix

When we substitute for the Wishart PDF in theexpression above, we encounter the integral

f ... f IIA jK+2-N e -r[A(IJN..+ C)] d(A).This is the same as the normalization integral for anotherWishart PDF, of dimensions N- 1 and K+ 1, and forwhich the underlying sample vectors share the covariancematrix

Jl(-(IN-I1 + C

The normalization factor for this slightly more generalWishart PDF is just

C(N- l,K+ 1)JlIIAIK+1 = C(N- l,K+ 1)1IN- 1 + eiIK+ I

C(N- l,K+ 1)[1 + (QBB)]K+1

(the evaluation of the determinant uses the same Lemmautilized by Section III. Combining these facts, we obtainthe simple result

1 C(N- l,K+ 1) [1 + (B)-(K+l)1T C(N-1I,K)

K! -K1

rN-1 (K+ I-N) ! (I ;B)-K

The remainder of the derivation is identical to thefinal few steps given in [1, Appendix]. The norm of thevector tB is interpreted as the square of the radialcoordinate in a (2N- 2)-dimensional Cartesian space, achange to polar coordinates is made, and the angularcoordinates integrated out. This process yields the PDF ofIB, and then a simple change of variable provides thedesired PDF of r:

f(r) = K! (I _r)N-2rK+l-N(K+ I-N)! (N-2)!

REFERENCES

[1] Reed, I.S., Mallett, J.D., and Brennan, L.E. (1974)Rapid convergence rate in adaptive arrays.IEEE Transactions on Aerospace and Electronic Systems,AES-1O (Nov. 1974), 853-863.

[21 Capon, J., and Goodman, N.R. (1970)Probability distributions for estimators of the frequency-wavenumber spectrum.Proceedings of the IEEE, 58 (Oct. 1970), 1785-1786.

[3] Goodman, N.R. (1963)Statistical analysis based on a certain multivariate complexGaussian distribution.Annals of Mathematical Statistics, 34 (Mar. 1963), 152-177.

[4] Kelly, E.J. (1981)Finite-sum expressions for signal detection probabilities.Technical Report 566, Lincoln Laboratory, M.I.T., May 20,1981.

[5] Kelly, E.J. (1985)Adaptive detection in non-stationary interference.Technical Report 724, Lincoln Laboratory, M.I.T., June 25,1985.

Edward J. Kelly was born in Philadelphia PA, in 1924. He received the S.B. degreein 1945, and the Ph.D. degree in 1950, both from the Massachusetts Institute ofTechnology, Cambridge, MA, in theoretical physics.

He was a staff member at Brookhaven National Laboratory, Upton, NY, from1950 to 1952, and joined M.I.T. Lincoln Laboratory in 1952, where he is now amember of the Senior Staff.

His work at Lincoln Laboratory has been in the areas of radar detection andestimation theory, applied seismology, air traffic control, spread spectrumcommunications and, currently, space-based radar systems analysis.

KELLY: ADAPTIVE DETECTION ALGORITHM 127

' f ... f IIA lieMB) =

1TrN-1