Reduced sensitivity algorithm for optical processors using constraints and ridge regression

5
Reduced sensitivity algorithm for optical processors using constraints and ridge regression David Casasent and Anjan Ghosh Optical linear algebra processors that involve solutions of linear algebraic equations have significant potential in adaptive and inference machines. We present an algorithm that includes constraints on the accuracyof the processor and improves the accuracy of the results obtained from such analog processors. The constraint algorithm matches the problem to the accuracy of the processor. Calculation of the adaptive weights in a phased array radar is used as a case study. Simulation results prove the benefits advertised. The desensiti- zation of the calculated weights to computational errors in the processor is quantified. Ridge regression is used to determine the parameter needed in the algorithm. 1. Introduction Optical matrix-vector processors represent the ba- sic functional block in optical crossbar switches, 1 many optical artificial neural networks, 2 optical associative processors, 3 general optical linear algebra processors, 4 and many other applications. To remain cost-effec- tive, the present belief 5 is that these systems should operate on analog data. This implies that the accura- cy of the resultant processor will be low, with 8-10 bits of accuracy possible with ac coupling and heterodyne detection. 6 This paper addresses the fact that many problems are ill-formulated and thus require much higher accuracy than is needed and merited. Section II provides several examples when this arises and ad- vances a new technique to match the accuracy of the problem to the accuracy of the processor. The specific case study chosen is calculation of the adaptive weights in a phased array radar. However, the general prob- lem addressed, the solution of a system of linear alge- braic equations (LAEs), arises in many applications. Section III briefly describes the optical processor con- sidered and its error sources. Section IV advances simulated data to quantify the performance improve- When this work was done both authors were with Carnegie Mellon University, Department of Electrical & Computer Engineering, Center for Excellence in Optical Data Processing, Pittsburgh, Penn- sylvania 15213-3890; A. Ghosh is now with University of Iowa, De- partment of Electrical & Computer Engineering, Iowa City, Iowa 52242. Received 31 August 1987. 0003-6935/88/081607-05$02.00/0. ©1988 Optical Society of America. ment obtained. Our summary and conclusions then follow in Sec. V. II. Accuracy Constraints and Ridge Regression To provide motivation for changing a given matrix- vector problem, we recall two examples. In earlier work on the solution of the algebraic Ricatti equation for the optimal control vector for an F100 aircraft engine, 7 the original plant matrix was found to have a very large condition number, thus requiring high accu- racy in calculating its inverse. However, we found and showed that small changes in the parameters of the model did not affect the closed-loop poles of the sys- tem (the performance parameters of concern), but did significantly reduce the condition number of the ma- trix and hence the required computational accuracy. This situation often arises, since the numerical values employed in models for many control problems have significant variation. A second example, and the one we will employ in our case study, concerns calculation of the adaptive weights in a phased array radar. Here, the issues 8 are that the antenna elements will have noise and in many applications (e.g., sonar) the location of the antenna elements are not necessarily accurately known. In both instances, techniques have been developed 8 to calculate adaptive weights that are less sensitive to antenna noise and antenna location errors. These methods involve modifications of the original covari- ance matrix. Such alterations are, in fact, necessary since, if a narrow and deep antenna pattern null were produced, without considering antenna noise and an- tenna element location errors, it would generally miss the intended jammer. 8 We modify these prior techniques for a different purpose. We consider the solution to a system of 15 April 1988 / Vol. 27, No. 8 / APPLIED OPTICS 1607

Transcript of Reduced sensitivity algorithm for optical processors using constraints and ridge regression

Page 1: Reduced sensitivity algorithm for optical processors using constraints and ridge regression

Reduced sensitivity algorithm for optical processors usingconstraints and ridge regression

David Casasent and Anjan Ghosh

Optical linear algebra processors that involve solutions of linear algebraic equations have significant potentialin adaptive and inference machines. We present an algorithm that includes constraints on the accuracy of theprocessor and improves the accuracy of the results obtained from such analog processors. The constraintalgorithm matches the problem to the accuracy of the processor. Calculation of the adaptive weights in aphased array radar is used as a case study. Simulation results prove the benefits advertised. The desensiti-zation of the calculated weights to computational errors in the processor is quantified. Ridge regression isused to determine the parameter needed in the algorithm.

1. Introduction

Optical matrix-vector processors represent the ba-sic functional block in optical crossbar switches,1 manyoptical artificial neural networks,2 optical associativeprocessors,3 general optical linear algebra processors,4and many other applications. To remain cost-effec-tive, the present belief5 is that these systems shouldoperate on analog data. This implies that the accura-cy of the resultant processor will be low, with 8-10 bitsof accuracy possible with ac coupling and heterodynedetection.6 This paper addresses the fact that manyproblems are ill-formulated and thus require muchhigher accuracy than is needed and merited. SectionII provides several examples when this arises and ad-vances a new technique to match the accuracy of theproblem to the accuracy of the processor. The specificcase study chosen is calculation of the adaptive weightsin a phased array radar. However, the general prob-lem addressed, the solution of a system of linear alge-braic equations (LAEs), arises in many applications.Section III briefly describes the optical processor con-sidered and its error sources. Section IV advancessimulated data to quantify the performance improve-

When this work was done both authors were with Carnegie MellonUniversity, Department of Electrical & Computer Engineering,Center for Excellence in Optical Data Processing, Pittsburgh, Penn-sylvania 15213-3890; A. Ghosh is now with University of Iowa, De-partment of Electrical & Computer Engineering, Iowa City, Iowa52242.

Received 31 August 1987.0003-6935/88/081607-05$02.00/0.© 1988 Optical Society of America.

ment obtained. Our summary and conclusions thenfollow in Sec. V.

II. Accuracy Constraints and Ridge Regression

To provide motivation for changing a given matrix-vector problem, we recall two examples. In earlierwork on the solution of the algebraic Ricatti equationfor the optimal control vector for an F100 aircraftengine,7 the original plant matrix was found to have avery large condition number, thus requiring high accu-racy in calculating its inverse. However, we found andshowed that small changes in the parameters of themodel did not affect the closed-loop poles of the sys-tem (the performance parameters of concern), but didsignificantly reduce the condition number of the ma-trix and hence the required computational accuracy.This situation often arises, since the numerical valuesemployed in models for many control problems havesignificant variation.

A second example, and the one we will employ in ourcase study, concerns calculation of the adaptiveweights in a phased array radar. Here, the issues8 arethat the antenna elements will have noise and in manyapplications (e.g., sonar) the location of the antennaelements are not necessarily accurately known. Inboth instances, techniques have been developed8 tocalculate adaptive weights that are less sensitive toantenna noise and antenna location errors. Thesemethods involve modifications of the original covari-ance matrix. Such alterations are, in fact, necessarysince, if a narrow and deep antenna pattern null wereproduced, without considering antenna noise and an-tenna element location errors, it would generally missthe intended jammer.8

We modify these prior techniques for a differentpurpose. We consider the solution to a system of

15 April 1988 / Vol. 27, No. 8 / APPLIED OPTICS 1607

Page 2: Reduced sensitivity algorithm for optical processors using constraints and ridge regression

LOs

1 fL1

b 3-- [ =3

4 --- E]=

s 1=

AOCELL

FTLENS

n f2 }=A> n f~~~~~~~~3

A = (f't)

Fig. 1. Frequency-multiplexed optical matrix-vector processor.

LAEs on an analog processor with limited accuracy.Our intent is to desensitize the solution and algorithmto processor errors (rather than to errors in the anten-na element locations).

We now consider and detail the algorithm for thecase of an adaptive phased array radar. Optimumadaptive array processing involves the calculation ofthe set of N adaptive weights (denoted by the vector w)to maximize the array gain

p(W) = WHSI2/(wHRw), (1)

where the numerator is the received signal power fromthe desired direction (specified by the steering vectors), the denominator is the average received power in alldirections, and R is the noise covariance matrix. Weuse lowercase and uppercase boldface letters to denotevectors and matrices, respectively. Maximizing p(w)as a function of w is equivalent to maximizing the SNRat the output of the array. The weights w that achievethis solve the LAEs:

Rw = s, (2)

that is,

w = Rs. (3)

This is referred to as a supergain or superdirectivedesign. This solution is very sensitive to computa-tional errors in the w solution, to additive antennanoise, and to differences in the locations of the antennaelements.

If P) and P0() are the power received in the direc-tion 0 with and without computational errors in w andP* (0) is a normalization factor (the nominal powerreceived in the direction with no antenna noise andno errors in w), it can be shown8 that

EjP()I/P* (0) = p (9)/p*(0) + Can2a", (4)

where E is the expected value operator, c is a functionof the antennan, 2 is the variance of the noise power,and a = (wHw)/IwHsI2 is referred to as a sensitivityfactor. The second term in Eq. (4) is the error andreducing a thus makes EP(O)} closer to P(6) for all s.The purpose of the algorithm to be discussed is toreduce the factor a. This factor a is a function of wand measures the susceptibility of the gain p(w) in Eq.(1) to random computational or other errors in w.

In the algorithm to be considered, we maximize p(w)with the constraint that a,, lies close to a specifiedvalue a0. This is equivalent to minimizing

1/p(w) + k(a - ao), (5)

where k is a Lagrange multiplier [which is the sensitiv-ity of 1/p(w) with respect to the constraint on a]. Itcan be shown8 that the solution to Eq. (5) is also thesolution of the LAEs:

(R + kI)w = s, (6)

where I is the identity matrix. The specified sensitiv-ity factor value ao and the Lagrange multiplier k arerelated by

sH[R + kI]-H[R + k]-s

IsH[R + kHs2 '(7)One can specify ao and then determine k. However,we will specify k and then determine o, since it iscomputationally easier. This is also attractive sincethe new LAEs in Eq. (6) that result from nonzerochoices for k are seen to be a perturbed version of theoriginal LAEs in Eq. (2), with k being the perturbationadded to the diagonal elements of R. ne cannotdetermine k analytically and thus we employ ridgeregression techniques9 to determine it. This involvescalculation of w(k) for several k values and selectingthe largest k such that (k) remains approximatelyconstant (stable) for larger k values [i.e., w(k) is stablewith respect to the perturbations k].

Ill. Optical Processor

The optical processor considered is shown in Fig. 1.It has been extensively discussed in the literature. 0

This system is shown with five input laser diodes (LDs)imaged through separate regions of an acoustooptic(AO) cell whose output is integrated at plane P3. Witha vector b fed to the LDs and with three columns of thematrix A fed one column in parallel (with its rowsfrequency-multiplexed) to the AO cell, the P3 outputson the three detectors shown are the three vector innerproducts of b and the three columns of A, i.e., thematrix vector product Ab. The algorithm describedin Sec. II (and the general trends that we will show inSec. IV) apply to any optical linear algebra processorarchitecture. The system of Fig. 1 was chosen since ithas been extensively analyzed in the literature.

1608 APPLIED OPTICS / Vol. 27, No. 8 / 15 April 1988

I'I r

Page 3: Reduced sensitivity algorithm for optical processors using constraints and ridge regression

Our concern is the effect of errors in the analogsystem of Fig. 1. The major error sources1 l we consid-er include: spatial-gain errors in the LD point modu-lators, acoustic attenuation (a) in the AO cell, anddetector noise. The first error source is modeled as aspatially and temporally fixed Gaussian random vari-able multiplying the point modulator inputs, and thedetector errors are modeled as temporally uncorrelat-ed Gaussian random variables added to the detectoroutputs.

IV. Simulated Results

The adaptive phased array radar problem we con-sidered to quantify the improvement possible with thealgorithm of Sec. II was a four-element end-fire phasedarray with elements spaced by X/4 and with the azi-muth angle 0 measured from the array normal.

If the condition number of R is C1 = emax/emin, whereen are the magnitude of the eigenvalues, the conditionnumber of the perturbed matrix R + kI in Eq. (6) is C2= (emax + k)/(emin + k). Thus, as k increases, C2decreases, and we expect a lower accuracy processor tobe sufficient. This is the insight behind why the per-turbed problem in Eq. (6) permits a solution on a loweraccuracy processor. We cannot reduce C2 to arbitrarylevels, since the sensitivity a,, will saturate as k exceedssome level. Thus, we select k to reduce C2 to allow asolution with a processor of specified accuracy. Theeffect of C on performance and errors has been detailedelsewhere.12 For a given algorithm and from this priortheory, it should be possible to relate the required kand C2 to the accuracy of the chosen processor. Let usnow quantify the improvements possible, the k valuesused, and the allowed optical system errors for severalscenarios.

Table 1. Effect of the Lagrange Parameter on the Adaptive WeightProblem

Lagrange Condition Array gain Sensitivity factorparameter number p[w(k)] a[**(k)]

k C2 (dB) (dB)

0 87.1 11.06 3.150.02 46.8 10.69 0.870.1 16.9 9.29 -2.88

We first consider the array with isotropic externalnoise added to each received antenna element. A nor-malized R matrix is used to make the results indepen-dent of the variance o-,2 of the noise, and s is chosen tosteer the antenna at 0 = 900. From our ridge regres-sion analysis, we foundw(k) to stabilize at k 0.1 [i.e.,further increases in k will reduce p(w) by more than 2dB from the ideal and will result in negligible furtherreductions in the sensitivity factor aw]. Table I sum-marizes the effect of k. We see that as k increases, thecondition number C2 decreases (by a factor of 5 for theexample shown), the array gain drops by only 1.75 dB,and the sensitivity factor (which is a measure of howmuch errors in the calculated w will change the perfor-mance from the ideal case) is reduced significantly by-6 dB (lower values for a,, are preferable).

From the reduced C2 value obtained as k increases,we expect an analog processor with a given accuracy toperform better for reduced C2 cases. We tested andquantified this concept for three different solutionalgorithms to Eq. (6). These are listed in column 1 ofTable II. Their implementation on the system of Fig.1 has been detailed elsewhere,12-14 and thus their stepsare not repeated here. The optical system errors as-sumed are noted in column 2. They include spatialgain errors in the LDs, acoustic attenuation a (weassume a 1-mm AO cell), and detector noise. Theerror values listed are different for the direct and itera-tive algorithms with the values used chosen to yieldapproximately the same percentage error in the calcu-lated weights for the original problem in Eq. (2) with k= 0. We note that the iterative algorithm is moresensitive to acoustic attenuation and that the directalgorithms are more sensitive to detector noise, inagreement with prior results.11 Column 3 lists theperturbation k used and columns 4-6 give the resultsfor different performance measures. The percent er-ror in the norm of the calculated weights (column 4)measures the error in the calculated w with respect tothe ideal w. As seen, the percent error in the norm isquite large and nearly equal (56-66%) for k = 0 for allalgorithms (due to our choice of the values for theoptical system error sources). Column 5 lists the dif-ference in decibels in the gain p(w) from the ideal gainin Eq. (1) as k increases. The major purpose of this

Table II. Examples of the Algorithm's Desensitivity to Optical SystemError Sources with Isotropic Antenna Noise

Lagrange Error in weights Error in gainAlgorithm Error sources parameter A&111 AP

used present k (%) (dB)

Direct LDLT 1% spatial LD errors, 0 66.54 0.34Cholesky a = 0.1 dB/cm, 0.02 14.1 0.28decomposition 0.05% detector noise 0.1 4.0 0.02

Direct LU 1% spatial LD errors, 0 66.49 0.34decomposition a = 0.1 dB/cm, 0.02 13.9 0.28

0.05% detector noise 0.1 4.01 0.03

Iterative 1% spatial LD errors, 0 56.29 0.30Richardson a = 0.02 dB/cm, 0.02 8.97 0.19

0.6% detector noise 0.1 4.36 0.09

15 April 1988 / Vol. 27, No. 8 / APPLIED OPTICS 1609

Page 4: Reduced sensitivity algorithm for optical processors using constraints and ridge regression

0

-60 -65 -40 -15 10 35. 60 85AZIMUTH ANGLE, DEGREE

a

PERTURBATION, k = 0 1, ISOTROPIC NOISE= NO OPTICAL ERRORS= OPTICAL ERRORS

-90 -65 -40 -15 10 35 60 85AZIMUTH ANGLE. DEGREE

bFig. 2. Antenna patterns for isotropic noise with (solid lines) andwithout (dashed lines) optical system errors in the calculation of theweights for the LDLT algorithm: (a) with k = 0 (no desensitivity)

and (b) with the Lagrange parameter k = 0.1.

paper is to quantify the improvement obtained inthese performance measures as k is increased in ourconstraint algorithm. As seen, the percent error in thecalculated w improves from -56-66% to -4% as k isincreased. This is a significant improvement (by afactor of 15) in the accuracy of the calculated resultfrom the optical processor. The difference in the gainp(w) in Eq. (1) is only a negligible 0.02-0.09 dB forthese k = 0.1 weight errors. Thus, significant im-provements in performance result from this algorithm.We note that the percent error in the gain will besignificantly less (here a factor of 10) than the percent

error in the weights (in agreement with earlier data"1 ).Figure 2 shows these effects pictorially in terms of theantenna atterns that result with k = 0 and k = 0.1 forthe LDLT algorithm with (solid lines) and without(dashed lines) optical error sources. Figure 2(a) showsconsiderable differences with and without optical sys-tem errors when our algorithm was not used (k = 0).When k = 0.1 [Fig. 2(b)], our new algorithm results inquite similar antenna patterns, even with optical sys-tem error sources and their resultant errors in thecalculated w present.

Next, we consider the case when the external noise isdirectional (i.e., corresponding to jammers at specificangles). In this case, we expect the antenna pattern toproduce a null at the jammer's location (whereas, inthe case of isotropic noise, the antenna pattern side-lobe structure in Fig. 2 is not easily predicted). TableIII shows the results obtained with the LDLT algo-rithm for different k values, for the case of 90% direc-tional noise (a jammer at 0 = 30°) and 10% isotropicnoise present. We note that some isotropic antennanoise is necessary or the condition number of R will beinfinity. The k values considered were determined bya ridge regression analysis to stabilize at k = 0.01. Thelower k value here results since emin was much less inthis case. We also note that C increases significantlywith an increase in directional noise (but is indepen-dent of the angle of the jammer). For k = 0, wecalculated C = 1237, which was reduced to 290 (for k =0.01). We used only the LDLT Cholesky decomposi-tion and backsubstitution algorithm here (since R isHermitian). The optical system error sources used arenoted in column 1, the Lagrange parameter in column2, and the performance measures in columns 3 and 4.As before, we observe a significant factor of 12 im-provement in the percent error in the solution calculat-ed with optical system errors (72.5-6.2%) when thedesensitization constraint algorithm is used and aninsignificant (0.08-dB) difference in the gain p(w).Thus, the desensitized solution algorithm also per-forms well for the case of directional jammers.

The graphical radiation pattern data obtained withw(k = 0.01) weights applied with (solid lines) andwithout (dashed lines) optical error sources are shownin Fig. 3. These data clearly show nearly identicalresultant antenna patterns and hence the successfuldesensitivity of the algorithm to optical processor errorsources and errors in the calculated weights. With aperfect optical system, a 43-dB null occurs at the =300 jammer location (the dashed line). The null depth

Table 111. Example of the Algorithm's Desensitivity to Optical System Error Sources for the LDLT Solution withDirectional and Isotropic Noise

Lagrange Error in weights Error in gainError sources parameter 1IA*lI Ap

present k (%) (dB)

0.5% Spatial LD errors 0 72.5 0.63a = 0.05 dB/cm

0.005% Detector noise 0.01 6.2 0.08

1610 APPLIED OPTICS / Vol. 27, No. 8 / 15 April 1988

Page 5: Reduced sensitivity algorithm for optical processors using constraints and ridge regression

-As -4 -10 10 35AZIMUTH ANGLE. DEGREE

6o is

Fig.3. Antenna patterns for directive noise (ajammer at 300) plusadditive antenna noise with (solid line) and without (dashed line)optical system errors in the calculation of the weights for the LDLT

solution using our desensitivity algorithm with k = 0.01.

obtained with optical system errors and the associatederrors in the calculated weights was 41 dB (the solidline). These two null depths are quite close and notsignificantly different from the 48-dB null that wouldhave been obtained from the original algorithm withno isotropic antenna noise present. Thus, the algo-rithm preserves the null depth and its location as wellas desensitizing the solution to processor and antennanoise and positional errors.

V. Summary and Conclusion

We have presented an algorithm that desensitizesthe calculated results (obtained on an analog opticalprocessor) to error sources in the processor and toerrors due to noise in the input data and to errors in theproblem definition (i.e., errors in the locations of theantenna elements for the case of a phased array). Aridge regression technique is used to determine theperturbation parameter to use in the algorithm. Cal-culation of the weights in an adaptive phased arrayradar was used as a case study to demonstrate andquantify the results obtained. We found a significantimprovement (by a factor of 10-15) in the accuracy ofcalculated weights using this new algorithm, the factthat optical system error sources have a negligible (4-6%) effect on the results (for matrices with conditionnumbers in excess of 1200), and that this is achievedwith negligible differences (0.0''-0.08 dB) in the arraygain. The algorithm and its performance were demon-strated for three different optical linear algebra algo-rithms (direct LDLT and LU decomposition, and theiterative Richardson algorithm) and for the cases ofisotropic antenna noise and the combination of direc-tionaljammers and additive antenna noise. This algo-rithm appears to be quite attractive and appropriatefor use on analog optical processors.

References

1. Special Issue on Optical Interconnections, Opt. Eng. 25, No. 10

(Oct. 1986).2. D. Psaltis and N. Farhat, "Optical Information Processing

Based on an Associative-Memory Model of Neural Nets withThresholding and Feedback," Opt. Lett. 10, 98 (1985).

3. R. Krishnapuram and D. Casasent, "Optical Associative Proces-sor for General Linear Transformations," App. Opt. 26, 3641(1987).

4. Special Issue on Optical Computing, Proc. IEEE (July 1984).5. D. Psaltis and R. A. Athale, "High Accuracy Computation with

Linear Analog Optical Systems: a Critical Study," Appl. Opt.

25, 3071 (1986).6. E. Pochapsky and D. Casasent, "Complex Data Handling in

Analog and High-Accuracy Optical Linear Algebra Processors,"Proc. Soc. Photo-Opt. Instrum. Eng. 752, 155 (1987).

7. C. Neuman, D. Casasent, and R. Baumbick, "An Electro-OpticalProcessor for the Optimal Control of F100 Aircraft Engines," inProceedings, EOSD (Nov. 1981), pp. 311-320.

8. E. Gilbert and S. Morgan, "Optimum Design of Directive Anten-na Arrays Subject to Random Variations," Bell Syst. Tech. J. 34,

637 (1955).9. B. Repasky and B. Breed, "Application of Ridge Regression

Analysis in Optimum Array Processing," in Proceedings,ICASSP 80 (CH1559-4/80) (Apr. 1980), Vol. 1, pp. 299-302.

10. D. Casasent and J. Jackson, "Space and Frequency-MultiplexedOptical Linear Algebra Processor: Fabrication and InitialTests," Appl. Opt. 25, 2258 (1986).

11. D. Casasent and A. Ghosh, "Optical Linear Algebra Processors:Noise and Error-Source Modeling," Opt. Lett. 10, 252 (1985).

12. D. Casasent, A. Ghosh, and C. Neuman, "A Quadratic MatrixAlgorithm for Linear Algebra Procesors," J. Large-Scale Syst. 9,

35 (1985).13. D. Casasent and A. Ghosh, "LU and Cholesky Decomposition on

an Optical Systolic Array Processor," Opt. Commun. 46, 270

(1983).14. A. Ghosh, D. Casasent, and C. Neuman, "Performance of Direct

and Iterative Algorithms on an Optical Systolic Processor,"Appl. Opt. 24, 3883 (1985).

The support of this research by a grant from the Air Force Office of Scientific Research (AFOSR-84-0239) isgratefully appreciated and acknowledged.

15 April 1988 / Vol. 27, No. 8 / APPLIED OPTICS 1611