Matrix Formulation for Linear and First-Order Nonlinear ...

E.Fontana MatrixFormulationforLinearandFirst-OrderNonlinearRegressionAnalysiswithMultidimensionalFunctions

1

MatrixFormulationforLinearandFirst-OrderNonlinearRegressionAnalysiswithMultidimensionalFunctions

EduardoFontana

Dep.deEletrônicaeSistemasUniversidadeFederaldePernambuco

UFPE–CTG-BlocoA,Sala421,Recife–PE50740-550BrasilE-mail:[email protected]

Published:January20,2014Abstract:Amatrixformulationispresentedtoallowlinearandfirst-ordernonlinearregressionanalysiswith multidimensional model functions. Examples of pseudocodes are presented illustrating theimplementationoftheformulationfornumericalcomputation.

1.Introduction

Regressionanalysisisanimportanttoolforthedeterminationoftherelationship

between variables from experimental datameasured in physical, biological, statistical

and other phenomena. If a goodmodel function is known to govern the experiment,

parameterscanbeinferredbyfittingthemodelfunctiontothedata.

The formulation broadly available in the literature employs a summation

procedurethatisgeneralizedtoamatrixmethodtoderivetheexpressionsthatleadto

thefinalsetofequationsfortheunknownparameters[1].Inthispaper,oneemploysa

matrix formulation [2]directly toease thedevelopmentof structuredalgorithms.The

formulation is derived for both linear and first-order nonlinear regression analysis.

Examples are shown on how to determine the intermediatematrices that lead to the

solution of the problem for both the linear and nonlinear cases. A pseudocode is

presentedforthedevelopmentofsimplenonlinearregressionalgorithms.

2.StatementoftheProblem Considerarealfunction modelingaphysicalquantityw,with

X = x1 x2 ... xM( )T , (1)

representinganindependentvariable,M-element,columnmatrix1and

1Inthepresentformulationmatricesarerepresentedascapitalletterswithanappendedtildesymbol.ThetransposeofamatrixisrepresentedbyaddingthesuperscriptTtothematrix.Them-throwisrepresentedbyappendingthesubscriptmtothematrix,andthen-thcolumn,byappendingthesuperscript<m>.

f X, A( )


2

A = a1 a2 ... aL( )T , (2)

acolumnmatrixwithLparametersthatdefinethemodelfunction.

Forinstance,forthemodelfunction

f = a + bx + cy + dxy ,

M=2andL=4andthematrices X and A are

X = x y( )T A = a b c d( )T .

Inthefollowing X willbereferredtoastheindependentvectorand A ,asthe

parametervector.Inpractice,thephysicalquantitywismeasuredonNdistinctpoints

representedasacolumnmatrix

W = w1 w2 ... wN( )T . (3)

The problem in regression analysis is to determine the parameter vector A

yieldingafunctionfthatbestdescribesthemeasuredquantityw.Onewaytosolvethis

problem is to determine a solution A that minimizes the square error [1]. This

parameter,inmatrixnotation,canbewrittenintheform

ξ ≡ W − F( )T W − F( ) , (4)

with

F = f Y 1 , A( ) f Y 2 , A( ) ... f Y N , A( )⎛⎝⎜

⎞⎠⎟T

(5)

representingacolumnmatrixinwhichthen-thelementisthefunctionfevaluatedatthe

point

X = Y n , (6)

with Y n representingthen-thcolumnoftheM × N matrix

Y = Y 1 Y 2 ... Y N( ) (7)

Given that ξ is a positive definite function, if a localminimumexists, it can be

obtainedbyimposingthecondition


3

∇ξ = 0 (8)

with

∇ ≡∂∂a1

∂∂a2

... ∂∂aL

⎛

⎝⎜

⎞

⎠⎟

T

(9)

representingthenablaoperator,intheL-dimensionalparameterspace,and

0 ≡ 0 0 ... 0( )T , (10)

theL-dimensionalnullvector.

Expression(8)representsasetofL-equationsontheLunknownparameters.Iff

isagoodmodelfunctionfortheproblem,ingeneralonlyminimawillbeobtained.The

bestsolutioncanthenbechosenasthatyieldingthesmallestvalueof ξ .

As detailed in the following sections, an exact solution can be obtained for the

caseofmodelfunctionsthatarelinearintheparameters.Fornonlinearmodelfunctions

anumericalsolutionhastobeobtained.

3.LinearRegression

Afunctionthatislinearintheelementsoftheparametervectorcanbewrittenin

theform

f X, A( ) = G X⎡⎣ ⎤⎦T A , (11)

with

G X = g1 X( ) g2 X( ) ... gL X( )( )T , (12)

being an L-element vector that defines the distinct functions associated to the L

elements of the parameter vectors. Superscript X is used to allow defining a more

general rectangular G matrix from which G Xwould represent a given column,

associatedwith the independentvector X . Thiscanbebetterunderstoodbynoticing

that(5)canbecastintotheform

F = GT A , (13)


4

with

G = G 1 G 2 ... G N( ) , (14)

representinganL × N matrixwiththen-thcolumnbeingdefinedaccordingto(12),i.e.,

G n = g1 Yn( ) g2 Y

n( ) ... gL Yn( )⎛

⎝⎜⎞⎠⎟T

(15)

Forexample,forthefunction

f = aexp(xy) + bcos x + y( )+ csin xyz( )+ d ,onehasM=3,L=4andonecoulddefine,accordingto(1),(2),(11)and(12),

X = x y z( )T , A = a b c d( )T . g1 X( ) = exp x1x2( ) , g2 X( ) = sin x1 + x2( ) , g3 X( ) = sin x1x2x3( ) , g4 X( ) =1 ,i.e.,

G X = exp x1x2( ) sin x1 + x2( ) sin x1x2x3( ) 1( )T ,with x1 = x, x2 = y, x3 = z .

To determine the solution to the parameter vector, the differential operation

givenby(8)isfirstcomputedusing(4)yielding

∇ξ = ∇ WT W − WT F − FT W + FT F( ) . (16)

Noticingthat W isindependentoftheparametervector,oneobtains

∇ WT F + FT W − FT F( ) = 0 (17)

Thel-thelementof(17)isoftheform


5

∂∂al

WT F + FT W − FT F( ) = 0 ,

orequivalently

∂ FT

∂alW − F( ) = 0 (18)

Forthelatterexpression,theproperty

AT B = BT A , (19)

wasused.

Before proceeding, it is instructive to introduce the generalized derivative

operator,foravectorofKelements,

′H ≡ ∇h1∇h2 ... ∇hK( )T . (20)

Expression(20)representsaK × L matrix.Accordingtothisdefinition,setting H = F ,

(18)canbegroupedassetofLequationsoftheform

′F T W − F( ) = 0 , (21)

with

′F T = ∇f Y 1 , A( ) ∇f Y 2 , A( ) ... ∇f Y N , A( )⎛⎝

⎞⎠ , (22)

representingan L× N matrixand 0 thenullvectoroflengthL.

Noticethat,byuseof(13),

∂ F∂al

= GT ∂ A∂al

,

orequivalently

∂ FT

∂al= ∂ AT

∂alG , (23)

whichcanbegeneralized,byuseofthedefinition(20)totherelation

′F T = ′A T G . (24)

Byapplyingthedefinition(22)tovector A ,itisstraightforwardtoshowthat

′A = ℑL , (25)

with ℑL representingthe L × L identitymatrix.From(24)and(25)oneobtains

′F T = G , (26)


6

Inserting(26)into(21)yieldsFrom(13),(18)canbecastintotheform

G W − F( ) = 0 (27)

By inserting (13) into (23) one obtains, after a few algebraicmanipulations, the final

solutionfortheunknownparametervector,

A = G GT( )−1 G W( ) (28)

Noticethat G GT isan L × L squarematrixand G W isavectoroflengthL.

4.First-OrderNonlinearRegression

Forafunctionthatisnonlinearontheelementsoftheparametervector,onecan

obtain an approximate solution by making a first-order linear approximation for the

modelvector.Smallcorrectionsareobtainedbyiteration.Atthek-thiterationstep,one

obtainsasetofLlinearequationsforthecorrectiontobemadeontheparametervector.

In order to develop the procedure, let A k( ) the parameter vector at thek-th iterationstep.Giventhisvector,oneexpandsthemodelfunction,tofirstorderintheform

f X, A( ) = f X, A k( )⎡⎣ ⎤⎦ +∇T f X, A k( )⎡⎣ ⎤⎦ A − A k( )⎡⎣ ⎤⎦ . (29)

Usingthedefinition(5),(29)canbegeneralizedtotheform

F A( ) = F k( )+ ′F k( ) A − A k( )⎡⎣ ⎤⎦ (30)

with

′F k( ) ≡ ∇f Y 1 , A k( )( ) ∇f Y 2 , A k( )( ) ... ∇f Y N , A k( )( )⎛⎝

⎞⎠

T

(31)

representing theN × L generalizedderivativematrixof thevector F , according to the

definition(20),asper(22).In(30)oneusesthenotation F k( ) = F A k( )⎡⎣ ⎤⎦ .

In order to determine the correction to be obtained at each iteration step, one

considerstheconditiongivenby(21),re-writtenas


7

′F A( )⎡⎣ ⎤⎦T W − F A( )⎡⎣ ⎤⎦ =

0 . (32)

Using(30)intheaboveequationandnoticingthattofirstorder

′F A( ) ≈ ′F k( ) ,

yields

′F k( )⎡⎣ ⎤⎦T W − F k( )− ′F k( ) A − A k( )⎡⎣ ⎤⎦{ } = 0 , (33)

whichafterafewalgebraicmanipulationsyields,

Δ A k( ) = ′F k( )⎡⎣ ⎤⎦T ′F k( ){ }−1

′F k( )⎡⎣ ⎤⎦T W − F k( )⎡⎣ ⎤⎦{ } , (34)

with

Δ A k( ) ≡ A− A k( ) , (35)

representingthedifferentialcorrectionintheparametervector.

Expression (34) is calculated iteratively until differential correction becomes

smallerthanacertainpreseterrorparameter.

In the following section, examples on how to define thematrices for both the

linearandnonlinear casesare shown.Apseudocode is shown for thedevelopmentof

algorithmsforthecaseofnonlinearregressionanalysisofdata.

5.Examples

5.1Linearregressionexample ConsideronceagaintheexampleofSection3with

f = aexp(xy) + bcos x + y( )+ csin xyz( )+ d ,


8

onehasM=3,L=4.AssumethatNvaluesareobtainedforthequantityw,i.e.

W = w1 w2 ... wN( )T . (36)

From(7)

Y =

x1 x2 ... xNy1 y2 ... yNz1 z2 ... zN

⎛

⎝

⎜⎜⎜⎜

⎞

⎠

⎟⎟⎟⎟

. (37)

From(12)

G X = exp xy( ) sin x + y( ) sin xyz( ) 1( )T (38)

andfrom(15),

G =

exp x1y1( ) exp x2y2( ) ... exp xN yN( )sin x1 + y1( ) sin x2 + y2( ) ... sin xN + yN( )sin x1y1z1( ) sin x2y2z2( ) ... sin xN yN zN( )

1 1 1 1

⎛

⎝

⎜⎜⎜⎜⎜

⎞

⎠

⎟⎟⎟⎟⎟

(39)

Oneobtainsa4× N Gmatrixandexpressions(36)and(39)aresufficienttodetermine

thesolutiongivenby(28).

5.2Nonlinearregressionexample

Forthesakeofsimplicity,considerthe3-parametergaussianmodelforasingle

variablefunction

f x, A( ) = aexp − x − x( ) / w⎡⎣ ⎤⎦2⎧

⎨⎩

⎫⎬⎭, (40)

wherea is theamplitude, x is thecentroidandw is thehalfwidthof thegaussian.The

parametervectoris

A = a x w( )T (41)

Assumeonceagain thatasetofNdatapoints isobtained, representedby(36).

Accordingto(34)allthatisnecessaryistodeterminethematrices F and ′F .Assuming

thatatthek-thiterationtheparametervectorisgivenby

A k( ) = a k( ) x k( ) w k( )( )T . (42)

From(5)


9

F k( ) =

a k( )exp − x1 − x k( )⎡⎣ ⎤⎦ / w k( ){ }2{ }a k( )exp − x2 − x k( )⎡⎣ ⎤⎦ / w k( ){ }2{ }

...

a k( )exp − xN − x k( )⎡⎣ ⎤⎦ / w k( ){ }2{ }

⎛

⎝

⎜⎜⎜⎜⎜⎜⎜⎜

⎞

⎠

⎟⎟⎟⎟⎟⎟⎟⎟

, (43)

andfrom(31),oneobtains,afterafewalgebraicmanipulations

′F k( ) =

e− x1−x k( )⎡⎣ ⎤⎦/w k( ){ }2 2a k( ) x1 − x k( )⎡⎣ ⎤⎦ / w k( )⎡⎣ ⎤⎦

2⎧⎨⎩

⎫⎬⎭e− x1−x k( )⎡⎣ ⎤⎦/w k( ){ }2 2a k( ) x1 − x k( )⎡⎣ ⎤⎦

2/ w k( )⎡⎣ ⎤⎦

3⎧⎨⎩

⎫⎬⎭e− x1−x k( )⎡⎣ ⎤⎦/w k( ){ }2

e− x2−x k( )⎡⎣ ⎤⎦/w k( ){ }2 2a k( ) x2 − x k( )⎡⎣ ⎤⎦ / w k( )⎡⎣ ⎤⎦

2⎧⎨⎩

⎫⎬⎭e− x2−x k( )⎡⎣ ⎤⎦/w k( ){ }2 2a k( ) x3 − x k( )⎡⎣ ⎤⎦

2/ w k( )⎡⎣ ⎤⎦

3⎧⎨⎩

⎫⎬⎭e− x3−x k( )⎡⎣ ⎤⎦/w k( ){ }2

... ... ...

e− xN −x k( )⎡⎣ ⎤⎦/w k( ){ }2 2a k( ) xN − x k( )⎡⎣ ⎤⎦ / w k( )⎡⎣ ⎤⎦

2⎧⎨⎩

⎫⎬⎭e− xN −x k( )⎡⎣ ⎤⎦/w k( ){ }2 2a k( ) xN − x k( )⎡⎣ ⎤⎦

2/ w k( )⎡⎣ ⎤⎦

3⎧⎨⎩

⎫⎬⎭e− xN −x k( )⎡⎣ ⎤⎦/w k( ){ }2

⎛

⎝

⎜⎜⎜⎜⎜⎜⎜⎜⎜

⎞

⎠

⎟⎟⎟⎟⎟⎟⎟⎟⎟

. (44)

Together with (36), (43) and (44) are sufficient to determine the differential

correctionfortheparametervector.

5.3Pseudocodefornonlinearregression Table I shows a pseudocode for the computational implementation of theprocedure, for the case of nonlinear regression analysis. As can be noticed from theprogram structure, by use of thematrix formulation presented in this paper one canorganizethealgorithmusingarathersimpleandmodularscheme.

References[1] Douglas M. Bates and Donald G. Watts, “Nonlinear regression analysis and itsapplications,”JohnWiley&Sons,Inc. (1988).

[2]G.B.ArfkenandH. J.Weber,“MathematicalMethodsforPhysicists,”Chapter3,5th.ed.,SanDiego,CA,AcademicPress(2001).


10

TableI–Pseudocodefornonlinearregressionanalysis.1.Initialization:

• Definemodelfunction • ReaddataandstoreintomatrixW• DefineparametervectorA• Setinitialguesstotheparametervector: • Setahighvalueforanerrorparameter: • Definefunction

• Definesystemmatrixfunction

• Defineinputvectorfunction

• Setamaximumvalueforthechangeinmagnitudeoftheparametervector 2.Calculation:

whileerr>ε

A Remarks:

• Mathcadprogrammingstyleisusedasamodel.• Thevectorizeoperation producesamatrixinwhicheachelementistheabsolutevalueofthedifferencebetweenvectors• Thefunctionmax()calculatesthemaximumelementofthevector.• ThelastvalueofAstorestheapproximatesolutiontotheparametervector.

Matrix Formulation for Linear and First-Order Nonlinear ...

Documents

Transcript of Matrix Formulation for Linear and First-Order Nonlinear ...