Data Modeling and Least Squares Fitting
description
Transcript of Data Modeling and Least Squares Fitting
Data Modeling andData Modeling andLeast Squares FittingLeast Squares Fitting
COS 323COS 323
Data ModelingData Modeling
• Given: data points, functional form,Given: data points, functional form,find constants in functionfind constants in function
• Example: given (xExample: given (xii, y, yii), find line through ), find line through
them;them;i.e., find a and b in y = ax+bi.e., find a and b in y = ax+b
(x(x11,y,y11))
(x(x22,y,y22))
(x(x33,y,y33))
(x(x44,y,y44))
(x(x55,y,y55))(x(x66,y,y66))
(x(x77,y,y77))
y=ax+by=ax+b
Data ModelingData Modeling
• You might do this because you actually You might do this because you actually care about those numbers…care about those numbers…– Example: measure position of falling object,Example: measure position of falling object,
fit parabolafit parabola
p = –p = –11//22 gt gt22
posi
tion
posi
tion
timetime
Estimate g from fitEstimate g from fit
Data ModelingData Modeling
• … … or because some aspect of behavior or because some aspect of behavior is unknown and you want to ignore itis unknown and you want to ignore it– Example: measuringExample: measuring
relative resonantrelative resonantfrequency of two ions,frequency of two ions,want to ignorewant to ignoremagnetic field driftmagnetic field drift
Least SquaresLeast Squares
• Nearly universal formulation of fitting:Nearly universal formulation of fitting:minimize squares of differences betweenminimize squares of differences betweendata and functiondata and function– Example: for fitting a line, minimizeExample: for fitting a line, minimize
with respect to a and bwith respect to a and b
– Most general solution technique: take Most general solution technique: take derivatives w.r.t. unknown variables, set derivatives w.r.t. unknown variables, set equal to zeroequal to zero
i
ii baxy 22 )( i
ii baxy 22 )(
Least SquaresLeast Squares
• Computational approaches:Computational approaches:– General numerical algorithms for function General numerical algorithms for function
minimizationminimization
– Take partial derivatives; general numerical Take partial derivatives; general numerical algorithms for root findingalgorithms for root finding
– Specialized numerical algorithms that take Specialized numerical algorithms that take advantage of form of functionadvantage of form of function
– Important special case: linear least squaresImportant special case: linear least squares
Linear Least SquaresLinear Least Squares
• General pattern:General pattern:
• Note that Note that dependence on unknownsdependence on unknowns is is linear,linear,not necessarily function!not necessarily function!
,,,forsolve),,(Given
)()()(
cbayx
xhcxgbxfay
ii
iiii
,,,forsolve),,(Given
)()()(
cbayx
xhcxgbxfay
ii
iiii
Solving Linear Least Squares Solving Linear Least Squares ProblemProblem
• Take partial derivatives:Take partial derivatives:
iii
iii
iii
iiiii
iii
iii
iii
iiiii
iiii
yxgxgxgbxfxga
xgbxfayxgb
yxfxgxfbxfxfa
xgbxfayxfa
xgbxfay
)()()()()(
0)()()(2
)()()()()(
0)()()(2
)()(22
iii
iii
iii
iiiii
iii
iii
iii
iiiii
iiii
yxgxgxgbxfxga
xgbxfayxgb
yxfxgxfbxfxfa
xgbxfayxfa
xgbxfay
)()()()()(
0)()()(2
)()()()()(
0)()()(2
)()(22
Solving Linear Least Squares Solving Linear Least Squares ProblemProblem
• For convenience, rewrite as matrix:For convenience, rewrite as matrix:
• Factor:Factor:
iii
iii
iii
iii
iii
iii
yxg
yxf
b
a
xgxgxfxg
xgxfxfxf
)(
)(
)()()()(
)()()()(
iii
iii
iii
iii
iii
iii
yxg
yxf
b
a
xgxgxfxg
xgxfxfxf
)(
)(
)()()()(
)()()()(
ii
i
ii
i
i
i
i
xg
xf
yb
a
xg
xf
xg
xf
)(
)(
)(
)(
)(
)(T
ii
i
ii
i
i
i
i
xg
xf
yb
a
xg
xf
xg
xf
)(
)(
)(
)(
)(
)(T
Linear Least SquaresLinear Least Squares
• There’s a different derivation of this:There’s a different derivation of this:overconstrained linear systemoverconstrained linear system
• A has n rows and m<n columns:A has n rows and m<n columns:more equations than unknownsmore equations than unknowns
bx
bx
A
A
bx
bx
A
A
Linear Least SquaresLinear Least Squares
• Interpretation: find x that comes Interpretation: find x that comes “closest” to satisfying Ax=b“closest” to satisfying Ax=b– i.e., minimize b–Axi.e., minimize b–Ax
– i.e., minimize |b–Ax|i.e., minimize |b–Ax|
– Equivalently, minimize |b–Ax|Equivalently, minimize |b–Ax|22 or (b–Ax) or (b–Ax)(b–(b–Ax)Ax)
bx
xbxbxb
xbxb
TT
TT
T
0)(2)()(
)()(min
AAA
AAAA
AA
bx
xbxbxb
xbxb
TT
TT
T
0)(2)()(
)()(min
AAA
AAAA
AA
Linear Least SquaresLinear Least Squares
• If fitting data to linear function:If fitting data to linear function:– Rows of A are functions of xRows of A are functions of xii
– Entries in b are yEntries in b are yii
– Minimizing sum of squared differences!Minimizing sum of squared differences!
iii
iii
iii
iii
iii
iii
xgy
xfy
bxgxgxfxg
xgxfxfxf
y
y
bxgxf
xgxf
)(
)(
,)()()()(
)()()()(
,)()(
)()(
TT
2
1
22
11
AAA
A
iii
iii
iii
iii
iii
iii
xgy
xfy
bxgxgxfxg
xgxfxfxf
y
y
bxgxf
xgxf
)(
)(
,)()()()(
)()()()(
,)()(
)()(
TT
2
1
22
11
AAA
A
Linear Least SquaresLinear Least Squares
• Compare two expressions we’ve derived Compare two expressions we’ve derived – equal!– equal!
iii
iii
iii
iii
iii
iii
xgy
xfy
b
a
xgxgxfxg
xgxfxfxf
)(
)(
)()()()(
)()()()(
iii
iii
iii
iii
iii
iii
xgy
xfy
b
a
xgxgxfxg
xgxfxfxf
)(
)(
)()()()(
)()()()(
ii
i
ii
i
i
i
i
xg
xf
yb
a
xg
xf
xg
xf
)(
)(
)(
)(
)(
)(T
ii
i
ii
i
i
i
i
xg
xf
yb
a
xg
xf
xg
xf
)(
)(
)(
)(
)(
)(T
Ways of Solving Linear Least Ways of Solving Linear Least SquaresSquares
• Option 1:Option 1:for each xfor each xii,y,yii
compute f(xcompute f(xii), g(x), g(xii), etc. ), etc.
store in row i of Astore in row i of Astore ystore yii in b in b
compute (Acompute (ATTA)A)-1 -1 AATTbb
• (A(ATTA)A)-1 -1 AATT is known as “pseudoinverse” is known as “pseudoinverse” of Aof A
Ways of Solving Linear Least Ways of Solving Linear Least SquaresSquares
• Option 2:Option 2:for each xfor each xii,y,yii
compute f(xcompute f(xii), g(x), g(xii), etc.), etc.
store in row i of Astore in row i of Astore ystore yii in b in b
compute Acompute ATTA, AA, ATTbbsolve Asolve ATTAx=AAx=ATTbb
• These are known as the “normal These are known as the “normal equations” of the least squares problemequations” of the least squares problem
Ways of Solving Linear Least Ways of Solving Linear Least SquaresSquares
• These can be inefficient, since A typically These can be inefficient, since A typically much larger than Amuch larger than ATTA and AA and ATTbb
• Option 3:Option 3:for each xfor each xii,y,yii
compute f(xcompute f(xii), g(x), g(xii), etc.), etc.
accumulate outer product in Uaccumulate outer product in Uaccumulate product with yaccumulate product with yii in v in v
solve Ux=vsolve Ux=v
Special Case: ConstantSpecial Case: Constant
• Let’s try to model a function of the formLet’s try to model a function of the form y = a y = a
• In this case, f(xIn this case, f(xii)=1 and we are solving)=1 and we are solving
• Punchline: mean is least-squares Punchline: mean is least-squares estimator for best constant fitestimator for best constant fit
n
ya
ya
ii
ii
i
1
n
ya
ya
ii
ii
i
1
Special Case: LineSpecial Case: Line
• Fit to y=a+bxFit to y=a+bx
2222
2
T22
2
1
2
1T
,
,)(
11
1
ii
iiii
ii
iiiii
ii
i
ii
i
ii
ii
i
i i
ii
i
i
xxn
yxyxnb
xxn
yxxyxa
yx
yb
xxn
nx
xx
xx
xn
xy
b
ax
x
AAA
2222
2
T22
2
1
2
1T
,
,)(
11
1
ii
iiii
ii
iiiii
ii
i
ii
i
ii
ii
i
i i
ii
i
i
xxn
yxyxnb
xxn
yxxyxa
yx
yb
xxn
nx
xx
xx
xn
xy
b
ax
x
AAA
Weighted Least SquaresWeighted Least Squares
• Common case: the (xCommon case: the (xii,y,yii) have different ) have different
uncertainties associated with themuncertainties associated with them
• Want to give more weight to Want to give more weight to measurementsmeasurementsof which you are more certainof which you are more certain
• Weighted least squares minimizationWeighted least squares minimization
• If uncertainty is If uncertainty is , best to take, best to take
i
iii xfyw 22 )(min i
iii xfyw 22 )(min
21
iiw 21
iiw
Weighted Least SquaresWeighted Least Squares
• Define weight matrix W asDefine weight matrix W as
• Then solve weighted least squares viaThen solve weighted least squares via
0
0
4
3
2
1
w
w
w
w
W
0
0
4
3
2
1
w
w
w
w
W
bx WAWAA TT bx WAWAA TT
Error Estimates from Linear Least Error Estimates from Linear Least SquaresSquares
• For many applications, finding values is For many applications, finding values is useless without estimate of their accuracyuseless without estimate of their accuracy
• Residual is b – AxResidual is b – Ax
• Can compute Can compute 22 = (b – Ax) = (b – Ax)(b – Ax)(b – Ax)
• How do we tell whether answer is good?How do we tell whether answer is good?– Lots of measurementsLots of measurements
22 is smallis small
22 increases quickly with perturbations to x increases quickly with perturbations to x
Error Estimates from Linear Least Error Estimates from Linear Least SquaresSquares
• Let’s look at increase in Let’s look at increase in 22::
• So, the So, the biggerbigger A ATTA is, the A is, the fasterfaster error error increasesincreasesas we move away from current xas we move away from current x
xx
xxxbx
xxxbxxbxb
xxbxxb
xxbxxb
xxx
AA
AAAAA
AAAAAA
AAAA
AA
TT22
TTTTT2
TTTTT
T
T
So,
)(2
)(2)()(
))())(
)()(
xx
xxxbx
xxxbxxbxb
xxbxxb
xxbxxb
xxx
AA
AAAAA
AAAAAA
AAAA
AA
TT22
TTTTT2
TTTTT
T
T
So,
)(2
)(2)()(
))())(
)()(
Error Estimates from Linear Least Error Estimates from Linear Least SquaresSquares
• C=(AC=(ATTA)A)–1 –1 is called is called covariancecovariance of the data of the data
• The “standard variance” in our estimate of x isThe “standard variance” in our estimate of x is
• This is a matrix:This is a matrix:– Diagonal entries give variance of estimates of Diagonal entries give variance of estimates of
components of xcomponents of x
– Off-diagonal entries explain mutual dependenceOff-diagonal entries explain mutual dependence
• n–m is (# of samples) minus (# of degrees of n–m is (# of samples) minus (# of degrees of freedom in the fit): consult a statistician…freedom in the fit): consult a statistician…
Cmn
2
2 Cmn
2
2
Special Case: ConstantSpecial Case: Constant
n
ya
ya
ay
ii
ii
i
1
n
ya
ya
ay
ii
ii
i
1
nn
ay
n
ay
ay
ii
a
ni
i
ii
1
)(
1
)(
)(
2
1
2
2
22
nn
ay
n
ay
ay
ii
a
ni
i
ii
1
)(
1
)(
)(
2
1
2
2
22
““standard deviationstandard deviationof mean”of mean”
““standard deviationstandard deviationof samples”of samples”
Things to Keep in MindThings to Keep in Mind
• In general, uncertainty in estimated In general, uncertainty in estimated parametersparametersgoes down slowly: like 1/sqrt(# samples)goes down slowly: like 1/sqrt(# samples)
• Formulas for special cases (like fitting a line) Formulas for special cases (like fitting a line) are messy: simpler to think of Aare messy: simpler to think of ATTAx=AAx=ATTb formb form
• All of these minimize “vertical” squared All of these minimize “vertical” squared distancedistance– Square not always appropriateSquare not always appropriate
– Vertical distance not always appropriateVertical distance not always appropriate