Week06&07_Curve Fitting.pdf
-
Upload
noorhafiz07 -
Category
Documents
-
view
214 -
download
0
Transcript of Week06&07_Curve Fitting.pdf
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 1/38
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 2/38
WEEK 6&7
Curve Fitting Linear regression
Polynomial regression
Multiple regression
General linear least squares
Nonlinear regression 2
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 3/38
At the end of this topic, the students will be able:
To fits the data using linear and polynomial
regression
To fits the data using multiple linear and
non-linear regression
To assess and choose the preferred method
for any particular problems
LESSON OUTCOMES
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 4/38
4
• Describes techniques to fit curves (curve fitting ) to
discrete data to obtain intermediate estimates.
• There are two general approaches for curve fitting: – Data exhibit a significant degree of error . The strategy is to
derive a single curve that represents the general trend of the data. – Data is very precise. The strategy is to pass a curve or a series of
curves through each of the points.
• In engineering two types of applications are
encountered normally when fitting experimental data: – Trend analysis. Predicting values of dependent variable, may
include extrapolation beyond data points or interpolation betweendata points.
– Hypothesis testing. Comparing existing mathematical model with
measured data.
Curve Fitting
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 5/38
5
Three attempts to fit a “best”
curve through uncertain data
points.
a) Least-squares regression Linear regression
Polynomial regression
Multiple regression General linear least squares
Nonlinear regression
b) Linear interpolation
c) Curvilinear interpolation
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 6/38
6
Mathematical background in Simple Statistics
• Arithmetic mean . The sum of the individual data
points (yi) divided by the number of points (n).
• Standard deviation . The most common measure of a
spread for a sample.
or
ni
n
y y
i
,,1
mean)theand pointsdata between
residualstheof squarestheof sumtotalt(S
)(
12
y yS
nS s
it
t y
1
/22
2
n
n y y s
ii
y
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 7/38
7
• Variance . Representation of spread by the square of
the standard deviation.
• Coefficient of variation . Has the utility to quantify the
spread of data and provides a normalized measure of
the spread.
1
)( 2
2
n
y y s
i
y
%100.. y
svc
y
Degrees of freedom
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 8/38
8
Least-Squares Regression
• Regression analysis is a study of the relationships among
variables.
• Get the ‘best’ straight line to fit through a set of uncertain
data points.
• Calculate the slope and the intercept of the line.
• Also fit the ‘best’ polynomial to data.
• Consider multiple linear regression for a case when one
variable depends on two or more variables in linear function.
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 9/38
9
Linear Regression
• Fitting a straight line to a set of pairedobservations: (x 1 , y 1 ), (x 2 , y 2 ),… ,(x n , y n ) .
• The mathematical expression for straight line:
y=a 0 +a 1 x+e
where a 0 = intercept
a 1 = slopee = error, or residual, between the model and theobservations (e = y – a 0 – a 1 x ; discrepancy between
true value of y and approximate value, a 0 + a 1 x )
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 10/38
10
Criteria for a ‘Best’ Fit
• Minimize the sum of the residual errors for all available data:
n = total number of points
• However, this is an inadequate criterion
because the errors cancel.
• So, another logical criteria might be to minimize the sum of
the absolute values of discrepancies:
• However, this criterion is also
inadequate. It will minimize the sum of
the absolute values. This criterion also
does not yield a unique best fit.
n
i
ioi
n
i
i xaa ye1
1
1
)(
n
iii
n
ii xaa ye 1
101
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 11/38
11
• Best strategy is to minimize the sum of the squares of the
residuals between the measured y and the y calculated withthe linear model:
• Yields a unique line for a given set of data. The line chosen thatminimizes the maximum distance that individual point fall
from the line.
n
i
n
i
iiii
n
i
ir xaa y y yeS 1 1
2
10
2
1
2 )()model,measured,( Eq.(17.3)
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 12/38
12
Least-Squares Fit of a Straight Line
• To determine values for a0 and a1, Eq.(17.3) is differentiated
with respect to each coefficient:
00
2
10
10
1
1
1
1
2
10
where
0
0
0)(2
0)(2
)(
naa
xa xa x y
xaa y
x xaa ya
S
xaa yaS
xaa yS
iiii
ii
iioir
ioi
o
r
n
i
iir
Setting derivatives equal to
zero will result in a
minimum Sr
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 13/38
13
xa ya
x xn
y x y xna
y xa xa x
ya xna
ii
iiii
iiii
ii
10
221
12
0
10
rule,sCramer'usingBy
Mean values
Normal equations,
can be solved
simultaneously
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 14/38
14
Example 1
Fit a straight line to the x and y values in the first twocolumns of Table 1
x y
1.0 0.5
2.0 2.5
3.0 2.04.0 4.0
5.0 3.5
6.0 6.0
7.0 5.5
4286.37
24y 24y
4
7
28x 28x
140x 5.119x 7
i
i
2
ii
i yn
0714.0)4)(8393.0(4286.3 8393.0)28()140(7
)24)(28()5.119(7
021
10221
aa
xa ya x xn
y x y xn
aii
iiii
0714.08393.0
isfitsquares-leastThe
x y
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 15/38
15
y = 0.8393x + 0.0714R² = 0.8683
0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0
y
x
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 16/38
16
Exercise 1
Fit the best straight line to the following set of x and yvalues using the method of least-squares.
x y
0 2
1 5
2 9
3 15
4 17
5 24
6 25
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 17/38
17
y = 4.1071x + 1.5357R² = 0.9822
0
5
10
15
20
25
30
0 1 2 3 4 5 6 7
y
x
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 18/38
18
Quantification of Error of Linear Regression
• Recall that the sum of the residuals squared
which measures the deviations of the regression
line from the actual data values is
n
iii
n
iir xaa yeS 1
2
101
2
)(
The square of the residual
represents the square of the
vertical distance between thedata and another measure of
central tendency-the straight line.
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 19/38
19
• The analogy can be extended further for cases:
The spread of points around the line is of similar
magnitude along the entire range of data. The distribution of these points about the line is
normal.
• The following statistical formulas may be used to
quantify the error associated with linear regression
2 )(
1
)(
/
2
10
2
n
S s xaa yS
n
S s y yS
r x yiir
t yit
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 20/38
20
• Standard error of the estimate
quantifies the spread the data.• sy/x quantifies the spread
around the regression line
• Original standard deviation,
sy quantified the spreadaround the mean.
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 21/38
21
“Goodness” of our fit
Determine
1) Total sum of the squares around the mean for the
dependent variable, y, is S t
2) Sum of the squares of residuals around the
regression line is S r3) Difference between two quantities, S t -S r quantifies
the improvement or error reduction due to
describing data in terms of a straight line rather than
as an average value.
4) Correlation of determination , r 2 and correlation
coefficient , r .
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 22/38
22
t
r t
t
r t
S
S S r
S
S S r
2
coefficient of determination
• For a perfect fit, Sr= 0 and r=r2=1, signifying that the
line explains 100 percent of the variability of thedata.
• For r =r2=0, Sr=St, the fit represents no improvement.
correlation coefficient
2222
iiii
iiii
y yn x xn
y x y xnr
In a more convenient form for computation of r is
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 23/38
23
Example 3Determine the total standard deviation, standard
error of the estimate and coefficient of correlation forthe linear regression line obtained in the Example 1.
xi yi (yi-ymean)2 (yi-a0-a1xi)2
1.0 0.5 8.5765 0.1687
2.0 2.5 0.8622 0.5625
3.0 2.0 2.0408 0.3473
4.0 4.0 0.3265 0.3265
5.0 3.5 0.0051 0.5897
6.0 6.0 6.6122 0.7970
7.0 5.5 4.2908 0.1994
24.0000 22.7143 2.9911
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 24/38
24
932.0868.0
868.07143.22
9911.27143.22
0.77342-7
2.9911
2
9911.2)(
1.94571-7
22.7143
1
7143.22)(
2
/
2
10
2
r
S
S S r
n
S s
xaa yS
n
S s
y yS
t
r t
r x y
iir
t y
it
Solution
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 25/38
25
Exercise 4Determine the total standard deviation, standard
error of the estimate and coefficient of correlation forthe linear regression line obtained in the Example 2.
x y
0 21 5
2 9
3 15
4 175 24
6 25
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 26/38
26
Linearization of Nonlinear Relationships
• Most of the engineering functions has nonlinearrelationships.
• These functions cannot adequately be described by
a linear line.
• Transformations can be used to express the data ina form that is compatible with linear regression.
• Examples of functions that can be linearized are
Exponential equation
Power equation
Saturation-growth rate equation
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 27/38
27
x
e y1
1
2
2
x y x
x
y 33
11 lnln x y22 logloglog x y
33
3 111
x y
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 28/38
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 29/38
29
Polynomial Regression
• Some engineering data is poorly
represented by a straight line wherethe error is high.
• Example (a) and (b): A curve would
be suited to fit the data.
• Instead of trying to linearize some
of the nonlinear functions and use
linear regression,we may alternatively
fit polynomials to the data using
polynomial regression.
• The least-squares procedure can be
readily extended to fit the data to a
higher order polynomial.
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 30/38
30
• For a second-order polynomial or quadratic:
y=a 0 +a 1 x+a 2 x 2 + e
• For this case, the sum of the squares of the residual is
• Derivative with respect to each of the unknown coefficients of the
polynomial
• These eq. set to be zero and develop set of normal equations:
n
i
iiir xa xaa yS 1
22
210 )(
)(2
)(2
)(2
2
21
2
2
2
21
1
2
211
ii
i
i
xa xaa y xa
S
xa xaa y xa
S
xa xaa y
a
S
ioir
ioiir
ioi
o
r
iiiii
iiiii
iii
y xa xa xa x
y xa xa xa x
ya xa xan
2
2
4
1
3
0
2
2
3
1
2
0
2
2
10
)()()(
)()()(
)()()(
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 31/38
31
• The coefficients of the unknowns can be calculated directly
from the observed data.
• Then, solving a system of three simultaneous linear
equations.
• The standard error is formulated as
• Coefficient of determination for polynomial regression can
also be as
)1(/
mn
S S r
x y
t
r t
S
S S r
2
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 32/38
32
Example 6Fit a second order polynomial to the data
xi yi
0 2.1
1 7.7
2 13.6
3 27.2
4 40.9
5 61.1
Solution: Refer to page 472 text book
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 33/38
33
Multiple Linear Regression
Case where y is a linear function of two or more independent
variable:
y=a 0 +a 1 x 1 +a 2 x 2 + e
n x1i x2i a0 yi
x1i (x1i)2 x1ix2i a1 = x1iyi
x2i x1ix2i (x2i)2 a2 x2iyi
n
i
iiir xa xaa yS 1
222110 )(
)1(/
mn
S S r
x y Example:
Refer to page 475 text book
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 34/38
34
• Multiple linear regression can also be extended to the power
equation of the general form,
• This equation is extremely useful when fitting experimental
data.
• To use multiple regression analysis, we transform the above
equation by taking logarithm of base 10 which yield
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 35/38
35
General Linear Least Squares
residualsE
tscoefficienunknownA
variabledependenttheof valuedobservedY
t variableindependentheof valuesmeasuredat the
functions basistheof valuescalculatedtheof matrix
functions basis1are
10
221100
Z
E A Z Y
m , z , , z z
e z a z a z a z a y
m
mm
2
1 0
n
i
m
j
ji jir z a yS
Minimized by taking its partial
derivative with respect to each of
the coefficients and setting the
resulting equation equal to zero
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 36/38
36
YTZAZTZ
• The normal equations can be expressed concisely in matrix
form as
1Y T Z Z T Z A
• To solve it, matrix inverse can be employed:
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 37/38
37
Nonlinear Regression
• The nonlinear models as those which have a nonlinear
dependence on their parameters. There are times when thenonlinear model must fit the data.
• As with linear least squares, nonlinear regression is based on
determining values of the parameters that minimize the sum of the
squares of the residual.
• The Gauss-Newton method is one of the procedures to achieve
the least-squares criterion. It uses a Taylor series expansion to
approximate the nonlinear equation in linear form.
• The least-squares theory is applied to obtain new parameter
values that move in the direction of minimizing the residuals.
8/20/2019 Week06&07_Curve Fitting.pdf
http://slidepdf.com/reader/full/week0607curve-fittingpdf 38/38
38
m
o
nn
nn
j
a
a
a
x f y
x f y
x f y
a
f
a
f
a
f
a
f
a
f
a
f
Z
E A j
Z D
.
.
. A
)(
.
.
.
)(
)(
D
122
11
10
1
2
0
2
1
1
0
1
criterion.stoppingaceptablean belowfallsuntil
%100
converges,solutiontheuntilrepeatedis procedureThis
1,11,1
0,01,0
1,
,1,
a
jk
jk jk
k a
aa
a j
a j
a
a j
a j
a
DT
j Z A
j Z
T
j Z
For example nonlinear models,
Example:
Refer to page
483 text book