Week #6 : Function Fitting; Animation Goals: Least-Squares...

33
Week #6 : Function Fitting; Animation Goals: Least-Squares Function fitting Animation in MATLAB

Transcript of Week #6 : Function Fitting; Animation Goals: Least-Squares...

Page 1: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Week #6 : Function Fitting; Animation

Goals:

• Least-Squares Function fitting

• Animation in MATLAB

Page 2: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Data Files and MATLAB - 1

Function Fitting

So far in MATLAB, we have not dealt with data with error:

• Splines go exactly through given points.

• Equation solving methods assume we know all input values exactly.

Another important use of numerical tools is to go from (error-prone) data toa mathematical model.

Page 3: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Data Files and MATLAB - 2

Data Files and MATLAB

The first step in working with data is getting it into memory.Look up what the following commands do in MATLAB.

• dlmread -

• csvread -

• textread -

• xlsread -

Page 4: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Data Files and MATLAB - 3

Example Dataset

Exercise: Start a new script, W5 8.m, and have it read in QuizAndExamGrades.xls

• Can be done with xlsread.

• Can also use double-click in Directory listing.

What format is data now in in MATLAB?

Page 5: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Histograms, Subplots and Function Fitting - 1

Histograms and Subplots

Exercise: Look at the distribution of each variable separately using a histogram.The MATLAB command for this is hist.

Arranging plots can be helpful. Exercise: In the script, use the commandsfigure(1) and figure(2) for each separate plot.

Exercise: Look up the subplot command in the Help system. Modify thescript so it shows all the graphs in the same graph window using subplot.How does the subplot command lay out the sub-windows?

Page 6: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Histograms, Subplots and Function Fitting - 2

Relationships

Of more interest than each variable separately is how they are related.Exercise: Generate a scatter plot of the exam vs. test grades.

Once you have the scatterplot, what might you want to do next?

Page 7: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Histograms, Subplots and Function Fitting - 3

Fitting Curves to Data

Exercise: In the MATLAB plot window, select Tools/Basic Fitting

There is a ‘spline’ option: what happens when you try it? Explain what happened.

Exercise: Play around:

• Move legend out of the way

• Get formula for best fit linear and quadratic curves

• What does big Right Arrow button do?

Page 8: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Model Selection - 1

Model Selection - Which Fit is “Best”?

MATLAB is supposedly finding the “best fit line” or “best fit curve”. i.e. Of allpossible straight lines, the linear fit shown is the best straight line.However, what does that mean if we want to compare the best straight line fitto the best quadratic fit?

Page 9: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Model Selection - 2

Guidelines

Which models keep closest to the actual data points?

• linear, quadratic, or higher order?

Which models match logic/intuition/practical constraints better?

• linear, quadratic, or higher order?

Always ask: Is a closer fit to the data substantial enough to justify higher-orderfittings?We will study the question of “how high a degree should I use” in a more systematicway next class.

Page 10: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Model Selection - 3

Defining the “Best Fit” Within One Model

For today, we will look at selecting

• the best linear fit, among all possible linear fits, or

• the best quadratic fit, among all possible quadratic fits, etc.

How is the best model within each family selected? Or in other words, what whatmakes the “best fit line” the best?

Page 11: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Mathematics of Least Squares - 1

Mathematics of Least Squares

Our data is a set of (xi, yi) pairs. Before we find the best curve, we select/limitourselves to one predictive family of functions, e.g.

• Linear: y = p1x + p2

• Quadratic: y = p1x2 + p2x + p3

Definition: Finding the “Best fit” means “find values for pi that minimize thesquared error”: ∑

i

(yi − yi)2

Page 12: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Mathematics of Least Squares - 2

Graphically

Page 13: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Mathematics of Least Squares - 3

Naming and Symbols

What symbols are traditionally used to describe the various components of functionfitting?

• Original Data

• Fitted function

• Fitted values

• Residuals

Page 14: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Mathematics of Least Squares - 4

Least-Squares Error

Least-squares error is the standard means by which we select the best fit. Thebest function fit is selected, from all possible curves in the same family, so as tominimize the sum of y errors squared.Are other definitions of “best fit” possible?

Why do we use this the least-squares definition of “best” so often?

Page 15: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Mathematics of Least Squares - 5

When is the “best fit” is not the best?Remember, “best fit” is a short-hand for “least-squares fit”. This fit being the“best” depends on two key assumptions:

1. errors between fitted curve and the data (called the residuals) have a normal/-Gaussian distribution around the fit line.

2. residuals have the same spread, regardless of x and y values.

Sketch some data and “best fit” curves with and without these properties.

Page 16: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Generating Residual Plots - 1

Generating Residual Plots

Exercise: Download and run W6 1.m. It loads in the data fromQuizAndExamGrades.xlsx and generates a scatterplot of test vs. exam grades.

Exercise: Use Tools/Basic Fitting to generate a linear fit to the data.

Exercise: Click the “Plot Residuals” checkbox. You’ll see a warning: ignoreit because we’ll be fixing the issue in the next step.

Exercise: Choose the “Scatter plot” option for the residuals.

Page 17: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Generating Residual Plots - 2

What to Look For

The least-squares fit may be skewed or sub-optimal if

1. residuals do not average to zero over some x intervals,

2. scale of residuals changes from left to right, or

3. residuals of specific data points are exceptional.

Page 18: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Generating Residual Plots - 3

Now focus on the exam vs. quiz data. Is there...

1. roughly zero average to the residuals everywhere?

2. consistent scale of residuals from left to right?

3. data for which residuals are exceptional?

Page 19: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Non-Zero Mean Residuals - 1

Problem 1: Non-Zero Mean Residuals

There are two common reasons why the mean of the residuals might not be zeroover some x intervals.1) Two different processes or relationships are present over the span of the data.i.e. low-grade students have a different quiz/exam relationship than students withhigher grades.

• Possible Solution: Take subsets of data

2) There is an effect in your experiment is not accounted for in your formula.

• If you have a theoretical model,Possible Solution: Check model assumptions, experimental effects

• If you don’t have a theoretical model, then the formula you’re using is likelytoo simple.Possible Solution: Experiment with higher-order polynomials, or non-polynomials.

Page 20: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Non-Zero Mean Residuals - 2

In Exam Data

In our example, there is no theoretical model. However, we could make an argumentthat splitting the data is appropriate, because only higher quiz grades are usefulfor predicting exam mark:

Page 21: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Non-Zero Mean Residuals - 3

MATLAB

Exercise: Pick a cutoff value for the quiz grades, and have MATLAB select allthe data points that fit that criterion.

Exercise: Perform a linear fit on only selected higher-quiz-score data points.Compare the quality of the fit to the earlier full-dataset results.

Give an example from material science where you would see this type of transitionfrom one function to another in a dataset.

Page 22: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Inconsistent Variability - 1

Problem 2: Inconsistent Variability

Least-squares fits can also be thrown off optimal by inconsistent variability inthe residuals. (This is commonly an effect when errors are multiplicative.)Give an example of when experimental errors might be multiplicative rather thanadditive.

Page 23: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Inconsistent Variability - 2

Exercise: In the file W6 2.m, load and plot PressureTempData.xlsx.Note that this data contains two columns, with temperature as the input, and theresultant pressure of a gas as the output.What should the relationship be between temperature and pressure?

Page 24: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Inconsistent Variability - 3

Problems with Variability

Recall that the “Best fit” line is selected to minimize squared error.Indicate on the graph which points become most important in the fit. Which arethe least important?

Page 25: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Inconsistent Variability - 4

Dealing with Inconsistent Variability

One common approach to dealing with errors that scale with x is to take the logtransform of the data.Exercise: In your script, take the log of the data (both temperature andpressure), and plot the new form of the data.

Comment on the spread of the residuals after the transform.

Page 26: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Inconsistent Variability - 5

The log transform is effective in this case, because there was a consistent relativeerror across the range of the data. The log transform turns this relative/percent/-multiplicative error into additive error.Show how this works mathematically.

Sometimes other transformations such as inverting the data (1/x) can also reducethe effect of inconsistent variability. The key is to try to understand the source ofdifference before trying to fix it!

Page 27: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Problems with Bad Measurements - 1

Problem 3: Problems with Bad Measurements

Exercise: In W6 3.m, load and plot PressureTempDataWithFailures.xlsx.

Comment on any odd features you notice in the data.

Page 28: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Problems with Bad Measurements - 2

Exercise: Generate the linear fit, and comment on its relationship to the data.

Page 29: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Problems with Bad Measurements - 3

Least-squares fits are dramatically affected by data that lies outside normallydistributed error. These points are often referred to as outliers.

The further from the “obvious” curve an outlier is, the more effect they have onthe final “best”-fit curve!

Page 30: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Problems with Bad Measurements - 4

Identifying OutliersResidual plots are great for helping identify outliers.Exercise: Generate the residual plot for pressure/temperature data with errors,and use it to identify the outlier(s) in the data.

When faced with one or more outliers, the simplest solution is to remove them(manually or automatically).

Page 31: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Problems with Bad Measurements - 5

Exercise: In your script, remove the outliers, re-plot the cleaned data and re-fita linear function.

If simply removing the outliers is not possible or not acceptable, more complexfitting schemes (not based on least-squares) should be used (methods designed tobe more tolerant of/less strongly affected by outliers).

Page 32: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Problems with Bad Measurements - 6

Least-Squares Summary

• MATLAB includes easy-to-use least-squares fitting tools.

• Ask before you start: “Interpolation or function fitting?”

• “Best fit” means “least-squares fit”.

• Least-squares is “best” if the errors/residuals satisfy very specific criteria:

– normally distributed with mean zero, and

– consistent variability for all x values.

• To prefer a higher-degree fit to a lower degree one, you must provide a theo-retical or practical justification; the higher-order model will always have lowererror, and will always be more complex, so those are not useful justifications.

Page 33: Week #6 : Function Fitting; Animation Goals: Least-Squares ...math272/Notes/Annotated_Online/notes06... · Week #6 : Function Fitting; Animation Goals: Least-Squares Function tting

Animation - 1

Animation

See the Animation Handout for sample code.