GCSE: Constructions & Loci Dr J Frost ([email protected]) Last modified: 28 th December 2014.
Dr J Frost (jfrost@ ) S1: Chapter 7 Regression Dr J Frost (jfrost@ ) .
-
Upload
alexina-aileen-cross -
Category
Documents
-
view
227 -
download
1
description
Transcript of Dr J Frost (jfrost@ ) S1: Chapter 7 Regression Dr J Frost (jfrost@ ) .
S1: Chapter 7Regression
Dr J Frost ([email protected])www.drfrostmaths.com
Last modified: 22nd January 2016
What is regression?
Time spent revising
Exam mark
I record peopleβs exam marks as well as the time they spent revision. I want to predict how well someone will do based on the time they spent revision. How would I do this?
π¦=20+3 π₯
What weβve done here is come up with a model to explain the data, i.e. a line . Weβve then tried to set and such that the resulting value matches the actual exam marks as close as possible.The βregressionβ bit is the act of setting the parameters of our model (here the gradient and y-intercept of the line of best fit) to best explain the data.
What is regression?
Time
Rabbit population
In this chapter we only cover linear regression, where our chosen model is a straight line.
But in general we could use any model that might best explain the data. Population tends to grow exponentially rather than linearly, so we might make our model and then try to use regression to work out the best and to use.
Explanatory and Response Variables
Time spent revising
Exam mark
! An independent (or explanatory) variable is one that is set independently of other variables.It goes on the x-axis.
! A dependent (or response) variable is one whose values are determined by the values of the independent variable.It goes on the y-axis.
So how do we numerically find the line of best fit?
π₯
π1 π2
π3 π4
π5
π6π7
The residuals are the errors between the value predicted by the model and the y value of each data point.
π¦
We minimise the total of the squares of the residuals.
Why squared?
This is known as a least squares regression line.
So how do we numerically find the line of best fit?
π₯
π1 π2
π3 π4
π5
π6π7
π¦
It turns out (using differentiation techniques youβll see in C2) that the and we use to minimise the total (squared) error is:
π=πΊππ
πΊππ π=πβππ
π=π+ππ
To remember the gradient, I think chromosomes of men and women. Men come out top!
The mean of x and y is on the line, i.e. . Hence this gives us .
Notice that in regression, we write the terms in ascending powers of , contrary to algebraic convention.Hence is the -intercept, not the gradient.
Example
π=πΊππ
πΊππ π=πβππ
Mass, (kg) 20 40 60 80 100
Length, (cm) 48 55.1 56.3 61.2 68
a) Calculate and (You may use that , , , , , , )
b) Calculate the regression line of on .? ?
??
?Broculator Tip: Your calculator will calculate and while in STATS mode (under the Reg menu)
Test Your UnderstandingMay 2009 Q5
For βcomment on reliability of estimateβ questions, always one of: !
β’ Reliable (1) because inside the range of the data/interpolating (1)
β’ Unreliable (1) because outside the range of the data/extrapolating (1).
β’ Reliable (1) because just outside the range of the data (1).
Note that once finding and , you still need to write the equation at the end for the final mark!
?
?
?
A common error is to do . The first row (the explanatory variable) is always the ββ one.
Exercises
On provided sheet.Answers on next slides.
(Note that Q7 and 8 uses βcodingβ. We will cover this next lesson)
Help with wordy questions:
βExplain why this diagram would support the fitting of a regression line of onto .βThe variables have a linear relationship, i.e. the points are close to the implied straight line of best fit.
βInterpret the gradient/slope of the line/interpret βAs (x) increases by 1, (y) increases/decreases by ___.
βInterpret the y-intercept/interpret βThe value (y) takes when (x) is 0.
βWhich is the explanatory variable? Explain your answer.β(x) is the explanatory variable because (x) influences (y)
Explain method of least squares."We minimise the square of the residuals" (draw a diagram)
?
?
?
?
?
Exercises
?
?
??
Exercises
?
?
?
?
?
?
Exercises
?
?
?
???
Exercises
?
?
???
Exercises
?
?
?
?
?
Exercises
?
??
?
?
?
CodingWeβve previously considered how coding affects a means, variances and the PMCC. So how do they affect the regression line?
Eight samples of carbon steel were produced with different percentages, of carbon in them. Each sample was heated in a furnace until it melted and the temperature, in C, at which it melted was recorded.
The results were coded such that and .
Suppose that we found the regression line of on was .Then what is the regression line in terms of the original variables and ?
?Just replace the variables using the substitution and rearrange. Thatβs it!
More Examples
The length and height of an Ewok was coded using and .If the equation of the regression line of on is:
what is the equation of the regression line of on ?
π π+ππ=ππ (πβππ)βπThe maths mark and English mark of some stormtroopers is coded using and .If the equation of the regression line of on is:
What is the equation of the regression line of on ?
π βππ=π( ππ )+π
?
?
Exercises (continued)
?
?
Exercises
?
?
?
Just For Funβ¦