Chapter 11

39
© The McGraw-Hill Companies, Inc., 2000 11-1 11-1 Chapter 11 Chapter 11 Correlation and Correlation and Regression Regression

description

11-1. Chapter 11. Correlation and Regression. Outline. 11-2. 11-1 Introduction 11-2 Scatter Plots 11-3 Correlation 11-4 Regression. Outline. 11-3. 11-5 Coefficient of Determination and Standard Error of Estimate. Objectives. 11-4. - PowerPoint PPT Presentation

Transcript of Chapter 11

Chapter 11Chapter 11
Outline
Outline
11-3
Objectives
Draw a scatter plot for a set of ordered pairs.
Find the correlation coefficient.
Find the equation of the regression line.
11-4
Objectives
Find the standard error of estimate.
Find a prediction interval.
11-2 Scatter Plots
A scatter plot is a graph of the ordered pairs (x, y) of numbers consisting of the independent variable, x, and the dependent variable, y.
11-6
11-2 Scatter Plots - Example
Construct a scatter plot for the data obtained in a study of age and systolic blood pressure of six randomly selected subjects.
The data is given on the next slide.
11-7
11-2 Scatter Plots - Example
11-2 Scatter Plots - Example
11-10
11-11
11-3 Correlation Coefficient
The correlation coefficient computed from the sample data measures the strength and direction of a relationship between two variables.
Sample correlation coefficient, r.
11-3 Range of Values for the Correlation Coefficient
11-13
11-3 Formula for the Correlation Coefficient r
11-14
The McGraw-Hill Companies, Inc., 2000
11-3 Correlation Coefficient - Example (Verify)
Compute the correlation coefficient for the age and blood pressure data.
11-15
11-3 The Significance of the Correlation Coefficient
The population corelation coefficient, , is the correlation between all possible pairs of data values (x, y) taken from a population.
11-16
11-3 The Significance of the Correlation Coefficient
H0: = 0 H1: 0
This tests for a significant correlation between the variables in the population.
11-17
11-3 Formula for the t tests for the Correlation Coefficient
11-18
t
n
r
with
d
f
n
11-3 Example
Test the significance of the correlation coefficient for the age and blood pressure data. Use = 0.05 and
r = 0.897.
H0: = 0 H1: 0
11-3 Example
Step 2: Find the critical values. Since = 0.05 and there are 6 – 2 = 4 degrees of freedom, the critical values are t = +2.776 and t = –2.776.
Step 3: Compute the test value. t = 4.059 (verify).
11-20
11-3 Example
Step 4: Make the decision. Reject the null hypothesis, since the test value falls in the critical region (4.059 > 2.776).
Step 5: Summarize the results. There is a significant relationship between the variables of age and blood pressure.
11-21
11-4 Regression
The scatter plot for the age and blood pressure data displays a linear pattern.
We can model this relationship with a straight line.
This regression line is called the line of best fit or the regression line.
The equation of the line is y = a + bx.
11-22
11-4 Formulas for the Regression Line y = a + bx.
11-23
2
2
2
2
2

11-4 Example
Find the equation of the regression line for the age and the blood pressure data.
Substituting into the formulas give a = 81.048 and b = 0.964 (verify).
Hence, y = 81.048 + 0.964x.
Note, a represents the intercept and b the slope of the line.
11-24
11-4 Example
11-4 Using the Regression Line to
Predict
The regression line can be used to predict a value for the dependent variable (y) for a given value of the independent variable (x).
Caution: Use x values within the experimental region when predicting y values.
11-26
11-4 Example
Use the equation of the regression line to predict the blood pressure for a person who is 50 years old.
Since y = 81.048 + 0.964x, then
y = 81.048 + 0.964(50) = 129.248 129.
Note that the value of 50 is within the range of x values.
11-27
11-5 Coefficient of Determination and Standard Error of Estimate
The coefficient of determination, denoted by r2, is a measure of the variation of the dependent variable that is explained by the regression line and the independent variable.
11-28
11-5 Coefficient of Determination and Standard Error of Estimate
r2 is the square of the correlation coefficient.
The coefficient of nondetermination is (1 – r2).
Example: If r = 0.90, then r2 = 0.81.
11-29
11-5 Coefficient of Determination and Standard Error of Estimate
The standard error of estimate, denoted by sest, is the standard deviation of the observed y values about the predicted y values.
The formula is given on the next slide.
11-30
11-5 Formula for the Standard Error of Estimate
11-31
11-5 Standard Error of Estimate - Example
From the regression equation, y = 55.57 + 8.13x and n = 6, find sest.
Here, a = 55.57, b = 8.13, and n = 6.
Substituting into the formula gives sest = 6.48 (verify).
11-32
11-5 Prediction Interval
A prediction interval is an interval constructed about a predicted y value, y , for a specified x value.
11-33
11-5 Prediction Interval
For given value, we can state with (1 – )100% confidence that the interval will contain the actual mean of the y values that correspond to the given value of x.
11-34
11-5 Formula for the Prediction Interval about a Value y
11-35
11-5 Prediction interval - Example
A researcher collects the data shown on the next slide and determines that there is a significant relationship between the age of a copy machine and its monthly maintenance cost. The regression equation is
y = 55.57 + 8.13x. Find the 95% prediction interval for the monthly maintenance cost of a machine that is 3 years old.
11-36
11-5 Prediction Interval - Example
11-5 Prediction Interval - Example
x2 = 82,
y = 55.57 + 8.13(3) = 79.96
Step 3: Find sest
11-38
11-5 Prediction Interval - Example
t/2 = 2.776, d.f. = 6 – 2 = 4 for 95%
60.53 < y < 99.39 (verify)
Hence, one can be 95% confident that the interval 60.53 < y < 99.39 contains the actual value of y.
11-39
Subject
Age,
x
Pressure,
y
A
43
128
B
48
120
C
56
135
D
61
143
E
67
141
F
70
152
7
0
6
0
5
0
4
0
1
5
0
1
4
0
1
3
0
1
2
0
A
g
e
P
r
e
s
s
u
r
e
7
0
6
0
5
0
4
0
1
5
0
1
4
0
1
3
0
1
2
0
A
g
e
P
r
e
s
s
u
r
e
1
5
1
0
5
9
0
8
0
7
0
6
0
5
0
4
0
N
u
m
b
e
r
o
f
a
b
s
e
n
c
e
s
F
i
n
a
l
g
r
a
d
e
1
5
1
0
5
9
0
8
0
7
0
6
0
5
0
4
0
N
u
m
b
e
r
o
f
a
b
s
e
n
c
e
s
F
i
n
a
l
g
r
a
d
e
x
y
xy
x
y
Substituti
ng
in
the
formula
for
r
gives
r