©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by...

61
©2006 Thomson/South-Western 1 Chapter 13 – Chapter 13 – Correlation and Correlation and Simple Simple Regression Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western Concise Managerial Statistics KVANLI PAVUR KEELING

Transcript of ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by...

Page 1: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 1

Chapter 13 –Chapter 13 –

Correlation andCorrelation andSimple Simple RegressionRegression

Slides prepared by Jeff HeylLincoln University

©2006 Thomson/South-Western

Concise Managerial StatisticsConcise Managerial Statistics

KVANLIPAVURKEELING

KVANLIPAVURKEELING

Page 2: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 2

Bivariate DataBivariate Data

Figure 13.1Figure 13.1

35 –35 –

30 –30 –

25 –25 –

20 –20 –

15 –15 –

10 –10 –

5 –5 –Sq

ua

re f

oo

tag

e (

hu

nd

red

s)

Sq

ua

re f

oo

tag

e (

hu

nd

red

s)

||

2020

||

3030

||

4040

||

5050

||

6060

||

7070

||

8080

YY

XX

Income (thousands)Income (thousands)

(a)(a)

35 –35 –

30 –30 –

25 –25 –

20 –20 –

15 –15 –

10 –10 –

5 –5 –

Sq

ua

re f

oo

tag

e (

hu

nd

red

s)

Sq

ua

re f

oo

tag

e (

hu

nd

red

s)

||

2020

||

3030

||

4040

||

5050

||

6060

||

7070

||

8080

YY

XX

Income (thousands)Income (thousands)

(b)(b)

Page 3: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 3

Coefficient of CorrelationCoefficient of Correlation

The strength of the linear relationship The strength of the linear relationship between two variables is called the between two variables is called the coefficient of correlation, r.coefficient of correlation, r.

rr = =∑∑((xx - - xx)()(yy - - yy))

∑ ∑((xx - - xx))22 ∑( ∑(yy - - yy))22

==∑∑xyxy - (∑ - (∑xx)(∑)(∑yy) / ) / nn

∑ ∑xx22 - (∑ - (∑xx))22 / / nn ∑ ∑yy22 - (∑ - (∑yy))22 / / nn

Page 4: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 4

Coefficient of Correlation Coefficient of Correlation PropertiesProperties

1.1. r ranges from r ranges from -1.0-1.0 to to 1.01.0

2.2. The larger |r | is, the stronger the linear The larger |r | is, the stronger the linear relationshiprelationship

3.3. The sign of r tells you whether the The sign of r tells you whether the relationship between X and Y is a positive relationship between X and Y is a positive (direct) or a negative (inverse) relationship(direct) or a negative (inverse) relationship

4.4. r r = 1= 1 or or -1-1 implies that a perfect linear implies that a perfect linear pattern exists between the two variables, pattern exists between the two variables, that they are perfectly correlatedthat they are perfectly correlated

Page 5: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 5

Sum of SquaresSum of SquaresSSSSXX = sum of squares for = sum of squares for XX

= ∑(= ∑(xx - - xx))22

= ∑= ∑xx22 - - (∑(∑xx))22

nn

SSSSYY = sum of squares for = sum of squares for YY

= ∑(= ∑(yy - - yy))22

= ∑= ∑yy22 - - (∑(∑yy))22

nn

SCPSCPXYXY = sum of cross products for = sum of cross products for XYXY

= ∑(= ∑(xx - - xx)()(yy - - yy))

= ∑= ∑xyxy - - (∑(∑xx) (∑) (∑yy))

nn

Page 6: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 6

Sum of SquaresSum of SquaresSSSSXX = sum of squares for = sum of squares for XX

= ∑(= ∑(xx - - xx))22

= ∑= ∑xx22 - - (∑(∑xx))22

nn

SSSSYY = sum of squares for = sum of squares for YY

= ∑(= ∑(yy - - yy))22

= ∑= ∑yy22 - - (∑(∑yy))22

nn

SCPSCPXYXY = sum of cross products for = sum of cross products for XYXY

= ∑(= ∑(xx - - xx)()(yy - - yy))

= ∑= ∑xyxy - - (∑(∑xx) (∑) (∑yy))

nn

rr = =SCPSCPXYXY

SSSSXX SS SSYY

Page 7: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 7

Scatter Diagram and Scatter Diagram and Correlation CoefficientCorrelation Coefficient

Figure 13.2Figure 13.2

Page 8: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 8

Vertical DistancesVertical Distances

dd11

dd22

dd33

dd44

dd55

dd66

dd77

dd88

dd99

dd1010Line Line LL

Figure 13.3Figure 13.3

||2020

||3030

||4040

||5050

||6060

||7070

||8080

XX

YYS

qu

are

foo

tag

eS

qu

are

foo

tag

e

IncomeIncome

Page 9: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 9

Least Squares LineLeast Squares Line

The least squares line is the line The least squares line is the line through the data that minimizes the through the data that minimizes the sum of the differences between the sum of the differences between the observations and the lineobservations and the line

∑∑dd22 = = dd1122 + + dd22

22 + + dd3322 + … + + … + d dnn

22

bb11 = = bb00 = = yy - - bb11xxSCPSCPXYXY

SSSSXX

Page 10: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 10

Least Squares LineLeast Squares Line

Figure 13.6Figure 13.6

dd11

dd22

YY = = bb00 + + bb11XX^̂

YY for for XX = 50 = 50YY for for XX = 50 = 50^̂

YY

XX5050

IncomeIncome

Sq

uar

e fo

ota

ge

Sq

uar

e fo

ota

ge

Distance is Distance is YY −− YY^̂

Page 11: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 11

Sum of Squares of ErrorSum of Squares of Error

SSE = SSSSE = SSYY - -(SCP(SCPXYXY))22

SSSSXX

SSE = ∑SSE = ∑dd22 = ∑( = ∑(yy - - yy))22^̂

Page 12: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 12

Least Squares Line Least Squares Line for Real Estate Datafor Real Estate Data

Figure 13.5Figure 13.5

YY

XX5050

IncomeIncome

Sq

uar

eS

qu

are

foo

tag

efo

ota

ge

YY = 4.915 + .3539 = 4.915 + .3539XX^̂

YY = 20 = 20YY = 22.67 = 22.67^̂

Page 13: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 13

Assumptions for theAssumptions for the Simple Regression Model Simple Regression Model

1.1. The mean of each error component is zeroThe mean of each error component is zero

YY = = 00 + + 11XX + + ee

2.2. Each error component (random variable) Each error component (random variable) follows an approximate normal distributionfollows an approximate normal distribution

3.3. The variance of the error component is the The variance of the error component is the same for each value of Xsame for each value of X

4.4. The errors are independent of each otherThe errors are independent of each other

Page 14: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 14

Assumption 1 for theAssumption 1 for theSimple Regression ModelSimple Regression Model

YY

XX

IncomeIncome

Sq

uar

e fo

ota

ge

Sq

uar

e fo

ota

ge

Figure 13.6Figure 13.6

YY = = 00 + + 11XX

YY = = 00 + + 11XX + + ee

µµyy150150

µµyy135135

ee

3535 5050

00

Page 15: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 15

Violation of Assumption 3Violation of Assumption 3

Figure 13.7Figure 13.7

YY

XX

IncomeIncome

Sq

uar

e fo

ota

ge

Sq

uar

e fo

ota

ge YY = = 00 + + 11XX

ee

3535 5050

ee

6060

Page 16: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 16

Assumptions 1, 2, 3 for theAssumptions 1, 2, 3 for theSimple Regression ModelSimple Regression Model

Figure 13.8Figure 13.8

YY

XX

IncomeIncome

Sq

uar

e fo

ota

ge

Sq

uar

e fo

ota

ge

ee

3535 5050 6060

0000

00

eeee

ee

ee

ee

Page 17: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 17

Estimating the Error Estimating the Error Variance, Variance, ee

22

ss22 = = ee22 = estimate of = estimate of ee

22 = = SSESSE

nn - 2 - 2^̂

wherewhere

(SCP(SCPXYXY))22

SSSSXX

SSE = ∑(SSE = ∑(yy - - yy))22 = SS = SSYY - -^̂

Page 18: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 18

Three Possible PopulationsThree Possible Populations

11 < 0 < 0

(c)(c)

XX

YY

11 > 0 > 0

(b)(b)

XX

YY

11 = 0 = 0

(a)(a)

XX

YY

Figure 13.9Figure 13.9

Page 19: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 19

Hypothesis Test on theHypothesis Test on theSlope of the Regression LineSlope of the Regression Line

HHoo: : 11 = 0 ( = 0 (XX provides no information) provides no information)

HHaa: : 11 ≠ 0 ( ≠ 0 (XX does provide information) does provide information)

Two-Tailed TestTwo-Tailed Test

Test Statistic:Test Statistic:

rejectreject HHoo if | if |tt| > | > tt/2,/2,nn-2-2

tt = = = =bb11 – – 11

s/ s/ SSSSxx

bb11 – – 11

ssb b 11

Page 20: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 20

Hypothesis Test on theHypothesis Test on theSlope of the Regression LineSlope of the Regression Line

Test Statistic:Test Statistic:

tt = =bb11

ssbb 11

HHoo: : 11 ≤ 0 ≤ 0

HHaa: : 11 > 0 > 0

One-Tailed TestOne-Tailed Test

HHoo: : 11 ≥ 0 ≥ 0

HHaa: : 11 < 0 < 0

rejectreject HHoo if if tt > > tt/2,/2,nn-2-2 rejectreject HHoo if if tt < - < -tt/2,/2,nn-2-2

Page 21: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 21

t Curve with 8 dft Curve with 8 df

Figure 13.10Figure 13.10

1.8601.860 Rejection regionRejection region

tttt

Page 22: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 22

Real Estate ExampleReal Estate Example

Figure 13.11Figure 13.11

Page 23: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 23

Real Estate ExampleReal Estate Example

Figure 13.12Figure 13.12

Page 24: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 24

Real Estate ExampleReal Estate Example

Figure 13.13Figure 13.13

Page 25: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 25

Real Estate ExampleReal Estate Example

Figure 13.14Figure 13.14

Page 26: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 26

Scatter DiagramScatter Diagram

30 –30 –

20 –20 –

10 –10 –

||1212

||2424

||3636

||4848

||6060

AgeAge

Liq

uid

ass

ets

Liq

uid

ass

ets

(% o

f a

nn

ual

in

com

e)(%

of

an

nu

al i

nco

me)

YY

XX

YY = -.814 + .3526 = -.814 + .3526XX^̂

Figure 13.15Figure 13.15

Page 27: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 27

Scatter DiagramScatter Diagram

Figure 13.15Figure 13.15

SSX = 1268.67 x = 43.667SSY = 348.92 y = 14.583SCPXY = 447.33

r = = .672SCPXY

SSX SSY

Page 28: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 28

Confidence Interval for Confidence Interval for 11

TheThe (1 - (1 - ) • 100% ) • 100% confidence interval forconfidence interval for 11 isis

bb11 - - tt/2,/2,nn-2-2ssbb toto bb11 + + tt/2,/2,nn-2-2ssbb11 11

Page 29: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 29

Curvilinear RelationshipCurvilinear Relationship

YYYY

XXXX

Figure 13.16Figure 13.16

Page 30: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 30

Measuring the StrengthMeasuring the Strengthof the Modelof the Model

rr = =SCPSCPXYXY

SSSSXX SS SSYY

rr

1 - 1 - rr22

nn - 2 - 2

tt = =

HHoo: : pp = 0 = 0 ((no linear relationship exists betweenno linear relationship exists between

XX andand YY))HHaa: : pp ≠ 0 ≠ 0 ((a linear relationship does exista linear relationship does exist))

Page 31: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 31

Danger of Assuming Danger of Assuming CausalityCausality

A high statistical correlation does A high statistical correlation does not imply causalitynot imply causality

There are many situations when There are many situations when variables are highly correlated variables are highly correlated because a factor not being because a factor not being studied affects the variables being studied affects the variables being studiedstudied

Page 32: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 32

Coefficient of DeterminationCoefficient of Determination

SSE = SSSSE = SSYY - -(SCP(SCPXYXY))22

SSSSXX

rr22 = =(SCP(SCPXYXY))22

SSSSXXSSSSYY

rr22== coefficient of determinationcoefficient of determination

== 1 -1 -

== percentage of explained variation percentage of explained variation in the dependent variable using the in the dependent variable using the simple linear regression modelsimple linear regression model

SSESSE

SSSSYY

Page 33: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 33

Total Variation, SSTotal Variation, SSYY

Figure 13.17Figure 13.17

YY

XX

((xx, , yy))yy - - yy

((xx, , yy))

yy - - yy

yy - - yy

yy

YY = = bb00 + + bb11XX

Sample pointSample point

Page 34: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 34

Total Variation, SSTotal Variation, SSYY

Figure 13.17Figure 13.17

YY

XX

((xx, , yy))yy - - yy

((xx, , yy))

yy - - yy

yy - - yy

yy

YY = = bb00 + + bb11XX

Sample pointSample point

SSY = SSR + SSE

SSR = (SCPXY)2

SSX

Page 35: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 35

Estimation and Estimation and Prediction Using the Prediction Using the Simple Linear Model Simple Linear Model

The least squares line can be The least squares line can be used to estimate average values used to estimate average values

or predict individual valuesor predict individual values

Page 36: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 36

Confidence Interval for µConfidence Interval for µY|xY|x 00

(1- (1- ) 100% Confidence Interval for ) 100% Confidence Interval for Y|xY|x00

YY - - tt/2,/2,nn-2-2ss + +^̂ ((xx00 - - xx))22

SSSSXX

11

nn

to Yto Y + + tt/2,/2,nn-2-2ss + +((xx00 - - xx))22

SSSSXX

11

nn^̂

ssYY = = ss + +((xx00 - - xx))22

SSSSXX

11

nn^̂

Page 37: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 37

Confidence andConfidence andPrediction IntervalsPrediction Intervals

Figure 13.18Figure 13.18

Page 38: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 38

Confidence andConfidence andPrediction IntervalsPrediction Intervals

Figure 13.19Figure 13.19

Page 39: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 39

Confidence and Confidence and Prediction IntervalsPrediction Intervals

Figure 13.20Figure 13.20

Page 40: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 40

95% Confidence Intervals95% Confidence Intervals

xx = 49.8 = 49.8

20.2720.27

12.3312.33

Upper confidence limitsUpper confidence limits

Lower confidence limitsLower confidence limits

YY = 4.975 + .3539 = 4.975 + .3539XX^̂

Figure 13.21Figure 13.21

35 –35 –

30 –30 –

25 –25 –

20 –20 –

15 –15 –

10 –10 –

5 –5 –

||2020

||3030

||4040

||5050

||6060

||7070

XX

Page 41: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 41

Prediction Interval for YPrediction Interval for YXX 00

YY - - tt/2,/2,nn-2-2ss 1 + + 1 + +^̂ ((xx00 - - xx))22

SSSSXX

11

nn

to Yto Y + + tt/2,/2,nn-2-2ss 1 + + 1 + +((xx00 - - xx))22

SSSSXX

11

nn^̂

ssYY22 = = ss22 1 1 + + + +

((xx00 - - xx))22

SSSSXX

11

nn^̂

Page 42: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 42

95% Confidence Intervals95% Confidence Intervals

Figure 13.22Figure 13.22

xx = 49.8 = 49.8

24.4324.43

Prediction interval limitsPrediction interval limits8.178.17

20.2720.27

Confidence interval limitsConfidence interval limits12.3312.33

35 –35 –

30 –30 –

25 –25 –

20 –20 –

15 –15 –

10 –10 –

5 –5 –

||2020

||3030

||4040

||5050

||6060

||7070

XX

Page 43: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 43

Checking Model Checking Model AssumptionsAssumptions

1.1. The errors are normally distributed The errors are normally distributed with a mean of zerowith a mean of zero

2.2. The variance of the errors remains The variance of the errors remains constant. For example, you should not constant. For example, you should not observe larger errors associated with observe larger errors associated with larger values of X.larger values of X.

3.3. The errors are independentThe errors are independent

Page 44: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 44

Examination of ResidualsExamination of Residuals

XX

(a)(a)

YY - - YY^̂

YY - - YY^̂

XX

(b)(b)

Figure 13.23Figure 13.23

Page 45: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 45

Examination of ResidualsExamination of Residuals

Figure 13.24Figure 13.24

TimeTime

YY - - YY^̂

0019

94 –

1994

1995

–19

95 –

1993

–19

93 –

1997

–19

97 –

1999

–19

99 –

1992

–19

92 –

1996

–19

96 –

1998

–19

98 –

2000

–20

00 –

2001

–20

01 –

Page 46: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 46

Autocorrelation and the Autocorrelation and the Durbin-Watson StatisticDurbin-Watson Statistic

Range from Range from 00 to to 44

Ideal value is Ideal value is 22

As As DWDW decreases from decreases from 22, positive , positive autocorrelation increasesautocorrelation increases

As As DWDW increases from increases from 22, negative , negative autocorrelation increasesautocorrelation increases

DW = DW = ∑∑((eett - - eet-1t-1))22

∑∑eett22

TT

tt =2 =2

TT

tt =1 =1

Page 47: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 47

Autocorrelation and the Autocorrelation and the Durbin-Watson StatisticDurbin-Watson Statistic

Figure 13.25Figure 13.25

Page 48: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 48

Checking for OutliersChecking for Outliers

Figure 13.26Figure 13.26

Page 49: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 49

Identifying Outlying ValuesIdentifying Outlying Values

Outlying sample values can be found Outlying sample values can be found by calculating the sample leverageby calculating the sample leverage

hhii = + = +((xxii - - xx))22

SSSSXX

11

nn

SSSSXX = ∑ = ∑xx22 - (∑ - (∑xx))22//nn

A sample is considered an outlier if its A sample is considered an outlier if its leverage is greater than leverage is greater than 4/4/n or n or 6/6/nn

Page 50: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 50

Identifying Outlying ValuesIdentifying Outlying ValuesThe standard deviation of the The standard deviation of the

predicted Y value ispredicted Y value is

ssyy = = s hs hii

The confidence interval isThe confidence interval is

YY - - tt/2,/2,nn-2-2s hs hi i to Y to Y + + tt/2,/2,nn-2-2s hs hii ^̂ ^̂

The prediction interval isThe prediction interval is

YY - - tt/2,/2,nn-2-2s s 1 + 1 + hhi i to Y to Y + + tt/2,/2,nn-2-2s s 1 + 1 + hhii ^̂ ^̂

Page 51: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 51

Real Estate ExampleReal Estate Example

Figure 13.27(a)Figure 13.27(a)

Page 52: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 52

Real Estate ExampleReal Estate Example

Figure 13.27(b)Figure 13.27(b)

Page 53: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 53

Identifying Outlying ValuesIdentifying Outlying Values

Unusually large or small values of the dependent Unusually large or small values of the dependent variable variable ((YY)) can generally be detected using the can generally be detected using the

sample standardized residualssample standardized residuals

Estimated standard deviation of the ith residualEstimated standard deviation of the ith residual

ss 1 - 1 - hhii

Standardized residual =Standardized residual =YYii - - YYii

ss 1 - 1 - hhii

An observation is thought to have and outlying An observation is thought to have and outlying value of Y if its standardized residual value of Y if its standardized residual > 2> 2 or or < -2< -2

Page 54: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 54

Identifying Influential Identifying Influential ObservationsObservations

You may conclude the ith observation is You may conclude the ith observation is influential if the corresponding Dinfluential if the corresponding Dii measure measure > .8> .8

Cook’s distance measureCook’s distance measure

DDii = (standardized residual) = (standardized residual)2211

22

hhii

1 - 1 - hhii

==((YYii – – YYii))22

22ss22

hhii

(1 – (1 – hhii))22

Page 55: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 55

Leverages, Standardized Leverages, Standardized Residuals, and Cook’s Distance Residuals, and Cook’s Distance

MeasuresMeasures

Figure 13.28Figure 13.28

Page 56: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 56

Summary of Summary of Figures 13.26 and 13.28Figures 13.26 and 13.28

Outlying inOutlying in Outlying in Outlying in InfluentialInfluentialXX Value Value YY Value Value ObservationObservation

PointPoint (h(hii > .4) > .4) (|stand. res.| > 2)(|stand. res.| > 2) (D(Dii > .8) > .8)

AA NoNo YesYes NoNo

BB NoNo NoNo NoNo

CC YesYes YesYes YesYes

Table 13.1Table 13.1

Page 57: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 57

Engine Capacity and MPGEngine Capacity and MPG

Figure 13.29Figure 13.29

Page 58: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 58

Engine Capacity and MPGEngine Capacity and MPG

Figure 13.30Figure 13.30

Page 59: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 59

Engine Capacity and MPGEngine Capacity and MPG

Figure 13.31Figure 13.31

Page 60: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 60

Engine Capacity and MPGEngine Capacity and MPG

Figure 13.32Figure 13.32

Page 61: ©2006 Thomson/South-Western 1 Chapter 13 – Correlation and Simple Regression Slides prepared by Jeff Heyl Lincoln University ©2006 Thomson/South-Western.

©2006 Thomson/South-Western 61

Engine Capacity and MPGEngine Capacity and MPG

Figure 13.33Figure 13.33