Correlation examining relationships. Five Descriptive Questions What is the middle of the set of...

Post on 05-Jan-2016

212 views 0 download

Transcript of Correlation examining relationships. Five Descriptive Questions What is the middle of the set of...

CorrelationCorrelation examining relationships examining relationships

Five Descriptive QuestionsFive Descriptive Questions

What is the middle of the set of What is the middle of the set of scores?scores?

How spread out are the scores?How spread out are the scores? Where do specific scores fall in the Where do specific scores fall in the

distribution of scores?distribution of scores? What is the shape of the distribution?What is the shape of the distribution? How do different variables relate to How do different variables relate to

each other?each other?

CorrelationCorrelation

Once you know:Once you know:– MiddleMiddle– SpreadSpread– ShapeShape– Relative position of specific casesRelative position of specific cases

It is now useful to know It is now useful to know relationships between variables.relationships between variables.

CorrelationCorrelation

Direction of RelationshipsDirection of Relationships Positive or NegativePositive or Negative Magnitude of RelationshipsMagnitude of Relationships Weak , Moderate, Strong Weak , Moderate, Strong  ScatterplotsScatterplots OutliersOutliers

CorrelationCorrelation

Quantitative index of associationQuantitative index of associationScaling of Pearson rScaling of Pearson r––1 = perfect negative relationship1 = perfect negative relationship0 = no relationship0 = no relationship+1 = perfect positive relationship+1 = perfect positive relationshipMost common measure of Most common measure of

association for interval and ratio association for interval and ratio variablesvariables

ExamplesExamples

Parent educational level and Parent educational level and student academic achievementstudent academic achievement

Parent income or SES and student Parent income or SES and student academic achievementacademic achievement

Coping strategies and perceived Coping strategies and perceived stressstress

CorrelationCorrelation

For positive correlations between For positive correlations between two variables:two variables:

High values on x tend to be High values on x tend to be associated with high values on yassociated with high values on y

Low values on x tend to be Low values on x tend to be associated with low values on yassociated with low values on y

High Positive Correlation, r=.825

30.00

40.00

50.00

60.00

70.00

30.00 40.00 50.00 60.00 70.00

Curriculum

To

tal S

core

GABIRTH

50403020

WE

IGH

T

5000

4000

3000

2000

1000

0

GAOBS

50403020

WE

IGH

T

5000

4000

3000

2000

1000

0

r=.337 2001-2002 NC State System Level Datar=.337 2001-2002 NC State System Level Data

FRL

908070605040302010

TU

RN

OV

ER

40

30

20

10

0

CorrelationCorrelation

For negative correlations between For negative correlations between two variables:two variables:

Low values on x tend to be Low values on x tend to be associated with high values on yassociated with high values on y

High values on x tend to be High values on x tend to be associated with low values on yassociated with low values on y

Percieved Control

9080706050403020

PS

S t

ota

l

60

50

40

30

20

10

r=-.613r=-.613

r=-.716 2001-2002 NC State System Level Datar=-.716 2001-2002 NC State System Level Data

FRL

908070605040302010

EO

G

100

90

80

70

60

50

40

r=-.560 2001-2002 NC State System Level Datar=-.560 2001-2002 NC State System Level Data

TURNOVER

403020100

EO

G

100

90

80

70

60

50

40

Interpretation GuidelinesInterpretation Guidelines

Correlation is not causality. Correlation is not causality.

Correlation is necessary for causal Correlation is necessary for causal inference, but not sufficient.inference, but not sufficient.

Causal inference requires Causal inference requires experimental designs.experimental designs.

Interpretation GuidelinesInterpretation Guidelines

Rum use and number of people Rum use and number of people entering the priesthood. entering the priesthood.

Square footage of home and Square footage of home and student academic achievement.student academic achievement.

Percent of women in a state who Percent of women in a state who earn high salaries and percent of earn high salaries and percent of public officials who are women.public officials who are women.

Interpretation GuidelinesInterpretation Guidelines

The third variable problem.The third variable problem.– SES and home size.SES and home size.

The risk factor vs. causal agent problem.The risk factor vs. causal agent problem.– Length of time smoking and life Length of time smoking and life

expectancy.expectancy.

The direction of causality problem.The direction of causality problem.– Productivity and job satisfactionProductivity and job satisfaction

Interpretation GuidelinesInterpretation Guidelines

R assumes a linear relationship. R assumes a linear relationship. R will underestimate curvilinear R will underestimate curvilinear

relationships.relationships.Restriction of range will lower Restriction of range will lower

correlation.correlation.Outliers, gaps in distributions, non-Outliers, gaps in distributions, non-

normal distributions can all influence r.normal distributions can all influence r.Be aware of subgroups.Be aware of subgroups.

Interpretation GuidelinesInterpretation Guidelines

Examine the scatterplot. Examine the scatterplot.

Examine the distributions of both Examine the distributions of both variables. variables.

Be aware of the other descriptive Be aware of the other descriptive statistics on both variables. statistics on both variables.

Interpreting MagnitudeInterpreting Magnitude

Strong Moderate Weak Weak Moderate Strong

-1.0 -0.7 -0.3 0.0 0.3 0.7 1.0

Perfect No PerfectNegative Relationship Positive

OutliersOutliers

You can look at outliers in the You can look at outliers in the univariate case (within the univariate case (within the distribution of a single variable) and distribution of a single variable) and in the bivariate case (within the in the bivariate case (within the scatterplot of points representing scatterplot of points representing values on two variables).values on two variables).

Examine the scatterplots for values Examine the scatterplots for values out of the pattern.out of the pattern.

GAOBS

50403020

GA

BIR

TH

50

40

30

20

AGEDAYS

140120100806040200-20

WE

IGH

T

5000

4000

3000

2000

1000

0

GABIRTH

50403020

AG

ED

AY

S

140

120

100

80

60

40

20

0

-20

WEIGHT

500040003000200010000

NP

BA

SE

HR

200

180

160

140

120

100

What would you expect?What would you expect?

Teacher ageTeacher ageClassroom qualityClassroom quality

20

25

30

35

40

45

50

55

60

65

70

30.00 40.00 50.00 60.00 70.00

Total Score

Ag

e o

f T

ea

ch

er

r=-.279r=-.279

What would you expect?What would you expect?

Perceived stressPerceived stressDepressionDepression

r=.582r=.582

BDI Total

50403020100-10

PS

S t

ota

l

60

50

40

30

20

10

What would you expect?What would you expect?

DepressionDepressionSelf-acceptanceSelf-acceptance

r=-.596r=-.596

Self-Acceptance

807060504030

BD

I T

ota

l

50

40

30

20

10

0

-10

What would you expect?What would you expect?

Emotional ExhaustionEmotional ExhaustionDepersonalizationDepersonalization

r=.574r=.574