Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

23
Chapter 15 Association Between Variables Measured at the Interval-Ratio Level

Transcript of Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Page 1: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Chapter 15

Association Between Variables Measured at the Interval-Ratio Level

Page 2: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Chapter Outline

Interpreting the Correlation Coefficient: r 2

The Correlation Matrix Testing Pearson’s r for Significance Interpreting Statistics: The Correlates

of Crime

Page 3: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Scattergrams

Scattergrams have two dimensions: The X (independent) variable is arrayed

along the horizontal axis. The Y (dependent) variable is arrayed

along the vertical axis.

Page 4: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Scattergrams

Each dot on a scattergram is a case. The dot is placed at the intersection

of the case’s scores on X and Y.

Page 5: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Scattergra ms

Turnout By % College

43

48

53

58

63

68

73

15 17 19 21 23 25 27 29 31 33 35

% College

Shows the relationship between % College Educated (X) and Voter Turnout (Y) on election day for the 50 states.

Page 6: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Scattergrams

Turnout By % College

43

48

53

58

63

68

73

15 17 19 21 23 25 27 29 31 33 35

% College

Horizontal X axis - % of population of a state with a college education. Scores range from 15.3% to 34.6% and increase

from left to right.

Page 7: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Scattergrams

Turnout By % College

43

48

53

58

63

68

73

15 17 19 21 23 25 27 29 31 33 35

% College

Vertical (Y) axis is voter turnout. Scores range from 44.1% to 70.4% and

increase from bottom to top

Page 8: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Scattergrams: Regression Line

Turnout By % College

43

48

53

58

63

68

73

15 17 19 21 23 25 27 29 31 33 35

% College

A single straight line that comes as close as possible to all data points.

Indicates strength and direction of the relationship.

Page 9: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Scattergrams:Strength of Regression Line The greater the extent to which dots are clustered

around the regression line, the stronger the relationship.

This relationship is weak to moderate in strength.

Turnout By % College

43

48

53

58

63

68

73

15 17 19 21 23 25 27 29 31 33 35

% College

Page 10: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Scattergrams: Direction of Regression Line Positive: regression line rises left to right. Negative: regression line falls left to right. This a positive relationship: As % college

educated increases, turnout increases.

Turnout By % College

43

48

53

58

63

68

73

15 17 19 21 23 25 27 29 31 33 35

% College

Page 11: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Scattergrams Inspection of the scattergram should

always be the first step in assessing the correlation between two I-R variables

Turnout By % College

43

48

53

58

63

68

73

15 17 19 21 23 25 27 29 31 33 35

% College

Page 12: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

The Regression Line: Formula This formula defines the regression line:

Y = a + bX Where:

Y = score on the dependent variable a = the Y intercept or the point where the

regression line crosses the Y axis. b = the slope of the regression line or the

amount of change produced in Y by a unit change in X

X = score on the independent variable

Page 13: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Regression Analysis Before using the formula for the regression line, a and b

must be calculated. Compute b first, using Formula 15.3 (we won’t do any

calculation for this chapter)

Page 14: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Regression Analysis The Y intercept (a) is computed from

Formula 15.4:

Page 15: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Regression Analysis For the relationship between % college

educated and turnout: b (slope) = .42 a (Y intercept)= 50.03

Regression formula: Y = 50.03 + .42 X A slope of .42 means that turnout increases

by .42 (less than half a percent) for every unit increase of 1 in % college educated.

The Y intercept means that the regression line crosses the Y axis at Y = 50.03.

Page 16: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Predicting Y What turnout would be expected in a state

where only 10% of the population was college educated?

What turnout would be expected in a state where 70% of the population was college educated?

This is a positive relationship so the value for Y increases as X increases: For X =10, Y = 50.3 +.42(10) = 54.5 For X =70, Y = 50.3 + .42(70) = 79.7

Page 17: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Pearson correlation coefficient But of course, this is just an estimate of

turnout based on % college educated, and many other factors also affect voter turnout.

How much of the variation in voter turnout depends on % college educated? The relevant statististic is the coefficient of determination (r squared), but first we need to learn about Pearson’s correlation coefficient (r).

Page 18: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Pearson’s r Pearson’s r is a measure of association for I-R

variables. It varies from -1.0 to +1.0 Relationship may be positive (as X increases, Y

increases) or negative (as X increases, Y decreases) For the relationship between % college educated and

turnout, r =.32. The relationship is positive: as level of education

increases, turnout increases. How strong is the relationship? For that we use R

squared, but first, let’s look at the calculation process

Page 19: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Example of Computation The computation and interpretation of a, b,

and Pearson’s r will be illustrated using Problem 15.1.

The variables are: Voter turnout (Y) Average years of school (X)

The sample is 5 cities. This is only to simplify computations, 5 is much

too small a sample for serious research.

Page 20: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Example of Computation The scores on each

variable are displayed in table format: Y = Turnout X = Years of

Education

City X Y

A 11.9 55

B 12.1 60

C 12.7 65

D 12.8 68

E 13.0 70

Page 21: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Example of Computation Sums are

needed to compute b, a, and Pearson’s r.

X Y X2 Y2

XY

11.9 55 141.61 3025 654.5

12.1 60 146.41 3600 726

12.7 65 161.29 4225 825.5

12.8 68 163.84 4624 870.4

13.0 70 169 4900 910

62.5 318 782.15 20374 3986.4

Page 22: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Interpreting Pearson’s r An r of 0.98 indicates an extremely strong

relationship between average years of education and voter turnout for these five cities.

The coefficient of determination is r2 = .96. Knowing education level improves our prediction of voter turnout by 96%. This is a PRE measure (like lambda and gamma)

We could also say that education explains 96% of the variation in voter turnout.

Page 23: Chapter 15 Association Between Variables Measured at the Interval-Ratio Level.

Interpreting Pearson’s r Our first example provides a more

realistic value for r. The r between turnout and % college

educated for the 50 states was: r = .32 This is a weak to moderate, positive

relationship. The value of r2 is .10.

Percent college educated explains 10% of the variation in turnout.