1 Relationships We have examined how to measure relationships between two categorical variables...

24
1 Relationships We have examined how to measure relationships between two categorical variables (chi- square) one categorical variable and one measurement variable (t-test, F- test) Now we look at relationships between two measurement variables
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    0

Transcript of 1 Relationships We have examined how to measure relationships between two categorical variables...

Page 1: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

1

Relationships

We have examined how to measure relationships between two categorical variables (chi-square)one categorical variable and one

measurement variable (t-test, F-test)

Now we look at relationships between two measurement variables

Page 2: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

2

Interval variable relations

We want to describe the relationship in terms of formstrength

We want to make inferences to the population

Page 3: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

3

Our Tools

Correlation to measure strength of relationship

Regression to measure form of relationship

Page 4: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

4

Regression

Begin with a scatterplot of two measurement variables, X and Y

Let X be the independent variable

Let Y be the dependent variable

Plot each case as we have done before at the beginning of the course.

Page 5: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

5

ScatterplotH

om

icid

es/

100,0

00 p

eople

Percent families below poverty0.00 5.00 10.00 15.00

0.00

10.00

20.00

30.00

Note:

Page 6: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

6

Note the outlier: Dallas

Page 7: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

7

Relationships

Each city is represented by an X score (percent poor) and a Y score (homicide rate)We are asking about the relationship between poverty and homicideDoes homicide change as percent poor changes? If so, in what way and how much?

Page 8: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

8

Looking at the scatterplot

We see that as percent poor (poverty) increases (from left to right on the graph), the homicide rate increases (from low to high on the graph

Page 9: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

9

ScatterplotH

om

icid

es/

100,0

00 p

eople

Percent families below poverty0.00 5.00 10.00 15.00

0.00

10.00

20.00

30.00

Page 10: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

10

Representing relationships

We represent the relationship with a straight line that goes through the middle of the points on the graph

This line is the regression line

It shows the average homicide rate for every level of poverty.

Page 11: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

11

Regression Line

0.00 5.00 10.00 15.00

0.00

10.00

20.00

30.00

Page 12: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

12

Regression LineEvery line is represented by a formula

The regression line has the following general formula

‘a’ represents the intercept of the line

‘b’ represents the slope of the line

y-hat is the predicted value of y for a given x value

bxay ˆ

Page 13: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

13

Regression of homicide on poverty

xy 944.815.ˆ

a = -.815 b = .944

x is percent poory is homicide rate

Page 14: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

14

Slope, the value of bThe slope of the regression line is positive, it goes from the lower left to the upper right.

The slope measures the amount of change in the dependent variable for every unit change in the independent variable

b = .944. There is an increase of .944 units in y for every increase of 1.0 in x

Page 15: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

15

Regression Line, slope

Percent families below poverty0.00 5.00 10.00

0.00

10.00

20.00

5 units

5 x .944 units

RegressionLine

“rise”

“run”

Page 16: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

16

Intercept, the value of a

The intercept is the point where the regression line crosses the Y axis

This point is the value of Y when X is zero

a = -.815. The predicted rate of homicide is -.815 when there is zero poverty

Page 17: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

17

Calculate b

2xx

yyxxb

i

ii

Page 18: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

18

Calculate a

First calculate b, then

xbya

Page 19: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

19

Calculate predicted y

After calculating a and b, one can use the regression line formula to calculate predicted values of y for every actual value of x

xy 944.815.ˆ

Page 20: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

20

Prediction errors

Prediction errors are the difference between the predicted value of y and the actual value of y

yye ii ˆ

Page 21: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

21

Prediction errors

Errors (actualminus predicted)

RegressionLine

Predicted

Actual

Page 22: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

22

Ordinary Least Squares: OLS

The regression line is the “best fitting” line through the data points in the graphIt is the line that minimizes the sum of the squared error terms -- hence “least squares”

22 yye iiMinimize:

Page 23: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

23

Sums of Squared Errors

-1.0 -0.9 -0.8 -0.7 -0.6

0.7

0.9

1.1

460.0

480.0

500.0

520.0

540.0

560.0

580.0

600.0

620.0

640.0

SS

a

b

Page 24: 1 Relationships We have examined how to measure relationships between two categorical variables (chi-square) one categorical variable and one measurement.

24

Sum of Squared Errors

-1.0 -0.9 -0.8 -0.7 -0.60.7 638.7 630.2 622.0 614.3 607.00.8 572.9 567.6 562.8 558.3 554.30.9 537.9 535.9 534.3 533.2 532.41.0 533.7 535.0 536.7 538.8 541.31.1 560.4 565.0 569.9 575.3 581.1

Minimum is 531.57 when a=-.815, b=.944

b

a