Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

22
Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu

Transcript of Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Page 1: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Correlation and

Regression

PS397Testing and Measurement

January 16, 2007Thanh-Thanh Tieu

Page 2: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Canoe.ca

Title of article implies that when women are depressed, they tend to drink more

Correlational relationship between drinking and depression in women

http://lifewise.canoe.ca/Living/2007/01/05/3176991-cp.html

Page 3: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Scatter Diagram Visual display of

relationship between variables Bivariate distribution:

two scores for each individual

Where an individual scores on both x and y

E.g., relationship between high school average and university average

Participant 11 – 3.2 high school GPA, 3.3 university GPA

Page 4: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Correlation What does one variable tell us about the

other? Looks at how the two variables covary

Changes in one correspond to changes in other Correlation coefficient tells us the direction

and magnitude of the relationship i.e., how variables are related (+/-) and the

strength of the relationship

Page 5: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Correlation Coefficient

Positive Correlation Negative Correlation

No Correlation

Page 6: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Correlation Coefficient Correlation coefficient varies from -1.0

(perfect negative relationship) to +1.0 (perfect positive relationship)

Accounts for the individual’s deviation above and below the group mean on each variable Above the mean on both variables = 2 positive

standard scores Below the mean on both variables = 2

negative standard scores

Page 7: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Correlation Coefficient Pearson correlation coefficient is mean of these

products:

yx s

yy

s

xx

Nr

1

Positive Value: standard scores have equal signs and are of approximate equal amount

Negative Value: standard score is above mean in one variable, and below mean in other (cross product is negative

No Correlation: some products are positive and some are negative

Page 8: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Regression If you had no other information, what is

the best prediction for a person’s grade in a course? Often we have other information (e.g., grades

on other courses, midterm grades, etc.) If variables are correlated with variable of

interest, this information can help us improve our prediction

Process called regression

Page 9: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Regression

Regression line: best fitting straight line through a set of points in a scatter diagram

Principle of Least Squares Minimum squared deviation from regression line

Page 10: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Regression Line

Y’ = a + bX Y’ = predicted score a = intercept, the value of y when x is 0,

point where regression line crosses y b = regression coefficient, slope of

regression line X = known score

Page 11: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Regression Line

Page 12: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Regression Line

Y’ = a + bXY’ = 20 + .1XWhere Y’ = predicted grade for course X = SAT score

slope = .1intercept = 20

Page 13: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Regression What if there were no correlation between X and

Y? What would regression line look like?

Page 14: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Regression The larger the value of b, the more information

we have about Y by knowing X

Page 15: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Regression What happens if both variables are in

terms of standard scores?

Y’ = a + bX a = 0 b = r, correlation between X and Y Regression equation would be:

ZY’ = rZx Correlation: special case of regression

where both variables are in standard scores

Page 16: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Regression Problems Break into groups of 3 people and

complete the problems on the handout

Page 17: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Terms Used in Correlation & Regression Residual: difference between predicted and

observed values

Y – Y’ Σresiduals = 0

Standard Error of the Estimate: standard deviation of residuals, kind of an average of residuals

A measure of accuracy of prediction Smaller = more accurate predictions because differences

between Y and Y’ are small

2

' 2

N

YYS yx

Page 18: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Terms Used in Correlation & Regression Coefficient of Determination (r2): % of total

variation in one set of scores that we know as a function of information about other set

Cross Validation: calculate standard error of estimate in a group of participants other than one used to get equation

Restricted Range: When restrictions on sample inhibit variability observed correlation will likely be deflated

Page 19: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Terms Used in Correlation & Regression Correlation – Causation Problem:

correlation between two variables does not necessarily mean that one causes another E.g., aggression and TV viewing

Third Variable Explanation: the possibility that a third variable that hasn’t been measured causes both E.g., aggression and TV viewing poor social

adjustment

Page 20: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Multiple Regression Looks at relationship among three or more

variables E.g., predicting course grade from SAT scores and average

from previous year

Where k = # of predictor variables Example: predicting law school GPA from

undergrad GPA, professors’ ratings, ageLaw school GPA = .8 (Z score of Undergrad GPA)

+ .24 (Z score of profs’ ratings) + .03 (Z score of age)

kk xbxbxbay ...' 2211

Page 21: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Multiple Regression

321 03.24.8.' xxxay

When variables are expressed in Z-units, weights are standardized regression coefficients Also called B’s or betas

If not Z-units using raw regression coefficients Also called b’s

Need to be careful when predictor variables are highly correlated

Best when predictor variables are uncorrelated

Page 22: Correlation and Regression PS397 Testing and Measurement January 16, 2007 Thanh-Thanh Tieu.

Teaching EvaluationFor: Thanh-Thanh TieuDate: January 16, 2007Class: Correlation & Regression, PS397

Strengths of the Lecture

Suggestions for Improvement

Additional Comments