Psychology 820 Correlation Regression & Prediction.

18
Psychology 820 Correlation Regression & Prediction

Transcript of Psychology 820 Correlation Regression & Prediction.

Page 1: Psychology 820 Correlation Regression & Prediction.

Psychology 820

Correlation

Regression & Prediction

Page 2: Psychology 820 Correlation Regression & Prediction.

Concept of Correlation

A coefficient of correlation (r or ρ ‘rho’) is a statistical summary of the degree and direction of relationship or association between two variables (X and Y)

Degree of RelationshipCorrelations range from 0 to 1.00

Direction of RelationshipPositive (+) relationship: High score on X goes with a High score on YNegative (-) relationship: High score on X goes with Low score on Y

Page 3: Psychology 820 Correlation Regression & Prediction.

The Bivariate Normal Distribution

A family of three dimensional surfaces

Page 4: Psychology 820 Correlation Regression & Prediction.

Scatterplots

The chief purpose of the scatterplot is for the study of the nature of the relationship between two variables.

Components of r

Pearson Product Moment Correlation

Page 5: Psychology 820 Correlation Regression & Prediction.

Additional Measures of Relationships

Spearman Rank CorrelationBoth X and Y are ranks

Phi CoefficientBoth X and Y are dichotomies

Point-Biserial CoefficientOne dichotomous variable and one continuous measure

Biserial CorrelationOne artificial dichotomy and one continuous measure

Tetrachoric CoefficientBoth X and Y are artificial dichotomies

Page 6: Psychology 820 Correlation Regression & Prediction.

Linear and Curvilinear Relationships

Only the degree of linear relationship is described by r or ρ

If there is a substantial nonlinear relationship between two variables, a different correlation coefficient (such as eta η) should be used

Page 7: Psychology 820 Correlation Regression & Prediction.

Linear Transformations and Correlation

Any transformation of X or Y that is linear does not affect the correlation coefficient

This includes transformations to z-scores, T-scores, addition of a constant to all values, subtracting multiplying or dividing by non-zero constants

Page 8: Psychology 820 Correlation Regression & Prediction.

Effects of Variability on Correlation

The variability (heterogeneity) of the sample has an important influence on r

Range restriction

Page 9: Psychology 820 Correlation Regression & Prediction.

Causation and Correlation

Correlation must be carefully distinguished from causation.

Third Variable Factor

Effect of Outliers

Page 10: Psychology 820 Correlation Regression & Prediction.

Regression and Prediction

Prediction and correlation are opposite sides of the same coin

Regression is usually the statistical method of choice when the predicted variable is an ordinal, interval, or ratio scale.

Simple linear regression (1 IV & 1 DV) extends to multiple regression (more than 1 IV)

Page 11: Psychology 820 Correlation Regression & Prediction.

The Regression Effect

The sons of tall fathers tend to be taller than average, but shorter than their fathers.

The sons of short fathers tend to be shorter than average, but taller than their fathers.

Regression to the Mean

Page 12: Psychology 820 Correlation Regression & Prediction.

Regression Equation

Y = b X + c (the equation of a straight line)

Line of best fit

Line of least-squares

Prediction equation

Page 13: Psychology 820 Correlation Regression & Prediction.

Proportion of Variance Interpretation of Correlation

The coefficient of determination (r2) is the proportion of variance in Y that can be accounted for by knowing X and, conversely, the proportion of variance in X that can be accounted for by knowing Y.

The coefficient of nondetermination (k2) is the proportion of variance “not accounted for”

Page 14: Psychology 820 Correlation Regression & Prediction.

Homoscedasticity

This assumption means that the variance around the regression line is the same for all values of the predictor variable (X). The plot on the right shows a violation of this assumption. For the lower values on the X-axis, the points are all very near the regression line. For the higher values on the X-axis, there is much more variability around the regression line.

In a bivariate normal distribution the variance of scores on Y will be the same for all values of X (equal variance of Y scores for each value of X) is known as homoscedasticity.

Page 15: Psychology 820 Correlation Regression & Prediction.

Part Correlation

It is the correlation of X1 (IQ) with X2

(achievement posttest) after the portion of the posttest that can be predicted from the pretest has been removed.

Page 16: Psychology 820 Correlation Regression & Prediction.

Partial Correlation

Simple extension of part correlation

The correlation of X1 and X2 with X3 “held constant”, removed, or partialed out is a partial correlation.

Page 17: Psychology 820 Correlation Regression & Prediction.

Multiple Regression

Multiple regression is the statistical method most commonly employed for predicting Y from two or more independent variables.

Page 18: Psychology 820 Correlation Regression & Prediction.

Multiple Correlation

The correlation between Y and Ypredicted when the prediction is based on two or more independent variables is termed multiple correlation