Statistics lecture 4 Relationships Between Measurement Variables.
Statistics for the Social Sciences Psychology 340 Fall 2006 Relationships between variables.
-
date post
21-Dec-2015 -
Category
Documents
-
view
213 -
download
0
Transcript of Statistics for the Social Sciences Psychology 340 Fall 2006 Relationships between variables.
Statistics for the Social Sciences
Psychology 340Fall 2006
Relationships between variables
Statistics for the Social Sciences
Correlation
• Write down what (you think) a correlation is.
• Write down an example of a correlation
• Association between scores on two variables– Age and coordination skills in children, as kids get older their motor coordination tends to improve
– Price and quality, generally the more expensive something is the higher in quality it is
Statistics for the Social Sciences
Correlation and Causality
• Correlational research design– Correlation as a kind of research design (observational designs)
– Correlation as a statistical procedure
Statistics for the Social Sciences
Another thing to consider about correlation
• Correlations describe relationships between two variables, but DO NOT explain why the variables are related
Suppose that Dr. Steward finds that rates of spilled coffee and severity of plane turbulents are strongly positively correlated.
One might argue that turbulents cause coffee spills
One might argue that spilling coffee causes turbulents
Statistics for the Social Sciences
Another thing to consider about correlation
• Correlations describe relationships between two variables, but DO NOT explain why the variables are related
Suppose that Dr. Cranium finds a positive correlation between head size and digit span (roughly the number of digits you can remember).
One might argue that bigger your head, the larger your digit span
1
2124
1537
One might argue that head size and digit span both increase with age (but head size and digit span aren’t directly related)
Statistics for the Social Sciences
Another thing to consider about correlation
• Correlations describe relationships between two variables, but DO NOT explain why the variables are related
For many years instructors have noted that the reported fatality rate of
grandparents increases during midterm and final exam periods. One might argue that college exams cause grandparent death
Statistics for the Social Sciences
Relationships between variables
• Properties of a correlation– Form (linear or non-linear)– Direction (positive or negative)– Strength (none, weak, strong, perfect)
• To examine this relationship you should:– Make a scatterplot - a picture of the relationship
– Compute the Correlation Coefficient - a numerical description of the relationship
Statistics for the Social Sciences
Graphing Correlations
• Steps for making a scatterplot (scatter diagram)1. Draw axes and assign variables to them2. Determine range of values for each
variable and mark on axes3. Mark a dot for each person’s pair of
scores
Statistics for the Social Sciences
Scatterplot
Y
X1
2
34
5
6
1 2 3 4 5 6
• Plots one variable against the other• Each point corresponds to a different individual
A 6 6
X Y
Statistics for the Social Sciences
Scatterplot
Y
X1
2
34
5
6
1 2 3 4 5 6
• Plots one variable against the other• Each point corresponds to a different individual
A 6 6B 1 2
X Y
Statistics for the Social Sciences
Scatterplot
Y
X1
2
34
5
6
1 2 3 4 5 6
• Plots one variable against the other• Each point corresponds to a different individual
A 6 6B 1 2C 5 6
X Y
Statistics for the Social Sciences
Scatterplot
Y
X1
2
34
5
6
1 2 3 4 5 6
• Plots one variable against the other• Each point corresponds to a different individual
A 6 6B 1 2C 5 6
D 3 4
X Y
Statistics for the Social Sciences
Scatterplot
Y
X1
2
34
5
6
1 2 3 4 5 6
• Plots one variable against the other• Each point corresponds to a different individual
A 6 6B 1 2C 5 6
D 3 4
E 3 2
X Y
Statistics for the Social Sciences
Scatterplot
Y
X1
2
34
5
6
1 2 3 4 5 6
• Imagine a line through the data points
• Plots one variable against the other• Each point corresponds to a different individual
A 6 6B 1 2C 5 6
D 3 4
E 3 2
X Y
• Useful for “seeing” the relationship– Form, Direction, and Strength
Statistics for the Social Sciences
Form
Non-linearLinear
Statistics for the Social Sciences
NegativePositive
Direction
• X & Y vary in the same direction
• As X goes up, Y goes up
• Positive Pearson’s r
• X & Y vary in opposite directions
• As X goes up, Y goes down
• Negative Pearson’s r
Y
X
Y
X
Statistics for the Social Sciences
Strength
• The strength of the relationship– Spread around the line (note the axis scales)
– Correlation coefficient will range from -1 to +1• Zero means “no relationship”• The farther the r is from zero, the stronger the relationship
Statistics for the Social Sciences
Strength
r = 1.0“perfect positive corr.”r2 = 100%
r = -1.0“perfect negative corr.”r2 = 100%
r = 0.0“no relationship”r2 = 0.0
-1.0 0.0 +1.0
The farther from zero, the stronger the relationship
Statistics for the Social Sciences
The Correlation Coefficient
• Formulas for the correlation coefficient:
r = XZ YZ∑N
€
r =SP
SSX SSY
€
SP = X − X ( ) Y −Y ( )∑
Used this one in PSY138 Common alternative
Statistics for the Social Sciences
The Correlation Coefficient
• Formulas for the correlation coefficient:
r = XZ YZ∑N
€
r =SP
SSX SSY
€
SP = X − X ( ) Y −Y ( )∑
Used this one in PSY138 Common alternative
Statistics for the Social Sciences
Computing Pearson’s r (using SP)
• Step 1: SP (Sum of the Products)
€
SP = X − X ( ) Y −Y ( )∑
mean3.64.0
6 61 25 6
3 4
3 2
X Y
€
X − X ( )
€
Y −Y ( )
€
X − X ( ) Y −Y ( )
Statistics for the Social Sciences
Computing Pearson’s r (using SP)
• Step 1: SP (Sum of the Products)
€
SP = X − X ( ) Y −Y ( )∑
mean3.64.0
2.4
0.0
6 61 25 6
3 4
3 2
X Y
€
X − X ( )
€
Y −Y ( )
€
X − X ( ) Y −Y ( )= 6 - 3.6
-2.6= 1 - 3.6
1.4= 5 - 3.6
-0.6= 3 - 3.6
-0.6= 3 - 3.6Quick check
Statistics for the Social Sciences
Computing Pearson’s r (using SP)
• Step 1: SP (Sum of the Products)
€
SP = X − X ( ) Y −Y ( )∑
mean3.64.0
2.4-2.6
1.4
-0.6
-0.6
0.0 0.0
6 61 25 6
3 4
3 2
X Y
€
X − X ( )
€
Y −Y ( )
€
X − X ( ) Y −Y ( )2.0= 6 - 4.0-2.0= 2 - 4.0
2.0= 6 - 4.0
0.0= 4 - 4.0
-2.0= 2 - 4.0Quick check
Statistics for the Social Sciences
Computing Pearson’s r (using SP)
• Step 1: SP (Sum of the Products)
€
SP = X − X ( ) Y −Y ( )∑
mean3.64.0
2.4-2.6
1.4
-0.6
-0.6
0.0
2.0-2.0
2.0
0.0
-2.0
0.0 14.0 SP
6 61 25 6
3 4
3 2
X Y
€
X − X ( )
€
Y −Y ( )
€
X − X ( ) Y −Y ( )4.8* =
5.2* =
2.8* =
0.0* =
1.2* =
Statistics for the Social Sciences
Computing Pearson’s r (using SP)
• Step 2: SSX & SSY
Statistics for the Social Sciences
Computing Pearson’s r (using SP)
• Step 2: SSX & SSY
mean3.64.0
2.4-2.6
1.4
-0.6
-0.6
0.0
2.0-2.0
2.0
0.0
-2.0
0.0 14.0
6 61 25 6
3 4
3 2
X Y
€
X − X ( )
€
Y −Y ( )
€
X − X ( ) Y −Y ( )4.85.2
2.8
0.0
1.2
€
X − X ( )2
5.76
15.20
SSX
2 =6.762 =
1.962 =
0.362 =
0.362 =
Statistics for the Social Sciences
Computing Pearson’s r (using SP)
• Step 2: SSX & SSY
mean3.64.0
2.4-2.6
1.4
-0.6
-0.6
0.0
2.0-2.0
2.0
0.0
-2.0
0.0 14.0
6 61 25 6
3 4
3 2
X Y
€
X − X ( )
€
Y −Y ( )
€
X − X ( ) Y −Y ( )4.85.2
2.8
0.0
1.2
€
X − X ( )2
5.766.76
1.96
0.36
0.36
15.20
€
Y −Y ( )2
2 =4.02 =4.02 =4.02 =0.02 =4.0
16.0
SSY
Statistics for the Social Sciences
Computing Pearson’s r (using SP)
• Step 3: compute r
€
r =SP
SSX SSY
Statistics for the Social Sciences
Computing Pearson’s r (using SP)
• Step 3: compute r
mean3.64.0
2.4-2.6
1.4
-0.6
-0.6
0.0
2.0-2.0
2.0
0.0
-2.0
0.0 14.0
6 61 25 6
3 4
3 2
X Y
€
X − X ( )
€
Y −Y ( )
€
X − X ( ) Y −Y ( )4.85.2
2.8
0.0
1.2
€
X − X ( )2
5.766.76
1.96
0.36
0.36
15.20
€
Y −Y ( )2
4.04.0
4.0
0.0
4.0
16.0
SSYSSX
SP
€
r =SP
SSX SSY
Statistics for the Social Sciences
Computing Pearson’s r
• Step 3: compute r
14.015.20 16.0
SSYSSX
SP
€
r =SP
SSX SSY
Statistics for the Social Sciences
Computing Pearson’s r
• Step 3: compute r
15.20 16.0
SSYSSX
r =14
SSXSSY
Statistics for the Social Sciences
Computing Pearson’s r
• Step 3: compute r
15.20
SSX
r =14
SSX * 16
Statistics for the Social Sciences
Computing Pearson’s r
• Step 3: compute r
€
r =14
15.2 *16
Statistics for the Social Sciences
Computing Pearson’s r
• Step 3: compute rr =
1415.2 * 16
=0.89
Y
X1
2
34
5
6
1 2 3 4 5 6
• Appears linear• Positive relationship• Fairly strong relationship• .89 is far from 0, near +1
Statistics for the Social Sciences
The Correlation Coefficient
• Formulas for the correlation coefficient:
r = XZ YZ∑N
r =SPSSXSSY
SP = X−X( ) Y −Y( )∑
Used this one in PSY138 Common alternative
Statistics for the Social Sciences
Computing Pearson’s r
(using z-scores)
• Step 1: compute standard deviation for X and Y (note: keep track of sample or population)
6 61 25 6
3 4
3 2
X Y
• For this example we will assume the data is from a population
Statistics for the Social Sciences
Computing Pearson’s r
(using z-scores)
• Step 1: compute standard deviation for X and Y (note: keep track of sample or population)
Mean 3.6
2.4-2.6
1.4
-0.6
-0.6
0.0
6 61 25 6
3 4
3 2
X Y
€
X − X ( )
€
X − X ( )2
5.766.76
1.96
0.36
0.36
15.20
SSXStd dev1.74
σ =SSX
N=
15.2
5= 1.74
• For this example we will assume the data is from a population
Statistics for the Social Sciences
Computing Pearson’s r
(using z-scores)
• Step 1: compute standard deviation for X and Y (note: keep track of sample or population)
Mean 3.6 4.0
2.4-2.6
1.4
-0.6
-0.6
2.0-2.0
2.0
0.0
-2.0
0.0
6 61 25 6
3 4
3 2
X YX −X( )
€
Y −Y ( )X −X( )2
5.766.76
1.96
0.36
0.36
15.20
€
Y −Y ( )2
4.04.0
4.0
0.0
4.0
16.0
SSYStd dev1.741.79
• For this example we will assume the data is from a population
σ =SSY
N
=16.0
5= 1.79
Statistics for the Social Sciences
Computing Pearson’s r
(using z-scores)
• Step 2: compute z-scores
Mean 3.6 4.0
2.4-2.6
1.4
-0.6
-0.6
2.0-2.0
2.0
0.0
-2.0
6 61 25 6
3 4
3 2
X Y
€
X − X ( ) Y −Y( )X −X( )2
5.766.76
1.96
0.36
0.36
15.20
Y −Y( )2
4.04.0
4.0
0.0
4.0
16.0Std dev
ZX
1.741.79
1.38=2.4
1.74
X −X( )sX
Statistics for the Social Sciences
Computing Pearson’s r
(using z-scores)
• Step 2: compute z-scores
Mean 3.6 4.0
2.4-2.6
1.4
-0.6
-0.6
2.0-2.0
2.0
0.0
-2.0
6 61 25 6
3 4
3 2
X Y
€
X − X ( ) Y −Y( )X −X( )2
5.766.76
1.96
0.36
0.36
15.20
Y −Y( )2
4.04.0
4.0
0.0
4.0
16.0Std dev
ZX
X −X( )sX
1.741.79
1.38-1.49
0.8
- 0.34
- 0.34
0.0 Quick check
Statistics for the Social Sciences
Computing Pearson’s r
(using z-scores)
• Step 2: compute z-scores
Mean 3.6 4.0
2.4-2.6
1.4
-0.6
-0.6
2.0-2.0
2.0
0.0
-2.0
6 61 25 6
3 4
3 2
X YX −X( )
€
Y −Y ( )X −X( )2
5.766.76
1.96
0.36
0.36
15.20
€
Y −Y ( )2
4.04.0
4.0
0.0
4.0
16.0Std dev
ZX ZY
1.741.79
1.1
Y −Y( )sY
=2.0
1.791.38-1.49
0.8
- 0.34
- 0.34
Statistics for the Social Sciences
Computing Pearson’s r
(using z-scores)
• Step 2: compute z-scores
Mean 3.6 4.0
2.4-2.6
1.4
-0.6
-0.6
2.0-2.0
2.0
0.0
-2.0
6 61 25 6
3 4
3 2
X YX −X( )
€
Y −Y ( )X −X( )2
5.766.76
1.96
0.36
0.36
15.20
€
Y −Y ( )2
4.04.0
4.0
0.0
4.0
16.0Std dev
ZX ZY
Y −Y( )sY
1.741.79
1.1-1.1
0.0
-1.1
1.1
0.0
1.38-1.49
0.8
- 0.34
- 0.34
Quick check
Statistics for the Social Sciences
Computing Pearson’s r
(using z-scores)
• Step 3: compute r
Mean 3.6 4.0
2.4-2.6
1.4
-0.6
-0.6
0.0
2.0-2.0
2.0
0.0
-2.0
0.0
6 61 25 6
3 4
3 2
X Y ZX ZY
5.766.76
1.96
0.36
0.36
15.20
€
Y −Y ( )2
4.04.0
4.0
0.0
4.0
16.0Std dev
ZX ZY
1.741.790.0
1.1-1.1
0.0
-1.1
1.1
0.0
1.52
X −X( ) X −X( )2
r =ZXZY∑N
Y −Y( )
1.38-1.49
0.8
- 0.34
- 0.34
* =
Statistics for the Social Sciences
Computing Pearson’s r
(using z-scores)
• Step 3: compute r
Mean 3.6 4.0
2.4-2.6
1.4
-0.6
-0.6
0.0
2.0-2.0
2.0
0.0
-2.0
0.0
6 61 25 6
3 4
3 2
X Y ZX ZY
5.766.76
1.96
0.36
0.36
15.20
€
Y −Y ( )2
4.04.0
4.0
0.0
4.0
16.0Std dev
ZX ZY
1.741.790.0
1.1-1.1
0.0
-1.1
1.1
0.0
1.521.64
0.88
0.0
0.37
X −X( ) X −X( )2
r =ZXZY∑N
=4.41
5
Y −Y( )
1.38-1.49
0.8
- 0.34
- 0.34
=0.89
4.41
Statistics for the Social Sciences
Computing Pearson’s r
(using z-scores)
• Step 3: compute r
Y
X1
2
34
5
6
1 2 3 4 5 6
• Appears linear• Positive relationship• Fairly strong relationship• .89 is far from 0, near +1
r =ZXZY∑N
=0.89
Statistics for the Social Sciences
A few more things to consider about correlation
• Correlations are greatly affected by the range of scores in the data– Consider height and age relationship
• Extreme scores can have dramatic effects on correlations – A single extreme score can radically change r
• When considering "how good" a relationship is, we really should consider r2 (coefficient of determination), not just r.
Statistics for the Social Sciences
Correlation in Research Articles
• Correlation matrix– A display of the correlations between more than two variables
Acculturation
• Why have a “-”?
• Why only half the table filled with numbers?
Statistics for the Social Sciences
Next time
• Predicting a variable based on other variables