Describing Variables Hypotheses Testing May 26, 2008 Ivan Katchanovski, Ph.D. POL 242Y-Y.
Chapter 4: Describing the relation between two variables Univariate data:
description
Transcript of Chapter 4: Describing the relation between two variables Univariate data:
Chapter 4: Describing the relation between two variables
Univariate data: Only one variable is measured per a subject. Example: height.
Bivariate data: Two variables are measured per a subject. Example: height and weight.
This chapter deals with bivariate data.
Section 4.1: Scatter diagrams and correlation• Scatter diagrams(plots) show the data.• Correlation is a statistic measuring linear
relationship between x and y variables.
Quantitative bivariate data are usually displayed in a scatter plot (also called scatter diagram).
4241403938373635
3500
3250
3000
2750
2500
weeks of gestation
birth
weig
ht
in g
ram
s
Scatterplot of weight vs gestation
4241403938373635
3500
3250
3000
2750
2500
weeks of gestation
birth
weig
ht
in g
ram
sScatterplot of weight vs gestation
The “X” variable is called many names:• Explanatory variable • Predictor variable• Independent variable Weeks of gestation is our x-variable here.
4241403938373635
3500
3250
3000
2750
2500
weeks of gestation
birth
weig
ht
in g
ram
sScatterplot of weight vs gestation
The “Y” variable is also called many names:• Response variable• Dependent variable• Outcome variable Birth weight is our y-variable here.
When trying to determine which variable you should put on the x or y axes, think “cause and effect”.
The “cause” variable should be on the x-axis.The “effect” variable should be on the y-axis.
Sample correlation coefficent (Also called “linear” or “Pearson product moment” correlation coefficient)
-1 <= r <= 1
If the slope is negative, r<0. If positive slope, r>1r=-1 or 1 means all points lie on straight line.
Bottom row shows pattern, but it’s not linear.
http://xkcd.com/552/
Practice: x y
4 3
6 0
8 6
n=?Mean(x)=? Mean(y)=?SD(x)=?SD(y)=?
x’s z-scores y’s z-scores
Practice: x y
4 3
6 0
8 6
n=?Mean(x)=? Mean(y)=?SD(x)=?SD(y)=?
x’s z-scores y’s z-scores
n=3Mean(x)= 6, Mean(y)=3SD(x) = 2, SD(y)=3r=0.5
4 5 6 7 8
01
23
45
6
x
y
60 80 100 120 140
60
80
100
120
140
X
Y
Correlation=?(A) 0 (B) 0.41 (C) 0.97 (D) 1
60 80 100 120 140
60
80
100
120
140
X
Y
A) -1.2 B) -1 C) -0.99 D) 100
60 80 100 120 140
60
80
100
120
140
X
Y
A) -1 B)-0.99 C) -0.5 D) -0.25
http://thedoghousediaries.com/2723
60 80 100 120 140
60
80
100
120
140
X
Y
A) -1 B) +0.5 C) +1 D) 100
60 80 100 120 140
60
80
100
120
140
X
Y
A) -0.9 B) +0.02 C) +0.9 D) +1
60 80 100 120 140
60
80
100
120
140
X
Y
A) +0.02 B) +0.04 C) +0.96 D) +1
60 80 100 120 140
60
80
100
120
140
X
Y
A) -1 B) -0.71 C) +1 D) +1.5
60 80 100 120 140
60
80
100
120
140
X
Y
A) -0.71 B) -0.2 C) +0.92 D) 1
10 12 14 16 18 20
05
10
15
20
25
X
Y
A) -1 B) -0.06 C) +0.99 D) +1