Post on 19-Dec-2015
Characteristics of Scatterplots
• Form
• Direction
• Strength
FORM
• Linear
• Curvilinear
• Clustering
• Outliers
• Other patterns
DIRECTION
• POSITIVE• Large values of X are associated with large values of Y,
and small values of X are associated with small values of Y.
• For example, IQ and SAT.
• NEGATIVE• Large values of one variable are associated with small
values of the other variable
• For example, SPEED and ACCURACY.
STRENGTH If the points do not fall along a straight line, then
there is NO linear association. If the points fall nearly along a straight line, then
there is a STRONG linear association. If the points fall exactly along a straight line, then
there is a PERFECT linear association.
y
x
Responsevariable
Explanatory variable
Dependentvariable
(DV)
Independent variable (IV)
Dinosaur-bone example
How to calculate Pearson’s r
z-score z-score product of
Specimen Femur Humerus Femur Humerus z-scores
A 38 41 -1.53 -1.57 2.41B 56 63 -0.17 -0.19 0.03C 59 70 0.06 0.25 0.02D 64 72 0.44 0.38 0.17E 74 84 1.20 1.13 1.36
Mean 58.20 66.00 0.00 0.00 0.99St Dev 13.20 15.89 1.00 1.00
How to calculate Pearson’s r
z-score z-score product of
Specimen Femur Humerus Femur Humerus z-scores
A 38 41 -1.53 -1.57 2.41B 56 63 -0.17 -0.19 0.03C 59 70 0.06 0.25 0.02D 64 72 0.44 0.38 0.17E 74 84 1.20 1.13 1.36
Mean 58.20 66.00 0.00 0.00 0.99St Dev 13.20 15.89 1.00 1.00
How to calculate Pearson’s r
z-score z-score product of
Specimen Femur Humerus Femur Humerus z-scores
A 38 41 -1.53 -1.57 2.41B 56 63 -0.17 -0.19 0.03C 59 70 0.06 0.25 0.02D 64 72 0.44 0.38 0.17E 74 84 1.20 1.13 1.36
Mean 58.20 66.00 0.00 0.00 0.99St Dev 13.20 15.89 1.00 1.00
How to calculate Pearson’s r
z-score z-score product of
Specimen Femur Humerus Femur Humerus z-scores
A 38 41 -1.53 -1.57 2.41B 56 63 -0.17 -0.19 0.03C 59 70 0.06 0.25 0.02D 64 72 0.44 0.38 0.17E 74 84 1.20 1.13 1.36
Mean 58.20 66.00 0.00 0.00 0.99St Dev 13.20 15.89 1.00 1.00
The Pearson product-moment correlation coefficient
r = 1
n -1zx i
zy i
i1
n
The Pearson product-moment correlation coefficient
Convert the X variable to z scores Convert the Y variable to z scores Multiply each pair of z scores Add up the products and divide by n-1
The range of the correlation coefficient
-1 0 +1perfect
negativelinear
relationship
nolinear
relationship
perfectpositivelinear
relationship
Outliers and influential cases
• An outlier is a case which does not follow the overall pattern of the others
• An influential case is one which draws the regression line toward its point in the scatterplot.
Example 2.18
0
20
40
60
80
100
120
140
0 10 20 30 40 50
Age at first word, in months
Gese
ll A
dap
tive S
core
Example 2.18
0
20
40
60
80
100
120
140
0 10 20 30 40 50
Age at first word, in months
Gese
ll A
dap
tive S
core
ESTABLISHING CAUSATION
Associationdoes not implycausation
When there is association between X and Y
• Perhaps X causes Y
• Perhaps Y causes X
• Perhaps some third variable causes both X and Y
COMMON RESPONSE Two variables might be associated because
they share a common cause.• For example, SAT scores and College Grades
are highly associated, but probably not because scoring well on the SAT causes a student to get high grades in college.
• Being a good student, etc., would be the common cause of the SATs and the grades.
CONFOUNDING For example, there is a strong positive
association between Number of Years of Education and Annual Income.
• In part, getting more education allows people to get better, higher-paying jobs.
• But these variables are confounded with others, such as socio-economic status (SES).
ESTABLISHING CAUSATION The best way to establish that X causes Y is
to have a controlled experiment, in which X is varied by the experimenter and the effects on Y can be seen.
But experimentation is not always possible.
ESTABLISHING CAUSATION The association is strong
The association is consistent Stronger treatments are associated with stronger
responses
The alleged cause precedes the effect in time
The alleged cause is plausible