Characteristics of Scatterplots Form Direction Strength.

Post on 19-Dec-2015

219 views 2 download

Transcript of Characteristics of Scatterplots Form Direction Strength.

Characteristics of Scatterplots

• Form

• Direction

• Strength

FORM

• Linear

• Curvilinear

• Clustering

• Outliers

• Other patterns

DIRECTION

• POSITIVE• Large values of X are associated with large values of Y,

and small values of X are associated with small values of Y.

• For example, IQ and SAT.

• NEGATIVE• Large values of one variable are associated with small

values of the other variable

• For example, SPEED and ACCURACY.

STRENGTH If the points do not fall along a straight line, then

there is NO linear association. If the points fall nearly along a straight line, then

there is a STRONG linear association. If the points fall exactly along a straight line, then

there is a PERFECT linear association.

y

x

Responsevariable

Explanatory variable

Dependentvariable

(DV)

Independent variable (IV)

Dinosaur-bone example

How to calculate Pearson’s r

z-score z-score product of

Specimen Femur Humerus Femur Humerus z-scores

A 38 41 -1.53 -1.57 2.41B 56 63 -0.17 -0.19 0.03C 59 70 0.06 0.25 0.02D 64 72 0.44 0.38 0.17E 74 84 1.20 1.13 1.36

Mean 58.20 66.00 0.00 0.00 0.99St Dev 13.20 15.89 1.00 1.00

How to calculate Pearson’s r

z-score z-score product of

Specimen Femur Humerus Femur Humerus z-scores

A 38 41 -1.53 -1.57 2.41B 56 63 -0.17 -0.19 0.03C 59 70 0.06 0.25 0.02D 64 72 0.44 0.38 0.17E 74 84 1.20 1.13 1.36

Mean 58.20 66.00 0.00 0.00 0.99St Dev 13.20 15.89 1.00 1.00

How to calculate Pearson’s r

z-score z-score product of

Specimen Femur Humerus Femur Humerus z-scores

A 38 41 -1.53 -1.57 2.41B 56 63 -0.17 -0.19 0.03C 59 70 0.06 0.25 0.02D 64 72 0.44 0.38 0.17E 74 84 1.20 1.13 1.36

Mean 58.20 66.00 0.00 0.00 0.99St Dev 13.20 15.89 1.00 1.00

How to calculate Pearson’s r

z-score z-score product of

Specimen Femur Humerus Femur Humerus z-scores

A 38 41 -1.53 -1.57 2.41B 56 63 -0.17 -0.19 0.03C 59 70 0.06 0.25 0.02D 64 72 0.44 0.38 0.17E 74 84 1.20 1.13 1.36

Mean 58.20 66.00 0.00 0.00 0.99St Dev 13.20 15.89 1.00 1.00

The Pearson product-moment correlation coefficient

r = 1

n -1zx i

zy i

i1

n

The Pearson product-moment correlation coefficient

Convert the X variable to z scores Convert the Y variable to z scores Multiply each pair of z scores Add up the products and divide by n-1

The range of the correlation coefficient

-1 0 +1perfect

negativelinear

relationship

nolinear

relationship

perfectpositivelinear

relationship

Outliers and influential cases

• An outlier is a case which does not follow the overall pattern of the others

• An influential case is one which draws the regression line toward its point in the scatterplot.

Example 2.18

0

20

40

60

80

100

120

140

0 10 20 30 40 50

Age at first word, in months

Gese

ll A

dap

tive S

core

Example 2.18

0

20

40

60

80

100

120

140

0 10 20 30 40 50

Age at first word, in months

Gese

ll A

dap

tive S

core

ESTABLISHING CAUSATION

Associationdoes not implycausation

When there is association between X and Y

• Perhaps X causes Y

• Perhaps Y causes X

• Perhaps some third variable causes both X and Y

COMMON RESPONSE Two variables might be associated because

they share a common cause.• For example, SAT scores and College Grades

are highly associated, but probably not because scoring well on the SAT causes a student to get high grades in college.

• Being a good student, etc., would be the common cause of the SATs and the grades.

CONFOUNDING For example, there is a strong positive

association between Number of Years of Education and Annual Income.

• In part, getting more education allows people to get better, higher-paying jobs.

• But these variables are confounded with others, such as socio-economic status (SES).

ESTABLISHING CAUSATION The best way to establish that X causes Y is

to have a controlled experiment, in which X is varied by the experimenter and the effects on Y can be seen.

But experimentation is not always possible.

ESTABLISHING CAUSATION The association is strong

The association is consistent Stronger treatments are associated with stronger

responses

The alleged cause precedes the effect in time

The alleged cause is plausible