Cor. & Regression

27
7/29/2019 Cor. & Regression http://slidepdf.com/reader/full/cor-regression 1/27

Transcript of Cor. & Regression

Page 1: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 1/27

Page 2: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 2/27

Measures the relative strength of the linear relationship between two variablesUnit-less

Ranges between –1 and 1The closer to –1, the stronger the negative linearrelationshipThe closer to 1, the stronger the positive linear

relationshipThe closer to 0, the weaker any positive linearrelationship

Page 3: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 3/27

Y

X

Y

X

Y

X

Y

X

Y

X

r = -1 r = -.6 r = 0

r = +.3r = +1

Y

X

r = 0

Page 4: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 4/27

Y

X

Y

X

Y

Y

X

X

Linear relationships Curvilinear relationships

Page 5: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 5/27

Y

X

Y

X

Y

Y

X

X

Strong relationships Weak relationships

Page 6: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 6/27

Y

X

Y

X

No relationship

Page 7: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 7/27

In correlation, the two variables are treatedas equals. In regression, one variable is

considered independent (=predictor)variable ( X ) and the other the dependent(=outcome) variable Y .

Page 8: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 8/27

Y=mX+B?

B

m

Page 9: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 9/27

A slope of 2 means that every 1-unitchange in X yields a 2-unit change in Y.

Page 10: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 10/27

The linear regression model:

Love of Math = 5 + .01*math SAT score

intercept

slope

P=.22; not

significant

Page 11: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 11/27

If you know something about X, this knowledge helpsyou predict something about Y.

Page 12: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 12/27

The average baby weights in Mumbai is3400 gm

Your “Best guess” at a random baby’s weight,given no information about the baby, is what?

3400 grams

But, what if you have relevant information? Canyou make a better guess?

Page 13: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 13/27

X=gestation time

Assume that babies that gestate forlonger are born heavier, all other thingsbeing equal.Pretend (at least for the purposes of thisexample) that this relationship is linear.

Example: suppose a one-week increasein gestation, on average, leads to a 100-gram increase in birth-weight

Page 14: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 14/27

Y=birth - weight

(g)

X=gestatio n time (weeks)

Best fit line is chosensuch that the sum of thesquared (why squared?) distances of the points(Y i ’s) from the line isminimized:

Or mathematically..(maxand mins fromcalculus)…

Derivative[ (Yi-(mx+b)) 2]=0

Page 15: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 15/27

A new baby is born that had gestated for just 30 weeks. What’s your best guess atthe birth-weight?Are you still best off guessing 3400? NO!

Page 16: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 16/27

Y=birth - weight

(g)

X=gestatio n time (weeks)

3000

30

Page 17: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 17/27

Y=birth weight

(g)

X=gestatio n time (weeks)

(x,y)=

(30,3000)

3000

30

Page 18: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 18/27

The babies that gestate for 30 weeksappear to center around a weight of 3000grams.

In Math- Speak… E(Y/X=30 weeks)=3000 grams

Page 19: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 19/27

Note that not every Y-value (Y i ) sits on the line. There’svariability.

Yi=3000 + random error

i

In fact, babies that gestate for 30 weekshave birth-weights that center at 3000grams, but vary around 3000 with some

variance2

◦ Approximately what distribution do birth-weights follow? Normal. Y/X=30 weeks ~ N(3000, 2)

Page 20: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 20/27

Y=birth - weight

(g)

X=gestatio n time (weeks)

20 30 40

Page 21: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 21/27

Y=baby weights

(g)

X=gestatio n times (weeks)

20 30 40

Y/X=40 weeks ~ N(4000, 2)

Y/X=30 weeks ~ N(3000, 2)

Y/X=20 weeks ~ N(2000, 2)

Page 22: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 22/27

E(Y/X=40 weeks)=4000E(Y/X=30 weeks)=3000

E(Y/X=20 weeks)=2000

E(Y/X)= Y/X = 100 grams/week*X weeks

Page 23: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 23/27

Y’s are modeled…

Yi= 100*X + random error i

Follows anormaldistribution

Fixed – exactlyon theline

Page 24: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 24/27

Linear regression assumes that…◦ 1. The relationship between X and Y is linear◦ 2. Y is distributed normally at each value of X◦ 3. The variance of Y at every value of X is the same

(homogeneity of variances)

Why? The math requires it —themathematical process is called ―leastsquares‖ because it fits the regression lineby minimizing the squared errors from theline (mathematically easy, but not general —relies on above assumptions).

Page 25: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 25/27

More than one predictor…

= + 1*X + 2 *W + 3 *Z

Each regression coefficient is the amount of change in the outcome variable that wouldbe expected per one-unit change of thepredictor, if all other variables in the modelwere held constant.

Page 26: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 26/27

PurchaseSatisfaction

ControlVariables

Revisit

Intention

Product

Quality

5 ITEM SCALE 5 ITEM SCALE 10 ITEM SCALE

Page 27: Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 27/27

• Cluster Sampling

• Sample Size: 450