Cor. & Regression

Post on 14-Apr-2018

227 views 0 download

Transcript of Cor. & Regression

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 1/27

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 2/27

Measures the relative strength of the linear relationship between two variablesUnit-less

Ranges between –1 and 1The closer to –1, the stronger the negative linearrelationshipThe closer to 1, the stronger the positive linear

relationshipThe closer to 0, the weaker any positive linearrelationship

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 3/27

Y

X

Y

X

Y

X

Y

X

Y

X

r = -1 r = -.6 r = 0

r = +.3r = +1

Y

X

r = 0

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 4/27

Y

X

Y

X

Y

Y

X

X

Linear relationships Curvilinear relationships

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 5/27

Y

X

Y

X

Y

Y

X

X

Strong relationships Weak relationships

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 6/27

Y

X

Y

X

No relationship

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 7/27

In correlation, the two variables are treatedas equals. In regression, one variable is

considered independent (=predictor)variable ( X ) and the other the dependent(=outcome) variable Y .

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 8/27

Y=mX+B?

B

m

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 9/27

A slope of 2 means that every 1-unitchange in X yields a 2-unit change in Y.

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 10/27

The linear regression model:

Love of Math = 5 + .01*math SAT score

intercept

slope

P=.22; not

significant

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 11/27

If you know something about X, this knowledge helpsyou predict something about Y.

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 12/27

The average baby weights in Mumbai is3400 gm

Your “Best guess” at a random baby’s weight,given no information about the baby, is what?

3400 grams

But, what if you have relevant information? Canyou make a better guess?

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 13/27

X=gestation time

Assume that babies that gestate forlonger are born heavier, all other thingsbeing equal.Pretend (at least for the purposes of thisexample) that this relationship is linear.

Example: suppose a one-week increasein gestation, on average, leads to a 100-gram increase in birth-weight

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 14/27

Y=birth - weight

(g)

X=gestatio n time (weeks)

Best fit line is chosensuch that the sum of thesquared (why squared?) distances of the points(Y i ’s) from the line isminimized:

Or mathematically..(maxand mins fromcalculus)…

Derivative[ (Yi-(mx+b)) 2]=0

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 15/27

A new baby is born that had gestated for just 30 weeks. What’s your best guess atthe birth-weight?Are you still best off guessing 3400? NO!

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 16/27

Y=birth - weight

(g)

X=gestatio n time (weeks)

3000

30

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 17/27

Y=birth weight

(g)

X=gestatio n time (weeks)

(x,y)=

(30,3000)

3000

30

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 18/27

The babies that gestate for 30 weeksappear to center around a weight of 3000grams.

In Math- Speak… E(Y/X=30 weeks)=3000 grams

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 19/27

Note that not every Y-value (Y i ) sits on the line. There’svariability.

Yi=3000 + random error

i

In fact, babies that gestate for 30 weekshave birth-weights that center at 3000grams, but vary around 3000 with some

variance2

◦ Approximately what distribution do birth-weights follow? Normal. Y/X=30 weeks ~ N(3000, 2)

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 20/27

Y=birth - weight

(g)

X=gestatio n time (weeks)

20 30 40

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 21/27

Y=baby weights

(g)

X=gestatio n times (weeks)

20 30 40

Y/X=40 weeks ~ N(4000, 2)

Y/X=30 weeks ~ N(3000, 2)

Y/X=20 weeks ~ N(2000, 2)

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 22/27

E(Y/X=40 weeks)=4000E(Y/X=30 weeks)=3000

E(Y/X=20 weeks)=2000

E(Y/X)= Y/X = 100 grams/week*X weeks

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 23/27

Y’s are modeled…

Yi= 100*X + random error i

Follows anormaldistribution

Fixed – exactlyon theline

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 24/27

Linear regression assumes that…◦ 1. The relationship between X and Y is linear◦ 2. Y is distributed normally at each value of X◦ 3. The variance of Y at every value of X is the same

(homogeneity of variances)

Why? The math requires it —themathematical process is called ―leastsquares‖ because it fits the regression lineby minimizing the squared errors from theline (mathematically easy, but not general —relies on above assumptions).

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 25/27

More than one predictor…

= + 1*X + 2 *W + 3 *Z

Each regression coefficient is the amount of change in the outcome variable that wouldbe expected per one-unit change of thepredictor, if all other variables in the modelwere held constant.

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 26/27

PurchaseSatisfaction

ControlVariables

Revisit

Intention

Product

Quality

5 ITEM SCALE 5 ITEM SCALE 10 ITEM SCALE

7/29/2019 Cor. & Regression

http://slidepdf.com/reader/full/cor-regression 27/27

• Cluster Sampling

• Sample Size: 450