Cor. & Regression
-
Upload
akhil-vashishtha -
Category
Documents
-
view
227 -
download
0
Transcript of Cor. & Regression
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 1/27
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 2/27
Measures the relative strength of the linear relationship between two variablesUnit-less
Ranges between –1 and 1The closer to –1, the stronger the negative linearrelationshipThe closer to 1, the stronger the positive linear
relationshipThe closer to 0, the weaker any positive linearrelationship
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 3/27
Y
X
Y
X
Y
X
Y
X
Y
X
r = -1 r = -.6 r = 0
r = +.3r = +1
Y
X
r = 0
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 4/27
Y
X
Y
X
Y
Y
X
X
Linear relationships Curvilinear relationships
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 5/27
Y
X
Y
X
Y
Y
X
X
Strong relationships Weak relationships
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 6/27
Y
X
Y
X
No relationship
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 7/27
In correlation, the two variables are treatedas equals. In regression, one variable is
considered independent (=predictor)variable ( X ) and the other the dependent(=outcome) variable Y .
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 8/27
Y=mX+B?
B
m
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 9/27
A slope of 2 means that every 1-unitchange in X yields a 2-unit change in Y.
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 10/27
The linear regression model:
Love of Math = 5 + .01*math SAT score
intercept
slope
P=.22; not
significant
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 11/27
If you know something about X, this knowledge helpsyou predict something about Y.
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 12/27
The average baby weights in Mumbai is3400 gm
Your “Best guess” at a random baby’s weight,given no information about the baby, is what?
3400 grams
But, what if you have relevant information? Canyou make a better guess?
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 13/27
X=gestation time
Assume that babies that gestate forlonger are born heavier, all other thingsbeing equal.Pretend (at least for the purposes of thisexample) that this relationship is linear.
Example: suppose a one-week increasein gestation, on average, leads to a 100-gram increase in birth-weight
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 14/27
Y=birth - weight
(g)
X=gestatio n time (weeks)
Best fit line is chosensuch that the sum of thesquared (why squared?) distances of the points(Y i ’s) from the line isminimized:
Or mathematically..(maxand mins fromcalculus)…
Derivative[ (Yi-(mx+b)) 2]=0
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 15/27
A new baby is born that had gestated for just 30 weeks. What’s your best guess atthe birth-weight?Are you still best off guessing 3400? NO!
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 16/27
Y=birth - weight
(g)
X=gestatio n time (weeks)
3000
30
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 17/27
Y=birth weight
(g)
X=gestatio n time (weeks)
(x,y)=
(30,3000)
3000
30
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 18/27
The babies that gestate for 30 weeksappear to center around a weight of 3000grams.
In Math- Speak… E(Y/X=30 weeks)=3000 grams
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 19/27
Note that not every Y-value (Y i ) sits on the line. There’svariability.
Yi=3000 + random error
i
In fact, babies that gestate for 30 weekshave birth-weights that center at 3000grams, but vary around 3000 with some
variance2
◦ Approximately what distribution do birth-weights follow? Normal. Y/X=30 weeks ~ N(3000, 2)
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 20/27
Y=birth - weight
(g)
X=gestatio n time (weeks)
20 30 40
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 21/27
Y=baby weights
(g)
X=gestatio n times (weeks)
20 30 40
Y/X=40 weeks ~ N(4000, 2)
Y/X=30 weeks ~ N(3000, 2)
Y/X=20 weeks ~ N(2000, 2)
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 22/27
E(Y/X=40 weeks)=4000E(Y/X=30 weeks)=3000
E(Y/X=20 weeks)=2000
E(Y/X)= Y/X = 100 grams/week*X weeks
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 23/27
Y’s are modeled…
Yi= 100*X + random error i
Follows anormaldistribution
Fixed – exactlyon theline
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 24/27
Linear regression assumes that…◦ 1. The relationship between X and Y is linear◦ 2. Y is distributed normally at each value of X◦ 3. The variance of Y at every value of X is the same
(homogeneity of variances)
Why? The math requires it —themathematical process is called ―leastsquares‖ because it fits the regression lineby minimizing the squared errors from theline (mathematically easy, but not general —relies on above assumptions).
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 25/27
More than one predictor…
= + 1*X + 2 *W + 3 *Z
Each regression coefficient is the amount of change in the outcome variable that wouldbe expected per one-unit change of thepredictor, if all other variables in the modelwere held constant.
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 26/27
PurchaseSatisfaction
ControlVariables
Revisit
Intention
Product
Quality
5 ITEM SCALE 5 ITEM SCALE 10 ITEM SCALE
7/29/2019 Cor. & Regression
http://slidepdf.com/reader/full/cor-regression 27/27
• Cluster Sampling
• Sample Size: 450