Correla’on con’nued and simple linear...
Transcript of Correla’on con’nued and simple linear...
![Page 1: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/1.jpg)
Correla'on con'nued and simple linear regression
![Page 2: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/2.jpg)
Outline for today
Be#erknowaplayer:MarkPriorReviewandcon6nua6onofcorrela6onSimplelinearregression!
![Page 3: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/3.jpg)
Worksheet 3: Jeter BA boxplot
Whatisthereasonfortheoutlier?
![Page 4: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/4.jpg)
Review
![Page 5: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/5.jpg)
Worksheet 3: interpre'ng z-scores
BAz-scores
Jeter: 1.02Ruth: 2.168Gehrig: -0.389Mantle: 2.70MaPngly: 1.46Howdoweinterpretwhatagoodz-scoreis?
![Page 6: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/6.jpg)
Are BAs normally distributed?
![Page 7: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/7.jpg)
ScaHer plots
Asca$erplotgraphstherela6onshipbetweentwovariablesIfthereisanexplanatoryandresponsevariable,thentheexplanatoryvariableisputonthex-axisandtheresponsevariableisputonthey-axis
R: plot(x, y)
Runsscored
Winning%
![Page 8: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/8.jpg)
Correla'on
Thecorrela+onismeasureofthestrengthanddirec6onofalinearassocia6onbetweentwovariables.
R: cor(x, y)
![Page 9: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/9.jpg)
Correla'on examples
![Page 10: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/10.jpg)
Correla'on Examples
Runsallowedandwins
r=-.55
(runsscored)/(runsallowed)andwins
r=.93
![Page 11: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/11.jpg)
Correla'on cau'ons
1.Astrongposi6veornega6vecorrela6ondoesnot(necessarily)implyacauseandeffectrela6onshipbetweentwovariables2.Acorrela6onnearzerodoesnot(necessarily)meanthattwovariablesarenotassociated.Correla6ononlymeasuresthestrengthofalinearrela6onship.3.Acorrela6onnearzerodoesnot(necessarily)meanthattwovariablesarenotassociated.Correla6ononlymeasuresthestrengthofalinearrela6onship.
![Page 12: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/12.jpg)
Correla'on cau'ons
1.Astrongposi6veornega6vecorrela6ondoesnot(necessarily)implyacauseandeffectrela6onshipbetweentwovariables2.Acorrela6onnearzerodoesnot(necessarily)meanthattwovariablesarenotassociated.Correla6ononlymeasuresthestrengthofalinearrela6onship.3.Correla6oncanbeheavilyinfluencesbyoutliers.Alwaysplotyourdata!
![Page 13: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/13.jpg)
Anscombe’s quartet (r = 0.81)
![Page 14: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/14.jpg)
Offensive sta's'cs
Whatdothefollowingabbrevia6onsstandfor?
HBBPAABOBPBASlugPct
Hits:1B+2B+3B+HRWalks:4ballsPlateAppearances:Numberof6mes“up”AtBats:PA-BBOn-BasePercentage:(H+BB)/PABaPngAverage:H/ABSluggingpercentage:(1·1B+2·2B+3·3B+4·HR)/AB
![Page 15: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/15.jpg)
Whowouldyouratherhaveonyourteam?
DerekJeter DavidOr6z
Who is a beHer hiHer: Derek Jeter or David Or'z?
![Page 16: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/16.jpg)
Jeterhasabe#erbaPngaverage
Who is a beHer hiHer: Derek Jeter or David Or'z?
![Page 17: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/17.jpg)
Or6zhitsmorehomeruns
Who is a beHer hiHer: Derek Jeter or David Or'z?
![Page 18: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/18.jpg)
Is power or baVng average more important?
Comparethembasedonthe“best”sta6s6c
Howdowedeterminewhichsta6s6cisbest?
Runsscoredandwins
Runsscored
Wins
![Page 19: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/19.jpg)
The great cycle of baseball
Morewins
Morefans
More$$$Be#erplayers
Scoremoreruns
Wecanevaluatehow‘good’asta6s6cisbasedonhowwellitcorrelateswiththenumberofrunsateamscores
![Page 20: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/20.jpg)
What is the best sta's'c to use?
Oneidea:the‘best’sta6s6ctojudgeaplayeristhesta6s6cthatismostcorrelatedwithruns• Wecanthenusethistoexaminehowgoodahi#eris
Wewilluseadatasetthathasseasontotalsta6s6csgoingbackto1961
Thesesta6s6csincludethetotalrunsateamscored,totalteamHR,totalteamBA,etc.
load('/home/shared/baseball_stats_2007/data/team_batting_stats.Rda')
![Page 21: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/21.jpg)
What is the best sta's'c to use?
Sta6s6cstocompare:1.Homeruns(HR) 2.BaPngaverage(BA)3.On-basepercentage(OBP) 4.Sluggingpercentage(Slug)5.On-basepercentage+sluggingpercentage(OPS)
Foreachofthese5sta6s6cs:• Createasca#erplotbetweenthesta6s6candruns(R)• Calculatethecorrela6onbetweenthesta6s6candruns(R)
Onceyouhavefoundthesta+s+cthatismorecorrelatedwithruns,createaside-by-sideboxplottocompareDerekJe$erandDavidOr+z’sdataonthissta+s+c
Youcangettheteamyearlytotalsta6s6csrun: load('/home/shared/baseball_stats_2017/team_batting_stats.Rda')
Usefulfunc6ons:• plot(x,y)#createasca#erplotofdifferentsta6s6csandruns• cor(x,y)#calculatethecorrela6onbetweendifferentsta6s6csandruns• boxplot(v1,v2,names=c(‘Derek',‘David'))#compareplayersonthis‘best’sta6s6c
![Page 22: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/22.jpg)
Results…
![Page 23: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/23.jpg)
Correla'on between HR and runs
![Page 24: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/24.jpg)
Correla'on between BA and runs
![Page 25: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/25.jpg)
Correla'on between OBP and runs
![Page 26: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/26.jpg)
Correla'on between Slug and runs
![Page 27: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/27.jpg)
Correla'on between OPS and runs
![Page 28: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/28.jpg)
The winner…
On-baseplussluggingseemslikethebeststa6s6ctouse!
![Page 29: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/29.jpg)
Or6zhasabe#eron-baseplusslugging!
Who is a beHer hiHer: Derek Jeter or David Or'z?
![Page 30: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/30.jpg)
BeHer know a player: Derek Jeter
Onioninfographic
OtherOnionar6cles
![Page 31: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/31.jpg)
Regression
RegressionismethodofusingonevariabletopredictthevalueofasecondvariableInlinearregressionwefitalinetothedata,calledtheregressionline.
![Page 32: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/32.jpg)
Regression line: runs/game as a func'on of team baVng average (2013)
![Page 33: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/33.jpg)
Equa'on for a line
ŷ=a+b·x
Response=a+b·Explanatory
![Page 34: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/34.jpg)
Wins runs regression
ŵ=14.47+.088·runs
a=14.47
b=.088
ŷ=a+b·x
R: lm(y ~ x)
![Page 35: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/35.jpg)
Interpre'ng the slope and intercept
ŷ=a+b·x
Theslopebrepresentsthepredictedchangeintheresponsevariableygivenaoneunitchangeintheexplanatoryvariablex
Theinterceptarepresentedthepredictedvalueoftheresponsevariableyiftheexplanatoryvariablexwere0
![Page 36: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/36.jpg)
Using the regression line to make predic'ons
1.Approximatelyhowmanyaddi6onalrunsdoyouneedtoscoreforanaddi6onalwin?2.Howmanywinswillyouhaveifyouscore0runsallseason?
a=14.47
b=.088
ŷ=a+b·x
ŵ=14.47+.088·Runs
![Page 37: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/37.jpg)
Wins runs regression
1.Anaddi6onalwinfor~11addi6onalrunsscored2.Therewillbe14.47winsifyouscore0runsallseason
a=14.47
b=.088
ŷ=a+b·x
ŵ=14.47+.088·Runs
![Page 38: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/38.jpg)
Example 2: Using the regression line to make predic'ons
a=-3.27
b=29.36
ŷ=a+b·x
1. If a team had a baVng average of 0.270, how many runs would you expect in a game?
![Page 39: Correla’on con’nued and simple linear regressionemeyers.scripts.mit.edu/emeyers/wp-content/uploads/CS149_slides/… · Correla’on cau’ons 1. A strong posi6ve or negave correlaon](https://reader034.fdocuments.us/reader034/viewer/2022050110/5f47c27c3f752c65c259abec/html5/thumbnails/39.jpg)
If a team had a baVng average of 0.270, how many runs would you expect in a game?
(R/G)expected=29.35*BA-3.27
(R/G)expected=29.35*.270-3.27
R/G)expected=4.6572HowaboutifateambaPng.250?
a=-3.27
b=29.36
ŷ=a+b·x