Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden:...
Transcript of Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden:...
![Page 1: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/1.jpg)
Predicting an MVPBrian King, Derek Zhang, Juleen Graham, Erin Henning, Ryan Haney
![Page 2: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/2.jpg)
How is an MVP selected?◼ From 1979-1995, NBA players voted for the MVP
◼ 1995-2010, votes strictly from a panel of sportswriters and broadcasters - Votes from US and CA, each of whom casted a vote for 1st through 5th place selections
◼ 2010- One ballot is cast by fan votes from online
![Page 3: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/3.jpg)
![Page 4: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/4.jpg)
![Page 5: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/5.jpg)
https://en.wikipedia.org/wiki/NBA_Most_Valuable_Player_Award)
![Page 6: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/6.jpg)
Trends?◼ What caused a change in trend from
Centers/Forwards to Guards/Forwards?
![Page 7: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/7.jpg)
Questions◼ What are the most important statistical criteria for
choosing an MVP?
◼ Can we create a model to predict the probability of an individual winning the MVP award?
![Page 8: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/8.jpg)
Procedures◼ Data from the 1991-1992 season to 2015-2016
- Top 150 players for each season that had the most playing time
◼ Logistic Regression Model
◼ Used the data from 1991-1992 to 2012-2013 seasons to fit the model
◼ Predicted on 2013-2014 to 2015-2016
◼ Compare “order” of prediction to true voting order
![Page 9: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/9.jpg)
The Logistic Regression Model
Where Xi = Predictor variable
Assumptions● Binary Response variable (MVP or not)● Continuous, Independent Explanatory variables
![Page 10: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/10.jpg)
The Variables ◼ Points Per Game
◼ Blocks, Steals, Assists, Rebounds
◼ Effective Field Goal Percentage
◼ Position
◼ Personal Fouls, Age, Minutes Played, Turnovers, ...
![Page 11: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/11.jpg)
2013-14 Season: All StatsActual
MVP: Kevin Durant2nd: LeBron James3rd: Blake Griffin4th: Joakim Noah5th: James Harden
Kevin Love: 11thStephen Curry: 6thLaMarcus Aldridge: 10th
Prediction
MVP: Kevin Love2nd: LeBron James3rd: Kevin Durant4th: Stephen Curry5th:LaMarcus Aldridge
Blake Griffin: 12thJoakim Noah: 31stJames Harden: 8th
![Page 12: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/12.jpg)
2013-14 Season: MVStatsPrediction
MVP: Kevin Durant2nd: LeBron James3rd: Kevin Love4th: Stephen Curry5th: Chris Paul
Blake Griffin: 7thJoakim Noah: 23rdJames Harden: 9th
Actual
MVP: Kevin Durant2nd: LeBron James3rd: Blake Griffin4th: Joakim Noah5th: James Harden
Kevin Love: 11thStephen Curry: 6thChris Paul: 7th
![Page 13: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/13.jpg)
2014-15 SeasonPrediction
MVP: Russell Westbrook2nd: LeBron James3rd: Chris Paul4th: James Harden5th: Stephen Curry
Anthony Davis: 10th
Actual
MVP: Stephen Curry2nd: James Harden3rd: LeBron James4th: Russell Westbrook5th: Anthony Davis
Chris Paul: 6th
![Page 14: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/14.jpg)
2015-16 SeasonPrediction
MVP: Stephen Curry2nd: Russell Westbrook3rd: LeBron James4th: Kevin Durant5th: James Harden
Kawhi Leonard: 26th
Actual
MVP: Stephen Curry2nd: Kawhi Leonard3rd: LeBron James4th: Russell Westbrook5th: Kevin Durant
James Harden: 9th
![Page 15: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/15.jpg)
Random Forests
◼ Decision Tree Learning◼ Bootstrap Aggregating ◼ Random Subspace Method
![Page 16: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/16.jpg)
Decision Tree Learning
Pts<x
Pts>x000000000
Assists Per Game
Assists<x
000100001
Assists>x
Rebounds Per Game
Points Per Game
Assists Per Game
Rebounds Per Game 2
Algorithm chooses variable at each step that best splits the data into successes and failures
![Page 17: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/17.jpg)
Bootstrap Aggregating
◼ random forest consists of b= 1, …, B randomized tree models
◼ each model (tree) is built with a bootstrap sample of the original data (sample of the original data of same size with replacement)
◼ training many trees on the same data set leads to problems (possibly recreating the same tree)
◼ averaging the predictions from all the individual regression trees leads to better performance
![Page 18: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/18.jpg)
Random Forest Interpretation
◼ samples not included in any given bootstrap sample are called “out-of-bag” samples
◼ %IncMSE “=” how much worse the predictions are when a permuted version of the variable is used instead of the true values◼ Build tree, make predictions using “real” data
values, record the error (MSE) of this◼ Permute values of variable in the out-of-bag
sample, re-do predictions, recompute MSE ◼ %IncMSE is how much the error increases
for the permuted samples vs the true samples
![Page 19: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/19.jpg)
Most Important MVP VariablesAccording to the Random Forest method:
% Increase MSE
PPG 0.0020350589APG 0.0010963324MPG 0.0010791207SPG 0.0010197867PFPG 0.0007518895TPG 0.0007515411eFG. 0.0007482838BPG 0.0006166867Age 0.0002998612RPG 0.0001863932POS 0.0001672975
According to Logistic Regression:
Z-score (absolute value)
PPG 5.167RPG 3.395PFPG 3.08 APG 2.58Age 2.291eFG. 2.104BPG 1.566POS 1.459MPG 0.62TPG 0.314SPG 0.128
![Page 20: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/20.jpg)
Drawbacks to our models
◼ Only one MVP can be crowned every year◼ Predictions using our models assume that the
response variable (MVP or not) is independent between players
◼ As a result, all probabilities do not sum to 1◼ Our models can rank players in likelihood of winning
MVP, but cannot give explicit probabilities
![Page 21: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/21.jpg)
Conclusions
◼ The most important variables are:◼ Points Per Game◼ Assists Per Game◼ Rebounds Per Game
◼ The least important variables include:◼ Blocks Per Game◼ Steals Per Game
◼ The problem with defensive production◼ MVP Voting: Stat-Driven, but not completely
◼ Steve Nash, 2005
![Page 22: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/22.jpg)
Future Work
◼ Further research into possible interaction between variables
◼ Better interpretability of logistic regression predictions◼ Impact of team on MVP prospects◼ Change in MVP selection criteria over the years◼ Changes in rules over the years◼ Growing data set and possible outcomes
![Page 23: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/23.jpg)
Thanks!
![Page 24: Predicting an MVP - SAMSI · 2016-06-21 · 4th: Russell Westbrook 5th: Kevin Durant James Harden: 9th. Random Forests Decision Tree Learning Bootstrap Aggregating Random Subspace](https://reader033.fdocuments.us/reader033/viewer/2022050105/5f43f373484c2e07db65b4d6/html5/thumbnails/24.jpg)
Questions?