1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D.
-
Upload
brooke-coleman -
Category
Documents
-
view
216 -
download
0
Transcript of 1 An Intelligence Approach to Evaluation of Sports Teams by Edward Kambour, Ph.D.
1
An Intelligence Approach to Evaluation of Sports
Teams by
Edward Kambour, Ph.D.
AgendaAgenda
I. College FootballII. Linear ModelIII. Generalized Linear ModelIV. Intelligence (Bayesian) ApproachV. ResultsVI. Other SportsVII. Future Work
General BackgroundGeneral Background
Goals Forecast winners of future games
Beat the Bookie! Estimate the outcome of unscheduled
games What’s the probability that Iowa would have
beaten Ohio St? Generate reasonable rankings
Major College Football Major College Football
No playoff system “Computer rankings” are an element of
the BCS 114 teams 12 games for each in a season
Linear ModelLinear Model
Rothman (1970’s), Harville (1977), Stefani (1977), …, Kambour (1991), …, Sagarin??? Response, Y, is the net result (point-
spread) Parameter, , is the vector of ratings For a game involving teams i and j,
E[Y] = i - j
Linear Model (cont.)Linear Model (cont.)
Let X be a row vector with
E[Y]=X
1 if
1 if
0 otherwise k
k i
X k j
Regression Model NotesRegression Model Notes
Least Squares Normality, Homogeneity
College Football Estimate 100 parameters Sample size for a full season is about 600 Design Matrix is sparse and not full rank
Home-field AdvantageHome-field Advantage
Generic Advantage (Stefani, 1980) Force i to be home team and j the visiting team Add an intercept term to X Adds one more parameter to estimate UAB = Alabama Rice = Texas A&M
Team Specific Advantage Doubles the number of parameters to estimate
Linear Model IssuesLinear Model Issues
Normality Homogeneity Lots of parameters, with relatively
small sample size Overfitting The bookie takes you to the cleaners!
Linear Model Issues (cont.)Linear Model Issues (cont.)
Should we model point differential A and B play twice
A by 34 in first, B by 14 in the second A by 10 each time
Running up the score (or lack thereof) BCS: Thou shalt not use margin of victory
in thy ratings!
Logistic RegressionLogistic Regression
Rothman (1970s) Linear Model Use binary variable
Winning is all that matters Avoid margin of victory Coin Flips
Logistic Regression IssuesLogistic Regression Issues
Still have sample size issues Throw away a lot of information Undefeated teams
TransformationsTransformations
Transform the differentials to normality Power transformations Rothman logistic transform
Transforms points to probabilities for logistic regression
“Diminishing returns” transforms Downweights runaway scores
Power TransformsPower Transforms
Transform the point-spread Y = sign(Z)|Z|a
a = 1 straight margin of victory a = 0 just win baby a = 0 Poisson or Gamma “ish”
Maximum Likelihood Transform
Maximum Likelihood Transform
1995-2002 seasons
MLE = 0.98
Power -2ln(likelihood)
0.1 52487
0.3 41213
0.5 35128
0.67 32597
0.8 31418
1 31193
Predicting the ScorePredicting the Score
Model point differential Y1 = Si – Sj
Additionally model the sum of the points scored Y2 = Si + Sj
Fit a similar linear model (different parameter estimates)
Forecast home and visitors score H = (Y1 + Y2 )/2, V = (Y2 - Y1)/2
Another Transformation IdeaAnother Transformation Idea
Scores (touchdowns or field goals) are arrivals, maybe Poisson Final score = 7 times a Poisson + 3 times
a Poisson + … Transform the scores to homogeneity
and normality first The differences (and sums) should follow
suit
Square Root TransformSquare Root Transform
Since the score is “similar” to a linear combination of Poissons, square root should work
Transformation
Why k? For small Poisson arrival rates, get better
performance (Anscombe, 1948)
T S k
Likelihood TestLikelihood Test
LRT: No transformation vs. square root with fitted k Used College Football results from 1995-
2002 k = 21 Transformation was significantly better
p-value = 0.0023, chi-square = 9.26
Predicting the Score with Transform
Predicting the Score with Transform
Model point differential
Additionally model the sum of the points scored
Forecast home and visitors score H = ((Y1 + Y2 )/2)2 , V = ((Y2 - Y1)/2)2
Note the point differential is the product
1 21 21i jY S S
2 21 21i jY S S
Unresolved Linear Model Issues
Unresolved Linear Model Issues
Overfitting History
Going into the season, we have a good idea as to how teams will do
The best teams tend to stay the best The worst teams tend to stay the worst
Changes happen Kansas State
Intelligence ModelIntelligence Model
Concept The ratings and home-ads for year t are
similar to those of year t-1. There is some drift from one year to the next.
Model 1
2
where
~ N( , )
t t t
t
0
Intelligence Model (Details)Intelligence Model (Details)
Notation L teams M seasons of data Ni games in the ith season
Xi : the Ni by 2L “X” matrix for season i
Yi : the Ni vector of results for season i
i : the Ni vector of results for season I
Details (cont.)Details (cont.)
Data Distribution: For all i = 1, 2, …, M
2, (independent)i i iN Y X
Details (cont.)Details (cont.)
Prior Distribution
2 21
2 21
2
0N ,
0 0.05
0.25 0N , for 2,...,
0 0.01
2,0.5
i i i M
I0
I
I
I
Details (finally, the end)Details (finally, the end)
The Posterior Distribution of M and -2 is closed form and can be calculated by an iterative method
The Predictive Distribution for future results (transformed sum or difference) is straight-forward correlated normal (given the variance)
ForecastsForecasts
For Scores Simply untransform
E[Z2] = Var[Z] + E[Z]2
For the point-spread Product of two normals
Simulate 10000 results
Enhanced ModelEnhanced Model
Fit the prior parameters Hierarchical models Drifts and initial variances No closed form for posterior and predictive
distributions (at least as far as I know) The complete conditionals are straight-forward,
so Gibbs sampling will work (eventually)
Results(www.geocities.com/kambour/football.html)
Results(www.geocities.com/kambour/football.html)
2002 Final RankingsTeam Rating Home
Miami 72.23 (1.03) 0.21 (0.04)
Kansas St 72.04 (1.04) 0.44 (0.03)
USC 71.95 (1.03) 0.04 (0.03)
Oklahoma 71.85 (1.02) 0.18 (0.03)
Texas 71.57 (1.03) 0.36 (0.03)
Georgia 71.49 (1.03) 0.02 (0.03)
Alabama 71.45 (1.03) -0.09 (0.03)
Iowa 71.30 (1.03) 0.21 (0.04)
Florida St 71.29 (1.02) 0.43 (0.03)
Virginia Tech 71.25 (1.03) 0.12 (0.03)
Ohio St 71.18 (1.03) 0.27 (0.03)
ResultsResults
2002 Final RankingsTeam Rating Home
Miami 72.23 0.21
Kansas St 72.04 0.44
USC 71.95 0.04
Oklahoma 71.85 0.18
Texas 71.57 0.36
Georgia 71.49 0.02
Alabama 71.45 -0.09
Iowa 71.30 0.21
Florida St 71.29 0.43
Virginia Tech 71.25 0.12
Ohio St 71.18 0.27
ResultsResults
2002 Final RankingsTeam Rating Home
Miami 72.23 0.21
Kansas St 72.04 0.44
USC 71.95 0.04
Oklahoma 71.85 0.18
Texas 71.57 0.36
Georgia 71.49 0.02
Alabama 71.45 -0.09
Iowa 71.30 0.21
Florida St 71.29 0.43
Virginia Tech 71.25 0.12
Ohio St 71.18 0.27
Bowl PredictionsBowl Predictions
Ohio St 17Miami Fl (-13) 31 0.8255 0.5228
Washington St 21
Oklahoma (-6.5) 31 0.7347 0.5797
Iowa 21
USC (-6) 30 0.7174 0.5721
NC State (E) 20
Notre Dame 17 0.5639 0.5639
Florida St (+4) 24
Georgia 27 0.5719 0.5320
2002 Final Record2002 Final Record
Picking Winners 522 – 157 0.769
Against the Vegas lines 367 – 307 – 5 0.544
Best Bets 9 – 7 0.563 In 2001, 11 - 4
ESPN College Pick’em(http://games.espn.go.com/cpickem/leader)
ESPN College Pick’em(http://games.espn.go.com/cpickem/leader)
1. Barry Schultz 5830 2. Jim Dobbs 5687 3. Michael Reeves 5651 4. Fup Biz 5594 5. Joe * 5587 6. Rising Cream 5562 7. Intelligence Ratings 5559
Ratings System Comparison(http://tbeck.freeshell.org/fb/awards2002.html)
Ratings System Comparison(http://tbeck.freeshell.org/fb/awards2002.html)
Todd Beck Ph.D. Statistician Rush Institute
Intelligence Ratings – Best Predictors
College Football ConclusionsCollege Football Conclusions
Can forecast the outcome of games Capture the random nature
High variability Sparse design
Scientists should avoid BCS Statistical significance is impossible Problem Complexity Other issues
NFLNFL
Similar to College Football Square root transform is applicable Drift is a little higher than College
Football Better design matrix
Small sample size Playoff
NFL Results(www.geocities.com/kambour/NFL.html)
NFL Results(www.geocities.com/kambour/NFL.html)
2002 Final Rankings (after the Super Bowl)Team Rating Home
Tampa Bay 70.72 0.29
Oakland 70.57 0.28
Philadelphia 70.55 0.10
New England 70.16 0.12
Atlanta 70.13 0.20
NY Jets 70.10 -0.01
Pittsburgh 69.95 0.28
Green Bay 69.92 0.28
Kansas City 69.90 0.51
Denver 69.89 0.50
Miami 69.89 0.49
2002 Final NFL Record2002 Final NFL Record
Picking Winners 162 – 104 – 1 0.609
Against the Vegas lines 135 – 128 – 4 0.513
Best Bets 9 – 8 0.529
NFL EuropeNFL Europe
Similar to College and NFL Square root transform Dramatic drift Teams change dramatically in mid-
season Few teams
Better design matrix
College BasketballCollege Basketball
Transform? Much more normal (Central Limit Theorem)
A lot more games Intersectional games
Less emphasis on programs than in College Football More drift
NCAA tournament
NCAA Basketball Pre-tournament Ratings
NCAA Basketball Pre-tournament Ratings
Team Rating Home
Arizona 100.06 3.97
Kentucky 99.33 4.32
Kansas 95.89 3.85
Texas 93.42 4.44
Duke 92.90 4.66
Oklahoma 90.19 4.31
Florida 90.65 3.99
Wake Forest 88.70 3.65
Syracuse 88.50 3.49
Xavier 87.89 3.37
Louisville 87.88 4.16
NBANBA
Similar to College Basketball Normal – No transformation
A lot more games – fewer teams Playoffs are completely different from
regular season Regular season – very balanced, strong
home court Post season – less balanced, home court
lessened
HockeyHockey
Transform Rare events = “Poissonish”
Square root with k around 1
A lot more games History matters Playoffs seem similar to regular season Balance
SoccerSoccer
Similar to hockey Transform
Square root with low k Not a lot of games Friendlys versus cup play Home pitch is pronounced
Varies widely
Soccer ResultsSoccer Results
Correctly forecasted 2002 World Cup final Brazil over Germany
Correctly forecasted US run to quarter-finals
Won the PROS World Cup Soccer Pool
Future EnhancementsFuture Enhancements
Hierarchical Approaches Conferences
More complicated drift models Correlations Individual drifts Drift during the season Mean correcting drift More informative priors