Talent Acquisition Analytics – Lessons Learned from Rating Competitors in Games and Sports
Transcript of Talent Acquisition Analytics – Lessons Learned from Rating Competitors in Games and Sports
Talent Acquisition Talent Acquisition Analytics: Lessons Learned Analytics: Lessons Learned from Rating Competitors from Rating Competitors in Games and Sportsin Games and Sports
Mark E. Glickman, Ph.D.Mark E. Glickman, Ph.D.Department of Health Policy & Department of Health Policy &
ManagementManagementBoston University School of Public Boston University School of Public
HealthHealth
Statistical analysis in talent Statistical analysis in talent acquisition:acquisition: “Moneyball” revolution in talent
assessment in sports Data-driven methods in hiring have
increased substantially in last 10-15 years
Statistical methods for evaluating performance in games and sports have been around for 50+ years.
What can human resource managers/executives learn from statistical methods in games/sports?
Outline of talk:Outline of talk:
Who am I? Statistical approaches to measuring
competitor strength in games/sports Examples (chess, Olympic sports) Closing the loop: Circling back to
talent acquisition
Disclaimer:Disclaimer: My area of expertise is statistical
methods, and not talent acquisition. This talk is not intended to be a tutorial. Do not expect to be able apply the
described methods at the conclusion of the presentation.
This presentation is intended to make you aware of an under-utilized way of approaching talent acquisition analytics that you may find worth considering.
A little bit about me:A little bit about me: Research Professor of Health Policy and
Management at Boston University; teach Statistics classes at Harvard as Visiting Professor
Editor in Chief, Journal of Quantitative Analysis in Sports
Co-organizer of the bi-annual New England Symposium on Statistics in Sports
My original interest in measuring competitor strength: chess ratings
I've also consulted for several companies on statistical issues in recruiting and talent assessment.
Measuring competitor ability: Measuring competitor ability: Paired comparisonsPaired comparisons
Observe game outcomes among competitors (e.g., tournaments, leagues)
Interested in measuring relative team/player abilities
Usual goal: Predict future performance
Typical setup
Basic example:Basic example:Adam, Bob and Carl play checkers in pairs. Adam vs Bob: Adam wins 8 games out of 10 Adam vs Carl: Adam wins 6 games out of 10 Bob vs Carl: Bob wins 7 games out of 10
Can conclude: Adam > Bob > Carl Adam defeats Bob about 80% of the
time, Adam defeats Carl about 60% of the time, and Bob defeats Carl about 70% of the time.
Basic example – altered slightly:Basic example – altered slightly:Adam, Bob and Carl play checkers in pairs. Adam vs Bob: Adam wins 8 games out of 10 Adam vs Carl: Adam wins 6 games out of 10 Bob and Carl do not compete
Can conclude: Adam > Bob and Adam > Carl
(approximately) Difficult to conclude what to expect if
Bob and Carl were to compete. Need extra assumptions.
How a statistician views a How a statistician views a competitor:competitor:
Paper bag!
Bag contains many numbered slips of paper
Each time a player competes, he/she draws a numbered slip of paper from the bag
The average value of the numbers in the bag can be thought of as the player’s overall average strength
Generally we do not know the average strength, but we hope to estimate it from performances and game outcomes
Two competitors…Two competitors…
……two bags!two bags!
Key idea:Key idea: When two players compete, each draws a
single slip from his/her bag and compares the two values If my value is higher, I win. If yours is higher, you win.
With enough games, we should be able to estimate the relationship between the average values in our two bags.
A “rating system” is designed to estimate the average ability for each competitor.
Complication: Competitor strength is likely changing over time, so rating systems need to account for changing strength.
Arpad Elo’s solution: Most commonly used rating
system was developed by Arpad Elo in the late 1950s, and adopted by the US Chess Federation (USCF) in 1960.
A rating is a number between 100 and 3000 that measures an individual’s playing strength.
The higher the rating, the stronger the player.
Elo’s system is dynamic; when a player wins, his/her rating increases, and when a player loses, his/her rating declines.
Key formula in Elo rating system:
Suppose a player competes against n opponents.The Elo rating update formula (in its simplest form) is
where score1, …, scoren are each either 1 or 0 if
the result was a win or loss against the opponent (or 0.5 if the game was a draw), and
expected1, …, expectedn are winning expectancies on the pre-tournament rating differences
Example calculation:Suppose a player rated 1500 competes in a tournament against 3 opponents with ratings: 1400, 1550 and 1700
Suppose further that the player wins against the first opponent, and loses the second two games.
The winning expectancies (from the winning expectancy curve) against these three opponents are 0.640, 0.429, and 0.240.
Top ten US players: July 2014
1 Nakamura, Hikaru NY USA 2855
2 Kamsky, Gata NY USA 27993 Akobian, Varuzhan KS USA 27574 Gareyev, Timur NV USA 27365 Onischuk, Alexander TX USA 27336 Shankland, Sam CA USA 27257 Robson, Ray MO USA 27168 Lenderman, Aleksandr NY USA 2690
9 Erenburg, Sergey VA USA 2677
10 Ramirez, Alejandro TX USA 2673
Top ten under 21 y/o: July 2014
1 Robson, Ray 19 MO USA 27162 Naroditsky, Daniel A 18 CA USA 26543 Holt, Conrad 20 KS USA 26454 Troff, Kayden W 16 UT USA 26065 Yang, Darwin 17 TX USA 25826 Sevian, Samuel 13 MA USA 25767 Xiong, Jeffery 13 TX USA 25488 Harmon-Vellotti, Luke C 15 ID USA 25269 Chandra, Akshat 15 NJ USA 251610 Ostrovskiy, Aleksandr 18 NY USA 2504
Multicompetitor games/sports:
Games/sports include races (human, car, horses), poker, gymnastics, diving, golf, and so on.
A game outcome is a rank-ordering of the competitors.
Goal: Produce ratings for all competitors based on the results of (possibly) many competitions.
Statistical foundation:
More paper bags!
Same idea as in head-to-head games:
Each competitor has his/her own bag of numbers with a (possibly) different average
For each competition, every player draws a number out of his/her bag The rank ordering of the numbers drawn
determines the finish placement of the competitionSame complication in head-to-head
games: Need to acknowledge that player abilities are changing over time
Application:Application: Alpine skiing - Alpine skiing - womenwomen’’s downhills downhill
Alpine skiing downhill Alpine skiing downhill data:data:
Event results provided by US Olympic Committee.
103 events from Feb 2002 through Dec 2013.
Total of 268 women competing in roughly 1.5 events per year, on average.
Events vary from 8 to 66 competitors.
Goal: Determine skier ratings over time
Top 10 women Alpine downhill Top 10 women Alpine downhill skiers: December 2013skiers: December 2013
Rank Skier’s Name Rating Std Dev1 Hilde Gerg 2067 2402 Michaela Dorfmeister 2061 2743 Tina Maze 1970 724 Marion Rolland 1944 935 Anna Fenninger 1914 836 Elena Fanchini 1868 807 Julia Mancuso 1860 628 Lara Gut 1858 769 Elisabeth Görgl 1857 72
10 Maria Höfl-Riesch 1855 78
Results for Lindsey Vonn and Julia Results for Lindsey Vonn and Julia MancusoMancuso
From competitor ratings From competitor ratings to talent acquisition to talent acquisition analytics:analytics:
Numerically rate features of job candidates, e.g., on a Likert scale, or from objective metrics
Combine different features into one overall score
Screen or rank-order candidates based on the combined score
Common approach to talent acquisition: Scales formed from numerically measured indicators of talent.
Difficulties with this Difficulties with this approach:approach:
High respondent burden – can take non-trivial mental effort in assigning scores
Often difficult to discriminate between candidates when assigning numerical values Simple combining of objective metrics into single values rely heavily on the method of combining
Numerically assigned values can often be biased towards high values (e.g., indicating good and excellent candidate features)
Competitor-based model:Competitor-based model:Rather than combining metrics for different candidates, perform direct comparisons
Many types of biases/difficulties present in numerical scoring is absent in comparison approaches
Similarity to competitor Similarity to competitor statistical models:statistical models: Each HR staff-person evaluating a job
candidate forms their own impression (drawing a number from the candidate’s paper bag)
The overall merit of a job candidate can be understood as the average of many HR evaluations
Analogy to rating systems for games/sports: We can learn about the overall merit by having HR staff compare candidates head-to-head, or in rank order.
Candidates’ merits may change over time
Example application:Example application: 30 candidates apply for a job; Five HR staff
assigned to make 10 paired comparisons on a subset of the candidates
Apply (for example) the Elo rating system to the results of the comparisons, and rank order candidates based on computed ratings
A few comments:A few comments: Can also ask HR staff to rank order batches
of candidates, and use multicompetitor rating systems to determine candidate ratings
Candidates who reapply for jobs over time may be changing in desirability or fit, so their ratings ought to change over time
Many competitor rating systems can used “off the shelf” without changing any formulas; Elo is one, and my Glicko system is another
Moral of the story:Moral of the story: Incorporating quantitative methods into talent
acquisition is not new, but little attention has been paid to methods used in rating competitors in games/sports
Evaluation schemes for talent acquisition through head-to-head comparisons or rank orderings permits use of existing games/sports rating systems.
Thanks for listening!
Contact E-mail: [email protected]