Talent Acquisition Analytics – Lessons Learned from Rating Competitors in Games and Sports

Talent Acquisition Talent Acquisition Analytics: Lessons Learned Analytics: Lessons Learned from Rating Competitors from Rating Competitors in Games and Sportsin Games and Sports

Mark E. Glickman, Ph.D.Mark E. Glickman, Ph.D.Department of Health Policy & Department of Health Policy &

ManagementManagementBoston University School of Public Boston University School of Public

HealthHealth

Statistical analysis in talent Statistical analysis in talent acquisition:acquisition: “Moneyball” revolution in talent

assessment in sports Data-driven methods in hiring have

increased substantially in last 10-15 years

Statistical methods for evaluating performance in games and sports have been around for 50+ years.

What can human resource managers/executives learn from statistical methods in games/sports?

Outline of talk:Outline of talk:

Who am I? Statistical approaches to measuring

competitor strength in games/sports Examples (chess, Olympic sports) Closing the loop: Circling back to

talent acquisition

Disclaimer:Disclaimer: My area of expertise is statistical

methods, and not talent acquisition. This talk is not intended to be a tutorial. Do not expect to be able apply the

described methods at the conclusion of the presentation.

This presentation is intended to make you aware of an under-utilized way of approaching talent acquisition analytics that you may find worth considering.

A little bit about me:A little bit about me: Research Professor of Health Policy and

Management at Boston University; teach Statistics classes at Harvard as Visiting Professor

Editor in Chief, Journal of Quantitative Analysis in Sports

Co-organizer of the bi-annual New England Symposium on Statistics in Sports

My original interest in measuring competitor strength: chess ratings

I've also consulted for several companies on statistical issues in recruiting and talent assessment.

Measuring competitor ability: Measuring competitor ability: Paired comparisonsPaired comparisons

Observe game outcomes among competitors (e.g., tournaments, leagues)

Interested in measuring relative team/player abilities

Usual goal: Predict future performance

Typical setup

Basic example:Basic example:Adam, Bob and Carl play checkers in pairs. Adam vs Bob: Adam wins 8 games out of 10 Adam vs Carl: Adam wins 6 games out of 10 Bob vs Carl: Bob wins 7 games out of 10

Can conclude: Adam > Bob > Carl Adam defeats Bob about 80% of the

time, Adam defeats Carl about 60% of the time, and Bob defeats Carl about 70% of the time.

Basic example – altered slightly:Basic example – altered slightly:Adam, Bob and Carl play checkers in pairs. Adam vs Bob: Adam wins 8 games out of 10 Adam vs Carl: Adam wins 6 games out of 10 Bob and Carl do not compete

Can conclude: Adam > Bob and Adam > Carl

(approximately) Difficult to conclude what to expect if

Bob and Carl were to compete. Need extra assumptions.

How a statistician views a How a statistician views a competitor:competitor:

Paper bag!

Bag contains many numbered slips of paper

Each time a player competes, he/she draws a numbered slip of paper from the bag

The average value of the numbers in the bag can be thought of as the player’s overall average strength

Generally we do not know the average strength, but we hope to estimate it from performances and game outcomes

Two competitors…Two competitors…

……two bags!two bags!

Key idea:Key idea: When two players compete, each draws a

single slip from his/her bag and compares the two values If my value is higher, I win. If yours is higher, you win.

With enough games, we should be able to estimate the relationship between the average values in our two bags.

A “rating system” is designed to estimate the average ability for each competitor.

Complication: Competitor strength is likely changing over time, so rating systems need to account for changing strength.

Arpad Elo’s solution: Most commonly used rating

system was developed by Arpad Elo in the late 1950s, and adopted by the US Chess Federation (USCF) in 1960.

A rating is a number between 100 and 3000 that measures an individual’s playing strength.

The higher the rating, the stronger the player.

Elo’s system is dynamic; when a player wins, his/her rating increases, and when a player loses, his/her rating declines.

Key formula in Elo rating system:

Suppose a player competes against n opponents.The Elo rating update formula (in its simplest form) is

where score1, …, scoren are each either 1 or 0 if

the result was a win or loss against the opponent (or 0.5 if the game was a draw), and

expected1, …, expectedn are winning expectancies on the pre-tournament rating differences

Example calculation:Suppose a player rated 1500 competes in a tournament against 3 opponents with ratings: 1400, 1550 and 1700

Suppose further that the player wins against the first opponent, and loses the second two games.

The winning expectancies (from the winning expectancy curve) against these three opponents are 0.640, 0.429, and 0.240.

Top ten US players: July 2014

1 Nakamura, Hikaru NY USA 2855

2 Kamsky, Gata NY USA 27993 Akobian, Varuzhan KS USA 27574 Gareyev, Timur NV USA 27365 Onischuk, Alexander TX USA 27336 Shankland, Sam CA USA 27257 Robson, Ray MO USA 27168 Lenderman, Aleksandr NY USA 2690

9 Erenburg, Sergey VA USA 2677

10 Ramirez, Alejandro TX USA 2673

Top ten under 21 y/o: July 2014

1 Robson, Ray 19 MO USA 27162 Naroditsky, Daniel A 18 CA USA 26543 Holt, Conrad 20 KS USA 26454 Troff, Kayden W 16 UT USA 26065 Yang, Darwin 17 TX USA 25826 Sevian, Samuel 13 MA USA 25767 Xiong, Jeffery 13 TX USA 25488 Harmon-Vellotti, Luke C 15 ID USA 25269 Chandra, Akshat 15 NJ USA 251610 Ostrovskiy, Aleksandr 18 NY USA 2504

Multicompetitor games/sports:

Games/sports include races (human, car, horses), poker, gymnastics, diving, golf, and so on.

A game outcome is a rank-ordering of the competitors.

Goal: Produce ratings for all competitors based on the results of (possibly) many competitions.

Statistical foundation:

More paper bags!

Same idea as in head-to-head games:

Each competitor has his/her own bag of numbers with a (possibly) different average

For each competition, every player draws a number out of his/her bag The rank ordering of the numbers drawn

determines the finish placement of the competitionSame complication in head-to-head

games: Need to acknowledge that player abilities are changing over time

Application:Application: Alpine skiing - Alpine skiing - womenwomen’’s downhills downhill

Alpine skiing downhill Alpine skiing downhill data:data:

Event results provided by US Olympic Committee.

103 events from Feb 2002 through Dec 2013.

Total of 268 women competing in roughly 1.5 events per year, on average.

Events vary from 8 to 66 competitors.

Goal: Determine skier ratings over time

Top 10 women Alpine downhill Top 10 women Alpine downhill skiers: December 2013skiers: December 2013

Rank Skier’s Name Rating Std Dev1 Hilde Gerg 2067 2402 Michaela Dorfmeister 2061 2743 Tina Maze 1970 724 Marion Rolland 1944 935 Anna Fenninger 1914 836 Elena Fanchini 1868 807 Julia Mancuso 1860 628 Lara Gut 1858 769 Elisabeth Görgl 1857 72

10 Maria Höfl-Riesch 1855 78

Results for Lindsey Vonn and Julia Results for Lindsey Vonn and Julia MancusoMancuso

From competitor ratings From competitor ratings to talent acquisition to talent acquisition analytics:analytics:

Numerically rate features of job candidates, e.g., on a Likert scale, or from objective metrics

Combine different features into one overall score

Screen or rank-order candidates based on the combined score

Common approach to talent acquisition: Scales formed from numerically measured indicators of talent.

Difficulties with this Difficulties with this approach:approach:

High respondent burden – can take non-trivial mental effort in assigning scores

Often difficult to discriminate between candidates when assigning numerical values Simple combining of objective metrics into single values rely heavily on the method of combining

Numerically assigned values can often be biased towards high values (e.g., indicating good and excellent candidate features)

Competitor-based model:Competitor-based model:Rather than combining metrics for different candidates, perform direct comparisons

Many types of biases/difficulties present in numerical scoring is absent in comparison approaches

Similarity to competitor Similarity to competitor statistical models:statistical models: Each HR staff-person evaluating a job

candidate forms their own impression (drawing a number from the candidate’s paper bag)

The overall merit of a job candidate can be understood as the average of many HR evaluations

Analogy to rating systems for games/sports: We can learn about the overall merit by having HR staff compare candidates head-to-head, or in rank order.

Candidates’ merits may change over time

Example application:Example application: 30 candidates apply for a job; Five HR staff

assigned to make 10 paired comparisons on a subset of the candidates

Apply (for example) the Elo rating system to the results of the comparisons, and rank order candidates based on computed ratings

A few comments:A few comments: Can also ask HR staff to rank order batches

of candidates, and use multicompetitor rating systems to determine candidate ratings

Candidates who reapply for jobs over time may be changing in desirability or fit, so their ratings ought to change over time

Many competitor rating systems can used “off the shelf” without changing any formulas; Elo is one, and my Glicko system is another

Moral of the story:Moral of the story: Incorporating quantitative methods into talent

acquisition is not new, but little attention has been paid to methods used in rating competitors in games/sports

Evaluation schemes for talent acquisition through head-to-head comparisons or rank orderings permits use of existing games/sports rating systems.

Thanks for listening!

Contact E-mail: [email protected]

Talent Acquisition Analytics – Lessons Learned from Rating Competitors in Games and Sports

Business

Transcript of Talent Acquisition Analytics – Lessons Learned from Rating Competitors in Games and Sports