Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van...

download Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

of 32

Transcript of Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van...

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    1/32Electronic copy available at: http://ssrn.com/abstract=2129879

    Beyond Chance?The Persistence of Performance in Online Poker

    Rogier J.D. Potter van LoonErasmus School of Economics, Erasmus University Rotterdam

    [email protected]

    Martijn J. van den Assem *Erasmus School of Economics, Erasmus University Rotterdam

    [email protected]

    Dennie van DolderErasmus School of Economics, Erasmus University Rotterdam

    [email protected]

    [ Preliminary version, comments and suggestions are welcome ]

    Abstract: A major issue in the widespread controversy about the legality of poker and

    the appropriate taxation of winnings is whether poker should be considered a game of

    skill or a game of chance. To inform this debate we present an analysis into the role of

    skill in the performance of online poker players, using a large database with hundreds

    of millions of player-hand observations from real money ring games at three different

    stakes levels. We find that players whose earlier profitability was in the top (bottom)

    deciles perform better (worse) and are substantially more likely to end up in the top

    (bottom) performance deciles of the following time period. Regression analyses of

    performance on historical performance and other skill-related proxies provide further

    evidence for persistence and predictability. Our results suggest that skill is an

    important factor in online poker.

    JEL: C01, D80, K00, K34

    Version: August 21, 2012 (First online version: August 15, 2012)

    Keywords: poker, onl ine poker, no limit texas holdem, online gambling, skill,

    chance, game of skill, game of chance, performance persistence

    * Contact details for manuscript correspondence: Erasmus University of Rotterdam, P.O. Box 1738, 3000 DR, Rotterdam, the Netherlands.

    E: [email protected], T: +31104081285, F: +31104089165. We gratefully acknowledge support from the Erasmus Research Institute

    of Management, the Netherlands Organisation for Scientific Research (NWO), and the Tinbergen Institute.

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    2/32Electronic copy available at: http://ssrn.com/abstract=2129879

    [2]

    Poker is the most popular card game in the world. Every day, hundreds of thousands of people play

    poker for real money on the Internet (Online Poker Traffic Reports). In 2011, online poker rooms

    generated approximately $4 billion in revenues (H2 Gambling Capital). The popularity of the game is

    also evidenced by the many TV reports of major poker tournaments and the number of participants

    in these tournaments. In 2012, for example, 6,598 people paid $10,000 to participate in the most

    renowned poker tournament, the Main Event of the World Series of Poker in Las Vegas.

    At the same time, there is a widespread controversy about the legality of poker and the appropriate

    taxation of winnings. A key issue in the debate is whether poker is to be considered a game of chance

    or a game of skill. Unlike with games of skill, organizing or playing a game of chance is prohibited or

    restricted in many countries. Also, many countries have a separate gam(bl)ing tax for games of

    chance, while money won in a game of skill is mostly subject to regular income tax. Kelly, Dhar and

    Verbiest (2007) map legislation and case law on poker for various countries, and show that there is

    great variation. US regulation is even different across states. Over recent years, several legal scholars

    have argued that poker is a skill game and should be recognized as such (Cabot and Hannum, 2005;

    Grohman, 2006; Tselnik, 2007).

    Authorities often have a less permissive stance towards online poker than towards live poker. In the

    US, for example, the Unlawful Internet Gambling Enforcement Act (UIGEA) that was implemented in

    2006 had a major impact: although the Act does not forbid online gambling, it does prohibit the

    transfer of funds to and from online gambling businesses. As depositing money is necessary for

    playing online poker, this Act effectively declares online poker illegal. If poker would be considered as

    a game of skill the UIGEA would probably no longer affect the online poker business.

    Behavioral research hints that there are important skill elements in poker. For optimal play in the

    long run, poker players generally need to apply the expected value criterion to every decision in the

    game. There is considerable evidence, however, that they will systematically deviate from this ideal

    and that they will do so in different degrees. One issue is that people have a tendency to evaluatetheir decisions one at a time and thereby weigh losses more heavily than gains, a phenomenon

    labeled myopic loss aversion (Benartzi and Thaler, 1995). In addition, people are typically risk averse

    for gains, but risk seeking for losses (the reflection effect; Kahneman and Tversky, 1979). Loss

    aversion and reflection lead to path-dependent risk preferences if people do not update their

    reference point after prior gains or losses, a phenomenon indeed found in various studies (Thaler

    and Johnson, 1990, Kameda and Davis, 1990; Post et al ., 2008). P eoples nonlinear sensitivity to

    probabilities is another reason why behavior diverges from the expected value criterion (Allais, 1953;

    Kahneman and Tversky, 1979; Tversky and Kahneman, 1992; Tversky and Fox, 1995).

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    3/32

    [3]

    Perhaps even more important, probabilities in decision problems in poker are rarely precise as they

    depend on the strategic choices of other players. As a result, players are likely to go wrong

    systematically: typically, people are overly optimistic about their own chances when they have a

    sense of control (Langer, 1975; Langer and Roth, 1975; Weinstein, 1980; Svenson, 1981), overly

    confident about their probability judgments (Alpert and Raiffa, 1982; Fischhoff et al ., 1977), and

    averse to ambiguity in a degree that depends on how competent they feel in evaluating the gamble

    (Ellsberg, 1961; Fox and Tversky, 1995). Also, instead of using Bayesian statistics they judge

    probabilities by the subjective ease with which relevant instances come to mind and by focusing on

    representativeness (Kahneman and Tversky, 1972, 1973; Tversky and Kahneman, 1971, 1973; Rabin,

    2002).

    Some behavioral studies have directly examined the behavior of poker players on some of these

    issues. Park and Santos-Pinto (2010) survey poker players and report that they overestimate their

    own expected performance. Smith et al . (2009) find that online players display path-dependent risk

    attitudes by playing less cautiously after big losses. Similarly, Eil and Lien (2010) find that players who

    have lost tend to play longer and less conservatively.

    Differences between players in the degree to which they are prone to these deficiencies lead to

    differences in their expected performance. Such heterogeneity seems natural and has also been

    shown in a large number of studies. Recent research has uncovered factors that explain systematic

    differences in peoples proneness . For example, Cesarini et al . (2011) and Cronqvist and Siegel (2011)

    find survey-based evidence from twins that genetic differences are a source of heterogeneity in

    anomalous choice behavior, and Stanovich and West (1998) and Grinblatt, Keloharju and Linnainmaa

    (2011) report a relation between susceptibility to behavioral biases and cognitive ability.

    Simultaneously, players are likely to have different levels of strategic sophistication. At the first level,

    a player considers her own cards only. A second-level player understands that opponents base their

    decisions on their own cards likewise, and updates the relative strength of her cards according toother players actions. At the third level, a player also bears in mind that opponents are influenced by

    her actions, entailing the awareness of opportunities to bluff. In theory, there are infinitely many

    such levels. 1 Heterogeneity in strategic sophistication is illustrated, for example, in a field experiment

    by Palacios-Huerta and Volij (2009): in a game called the centipede game, experienced chess players

    chose the (rational) action dictated by backward induction more often than college students. Other

    examples are Stahl and Wilson (1994), Haruvy, Stahl and Wilson (2001), and Camerer, Ho and Chong

    1 The analysis of optimal strategies in poker has captivated many game theorists. See, for example, vonNeumann and Morgenstern (1944), Kuhn (1950), Nash and Shapley (1950), and Friedman (1971).

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    4/32

    [4]

    (2004). Altogether, these behavioral insights suggest that a players performance in the game of

    poker is determined by more than chance alone.

    Two different research tracks have examined the skill component in poker. One track of prior studies

    focuses on developing and calculating measures of skill in games. Borm and van der Genugten(2001), Dreef, Borm and van der Genugten (2003, 2004a, 2004b), and Hendrickx et al . (2008) propose

    measures that compare the performances of different types of players, including an informed

    hypothetical player who knows exactly the cards that will be drawn. The use of their approach is,

    however, limited to relatively simple games. Because of the virtually infinite number of possible

    game situations that result from the many different choice (betting) options that players have and

    because of the importance of players hidden higher order beliefs, the approach cannot be accurately

    implemented for the most popular form of poker, No Limit Texas Holdem. Nevertheless, even for

    simple poker variants, the different studies report a substantial degree of skill. Heubeck (2008)

    reviews the various kinds of proposed skill measures.

    The second track of prior studies takes a more empirically oriented approach. Likewise, these papers

    suggest that poker involves a skill component. Larkey et al . (1997) and Cabot and Hannum (2005) ran

    large-scale simulations with different pre-defined playing strategies and find that their more

    sophisticated strategies performed better. DeDonno and Detterman (2008) carried out experiments

    with student-subjects and demonstrate that the group of players who received strategic instructions

    during the session outperformed the control group. Siler (2009) analyzes online poker data and

    establishes that performance is related to playing style, and that style and performance differences

    between players decrease with the level of the stakes.

    Our analysis is closest to the empirical studies by Croson, Fishman and Pope (2008) and Levitt and

    Miles (2011). Both these studies examine whether there is persistence in the performance of poker

    players. Croson, Fishman and Pope analyze how well players who have finished in the top 18 of a

    high-stakes tournament fare when they are among the final 18 players in a subsequent majortournament, and they compare their results with those from a similar analysis for professional golf.

    They find that previous finishes predict current finishes, and that the skill differences across the

    poker players in their sample are similar to those across the golfers. Levitt and Miles analyze a data

    set that comprises the complete rankings of all players who entered a 2010 World Series of Poker

    tournament. They report that players who were a priori classified as being especially skilled indeed

    outperformed the other players.

    In the present paper we analyze the role of skill in the performance of online poker players, using alarge database with 415.9 million player-hand observations from real money ring games at three

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    5/32

    [5]

    different stakes levels. We believe that it is natural to use online poker data to inform the chance-

    skill debate, because the debate is especially oriented towards issues regarding the legality of

    internet poker and the taxation of winnings from online play. Moreover, the vast amount of data

    that is available allows for powerful analyses. To the best of our knowledge, we are the first to make

    use of online poker data to examine whether there is persistence in poker pl ayers performance.

    We define skill as anything that affects a players performance other than chance. Because the effect

    of chance cancels out in the long run, predictable differences in performance between players would

    indicate that skill matters. In a pure game of chance each players expected winnings are zero (in

    absence of costs), and there is no persistence or positive autocorrelation in their performance:

    players performance over a given period is independent of their performance over any other period.

    Our results indicate that skill is an important factor. When we split our sample into subperiods, we

    find that players whose performance was in the top (bottom) deciles of the previous period perform

    better (worse) and are more likely to end up in the top (bottom) deciles of the current period. Our

    regression analyses of performance on past performance and other skill proxies reinforce the

    evidence of persistence in performance.

    The remainder of the paper is structured as follows: Section I discusses our data and presents

    descriptive statistics, Section II analyzes the persistence of performance using decile analyses,

    Section III reports on regression analyses, and Section IV concludes.

    I. Data and Descriptive Statistics

    For our analysis we use data on real money ring games (cash games) played at one of the major

    online poker sites. We consider No Limit (NL) Texas Holdem only because this variant is by far the

    most popular form of poker worldwide. Our data is acquired through an online service called

    HHDealer. In recent years, several companies have specialized in gathering and trading so-called

    hand histories from online poker rooms. 2 With software applications they continuously collect

    information on hands played at online poker tables. Numerous players buy these data to have

    information on the playing styles of others. Because of limited resources, hand history providers are

    unable to store data on every hand that is played online. Out of the websites that responded to our

    2

    A hand is the game that is played between two subsequent shuffles of the deck: dealing of cards, betting,and awarding of the pot (in another context, the term can also refer to the cards dealt to a player). Withplaying a hand we mean to be dealt in. Whenever possible, we avoid the use of poker ter minology. Webelieve that reading the paper does not require the readers understanding of the game.

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    6/32

    [6]

    inquiries, HHDealer was able to provide the largest number of hands for an uninterrupted period of

    twelve months. We purchased all available data for the games that had been played at three

    particular stakes levels in the period October 2009 September 2010. In poker, stakes levels are

    distinguished by the size of the small and the big blind bet. To ground our analysis on distinct stakes

    levels, we selected data from so-called low, medium and high stakes games, with big blind

    sizes of $0.25, $2 and $10, respectively. These games are also known as $25NL, $200NL, and

    $1,000NL because of the maximum buy-in amount, which is normally one hundred times the big

    blind.

    The resulting raw data set contains a total of 76.7 million different hands. The average number of

    players participating in a hand is 5.4, yielding 415.9 million different player-hand observations. Of

    these, 151.4 million (36.4%) are from the low stakes games, 228.1 million (54.8%) are from the

    medium stakes, and 36.4 million (8.8%) are from the high stakes. The smallest number of

    observations in a month is recorded in February 2010 (16.5 million, or 4.0%), and relates to a

    software change that temporarily made data mining more difficult. The peak is in January 2010 (60.3

    million, or 14.5%).

    Table 1 summarizes the data. Our sample contains over half a million different players. 3 About

    375,000 of them played at least one hand at a low stakes table ($0.25 big blind), 222,000 played in a

    medium stakes game ($2 big blind) and 34,000 played in a large stakes game ($10 big blind). Playershardly switched between these three levels: nearly all hands (96%) were played at the stakes level at

    which the given player played most frequently. A small minority (16%) was active at more than one

    of the three levels, but even these players still played 90 percent of their hands at their most favorite

    level.

    Players who participated in a high stakes game played, on average, 1,085 hands at that particular

    level. For the medium and small stakes this number is 1,027 and 403, respectively. The average

    number of hands played at the three levels combined is 774. There is much variation across playersin the number of hands that they played. One exceptional player was involved in approximately

    760,000 hands (0.2% of our sample), while 56.2 percent of all players participated in less than one

    hundred hands. The degree of concentration is high: the one percent most active players played 58.1

    percent of all hands, and 12.5 percent played 90 percent. Figure 1 displays the cumulative

    distribution of hands played for each of the three different stakes levels.

    3 We interpret each account as a separate player. People are not allowed to have multiple accounts.

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    7/32

    [7]

    Table 1: Summary Statistics

    The table shows descriptive statistics for our full sample of 415.9 million player-hand observations from realmoney ring games No Limit Texas Holdemat three different stakes levels. For each stakes level and for thethree levels combined, the first two rows show the number of players and the number of hands at theaggregate level. Note that some players have played at more than one stakes level. The other rows provide

    statistics at the player level on the number of hands played, the total winnings expressed as the number of bigblinds won, and the average number of big blinds won per hundred hands. Profitability statistics are shownwith and without correction for rake (the commission taken by the operator). Winnings are corrected for rakeby adding back rake in proportion to players contribution to the pot. The small, medium and large stakesgames have big blinds of $0.25, $2 and $10, respectively.

    Smallstakes

    Mediumstakes

    Largestakes

    Allstakes

    Total players 375,485 222,079 33,564 537,277

    Total hands (in millions) 151.4 228.1 36.4 415.9

    Hands

    Mean 403 1,027 1,085 774Minimum 1 1 1 1Median 54 84 70 75Maximum 343,971 759,602 461,743 760,701Stdev 2,621 8,381 7,004 6,191

    Big blinds won(raked)

    Mean 35 53 20 47Minimum 8,971 10,490 6,887 10,490Median 18 30 21 25Maximum 11,834 29,877 31,349 30,501Stdev 189 391 406 317

    Big blinds won(not raked)

    Mean 0 0 0 0

    Minimum 7,422 7,988 6,618 7,962Median 9 23 20 16Maximum 36,539 51,815 39,280 51,578Stdev 289 666 477 511

    Big blinds won per 100 hands(raked)

    Mean 102 106 106 97Minimum 15,696 15,740 12,673 15,740Median 28 32 24 28Maximum 11,020 15,000 10,030 15,000Stdev 473 459 588 441

    Big blinds won per 100 hands

    (not raked)

    Mean 90 99 104 88Minimum 15,547 15,715 12,671 15,715Median 19 26 22 21Maximum 11,150 15,000 10,035 15,000Stdev 467 458 588 438

    Table 1 also shows statistics on players winnings, both before and after commission taken by the

    operator. This commission is known as rake . To compare and combine performance statistics

    across stakes levels, winnings are scaled by the size of the big blind. For example, a profit of 5 big

    blinds corresponds to a profit of $50 at the high stakes, and $1.25 at the low stakes. To also account

    for differences in the number of hands played, performance is expressed as the number of big blinds

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    8/32

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    9/32

    [9]

    -100 -80 -60 -40 -20 0 20 40 60 80 1000

    0.005

    0.01

    0.015

    0.02

    0.025

    0.03

    0.035

    Big blinds per 100 hands

    D e n s i t y

    Figure 2: Distribution of Players Performance. The figure displays the distribution of the performance ofthe 51,067 players who participated in over 1,000 hands included in our sample of 415.9 million player-handobservations from re al money ring games No Limit Texas Holdem. For each player, performance is measuredas the number of big blinds won per 100 hands.

    that depends on the number of players at the table. To correct players winnings for rake, we add

    back the rake in proportion to their contributions to the pot. On average, rake reduces players

    performance in our sample by 9.5 bb/100. As a result of the fixed nominal cap, the effect of rake on

    players win rates is larger for games with smaller stakes. In the absence of rake, 38 percent of all

    players would have made a profit. The extreme values for the best and worst win rate scores in the

    table were recorded for very lucky and unlucky players who played only one or two hands.

    For our analysis of the role of skill in performance, we measure performance as the win rate in big

    blinds won per 100 hands before deduction of rake. Later on, we introduce a second performance

    measure. We control for rake, because we do not want our findings to be conditional on the rake

    structure that is employed by the operator. Rake is not an intrinsic element of the game, and the per-

    centages and caps differ across sites. Moreover, it remains unclear what amount of rake players truly

    pay. They can easily participate in reward schemes and receive deposit bonuses that partly make up

    for it, and via affiliates of the operator they can enter into so- called rakeback deals that reimburse

    25 percent or more of the rake that they paid. Figure 2 shows the distribution of rake-corrected

    performance for our full sample period for players who participated in more than 1,000 hands.

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    10/32

    [10]

    II. Decile Analyses

    Under the null hypothesis that poker is a game of chance alone, there is no relation between a

    players performance scores across different subperiods. Alternatively, if skill plays a material role in

    the game of poker, we would expect a player s performance in one particular subperiod to beindicative of her performance in later subperiods. In this section we subdivide players into deciles

    based on their performance in the first six months of our sample period and examine how these

    various deciles fared in the last six months. We run similar analyses for quarterly periods. In the next

    section we look at the persistence and predictability of performance through regression analysis.

    Our sample period covers twelve consecutive months. For our first decile analyses we split up our

    data into two subsamples (October 2009 March 2010 and April September 2010), and rank the

    different players into deciles according to the average number of big blinds they have won per hand

    across the first period ( the ranking period). Because small collections of hands are likely to yield

    very noisy indicators of performance, we filter out players who have played less than 1,000 hands

    during this ranking period. This leaves a sample of 13,681 players for the small stakes level, 16,212

    for the medium stakes level, and 2,725 for the large stakes level. A total of 32,716 players

    participated in 1,000 or more hands at the three levels combined. On average, they played 5,831

    hands each (median: 2223). Next, we examine the average performance of the various deciles of

    players over the second period of six months (the measurement period) . To prevent selectioneffects, we impose no restriction on the number of hands in this measurement period. As explained

    in the previous section, hand outcomes are corrected for rake and scaled by the size of the big blind.

    Table 2 shows the results for the three individual stakes levels (Panel A, B and C) and for the three

    levels combined (Panel D). The left part of the table includes the average performance (in bb/100) for

    each decile over the ranking period, while the right part displays how well each decile fared in the

    measurement period. 5 In a nutshell, the results indicate that there is substantial and significant

    persistence in performance: deciles of players that performed relatively well in the first period onaverage continued to do so in the second period. The findings for individual stakes levels are

    generally similar to those for the three levels combined, and our discussion below therefore mainly

    concentrates on the aggregate results.

    5 Note that the deciles comprise less players in the measurement period than in the ranking period. Players

    either ceased to play at some point, moved up or down in stakes, or were simply not covered in our handhistories. For the three stakes levels combined, out of the 32,716 players who played at least 1,000 handsduring the first six months, a subgroup of 18,311 players were also active during the subsequent six months. Onaverage, they played 5,017 hands in this second period (median: 730) and 7,318 (median: 2530) in the first.

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    11/32

    [11]

    Table 2: Standard Performance Measure Deciles for Six-Month Periods

    The table ranks all players who played 1,000 hands or more over the first six months of our sample period intodeciles by their performance over these six months, where performance is measured as the average number ofbig blinds won per hundred hands after correction for rake. For each decile, the first columns show the numberof included players ( N) and their average performance (bb/100) for this ranking period (Period 1). The next

    columns show the number of players from the decile who played at least one hand in the last six months of oursample period, as well as their average performance for this measurement period (Period 2) and how they rankon average relative to all other players who played at least one hand (Rank). Average decile performance(bb/100) is expressed both as a straight average (unweighted) and as a weighted average where the weightsare either players numb er of hands (weighted by n ) or the square roots of players number of hands ( weightedby n). Panel A (B/C) shows the results for observations from the small (medium/high) stakes level only. PanelD shows the results for observations from all stakes levels combined. For each panel, the table shows theSpearman rank correlation between the two periods for the average performance at the decile level (for eachweighting method) and for performance at the player level (in Rank column).

    Period 1 Period 2

    bb/100 bb/100Decile N

    un-weighted

    weightedby n

    weightedby n

    Nun-

    weightedweighted

    by nweighted

    by nRank

    Panel A: Small stakes1 1,368 34.7 34.1 33.3 619 7.7 4.8 8.3 2,9312 1,368 19.9 19.8 19.6 660 3.6 6.9 9.3 2,9033 1,368 14.1 14.1 14.0 654 3.5 6.5 8.8 2,9324 1,368 10.2 10.2 10.2 641 7.3 6.3 8.0 2,9385 1,369 7.2 7.2 7.2 598 1.0 6.7 7.8 2,8826 1,368 4.4 4.4 4.5 587 3.3 5.0 7.3 3,0507 1,368 1.4 1.5 1.5 571 9.7 4.4 6.9 3,0828 1,368 2.7 2.7 2.6 573 4.8 2.3 5.8 3,2169 1,368 8.9 8.9 8.8 584 8.1 1.3 3.2 3,326

    10 1,368 29.8 29.2 28.6 683 16.3 11.3 5.9 3,584Correlation 0.455 0.782 0.964 0.110( p-value) (0.191) (0.012) (0.000) (0.000)

    Panel B: Medium stakes1 1,621 34.8 33.5 31.7 959 18.7 -3.3 4.1 4,3992 1,621 16.0 15.7 15.3 976 22.1 1.4 6.8 4.1543 1,622 10.0 9.9 9.7 938 13.7 3.8 6.9 3,9444 1,621 6.2 6.2 6.2 910 22.2 2.7 5.8 4,0475 1,621 2.9 3.0 3.2 839 12.2 0.8 4.3 4,1396 1,621 0.7 0.6 0.5 861 26.0 2.7 2.7 4,436

    7 1,621 5.3 5.3 5.2 842 26.1 5.3 1.4 4,6158 1,622 11.3 11.2 11.1 841 33.4 6.4 1.4 4,6139 1,621 21.0 20.8 20.5 874 50.8 13.3 4.5 4,986

    10 1,621 46.7 46.0 45.0 890 57.2 24.1 9.6 5,400Correlation 0.842 0.782 0.855 0.135( p-value) (0.004) (0.012) (0.004) (0.000)

    We first discuss the results where second period decile performance is calculated as the unweighted

    average performance across players. In general, players from higher-ranked deciles outperform

    players from lower-ranked deciles. For example, the average player from the top decile for the three

    stakes levels combined lost 17.5 bb/100, while the average player from the bottom decile lost 47.3

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    12/32

    [12]

    Table 2 (continued)

    Period 1 Period 2bb/100 bb/100

    Decile Nun-

    weighted

    weighted

    by n

    weighted

    by nN

    un-

    weighted

    weighted

    by n

    weighted

    by nRank

    Panel C: High stakes1 273 36.6 35.5 33.7 176 4.8 3.8 6.0 7802 272 16.2 16.0 15.8 176 12.5 0.8 2.5 8573 273 9.6 9.6 9.5 181 5.6 3.2 3.1 7904 272 5.9 5.8 5.8 187 3.0 2.3 4.2 7945 273 2.9 2.8 2.8 187 2.0 1.8 3.9 8236 272 0.1 0.0 0.1 176 17.1 0.9 3.8 8807 273 4.2 4.1 4.1 161 6.4 2.7 1.4 8448 272 9.3 9.3 9.3 144 5.6 3.2 1.6 8719 273 17.0 16.8 16.6 161 33.1 8.8 1.9 961

    10 27240.9 40.0 38.8

    14723.1 11.2 2.6

    916Correlation 0.612 0.879 0.806 0.093( p-value) (0.066) (0.002) (0.008) (0.000)

    Panel D: All stakes1 3,272 34.1 33.2 32.1 1,840 17.5 0.2 5.9 8,8682 3,271 17.5 17.4 17.1 1,897 12.2 3.2 7.1 8,4663 3,272 11.6 11.5 11.4 1,854 6.8 4.8 7.1 8,2264 3,271 7.7 7.7 7.7 1,835 17.6 4.1 6.7 8,4005 3,272 4.6 4.6 4.7 1,825 9.9 2.9 5.4 8,5506 3,272 1.4 1.5 1.6 1,744 15.7 2.0 4.8 8,7107 3,271 2.6 2.6 2.4 1,790 18.9 2.4 2.7 9,5058 3,272 8.0 7.9 7.8 1,725 16.6 4.1 1.1 9,548

    9 3,271 16.5 16.3 16.2 1,868 26.2 8.5 2.1 10,03110 3,272 41.5 40.7 39.7 1,933 47.3 21.2 9.2 11,172

    Correlation 0.624 0.782 0.927 0.142( p-value) (0.060) (0.012) (0.000) (0.000)

    bb/100; the difference of 29.9 bb/100 is statistically significant ( t = 5.054; p < 0.001). However, across

    all deciles, the Spearman rank correlation between the average decile performances in the ranking

    period and those in the measurement period is only marginally significant ( = 0.624; p = 0.060). At

    the individual stakes levels, this correlation analysis yields a mixed picture: no significant correlation

    for the small stakes ( = 0.455; p = 0.191), significant correlation for the medium stakes ( = 0.842;

    p = 0.004), and only marginally significant correlation for the large stakes ( = 0.612; p = 0.066).

    The unweighted average in period two is negative for virtually all deciles. This result is related to the

    equal weight assigned to every player in calculating decile performance. There is much variation

    across players in the number of hands they played in Period 2: this number ranges from 1 to 622,915.

    Because liquidity constraints can force players to stop playing when losses accumulate, a negative

    average result from a bad sequence of hands is less likely to be cancelled out or diluted by

    subsequent hands than a positive result after a streak of luck. As a result, (extremely) negative

    average performances at the player level are more likely to occur than (extremely) positive average

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    13/32

    [13]

    performances. Indeed, players who played relatively few hands in Period 2 have lower scores: those

    who played less than 100 hands (18.1% of all active players) recorded a score of 80.3 bb/100, while

    the others (81.9%) recorded 5.5 bb/100 on average (not tabulated).

    The substantial share of players who played relatively few hands in the measurement period mayalso explain why the decile-level correlation between the average performances in the ranking and

    measurement period is only marginally significant. The average performance across such a short

    sequence of hands is relatively noisy, and the widely varying scores consequently distort the strength

    of the correlation. In fact, players who played only a few hands are given a questionably large weight

    when decile performance is expressed as a straight average across players. Using a weighted average

    with players number of hands as weights would avoid this problem, and we therefore propose this

    measure as an alternative indicator. (Note that this weighted average is identical to the average

    profitability per hand across all hands played by the players in a decile combined.) Because players

    who played only infrequently are hardly reflected in this alternative measure, we also consider a

    compromise weighting method that uses the square roots of players number s of hands as weights.

    Evaluating on the weighted average performance measures strengthens the pattern observed.

    Players from higher-ranked deciles again outperform players from lower-ranked deciles in Period 2.

    For example, hands played by players in the top decile yielded a profit of 5.9 bb/100 across all stakes

    levels, while hands played by bottom decile players lead to a loss of 9.2 bb/100 (difference: 15.1bb/100; t = 15.290; p < 0.001). The Spearman rank correlations across the deciles between

    performances in the two periods are generally higher with weighted than with unweighted average

    scores and also always statistically significant both for the three individual stakes levels and for the

    three levels combined (for all stakes and weightings: 0.782, p 0.012). Note that the second

    period performance is positive for most deciles when players numbers of hands are used as weights.

    This is striking, because, by definition, the average winnings per hand are zero across all hands in our

    sample. Apparently, players who played 1,000 or more hands in the prior six months played more

    profitable than others. This, in itself, might indicate that experience pays off in this game.

    The persistence of performance also appears from how players in a given decile rank relative to all

    other players in Period 2. The last column of Table 2 shows that players from higher-ranked deciles

    generally rank higher than players from lower-ranked deciles do. For example, for all stakes levels

    combined, the average rank of top-decile players is 8,868 (out of 18,311), while that of bottom-decile

    players is 11,172 ( t = 12.391; p < 0.001).

    At the individual player level, the strength of the correlation between players Period 1 and Period 2

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    14/32

    [14]

    Table 3: Standard Performance Measure Deciles for Three-Month Periods

    The table ranks all players who played 1,000 hands or more over a given three-month period into deciles bytheir performance over these months at the small, medium and large stakes combined. The performancemeasure that is used to rank players is the standard performance measure, where performance is measured asthe average number of big blinds won per hundred hands after correction for rake (bb/100). For each resulting

    decile, statistics are shown for the ranking period (Period 1) and for the subsequent measurement period ofthree months (Period 2). Our one-year sample period is divided into four quarters, where Q1 =October December 2009, Q2 = January March 2010, Q3 = April June 2010, and Q4 = July September2010. In Panel A (B/C), the ranking period is Q1 (Q2/Q3) and the measurement period is Q2 (Q3/Q4). Apartfrom the time intervals, all statistics are defined as in Table 2.

    Period 1 Period 2bb/100 bb/100

    Decile Nun-

    weightedweighted

    by nweighted

    by nN

    un-weighted

    weightedby n

    weightedby n

    Rank

    Panel A: Ranking in Q1, measurement in Q2

    1 1,997 34.6 33.8 32.9 1,362 16.8 0.7 4.2 6,7602 1,997 17.7 17.5 17.3 1,412 9.0 2.4 5.7 6,5903 1,998 11.7 11.7 11.6 1,423 3.7 4.3 6.7 6,3544 1,997 8.0 8.0 7.9 1,393 7.2 4.6 6.6 6,3625 1,997 5.0 5.1 5.1 1,437 6.2 3.3 5.6 6,6066 1,997 2.2 2.3 2.3 1,378 6.6 2.3 4.5 6,5927 1,997 1.3 1.2 1.1 1,347 6.7 0.9 3.9 6,9078 1,998 6.1 6.1 6.0 1,336 11.4 2.0 1.7 7,1939 1,997 13.6 13.5 13.4 1,324 21.7 5.5 0.4 7,583

    10 1,997 36.0 35.3 34.5 1,406 36.5 13.7 6.3 8,199Correlation 0.406 0.661 0.782 0.114( p-value) (0.247) (0.044) (0.012) (0.000)

    Panel B: Ranking in Q2, measurement in Q31 1,584 37.3 36.4 35.4 978 12.3 0.7 5.6 4,7802 1,584 19.9 19.8 19.6 1,026 16.0 3.9 7.2 4,5513 1,584 13.5 13.4 13.2 1,014 3.9 5.3 7.8 4,4514 1,584 9.1 9.1 9.0 999 2.1 5.3 6.7 4,3415 1,584 5.7 5.7 5.7 969 4.4 3.2 5.7 4,6746 1,583 2.4 2.4 2.4 936 12.5 2.3 4.7 4,6587 1,584 1.6 1.5 1.4 937 12.2 0.8 3.9 4,9378 1,584 6.6 6.6 6.5 938 13.7 1.6 3.0 5,0619 1,584 14.9 14.8 14.8 958 24.2 6.2 0.3 5,275

    10 1,584 40.3 39.4 38.3 987 51.7 17.8 7.3 6,028Correlation 0.539 0.697 0.867 0.132( p-value) (0.113) (0.031) (0.003) (0.000)

    Panel C: Ranking in Q3, measurement in Q41 1,167 37.4 36.3 34.9 780 14.0 3.6 7.3 3,7692 1,166 19.7 19.6 19.5 811 11.7 4.5 7.2 3,7003 1,167 13.5 13.4 13.2 835 6.7 5.1 7.5 3,6664 1,166 9.3 9.3 9.3 871 3.6 5.5 7.2 3,5995 1,167 5.9 5.9 6.0 848 2.3 4.3 5.7 3,7096 1,167 2.5 2.5 2.5 802 11.6 2.0 4.7 3,9527 1,166 1.5 1.4 1.3 751 12.3 0.9 4.2 4,0728 1,167 7.1 7.0 6.9 755 15.7 0.4 2.8 4,1329 1,166 16.0 15.9 15.7 773 34.0 6.2 0.9 4,527

    10 1,167 42.0 41.1 40.1 790 31.2 18.6 9.5 5,068Correlation 0.564 0.830 0.964 0.162( p-value) (0.096) (0.006) (0.000) (0.000)

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    15/32

    [15]

    ranks is rather moderate. The correlation coefficient ranges between 0.093 (for the large stakes) and

    0.142 (for the three stakes levels combined). The relatively low degree of correlation as compared to

    the correlation coefficients at the decile level reflects the relevance of variance in performance at the

    individual level in particular of the variance for players who played only few hands in Period 2.

    Statistically, however, the rank correlation at the individual player level is highly significant for every

    (sub)sample (all p < 0.001).

    Table 3 reports the results for a similar analysis that uses three rather than six months as the ranking

    and measurement period. Regardless of the pair of successive quarters that we use, we observe the

    same pattern of persistence as in the previous analyses: higher-ranked deciles generally outperform

    lower-ranked deciles. Again, the correlations are stronger when we reduce the influence of relatively

    infrequent players by calculating performance as a weighted average. At the individual player level,

    the rank correlation is always highly significant (all p < 0.001).

    Thus far we have ranked players on the basis of their average performance in big blinds. Though

    simple and natural, this approach ignores the importance of differences between players in the

    number of hands that they played. Few would share the view that a player who has won 100 big

    blinds over 200 hands (50 bb/100h) is to be considered a better performing player than someone

    who has won 40,000 big blinds over 100,000 hands (40 bb/100h). One of the drawbacks of the

    previous approach is that it does not account for the basic statistic rule that the sampling distribution

    of the mean depends on the sample size ( n i ): the greater (smaller) the number of observations, the

    less (more) likely that their mean takes an extreme value. For example, if we consider two players

    with equal ability from a larger population, the player who participates in a smaller number of hands

    is more likely to be classified in one of the top or bottom deciles if players are ranked by their

    average winnings per hand. Similarly, the previous approach does not account for differences in

    playing style or the standard deviation of winnings ( s i ): when two players are equally profitable, the

    more adventurous (conservative) player is more (less) likely to end up in one of the two extremes of

    the ranking.

    We therefore propose an alternative measure to rank players that does account for the number of

    hands and playing style of an individual player ( i ):

    (1)i i

    i

    i i

    i i

    i

    i i

    ns

    BB

    n / s

    n / BBratewinstdev

    ratewinPRM

    where BBi is the sum of big blinds won (before deduction of rake), s i is the standard deviation of big

    blinds won, and n i is the number of hands played. We label this measure th e performance

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    16/32

    [16]

    Table 4: Performance Robustness Measure Deciles for Six-Month Periods

    The table ranks all players who played 1,000 hands or more over the first six months of our sample period intodeciles by their performance over these six months. Here, the performance measure that is used to rankplayers is the performance robustness measure, which is defined as the average number of big blinds won aftercorrection for rake divided by its estimated standard error. The estimated standard error is the sample

    standard deviation of the rake-corrected winnings per hand divided by the square root of the number of hands.The statistics shown for each resulting decile are defined as in Table 2. The various panels and correlationcoefficients are also identically defined.

    Period 1 Period 2bb/100 bb/100

    Decile Nun-

    weightedweighted

    by nweighted

    by nN

    un-weighted

    weightedby n

    weightedby n

    Rank

    Panel A: Small stakes1 1,368 23.9 18.9 14.7 664 3.9 8.4 9.3 2,7202 1,368 20.4 16.9 13.4 604 4.3 7.0 8.3 2,921

    3 1,368 16.6 14.2 11.7 597 18.5 4.8 7.1 2,9494 1,368 13.4 11.6 9.6 615 7.7 7.2 8.3 2,8245 1,369 9.7 8.5 7.1 624 14.6 3.8 7.5 3,1216 1,368 6.1 5.4 4.7 602 10.8 4.1 6.3 3,0757 1,368 1.9 1.7 1.4 629 6.3 3.3 6.3 3,1068 1,368 3.3 3.0 2.6 598 9.5 1.0 5.7 3,2979 1,368 10.2 9.4 8.3 590 8.6 1.9 2.7 3,281

    10 1,368 28.0 25.9 23.3 647 12.3 9.4 4.8 3,569Correlation 0.321 0.952 0.952 0.128( p-value) (0.368) (0.000) (0.000) (0.000)

    Panel B: Medium stakes1 1,621 21.1 13.1 9.6 1,081 3.6 5.6 6.8 3,5502 1,621 20.3 14.8 10.1 931 7.7 2.5 5.7 3,9543 1,622 14.7 11.3 7.8 916 21.9 0.2 4.9 4,2754 1,621 9.5 7.7 5.7 862 29.8 4.1 3.2 4,5115 1,621 4.2 3.5 2.7 822 29.7 4.4 2.8 4,5076 1,621 0.9 0.8 0.6 872 37.8 5.5 1.6 4,6087 1,621 6.6 5.7 4.5 853 27.1 5.6 1.6 4,5938 1,622 12.9 11.3 9.4 853 39.8 7.3 1.1 4,7429 1,621 22.9 20.4 17.1 859 41.4 12.9 2.4 4,962

    10 1,621 41.6 37.5 31.8 881 49.7 18.0 6.9 5,227Correlation 0.915 1.000 1.000 0.178( p-value) (0.000) (0.000) (0.000) (0.000)

    Panel C: High stakes1 273 23.5 15.0 10.3 207 12.8 4.6 4.3 6922 272 20.8 13.9 7.9 177 0.9 2.6 3.8 7963 273 13.7 10.0 6.6 182 14.6 1.0 1.8 8884 272 8.6 6.8 4.8 175 11.4 0.3 4.0 8395 273 4.3 3.4 2.4 170 2.8 2.4 5.4 7996 272 0.2 0.1 0.1 169 14.6 0.2 2.9 8847 273 5.7 4.6 3.6 155 13.3 3.3 0.1 8748 272 11.5 9.4 6.9 157 3.3 4.2 1.0 8929 273 18.7 15.4 11.7 149 36.0 7.4 2.1 930

    10 272 35.3 29.7 23.1 155 22.3 10.4 2.8 948Correlation 0.697 0.915 0.806 0.125

    ( p-value) (0.031) (0.000) (0.008) (0.000)

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    17/32

    [17]

    Table 4 (continued)

    Period 1 Period 2bb/100 bb/100

    Decile Nun-

    weighted

    weighted

    by n

    weighted

    by nN

    un-

    weighted

    weighted

    by n

    weighted

    by nRank

    Panel D: All stakes1 3,272 21.8 14.7 10.5 1,994 2.1 6.4 7.2 7,5602 3,271 20.0 15.4 10.8 1,825 5.4 4.2 6.4 8,1933 3,272 15.5 12.4 9.0 1,814 6.9 2.9 5.7 8,4544 3,271 11.0 9.1 6.8 1,806 20.4 0.9 5.4 8,9075 3,272 6.7 5.6 4.4 1,772 25.6 0.5 4.2 9,0606 3,272 2.0 1.7 1.4 1,790 21.1 0.8 4.3 9,1937 3,271 3.4 2.9 2.4 1,809 22.7 3.7 2.4 9,5858 3,272 9.4 8.3 6.7 1,794 21.6 5.7 1.0 9,8059 3,271 18.2 16.2 13.7 1,828 26.9 8.2 0.7 10,009

    10 3,27237.5 33.8 28.7

    1,87938.9 16.1 6.4

    10,894Correlation 0.915 1.000 0.988 0.174( p-value) (0.000) (0.000) (0.000) (0.000)

    robustness measure (PRM). In fact, PRM i equals the t -value of a test of a players observed

    performance against the null-hypothesis of zero expected performance.

    Table 4 presents the new results for six-month periods. Accounting for playing style and number of

    hands in the measurement period strengthens the previous evidence for performance persistence.

    Deciles of players who rank higher by their PRM i again generally fare better than lower-ranked

    deciles. However, the new ranking method turns out to be more accurate: in many cases, the

    performance of a decile in Period 2 is now perfectly or almost perfectly monotonically decreasing

    with the rank of a decile for Period 1. For example, for the aggregate data, the rank correlation is

    0.988 (1.000) when Period 2 decile performance is measured with (the square root of) players

    numbers of hands as weights. For each stakes level, the rank correlation of performance at the

    individual player level is stronger as well. The new coefficients are about two to four percentage

    points larger, and range between 0.125 (large stakes) and 0.178 (medium stakes). Table 5 reports

    similar results for three-month periods.

    Another way to look at the persistence of performance is through transition probabilities. Table 6

    shows transition probabilities across performance deciles for players who played 1,000 hands or

    more over the first six months of our sample period. These players are ranked on the basis of their

    performance twice: for Period 1 and for Period 2. Each probability indicates the empirical probability

    of transitioning from a given performance decile in the first half-year period to a given performance

    decile in the second half-year period. Players for whom we have no observations for the second

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    18/32

    [18]

    Table 5: Performance Robustness Measure Deciles for Three-Month Periods

    The table ranks all players who played 1,000 hands or more over a given three-month period into deciles bytheir performance over these months at the small, medium and large stakes combined. The performancemeasure that is used to rank players is the performance robustness measure (see notes to Table 4). For eachresulting decile, statistics are shown for the ranking period (Period 1) and for the subsequent measurement

    period of three months (Period 2). Our one-year sample period is divided into four quarters, where Q1 =October December 2009, Q2 = January March 2010, Q3 = April June 2010, and Q4 = July September2010. In Panel A (B/C), the ranking period is Q1 (Q2/Q3) and the measurement period is Q2 (Q3/Q4). Apartfrom the time intervals, all statistics are defined as in Table 2.

    Period 1 Period 2bb/100 bb/100

    Decile Nun-

    weightedweighted

    by nweighted

    by nN

    un-weighted

    weightedby n

    weightedby n

    Rank

    Panel A: Ranking in Q1, measurement in Q21 1,997 22.5 15.7 11.2 1,514 0.6 5.2 6.5 6,110

    2 1,997 19.9 15.6 11.3 1,404 5.4 4.6 6.5 6,2573 1,998 15.2 12.2 9.0 1,384 10.4 2.9 5.5 6,5254 1,997 11.3 9.4 7.1 1,415 5.1 2.6 5.4 6,5935 1,997 7.3 6.2 4.9 1,342 12.4 1.5 4.6 6,7466 1,997 3.2 2.7 2.2 1,348 14.9 0.8 3.1 7,0207 1,997 1.6 1.4 1.1 1,355 13.7 1.1 3.5 7,1668 1,998 7.5 6.7 5.6 1,325 12.4 2.5 1.5 7,1149 1,997 15.0 13.5 11.5 1,374 21.9 5.3 1.0 7,723

    10 1,997 32.9 29.8 25.5 1,357 30.4 10.5 4.8 7,972Correlation 0.879 1.000 0.976 0.142( p-value) (0.002) (0.000) (0.000) (0.000)

    Panel B: Ranking in Q2, measurement in Q3

    1 1,584 25.4 18.3 13.3 1,102 2.0 6.3 7.3 4,1852 1,584 21.3 16.5 12.0 1,017 2.7 4.5 6.7 4,4833 1,584 17.3 14.1 10.6 970 10.6 3.3 6.4 4,6584 1,584 12.6 10.3 7.8 953 7.8 3.7 5.3 4,5555 1,584 7.9 6.7 5.3 926 10.9 0.7 4.4 4,7926 1,583 3.2 2.8 2.2 950 12.6 0.5 4.2 4,8027 1,584 2.0 1.7 1.4 948 13.6 0.9 3.2 5,0108 1,584 7.8 6.9 5.8 943 19.3 3.8 1.8 5,2239 1,584 16.5 14.6 12.3 967 26.1 4.8 2.7 5,282

    10 1,584 37.0 33.5 29.0 966 45.3 13.1 4.0 5,845Correlation 0.988 0.988 0.988 0.158( p-value) (0.000) (0.000) (0.000) (0.000)

    Panel C: Ranking in Q3, measurement in Q41 1,167 24.9 17.3 12.6 910 3.0 6.8 7.3 3,2922 1,166 22.2 17.3 12.7 821 6.9 4.7 6.3 3,5993 1,167 17.2 14.0 10.7 836 13.0 5.3 7.6 3,6234 1,166 12.6 10.4 7.8 812 7.7 2.7 5.1 3,9425 1,167 8.1 6.8 5.3 804 9.0 2.0 5.5 3,9076 1,167 3.3 2.8 2.2 767 12.2 0.1 3.6 4,1087 1,166 1.8 1.6 1.3 744 25.9 0.3 4.3 4,1398 1,167 8.2 7.2 5.9 755 19.6 2.5 2.6 4,2779 1,166 17.6 15.9 13.8 772 14.8 4.5 0.9 4,412

    10 1,167 38.8 35.1 30.2 795 38.3 13.6 4.0 4,962

    Correlation 0.879 0.988 0.939 0.194( p-value) (0.002) (0.000) (0.000) (0.000)

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    19/32

    [19]

    Table 6: Transition Probabilities

    The table shows the transition probabilities across performance deciles for players who played 1,000 hands ormore over the first six months of our sample period. Each probability indicates the empirical probability oftransitioning from a given performance decile in the first half-year period (HY1) to a given performance decilein the second half-year period (HY2). In Panel A, the performance measure that is used to rank players in both

    HY1 and HY2 is the standard performance measure, where performance is measured as the average number ofbig blinds won per hundred hands after correction for rake (bb/100). In Panel B, the performance measure thatis used for both HY1 and HY2 is the performance robustness measure, which is defined as the average numberof big blinds won after correction for rake divided by its estimated standard error. Players for whom we haveno observations for HY2 are not included in the HY2 ranking.

    HY1decile

    HY2 decile1 2 3 4 5 6 7 8 9 10

    Panel A: Ranking by Standard Performance Measure1 0.132 0.121 0.106 0.079 0.078 0.078 0.096 0.096 0.102 0.1132 0.110 0.123 0.124 0.109 0.099 0.090 0.094 0.090 0.073 0.089

    3 0.090 0.123 0.135 0.129 0.113 0.107 0.088 0.077 0.079 0.0594 0.072 0.094 0.143 0.151 0.143 0.104 0.097 0.067 0.059 0.0705 0.084 0.087 0.119 0.137 0.148 0.115 0.095 0.081 0.069 0.0646 0.102 0.096 0.096 0.132 0.112 0.114 0.097 0.099 0.076 0.0757 0.085 0.091 0.098 0.080 0.103 0.115 0.112 0.118 0.104 0.0948 0.110 0.099 0.066 0.075 0.084 0.122 0.111 0.119 0.114 0.1019 0.107 0.086 0.070 0.060 0.074 0.097 0.121 0.130 0.137 0.118

    10 0.107 0.080 0.043 0.051 0.050 0.062 0.090 0.124 0.184 0.210Panel B: Ranking by Performance Robustness Measure

    1 0.234 0.133 0.139 0.105 0.089 0.083 0.072 0.061 0.046 0.0402 0.139 0.119 0.109 0.114 0.108 0.111 0.086 0.092 0.065 0.0563 0.122 0.116 0.108 0.106 0.115 0.089 0.102 0.087 0.080 0.0764 0.103 0.114 0.102 0.098 0.101 0.110 0.094 0.096 0.096 0.0875 0.086 0.103 0.097 0.122 0.103 0.102 0.104 0.093 0.099 0.0906 0.087 0.093 0.106 0.099 0.098 0.111 0.098 0.108 0.104 0.0967 0.065 0.096 0.099 0.083 0.103 0.102 0.109 0.116 0.118 0.1098 0.064 0.087 0.077 0.096 0.099 0.100 0.116 0.110 0.133 0.1189 0.048 0.079 0.091 0.094 0.092 0.109 0.109 0.114 0.129 0.136

    10 0.041 0.059 0.068 0.084 0.094 0.087 0.113 0.126 0.133 0.196

    period are not included in the ranking for the second period, so essentially the probabilities are

    conditional on participation in the second six months.

    In Panel A, the performance measure that is used to rank players is the standard performance

    measure (bb/100) after correction for rake. The fraction of players in the top decile of Period 1 who

    end up in the top (second-best) decile in Period 2 is 13.2 (12.1) percent; players who are in the worst

    decile end up in the worst (second-worst) decile 21.0 (18.4) percent of the time. These empirical

    probabilities are all substantially greater than the value of 10 percent that would be expected under

    the null hypothesis of no performance persistence (all p < 0.001). At the same time, however, there is

    also some evidence that the likelihood of ending up at the opposite extreme is greater than 10

    percent. For example, the chance of transitioning from the very best (worst) category to the very

    worst (best) is 11.3 (10.7) percent. This pattern is symptomatic of the inadequacy of the ranking

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    20/32

    [20]

    measure used here: players with a higher variance of their average winnings due to adventurous or

    infrequent play are more likely to end up in the extreme win rate categories. Ranking players on the

    basis of our alternative performance robustness measure controls for this variance effect.

    In Panel B, players are ranked on the basis of their PRM i . The results are compelling: players from thetop decile reappear in this decile 23.4 percent of the time, and with a probability of 4.0 percent they

    end up in the bottom decile relatively infrequently. Similarly, losers are unlikely to turn into winners:

    the worst ten percent of players rank among the best ten percent in the next six months only 4.1

    percent of the time and among the worst ten percent 19.6 percent of the time. The empirical

    probabilities are even more telling when we look at percentiles (not tabulated): the very best one

    percent of players in Period 1 rank among the very best one percent in Period 2 12.3 percent of the

    time, and among the best ten percent 41.1 percent of the time (12.3 and 4.1 times the base rate).

    They are among the worst ten percent only 2.0 percent of the time. Similarly, the least successful

    players from Period 1 often keep performing badly: the worst percentile stay in that category 10.5

    percent of the time, and belong to the worst decile in 31.0 percent of the cases. They rarely

    outperform: the best decile is reached only 3.5 percent of the time.

    III. Regression Analyses

    To further analyze the role of skill we regress performance over the final six months of our sample on

    performance over the first six months, and on other measures that may serve as skill proxies. We

    consider the following variables:

    - SPM: the standard performance measure or win rate, defined as the average number of

    big blinds won per hundred hands after correction for rake.

    - PRM: the performance robustness measure, defined as the average number of big blinds

    won per hand after correction for rake divided by its estimated standard error. The

    estimated standard error is the sample standard deviation of the rake-corrected winnings

    per hand divided by the square root of the number of hands.

    - Hands (log) : the natural logarithm of the number of hands played. This variable is a proxy for

    the experience of players and thus a possible indicator of skill.

    - Tightness : one minus the proportion of hands in which a player voluntarily wagered money

    in the first betting round (called or raised before the flop). The degree of tightness is one ofthe two simple measures that are typically used to broadly categorize players playing styles.

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    21/32

    [21]

    Generally, tighter play is thought to be indicative of a better player. Common mistakes in

    poker are to impatiently look for action and to overestimate the profitability of playing a

    given hand.

    - Aggressiveness : the number of time a player led the betting (bet or raised) as aproportion of the total number of times the player volun tarily wagered money (bet,

    called or raised). The aggression factor is the other of the two simple playing style

    measures. Aggressive play is generally thought to yield a higher expected performance than

    passive play, because increasing the cost of playing at the right times can pressure other

    players to give up stronger cards or to pay off with weaker ones.

    To avoid endogeneity issues, all explanatory variables are based on data from Period 1 (the first six

    months) only. We run two sets of regressions, one for the standard performance measure and the

    other for our performance robustness measure. In the former case, we face the issue of

    heteroskedasticity: the variance of the error term is proportional to the sample variance of the

    number of big blinds won ( 2i s ) and inversely proportional to the number of hands played ( n i ) in

    Period 2. We therefore apply weighted least squares (WLS) to estimate these regression models,

    with the inverse of the variance of the error term, 2i i s / n , as weight. Intuitively it also seems obvious

    that the more precise performance measurements of players with a lower variance and a larger

    number of hands should be given more weight. When our performance robustness measure is the

    dependent variable we use ordinary least squares (OLS) because the errors have constant variance by

    construction.

    Panel A of Table 7 presents the WLS results for the standard performance measure. In each

    univariate regression, performance is significantly related to the skill proxy from the previous period

    (all p < 0.001). Not just the historical performance measure (Model 1), but also the number of hands

    played (Model 2) and the two style measures (Models 3 and 4) predict performance to a limited but

    statistically significant extent. Players who participated in more hands in the previous period perform

    better, as do players who adopted a tight and aggressive playing style. Combined, the measures

    explain 6.5 percent of the variance in performance. The smaller-than-unity coefficient in Model 1

    suggests that there is regression to the mean in players performance over time . There is, however,

    good reason to believe that this coefficient systematically underestimates the effect of skill on

    performance. We will address this issue below.

    We obtain qualitatively similar results when we use our performance robustness measure (Panel B).

    However, for most individual variables the explanatory power is now higher. The percentage of

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    22/32

    [22]

    Table 7: Regression Results

    The table displays regression analysis results for our subsample of players who played 1,000 hands or moreover the first six months of our sample period and at least 1 hand over the second six months. The dependentvariable is either the standard performance measure (Panel A) or the performance robustness measure (PanelB). The standard performance measure is defined as the average number of big blinds won per hundred hands

    after correction for rake (bb/100). The performance robustness measure is the average number of big blindswon after correction for rake divided by its estimated standard error. The explanatory variables are calculatedusing data from Period 1 only. SPM is the players standard performance measure or win rate . PRM is herperformance robustness measure. Hands (log) is the natural logarithm of the number of hands played.Tightness is one minus the proportion of hands in which the player voluntarily wagered money in the firstbetting round. Aggressiveness is the number of time she led the betting as a proportion of the total number oftimes the player voluntarily wagered money. The results reported in Panel A are weighted least squaresregression results with the ratio of a players number of hands and her sample variance of the number of bigblinds won in Period 2 as weight. Panel B presents ordinary least squares results. In all regressions N = 18,311.The p-values are in parentheses.

    Model 1 Model 2 Model 3 Model 4 Model 5Panel A: Standard Performance Measure (WLS)Constant 3.999 2.103 18.756 9.015 21.200

    (0.000) (0.001) (0.000) (0.000) (0.000)SPM 0.233 0.188

    (0.000) (0.000)Hands (log) 0.780 0.286

    (0.000) (0.000)Tightness 29.266 20.708

    (0.000) (0.000) Aggressiveness 19.416 7.656

    (0.000) (0.000)

    R2 0.040 0.008 0.035 0.014 0.065

    Panel B: Performance Robustness Measure (OLS)Constant 0.051 2.831 1.896 2.156 3.476

    (0.000) (0.000) (0.000) (0.000) (0.000)PRM 0.301 0.178

    (0.000) (0.000)Hands (log) 0.354 0.185

    (0.000) (0.000)Tightness 2.677 1.400

    (0.000) (0.000) Aggressiveness 3.146 1.359

    (0.000) (0.000)R2 0.080 0.067 0.056 0.030 0.125

    variance explained by the joint skill proxies is 12.5 percent, which is about twice as high as the

    empirical fit of the previous multivariate specification. The performance robustness measure

    provides the largest contribution to the predictive power. In isolation, this variable explains 8.0

    percent of the variance in performance.

    Although these results reinforce our earlier findings of performance persistence and the role of skill

    in poker, the major part of variance in performance remains unexplained by the models and appears

    to be attributable to chance. However, an issue that we have not yet explicitly addressed so far is the

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    23/32

    [23]

    problem of errors in variables. In the ideal situation we would know every players precise skill level,

    but given the lack of this information we have to use noisy proxies. When explanatory variables are

    mismeasured, coefficients estimated via standard regression methods are biased towards zero and

    the true explanatory power of the variables is underestimated. The low empirical fit of the regression

    models indicates that measurement error is a serious issue for the historical performance measures:

    if a random factor explains much of the variation in performance, any measurement of previous

    performance is likely to be subject to a large degree of randomness as well. 6

    The bias of the estimated coefficient towards zero as a consequence of measurement error is known

    as attenuation or regression dilution . Although measurement error is not a problem for predictive

    modeling, it may give an unjust impression of the size of the effect of skill on performance here and

    falsely suggest that a players skill is not a stable quality over time. To account for error in both the

    dependent and the independent variable we therefore also run a so-called Deming regression

    (methodological details are in the Appendix). The results indicate that the standard regression

    understates the size of the effect of skill on performance to a considerable extent: when the win rate

    from Period 2 is regressed on the win rate from Period 1, we now obtain a coefficient of 1.758

    (t = 16.584; p < 0.001). When the performance robustness measure is used for the dependent and for

    the independent variable, the new coefficient is 1.238 ( t = 37.416; p < 0.001). The new coefficients

    are not only clearly higher than the values of 0.233 and 0.301 reported before, but also significantly

    greater than unity ( p < 0.001). We have no clear explanation for the latter; taken at face value, the

    coefficients indicate that the disparity in performance between players increases over time.

    The underestimation of the true explanatory power of skill as a consequence of measurement error

    decreases with the number of hands that are used to calculate the proxy for skill. With more

    underlying observations, measurement error becomes relatively less important: for both historical

    performance measures the variance of the measurement error decreases as a fraction of the

    variance of the explanatory variable. For the standard performance measure the variance of the

    measurement error decreases as the number of hands increases. On average, the measured win rate

    values are closer to players expected win rates. For the performance robustness measure the

    variance of the measurement error is constant (at unity), but here an increase in the number of

    hands leads to more distinctive differences between players with a different expected win rate,

    reducing the relative size of the measurement errors.

    6 The playing style variables are measured with relatively little error because they are based on a large numberof draws from a binomial distribution. Their relatively poor predictive power appears to be especially related totheir more indirect reflection of skill.

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    24/32

    [24]

    To illustrate the effect of the number of observations per player on the empirical fit, we run

    regressions for the pooled results of teams of players. More precisely, we first rank players on the

    basis of their performance in Period 1. We then group the players into percentiles, where the best

    one percent of players forms a group, the second best forms another group, et cetera. Next, for each

    percentile we calculate Period 1 and Period 2 performance across all hands of the group of players

    combined. Lastly, we regress the pooled Period 2 performance on the pooled Period 1 performance.

    The average hypothetical player has now played about 1,907,513 hands in Period 1 (rather than

    7,318) and 918,678 in Period 2 (rather than 5,017). The results are remarkable. When performance is

    expressed as the win rate the R2 is 72.4 percent, and when the performance robustness measure is

    used the R2 is 86.3 percent.

    Finally, we have also performed the various analyses in this section for the three separate stakes

    levels. The results and conclusions are all similar to the results for the aggregate sample. A

    noteworthy pattern is that the various coefficients are smallest for the large stakes subsample, which

    indicates that skill differences between players decrease with the level of the stakes. For successive

    subperiods of three months, the results are qualitatively similar but statistically less significant as a

    consequence of the smaller number of observations.

    IV. Discussion and Conclusions

    Through several analyses we have examined whether skill differences between online poker players

    can explain differences in their performance. The results provide strong evidence against the

    hypothesis that poker is a game of pure chance. For a game of pure chance there would be no

    correlation in the winnings of players across successive time intervals. In our large database for three

    different stakes levels, however, we do find significant persistence in the performance of players

    over time. On average, players who rank higher (lower) in profitability over the previous subperiod

    perform better (worse) during the current subperiod. For example, players from the best decile over

    the first six months of our sample period earn about 30 to 40 big blinds per 100 hands more during

    the next six months than players from the worst decile. When we rank players on the basis of how

    well they did according to our alternative performance robustness measure, we find that a player

    who is in the top ten percent ranks among the top ten percent of the next period 23.4 percent of the

    time, and among the worst ten percent only 4.0 percent of the time. The differences are even more

    pronounced if we look at the top one percent: 12.3 percent of these players end up in the top one

    percent the next period, 41.1 percent end up in the top ten percent, and only 2.0 percent end up in

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    25/32

    [25]

    the bottom ten percent. Similarly, those who perform the worst in a given period hardly ever turn

    into the best performing players in the next period.

    Our regression results reinforce these findings, and they also show that current performance is not

    only related to historical performance but also in some extent to simple measures of playing style.Players who are characterized by a tight and aggressive playing style generally perform better than

    their loose and passive opponents. Performance is also related to the number of hands that subjects

    have played over the previous period: more frequent or experienced players achieve better results.

    This finding can indicate that better players choose to play more and that players learn from playing.

    Both interpretations conflict with the pure-chance hypothesis.

    Although the evidence clearly goes against the pure-chance hypothesis and points at the relevance of

    skill, the explanatory power of the regression models is not very high. Only a minor part of the

    differenc es in players performance is captured by the explanatory variables, which suggest s that the

    role of skill is relatively unimportant compared to the role of chance. However, drawing this

    conclusion is inappropriate for three reasons.

    First, the reported R2 values underestimate the true role of skill because of measurement error in the

    main explanatory variable. Past performance is only an imperfect proxy for skill or expected

    performance, especially when the number of underlying observations per player is relatively small.

    The low explanatory power of the regression models indicates that measurement error is a serious

    problem here: if a random factor explains much of the variation in performance, any measurement

    of previous performance is likely to be subject to a large degree of randomness as well.

    Second, the extent to which skill differences explain differences in performance also depends on the

    number of hands over which performance is measured. In the regressions, this number varies from

    only a few to several hundreds of thousands, but the median number is only approximately 700. For

    greater volumes of hands per player, the R2 would improve.

    Our regressions with the pooled performance scores of percentiles of players illustrate the relevance

    of these two issues and yielded an explanatory power as high as 72 and 86 percent. The effects are

    even more convincingly illustrated by the decile analyses: for very large groups of players, the (rank)

    correlation between past and current performance is near-perfect or even perfect. This implies that

    if sufficiently many hands are played, skill explains practically all variation in performance.

    Third, our sample consists of only a select, relatively homogenous subset of the larger population. To

    avoid the inclusion of extremely noisy indicators of performance we have only included players whohave played at least 1,000 hands over the first subperiod. This way, we have unintentionally excluded

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    26/32

    [26]

    relatively infrequent and less experienced players. People who have never played the game at all

    before are obviously totally absent in our data. Every player in our final sample will thus normally be

    well informed about the rules of the game and master some basic strategic concepts. If we would

    also observe beginners playing the game, the differences in performance across players would

    presumably be greater and skill measures would then probably explain a greater part of these

    differences. Even then we would formally still underestimate the role of skill, because even a

    complete beginner would put his cognitive abilities at work and play better than a player who makes

    random decisions. Next to this, the homogeneity in our sample is probably strengthened because of

    players self-selection into stakes levels on the basis of their perception of their relative skill level.

    Better players are more likely to play for larger stakes, while worse or beginning players might feel

    more comfortable at smaller stakes.

    The selection effect is not unique for poker. In many games, people play against opponents of

    relatively similar ability. As a consequence, a regression of performance on historical performance

    will leave a substantial amount of variance unexplained even for sports and games like chess or

    bridge. Sports betting is a multi-billion dollar industry that can exist because of the importance of

    chance elements in matches between relatively equally skilled players or teams. Still, we consider

    most sports as games of skill. Similarly, when the outcome of a chess match between two equally

    skilled chess masters is unpredictable we neither conclude that the two are playing a game of

    chance. What distinguishes poker from other games is that poker incorporates an explicit and salient

    source of randomness in the form of the shuffled deck of cards, and this makes the association with

    chance more obvious.

    In the light of these considerations we believe that we can legitimately conclude that skill is an

    important factor in online ring game poker. Our analyses have shown that there are structural

    differences between players that explain a fair share of the differences in their performance over a

    medium number of hands.

    Our study has been limited to online play, which calls the generalizability of this conclusion into

    question. Still, if skill is important in online poker it is likely to be even more important for brick-and-

    mortar play, because the latter requires not only to master the game but also to account for body

    language and other subtle forms of communication. Players sitting face-to-face need to carefully

    control their behavior to not reveal the strength of their hands, and by observing others they can

    sometimes discover useful tells about their play. At the same time, body language can also be used

    to deliberately mislead opponents. Furthermore, players patience is put to the test more in live play

    than in online play because fewer hands are dealt per hour.

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    27/32

    [27]

    We have also considered cash game play only. Both online and offline poker are also frequently

    played in tournament form. We have intentionally analyzed the cash game variant, because the value

    of a given amount of chips wagered in a tournament hand depends on the phase of the tournament

    and individual players chip stack size relative to her opponents. This greatly complicates the analysis

    of performance using hand-level data. A more straightforward approach for tournaments would be

    to analyze players finishes, which is precisely what Croson, Fishman and Pope (2008) and Levitt and

    Miles (2011) do for major live events. It is hard to tell whether tournament poker requires more or

    less skill than the cash game variant, but a significant difference is unlikely. In their early phases,

    tournaments are probably very similar to cash games. At later stages, chance becomes more

    important because the blind bets become larger relative to players chip stacks, which effectively

    reduces the opportunities for strategic betting. However, for the same reason, meticulous hand

    selection (which dealt hands to play and which not) is even more consequential then. Furthermore,

    especially at later stages, players also need to factor in the prize money structure in their decisions.

    Given the many similarities with ring games, we believe that our results can be safely generalized to

    tournament poker as well. Nevertheless, future work could exploit the large amount of data available

    on tournament results and conduct similar analyses as we did. For further research it would also be

    interesting to look at alternative but less popular game variants such as Pot Limit Omaha or Limit

    Texas Holdem.

    Appendix: Deming Regression

    Deming regression was first introduced by Adcock (1878). Kummell (1879) extended the method by

    allowing for the errors in the dependent and in the independent variable to have different variances

    (although the former was still assumed to be proportional to the latter for all observations). The

    method is named after the statistician W. Edwards Deming who propagated it (Deming, 1943). For

    our regression, we use the following model:

    (A1) *1,*

    2, i i y y

    (A2) t i t i t i y y ,*,,

    where *,t i y are the true values and t i y , are the observed values for the performance of player i in

    period t (t = 1,2) and where i ,t is the measurement error that is normally distributed with a mean of

    zero.

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    28/32

    [28]

    While standard regression approaches minimize the sum of squares of residuals for the dependent

    variable ( e i ,2) only, we here minimize the sum of squares of the standardized residuals for both the

    dependent and the independent variable:

    (A3)

    N

    i i

    i

    i

    i eeSSR1 1,

    21,

    2,

    22,

    )var()var(

    This leads to the following objective function:

    (A4)

    N

    i i

    i i

    i

    i i

    y y

    y y y y

    N 1 1,

    2*1,1,

    2,

    2*1,2,

    ,,,, )var(

    )var(

    min*

    1,*

    1,1

    As show in York (1966), the solution is:

    (A5) N

    i i i i

    N

    i i i i

    y zw

    y zw

    11,

    12,

    ~

    ~

    (A6) w w y y 12

    where

    (A7) w t t i t i y y y ,,~

    (A8) N

    i i

    N

    i t i i

    w t

    w

    y w

    y

    1

    1,

    (A9) 11,22, i i i v v w

    (A10) t i ,ti, varv

    (A11) 2,1,1,2, i i i i i i y v y v w z

    In most applications, the variance of the measurement error is not known at the level of individual

    observations. In order to solve the optimization problem it is then usually assumed that the ratio

    1,2, varvar i i is the same for all i . We are in the unique situation that we do have accurate

    estimates, which allows us to obtain unbiased estimates of and . When the performance ofpoker players is measured as the average number of big blinds won, the variance of the

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    29/32

    [29]

    measurement error var( i ,t ) for any specific observation is approximated by t i t i ns ,2, , or the ratio of

    the underlying sample variance of the number of big blinds won ( 2,t i s ) and the number of hands ( n i,t ).

    (We assume that the sample variance of the number of big blinds won is stationary, that is,

    t ss t i i 2,2 .) When our performance robustness measure is used, the measurement error variance is

    always equal to unity.

    References

    Adcock, R.J. 1878. AProblem in Least S quares. The Analyst , 5(2): 53-54.

    Allais, Maurice. 1953. Le Comportement de lHomme Rationnel devant le Risque: Critique desPostulats et Axiomes de lcole Amricaine.Econometrica , 21(4): 503 546.

    Alpert, Marc, and Howard Raiffa. 1982. A Progress Report on the Training of Probability Assessors . In Judgment under Uncertainty: Heuristics and Biases , ed. Daniel Kahneman, Paul Slovic, and AmosTversky, 294-305. New York: Cambridge University Press.

    Benartzi, Shlomo, and Richard H. Thaler. 1995. Myopic Loss Aversion and the Equity PremiumPuzzle.Quarterly Journal of Economics , 110(1): 75-92.

    Borm, Peter E.M., and Ben B. van der Genugten. 2001 . On a Relative Measure of Skill for Games

    with Chance E lements. Top , 9(1): 91-114.

    Cabot, Anthony, and Robert C. Hannum. 2005 . Poker: Public Policy, Law, Mathematics, and theFuture of an American Tradition . Thomas M. Cooley Law Review , 22(2): 443-514.

    Camerer, Colin F., Teck-Hua Ho, and Juin-Kuan Chong. 2004. A Cognitive Hierarchy Model ofGames . Quarterly Journal of Economics , 119(3): 861-898.

    Cesarini, David, Magnus Johannesson, Patrik K.E. Magnusson, and Bjrn Wallace. 2011. TheBehavioral Genetics of Behavioral Anomalies . Management Science , 58(1): 21-34.

    Cronqvist, Henrik, and Stephan Siegel. 2011. Why Do Individuals Exhibit Investment Biases?Working paper.

    Croson, Rachel, Peter Fishman, and Devin G. Pope. 2008. Poker Superstars: Sk ill or Luck? Chance ,21(4): 25-28.

    DeDonno, Michael A., and Douglas K. Detterman. 2008. Poker Is a Skill . Gaming Law Review ,12(1): 31-36.

    Deming, W. Edwards. 1943. Statistical Adjustment of Data . New York: John Wiley & Sons.

    Dreef, Marcel, Peter E.M. Borm, and Ben B. van der Genugten. 2003. On Strategy and Relative Skillin Poker. International Game Theory Review , 5(2): 83-103.

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    30/32

    [30]

    Dreef, Marcel, Peter E.M. Borm, and Ben B. van der Genugten. 2004a. Measuring Skill in Games:Several Approaches Discussed . Mathematical Methods of Operations Research , 59(3): 375-391.

    Dreef, Marcel, Peter E.M. Borm, and Ben B. van der Genugten. 2004b. A New Relative SkillMeasure for Games with Chance Elements. Managerial and Decision Economics , 25(5): 255-264.

    Eil, David, and Jamie W. Lien. 2010. Staying Ahead and Getting Even: Risk Preferences of ProfitablePoker Players . Working paper.

    Ellsberg, Daniel. 1961. Risk, Ambiguity, and the Savage Axioms.Quarterly Journal of Economics ,75(4): 643-669.

    Fischhoff, Baruch, Paul Slovic, and Sarah Lichtenstein. 1977. Knowing with Certainty: theAppropriateness of Extreme Confidence. Journal of Experimental Psychology: Human Perception andPerformance , 3(4): 552564.

    Friedman, Lawrence. 1971. Optimal Bluffing Strategies in Poke r. Management Science , 17(12):764-771.

    Fox, Craig R., and Amos Tversky. 1995. Ambiguity Aversion and Comparative Ignorance.Quarterly Journal of Economics , 110(3): 585-603.

    Grinblatt, Mark, Matti Keloharju, and Juhani T. Linnainmaa. 2011. IQ, Trading Behavior, andPerformance . Journal of Financial Economics , 104(2): 339-362.

    Grohman, Christopher. 2006. Reconsidering Regulation: A Historical View of the Legality of InternetPoker and Discussion of the Internet Gambling Ban of 2006. Journal of Legal Technology RiskManagement , 1(1): 34-74.

    Haruvy, Ernan, Dale O. Stahl, and Paul W. Wilson. 2001. Modeling and Testing for Heterogeneity inObserved Strategic Behavior . Review of Economics and Statistics , 83(1): 146-157.

    Hendrickx, Ruud L.P., Peter E.M. Borm, Ben B. van der Genugten, and Pim Hilbers. 2008.Measuring Skill in More-Person G ames with Applications to Poker. Working paper.

    Heubeck, Steven. 2008. Measuring Skill in Games: A Critical Review of Methodologies . Gaming LawReview and Economics , 12(3): 231-238.

    Kahneman, Daniel, and Amos Tversky. 1972. Subjective Probability: A Judgment ofRepresentativeness. Cognitive Psychology , 3(3): 430-454.

    Kahneman, Daniel, and Amos Tversky. 1973. On the Psychology of Prediction . PsychologicalReview , 80(4): 237-251.

    Kahneman, Daniel, and Amos Tversky. 1979. Prospect Theory: An Analysis of Decision under Risk.Econometrica , 47(2): 263-292.

    Kameda, Tatsuya, and James H. Davis. 1990. The Function of the Reference Point in Individual andGroup Risk Decision Making.Organizational Behavior and Human Decision Processes , 46(1): 55-76.

    Kelly, Joseph M., Zeeshan Dhar, and Thibault Verbiest. 2007. Poker and the Law: Is It a Game ofSkill or Chance and Legally Does It Matter? Gaming Law Review , 11(3): 190-202.

  • 8/11/2019 Beyond Chance? The Persistence of Performance in Online Poker - Potter van Loon, Van den Assem, Van Dolder

    31/32

    [31]

    Kuhn, Harold W. 1950. A Simplified Two-Person Poker. In Contributions to the Theory of Games ,ed. Harold W. Kuhn, and Albert W. Tucker, 97-104. Princeton: Princeton University Press.

    Kummell, Charles H. 1879. Reduction of Observation Equations Which Contain More Than OneObserved Quantity. The Analyst , 6(4): 97-105.

    Larkey, Patrick, Joseph B. Kadane, Robert Austin, and Shmuel Zamir. 1997. Skill in Games .Management Science , 43(5): 596-609.

    Nash, John F., and Lloyd S. Shapley. 1950. A Simple Three-Pers on Poker Game. In Contributions tothe Theory of Games , ed. Harold W. Kuhn, and Albert W. Tucker, 105-116. Princeton: PrincetonUniversity Press.

    Langer, Ellen J. 1975. The Illusion of Control . Journal of Personality and Social Psychology , 32(2):311-328.

    Langer, Ellen J., and Jane Roth. 1975. Heads I Win, Tails It s Chance: The Illusion of Control as aFunction of the Sequence of Outcomes in a Purely Chance Task . Journal of Personality and SocialPsychology , 32(6): 951-955.

    Levitt, Steven D., and Thomas J. Miles. 2011. The Role of Skill versus Luck in Poker: Evidence fromthe World Series of Poker . Journal of Sports Economics , forthcoming.

    Palacios-Huerta, Ignacio, and Oscar Volij. 2009. Field Centipedes . American Economic Review ,99(4): 1619-1635.

    Park, Young Joon, and Lus Santos-Pinto. 2010. Overconfidence in Tournaments: Evidence from theField. Theory and Decision , 69(1): 143-166.

    Post, Thierry, Martijn J. van den Assem, Guido Baltussen, and Richard H. Thaler. 2008. Deal or NoDeal? Decision Making under Risk in a Large-Payoff Game Show . Am