AN ANALYSIS OF BOWLING SCORES AND HANDICAP SYSTEMS€¦ · OF BOWLING SCORES AND HANDICAP SYSTEMS...
Transcript of AN ANALYSIS OF BOWLING SCORES AND HANDICAP SYSTEMS€¦ · OF BOWLING SCORES AND HANDICAP SYSTEMS...
AN ANALYSIS
OFBOWLING SCORES AND HANDICAP SYSTEMS
by
Wenjun Chen
M.Sc. Beijing Institute of Technology
A PROJECT SUBMITTED IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE
in the Department of Mathematics and Statistics
of
Simon Fraser University
@ Wenjun Chen 1991
SIMON FRASER UNIVERSITY·
August, 1991
All rights reserved. This work may not be
reproduced in whole or in part, by photocopy
or other means, without the permission of the author.
APPROVAL
Name: Welljun Chen
Degree: Master of Science
Title of project: An Analysis of Bowling Scores and Handicap Sys
tems
Examining Committee: Dr. A. Lachlan
Chair
Date Approved:
,
Dr. R. Routledge, Committee Member
Dr. C. Dean, External Examiner
November 5 ] 991
ii
Abstract
This project uses the Box-Cox transformation and goodness of fit techniques to find a
model describing the distribution of bowling scores based on the analysis of actual bowling
data. We have found that the logarithm of bowling scores is approximately normally
distributed with a constant variance. The simulation of bowling scores based on a Fortran
program confirms this model. The project also uses Monte Carlo methods to investigate
the effect of various handicap systems based on the proposed model.
III
Acknowledgements
My sincere thanks to Dr. Tim Swartz for his invaluable advice and guidance and time
that he spent with me in the preparation of this project. I would also like to thank him
for his supervision during my studies.
Thanks also go to Dr. R. Routledge, Dr. K. 1. Weldon and Dr. C. Dean for their
assistance. I would also like to express my thanks to Dr. M. A. Stephens and to F.
Bellavance for their guidance and advice in doing the project. I would also like to thank
my fellow students and friends for any help that they gave me.
I would also like to extend my thanks to Ms. Sherry Swartz who supplied the data set
used in this project. It was a pleasure to work with such a "good" data set.
Finally, I acknowledge with humble gratitude, the constant encouragement and every
possible help from my mother, father, brothers and sisters. Without them, I would never
have been, what I am today. I would also like to acknowledge all the help and support
extended by my husband.
iv
Contents
Abstract
Acknowledgements
Dedication
Contents
List of Tables
List of Figures
1 Introduction
2 Description of Five Pin Bowling and The Data Set
2.1 Five Pin Bowling
2.2 The Data Set . . .
3 Characterizing The Bowling Scores
3.1 Initial Exploration of the Data Set
3.2 Mean-Variance Relationship ...
3.3 Profile Analysis of Bowling Scores
3.4 Normality of the Bowling Scores
4 Modelling The Bowling Scores
4.1 Box-Cox Transformation .
4.2 Goodness of Fit Technique in Testing for Normality
4.3 The Results of Goodness of Fit for Bowling Scores
vi
iii
iv
v
vi
viii
ix
1
3
3
8
11
11
15
21
24
29
29
29
30
4.4 The Proposed Model for Bowling Scores.
4.5 The Property of Equal Variances.
5 Simulated Bowling Scores
5.1 Assumptions of Simulation
5.2 The Results of Simulation.
6 Handicap Systems
6.1 The Remington Rand Study
6.2 The Monte Carlo Study ..
6.3 The Results of Our Study .
Appendix
A The Data Set .
B A Program which Simulates Bowling Scores.
Bibliography
vii
32
34
40
40
44
49
49
49
50
52
54
62
76
List of Tables
2.1 Two Scoring Sheets ....... 9
3.1 Brief Summary of The Data Set. 14
4.1 The Results of Bowling Scores. . 31
4.2 The Results of Logarithms of Bowling Scores 33
4.3 The Sample Avera.ges and Sample Variances of Logarithm Data. 34
5.1 The Results of Simulation (N = 10000) ...... 44
6.1 Estimated Probabilities of the Favourite Winning 51
viii
List of Figures
2.1 Pin Count .
3.1 The Scatter Plots of Scores vs Games for Players 1 to 20 .
3.2 The Scatter Plots of Scores vs Games for Players 21 to 40
3.3 The Plot of Standard Deviation vs Average ..
3.4 The Samples of SL and SH .
3.5 Possible Standard Deviation vs Average Curve
3.6 Overa.ll Average Plots .
3.7 Plots Based on Scores of All Players .
3.8 The Histograms of Bowling Scores for Players 1 to 20
3.9 The Histograms of Bowling Scores for Players 21 to 40 .
3.10 The Distribution of Standardized Bowling Scores
4.1 Normality of the Logarithms of Bowling Scores. . ....
4.2 The Sample Variances vs Averages Plot for Logarithm Data.
4.3 Normality of the Logarithm of Bowling Scores (<1 2 = 0.0328).
5.1 Tree Diagram of Possible Outcomes .
5.2 The Results of pl=0.15, p2=0.15, p3=0.25, p4=0.45
5.3 The Results of p1=0.15, p2=0.15, p3=0.1, p4=0.6
5.4 The Results of p1=0.3, p2=0.3, p3=0.1, p4=.3 ..
6.1 The Probability of the Favourite Winning the Game
IX
4
12
13
16
17
20
22
23
26
27
28
35
36
38
43
46
47
48
........ 53
Chapter 1
Introduction
In the fall semester of 1990, I contacted Dr. Tim Swartz about analyzing a data set for
my M.Sc. project. Several days later, he mentioned that he had been watching television
and noticed that unlike most major sports, in professional bowling very little quantitative
analysis is provided to the viewing audience. In particular it seemed that all that could
be said about a bowler X was that he/she maintained a bowling average of Y.
This observation inspired us to ask the following questions: Can more be said about a
bowler's tendencies beyond reporting his/her bowling average? Do bowling scores follow
a particular distribution? What impact might this have on handicapping? What are the
handicap systems currectly used today? Are they fair?
In order to pursue these questions we required a practical data set of bowling Scores.
Sherry Swartz, Dr Tim Swartz's sister, has participated in a bowling league for several
years. She mailed us a total of approximately 2300 five pin bowling scores from an actual
league. These are the scores upon which our analysis is based.
In our attempt to gain a quantitative understanding of bowling scores we also came
in contact with several Canadian bowling agencies and individuals. We describe these
helpful encounters throughout the project.
1
CHAPTER 1. INTRODUCTION 2
Chapter 2 introduces the game of five pin bowling; we describe the rules, the equipment
and the scoring procedure. In this chapter we also describe the data set on which my
project is based.
In Chapter 3 we carry out exploratory data analysis. Through the use of simple
descriptive statistics, plots and tests we gain some feeling for the data set. This exploratory
work also provides ideas regarding future directions for our analysis.
Chapter 4 reviews the Box-Cox transformation
if a -# 0
if a = 0
which is used in my project to maximize the p-value in a test of normality involving
bowling scores. The main result of this chapter is that the logarithms of bowling scores
are approximately normally distributed.
In Chapter 5 a simulation experiment based on a Fortran program is presented. It is
used to verify the proposed model obtained in Chapter 4. Adjustable parameters which
describe different bowling skills are considered.
In Chapter 6, the Remington Rand handicap study is described. We use our model to
investigate the effect of various handicap systems on the probability of winning and then
compare our results with the Remington Rand study.
Chapter 2
Description of Five Pin Bowling
and The Data Set
This chapter contains an introduction to the sport of five pin bowling with an emphasis
on the scoring procedure. We also describe the data set upon which our study is based.
2.1 Five Pin Bowling
Bowling is one of the most popular sports for participants of all ages, regardless of sex,
shape or physical condition. The two most popular types of bowling in Canada are ten pin
bowling and five pin bowling. For five pin bowling, five pins are used in the game. A game
of five pin bowling consists of ten frames and should be played with regulation equipment
on regulation lanes. Each frame consists of a maximum of three legally delivered balls
rolled by the same bowler down the lane in succession. If a bowler should knock down all
5 pins in less than 3 attempts the frame is considered complete. An exception occurs in the
tenth frame where 3 balls are always delivered. If in the tenth frame all 5 pins are knocked
down during the first or second attempts then the 5 pins are reset. The object of the
game is to score as many points as possible in ten frames. The score is the total number
of points corresponding to the pins knocked down in the ten frames (plus bonuses). The
scores assigned to each pin are recorded in Figure 2.1.
3
CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET 4
Figure 2.1: Pin Count
The following are some common scoring terms:
(1) strike: All pins are knocked down by the first ball bowled in a frame.
(2) spare: All pins are knocked down by the first two balls bowled in a frame.
(3) corner pin: All pins are knocked down by the first ball with the exception of a single
corner pin.
(4) head pin: The head pin is picked out by the first ball bowled in a frame.
The basic rules for scoring a game of bowling are as follows:
(1) no strike or spare: Merely add the total points corresponding to the pins knocked
down on the three balls.
(2) strike: Fifteen points plus a bonus of the number of points accumulated by the next
two balls rolled.
(3) spare: Fifteen points plus a bonus of the number of points accumulated by the next
ball rolled.
A perfect game of 450 is scored by recording strikes in each of the ten frames. In
addition this requires knocking down all 5 pins on both the second and third balls of the
tenth frame. An example of the scoring for a typical bowling game is given below.
CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET 5
frame 1 The player knocks down all pins except the headpin using three balls.
Score: 10 points.
frame 2 The player knocks down all the pins using three balls.
Count: 15 points.
Score: 25 points.
frame 3 The player uses only two balls to knock down all pins. This is called a spare.
Count: 15 points. However, as each frame's count is for three balls, the player adds
the count from his first ball in the next frame.
Score: INCOMPLETE.
frame 4 The player knocks down the 3 pin with his first ball. We therefore add 3 points
to the 15 points of his spare making a count of 18 for the third frame.
Score: 43 points.
The player then knocks down the headpin (5) and the left 2 pin. This makes his
count for the fourth frame 3+5+2=10 points.
Score: 53 points.
frame 5 The player records a strike. Count: 15 points plus points scored with the next
two balls bowled.
Score: INCOMPLETE.
frame 6 The player makes another strike and credits the fifth frame with 15 points. Fifth
frame score: still incomplete. Sixth frame count: 15 points plus points scored with
the next two balls bowled.
Score: INCOMPLETE.
frame 7 With the first ball, the player picks the headpin (5). This completes the fifth
frame. Count: 15+15+5=35 points, fifth frame score: 88. On the second try he
scores 5 points. The player adds the 10 points to the strike count of the sixth frame.
Sixth frame count 25, score: 113. Then the player knocks down the 2 pin, his count
for the seventh frame: 12 points.
Score: 125 points.
frame 8 A strike is recorded. Count: 15 points plus points scored with the next two balls
bowled.
Score: INCOMPLETE.
CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET 6
frame 9 With the first ball, the player picks the 3 pin. With his second ball, he counts 12
points knocking down all remaining pins for a spare. This complete the 8th frame.
Count: 15+3+12=30 points. Eighth frame score 155.
Score: INCOMPLETE.
frame 10 A strike is recorded and 15 points are credited to the ninth frame. Ninth frame
count: 15+15=30. Score 185. The strike in the tenth frame permits two additional
attempts. The player gets the three pin on his second ball and the five on his third
ball. Player's tenth frame count: 15 (for the strike)+3+5=23.
Game score: 208 points.
Scoring Sheet
1 2 3 -I 5
5 / 5 / - 31 2/10 5/11 31 51 2 X I I10 25 43 53 88
6 7 8 9 10
X I I 51512 X I I 31 / I X 1315
113 125 155 185 208
Here "X" means a strike, "I" means a spare.
Handicapping:
In a league in which the range of abilities is wide, the league may adopt handicap
rules. A handicap attempts to "even up" the chance of winning between opposing teams.
There are two basic kinds of handicapping currently used by the Canada Five Pin Bowlers'
Association.
A. Individual handicap systems:
(1) 80% of the difference between the bowler's average and a base figure of 225.
(2) 66% of the difference between the bowler's average and a base figure of 200.
CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA. SET 7
(3) 75% of the difference between the bowler's average and a base figure of 200.
(4) 75% of the difference between the bowler's average and a base figure of 220.
The various handicap methods mentioned above are applied to a 160 average bowler
to produce the following handicaps.
System Handicap
80% of 225 52
66% of 200 27
75% of 200 30
75% of 220 46
These handicaps are then added to a bowler's gross score at the end of each game to
give a net score. In the case of a bowler whose average exceeded the base figure, a zero
handicap would be assigned.
B. Team handicap systems:
(1) Team handicaps are determined by adding the averages of the team players for each
of two opposing teams. Then 80% of the difference between the team totals is taken
as the handicap for the weaker team in each individual game.
(2) In deciding the three game handicap total, multiply the single game team handicap
by 3. This total would be added to the total team score for the three games.
(3) When a team's strength is not identical for the three games, the three game handicap
shall be the total of the handicap allowed for each of the three games. This may
happen for example when a team has 6 bowlers, only 5 are permitted to bowl in a
given game and some rotation scheme is used.
CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET 8
2.2 The Data Set
The data set of bowling scores on which my project is based carne from an actual five
pin bowling league in Kitchener/Waterloo. The scoring sheets were provided to us by the
league scorer (Ms. Sherry Swartz). Each scoring sheet is a weekly summary showing the
results between two competing teams. The scorer recorded the league name, the date and
the particular lane used by the competing teams. For each team the scoring sheet provides
us with the team number, team players, individual and team handicaps and individual
and team totals. From the scoring sheet one can also determine the number of points that
each team scores for the week. A team obtains two points for every game won and a single
point for having the largest grand total. Therefore 7 points are shared by two competing
teams and the maximum number of points that a team can score in one week is 7 points.
Table 2.1 illustrates the format of two sheets.
There are 4 scoring sheets provided each week. This represents a league consisting
of 8 teams with 5 players per team. However players change teams from time to time
and new players occasionally join the league. For example, Mary was a member of team
8 on December 19, 1990 and became a member of team 5 on January 2, 1991. Also
Diana happened to join team 4 on October 31, 1990. It is also possible that some players
might choose to leave the league. The total number of players recorded in the scoring
sheets exceed 40. However some players only participated in three or six games. The data
set was collected weekly from September 5, 1990 to January 9, 1991 excluding the week
of Christmas. This resulted in a total of 19 weeks. The players bowled 3 games every
Wednesday evening over the span of 19 weeks giving a maximum of 57 bowling scores per
player.
The scoring sheets illustrate all information about individual and team scores. How
ever, because the members of teams varied from week to week, team scoring will not be
the focus of this project. We will instead concentrate on individual scores. In our study
we have selected the forty players who have been the most active in the league. Therefore
for each player we will have a maximum of 57 game scores plus possibly some missing
values. This results in approximately 2300 game scores. Appendix A lists all data used in
the analysis.
CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET 9
Table 2.1: Two Scoring Sheets
Lane No. 19&20Team 8
League KolCvs
Date Dec.19 19 90Team 7
A_I PI&y~u H" Gamel O.m~:l O.me3 Tot.1 A_, Playe" H" Oamel O ..me, O ..meJ Tot ..,
Ma.y .. '" '" U • '" W ...... y " '"~ '" ." '"Pride .. 'I' '" '" .., M,i.... " .. • 11 ... '"Sh .... .. ,,, u.
'"~... Edie " ... ... ,.. ."
QlIal" .. '"~ '" u. ... W.. D.lI " U • ,.. '" .n
ao, " ,.. ", '"~ m "1..111. " ... ,.,'"~ m
Total m '" '" Tot..1 '" '"~ ...Team H&lIdieap '" .., .., Team Ha..oicap '" '" '"O ....d Tau,' 1140 1200 1144 J..... O'aDd Tol ..1 1166 to!>1 12'!'2 H69
O ..mu WOIl Poiau Wor>
Lane No. 19&20Team 6
League KolCvs
Date Dec.19 19 90Team 5
A_, Play.." H<, O ..mel Oame2 a ..me3 ToI..1 A_. Play.... H<, Oa...el O ..me, Ga. ... ") Total
P..k " '"~ '",,. ... Sbelly .. '"~ >I' '" ...
Joh" .. '"~,.. '" '" Lulie " U. ,., >I. ."
Bell"r '"~ '"~ m '" ... Hdl .. '" '" '" '"Cook .. '" '" ,.. ... Bo'" " ,>I ... '" ."Pert' .. '" ..,
'"~ m Earl " ,.,'" '" '"
TOh,1 '" ", ... T .., ..I ." .n '"Tea... H&odlc ..p ... '" '" ", Te..... Ha..dicap .11 .., '" 166
0.&.. 0 Total 1133 U76 1040 3449 OralIO Tou.1 1008 IHi3 IHi'T 3318
Gamet WOIl Poillu War.. Team Avu..,e Ga",e, WOIl Poi"" WOD
CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET 10
The players in the study vary in sex, age and experience. Unfortunately, we could not
get information concerning these variables and we will therefore not take these factors into
account. The only discriminating factors which we will consider in our analysis of bowling
scores are overall ability, changes over weeks and changes between the 3 games bowled in
the same evening.
Chapter 3
Characterizing The Bowling
Scores
Before a formal analysis is given in each section of this chapter, we present an exploratory
analysis by using simple descriptive statistics and plots. We begin by carrying out an
informal graphical exploration of the data as this is often helpful in highlighting special
features of the data and may be helpful in determining the direction in which we should
continue our analysis. We then attempt to characterize the data set formally by considering
the mean-variance relationship, the profile analysis and the property of normality.
3.1 Initial Exploration of the Data Set
The scatter plots of scores vs games for each of the forty bowlers are given in Figure 3.1
and Figure 3.2.
Figures 3.1 and 3.2 are given using the same vertical and horizontal scale. From
studying the two figures, we see that the bowling game scores vary widely for different
players. The maximum score is about 330 points and the minimum is about 70 points.
Actually the maximum score is 329 which was obtained by player 25 in game 40 and the
minimum score is 71 which was obtained by player 17 in game 2. The bowling skills are
quite different from player to player. For example players 2,11,12 and 17 tend to have
lower bowling scores, having average scores of 140, 120, 131 and 126; players 5, 10, 25
11
CHAPTER 3. CHARACTERlZING THE BOWLING SCORES 12
player 1 player 2 playe, 3 player 4 player 5
':.': .I ! '. '.. ::. ' ...... " .....
.,:; '"..... -', .......
I
!
I !!
!
':'.......'- ::...... ~'.
., ~
."! .' - .. ,.. " ...... ".
.'... ,.. ' :.' .
! .....
~
..-010l0:SO.oW
..-"OIO:lOUr.ll
.- .-play.f6 playef7 player 8 player 9 playe, \0
!I! "::
!!!! .': : :.:.:.:," ..., ! ,', ....:....:. ..... :..:
" ..
'.':.' .'... :', .............! .....:.:..!
., :. ',': .
. ..... ., ':
' ....
..- .-"010:104050
.- .-D 10 10 :llI 40 r.o
player 11 player 12 player 13 player 14 player 15
! ..
I
!
I !! .,.
:'. ;~~'.~'."\{' ;'..~:.:! "
I
!
I !! .',:..~ ':.: :: .
. ", .-.
!I
II! .'.. ..,....::.. " " ',:.. , ..... '. . ..'.
;.... .
..- .- .-" 10 :)II 40 r.a
.-D \D U 3D 'D !>D
player 16 player 17 player 18 player 19 playaf 20
!
! ,". ',::u.,- ...
"",--,0' ...... ':
I
I
!
! ::...;.:...... ' .. ',\:',:! -;.-
I !!! .: "''. ','~ .....~..:.: .:', .
'.'
! on·
!
,- :. . ~.
..-"0"30":10
.- .-D1GID:lII4G!>D
.- ....Figure 3.1: The Scatter Plots of Scores vs Games for Players 1 to 20
CHAPTER 3. CHARACTERiZING THE BOWLING SCORES
player 21 player 22 player 23 player 24 player 25
! ! ! ! !!
: !"
! ! !,! ......
! ! , ..... '. ! .. • ! ..... .....~.: . ":" " > .. .:.... I ,~ ~ ~
.... ,', "~ " ~
....,~ ! ! ! !
" " • " .. " " • .. .. " " .. .. .. " " • .. • " " • " •..... .... ..... .... ....player 26 player 27 player2B player 29 player 30
I I ! ! !! ! ! ! !
I ! ! ! I ! " " • !, '.' " " I
" .':. ......~
,.1.. '"
~ ".:',', '::::'" :~. ~ ',' ~ " ~ : '::, ::'.-" .;: ..- " ....:......:/ , ' ,';: "
! ! ! '. , ,! !
," " • .. • • " " • .. • " " • " • " " • " • ,
" " • .. •.... .... ...- .... ....player 31 ptayer32 player 33 player 34 player 35
! ! ! ! !! ! ! ! !
I ! ! ! ! ..· !" ". · ..
~ " ." :'.:"'-' ! .. .~ '. .. '
~" ,,' ~ "....,. ....w ·. , , .' ',' "... :.... .. .....
" '"! ! .. ! .. ! ! .. ,,
" " • .. • " " • " .. ," " • " • " " • .. " " " • .. "..... ...- ..... ...- ....
player 36 player 37 player 38 player 39 player40
! I I ! !
! ! ! ! !..•! " " I ! I ! ! ,".. ' ," .....: . .. .. "...:' .
~"
~ ~, ",
~......
~ ....-:..:.-:.' : .. .. .:...
! ! .".' . .... ,., .. :'::::'! ! !
"
," " • " • ,
" " • .. " • " " • .. • " " • .. " " " • .. "..... .... ...- .... ....Figure 3.2: The Scatter Plots of Scores vs Games for Players 21 to 40
13
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES 14
and 29 have higher bowling scores, having average scores of 204, 198, 197 and 204. The
variations of scores are also quite different for different players. For instance players 1, 10,
22 and 25 have ranges of 165, 157, 138 and 189. Players 11, 14, and 16 have ranges of 74,
99 and 87. It also seems to be the case that if a player has a high average, then he(she)
also tends to have a high variation. We will investigate the mean-variance relationship in
more detail in section 3.2. Missing values can also be noted from the plots. Some players
such as players 1, 5 and 6 bowled from the begining to the end of the bowling season.
They do not have any missing values. Other players were occasionally absent even though
I chose the 40 players with fewest missing values. For example the scores of player 20
are missing between games 34 and 54. In order to get a more quantitative understanding
of the bowling scores for different players, Table 3.1 gives a brief summary of the data
set including the average, sample variance and actual number of games that each player
bowled.
Table 3.1: Brief Summary of The Data Set
Player Games Avg. S" Player Games Avg. S"1 57 195.9 1415.5 21 48 190.5 968.72 54 138.8 492.5 22 51 203.5 1198.83 48 160.2 432.7 23 45 188.6 1026.44 48 187.7 911.0 24 54 180.0 937.55 57 204.3 1130.8 25 51 196.7 1699.06 57 145.0 672.4 26 57 157.7 689.07 51 155.9 938.6 27 57 140.9 712.88 54 180.3 948.3 28 45 119.9 694.69 57 162.4 1088.8 29 54 203.8 1233.110 57 197.5 1237.2 30 57 162.2 576.411 51 119.7 312.4 31 51 163.2 703.112 57 131.2 639.3 32 54 144.7 608.213 54 162.3 918.8 33 57 151.6 1015.114 51 142.6 447.1 34 54 164.2 928.815 48 153.0 1016.8 35 48 136.6 856.016 51 143.1 411.4 36 57 171.6 891.017 57 126.1 705.0 37 54 128.1 800.818 54 163.6 797.8 38 57 169.2 1339.119 51 161.5 942.8 39 42 141.0 703.920 39 155.4 839.3 40 51 160.0 1056.4
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES 15
From Table 3.1 we see that 13 players participated in all 57 games. Player 20 partici
pated in the fewest games (39) amongst the 40 bowlers. The averages of game scores vary
from 119.7 (player 11) to 204.3 (player 5). The variances of scores also vary widely from
312.4 (player 11) to 1699 (player 25).
3.2 Mean-Variance Relationship
We mentioned a little bit about the relationship between mean and variance in Section
3.1. We now give a more detailed investigation of the relationship. Figure 3.3 gives a
plot of standard deviations vs averages for each ofthe 40 players. The smoothing line was
obtained by using the lowess command in S-plus.
As mentioned in Section 3.1, we verify in Figure 3.3 that standard deviations tend to
increase with average. In this section we will use formal statistical methods to test this
phenomenon. We will test the hypothesis using different statistical tests.
In order to test this hypothesis, we divided the standard deviations into two groups:
SL and SH. SL is the set of standard deviations corresponding to the 20 lowest averages
and SH is the set of standard deviations corresponding to the 20 highest averages. Figure
3.4 illustrates the two samples.
Method 1: Mann-Whitney Test
The distribution of standard deviations of bowling scores is unknown. Therefore the
non-parametric method may be preferable here.
The main assumptions of the Mann-Whitney test:
(1) The data consist of a random sample of observations SL p SL" ... , SLnl from a
population with unknown median M'L and an independent random sample of ob
servations SH" SH" ..., SH., from a population with unknown median M'H"
(2) The distribution functions of the two populations differ only with respect to location,
if they differ at all.
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES
The Plot without Smoothing Line
S1
c III .".g.!!l . . .>" to!
. ' .. •0
~ • ·~ ~ ·(J)
·~ ·120 140 160 180 200
Average
The Plot with Smoothing Line
0...c III.2
.1!~
to! '.. •0
~ •'l! ·~
~ ·•
~ ·120 140 160 180 200
Average
Figure 3.3: The Plot of Standard Deviation vs Average
16
CHAPTER 3. CHARACTERiZING THE BOWLING SCORES 17
~ -SL
'"'""0"Q.~
gc
~J!I(/) l!,J -
0
'"
,120 140
' ..
160
Average
SH
,
,180 200
:
Figure 3.4: The Samples of SL and SH
The hypothesis:
Ho: M'L =M'H
HI: M'L < M'H
The test statistic T = S - n"n~ -I) = 164 where S is the sum of Ihe ranks assigned to
the sample observations from the first population.
Decision rules: We reject Ho at the a level if the computed T is less than the critical
value given below.
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES
Critical Values of the Mann-Whitney test statistic ( nI = 20 )
p\n2 15 16 17 18 19 20
.01 81 88 94 101 108 115
.025 91 99 106 113 120 128
.05 101 108 116 124 131 139
.10 11 120 128 136 144 152
We can not reject Ho even at the a = 10% level.
Method 2: Two Sample t-Test
Assumptions of t-test:
(1) SL and SH are independent samples from normal populations.
(2) The standard deviations of SL and SH are identical.
18
Assumption (1) is clearly violated here. However the t-test is a robust test for samples
from certain non-normal distributions. Graphically, it seems no reason to lose faith in
assumption (2).
The hypothesis:
Ho: E(SL) = E(SH)
HI: E(SL) < E(SH)
We calculate T = J-~ = -lAO with degrees of freedom = 38 andSp lInt +1/n2
p - value = .085 which leads us to not reject Ho (some mild evidence against Ho)
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES
Method 3: Spearman Rank Correlation Coefficient
19
Method 1 and method 2 tested for a difference between means in SLand SH. Both
methods were simple to use but gave only mild evidence of a difference between means.
We now use Spearman's rank correlation method to determine Whether there is evidence
of increasing standard deviation with respect to mean. This test imposes more structure
than the previous 2 methods which simply divided the data in half.
The assumptions of Spearman's test are that the data consist of a random sample of
n pairs of observations and that each pair of observations represents two measurements
taken on the same object or individual. Let (Xi, Yi) denote the average and standard
deviation of player i and let R(Xi)(R(Yi)) be the rank of the Xi(Yi) relative to all other
values of X(Y). If ties occur among the X's(Y's) each tied value is assigned the average ofthe rank positions for which it is tied.
The hypothesis:
Ho : X and Yare independent
HI : There is a direct relationship between X and Y
6L:~, ,qThe test statistic is r. =1-~ =0.724 where di = [R(Xi) _ R(YiW and n =40.
Decision rules: We reject Ho at the a level if the computed value of r. is greater thanthe critical value which is given below.
Critical Values (n = 40 )
a .25 .10 .05 .01 .005r(a) .110 .207 .264 .368 .507
Therefore we reject Ho even at the a = 0.005 level.
From the exploratory graphical approach and the third test there seems to be some
evidence that the standard deviation increases as the average increases. Possible reasons
for the phenomenom might be as follows: (1) In bowling, strikes and spares affect scores
dramatically. Players with high averages have more chance of getting a strike or a spare.
Therefore the variations of scores are wider than for players with lower average scores.
(2) Bowling scores range from 0 to 450. If a player obtains 0 for every game, the average
score of the player is 0 and the variance is also O. It is the same if a player obtains 450
for every game. The average score of the player is 450 and the variance is O. Using these
two fixed points we might therefore expect the standard deviation versus average curve to
appear as in Figure 3.5. The maximum point of variation is not intended to occur at any
specific point along the horizonal axis. For our case study, we collected bowling scores
with average scores between 120 and 205. Therefore corresponding to Figure 3.5 there is
an increasing relationship between standard deviation and average as we expected.
CHAPTER 3. CHARACTERiZING THE BOWLING SCORES 20
400300
Average
200100
Figure 3.5: Possible Standard Deviation vs Average Curve
o
0gc
:;~ 0
1 !<lc
~
8
.1. _
CHAPTER 3. CHARACTERIZING THE'BOWLING SCORES
3.3 Profile Analysis of Bowling Scores
21
The bowling season of the league from which the data set was collected took place from
September 1990 to April 1991. As described in Section 2.2 the players bowled a maximum
of 57 games. Three games are bowled in one evening (Wednesday evening) every week.
We are therefore interested in answering the following questions: Do the players improve
their bowling skills from week to week or maybe from game to game each week? The
answer to these questions is important as it gives some indication of whether the scores
for each bowler are identically distributed.
Before proceeding further, we introduce some convenient terminology. In every bowling
evening, the players bowled three games (ordered games). The corresponding scores are
called the first game score, the second game score and the third game score respectively.
By the week average we mean the average game score for each week. Therefore we have
at most 57 game scores, 19 first game scores, 19 second game scores, 19 third game scores
and 19 week averages for each player.
let Xijk be the bowling score for the i'h player (i = 1,2, ... ,40) in the i th game
(j = 1,2,3) in the k'h week (k = 1,2, ... ,19). ThenL'·
X,jk = ''4'0x;), is the overall average game scores of game j in week k.
L'· L'Xuk = ;=1 120=1 Xi)' is the overall average of week k and
L'· L 19
x.j. =i-l Y6t-' Xi;' is the overall average of the j'h game.
In order to investigate the questions posed earlier we give the plots of overall averages
of the 40 players. If players improved their bowling skills significantly from week to week
or from game to game, we could see it from the overall average plots. Figure 3.6 gives the
plots of overall week averages vs weeks and overall average game scores vs games.
From Figure 3.6 we see that there is some indication of improvement over the three
games in a given week (2nd plot) and possibly some very mild evidence in improvement
over the season (1st plot). Figure 3.7 gives the plots of scores vs games and scores vs weeks.
We can not see any indication of improvement over the three games nor improvement over
T CHAPTER 3. CHARACTERIZING THE BOWLING SCORES
Overall Average Week Scores vs Weeks
II)Q)~
lfl
~ '"~'"
• •Q)Q) g •3:
~
•Q) • • •Clos •~ IIIQ) •~ ~ • •
•~
~•
<3• •
~
5 10 15
Weeks
Overall Average Game Scores vs Games
22
•
•
1.0 1.5 2.0
Games
2.5 3.0
Figure 3.6: Overall Average Plots
~--------------
1CHAPTER 3. CHARACTERlZING THE BOWLING SCORES
Scores vs Weeks
•00 • • • • •C') • • •• t• • • • • •• • t • • • f f • • •t • I • I
• ! • •Ul I • t
I•
I•
I I~ 0 •
I8 0 IN
I Ien
I•0 I •0 • I • •• I~ • • • • • • • • •• • •
5 10 15
Weeks
Scores vs Games
•0
•0 • ••C') • • I• I
III 0~
8 0N
en
00- •
1.0 1.5 2.0 2.5 3.0
Games
Figure 3.7: Plots Based on Scores of All Players
23
1 ,
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES 24
weeks from this plot. However we will need to test these conjectures formally as these
plots do not convey the variability associated with each observation. Because the players
have quite different average bowling scores, we stratify our population by choosing each
player as a subject.
For i = 1,2, ... ,40 we test:
Rio: Ui1 = Ui2 = Ui3
Hi, : not the case that u" = Ui, = Ui,
where Ui, = mean score of player i for the first game,
Ui, = mean score of player i for the second game and
Ui, = mean score of player i for the third game.
In a similar manner wealso test for a difference over weeks. The null hypothesis states
that there is no difference between the means of the 19 weekly scores.
Because we are interested in both the game effect and the week effect for each player
we use the two factor analysis of variance model for repeated measurement designs. We
note that analysis of variance models are robust with respect to small departures from
normality. Using the statistical package SAS we obtain a p-value of 4.2 % for a difference
between games and a p-value of 17.5 % for a difference between weeks. Therefore there
seems to be at most mild evidence for an effect due to games and no evidence for an effect
due to weeks. We will therefore assume these effects do not exist.
3.4 Normality of the Bowling Scores
As mentioned earlier it seems that currently the only qnantitative comment that is rou
tinely made concerning a bowler's ability is the reporting of his/her bowling average. We
would like to do more than this by possibly describing the distribution from which bowling
scores arise. Initially we conjectured that bowling scores for each individual are approxi·
mately normally distributed with unknown mean and variance. We made this conjecture
as bowling scores arise as the sum of scores of 10 frames which suggests that the central
limit theorem may approximately hold here. Figures 3.8 and 3.9 show the histograms of
~--------------------
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES
bowling scores for each player.
25
From Figures 3.8 and 3.9, it seems that some histograms are approximately normally
distributed (eg. histogram for player 29). However more often than not, the histograms
seem to have a long right tail (eg. histogram for player 22). We will use a graphical method
to pool the results of each of the 40 bowlers to determine whether the bowling scores
are approximately normally distributed. The null hypothesis is: Ho : Xijk ~ N(J1-i, ,,1),i =1,2, ... ,40.
Pierce[5] has suggested a method of combining tests based on several samples, for
testing Ho : the sample comes from a distribution F(Xi,8i), with 8i containing unknown
location and/or scale parameters J1-i and "i. The true value of these parameters may be
different for each test. For sample i, let jJ.i and O"i be the maximum likelihood estimates,
Define standardized values Wijk = (Xijk - P,i)/Ui, i = 1,2, ... ,40 (in our case). Based
on the proposal of Pierce, the Wijk for all 40 samples should be pooled to form one large
sample of size n = ~t~l ni where ni is the size of i'k sample. The limiting distribution
of Wijk will be the same as its limiting distribution for individual samples. Therefore the
above hypothesis is changed to the null hypothesis: Do: Wijk ~ N(O, 1).
Figure 3.10 shows the histogram of standardized and pooled bowling scores Wijk for all
players and the q-q plot with the standardized normal distribution. A line of zero intercept
and unit slope is added to the q-q plot in order to measure easily whether the Wijk are
approximately normally distributed. Both the histogram and the q-q plot suggest that
bowling scores are skewed to the right. In the q-q plot the left tail departs much more from
the straight line than the right tail. Therefore the bowling scores are not approximately
normally distributed. In Chapter 4 we will use the Box-Cox transformation and formal
statistical methods to find a better model for the bowling scores.
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES 26
player 1 player 2 player 3 player 4 player 5
.........."'00 '00 eo 120 1$0 200 120 160 200 100 1SO 200 250 '"' '"'
player 6 player 7 player 8 player 9 player 10
100 140 110 220
.........."100 ISO 200 250
.........."100 ISO 200 250
.........."'00 '"'
play", 11
eo 100 '40
player 12
60 100 140 llll '00
player 13 player 14
100 140 '80 220 '00
player 15
'00
.........." Bc:J,Wng ScDlq
player 16 player 17 player 18 player 19 player 20
100 '40 110 100 150 200 100 140 180 220 100 140 180 220 100 uo leo 220
Figure 3.8: The Histograms of Bowling Scores for Players 1 to 20
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES 27
player 21
player 26
player 22
player 27
player 23
120 160 200 2010
BowlirIg Sca'n
player 28
prayer 24
120 160 200 240
player 29
player 25
150 250
player 30
100 140 tao 220
8cMIlng Sea..
tOO 150 200
-...... ,co .. ,so ,so 100 140 180 220
player 31 player 32 player 33 player 34 player 35
100 140 180 220
player 36
80 120 160 200
-.""""player 37
100 150 200
BcMtIng S<:a'H
player 38
tOO 1010 leo 220
player 39
60 100 140 \80
Bawling ScCfU
player 40
100 UO 180 220 80 120 160 200 lOCI 200 300
Bowling Sca••
80 120 160 200 80 120 160 200
Figure 3.9: The Histograms of Bowling Scores for Players 21 to 40
~----------------------
CHAPTER 3. CHARACTERIZING THE I;IOWLING SCORES 28
Histogram of Standardized Values
8~
8'"8'"§
0
-2 -1 0 2 3 4
Standardized Bowling Score
Q - Q Plot
~
e '"~
'"'",5~<D
IIN 0'15la"Dc: ";'
~')' ....
-2 0 2
Quantiles of Standard Normal
Figure 3.10: The Distribution of Standardized Bowling Scores
.---------------,--------
ira f 0
if a = 0
Chapter 4
Modelling The Bowling Scores
In Chapter 3, we mentioned that the bowling scores are not approximately normally
distributed. In this chapter, we will use the Box-Cox transformation technique to find an
improved model for the bowling scores.
4.1 Box-Cox Transformation
In general, the Box-Cox family of transformations is given by
{
x"-ly(a) = -a-
In(x)
where the transformed y(a) is "more normal" than x. In our case let
{
xrj.-l 'f.J- 0y~l = a 1 a r
In(x;jk) if a = 0
where i stands for one of the players 1 through 40, j stands for one of the games 1 through
3, k stands for one of the weeks 1 through 19. We consider different parameters a in order
to maximum the p-value in testing the hypotheses that Yi~l ~ N(j.t;,al), i = 1,2,.,.,40,
j =1,2,3 and k =1,2, ... ,19.
4.2 Goodness of Fit Technique in Testing for Normality
Suppose that a given random sample of size n is given by Xl> X 2 , • • " X n , and let X(I) <X(2) < ... < X(n) be the order statistics. Suppose that the distribution of X is F(x), The
29
~--------------------------
CHAPTER 4. MODELLING THE BOWLING SCORES 30
empirical distribution function (edf) for the sample is defined by
[;' ( ) _ number 0/ observations::; x ..I'n X - l -00 < X < 00.
n
Edf statistics are a class of goodness of fit statistics which measure the difference between
Fn(x) and F(x). The Anderson-Darling edf statistic is defined by
A2 =nJ: [Fn(x) - F(xW[(F(x))(l- F(X)))-ldF(x).
The A2 statistic is a general purpose (omnibus) goodness of fit statistic although on the
whole it is most powerful when F(x) departs from the true distribution in the tail.
A modified statistic A2' is used in the test for normality with J1 and (j unknown and
estimated by the mles. It is given by
2· 2 2A =A (1.0 + .75/n +2.25/n ).
Tables providing significance levels for the Anderson-Darling test for normality can be
found in D'Agostino and Stephens[l].
Fisher's method is a method for combining independent tests from several samples.
Suppose that k tests are to be made of the null hypotheses HOI, Ho2 , .•. , Hok. Let Ho be
the composite hypothesis that all HOi are true. Let Pi be the p-value corresponding to the
ith test. Then when HOi is true, Pi is U(O,I) as long as the test statistics are continuous.
The statistic
P = -22: 10g(Pi)
under Ho, has the X~k distribution.
We will use the modified Anderson-Darling statistic AZ' to test the hypothesis of
normality for the individual bowling scores by finding the p-value Pi, i = 1,2, ... ,40. We
then use Fisher's method to test the. composite hypothesis that all HOi are true.
4.3 The Results of Goodness of Fit for Bowling Scores
In Section 3.4, we used a graphical method to show that the bowling scores are not
approximately normally distributed. Here we give the results of testing the normality of
CHAPTER 4. MODELLING THE BOWLING SCORES 31
bowling scores by using the modified Anderson-Darling statistic and Fisher's method. The
hypotheses are:
HOi: Xijk - N(j1.i,U?), i =1,2, ... ,40
HI: not all HOi are true
where j1.i and ul are unknown and are estimated by the sample average Xi and the sample
variance s~ respectively.
Table 4.1 gives the modified Anderson-Darling statistic A2' and the p-value for each
of above hypotheses.
Table 4.1: The Results of Bowling Scores
player A" p-value player A2 p-value1 .27 .68 21 .39 382 .32 .53 22 .65 .093 .32 .54 23* .86 .034 .39 .39 24 .33 .515 .27 .67 25* 1.34 .006 .65 .09 26 .56 .14
7* 1.70 .00 27* 1.69 .008 .50 .21 28 .54 .16
9* 1.09 .01 29 .24 .7810 .36 .45 30 .29 .6311 .63 .10 31 .16 .9412 .66 .08 32 .61 .1113 .40 .37 33 .65 .0914 .54 .17 34 .54 .1715* 1.11 .01 35 .15 .9716 .29 .60 36 .26 .7217 .58 .13 37* 1.31 .00
18* 1.36 .00 38 .33 .5219 .45 .28 39 .26 .7120 .31 .57 40* .82 .04
"*,, means that the p-value is smaller than .05.
From Table 4.1 we observe that nine out of forty players have bowling scores which
are significantly different from the normal population at the 5% significance level. We
._---------------------------
CHAPTER 4. MODELLING THE BOWLING SCORES 32
now use Fisher's method for combining the tests of forty samples and get the statistic
p = -2l:t~1Iog(Pi) =176.9 with overall p-value O. We therefore reject the null hypothesis
that the bowling scores are approximately normally distributed.
4.4 The Proposed Model for Bowling Scores
The Box-Cox family of transformations of bowling scores is given by
if a f 0
if a = O.
We want to find the value of the parameter a so that the transformed yJ;l are approxi
mately normally distributed. We test the hypotheses HOi: yJ;l ~ N(J.li' all, i = 1,2, ... ,40
for a specified a by using the modified Anderson-Darling statistic and Fisher's method.
We do this rather than the traditional maximum likelihood approach since the maximum
likelihood method requires a constant variance amongst individuals aud we have no prior
reason to believe this. The "best" model is the one whose value a gives the maximum
overall p-value. Through trying different a-values ranging from -2 to 2 the logarithms of
bowling scores (a =0) has nearly a maximal p-value of 0.07. When a =-.05, Pm"x =.07.
However we would like to choose a = 0, because the difference in p-values is small. We
mention that the log transformation is the variance stabilizing transformation resulting
from a model where the standard deviation is proportional to the mean. Table 4.2 gives
the results of testing the normality of the logarithms of bowling scores.
We see that the logarithms of bowling scores for players 7, 8, 23 and 27 are significantly
different from the normal distribution at the 5% significant level. We reject four out of
forty hypotheses HOi ~ N(J.li, all, i = 1,2, ... ,40. Through using Fisher's method for
combining tests of forty samples, we get the statistic P = -2l:t~1Iog(Pi) = 99.8 with
overall p-value 0.07. We therefore tentatively accept the hypothesis that the logarithms
of bowling scores are approximately normally distributed.
We mention that the p-values obtained above are appropriate for a specified value a.
We have optimally determined a and then computed the p-value as though a was specified.
......._--------------------
CHAPTER 4. MODELLING THE BOWLING SCORES
Table 4.2: The Results of Logarithms of Bowling Scores
player A2 p-value player A2 p-value1 .39 .39 21 .26 .722 .33 .51 22 .43 .323 .23 .82 23· .93 .024 .67 .53 24 .35 .485 .15 .96 25 .61 .116 .32 .53 26 .25 .747' .92 .02 27· .88 .028· 1.10 .01 28 .35 .479 .54 .17 29 .51 .1910 .26 .71 30 .17 .9311 .53 .17 31 .23 .8012 .30 .58 32 .31 .5613 .18 .92 33 .30 .5814 .25 .74 34 .43 .3115 .44 .29 35 .28 .6416 .39 .39 36 .42 .3317 .42 .33 37 .70 .0718 .72 .06 38 .10 .9919 .39 .38 39 .18 .9220 .21 .86 40 .61 .11
"." means that the p-value is smaller than .05.
33
Technically this is not ideal and the true p-value should be smaller than the reported p_
value of 0.07. Despite this we believe that the log normal approximation is good as will
be seen in the q-q plot.
To compare the results for the bowling scores obtained from Section 3.4 by using
standardized and pooled data, we give the histogram and standardized q-q plot for the
logarithms of bowling scores in Figure 4.1. We see that the histogram of standardized
and pooled logarithms of bowling scores is approximately normally distributed. The q_
q plot is almost a straight line through the origin and with unit slope. The only place
of departure is in the tail. It seems that the tails of the normal distribution may be
slightly thicker than the tails of the logarithms of bowling scores. However as mentioned
earlier, the Anderson-Darling statistic is very sensitive in detecting departures from the
true distribution in the tail. We therefore accept the hypothesis that the logarithms of
bowling scores are approximately normally distributed.
'*'------------------
CHAPTER 4. MODELLING THE BOWLING SCORES
4.5 The Property of Equal Variances
34
After having found that the logarithms of bowling scores are approximately normally
distributed, we would like to know whether the equality of variances holds. Table 4.3 lists
the averages and the sample variances of the logarithms of bowling scores. The sample
variances vary from 0.017 to 0.050.
Table 4.3: The Sample Averages and Sample Variances of Logarithm Data
Player Averages Sample Var. Player Averages Sample Var.1 5.26 .038 21 5.24 .0272 4.92 .026 22 5.30 .0283 5.07 .017 23 5.23 .0304 5.22 .029 24 5.18 .0295 5.31 .027 25 5.26 .0396 4.96 .030 26 5.05 .0277 5.03 .035 27 4.93 .0328 5.18 .034 28 4.76 .0459 5.07 .039 29 5.30 .03310 5.27 .031 30 5.08 .02211 4.77 .022 31 5.08 .02712 4.88 .037 32 4.96 .02913 5.07 .034 33 5.00 .04414 4.95 .021 34 5.08 .03515 5.01 .038 35 4.89 .05016 4.95 .020 36 5.13 .03317 4.82 .043 37 4.83 .04618 5.08 .027 38 5.11 .04519 5.07 .036 39 4.93 .03520 5.03 .034 40 5.05 .043
Figure 4.2 gives a plot of the sample variances against the averages.
We see that the sample variances of logarithms of bowling scores are approximately the
same amongst all players. We will use Bartlett's method for testing whether the variances
are approximately equal for the logarithms of bowling scores.
The assumptions of the Bartlett test are:
.-----------------------
CHAPTER 4. MODELLING THE BOWLING SCORES
Histogram of Standardized Values
35
8
o
;---
r--
-r---'
r---' I--
....---r- II.I I I I i I I
-3 -2 -1 o 2 3
~8 '"'"'",5 '"~III
"00E
0,5'fij
'".9 -,II~ ')'
i!~ "I
Standardized logarithms of Bowling SCores
Q. Q Plot
-2 o
Quantile, of Standard Normal
2
Figure 4.1: Normality of the Logarithms of Bowling Scores
16n _
q
CHAPTER 4. MODELLING THE BOWLING SCORES
C!Xl 0uiij'iij :>~
Q. '"~0 -0
III •
'" ".0ci
,4.8 4.9 5.0 5.1 5.2 5.3
Averages
Figure 4.2: The Sample Variances vs Averages Plot for Logarithm Data
(1) Each of the k populations is normal.
(2) Independent random samples are obtained from each population.
The hypothesis is:
H .,,2 - ,,2 - - ,,2o· 1- 2-"'- k
H1 : not all of the (11 are equal
36
d
Let s~, . .. , s~ denote the sample variances from the k normal populations and let d!,denote the degrees of freedom associated with the sample variance s?- Then the mean
square error is given by1 k
MSE = dlf 'Edf;slT i=l
wherek
dfT = 'Ed!i.j=}
-------------------------------------- -
""CHAPTER 4. MODELLING THE BOWLING SCORES
The test statistic is
1 k
B = C[(dfy)log(MSE) - 2:)d/;)log(sr)]1=1
where1 k 1 1
C =1+ 3(k _ 1)[(2: d'!) - d""l1=1 I 'JT
Under Ho, B is approximately distributed as xLI'
37
In studying the equality of variances for the logarithms of bowling scores, the above
two assumptions hold. We have dfi = ni - 1 where ni is the number of games in which
player i participated. The sample variance of the logarithm of bowling scores for player i
is sr and k =40. The test statistic B =57.6 yields a p - value = .03. A Spearman Rank
test was also carried out as in Section 3.2 and the p-value was found to be insignificant.
Therefore there seems to be mild evidence of differences amongst the variances. However
all that we care about is that the differences are not too big. Therefore we suggest
that the variances are approximately equal amongst players and that the logarithm of
bowling scores are approximately normally distributed with a constant variance estimatedE'· ,
by 0-2 =;40 So = 0.0328. An approximate 95% confidence interval for ,,2 based on
normality is (0.0220, 0.0541)
Having revised our model we standardize logarithms of bowling scores as in Section
3.4 by using the r; for each player together with 0-2 = 0.0328 obtained above. We then
construct the standardized q-q plot and histogram for pooled values based on this new
model. Figure 4.3 gives the plot of normality of logarithms of bowling scores with constant
variance.
Comparing Figure 4.1 and Figure 4.3, we observed that Figure 4.3 is more normal than
Figure 4.1. The reason might be that in Figure 4.1 fewer data are in the tail compared
with the standard normal. In Figure 4.3 the i'k bowler contributes the terms x'tk.-
Xi tou
the pooled data. Therefore those bowlers which have a small Si are going to contribute
terms that are clustered mOre tightly about zero and those bowlers which have a larger s,
are now going to contribute terms that are more spread out about zero. The net effect is
a longer tailed distribution which is what we observed.
CHAPTER 4. MODELLING THE BOW~ING SCORES
Histogram of Standardized Values
38
8-o
r--f---
'--r--
r-- f---
s--- r-,i i i
Figure 4.3: Normality of the Logarithm of Bowling Scores (a 2 =0.0328)
rJb _
CHAPTER 4. MODELLING THE BOWLING SCORES 39
Note that some very strong modelling assumptions have been conjectured; i.e. that
logarithms of bowling scores are approximately normally distributed with a constant vari
ance. However this conclusion is based on a single league and it may be unreasonable
to extend this inference to populations in general. In the next chapter we hope to show
through simulation that the result is approximately valid for a wide range of bowling
abilities.
~--------------------
Chapter 5
Simulated Bowling Scores
In Chapter 4, we found that the logarithm of bowling scores is approximately normally
distributed. However we have some concern over the adequacy of the approximation due
the 7% p-value. Perhaps the approximation is quite good and the questionable p-value
can be attributed to the effect of sample size on the meaning of significance tests. For
example, it is well known that with a very large data set a precise Ho will almost always
be rejected (see Royall[lOJ). In any case we would like to confirm the adequacy of the
approximation. In this chapter, we use a Fortran program to simulate bowling scores to
confirm the model.
5.1 Assumptions of Simulation
As we know, the actual mechanism underlying a bowling game is impossible to describe
and to simulate. We give some simplifications concerning the bowling mechanism in order
to make the simulation easy.
(A) There is no curve on each ball bowled.
(B) A bowler always aims directly at the middle of the pin of interest and can miss the
pin by no more than 1 pin to the right or to the left.
(C) The bowler has equal accuracy to the left or to the right; the chance of hitting either
the adjacent right pin or the adjacent left pin is the same.
(D) There is no learning effect. Every ball bowled is independent of one other.
40
4 _
4
CHAPTER 5. SIMULATED BOWLING SCORES 41
We also simplify the outcomes of each ball bowled. These outcomes are described below.
We mention that other outcomes can arise in practice other than those described. However
they are far less probable. They also have a similar structure to one of the above possibil
ities. That is, they count approximately the same number of points and have nearly the
same implications for successive balls in the frame.
1. The outcomes resulting from the first ball bowled in a frame are one of following
four possible results:
(a) Strike: A bowler knocks down all pins with probability pi and the score is 15 points.
(b) Corner: A bowler knocks down all pins except a single corner pin with probability p2
and the score is 13 points.
(c) Headpin: A bowler knocks down the headpin with probability p3 and the score is 5
points.
(d) 3-2: A bowler knocks down the 3-pin and the 2-pin on the same side with probability
p4 =1 - (pi +p2 +p3) and the score is 5 points.
2. The outcomes resulting from the second ball bowled in a frame are conditional on
the result of the first ball.
(a) A strike is recorded and the frame is completed. No second ball is available.
(b) A corner pin is left standing after the first ball. There are 2 possible outcomes for
the second ball:
(1) Spare: The bowler knocks down the corner pin. The probability pl+p2+p3
relates to a ball which does not miss its intended pin. A spare is recorded and
the score is 15 points.
(2) The corner pin remains. The bowler does not not knock down the corner pin
with probability p4 and the score remains unchanged.
CHAPTER 5. SIMULATED BOWLING SCORES 42
(c) A headpin is picked as the result of the first ball. There are four possible outcomes
for the second ball:
(1) 3-2: The bowler knocks down the 3-pin and 2-pin on either the right or left side.
The probability pl+p2 relates to a ball which is aimed at 3-pin and hits but
does not punch out the 3-pin and the score is 5 +5 = 10.
(2) 3-pin: The bowler knocks down the 3-pin with probability p3 and the score is
5 +3 = 8.
(3) 2-pin: The bowler knocks down the 2-pin and the score is 5 + 2 = 7. The
probability p4/2 relates to the bowlers tendency to miss to the left or to the
right with equal probability.
(4) Miss: The bowler rolls the ball through the headpin channel with probability
p4/2 and the score is still 5 points.
(d) The 3-2 combination is picked as the result of the first ball. There are five possible
outcomes for the second ball:
(1) Headpin: The bowler picks the headpin with the second ball with probability
p3 and the score is 5 +5 = 10.
(2) Spare: The bowler knocks down all remaining pins with the second ball with
probability pl+p2/2. A spare is recorded and the score is 15.
(3) Miss: The bowler rolls the ball through the 3-2 channel with probability p4/2
and the score is unchanged.
(4) 3-2: The bowler knocks down the other 3-pin and 2-pin with probability p4/2
and the score is 5 +5 = 10.
(5) hp-3: The bowler knocks down both the head pin and the 3-pin with probability
p2/2 and the score is 5 +5 +3 = 13.
3_ For the third ball of the frame there are 14 different results based on the second
ball. Figure 5.1 graphically depicts all outcomes and probabilities. The numbers within
circles represent the cumulative scores and the letters A, B, C and D after the second ball
indicate that the same situations have occurred at other places in the chart.
_ ..dtt _
-~
CHAPTER 5. SIMULATED BOWLING SCORES 43
hp: headpin
3-2: 3-pin and 2-pin knocked down
3: 3-pin knocked down
2: 2-pin knocked down
hp-3: headpin and 3-pin knocked down
(0G8(0
hp(pl+pZ+ 3 G
Figure 5.1: Tree Diagram of Possible Outcomes
<
CHAPTER 5. SIMULATED BOWLING SCORES
5.2 The Results of Simulation
44
Based on the simplifications and the scoring rules, a Fortran program has been coded to
simulate bowling games. Appendix B lists the computer program.
We use the Fortran program to simulate bowling scores by choosing different param
eters pI, p2, p3 and p4 roughly based on the observations of actual bowlers and on the
considerations of variability of abilities where pI + p2 + p3 + p4 = 1. Table 5.1 lists
some typical results of the simulation where averages range between 120 and 305. The
results include the sample averages and sample variances of the bowling scores based on
N simulations for a set of chosen parameters.
Table 5.1: The Results of Simulation (N = 10000)
pI p2 p3 p4 Average Variance0.3 0.3 0.2 0.2 250 13880.2 0.2 0.3 0.3 199 11570.2 0.3 0.2 0.3 219 11970.4 0.3 0.2 0.1 282 15070.25 0.25 0.2 0.3 224 13350.1 0.1 0.3 0.5 149 7310.15 0.15 0.25 0.45 173 9780.15 0.15 0.1 0.6 171 10150.05 0.05 0.35 0.65 125 4120.1 0.2 0.3 0.4 169 8550.2 0.2 0.15 0.45 197 12180.25 0.15 0.1 0.5 202 13290.4 0.4 0.1 0.1 302 13550.25 0.2 0.25 0.3 214 13380.25 0.25 0.1 0.4 223 13560.3 0.3 0.1 0.3 250 1412
We also give some standardized q-q plots and histograms based on the logarithms of
the simulated bowling scores in Figures 5.2-5.4. The sample variances of the logarithms of
bowling scores are 0.0327, 0.0361 and 0.0234 for the data sets which contributed Figures
5.2,5.3 and 5.4 respectively. They belong to the range of variances of logarithms of actual
bowling scores. The plots in Figures 5.2, 5.3 and 5.4 are obtained by using the constant
variance (72 = 0.0328. Figure 5.2 gives one of the "best" amongst the simulated scores,
· ----.....--
CHAPTER 5. SIMULATED BOWLING SCORES 45
Figure 5.3 gives the result of a typical simulation and Figure 5.4 gives one of the "worst"
results. By "best" we mean that it fits the normal model best and by "worst" we mean
that it fits the normal model worst. Note that in the worst case the average of simulated
bowling scores = 250 which extends the range of averages of the actual data. However
it is still not clear which values of the parameters pI, p2, p3 and p4 lead to a good
approximation.
It has been observed in Figures 5.2-5.4 that in each case the left sample quantiles
fall below the normal quantiles. A possible explanation for this is the inadequacy of the
simulation model. The simplified model may make it unrealistically easy to obtain a low
score.
From Figures 5.2-5.4, we find that the logarithms of simulated bowling scores are
approximately normally distributed with a constant variance. We have therefore confirmed
the proposed model of Chapter 4 by using a Fortran program to simulate bowling scores.
CHAPTER 5. SIMULATED BOWLING SCORES
Histogram of Standardized Values
8'"Sl-8
Sl
0
-4 -2 0
Standardized Logarithm. of Simulated Bowling Score.
2
46
-2
Q - Q Plot
o
Quantile. 0' Standard Normal
2
Figure 5.2: The Results of pl=O.15, p2=O.15, p3=O.25, p4=0.45
6 _
CHAPTER 5. SIMULATED BOWLING SCORES
Histogram of Standardized Values
47
8
o
.---
-- -
- -
--~ L
i i , i
-4 -2 o 2
-2
Standardized Logarithms of Simulated Bowling SCores
Q - Q Plot
o
Ouantiles of Standard Normal
2
Figure 5.3: The Results of pl=O.15, p2=O.15, p3=O.1, p4=O.6
*----------------------
...CHAPTER 5. SIMULATED BOWLING SCORES
Histogram of Standardized Values
48
iil-8
o
-
-
-4 -3 -2 -1 o 2
(;';"
~
E-S.~
'1'll..9..,
'?~
J~
.!!lCJ)
-2
Standardized Logarithms of Simulated Bowling Scores
Q. Q Plot
o
Quantiles of Standard Normal
2
Figure 5.4: The Results of pl=O.3, p2=O.3, p3=O.1, p4=.3
~---------------------------------
Chapter 6
Handicap Systems
From preceeding chapters we have concluded that the logarithm of bowling scores is ap
proximately normally distributed with a constant variance (72 = 0.0328. In this chapter we
will use Monte Carlo methods to investigate the effect of various handicap systems based
on the proposed model of Chapter 4 and then compare our results with the Remington
Rand study.
6.1 The Remington Rand Study
There are various handicap systems currently used in league and tournament play. De
tailed information about handicap systems is described in Section 2.1. The Remingtom
Rand study[8] processed over 100,000 league bowling scores and the results suggested that
the individual handicap system of 80% of the difference between the bowler's average and
a base figure of 225 is the fairest handicap system. We tried to get more information about
the Remington Rand study and their criteria of "fairest". Unfortunately we did not get
any reply from the Ontario 5 Pin Bowlers' Association on this matter. In the remainder
of this chapter, we use our model to investigate the effects of various handicap systems on
the probability of winning. By a fair handicap system we mean one which tries to "even
up" the chance of winning in a match between competitors of various strengths.
6.2 The Monte Carlo Study
Let X;j be the j'h bowling score of player i with mean m;. In this way we can compare
bowlers of various abilities by changing the value of m;. We know from our model in the
49
6z _
CHAPTER 6. HANDICAP SYSTEMS 50
L..
previous chapters that Yij = log(Xij) ~ N(ui, (12) where (12 = 0.0328. Since Xij = eY",,by completing the square it is easy to show that mi = E(xij) = E(eY") = e"'+"' and
therefore Ui = log(mil - a;. We easily simulate bowling scores for player i by generating
random variates from the N(log(mi) - a2', (12) distribution with (12 = 0.0328 and then
taking logarithms.
In more detail the method for generating bowling scores is as follows:
(1) Create average bowling scores mi and mj for player i and player j. The values of mi
and mj (mi ::; mj) are between 100 and 240 to reflect realistic abilities. We consider
all possible combinations with averages ranging by 10 point intervals. By restricting
mi ::; mj we refer to player i as the underdog and player j as the favourite.
(2) For each pair of players we generate 10,000 bowling scores Xik and x jk, k = 1,2, ... ,10000,
according to the log normal distribution described above.
(3) We then add a handicap to both Xik and Xjk based on the handicap system currently
under study and obtain the total game scores. We consider handicap systems 1, 2,
3 and 4 corresponding to the descriptions in Section 2.1.
(4) We then estimate the probability of the favourite defeating the underdog in a given
game by considering the fraction of the 10,000 games won by the favourite over the
underdog.
6.3 The Results of Our Study
Table 6.1 lists the bowler's averages and the probabilities of the favourite defeating the
underdog based on the four handicap systems. This has been done using 20 point intervals.
As we expected, from Table 6.1 )Ve observe that the stronger player always has an
advantage under each of the four handicap systems. We see this as a good thing as it
offers incentive to improve one's bowling skills. On the other hand the advantage of the
stronger player should not be so great as to discourage the weaker player. From this point
of view Table 6.1 indicates that handicap system 1 may be the most preferable as the
advantage of the favourite over the underdog is not as dramatic as with handicap systems
2, 3 and 4. For example the favourite with an average of 220 has probability 0.54, 0.69,
d
Q
CHAPTER 6. HANDICAP SYSTEMS
Table 6.1: Estimated Probabilities of the Favourite Winning
Underdog Favourite Handicap1 Handicap2 Handicap3 Handicap4100 100 0.50 0.50 0.50 0.50100 120 0.56 0.60 0.57 0.57100 140 0.60 0.68 0.63 0.63100 160 0.63 0.73 0.67 0.67100 180 0.67 0.78 0.71 0.61100 200 0.68 0.80 0.73 0.73100 220 0.70 0.90 0.86 0.75100 240 0.81 0.96 0.93 0.87120 120 0.50 0.50 0.50 0.50120 140 0.54 0.57 0.55 0.55120 160 0.58 0.64 0.60 0.60120 180 0.61 0.70 0.64 0.64120 200 0.64 0.73 0.67 0.67120 220 0.66 0.86 0.82 0.71120 240 0.77 0.93 0.90 0.83140 140 0.50 0.50 0.50 0.50140 160 0.54 0.57 0.55 0.55140 180 0.58 0.64 0.60 0.60140 200 0.59 0.67 0.62 0.62140 220 0.64 0.82 0.78 0.67140 240 0.74 0.90 0.87 0.79160 160 0.51 0.51 0.51 0.51160 180 0.53 0.56 0.54 0.54160 200 0.56 0.61 0.58 0.58160 220 0.59 0.76 0.73 0.61160 240 0.70 0.85 0.83 0.74180 180 0.50 0.50 0.50 0.50180 200 0.53 0.55 0.53 0.53180 220 0.55 0.70 0.68 0.56180 240 0.67 0.81 0.80 0.71200 200 0.49 0.49 0.49 0.49200 220 0.53 0.65 0.65 0.54200 240 0.64 0.76 0.76 0.67220 220 0.50 0.50 0.50 0.50220 240 0.61 0.64 0.64 0.64240 240 0.51 0.51 0.51 0.51
51
CHAPTER 6. HANDICAP SYSTEMS 52
0.67 and 0.55 of defeating the underdog with an average of 180 under handicap systems
1, 2, 3 and 4 respectively.
To gain a better understanding of Table 6.1 we present it in a graphical manner in Fig
ure 6.1. Figure 6.1 gives plots of the probabilities of the favourite defeating the underdog
under four handicap systems (H1, H2, H3 and H4) for an underdog with a fixed average.
Notice that any lack of smoothness in the plot is due to errors in our estimates and should
be ignored. The standard error of the probabilities is less than or equal to °i~~go5 = 0.005.
The estimate errors are also clearly seen in Table 6.1 where the probability of the favourite
winning should always be .50 when mj =mj'
Figure 6.1 shows clearly that handicap system 1 is the fairest. This gives the same result
as the Remington Rand study: the individual handicap system of 80% of the difference
between the bowler's average and a base figure of 225 is the fairest handicap system to
use in league or tournament play.
CHAPTER 6, HANDICAP SYSTEMS 53
Average of Underdog = 100 Average of Underdog = 130
"!
[}] " , 0
rn"! .," ,., ' ,0 / , ,,- I
--- H3 1 CD --- H3 1f 0 .. -- H4
,<Xl -- H4 f .. f
,~0 ...... f
ff
f : f
~.
.../ ,d"r-
"f
r- ..- .,- .... -" 0 f
0 / e 1a. ~ a. ..r
..
..~. <X!"<X! 0 " ,
0
'" '"0 0
100 140 180 220 130 160 190 220
Average of Favourite Average of Favourite
Average of Underdog = 160 Average of Underdog = 190
lil0 ,./'.... ",
,<Xl [}] [}]
.-/" , .~
0 .,'l' ...;--- H3
/' /--- H3
,"
" f R J'~
-- H4 // 0 -- H4.....
" f ,",~ ,,'/ ,~ .'/r- /'":s 0 /, ~ /1co : f .'"-" -"e /1 e ,I
a. .: I a. 5l .';,: I ;i'
0 ,I<D ...• I . ,""' ,-/0 .... ..f·'· ..~" ,/
,'/g .. ,'( ..
'"0 0
160 180 200 220 240 190 210 230
Average of Favourite Average of Favourite
Figure 6,1: The Probability of the Favourite Winning the Game
d
Appendix A
The Data Set
Game Player
1 2 3 4 5 6 7 8 9 10
1 173 1.52 161 NA 215 125 NA 127 148 257
2 165 164 172 NA 200 103 NA 185 107 161
3 182 148 204 NA 210 128 NA 205 167 203
4 159 130 NA NA 182 133 NA 159 155 178
5 147 152 NA NA 141 88 NA 190 148 142
6 194 172 NA NA 176 122 NA 128 187 170
7 221 153 173 184 248 141 135 176 138 295
8 153 167 171 180 229 150 161 211 136 200
9 159 149 136 147 177 135 158 212 110 185
10 183 113 147 197 159 119 137 195 139 224
11 141 143 151 191 138 133 205 209 121 146
12 252 184 187 161 167 170 193 206 138 227
13 181 137 122 188 185 147 221 150 215 193
14 210 122 147 219 204 167 137 107 136 165
15 205 124 149 178 198 146 106 230 212 230
16 163 118 NA 149 164 146 147 193 126 162
17 189 162 NA 222 196 162 153 145 155 161
18 189 120 NA 200 184 154 123 160 143 176
19 234 139 198 139 257 146 129 205 126 147
20 205 146 126 213 260 116 143 106 135 199
54
.-z _
APPENDIX A. THE DATA SET
Game Player
1 2 3 4 5 6 7 8 9 10
21 179 126 162 190 234 172 149 192 132 188
22 199 130 166 187 173 167 147 191 202 163
23 238 161 162 220 201 133 226 182 156 255
24 133 161 169 230 214 209 157 176 218 211
25 199 130 166 187 173 167 147 191 202 163
26 238 161 162 220 201 133 226 182 156 255
27 133 161 169 230 214 209 157 176 218 211
28 223 125 NA 173 198 159 146 187 168 200
29 235 139 NA 202 195 116 163 188 134 164
30 239 125 NA 159 199 156 127 171 167 205
31 207 101 137 178 242 113 144 220 170 210
32 214 124 155 124 229 157 120 129 145 217
33 163 141 131 183 252 III 124 173 248 253
34 221 122 162 190 222 177 113 203 220 145
35 242 173 171 186 190 131 173 192 162 184
36 196 121 151 113 184 126 115 165 173 188
37 209 113 137 225 214 142 211 178 136 200
38 139 146 181 162 299 159 191 181 159 266
39 250 146 146 255 202 149 156 180 116 225
40 146 NA 146 196 236 142 152 200 184 235
41 256 NA 130 224 162 158 152 144 211 171
42 297 NA 151 197 180 144 136 144 168 205
43 224 101 148 235 187 124 179 194 219 211
44 188 177 148 165 229 147 165 160 156 228
45 166 187 203 190 246 135 138 182 217 190
46 235 119 187 223 162 141 193 NA 166 228
47 132 102 202 142 209 159 141 NA 185 200
48 194 121 164 204 148 132 167 NA 157 138
49 206 144 156 NA 172 135 149 138 132 154
50 232 140 178 NA 239 163 137 148 174 199
51 259 136 149 NA 195 113 130 173 118 175
52 178 119 154 148 285 152 142 234 145 185
53 181 ~2 169 195 205 119 213 231 193 157
55
.._------------------------
<
APPENDIX A. THE DATA SET
Game Player
1 2 3 4 5 6 7 8 9 10
54 156 105 140 183 234 176 143 220 155 229
55 146 96 182 165 208 120 124 192 134 172
56 212 142 126 178 213 155 141 241 149 216
57 196 163 185 184 208 232 207 177 169 240
Game Player
11 12 13 14 15 16 17 18 19 20
1 85 123 119 NA 170 125 114 NA 151 200
2 109 109 201 NA 153 147 71 NA 196 154
3 95 III 137 NA 157 153 102 NA 158 156
4 123 140 144 132 139 154 103 154 115 187
5 158 117 177 153 172 129 102 144 157 193
6 131 109 146 III 143 147 91 158 115 121
7 123 106 151 NA NA 103 103 205 128 162
7 153 160 188 NA NA III 114 213 136 132
9 155 124 158 NA NA 112 94 217 159 135
10 84 137 148 154 136 140 99 171 186 171
11 114 152 132 149 115 190 168 144 122 151
12 119 136 143 148 123 190 124 137 237 159
13 135 118 173 180 123 154 136 158 136 184
14 158 112 157 163 141 151 126 153 181 138
15 112 135 156 132 213 135 141 145 197 121
16 110 157 135 108 136 113 130 188 190 170
17 110 117 125 184 126 146 87 174 164 163
18 129 166 156 122 154 125 138 138 139 183
19 126 119 170 119 150 158 145 118 121 132
20 109 124 179 142 144 151 120 168 195 129
21 123 124 154 147 158 162 132 175 164 176
22 108 150 157 160 NA 117 125 171 135 173
23 136 122 205 144 NA 151 133 155 183 131
24 115 194 176 171 NA 157 140 234 174 149
56
+
APPENDIX A. THE DATA SET
Game Player
11 12 13 14 15 16 17 18 19 20
25 108 150 157 160 NA 117 125 171 135 173
26 136 122 205 144 NA 151 133 155 183 131
27 115 194 176 171 NA 157 140 234 174 149
28 115 182 132 124 145 174 156 148 148 223
29 111 128 161 130 125 145 142 118 194 126
30 108 75 135 125 136 146 146 153 155 110
31 104 148 190 124 131 NA 133 124 154 128
32 119 162 185 144 101 NA 155 147 132 163
33 107 96 115 207 177 NA 103 179 204 117
34 127 180 189 133 208 NA 158 142 152 105
35 132 139 169 143 128 NA 155 178 160 166
36 102 131 108 135 116 NA 218 135 180 221
37 NA 115 131 137 157 148 139 164 215 NA
38 NA 103 146 136 136 142 125 137 218 NA39 NA 121 186 118 131 124 155 168 170 NA40 113 121 195 182 143 134 101 138 134 NA41 124 142 169 155 143 136 147 158 216 NA
42 129 146 156 135 176 163 126 206 185 NA
43 86 94 168 141 180 163 123 145 183 NA
44 111 146 164 112 152 167 114 136 119 NA
45 107 99 148 116 275 137 104 150 162 NA
46 NA 118 261 119 177 164 104 170 131 NA
47 NA 155 203 167 146 136 102 156 144 NA
48 NA 107 106 142 186 122 119 145 128 NA
49 96 157 142 134 130 116 155 142 NA NA
50 131 141 133 171 130 158 121 202 NA NA
51 136 162 172 153 192 125 102 228 NA NA
52 111 108 NA 146 104 115 105 163 135 NA
53 132 113 NA 125 155 161 196 147 112 NA
54 151 113 NA 132 214 178 80 190 175 NA
55 109 97 188 144 160 135 111 188 NA 189
56 128 122 239 133 183 128 124 164 NA 148
57 138 128 150 117 153 136 131 136 NA 141
57
APPENDIX A. THE DATA SET
Game Player
21 22 23 24 25 26 27 28 29 301 NA NA NA NA NA 162 114 NA NA 1642 NA NA NA NA NA 172 105 NA NA 1863 NA NA NA NA NA 156 108 NA NA 2084 148 187 NA 153 152 177 92 78 234 1535 172 176 NA 133 145 144 119 90 211 1866 152 240 NA 191 187 130 110 121 188 1487 174 155 212 174 151 151 130 112 221 1408 181 227 260 156 216 131 206 85 200 1839 160 232 199 223 222 153 221 152 263 169
10 163 204 162 181 182 140 129 103 237 10911 177 246 233 163 175 175 154 123 191 15712 190 289 141 133 211 161 179 119 169 15213 172 188 218 135 164 112 123 104 246 16114 180 182 168 174 171 133 144 100 186 16415 211 184 197 167 205 185 139 126 188 14516 157 205 192 162 212 165 132 NA 148 15417 176 179 161 148 201 156 212 NA 191 13818 212 261 228 182 241 142 165 NA 281 13419 201 178 234 191 204 217 107 151 223 14520 207 153 142 202 156 129 140 145 200 14621 233 276 152 210 169 152 160 118 229 20022 166 221 210 192 181 145 176 123 228 13523 210 193 204 140 158 165 161 111 187 17724 237 151 161 212 177 142 135 124 230 147
25 166 221 210 192 181 145 176 123 228 135
26 210 193 204 140 158 165 161 111 187 17727 237 151 161 212 177 142 135 124 230 14728 195 236 215 171 159 134 132 82 273 141
29 152 213 186 158 199 128 166 146 246 13430 233 263 155 149 207 216 142 101 242 161
58
....._-------------
APPENDIX A. THE DATA SET
Game Player
21 22 23 24 25 26 27 28 29 30
31 NA 183 149 151 196 136 162 117 204 138
32 NA 162 200 231 140 153 152 115 128 166
33 NA 179 225 226 220 161 199 133 183 186
34 NA 189 209 212 246 123 132 95 179 169
35 NA 179 226 141 183 144 127 169 214 175
36 NA 185 164 187 189 222 123 142 146 157
37 244 NA 131 254 271 195 127 114 158 216
38 189 NA 187 172 156 190 113 111 209 154
39 223 NA 226 193 272 186 132 96 135 181
40 233 179 NA 204 329 167 155 81 254 156
41 216 162 NA 225 182 176 127 76 226 128
42 185 199 NA 153 298 201 134 136 203 147
43 167 208 212 195 NA 170 125 107 156 134
44 193 180 157 166 NA 166 128 131 218 192
45 182 210 208 137 NA 121 150 138 212 183
46 142 253 210 180 215 157 128 NA 231 201
47 215 238 237 147 149 169 147 NA 211 169
48 189 218 161 128 205 147 142 NA 190 190
49 129 218 143 168 184 209 126 109 174 174
50 185 165 175 219 243 163 149 221 239 163
51 235 237 155 180 184 124 123 128 186 138
52 162 152 NA 183 171 131 117 NA 198 118
53 140 243 NA 202 158 126 133 NA 235 217
54 263 203 NA 176 200 122 121 NA 141 194
55 216 250 170 198 240 140 125 150 172 172
56 185 192 157 217 277 187 129 119 163 163
57 179 188 180 229 165 178 132 137 184 171
59
..-----------------------
APPENDIX A. THE DATA SET
Game Player
31 32 33 34 35 36 37 38 39 401 NA NA 221 NA NA 159 119 176 NA NA2 NA NA 165 NA NA 197 181 146 NA NA3 NA NA 189 NA NA 171 164 177 NA NA4 167 125 138 138 NA 193 148 294 NA NA5 139 133 198 147 NA 190 110 128 NA NA6 131 127 158 224 NA 134 113 174 NA NA7 173 122 96 181 NA 153 114 168 119 158
8 129 114 100 177 NA 134 107 151 137 157
9 127 150 127 163 NA 178 118 101 95 126
10 184 125 219 119 115 140 112 141 110 132
11 128 153 135 158 156 140 139 152 140 21812 151 103 161 141 138 179 91 207 107 95
13 NA 134 137 147 132 181 146 138 132 156
14 NA 182 128 128 128 144 162 195 131 195
15 NA 179 196 190 138 155 106 233 165 132
16 158 168 109 176 81 183 NA 213 166 155
17 173 141 148 120 71 132 NA 153 153 112
18 159 141 131 194 90 150 NA 128 123 158
19 170 135 213 151 107 147 128 147 155 212
20 185 149 125 166 191 204 115 160 124 148
21 185 163 153 168 120 183 115 165 158 137
22 174 134 107 124 106 106 126 193 183 103
23 199 187 120 200 125 218 83 111 206 155
24 196 131 166 203 156 165 169 132 157 120
25 174 134 107 124 106 106 126 193 183 103
26 199 187 120 200 125 218 83 111 206 155
27 196 131 166 203 156 165 169 132 157 120
28 115 119 117 125 136 191 103 118 131 125
29 146 153 165 150 108 194 113 169 118 145
30 191 106 131 208 150 176 111 163 139 138
60
... t-. ._
•
APPENDIX A. THE DATA SET
Game Player
31 32 33 34 35 36 37 38 39 40
31 121 169 134 212 119 132 102 153 151 191
32 149 185 158 132 95 131 155 191 105 189
33 184 133 188 198 101 180 199 191 105 150
34 147 139 144 172 118 224 111 144 134 187
35 211 133 136 172 119 181 84 146 168 150
36 135 199 121 229 155 124 114 154 136 154
37 156 137 186 122 141 152 90 162 149 148
38 163 145 154 188 188 167 121 174 145 159
39 128 139 126 118 127 198 129 174 141 142
40 140 145 203 154 165 186 142 185 146 195
41 148 141 182 142 164 170 152 242 160 154
42 166 145 154 155 149 227 124 222 119 166
43 148 166 140 214 162 176 95 136 128 210
44 160 176 143 121 138 167 125 181 118 220
45 159 155 160 157 126 146 115 134 93 137
46 169 157 115 162 147 175 179 168 NA 165
47 214 202 169 142 145 208 150 159 NA 193
48 225 138 173 208 177 166 120 191 NA 202
49 179 99 146 184 191 174 105 202 NA 195
50 112 106 134 189 143 230 181 159 NA 142
51 187 161 209 170 138 186 121 203 NA 138
52 144 117 136 135 200 211 129 232 NA 206
53 155 137 124 141 104 150 186 155 NA 188
54 179 177 163 147 159 227 162 206 NA 214
55 141 149 187 170 152 151 107 120 119 201
56 169 120 190 156 124 176 121 187 148 144
57 185 119 121 154 174 180 127 206 161 166
61
Appendix B
A Program which Simulates
Bowling Scores
c This is a program which simulates bowling scores
c**********************************************************************c Simplifying Assumptions:
c (1) no curve on ball
c (2) aim straight on
c (3) equal accuracy to left .or. right
c (4) no learning effect (balls are independent)
c (5) misses target by at most 1 pin
c*********************************************************************
program main
parameter (n=10000)
dimension gscore(n).fscore(10).p(4) .t(3)
real p,mscore,sscore
double precision drand.pp
integer gscore.fscore.tstrike.tspare.i.j.t.k
open(8.file=·output')
mscore=O
sscore=O
read(*.*) (p(i),i=1.4)
62
17 ; •
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES 63
p(2)=p(1)+p(2)
p(3)=p(2)+p(3)
p(4)=p(3)+p(4)if (p(4) .It ..99999999 .and. p(4) .gt. 1.00000001) then
print*, 'NOT proper selection of probability'
endif
do 99 i=l,n
do 20 k=l,10
20 fscore(k)=O
t(l)=O
t(2)=0
t(3)=0
tstrike=O
tspare=O
do 88 j=l,10
40 continue
pp=drand(O)
c This is a strike
if (pp .le . p(l») then
fscore(j)=fscore(j)+15if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+1)=fscore(j+1)+15
endif
continueif( tstrike .eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+15
t(3)=t(3)+1
endif
t(l)=t(l)+l
tstrike=tstrike+1
t
.._----------------APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES 64
c This is a corner
elseif (pp .gt. p(l) .and. pp .le. p(2)) thenfscore(j)=fscore(j)+13
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) thent(2)=t(2)+1
fscore(j+l)=fscore(j+l)+13
endif
continue
if( tstrike .eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+13
t(3)=t(3)+1
endif
t(1)=t(l)+l
!f(t(1) .It.3) then
call corner(fscore,t,tspare,tstrike,p,j)else
goto 30
endif
c This is a headpin
else if (pp .gt. p(2) .and. pp .le. p(3)) then
fscore(j)=fscore(j)+5
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) thent(2)=t(2)+1
fscore(j+l)=fscore(j+l)+5
endif
continue
if( tstrike .eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+5
t(3)=t(3)+1
endif
t(1)=t(l)+l
if(t(l) .It.3) then
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES 65
call headpin(fscore,t,tspare,tstrike,p,j)
else
goto 30
endif
c This is a 3-2
elseif( pp .gt. p(3) .and. pp .le.l) then
fscore(j)=fscore(j)+5
if «tstrike .ne. 0 .or. tspare .ne. 0) . and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+5
endif
continue
if( tstrike .eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+5
t(3)=t(3)+1
endif
t(1)=t(1)+l
if(t(l) .It.3) then
call bov32(fscore,t,tspare,tstrike,p,j)
else
goto 30
endif
endif
30 if (t(l) .ge. 3) then
t (1)=t(2)
t(2)=t(3)
t(3)=0
tstrike=O
elseif(t(l) .It. 3) then
goto 40
endif
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES 66
88 continue
gscore(i)=O
do 12 j=1.10
12 gscore(i)=gscore(i)+fscore(j)
99 continue
write(8.101) (gscore(i).i=l.n)
do 19 i=l.n
19 mscore=mscore+gscore(i)
mscore=mscore/n
do 29 i=l.n
29 sscore=sscore+(gscore(i)-mscore)**2
sscore=sscore/(n-l)
write(*.*) mscore.sscore
101 format(lx.40i5)
stop
end
c This subroutine calculates the outcomes of a corner pin
subroutine corner(fscore.t.tspare.tstrike.p.j)
dimension fscore(10).p(4).t(3)
real p
double precision drand.pp
integer fscore.tstrike.tspare.j.t
10 pp=drand(O)
if( pp .le. p(3» then
fscore(j)=fscore(j)+2
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+2
endif
continue
if( tstrike .eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+2
s
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES 67
t(3)=t(3)+1
endif
t(1)=t(1)+l
if(t(l) .It. 3) then
tspare=tspare+l
endif
elseif (pp .gt. p(3) .and. pp .le.l) then
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
endif
if( tstrike .eq. 2 .and. j .le. 8) then
t(3)=t(3)+1
endif
t(1)=t(l)+l
if· ( t(l) .It. 3) then
goto 10
endif
endif
return
end
c This subroutine calculates the outcomes of a headpin
subroutine headpin(fscore,t,tspare,tstrike,p,j)
dimension fscore(10),p(4),t(3)
real p
double precision drand,pp
integer fscore,tstrike,tspare,j,t
10 pp=drand(O)
if( pp .le. p(2» then
fscore(j)=fscore(j)+5
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES 68
fscore(j+l)=fscore(j+l)+5
endif
continue
if( tstrike .eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+5
t(3)=t(3)+1
endif
t(1)=t(l)+l
if (t(l) .It. 3) then
call hp32(fscore,t,tspare,tstrike,p,j)
endif
elseif (pp .gt. p(2) .and. pp .le. p(3)) then
fscore(j)=fscore(j)+3
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+3
endif
continue
if( tstrike .eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+3
t(3)=t(3)+1
endif
t(1)=t(1)+l
if (t(l) .It. 3) then
call hp32(fscore,t,tspare,tstrike,p,j)
endif
elseif (pp .gt. p(3) .and. pp .It. (1+p(3))/2) then
fscore(j)=fscore(j)+2
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+2
endif
continue
if( tstrike .eq. 2 .and. j .le. 8) then
•• d
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES 69
fscore(j+2)=fscore(j+2)+2
t(3)=t(3)+1
endif
t(1)=t(1)+l
if (t(l) .It. 3) then
call hp32(fscore,t,tspare,tstrike,p,j)
endif
elseif (pp .gt. (p(3)+p(4»/2 ) then
t(1)=t(l)+l
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
endif
if( tstrike .eq. 2 .and. j .le. 8) then
t(3)=t(3)+1
endif
if(t(1) .It. 3) then
goto 10
endif
endif
return
end
c This subroutine calculates the outcomes of a 3-2
subroutine bov32(fscore,t,tsapre,tstrike,p,j)
dimension fscore(10),p(4),t(3)
real p
double precision drand,pp
integer fscore,tstrike,tspare,j,t
10 pp=drand(O)
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES 70
if(pp .It. (p(1)+p(2))!2) then
fscore(j)=fscore(j)+10
if ((tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+10
endif
continue
if( t_strike.eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+10
t(3)=t(3)+1
endif
t(1)=t(1)+l
if (t(l) .le. 3) then
tspare=tspare+l
endif
return
elseif (pp .gt.(p(1)+p(2))!2 .and. pp .le. p(2)) then
fscore(j)=fscore(j)+8
if ((tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+8
endif
continue
if( tstrike.eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+8
t(3)=t(3)+1
endif
t(1)=t(l)+l
if(t(l) .It. 3) then
call corner(fscore.t.tspare.tstrike.p.j)
endif
return
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES il
elseif ( pp .gt. p(2) .and. pp .le. p(3)) then
fscore(j)=fscore(j)+5
if «(tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+5
endif
continue
if( tstrike.eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+5
t(3)=t(3)+1
endif
t(1)=t(1)+l
if(t(l) .It. 3) then
call hp32(fscore,t,tspare,tstrike,p,j)
endif
return
elseif (pp .gt. p(3) .and. pp .le. (p(3)+p(4))/2) then
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
endif
if( tstrike .eq. 2 .and. j .le. 8) then
t(3)=t(3)+1
endif
t(1)=t(1)+1
if ( t(1) .It. 3) then
goto 10
endif
else
fscore(j)=fscore(j)+5
if «tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+5
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES 72
endif
continue
if( tstrike.eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+5
t(3)=t(3)+1
endif
t(1)=t(1)+l
if(t (1) .It. 3) then
call b3232(fscore,t,tspare,tstrike,p,j)
endif
endif
return
end
c This subroutine calculates the outcomes of head pin and a 3-2
subroutine hp32(fscore,t,tspare,tstrike,p,j)
dimension fscore(10),p(4),t(3)
real p
double precision drand,pp
integer fscore,tstrike,tspare,j,t
10 pp=drand(O)
if( pp .le. p(2)) then
fscore(j)=fscore(j)+5
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+1)=fscore(j+1)+5
endif
continue
if( tstrike.eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+5
t(3)=t(3)+1
endif
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES 73
t(1)=t(1)+l
return
elseif ( pp .gt. p(2) .and. pp .le. p(3)) then
fscore(j)=fscore(j)+3
if «tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+3
endif
continue
if( tstrike.eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+3
t(3)=t(3)+1
endif
t(1)=t(1)+l
elseif ( pp .gt. p(3) .and. pp .le. (p(3)+p(4))!2) then
fscore(j)=fscore(j)+2
if «tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+2
endif
continue
if( tstrike .eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+2
t(3)=t(3)+1
endif
t(1)=t(1)+l
else
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
endif
u __.__
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES 74
continue
if( tstrike .eq. 2 .and. j .le. 8) then
t(3)=t(3)+1
endif
t(1)=t(1)+1
endif
return
end
c This subroutine calculates the outcomes of a 3-2 on both sides
subroutine b3232(fscore,t,tspare,tstrike,p,j)
dimension fscore(10),p(4) ,t(3)
real p
double precision drand,pp
integer fscore,tstrike,tspare,j,t
pp=drand(O)
if (pp .le. p(3)) then
fscore(j)=fscore(j)+5
if ((tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+1)=fscore(j+1)+5
endif
t(1)=t(1)+1
return
elseif ( pp .gt. p(3) .and. pp .le.p(4)) then
t (1) =t (1) +1
if((tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
endif
return
endif
•
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES 75
return
end
Bibliography
[IJ D'Agostino, R. B. and Stephens, M. A., Goodness of Fit Techniques, New York,
Marcel Dekker, 1986.
[2] Daniel, Wayne W., Applied Nonparametric Statistics, Second Edition, Boston,
PWS-Kent, 1990.
[3] Box, G. P., Hunter, W. G. and Hunter, J. S., Statistics for Experimenters, New
York, John Wiley & Son, 1978.
[4] Neter, J., Wasserman, W. and Kutner, M. H., Applied Linear Statistical Models,
Second Edition, Homewood, Richard D. Irwin, 1985.
[5) Pierce, D. A. and Kopecky, R. J., Testing goodness of fit for the distribution of
errors in regression models, Technical Report Symp. 16, Department of Statistics,
Stanford University, 1978.
[6J Huynh, H., Some Approximate Tests for Repeated Measurement Designs, Psy
chometrika, Vol. 43, No.2, June, 1978,161-175.
[7J Let's Go Bowling, Canadian 5 Pin Bowlers' Association.
[8J Official Rules and Regulations Governing The Sport of 5 Pin Bowling, Canadian
5 Pin Bowlers' Association, 1987.
[9] 5 Pin Bowling Specifications and Standards Manual, Canadian 5 Pin Bowlers'
Association, 1987.
[10] Royall, R. M., The effect of sample size on the meaning of significance tests, The
American Statistician, Vol. 40, No.4, November, 1986,313-315.
76