A REVIEW OF OCCUPANCY PROBLEMS AND THEIR APPLICATIONS WITH A MATLAB DEMO Samuel Khuvis,...
-
Upload
jeremy-day -
Category
Documents
-
view
214 -
download
0
Transcript of A REVIEW OF OCCUPANCY PROBLEMS AND THEIR APPLICATIONS WITH A MATLAB DEMO Samuel Khuvis,...
A REVIEW OF OCCUPANCY PROBLEMS AND THEIR APPLICATIONS WITH A
MATLAB DEMOSamuel Khuvis, Undergraduate
Nagaraj Neerchal, Professor of StatisticsDepartment of Mathematics and Statistics, University of Maryland Baltimore County,
1000 Hilltop Circle, Baltimore, MD 21250
AbstractConsider an experiment of randomly distributing r
balls into n cells. One can conceive several easily
described probability problems related to this
experiment. Obtaining the probability that no two
adjacent cells are empty, finding the distribution of
the number of balls occupying a given cell and
deriving the distribution of the smallest number of
balls over all cells are a few examples of such
problems which are collectively referred to as
occupancy problems. Solutions to some of these
problems are non-trivial and in fact some naturally
give rise to well known probability distributions such
as binomial and multinomial distributions.
Occupancy problems have found important
applications in many areas. Distribution of Bose-
Einstein and Fermi- Dirac statistics are the most
celebrated examples of such applications. More
recently, questions from genetics, involving non-
randomness of occurrence of mutagen-induced
mutations across loci, have also been connected to
this general topic. In this poster, we provide a
glimpse to the probability calculations underlying
occupancy problems, and demonstrate them using
an interactive MATLAB program.
Examples
Applications
Fig. 2: These are four realizations generated by the MATLAB demo of an experiment in which 5 balls are thrown into 4 cells.
A
C D
Fig. 1: This is a screenshot of the MATLAB Demo used to visualize the occupancy problems. With six different operations that may be selected to the right.
Basic Calculations Concerning the Occupancy Problem
|S| = (n) (n)…(n)=
In the program, realizations were generated by:
For i=1 to number of balls
Generate a random number from 1 to the number of cells with each
number having a uniform probability of occurring
End
Conclusion
B
This has only been a basic introduction to occupancy
problems and there are many other calculations that
may be done based on the experiment of throwing r
balls into n cells.
These problems have many applications in the natural
sciences, especially in physics. More complex
calculations are able to explain the behaviors of
elementary particles. Using the simulation method, we
can begin to understand the probability distributions
which arise from these models.
-A binomial distribution describes the distribution of results of an experiment
in which:
1. There is a sequence of n trials, where n is fixed in advance
2. Each trial results in one of two possible outcomes, which is denoted as
either a success or a failure
3. The trials are independent, so each outcome on any particular trial
does not influence the outcome of any other trial
4. The probability of success (p) is constant from trial to trial
-Where the probability of x number of successes is
Binomial Distributions
Binomial Distribution of 1st Cell
T=number of balls in cell 1, where T is a random variable
t=0, 1, 2,…, r, where t is all r of the balls being thrown
: exactly t of ‘s are 1 and the
others are not 1}
Number of (T=t)= (n-1)r-t
So, T is a binomial distribution such that T ~ Bin(r, )
Multinomial Distribution
-A multinomial distribution is similar to a binomial with the
exception that instead of having 2 possible outcomes, there
are greater than 2 possible outcomes
-Let = number of balls in cell 1 and = number of balls in
cell 2
-The third outcome is a ball going into a cell other than cell 1
or cell 2
-( , ) ~ multinomialMinimum Calculations
Y=minimum number of balls occurring in any cell
So,
For Y>0 it is non-trivial to calculate the P(Y) without the use of a
simulation.
Statistical Mechanics
-We have r indistinguishable particles subdivided into n small regions, or phase
spaces with the particles being randomly distributed into these phase spaces
-It would seem that all arrangements are equally possible, however physicists
have shown that this is not the case. So, there are two statistics to describe the
behavior of particles:
-Fermi-Dirac Statistics
-Bose-Einstein Statistics
-In this realization, no two particles may be in the same cell and all
distinguishable arrangements have equal probabilities
-This means that r ≤ n, so any of the arrangements can be chosen
by randomly selecting which r cells contain a particle. Each
arrangement has a probability of and describes the behavior of
electrons, protons, and neutrons.
-In this realization, each distinguishable arrangement is given a
probability of
-This has been proven, experimentally, to describe the behavior of
photons, nuclei, and atoms that have an even number of elementary
particles
Population Genetics
-Since genetic data is often analyzed through categorical
observations, the computation of expected frequencies of
different genetic models can be described
-These are important in genetics when testing the non-
randomness of mutagen-induced mutations across loci.
-The occupancy problem is applied to these analyses to
combinatorially solve the problem of an inadequate sample
size.
-In this application, r is the size of the random sample and n is
the number of classes being analyzed in the sample
Matlab Demo
Function 1: Can generate one realization at a time for a certain number of balls and
cells.
Function 2: Can simulate a large number of realizations and empirically compute
probabilities.
Function 3: Allows the user to change the number of balls and cells
Output 1: One arrangement of 50 balls and 25 cells
Output 5: Displays the distribution of the balls in the first cell over 1000
realizations
Output 3: Displays the number of balls in each cell over 1,000 realizationsOutput 4: Displays the minimum number of balls for each of 10,000 realizations if
each arrangement has 50 balls and 25 cells selected by the user
Output 2: Randomly generates birthdays of50 people
Output 6: Shows how many days have a certain number of births in common
Acknowledgments: I would like to thank Andrew Raim and all of the members of CIRC for their
help.
using
occupancy problems
References
Feller, William. An Introduction to Probability Theory and
Chakraborty, Ranajit. “A Class Population Genetic
Its Applications. New York: John Wiley & Sons, 1950.
Questions Formulated as the Generalized Occupancy
Problem.” Genetics Society of America (1993) 953-958.