Input Modeling Modeling and Simulation CS 313 1. Poisson distribution 2 It is a discrete...

14
Input Modeling Modeling and Simulation CS 313 1

Transcript of Input Modeling Modeling and Simulation CS 313 1. Poisson distribution 2 It is a discrete...

Page 1: Input Modeling Modeling and Simulation CS 313 1. Poisson distribution 2  It is a discrete probability distribution that expresses the probability of.

1

Input Modeling

Modeling and Simulation

CS 313

Page 2: Input Modeling Modeling and Simulation CS 313 1. Poisson distribution 2  It is a discrete probability distribution that expresses the probability of.

2

Poisson distribution It is a discrete probability distribution that expresses the

probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since the last event.

Example: Suppose you typically get 4 pieces of mail per day. That becomes

your expectation, but there will be a certain spread: sometimes a little more, sometimes a little less, and once in a while nothing at all.

Given only the average rate, for a certain period of observation (pieces of mail per day), the Poisson distribution will tell you how likely it is that you will get 3, or 5, or 11, or any other number, during one period of observation.

Page 3: Input Modeling Modeling and Simulation CS 313 1. Poisson distribution 2  It is a discrete probability distribution that expresses the probability of.

3

Poisson distribution The distribution equation: If the expected number of occurrences in a given interval is λ, then

the probability that there are exactly k occurrences (k being a non-negative integer, k = 0, 1, 2, ...) is equal to:

Where: e is the base of the natural logarithm (e = 2.71828...) k is the number of occurrences of an event— the probability of

which is given by the function (The random number) k! is the factorial of k λ is a positive real number, equal to the expected number of

occurrences during the given interval (the average rate) For instance, if the events occur on average 4 times per minute,

and one is interested in the probability of an event occurring k times in a 10 minute interval, one would use a Poisson distribution as the model with λ = 10×4 = 40.

Page 4: Input Modeling Modeling and Simulation CS 313 1. Poisson distribution 2  It is a discrete probability distribution that expresses the probability of.

4

Poisson distribution example Consider in an office 2 customers arrived today. Calculate

the possibilities for exactly 3 customers to arrive tomorrow. Step1:

Find e-λ where, λ=2 and e=2.718 e-λ = (2.718)-2 = 0.135

Step2: Find λx where, λ=2 and x=3 λx = 23 = 8

Step3: Find f(x) f(x) = e-λλx / x! f(3) = (0.135)(8) / 3! = 0.18

Hence there is18% possibility for 3 customers to arrive tomorrow.

Page 5: Input Modeling Modeling and Simulation CS 313 1. Poisson distribution 2  It is a discrete probability distribution that expresses the probability of.

5

Exponential distribution It is a family of continuous probability distributions. It

describes the time between events in a Poisson process, i.e. a process in which events occur continuously and independently at a constant average rate.

The probability density function (pdf) of an exponential distribution is

λ is the parameter of the distribution x is the random variable

Page 6: Input Modeling Modeling and Simulation CS 313 1. Poisson distribution 2  It is a discrete probability distribution that expresses the probability of.

6

Binomial distribution Definition:

The Binomial distribution is one of the discrete probability distributions. It is used when there are exactly two mutually exclusive outcomes of a trial. These outcomes are appropriately labeled Success and Failure.

The Binomial distribution is used to obtain the probability of observing r successes in n trials, with the probability of success on a single trial denoted by p.

Function: P(X = r) = nCr pr (1-p)n-r

Where: n = number of events r = number of successful events p = probability of success on a single trial

nCr = ( n! / (n-r)! ) / r! 1-p = probability of failure

Page 7: Input Modeling Modeling and Simulation CS 313 1. Poisson distribution 2  It is a discrete probability distribution that expresses the probability of.

7

Binomial distribution example Toss a coin for 12 times. What is

the probability of getting exactly 7 heads?

Step 1: Number of trials n = 12 Number of successes r = 7 since

we define getting a head as success

Probability of success on any single trial p = 0.5

Step 2: calculate nCr

nCr = ( n! / (n-r)! ) / r!

= ( 12! / (12-7)! ) / 7! = ( 12! / 5! ) / 7! = ( 479001600 / 120 ) / 5040 = ( 3991680 / 5040 ) = 792

Step 3: find pr

pr = 0.57 = 0.0078125

Step 4: to Find (1-p)n-r calculate 1-p and n-r 1-p = 1-0.5 = 0.5 n-r = 12-7 = 5

Step 5: find (1-p)n-r

= 0.55 = 0.03125 Step 6: solve P(X = r) =

nCr pr (1-p)n-r

= 792 * 0.0078125 * 0.03125

= 0.193359375 The probability of

getting exactly 7 heads is 0.19

Page 8: Input Modeling Modeling and Simulation CS 313 1. Poisson distribution 2  It is a discrete probability distribution that expresses the probability of.

8

Goodness-of-Fit Tests Conduct hypothesis testing on input data distribution using:

Kolmogorov-Smirnov test Chi-square test

Goodness-of-fit tests provide helpful guidance for evaluating the suitability of a potential input model.

No single correct distribution in a real application exists. If very little data is available, it is unlikely to reject any

candidate distribution. If a lot of data is available, it is likely to reject all candidate

distributions.

Page 9: Input Modeling Modeling and Simulation CS 313 1. Poisson distribution 2  It is a discrete probability distribution that expresses the probability of.

9

Chi-Square test Intuition: comparing the histogram of the data to the shape of

the candidate density or mass function. Valid for large sample sizes when parameters are estimated

by maximum likelihood By arranging the n observations into a set of k class intervals

or cells, the test statistics is:

Where: Oi is the observed frequency (number of occurrences per

interval/value) Ei is the expected frequency

Ei = n * pi where pi is the theoretical probability of the ith interval and n is the total number of data values

Suggested minimum for Ei = 5

Page 10: Input Modeling Modeling and Simulation CS 313 1. Poisson distribution 2  It is a discrete probability distribution that expresses the probability of.

10

Chi-Square test The hypothesis of a chi-square test is:

H0: the random variable, X, conforms to the distributional assumption with the parameter(s) given by the estimate(s).

H1: the random variable X does not conform. If the distribution tested is discrete and combining adjacent

cell is not required (so that Ei > minimum requirement), each value of the random variable should be a class interval, unless combining is necessary, and:

pi = p(xi) = P(X = xi)

Page 11: Input Modeling Modeling and Simulation CS 313 1. Poisson distribution 2  It is a discrete probability distribution that expresses the probability of.

11

Chi-Square test example Vehicle Arrival Example (continued) : The histogram appears to be Poisson We find the estimated mean (λ) to be 3.64 Using Poisson pmf:

For λ = 3.64, the probabilities are: p(0) = 0.026 p(6) = 0.085 p(1) = 0.096 p(7) = 0.044 p(2) = 0.174 p(8) = 0.020 p(3) = 0.211 p(9) = 0.008 p(4) = 0.192 p(10) = 0.003 p(5) = 0.140 p(≥11) = 0.001

Page 12: Input Modeling Modeling and Simulation CS 313 1. Poisson distribution 2  It is a discrete probability distribution that expresses the probability of.

12

Chi-Square test example cont. H0: the random variable is Poisson distributed.

H1: the random variable is not Poisson distributed.

Because the hypothesis (H0) is rejected

Page 13: Input Modeling Modeling and Simulation CS 313 1. Poisson distribution 2  It is a discrete probability distribution that expresses the probability of.

13

Chi-Square test Chi-square test can accommodate estimation of parameters. Chi-square test requires data to be placed in intervals. Changing the number of classes and the interval width

affects the value of the calculated and tabulated chi-square. A hypothesis can be accepted if the data is grouped in one

way and rejected in another way.

Page 14: Input Modeling Modeling and Simulation CS 313 1. Poisson distribution 2  It is a discrete probability distribution that expresses the probability of.

14

Kolmogorov-Smirnov test Intuition: formalize the idea behind examining a q-q plot A more powerful test, particularly useful when:

Sample sizes are small No parameters have been estimated from the data