2 Discrete Distributions -...
Transcript of 2 Discrete Distributions -...
![Page 1: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/1.jpg)
ProbabilitySTAT 416Spring 2007
2 Discrete Distributions
1. Introduction
2. Mean and Variance
3. Binomial Distribution
4. Poisson Distribution
5. Other Discrete Distributions
1
![Page 2: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/2.jpg)
2.1 Introduction
Example: Fair dice, Observations: 1, 2, 3, 4, 5, 6Each observation probability p = 1/6:
P (1) = 1/6, P (2) = 1/6, . . .
We observe realizations of a random variable
Random variable: Map from a (suitable) probability space into thereal numbers X : Ω → R
Examples:
Ω = 1, 2, 3, 4, 5, 6P (i) = 1/6, i = 1, . . . 6
X(i) = i
2
![Page 3: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/3.jpg)
Example continued
Two fair dices, Sum of observations X = X1 + X2
X1 and X2 both random variables like before (independent)
Ω = 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12P (2) = P (12) = 1/36
P (3) = P (11) = 2/36
P (4) = P (10) = 3/36
P (5) = P (9) = 4/36
P (6) = P (8) = 5/36
P (7) = 6/36
X : Ω → R, X(i) = i
3
![Page 4: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/4.jpg)
Discrete random variable
Sample space Ω with finite or countable number of elements,i.e. index set N: Ω = x1, x2, x3, . . .
It is always possible to identify the sample space Ω with the set ofall possible observations of the random variable
Random variable X then has the form X : Ω → R, X(xi) = xi
fully described by its probability function:
P : Ω → [0, 1], P (xi) = pi
Probability of elementary events fully describe distribution of adiscrete random variable
4
![Page 5: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/5.jpg)
Cumulative distribution function (cdf)
F : R→ [0, 1], F (x) = P (X ≤ x)
Example: Fair dice
−2 0 2 4 6 8
0
0.2
0.4
0.6
0.8
1
F(x
) =
P(X
≤ x
)
x
5
![Page 6: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/6.jpg)
Uniform distribution
n possible events with equal probability
Ω = 1, . . . , n P (i) = 1/n
Cummulative distribution function:
F (x) =
0, x < 1
i/n, i ≤ x < i + 1, i = 1, . . . , n− 1
1, x ≥ n
at x ∈ Ω the CDF has jumps of size 1/n
⇒ connection between CDF and probability function
P (i) = F (i)− F (i− 1), for i ∈ Ω
6
![Page 7: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/7.jpg)
Properties of the CDF
Specifically for discrete random variables:
CDF is monotonously increasing step function with jumps at eventswith positive probability
Generally for CDF holds:
• P (x) = F (x)− F (x−), where F (x−) = limh→x,h<x
F (h)
due to the definition of F (x) = P (X ≤ x)
• P (a < X ≤ b) = F (b)− F (a)
• lima→−∞
F (a) = 0, limb→∞
F (b) = 1
• F (x) monotonously increasing
7
![Page 8: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/8.jpg)
Exercise
CDF of a random variable X given by
F (x) =
0, x < 1
1− 2−k, k ≤ x < k + 1, k = 1, 2, . . .
1. Draw the CDF in the range of x ∈ [0, 5]
2. Determine the probability function of X
3. Compute the probability of X > 5?
8
![Page 9: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/9.jpg)
2.2 Mean and Variance
Essential properties of a distribution
Important for practical purposes
⇒ Reduction of information of data
Mean is a measure of central tendency, also called expected value,corresponds to the arithmetic mean of a sample
Variance is a measure of dispersioncorresponds to the deviation from the mean of a sample
Both figures based on moments of distribution, specifically for thenormal distribution of major importance
9
![Page 10: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/10.jpg)
Mean
Discrete random variable X with probability space Ω, P
Definition of mean:
E(X) =∑
x∈Ω
xP (x)
Weighted sum of values of Ω
weights are the corresponding probabilities of events
Usual notation: µ = E(X)
Example Fair dice:
E(X) = 1 · 1/6 + 2 · 1/6 + · · ·+ 6 · 1/6
=1 + 2 + 3 + 4 + 5 + 6
6= 21/6 = 3.5
10
![Page 11: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/11.jpg)
Transformation of random variables
Discrete random variable X with probability space Ω, P
Specifically for all x ∈ Ω : P (x) = px
Additionally given f : Ω → R, image set f(Ω)
Definition: f(X) is the random variable Y : f(Ω) → R with
Y (y) = y and P (y) =∑
x∈Ω:f(x)=y
px
I.e. values of events x ∈ Ω are transformed into f(x)Probabilities added for all x with equal images f(x)
11
![Page 12: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/12.jpg)
Examples for transformation
1) Fair dice, f(x) = x2, Y = X2:
Y (y) = y with y ∈ ΩY := 1, 4, 9, 16, 25, 36P (1) = P (4) = P (9) = P (16) = P (25) = P (36) = 1/6
2) Fair dice, g(x) = (x− 3.5)2, Z = (X − 3.5)2:
Z(z) = z with z ∈ ΩZ := 2.52, 1.52, 0.52 = 6.25, 2.25, 0.25P (6.25) = p1 + p6 = 1/3P (2.25) = p2 + p5 = 1/3P (0.25) = p3 + p4 = 1/3
Exercise:Ω = −1, 0, 1, P (X = −1) = P (X = 1) = 1/4, P (X = 0) = 1/2
Compute Y = X2 and Z = X3
12
![Page 13: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/13.jpg)
Expectation of functions
Example: Fair dice – continued:
1) E(f(X)) = E(Y ) = 1 · 1/6 + 4 · 1/6 + · · ·+ 36 · 1/6
=1 + 4 + 9 + 16 + 25 + 36
6= 91/6 = 15.1667
2) E(g(X)) = E(Z) = 6.25/3 + 2.25/3 + 0.25/3 = 2.9167
In general: Computation of expectation of f(X):
E(f(X)) =∑
x∈Ω
f(x)P (x)
Weighted sum of the values of f(Ω)
Note:∑
x∈Ω,f(x)=y
f(x)P (x) =∑
y∈f(Ω)
yPY (y)
13
![Page 14: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/14.jpg)
Linear Transformation
For general a, b ∈ R:
E(aX + b) = aE(X) + b
Proof:
E(aX + b) =∑
x∈Ω
(ax + b)P (x)
= a∑
x∈Ω
xP (x) + b∑
x∈Ω
P (x)
= aE(X) + b
Specifically: E(X − µ) = E(X − E(X)) = 0
14
![Page 15: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/15.jpg)
Variance
Definition:
Var (X) := E(X − µ)2
Usual notation: σ2 = Var (X)
σ . . . Standard deviation: SD(X) =√
Var (X)
It holds Var (X) = E(X2)− µ2
E(X − µ)2 =∑
x∈Ω
(x− µ)2P (x) =∑
x∈Ω
(x2 − 2µx + µ2)P (x)
=∑
x∈Ω
x2P (x)− 2µ∑
x∈Ω
xP (x) + µ2∑
x∈Ω
P (x)
= E(X2)− 2µ2 + µ2 = E(X2)− µ2
15
![Page 16: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/16.jpg)
Example for variance
Three random variables X1, X2, X3
X1 = 0 with probability 1
X2 equally distributed on −1, 0, 1X3 equally distributed on −50,−25, 0, 25, 50
All three random variables have mean 0
Var (X1) = 02 · P (0) = 0
Var (X2) = (−1)2 · 1/3 + 12 · 1/3 = 2/3
Var (X3) = (−50)2 · 1/5 + (−25)2 · 1/5 + 252 · 1/5 + 502 · 1/5 = 1250
Variance gives additional information on the distribution
16
![Page 17: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/17.jpg)
Properties of variance
For general a, b ∈ R:
Var (aX + b) = a2Var (X)
Proof:
Var (aX + b) = E(aX + b− aµ− b)2 = a2E(X − µ)2
Specifically: Var (−X) = Var (X)
Var (X + b) = Var (X)
At times E(X2)− µ2 is easier to compute than E(X − µ)2
Exercise: Variance of fair dice with both formulas
17
![Page 18: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/18.jpg)
Moments of a distribution
k-th moment of a random variable: mk := E(Xk)
k-th central moment: zk = E((X − µ)k)
m1 . . . mean
z2 = m2 −m21 . . . variance
Of practical importance also third and fourth moment
Skewness: ν(X) := z3σ3 = E(X3
∗ ) where X∗ := (X − µ)/σ
• ν(X) = 0 . . . symmetric distribution
• ν(X) < 0 . . . left skewed
• ν(X) > 0 . . . right skewed
Kurtosis: z4σ4 = E(X4
∗ )(has to do with curvature → Normal distribution)
18
![Page 19: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/19.jpg)
Exercise: Skewness
Random variable X has the following distribution:
P (1) = 0.05, P (2) = 0.1, P (3) = 0.3, P (4) = 0.5, P (5) = 0.05
Draw probability function and CDF
Compute skewness!
Compute skewness for the mildly changed distribution
P (1) = 0.05, P (2) = 0.3, P (3) = 0.3, P (4) = 0.3, P (5) = 0.05
19
![Page 20: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/20.jpg)
2.3 Binomial distribution
Bernoulli trial: Two possible outcomes (0 or 1)
P (X = 1) = p, P (X = 0) = q where q = 1− p
E.g. fair coin: p = 1/2
Example: Throw an unfair coin twice. P (head) = p = 0.7Compute probability distribution of Z, the number of heads!
Sample space ΩZ = 0, 1, 2Throwing both coins independently!
P (Z = 0) = P (X1 =0, X2 =0) = P (X1 =0)P (X2 =0) = 0.32 = 0.09
P (Z = 1) = P (X1 =0, X2 =1) + P (X1 =1, X2 =0) =
= 2 · P (X1 =0)P (X2 =1) = 2 · 0.3 · 0.7 = 0.42
P (Z = 2) = P (X1 =1, X2 =1) = P (X1 =1)P (X2 =1) = 0.72 = 0.49
20
![Page 21: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/21.jpg)
Binomial distribution
n independent Bernoulli trials, P (X = 1) = p
Y : Number of successes (trials with outcome 1) binomiallydistributed:
P (Y = k) =(nk
)pkqn−k
Proof: Independence ⇒ Probability for each single sequencewith k successes (1) and n− k failures (0) given by pk(1− p)n−k
Number of such sequences: r-combination without replacement
Notation: Y ∼ B(n, p)
Exercise: Throw independently five fair coins
Compute distribution of the number of heads!
21
![Page 22: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/22.jpg)
Example binomial distribution
Exam with failure rate of 20%
Distribution of number of successes for 10 students?
P (X = 7) =(
107
)· 0.87 · 0.23 = 0.2013
0 1 2 3 4 5 6 7 8 9 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
22
![Page 23: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/23.jpg)
Examples binomial distribution: n = 10
p = 0.1
0 1 2 3 4 5 6 7 8 9 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 1 2 3 4 5 6 7 8 9 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
p = 0.2
p = 0.3
0 1 2 3 4 5 6 7 8 9 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 1 2 3 4 5 6 7 8 9 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
p = 0.5
23
![Page 24: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/24.jpg)
Exercise: S.R. Example 6f
Communication system - n components, each functionsindependently with probability p
Total system operates if at least one-half of its components work
1. For which values of p is a 5 - component system more likely towork than a 3 - component system?
2. Generalize: For which values of p is a 2k + 1 - componentsystem more likely to work than a 2k − 1 - component system?
24
![Page 25: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/25.jpg)
Application: Drawing with replacement
population of N objects
• M of N objects have some property E• Draw n objects with replacement
Number X of drawn objects with property E are binomiallydistributed:
X ∼ B(n, M/N)
Exercise: Bowl with 3 black and 9 white balls; draw 5 balls withreplacement, X . . . number of drawn balls that are black
• Probability function of X?
• Expected value of X?
25
![Page 26: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/26.jpg)
Mean of binomial distribution
X ∼ B(n, p) ⇒ E(X) = np
Using k(nk
)= n
(n−1k−1
)we obtain
E(X) =n∑
k=1
k
(n
k
)pkqn−k = np
n∑
k=1
(n− 1k − 1
)pk−1qn−k
= npn−1∑
i=0
(n− 1
i
)piqn−1−i
and due to the binomial theorem
n−1∑
i=0
(n− 1
i
)piqn−1−i = (p + q)n−1 = 1
Alternative Proof: Differentiate (p + q)n = 1 w.r.t. p
26
![Page 27: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/27.jpg)
Variance of binomial distribution
X ∼ B(n, p) ⇒ Var (X) = npq
Again using k(nk
)= n
(n−1k−1
)we obtain
E(X2) =n∑
k=1
k2
(n
k
)pkqn−k = np
n∑
k=1
k
(n− 1k − 1
)pk−1qn−k
= npn−1∑
i=0
(i + 1)(
n− 1i
)piqn−1−i = np (n− 1)p + 1
and thus
Var (X) = E(X2)− µ2 = np (n− 1)p + 1 − (np)2 = np(1− p)
Alternative Proof: Differentiate (p + q)n = 1 twice w.r.t. p
27
![Page 28: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/28.jpg)
2.4 Poisson distribution
Definition: Ω = N0 = 0, 1, 2, · · ·
P (X = k) = λk
k! e−λ , λ > 0
Notation: X ∼ P(λ)
Poisson-distributed random variable can take in principle arbitrarilylarge values - though with very small probability
Example: λ = 2
P (X ≤ 1) =20
0!e−2 +
21
1!e−2 = (1 + 2)e−2 = 0.4060
P (X > 4) = 1− P (X ≤ 4) = 1− (1 + 2 +42
+86
+1624
)e−2
= 1− 0.9473 = 0.0527
28
![Page 29: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/29.jpg)
Examples Poisson distribution
λ = 1
0 1 2 3 4 5 6 7 8 9 10 11 120
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 1 2 3 4 5 6 7 8 9 10 11 120
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
λ = 1.5
λ = 3
0 1 2 3 4 5 6 7 8 9 10 11 120
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 1 2 3 4 5 6 7 8 9 10 11 120
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
λ = 5
29
![Page 30: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/30.jpg)
Application
To model rare events
Examples
• Number of clients within a certain time frame
• Radioactive decay
• Number of errors per slide
• Number of people older than 100 years (per 1 000 000)
• number of false alarms per day
• etc.
Connection between Poisson distributed events and the time inbetween two events ⇒ Exponential distribution
30
![Page 31: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/31.jpg)
Assumptions
Events happening at certain points in time are Poisson - distributedunder the following assumptions
• Probability that exactly 1 event occurs within a given timeinterval of length h is approximately λh
• Probability that 2 or more events occur within a given timeinterval of length h is very small compared to h
• Looking at two time intervals that do not overlap the number ofevents in one interval is independent from the number ofevents in the other interval
For each time interval [t1, t2] the probability for the number ofoccurring events is Poisson distributed with parameter λ(t2 − t1).
31
![Page 32: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/32.jpg)
Example
Suppose that the number of earthquakes per week is Poissondistributed with parameter λ = 2
1. What is the probability of at least 3 earthquakes during thenext week?
2. What is the probability of at least 3 earthquakes during thenext two weeks?
Solution: 1)P (X ≥ 3) = 1− P (X ≤ 2) = 1− (1 + 2 + 4
2 )e−2 = 0.3233
2) Now we have a time interval of 2 weeks, therefore we get aPoisson distribution with parameter 2λ = 4
P (X ≥ 3) = 1− P (X ≤ 2) = 1− (1 + 4 + 162 )e−4 = 0.7619
32
![Page 33: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/33.jpg)
Mean and variance
X ∼ P(λ) ⇒ E(X) = λ
Proof:
E(X) =∞∑
k=0
kλk
k!e−λ = e−λ
∞∑
k=1
λk
(k − 1)!= λe−λ
∞∑
j=0
λj
j!
X ∼ P(λ) ⇒ Var (X) = λ
Proof:
E(X2)=∞∑
k=0
k2 λk
k!e−λ =e−λ
∞∑
k=1
kλk
(k − 1)!=λe−λ
∞∑
j=0
(j + 1)λj
j!=λ(λ+1)
E(X2)− E(X)2 = λ(λ + 1)− λ2 = λ
33
![Page 34: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/34.jpg)
Exercise
Suppose that a book has on average on every third page a typo.
1. What is the probability that you find at least two errors on thepage that you are reading right now?
2. What is the probability that you find at least two errors within10 pages?
3. What is the probability that you find at least two errors on anyof 10 pages?
34
![Page 35: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/35.jpg)
Approximation of binomial distribution
X ∼ B(n, p), where n large and p small (e. g. n > 10 and p < 0.05)
⇒ X ∼ P(np)i.e. X is approximately Poisson-distributed with parameter λ = np
Motivation: Let λ := np
P (X = k) =n!
k! (n− k)!pkqn−k
=n(n− 1) · · · (n− k + 1)
k!· λk
nk· (1− λ/n)n
(1− λ/n)k
For n large and moderate λ (i.e. p small) we have
n(n− 1) · · · (n− k + 1)nk
≈ 1 (1−λ/n)k ≈ 1 (1−λ/n)n ≈ e−λ
and thus P (X = k) ≈ λk
k! e−λ
35
![Page 36: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/36.jpg)
Example Poisson approximation
Comparison of Poisson approximation (λ = 0.5) with exact CDF ofbinomial distribution (n = 10, p = 0.05)
0 1 2 3 4 5 60.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
Blue: P ∼ B(10, 0.05)Red: P ∼ P(0.5)
Binomial:
P (X ≤ 3) = 0.9510 + 10 · 0.05 · 0.959
+ 45 · 0.052 · 0.958 + 120 · 0.053 · 0.957
= 0.99897150206211
Poisson approximation:
P (X ≤ 3) ≈
≈(
1 + 0.5 +0.52
2+
0.53
6
)e−0.5
= 0.99824837744371
36
![Page 37: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/37.jpg)
2.5 Other discrete Distributions
We will discuss
• Geometric
• Hypergeometric
Apart from that
• Negative binomial (more general: Panjer)
• Generalized Poisson
• Zeta distribution
• etc.
Wikipedia very helpful
37
![Page 38: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/38.jpg)
Geometric distribution
Independent Bernoulli - trials with probability p
X . . . number of trials until the first success
Therefore P (X = k) = qk−1 p
k − 1 failures with probability q = 1− p
Exercise: Bowl with N white and M black balls
Drawing with replacement
a) Probability, that it takes exactly k trials, till one draws a black ball
b) Probability, that it takes at most k trials, till one draws a black ball
38
![Page 39: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/39.jpg)
Geometric distribution
Compare shape of distribution later with density of exponentialdistribution
0 1 2 3 4 5 6 7 8 9 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Memorylessness
39
![Page 40: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/40.jpg)
Mean and variance
Note that∞∑
j=0
qj = 11−q and thus
∞∑k=1
qk−1p = p1−q = p
p = 1
Differentiate:∞∑
k=1
kqk−1 = ddq
∞∑k=0
qk = 1(1−q)2
E(X) =∞∑
k=1
kqk−1p =p
(1− q)2=
1p
Differentiate again:∞∑
k=1
k(k − 1)qk−2 = d2
dq2
∞∑k=0
qk = 2(1−q)3
E(X2) =∞∑
k=1
k2qk−1p = pq∞∑
k=1
k(k−1)qk−2 +p∞∑
k=1
kqk−1 =2pq
p3+
1p
And thus: Var (X) = E(X2)− E(X)2 = 2p2 − 1
p − 1p2 = 1−p
p2
40
![Page 41: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/41.jpg)
Hypergeometric distribution
Binomial distribution: Drawing with replacement
Exercise: Bowl, 3 black balls, 5 white balls,Draw 4 balls with and without replacement respectively.
Compute for both cases distribution of the number of drawn blackballs!
0 1 2 3 40
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
with replacement0 1 2 3 4
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
without replacement
41
![Page 42: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/42.jpg)
Hypergeometric distribution
N objects from which M have some property E . Draw n objectswithout replacement, X number of drawn objects with property E .
P (X = k) = (Mk )(N−M
n−k )(N
n)
We use the definition(ab
)= 0, whenever a < b
Clearly we have P (X = k) = 0 if M < k
I cannot draw more black balls than there are in the bowl
Also clear that P (X = k) = 0 if N −M < n− k
I cannot draw more white balls than there are in the bowl
Thus: Ω = k : max(0, n−N + M) ≤ k ≤ min(n, M)
42
![Page 43: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/43.jpg)
Mean and variance
Without proof (easy but slightly tedious computations)
E(X) = nMN , Var (X) = nM
N (1− MN )N−n
N−1 ,
Define p := MN and compare with binomial distribution
E(X) = np same formula like for binomial
Var (X) = np(1− p)N−nN−1 asymptoticly like binomial
because limN→∞ N−nN−1 = 1
If N and M very large compared to n, then we have approximatelyX ∼ B(n, M
N ) (without proof)
43
![Page 44: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/44.jpg)
Example hypergeometric distribution
Quality control: Delivery of 30 boxes with eggs,10 boxes contain at least one broken egg,Take a sample of size 6
• Compute probability that two boxes within the sample containbroken eggs?
N = 30,M = 10, n = 6
P (X = 2) =
(102
)(204
)(306
) = 0.3672
• Mean and variance for the number of boxes within the samplethat contain broken eggs?
E(X) = 6 · 1030 = 2; Var (X) = 6 · 1
3 · 23 · 24
29 = 1.1034
44
![Page 45: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/45.jpg)
Exercise: Approximation by binomial distribution
Lottery with 1000 lots, 200 are winningAssume you buy 5 lots
1. Compute probability, that at least one lot will win
Solution: 0.6731
2. Compute the same probability using the binomialapproximation
Solution: 0.6723
45
![Page 46: 2 Discrete Distributions - homepage.univie.ac.athomepage.univie.ac.at/florian.frommlet/WS2007/AdvStat/stat416_2.pdf · Probability STAT 416 Spring 2007 2 Discrete Distributions 1.](https://reader031.fdocuments.us/reader031/viewer/2022011820/5ea4d47aa137483da2093ba3/html5/thumbnails/46.jpg)
Summary discrete distributions
• Uniform: Ω = x1, . . . , xn , P (X = xk) = 1/n
• Binomial: X ∼ B(n, p), P (X = k) =(nk
)pkqn−k
We have E(X) = np, Var (X) = npq Ω = 0, . . . , n
• Poisson: X ∼ P(λ), P (X = k) = λk
k! e−λ
We have E(X) = λ, Var (X) = λ Ω = 0, 1, 2 . . .
• Geometric: P (X = k) = p qk−1
We have E(X) = p−1, Var (X) = q p−2 Ω = 1, 2 . . .
• Hypergeometric: P (X = k) =(Mk
)(N−Mn−k
)/(Nn
)
We have E(X) = np, Var (X) = np(1− p)N−nN−1 , p = M
N
46