1 Discrete Probability Hsin-Lung Wu Assistant Professor Advanced Algorithms 2008 Fall.
-
date post
22-Dec-2015 -
Category
Documents
-
view
225 -
download
0
Transcript of 1 Discrete Probability Hsin-Lung Wu Assistant Professor Advanced Algorithms 2008 Fall.
2
Sample space (set) S of elementary event eg. The 36 ways of 2 dices can fall
An event A is a subset of S eg. Rolling 7 with 2 dices
A probability distribution Pr{} is a map from events of S to
Probability Axiom:
Pr{ } 0Pr{ } 1Pr{ } Pr{ } Pr{ }, if
ASA B A B A B
3
Pr Pr Pr PrPr Pr
:
A B A B A B
Theo
B
rem
A
Pr Pr
:For discrete probability distributions
s A
A
Th
s
eorem
4
A random variable (r.v.) X is a function from S to The event “X = x” is defined as {sS : X(s) = x} eg. Rolling 2 dices:
|S|=36 possible outcomes Uniform distribution: Each element has the same probability
1/|S|=1/36 Let X be the sum of dice
Pr{ X = 5 } = 4/36, {(1, 4), (2, 3), (3, 2), (4, 1)}
Expected value: Linearity:
X1: number on dice 1
X2: number on dice 2
X=X1+X2, E[X1]=E[X2]=1/6(1+2+3+4+5+6)=21/6
Prx
E X x X x E aX Y aE X E Y
5
Independence
Two random variables X and Y are independent if
, , Pr Pr Pr and x y X x Y y X x Y y
X Y
E XY E X E Y
If and are independent, then
6
Indicator random variables Given a sample space S and an event A, the indicator
random variable I{A} associated with event A is defined as:
10 if occurso/w
AI A
7
E.g.: Consider flipping a fair coin: Sample space S = { H,T } Define random variable Y with Pr{ Y=H } = Pr{ Y=T }=1/2 We can define an indicator r.v. XH associated with the
coin coming up heads, i.e. Y=H
10 if if H
Y HX I Y HY T
1 Pr 0 Pr
1Pr
2
HE X E I Y HY H Y T
Y H
8
{ }
:
:
Pr
1 Pr 0 Pr
Pr
A
A
A
S AS X I A
E X A
E X E I A A A
A
Lemma
Proof
Given a sample space and an event in thesample space , let Then
9
The birthday paradox: How many people must there be in a room before there
is a 50% chance that two of them born on the same day of the year?
(1) Suppose there are k people and there are n days in a y
ear,bi : i-th person’s birthday, i =1,…,k
Pr{bi=r}=1/n, for i =1,…,k and r=1,2,…,n
Pr{bi=r, bj=r}=Pr{bi=r}. Pr{bj=r} = 1/n2
10
Define event Ai : Person i’s birthday is different from per
son j’s for j < i
Pr{Bk} = Pr{Bk-1∩Ak} = Pr{Bk-1}Pr{Ak|Bk-1}where Pr{B1} = Pr{A1}=1
11
Pr Pr ,n
i j i j nrb b b r b r
1
1
: the event that people have distinct birthdayk
k ii
k k
B A k
B A
( 1)1 2
1 (1
1 1
2 1 2 1
1 2 1 3 2 11 2 1
11 2
/
Pr{ } Pr{ }Pr{ | }Pr{ }Pr{ | }Pr{ | }... Pr{ }Pr{ | }Pr{ | }...Pr{ | }1 ( )( )...( )
1 (1 )(1 )...(1 ) 1k
n n n
k k ki
k k k k
k k k k k
k kn n n kn n n
xkn n n
i n
B B A BB A B A B
B A B A B A B
e e e x e
e e
1)
2 ( 1)1 12 2 2ln( )where n k k
n
12( 1) 2 ln 2 , (1 1 (8ln 2) ) / 2
365, 23the prob.
For we have k k n k n
n k
■
11
(2) Analysis using indicator random variables For each pair (i, j) of the k people in the room, define th
e indicator r.v. Xij, for 1≤ i < j ≤ k, by
10 /
ijX I i ji jo w
person and person have the same birthday and have the same birthday
1
1 1
1 1
1 1
Pr
( 1)/
2 2
person and have the same birthday
Let
ij
nk k
iji j ik k
iji j i
k k
iji j i
E X i j
X X
E X E X
k kkE X nn
12
When k(k-1) ≥ 2n, the expected number of pairs of people with the same birthday is at least 1
2 1 1 82 0
2( ), 365 28, we expect to find at least
one matching pair
nk k n k
k n n k
■
13
Balls and bins problem: Randomly toss identical balls into b bins, numbered 1,2,
…,b. The probability that a tossed ball lands in any given bin is 1/b
(a) How many balls fall in a given bin? If n balls are tossed, the expected number of balls that fall in
the given bin is n/b (b) How many balls must one toss, on the average, until
a given bin contains a ball? By geometric distribution with probability 1/b
1
21 1 1 1 1
21 1 1 1 1 1
1 11 (1 )
1
1 2 (1 ) 3 (1 ) ...(1 ) (1 ) (1 ) ...
( ) 1
1b
b b b b b
b b b b b b
b
b
ee e
e e b
14
(c) (Coupon collector’s problem) How many balls must one toss until every bin contains at least one ball?
Want to know the expected number n of tosses required to get b hits. The ith stage consists of the tosses after the (i-1)st hit until the ith hit.
For each toss during the ith stage, there are i-1 bins that contain balls and b-i+1 empty bins
Thus, for each toss in the ith stage, the probability of obtaining a hit is (b-i+1)/b
Let ni be the number of tosses in the ith stage. Thus the number of tosses required to get b hits is n=∑b
i=1 ni
Each ni has a geometric distribution with probability of success (b-i+1)/b → E[ni]=b/b-i+1
111 1 1 1
(ln (1)) ( ln )
b b b bbi i b i ii i i i
E n E n E n b
b b O O b b
■
15
Streaks
Flip a fair coin n times, what is the longest streak of consecutive heads? Ans:θ(lg n)
Let Aik be the event that a streak of heads of length at least
k begins with the ith coin flip
For j=0,1,2,…,n, let Lj be the event that the longest streak
of heads has Length exactly j, and let L be the length of the longest streak.
2
2 lg 1,2 lg
Pr 1/ 22 lg
Pr 2
kik
n
i n n
Ak n
A
For
0Pr
n
jjE L j L
16
2 lg
0,12 lg
Pr
j
n
jj n
L j nn
L
Note that the events for ,..., are disjoint.So the probability that a streak of heads of length
begins anywhere is
12 lg
2 lg 1
0 0
Pr
Pr 1. Pr 1
Thus,
while We have
n
j nj nn n
j jj j
L
L L
02 lg 1
0 2 lg2 lg 1
0 2 lg2 lg 1
0 2 lg
Pr
Pr Pr
(2 lg ) Pr Pr
2 lg Pr Pr
2 lg 1 (1/ ) (lg )
n
jjn n
j jj j nn n
j jj j nn n
j jj j n
E L j L
j L j L
n L n L
n L n L
n n n O n
17
We look for streaks of length s by partitioning the n flips into approximately n/s groups of s flips each.
lg
, lg
1
Pr 1 2 1
1lg
The probability is that the largest streakis
r n ri r n
r r
A n
n n nr n
:
lgThe expected length of the longest streak of heads in coin flips is
nC im
n
la
18
The probability that a streak of heads of length
does not begin in position i is
(lg ) / 2Take s n s s s
n
(lg ) / 2
, (lg ) / 2Pr 1 2 1n
i nA n
(lg ) / 2n 1 1 n
(lg ) / 2 / (lg ) / 21
(lg ) / 2
(lg ) / 2
(1 1 ) (1 )n
n n n
n
nn
n
n
The groups are mutually exclusive, ind. coin flips,
the prob. that every one of the groups fails to be a streak oflength is at most
1 2 / lg 11
2 / lg 1 / lg 1
(1 ) n n
nn n n n
ne O e O
19
(lg ) / 2 1
(lg ) / 2
Pr 1 1/n
jj n
n
L O n
Thus, the prob. that the longest streak exceeds is
WHY?
0(lg ) / 2
0 (lg ) / 2 1
(lg ) / 2 1
(lg ) / 2 1
Pr
Pr Pr
(lg ) / 2 Pr
(lg ) / 2 Pr
(lg ) / 2 1 1/ (lg )
n
jjn n
j jj j nn
jj nn
jj n
E L j L
j L j L
n L
n L
n O n n
■
20
Using indicator r.v. :
Let ik ikX I A1
1Let
n k
ikiX X
1
11 1 1 1
1 1 1 2Pr 1/ 2 k
n k
ikin k n k n k k n k
ik iki i i
E X E X
E X A
lg 1 1
1
lglg 1 lg 1 1 ( lg 1) /
21
( )
If , for some constant ,
c n c c c
c
k c n cn c n n c n c n n
E Xn n n
n
21
If c is large, the expected number of streaks of length clgn is very small.
Therefore, one streak of such a length is very likely to occur.
12
1 12
1 12
12( ) lg
If , then we obtain
and we expect that there will be a large number of streaksof length
nc E X n
n
:(lg )The length of the longest streak is
Conclusionn ■
22
The on-line hiring problem:
To hire an assistant, an employment agency sends one candidate each day. After interviewing that person you decide to either hire that person or not. The process stops when a person is hired.
What is the trade-off between minimizing the
amount of interviewing and maximizing the quality of the candidate hired?
25
Let M(j) = max 1ij{score(i)}.
Let S be the event that the best-qualified applicant is chosen.
Let Si be the event the best-qualified applicant chosen is the i-th one interviewed.
Si are disjoint and we have Pr{S}= ni=1Pr{Si}.
If the best-qualified applicant is one of the first k, we have that Pr{Si}=0 and thus
Pr{S}= ni=k+1Pr{Si}.
26
Let Bi be the event that the best-qualified applicant must be in position i.
Let Oi denote the event that none of the applicants in position k+1 through i-1 are chosen
If Si happens, then Bi and Oi must both happen.
Bi and Oi are independent! Why?
Pr{Si} = Pr{Bi Oi} = Pr{Bi} Pr{Oi}.
Clearly, Pr{Bi} = 1/n.
Pr{Oi} = k/(i-1). Why???
Thus Pr{Si} = k/(n(i-1)).