CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF....
-
Upload
alison-singleton -
Category
Documents
-
view
215 -
download
2
Transcript of CSE 321 Discrete Structures Winter 2008 Lecture 19 Probability Theory TexPoint fonts used in EMF....
CSE 321 Discrete Structures
Winter 2008
Lecture 19
Probability Theory
Announcements
• Readings – Probability Theory
• 6.1, 6.2 (5.1, 5.2) Probability Theory• 6.3 (New material!) Bayes’ Theorem• 6.4 (5.3) Expectation
– Advanced Counting Techniques – Ch 7.• Not covered
Discrete Probability
Experiment: Procedure that yields an outcome
Sample space: Set of all possible outcomes
Event: subset of the sample space
S a sample space of equally likely outcomes, E an event, the probability of E, p(E) = |E|/|S|
Example: Dice
Example: Poker
Probability of 4 of a kind
Combinations of Events
EC is the complement of E
P(EC) = 1 – P(E)
P(E1 E2) = P(E1) + P(E2) – P(E1 E2)
Combinations of Events
EC is the complement of E
P(EC) = 1 – P(E)
P(E1 E2) = P(E1) + P(E2) – P(E1 E2)
Probability Concepts
• Probability Distribution
• Conditional Probability
• Independence
• Bernoulli Trials / Binomial Distribution
• Random Variable
Discrete Probability Theory
• Set S
• Probability distribution p : S [0,1]– For s S, 0 p(s) 1
– sS p(s) = 1
• Event E, E S
• p(E) = sEp(s)
Examples
Conditional Probability
Let E and F be events with p(F) > 0. The conditional probability of E given F, defined by p(E | F), is defined as:
Examples
Independence
The events E and F are independent if and only if p(E F) = p(E)p(F)
E and F are independent if and only if p(E | F) = p(E)
Are these independent?
• Flip a coin three times– E: the first coin is a head– F: the second coin is a head
• Roll two dice– E: the sum of the two dice is 5– F: the first die is a 1
• Roll two dice– E: the sum of the two dice is 7– F: the first die is a 1
• Deal two five card poker hands– E: hand one has four of a kind– F: hand two has four of a kind
Bernoulli Trials and Binomial Distribution
• Bernoulli Trial– Success probability p, failure probability q
The probability of exactly k successes in n independent Bernoulli trials is
Random Variables
A random variable is a function from a sample space to the real numbers
Bayes’ Theorem
Suppose that E and F are events from a sample space S such that p(E) > 0 and p(F) > 0. Then
False Positives, False Negatives
Let D be the event that a person has the disease
Let Y be the event that a person tests positive for the disease
Testing for disease
Disease is very rare: p(D) = 1/100,000
Testing is accurate:False negative: 1%False positive: 0.5%
Suppose you get a positive result, whatdo you conclude?
P(D|Y)
Spam Filtering
From: Zambia Nation Farmers Union [[email protected]]Subject: Letter of assistance for school installationTo: Richard Anderson
Dear Richard,I hope you are fine, Iam through talking to local headmen about the possible assistance of school installation. the idea is and will be welcome.I trust that you will do your best as i await for more from you.Once againThanking you very muchSebastian Mazuba.
Bayesian Spam filters
• Classification domain– Cost of false negative– Cost of false positive
• Criteria for spam– v1agra, ONE HUNDRED MILLION USD
• Basic question: given an email message, based on spam criteria, what is the probability it is spam
Email message with phrase “Account Review”
• 250 of 20000 messages known to be spam• 5 of 10000 messages known not to be spam• Assuming 50% of messages are spam, what
is the probability that a message with “Account Review” is spam
Proving Bayes’ Theorem
Expectation
The expected value of random variable X(s) on sample space S is:
Flip a coin until the first headExpected number of flips?
Probability Space:
Computing the expectation:
Linearity of Expectation
E(X1 + X2) = E(X1) + E(X2)
E(aX) = aE(X)
Hashing
H: M [0..n-1]
If k elements have been hashed torandom locations, what is the expected number of elements in bucket j?
What is the expected number ofcollisions when hashing k elements to random locations?
Hashing analysis
Sample space: [0..n-1] [0..n-1] . . . [0..n-1]
Random VariablesXj = number of elements hashed to bucket jC = total number of collisions
Bij = 1 if element i hashed to bucket jBij = 0 if element i is not hashed to bucket j
Cab = 1 if element a is hashed to the same bucket as element bCab = 0 if element a is hashed to a different bucket than element b
Counting inversions
Let p1, p2, . . . , pn be a permutation of 1 . . . npi, pj is an inversion if i < j and pi > pj
4, 2, 5, 1, 3
1, 6, 4, 3, 2, 5
7, 6, 5, 4, 3, 2, 1
Expected number of inversions for a random permutation
Insertion sort
for i :=1 to n-1{ j := i; while (j > 0 and A[ j - 1 ] > A[ j ]){
swap(A[ j -1], A[ j ]); j := j – 1; }}
4 2 5 1 3
Expected number of swaps for Insertion Sort
Left to right maxima
max_so_far := A[0];for i := 1 to n-1 if (A[ i ] > max_so_far)
max_so_far := A[ i ];
5, 2, 9, 14, 11, 18, 7, 16, 1, 20, 3, 19, 10, 15, 4, 6, 17, 12, 8
What is the expected number of left-to-right maxima in a random permutation