Lecture 7-cs648 Randomized Algorithms

32
Randomized Algorithms CS648 Lecture 7 Two applications of Union Theorem Balls into Bin experiment : Maximum load Randomized Quick Sort: Concentration of the running time 1

Transcript of Lecture 7-cs648 Randomized Algorithms

Page 1: Lecture 7-cs648 Randomized Algorithms

Randomized AlgorithmsCS648

Lecture 7Two applications of Union Theorem• Balls into Bin experiment : Maximum load• Randomized Quick Sort: Concentration of the running time

1

Page 2: Lecture 7-cs648 Randomized Algorithms

Union theorem

Theorem: Suppose there is an event defined over a probability space (,P) such that = , then

P() ≤ Furthermore, if is same for each , then

P() ≤

2

Page 3: Lecture 7-cs648 Randomized Algorithms

Union theorem

When to use Union theorem: Suppose we wish to get an upper bound on P() but it turns out to be difficult to calculate P() directly.

How to use Union theorem: Try to express as union of events (usually identical) such that it is easy to calculate P().Then we can get an upper bound on P() as

P() ≤

3

Page 4: Lecture 7-cs648 Randomized Algorithms

APPLICATION 1 OF THE UNION THEOREM

BALLS INTO BINS: MAXIMUM LOAD

4

Page 5: Lecture 7-cs648 Randomized Algorithms

Balls into Bins

Ball-bin Experiment: There are balls and bins. Each ball selects its bin randomly uniformly and independent of other balls and falls into it. Used in:• Hashing• Load balancing in distributed environment

5

1 2 3 … i … n

1 2 3 4 5 … m-1 m

Page 6: Lecture 7-cs648 Randomized Algorithms

Balls into Bins

Ball-bin Experiment: There are balls and bins. Each ball selects its bin randomly uniformly and independent of other balls and falls into it.

Theorem: For the case when , prove that with very high probability, every bin has O(log ) balls.

6

1 2 3 … j … n

1 2 3 4 5 … m-1 m

Page 7: Lecture 7-cs648 Randomized Algorithms

Balls into BinsThe main difficulty and the way out

Event : There is some bin having at least balls.Observation: It is too difficult to calculate P() directly.

Question: What is the way out?

7

1 2 3 … j … n

1 2 3 4 5 … m-1 m

Page 8: Lecture 7-cs648 Randomized Algorithms

Balls into BinsFrom perspective of th bin

Event : There is some bin having at least balls.Event : th bin has at least balls.Question: What is the relation and ?Answer:

8

1 2 3 … j … n

1 2 3 4 5 … m-1 m

P() ≤

Page 9: Lecture 7-cs648 Randomized Algorithms

Balls into BinsFrom perspective of th bin

Event : There is some bin having at least balls.Event : th bin has at least balls.Observation: In order to show P() < , it suffice to show P() < ??

P() <

9

1 2 3 … j … n

1 2 3 4 5 … m-1 m

𝑛− 5

P() ≤

Page 10: Lecture 7-cs648 Randomized Algorithms

AIM: TO SHOWP() <

P(TH BIN HAS AT LEAST BALLS) <

10

Page 11: Lecture 7-cs648 Randomized Algorithms

Calculating P()

P[] = = ≤ = ≤ ≤ ≤ ≤ ≤ ≤ ≤

11

Using Stirling’s formula

Choosing

Page 12: Lecture 7-cs648 Randomized Algorithms

Balls into Bins

Theorem: If balls are thrown randomly uniformly and independently into bins , then with probability , maximum load of any bin will be O(log ) balls.

Note:With slightly more careful calculation, it can be shown that the maximum load will be O((log )/log log ).

12

Page 13: Lecture 7-cs648 Randomized Algorithms

APPLICATION 2 OF THE UNION THEOREM

RANDOMIZED QUICK SORT: THE SECRET OF ITS POPULARITY

13

Page 14: Lecture 7-cs648 Randomized Algorithms

Concentration of Randomized Quick Sort

: random variable for the no. of comparisons during Randomized Quick Sort

We know: E[]

Our aim: P( > ) < For any constant , we can find constants and such that the above inequality holds.

We shall show that P( > ) <

14

A

… 𝒏

Page 15: Lecture 7-cs648 Randomized Algorithms

Concentration of Randomized Quick SortTools needed

1. Slightly generalized Union theorem:Suppose there is an event defined over a probability space (,P) such that = , then

P() ≤

2. Probability that we get less than HEADS during tosses of a fair coin is less than .

15

Page 16: Lecture 7-cs648 Randomized Algorithms

Randomized QuickSort The main difficulty and the way out

Question: What is the main difficulty in showing P( > ) < Answer: No direct way to bound P( > ) because • sample space is too huge• Sample space is non-uniform

Question: How could we bound E[] ?Answer: (by taking microscopic view of Randomized Quick sort)

16

𝑒𝑖Elements of A arranged in Increasing order of values

𝑒 𝑗

Page 17: Lecture 7-cs648 Randomized Algorithms

Randomized QuickSort The main difficulty and the way out

Question: What is the main difficulty in showing P( > ) < Answer: No direct way to bound P( > ) because • sample space is too huge• Sample space is non-uniform

Question: How could we bound E[] ?Answer: (by taking microscopic view of Randomized Quick sort)

17

𝑒𝑖Elements of A arranged in Increasing order of values

Page 18: Lecture 7-cs648 Randomized Algorithms

Randomized QuickSort from perspective of

18

Elements of A arranged in Increasing order of values

𝑒𝑖

leaves the algorithm

Page 19: Lecture 7-cs648 Randomized Algorithms

Randomized QuickSort from perspective of

: no. of recursive calls in which participates before being selected as a pivot.

Question: Is there any relation between and ?Answer:

19

Page 20: Lecture 7-cs648 Randomized Algorithms

Randomized QuickSort A new way to count the comparisons

Key idea: Assign each comparison during a recursive calls to the non-pivot element.

Question: Is there any relation between and ?Answer: Observation: If > , there must be at least one such that

>

20

Elements of A arranged in Increasing order of values

Page 21: Lecture 7-cs648 Randomized Algorithms

Randomized QuickSort Applying Union theorem

Observation: If > , there must be at least one such that >

Event : > Event : > Question: What is the relation and ?Answer: P() ≤ Observation: In order to show P() < , it suffice to show P() < ??

P(> ) <

21

𝑛− 8

Page 22: Lecture 7-cs648 Randomized Algorithms

AIM: TO SHOWP(> ) <

22

Page 23: Lecture 7-cs648 Randomized Algorithms

Randomized Quick Sort

Definition: a recursive call is good if the pivot is selected from the middle half, and bad otherwise.

P(a recursive call is good) = ??Notation: The size of a recursive call is the size of the subarray it sorts.

23

middle-half

Increasing order of values

…12

Page 24: Lecture 7-cs648 Randomized Algorithms

Randomized Quick Sort

Observation: If a recursive call is good, size of each of its child-recursive calls reduces by a factor of .

24

middle-half

Increasing order of values

Page 25: Lecture 7-cs648 Randomized Algorithms

Randomized Quick Sort

Question: What is the maximum no. of good recursive calls can have ?Answer: .

25

middle-half

𝑒𝑖

Increasing order of values

Page 26: Lecture 7-cs648 Randomized Algorithms

Randomized Quick SortSummary from the perspective of

During Randomized Quick Sort element • Participates in a sequence of recursive calls each of which is good

independently with probability .• leaves the algorithm on or before participating in good recursive calls.

can be re-stated as: participated in more than recursive calls but fewer than turned out to be

good.

P() < =

26

Probability we get less than HEADS during tosses of a fair coin < .

Page 27: Lecture 7-cs648 Randomized Algorithms

Randomized Quick SortFinal result

Theorem: Let be the random variable for the no. of comparisons during Randomized Quick Sort on input of size

P( > ) <

Homework: Rework the calculation to find the smallest possible such thatP( > ) <

27

Page 28: Lecture 7-cs648 Randomized Algorithms

SOME WELL KNOWN AND WELL STUDIEDRANDOM VARIABLES

28

Page 29: Lecture 7-cs648 Randomized Algorithms

Bernoulli Random Variable

Bernoulli Random Variable:A random variable X is said to be a Bernoulli random variable with parameter if it takes value 1 with probability and takes value 0 with probability .The corresponding random experiment is usually called a Bernoulli trial.

Example: Tossing a coin (of HEADS probability= ) once, HEADS corresponds to 1 and TAILS corresponds to 0.

E[X] =

29

Page 30: Lecture 7-cs648 Randomized Algorithms

Binomial Random Variable

Binomial Random Variable:Let ,…, be independent Bernoulli random variables with parameter , then random variable X= is said to be a Binomial random variable with parameters and

Example: number of HEADS when we toss a coin (of HEADs probability= ) times.

Homework: Prove, without any knowledge of binomial coefficients, that E[X] = .

30

Page 31: Lecture 7-cs648 Randomized Algorithms

Geometric Random Variable

Geometric Random Variable:Consider an infinite sequence of independent and identical Bernoulli trials with parameter . Let X denote the number of these trials upto and including the trial which gives the first 1 is called a Geometric random variable with parameter .

Example: Number of tosses of a coin (of HEADs probability= ) to get the first HEADS.

Homework: • Find the probability P(X= ).• Prove, that E[X] =

31

Page 32: Lecture 7-cs648 Randomized Algorithms

Negative Binomial Random Variable

Negative Binomial Random Variable:Let ,…, be independent Geometric random variables with parameter , then random variable X= is said to be a negative-Binomial random variable with parameters and

Example: number of tosses of a coin (of HEADs probability= ) to get HEADS.

Homework: • Guess why it is called “negative” Binomial random variable.• Find the probability P(X= ).• Prove, without any knowledge of binomial coefficients, that E[X] =

32