Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small...

22
Large Deviations vs. Small Deviations 1

Transcript of Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small...

Page 1: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

Large Deviations vs.

Small Deviations

1

Page 2: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

Sum of IID binary random variable

Sn =

n∑

i=1

Xi

Xi =

{

1 with prob. p

0 with prob. 1 − p

2

Page 3: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

Small deviations vs. large deviations

3

Page 4: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

Small deviations vs. large deviations

4

Page 5: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

Large and small deviations are incompatible

Scale Prob.limit

Small Deviation analysis Sm/√m const

Large deviation analysis Sm/m e−cm

5

Page 6: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

Hoeffding / Chernoff / Sanov’s bound

6

Page 7: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

3-element distributionsand the 2D simplex

7

Page 8: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

Sanov’s theorem

Distributionswithin KL-divergence < a

from P

xP

(1,0,0)

(0,1,0)

(0,0,1)

8

Page 9: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

On the virtues of the cumulative distribution

function

9

Page 10: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

matlab code

10

Page 11: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

CDFs and sorting

• To Create a CDF you need to sort

• Can knowing the CDF help you sort?

11

Page 12: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

No. of comparisons needed for sorting

• There are n! permutations

• Identifying the correct permutation requires log(n!) comparisons.

• log(n!) ~ n log n

• Merge-sort achieves time O(n log n)

12

Page 13: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

Efficient sorting for uniform distribution

• n Elements drawn IID from uniform distribution over [0,1]

• A = array[1:n] of lists

• given x insert it into list at A[floor(x n)]

• Lists would rarely have > 1 element

• time: O(n)

13

Page 14: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

Efficient sorting using the CDF

x

A1

n

14

Page 15: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

Efficient sorting for an IID source

• Source is IID but distribution is unknown

• Sort first m instances to estimate CDF

• Use CDF to sort rest of data.

• How large should m be?

15

Page 16: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

Stability of empirical CDF

250 instances 1000 instances

16

Page 17: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

The difference between two empirical CDFs

% input: samples d1,d2n = length(d1) % assuming length(d1)==length(d2)vals = [d1; d2];labels = [repmat(1,n,1); repmat(-1,n,1)];

[s,i] = sort(vals); % sort labels by s_labels = labels(i); % increasing vals

d = cumsum(s_labels);plot(0:length(d),[0;d]);

17

Page 18: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

Typical cumulative difference

18

Page 19: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

Typical cumulative difference

19

Page 20: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

Maximal divergence of a returning random walk

20

Page 21: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

The Glivenko-Cantelli Theorem

= empirical CDF using n random instancesFn(x)= CDFF (x)

! = error tolerance

Pxn

1

!

maxx

|Fn(x) ! F (x)| " ε

"

# ae−bn!

2

Elegant proof.Devroye, Gyorfi, Lugosi / A probabilistic Theory of Pattern Recognition, page 192Best Constants: a=b=2, Complex proof. MASSART, P. (1990). The tight constant in the Dvoretzky-Kiefer-Wolfowitz inequality. Ann. of Probability 18, 1269-1293.

xn

1 = !x1, x2, . . . , xn" = sample

21

Page 22: Large Deviations vs. Small Deviations · Small deviations vs. large deviations 4. Large and small deviations are incompatible Scale Prob. limit Small Deviation analysis Sm/√m const

Empirical test of the Glivenko-Cantelli Theorem

250 instances 1000 instances

22