Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

14
Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011

Transcript of Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

Page 1: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

Testing Collections of Properties

Reut Levi Dana Ron

Ronitt Rubinfeld

ICS 2011

Page 2: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

Shopping distribution

What properties do your distributions have?

Page 3: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

Transactions in California Transactions in New York

Testing closeness of two distributions:

trend change?

Page 4: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

Testing Independence:Shopping patterns:

Independent of zip code?

Page 5: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

This work: Many distributions

Page 6: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

One distribution:

D is arbitrary black-box distribution over [n], generates iid samples.

Sample complexity in terms of n? (can it be sublinear?)

D

Test

samples

Pass/Fail?

Page 7: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

Uniformity (n1/2) [Goldreich, Ron 00] [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01] [Paninski 08]

Identity (n1/2) [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01]

Closeness (n2/3) [Batu, Fortnow, Rubinfeld, Smith, White], [Valiant 08]

Independence O(n12/3 n2

1/3), (n12/3 n2

1/3) [Batu, Fortnow, Fischer, Kumar, Rubinfeld, White 01] , this work

Entropy n1/β^2+o(1) [Batu, Dasgupta, Kumar, Rubinfeld 05], [Valiant 08]

Support Size (n/logn) [Raskhodnikova, Ron, Shpilka, Smith 09], [Valiant, Valiant 10]

Monotonicity on total order (n1/2) [Batu, Kumar, Rubinfeld 04]

Monotonicity on poset n1-o(1)

[Bhattacharyya, Fischer, Rubinfeld, Valiant 10]

Some answers…

Page 8: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

Collection of distributions:

Two models: Sampling model:

Get (i,x) for random i, xDi

Query model: Get (i,x) for query i and xDi

Sample complexity in terms of n,m?

D1

Test

samples

Pass/Fail?

D2 Dm…

Further refinement: Known or unknown distribution on i’s?

Page 9: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

Properties considered:

Equivalence All distributions are equal

``Clusterability’’ Distributions can be clustered into k

clusters such that within a cluster, all distributions are close

Page 10: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

Equivalence vs. independence

Process of drawing pairs: Draw i [m], x Di output (i,x)

Easy fact: (i,x) independent iff Di‘s are equal

Page 11: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

Results

Def: (D1,…Dm) has the Equivalence property if Di = Di' for all 1 ≤ i, i’ ≤ m.

Lower Bound Upper Bound

n>m (n2/3m1/3) Unknown Weights Õ(n2/3m1/3)

m>n (n1/2m1/2) Õ(n1/2m1/2) Known Weights

Also yields “tight” lower bound for independence testing

Page 12: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

Clusterability

Can we cluster distributions s.t. in each cluster, distributions (very) close? Sample complexity of test is

O(kn2/3) for n = domain size, k = number of clusters No dependence on number of distributions Closeness requirement is very stringent

Page 13: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

Open Questions

• Clusterability in the sampling model, less stringent notion of close

• Other properties of collections?• E.g., all distributions are shifts of each other?

Page 14: Testing Collections of Properties Reut Levi Dana Ron Ronitt Rubinfeld ICS 2011.

Thank you