Tracing a Single User Joint work with Noga Alon. Group Testing Dorfman raised the following problem...

21
Tracing a Single User Joint work with Noga Alon
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    0

Transcript of Tracing a Single User Joint work with Noga Alon. Group Testing Dorfman raised the following problem...

Tracing a Single User

Joint work with Noga Alon

Group Testing

Dorfman raised the following problem in 1941:

All American inductees gave blood samples, that were tested for the presence of a syphilitic antigen.

We assume that the number of infected blood samples r is much smaller than the total number m.

Testing each sample separately requires m tests.

Group Testing (cont.) Instead, one can test pools that contain blood from

a set of samples.

If the outcome is negative – none of the samples in the pool is infected.

Otherwise, the pool contains at least one infected sample, which can be determined by further tests.

This way, less than m tests are needed.

Molecular Biology In recent years this problem has gained

popularity again in the field of molecular biology.

For example, when we are given a large set of DNA sequences, and we look for all those that contain a specific short subsequence.

We can use a method similar to that of the blood testing problem.

Molecular Biology (cont.) In some applications, we are interested in

finding one sequence that contains the short subsequence, rather than all of them.

Parallelization Often, we would prefer to conduct all

experiments simultaneously, even at the cost of increasing the number of experiments.

Thus, we need our tests to be non-adaptive, i.e. the pool tested in each experiment is independent of the outcomes of other experiments.

Non-Adaptive Tests

a1 a2 . . . .am

T1 0 1 1 1 1 0 0

T2 1 1 1 1 0 0 1

. 0 1 0 0 1 1 0

. 1 0 1 0 0 0 0

. 1 0 0 1 1 0 1

Tn 0 0 1 0 0 1 1

1

1

1

0

1

0

r-SUT Definition

Definition: Let F be a family of subsets of[n] = {1,…,n}. F is called r-single-user-tracing superimposed (r-SUT) if F1,…,FkF with |Fi|r,

In other words, given the union of up to r sets from F, one can identify at least one of those sets.

. A...A i

k

1iAA k1

FFF

Communication Suppose that m users share a common channel.

Each user is associated with a vector in {0,1}n.

All active users transmit their vectors, and a single receiver gets the OR of all transmitted vectors.

Given that at most r users are active simultaneously, we would like the receiver to be able to identify at least one of them.

Maximal r-SUT Families Let g(n,r) denote the maximum size of an

r-SUT family of subsets of [n].

Let Rg(r) = lim sup n log g(n,r) / n.

Csűrös and Ruszinkó: There exist constants c1,c2>0 s.t.

.

Our result: Rg(r) =(1/r) (and hence (1/r)).

rc

gr

c 221 rR

Lower Bound Let m = 2n/(20r).

We construct a family F={F1,…,Fm} of subsets of [n] at random as follows:

1 ≤ i ≤ m and 1 ≤ j ≤ n independently, put j in Fi with probability 1/r.

Lower Bound (cont.) We show that F is r-SUT with positive probability.

We say a configuration of F1,…,FkF with |Fi|r and

is bad if all the unions are equal.

We show that with positive probability there are no bad configurations.

i

k

1iF A

iA F

Lower Bound (cont.)

We show that with probability > ½ no small configuration is bad, and that with probability > ½ no large configuration is bad.

Therefore, with positive probability there is no bad configuration.

Small Configurations

Proposition: With probability > ½ the following holds:s<2r and distinct A1,…,AsF, j[n] that belongs to exactly one of the sets A1,…,As.

Corollary: With probability > ½ no small configuration is bad.

r2i

k

1i

F

Small Configurations (cont.)

A1

A3

A4

A6

A8

A7A2

A5

A9

Large Configurations

Proposition: With probability > ½ the following holds.

For all distinct A1,…,Ar,B1,…,BrF,

Corollary: With probability > ½ no large configuration is bad.

r2i

k

1i

F

. BA i

r

1ii

r

1i

Large Configurations (cont.)

B1 B3

A2

A1

B2

A3

B1

B2

B3

Ai

Tracing Multiple Users Recently, Laczay and Ruszinkó have

introduced the following generalization of r-SUT families.

For integers n, r2, and 1kr, a family F of subsets of [n] is called k-out-of-r multiple-user-tracing superimposed (MUTk(r)) if given the union of any ℓr sets from F, one can identify at least min(k,ℓ) of them.

Tracing Multiple Users (cont.) Let h(n,r,k) denote the maximum size of a

MUTk(r) family of subsets of [n].

Let Rh(r,k) = lim sup n log h(n,r,k) / n.

We have shown that there are constants c1,c2,c3,c4>0 s.t. .

243

221

k

klogcrc

hk

crc ,mink,rR,min

Open Problems We have shown that Rg(r) = (1/r), but the

question of finding the exact constant is still open.

This problem is open even for the case of r = 2.

1/3 Rg(2) 1/2+o(1). Follows from a result of Coppersmith and Shearer

By a careful analysis of the random construction

Open Problems (cont.) We show how to construct an r-SUT family in time

mO(r), where m is the size of the family.It would be interesting to find explicit constructions for all r.

There are other related problems for which there are still gaps between lower and upper bounds: Multiple-user tracing families r-superimposed families Disjointly r-superimposed families Graph identifying codes