[IEEE 2010 IEEE International Symposium on Information Theory - ISIT - Austin, TX, USA...

5
Recovery thresholds for 1 optimization in binary compressed sensing Mihailo Stojnic School of Industrial Engineering Purdue University West Lafayette IN 47907, USA Email: [email protected] Abstract—Recently, [3], [7] theoretically analyzed the success of a polynomial 1 optimization algorithm in solving an under- determined system of linear equations. In a large dimensional and statistical context [3], [7] proved that if the number of equations (measurements in the compressed sensing terminology) in the system is proportional to the length of the unknown vector then there is a sparsity (number of non-zero elements of the unknown vector) also proportional to the length of the unknown vector such that 1 optimization succeeds in solving the system. In this paper, we consider the same problem while additionally assuming that all non-zero elements elements are equal to each other. We provide a performance analysis of a slightly modified 1 optimization. As expected, the obtained recoverable sparsity proportionality constants improve on the equivalent ones that can be obtained if no information about the non-zero elements is available. In addition, we conducted a sequence of numerical experiments and observed that the obtained theoretical proportionality constants are in a solid agreement with the ones obtained experimentally. Index Terms: compressed sensing, 1 optimization I. I NTRODUCTION In last several years the area of compressed sensing has been the subject of extensive research. One of the most crucial compressed sensing problems is finding the sparsest solution of an under-determined system of equations. While this problem had been known for a long time it is the work of [3] and [7] that rigorously proved for the first time that a sparse enough solution can be recovered by solving a linear program in polynomial time. These results generated enormous amount of research with possible applications ranging from high-dimensional geometry, image reconstruction, single-pixel camera design, decoding of linear codes, channel estimation in wireless communications, to machine learning, data-streaming algorithms, DNA micro-arrays, magneto-encephalography etc. (more on the compressed sensing problems, their importance, and wide spectrum of different applications can be found in excellent references [1], [5], [13], [21]–[23], [31]). As is well known, solving an under-determined system of linear equations assumes finding x such that Ax = y (1) where (in the compressed sensing terminology) A is an m × n (m<n) measurement matrix and y is an m × 1 measurement vector. Standard compressed sensing context assumes that x is an n × 1 unknown k-sparse vector (see Figure 1; here and in the rest of the paper, under k-sparse vector we assume a vector that has at most k nonzero components). The main topic of this paper will be compressed sensing of the so-called ideally sparse signals (more on the so-called approximately sparse signals can be found in e.g. [4], [33]). Also, in the rest of the paper we will assume the so-called linear regime, i.e. we will assume that k = βn and that the number of the measurements is m = αn where α and β are absolute constants independent of n. The following 1 optimization approach to solving (1) k n m = A x y Fig. 1. Model of a linear system; vector x is k-sparse attracted a great deal of attention in recent years min x 1 subject to Ax = y. (2) Quite remarkably, for certain statistical matrices A in [3], [7] the authors were able to show that if α and n are given then any unknown vector x with no more than k = βn (where β is an absolute constant dependent on α and explicitly calculated in [3], [7]) non-zero elements can be recovered, with over- whelming probability, by solving (2). (Under overwhelming probability we in this paper assume a probability that is no more than a number exponentially decaying in n away from 1.) As expected, this assumes that y was in fact generated by that x and given to us. (More on practically very important scenario when the available measurements are noisy versions of y can be found in [3], [32].) The above described scenario is the standard compressed sensing setup. Such a setup does not assume availability of any a priori knowledge about the structure of the unknown k-sparse ISIT 2010, Austin, Texas, U.S.A., June 13 - 18, 2010 1593 978-1-4244-7892-7/10/$26.00 ©2010 IEEE ISIT 2010

Transcript of [IEEE 2010 IEEE International Symposium on Information Theory - ISIT - Austin, TX, USA...

Page 1: [IEEE 2010 IEEE International Symposium on Information Theory - ISIT - Austin, TX, USA (2010.06.13-2010.06.18)] 2010 IEEE International Symposium on Information Theory - Recovery thresholds

Recovery thresholds for �1 optimization in binarycompressed sensing

Mihailo StojnicSchool of Industrial Engineering

Purdue University

West Lafayette IN 47907, USA

Email: [email protected]

Abstract—Recently, [3], [7] theoretically analyzed the successof a polynomial �1 optimization algorithm in solving an under-determined system of linear equations. In a large dimensionaland statistical context [3], [7] proved that if the number ofequations (measurements in the compressed sensing terminology)in the system is proportional to the length of the unknownvector then there is a sparsity (number of non-zero elementsof the unknown vector) also proportional to the length of theunknown vector such that �1 optimization succeeds in solvingthe system. In this paper, we consider the same problem whileadditionally assuming that all non-zero elements elements areequal to each other. We provide a performance analysis ofa slightly modified �1 optimization. As expected, the obtainedrecoverable sparsity proportionality constants improve on theequivalent ones that can be obtained if no information aboutthe non-zero elements is available. In addition, we conducteda sequence of numerical experiments and observed that theobtained theoretical proportionality constants are in a solidagreement with the ones obtained experimentally.Index Terms: compressed sensing, �1 optimization

I. INTRODUCTION

In last several years the area of compressed sensing has

been the subject of extensive research. One of the most

crucial compressed sensing problems is finding the sparsest

solution of an under-determined system of equations. While

this problem had been known for a long time it is the work

of [3] and [7] that rigorously proved for the first time that a

sparse enough solution can be recovered by solving a linear

program in polynomial time. These results generated enormous

amount of research with possible applications ranging from

high-dimensional geometry, image reconstruction, single-pixel

camera design, decoding of linear codes, channel estimation in

wireless communications, to machine learning, data-streaming

algorithms, DNA micro-arrays, magneto-encephalography etc.

(more on the compressed sensing problems, their importance,

and wide spectrum of different applications can be found in

excellent references [1], [5], [13], [21]–[23], [31]).

As is well known, solving an under-determined system of

linear equations assumes finding x such that

Ax = y (1)

where (in the compressed sensing terminology) A is an m×n(m < n) measurement matrix and y is an m×1 measurement

vector. Standard compressed sensing context assumes that x is

an n × 1 unknown k-sparse vector (see Figure 1; here and in

the rest of the paper, under k-sparse vector we assume a vector

that has at most k nonzero components). The main topic of

this paper will be compressed sensing of the so-called ideally

sparse signals (more on the so-called approximately sparse

signals can be found in e.g. [4], [33]). Also, in the rest of the

paper we will assume the so-called linear regime, i.e. we will

assume that k = βn and that the number of the measurements

is m = αn where α and β are absolute constants independent

of n. The following �1 optimization approach to solving (1)

k

n

m =

A xy

Fig. 1. Model of a linear system; vector x is k-sparse

attracted a great deal of attention in recent years

min ‖x‖1

subject to Ax = y. (2)

Quite remarkably, for certain statistical matrices A in [3], [7]

the authors were able to show that if α and n are given then

any unknown vector x with no more than k = βn (where β is

an absolute constant dependent on α and explicitly calculated

in [3], [7]) non-zero elements can be recovered, with over-

whelming probability, by solving (2). (Under overwhelming

probability we in this paper assume a probability that is no

more than a number exponentially decaying in n away from

1.) As expected, this assumes that y was in fact generated by

that x and given to us. (More on practically very important

scenario when the available measurements are noisy versions

of y can be found in [3], [32].)

The above described scenario is the standard compressed

sensing setup. Such a setup does not assume availability of any

a priori knowledge about the structure of the unknown k-sparse

ISIT 2010, Austin, Texas, U.S.A., June 13 - 18, 2010

1593978-1-4244-7892-7/10/$26.00 ©2010 IEEE ISIT 2010

Page 2: [IEEE 2010 IEEE International Symposium on Information Theory - ISIT - Austin, TX, USA (2010.06.13-2010.06.18)] 2010 IEEE International Symposium on Information Theory - Recovery thresholds

signal x. In certain application one can however encounter

signals that have an a priori known structure (see, e.g. [1],

[14], [15], [28]–[30], [34]) and then try to design the recovery

algorithms that would exploit the knowledge of the signal

structure. In this paper we will also consider signals that have

certain structure. Namely, we will be interested in the recovery

of the so-called binary sparse signals x. For the simplicity

of the exposition we will under binary sparse signals assume

signals whose components are either zero or one (more on this

or similar discrete type of signals as well as on their potential

applications can be found in e.g. [6], [8], [11], [19], [20],

[31]). Consequently, under binary k-sparse signals x we will

consider signals that have k components equal to one and the

remaining ones equal to zero. It is also relatively easy to note

that the binary sparse signals are a subclass of the so-called

non-negative sparse signals studied in e.g. [11], [18], [27],

[35]. To recover the non-negative sparse signals one usually

considers a modification of (3). That modification assumes

adding positivity constraints on each of the components of x(see, e.g. [11], [18], [27], [35]). In a similar fashion to recover

binary k-sparse x we will consider the following modification

of (2) (see e.g. [11])

min ‖x‖1

subject to Ax = y

0 ≤ xi ≤ 1, 1 ≤ i ≤ n. (3)

Let βw be the maximum allowable value of the constant βsuch that (3) finds given binary k-sparse solution of (1) with

overwhelming probability. The value of βw is often referred

to as the weak threshold (for more on the definitions of the

weak threshold see, e.g. [7], [10], [27]). For a specific group

of randomly generated matrices A, we will in this paper

determine the values of βw for the entire range of α, i.e. for

0 ≤ α ≤ 1.

II. BASIC THEOREMS

In this section we introduce two useful theorems that will

be of key importance in our subsequent analysis. First we

introduce a null-space characterization of the matrix A that

guarantees that the solutions of (1) and (3) coincide. Since the

analysis will clearly be irrelevant with respect to what partic-

ular location of the nonzero elements are chosen, we can for

the simplicity of the exposition and without loss of generality

assume that the components x1,x2, . . . ,xn−k of x are equal

to zero and the components xn−k+1,xn−k+2, . . . ,xn of x are

equal to one. (One should note that while for our analysis it

is assumed that the components x1,x2, . . . ,xn−k of x are

equal to zero this fact is not known to the algorithm). Under

these assumptions we have the following theorem (similar

characterizations adopted in different contexts can be found

in [12], [17], [26], [28], [33], [35]).

Theorem 1: (Nonzero part of x has fixed location; all non-

zero elements of x are equal to one) Assume that an m × nmeasurement matrix A is given. Let x be a binary k-sparse

vector whose nonzero components are equal to one and let

x1 = x2 = · · · = xn−k = 0. Further, assume that y = Axand that w is an n × 1 vector. Let I = {1, 2, . . . , n − k} and

Ic = {n − k + 1, n − k + 2, . . . , n}. If (∀w ∈ Rn|Aw =0,wi ≥ 0, i ∈ I,wi ≤ 0, i ∈ Ic) it holds that

−n∑

i=n−k+1

wi <n−k∑i=1

wi (4)

then the solution of (3) is the binary k-sparse solution of (1)

Proof: Follows directly from the corresponding results in

e.g. [26], [35] by adding the obvious restriction wi ≤ 0, i ∈Ic.

Having matrix A such that (4) holds would be enough for the

solution of (3) to coincide with the binary k-sparse solution

of (1). If one assumes that m and k are proportional to n(the case of interest in this paper) then the construction of

the deterministic matrices A that would satisfy (4) is not

an easy task (in fact, one may say that it is one of the

most fundamental open problems in the area of theoretical

compressed sensing). However, turning to random matrices

significantly simplifies things. As we will see later in the paper,

the random matrices A that have the null-space uniformly

distributed in the Grassmanian will turn out to be a very

convenient choice. The following phenomenal result from [16]

that relates to such matrices will be the key ingredient in the

analysis that will follow.

Theorem 2: ( [16] Escape through a mesh) Let S be a

subset of the unit Euclidean sphere Sn−1 in Rn. Let Y be

a random (n − m)-dimensional subspace of Rn, distributed

uniformly in the Grassmanian with respect to the Haar mea-

sure. Let

w(S) = E supw∈S

(hT w) (5)

where h is a random column vector in Rn with i.i.d. N (0, 1)components. Assume that w(S) <

(√m − 1

4√

m

). Then

P (Y ∩ S = 0) > 1 − 3.5e−(√

m− 14√

m−w(S)

)2

18 . (6)

Remark: Gordon’s original constant 3.5 was substituted by

2.5 in [24]. Both constants are fine for our subsequent analysis.

III. PROBABILISTIC ANALYSIS OF THE NULL-SPACE

CHARACTERIZATIONS

In this section we probabilistically analyze validity of the

null-space characterization given in Theorem 1. We will show

how one can obtain the values of the weak threshold βw for

the entire range 0 ≤ α ≤ 1 based on such an analysis.

A. Weak threshold

As masterly noted in [24] Theorem 2 can be used to

probabilistically analyze (4). Following the procedure of [27]

we set Sw

Sw = {w ∈ Sn−1|−∑i∈Ic

wi ≥∑i∈I

wi,wi ≥ 0, i ∈ I,wi ≤ 0, i ∈ Ic}(7)

ISIT 2010, Austin, Texas, U.S.A., June 13 - 18, 2010

1594

Page 3: [IEEE 2010 IEEE International Symposium on Information Theory - ISIT - Austin, TX, USA (2010.06.13-2010.06.18)] 2010 IEEE International Symposium on Information Theory - Recovery thresholds

and

w(Sw) = E supw∈Sw

(hT w) (8)

where h is a random column vector in Rn with i.i.d. N (0, 1)components and Sn−1 is the unit n-dimensional sphere. Let

Y be an (n − m) dimensional subspace of Rn uniformly

distributed in Grassmanian. Furthermore, let Y be the null-

space of A. Then as long as w(Sw) <(√

m − 14√

m

), Y

will miss Sw (i.e. (4) will be satisfied) with probability no

smaller than the one given in (6). More precisely, if α = mn

is a constant (the case of interest in this paper), n,m are

large, and w(Sw) is smaller than but proportional to√

m then

P (Y ∩ Sw = 0) −→ 1. This in turn is equivalent to having

P ((∀w ∈ Rn|Aw = 0,wi ≥ 0, i ∈ I,wi ≤ 0, i ∈ Ic),

−∑i∈Ic

wi <∑i∈I

wi) −→ 1

which according to Theorem 1 means that the solution of (3)

coincides with probability 1 with the binary k-sparse solution

of (1). For any given value of α ∈ (0, 1] a threshold value

of β can be then determined as a maximum β such that

w(Sw) <(√

m − 14√

m

). That maximum β will be exactly

the value of the weak threshold βw. If one is only concerned

with finding a possible value for βw it is easy to note that

instead of computing w(Sw) it is sufficient to find its an upper

bound. However, as we will soon see, to determine as good

values of βw as possible, the upper bound on w(Sw) should

be as tight as possible. The main contribution of this work as

well as of [27] is a fairly precise estimate of w(Sw). As in

[27] we can write

w(h, Sw) = maxw∈Sw

(hT w)

= maxw∈Sw

(∑i∈I

hiwi +∑i∈Ic

(−hi)(−wi)). (9)

Let

h(I) = (h1,h2, . . . ,hn−k)T

and

h(Ic) = (−hn−k+1,−hn−k+2, . . . ,−hn)T .

Further, let h(I)(i) be the i-th smallest of the elements of h(I). In

a similar fashion, let h(Ic)(i) be the i-th smallest of the elements

of h(Ic). Set

h̄ = (h(I)(1),h

(I)(2), . . . ,h

(I)(n−k),h

(Ic)(1) ,h(Ic)

(2) , . . . ,h(Ic)(k) )T .

Then one can simplify (9) in the following way

w(h, Sw) = maxy∈Rn

h̄T a

subject to ai ≥ 0, 0 ≤ i ≤ nn∑

i=n−k+1

ai ≥n−k∑i=1

ai

n∑i=1

a2i ≤ 1. (10)

Let z ∈ Rn be a column vector such that zi = 1, 1 ≤ i ≤ (n−k), zi = −1, n−k+1 ≤ i ≤ n, and zi = 0, n−k+1 ≤ i ≤ n.

Repeating the derivation given in [27] one has, based on the

Lagrange duality theory, that there are c(1)w = (1 − θ

(1)w )n ≤

(n − k) and c(2)w = (1 − θ

(2)w )n ≤ k such that

w(Sw) = E(w(h, Sw)) ≤ O(√

ne−nconst.)

+ (En−k∑

i=c(1)w +1

h̄2i + E

n∑i=n−k+c

(2)w +1

h̄2i

− (E(h̄T z) − E∑c(1)

wi=1 h̄i + E

∑n−k+c(2)w

i=n−k+1 h̄i)2

n − c(1)w − c

(2)w

)12 , (11)

where h̄i is the i-th element of vector h̄ and const is an

absolute constant independent of n. One should note the

appearance of c(2)w which corresponds to the fact that now

(differently from [27]) all components of a in (10) are non-

negative. Furthermore, similarly to what was established in

[27], one can establish the following two relations that c(1)w

and c(2)w satisfy

(E(h̄T z) − E∑c(1)

wi=1 h̄i + E

∑n−k+c(2)w

i=n−k+1 h̄i)

(n − c(1)w − c

(2)w )/(1 − ε)

≥ F−1c

((1 + ε)c(1)

w

n(1 − β)

)

(E∑c(1)

wi=1 h̄i − E(h̄T z) − E

∑n−k+c(2)w

i=n−k+1 h̄i)

(n − c(1)w − c

(2)w )/(1 + ε)

≥ F−1c

((1 − ε)c(2)

w

).

(12)

As in [27], F−1c (·) in the inequalities in (12) stands for the

inverse cdf of the zero-mean, unit variance Gaussian random

variable and ε > 0 is an arbitrarily small constant independent

of n. When n is large the first term on the right hand side of

the inequality in (11) goes to zero. Finding c(1)w and c

(2)w that

satisfy (12) and minimize the second term on the right hand

side of the inequality in (11) is then enough to determine

a valid upper bound on w(Sw) and therefore to compute

the values of the weak threshold. The theorem that we will

present below summarizes how an attainable value of the weak

threshold can be computed. It follows by literally repeating

each of the remaining steps of the procedure established in

[27] while relying on results from [2], [25].

To facilitate the presentation of the theorem, we before

proceeding further first simplify notation . Let erfinv be the

inverse of the standard error function associated with zero-

mean unit variance Gaussian random variable. Further, let

f1(θ, β) be the following function of arguments θ and β

f1(θ, β) = 2(erfinv(21 − θ

1 − β− 1))2. (13)

Also, let f2(θ, β) be the following function of arguments θand β

f2(θ, β) =

⎧⎨⎩(1 − β) − (1 − θ) + (1−β)

√f1(θ,β)√

2πef1(θ,β)2(1−θ)1−β ≥ 1

(1 − β) − (1 − θ) − (1−β)√

f1(θ,β)√2πef1(θ,β) otherwise.

(14)

ISIT 2010, Austin, Texas, U.S.A., June 13 - 18, 2010

1595

Page 4: [IEEE 2010 IEEE International Symposium on Information Theory - ISIT - Austin, TX, USA (2010.06.13-2010.06.18)] 2010 IEEE International Symposium on Information Theory - Recovery thresholds

Finally, let f3(θ(1), θ(2), β) be the following function of argu-

ments θ(1), θ(2), and β

f3(θ(1), θ(2), β) =(1 − β)e−(erfinv(

2(1−θ(1))1−β −1))2

√2π

− βe−(erfinv(2(1−θ(2))

β −1))2

√2π

. (15)

Theorem 3: (Weak threshold; non-zero elements of x are

equal to one) Let A be an m × n measurement matrix in (1)

with the null-space uniformly distributed in the Grassmanian.

Let the unknown x in (1) be binary k-sparse. Further, let the

location of the nonzero elements of x be arbitrarily chosen but

fixed. Assume that the non-zero elements of x are equal to one.

Let k, m, n be large and let α = mn and βw = k

n be constants

independent of m, n, and k. Let ε > 0 be an arbitrarily small

constant and let functions f1(·, ·), f2(·, ·), and f3(·, ·, ·) be as

defined in (13), (14), and (15), respectively. Further let θ(1)w ,

(βw ≤ θ(1)w ≤ 1), θ

(2)w , (1 − βw/2 ≤ θ

(2)w ≤ 1), be such that

(1 − ε)f3(θ(1)w , θ

(2)w , βw)

θ(1)w + θ

(2)w − 1

≥√

2erfinv(2(1 + ε)(1 − θ

(1)w )

1 − βw− 1)

(1 + ε)f3(θ(1)w , θ

(2)w , βw)

θ(1)w + θ

(2)w − 1

≤ −√

2erfinv(2(1 − ε)(1 − θ

(2)w )

βw−1).

(16)

If α and βw further satisfy

α > f2(θ(1)w , β)+f2(θ(2)

w , 1−β)−(f3(θ

(1)w , θ

(2)w , βw)

)2

θ(1)w + θ

(2)w − 1

(17)

then the solution of (3) coincides with overwhelming proba-

bility with the binary k-sparse solution of (1).

Proof: Follows from the previous discussion and analysis

presented in [27].

The results for the weak threshold obtained from the above

theorem are presented in Figure 2. For the comparison we also

show the values of the weak threshold for the so-called non-

negative signals. The threshold values obtained in that case

correspond to the ones computed in [27] and (as demonstrated

in [27]) match the best currently known ones from [9], [10].

Since, binary signals are a subclass of the non-negative signals

(and as such are clearly more structured) one would then

expect that they have higher recoverable sparsity thresholds.

Figure 2 hints that the �1 optimization algorithm from (3)

indeed recovers higher sparsity. Also, it is interesting to note

that if α → 12 then β

α → 1. This means that in binary

compressed sensing one needs at most m = n2 measurements

no matter how sparse the signal is (more on this phenomenon

the interested reader can find in an excellent recent work [20]).

Another interesting point is that in binary compressed sensing

the understanding of sparsity can be twofold. Traditionally

one views sparsity as the number of non-zero components

of x. However, given the symmetry of the binary signals, it

is relatively easy to see that the sparsity can be viewed as

the number of zeros as well (a simple signal transformation

x′i = 1 − xi, 1 ≤ i ≤ n, translates one sparsity to another).

This, on the other hand, implies that if one can recover

all signals of sparsity (understood in the traditional way as

the number of non-zeros) less than or equal to k with mmeasurements, then with the same m measurements one can

recover all signals of sparsity larger than or equal to (n − k)as well.

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

α

β/α

Weak threshold, l1

optimization, xi∈ {0,1}

Signed x [Donoho/Tanner; Stojnic] x

i∈ {0,1} Present paper

Fig. 2. Weak threshold; �1 optimization, xi ∈ {0, 1}IV. NUMERICAL EXPERIMENTS

In this section we briefly discuss the results that we obtained

from numerical experiments. In all our numerical experiments

we fixed n = 200. We then generated matrices A of size

m × n with m2 ∈ {10, 20, 30, 40, 45, 49, 59, . . . , 99}. The

components of the measurement matrices A were generated as

i.i.d. zero-mean unit variance Gaussian random variables. For

each m we generated k-sparse signals x for several different

values of k from the transition zone (the locations of non-

zero elements of x were chosen randomly and fixed). For

each combination (k, m) we generated 200 different problem

instances and recorded the number of times the �1 optimization

algorithm from (3) failed to recover the correct k-sparse x. The

obtained data are then interpolated and graphically presented

in Figure 3. The color of any point in Figure 3 shows the

probability of having �1 optimization from (3) succeed for a

combination (α, β) that corresponds to that point. The colors

are mapped to probabilities according to the scale on the right

hand side of the figure. The simulated results can naturally be

compared to the weak threshold theoretical prediction. Hence,

we also show in Figure 3 the theoretical value for the weak

threshold calculated according to Theorem 3 (and shown in

Figure 2). We observe that the simulation results are in a good

agreement with the theoretical calculation.

V. CONCLUSION

In this paper we considered recovery of sparse signals from

a reduced number of linear measurements. We additionally

assumed that the sparse signal is binary which essentially

means that the non-zero components are equal to one. We

provided a theoretical performance analysis of a modified

polynomial �1 optimization algorithm when used for recovery

of such signals. Under the assumption that the measurement

matrix A has a basis of the null-space distributed uniformly in

ISIT 2010, Austin, Texas, U.S.A., June 13 - 18, 2010

1596

Page 5: [IEEE 2010 IEEE International Symposium on Information Theory - ISIT - Austin, TX, USA (2010.06.13-2010.06.18)] 2010 IEEE International Symposium on Information Theory - Recovery thresholds

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

α

β/α

Recoverable sparsity, n=200, xi∈ {0,1}

l1 optimization fails

l1 optimization succeeds

theoretical threshold, xi∈{0,1}

Fig. 3. Experimentally recoverable sparsity; �1 optimization xi = {0, 1}the Grassmanian, we derived lower bounds on the values of the

recoverable weak thresholds in the so-called linear regime, i.e.

in the regime when the recoverable sparsity are proportional to

the length of the unknown vector. Obtained threshold results

suggest that unknown sparse signals with additional binary

structure can have higher recoverable sparsity.

The results presented in this paper seem encouraging and

one may wonder if further extensions are possible. One of

the most important observations is that the sparse signals that

we considered in this paper are discrete (in fact very simple

discrete signals). For such simple discrete signals we were

able to show that one can improve the recoverable thresholds.

Naturally, one can ask if results of similar flavor can be

obtained for more general discrete signals. For example, it

would be very interesting (especially given the importance of

digital signals in practical considerations) to see if one can

design an adaptation of the standard �1 optimization so that

the unknown signals with components, say, from an integer set

larger than the binary one considered here can be recovered

as well. This will be the subject of a future work.REFERENCES

[1] R. Baraniuk, V. Cevher, M. Duarte, and C. Hegde. Model-basedcompressive sensing. available online at http://www.dsp.ece.rice.edu/cs/.

[2] A. Barvinok and A. Samorodnitsky. Random weighting, asymptoticcounting, and inverse isoperimetry. Israel Journal of Mathematics,158:159–191, 2007.

[3] E. Candes, J. Romberg, and T. Tao. Robust uncertainty principles: exactsignal reconstruction from highly incomplete frequency information.IEEE Trans. on Information Theory, 52:489–509, December 2006.

[4] A. Cohen, W. Dahmen, and R. DeVore. Compressed sensing and bestk-term approximation. Journal of the American Mathematical Society,22(1), January 2009.

[5] S. F. Cotter and B. D. Rao. Sparse channel estimation via matching pur-suit with application to equalization. IEEE Trans. on Communications,50(3), 2002.

[6] W. Dai and O. Milenkovic. Weighted superimposed codes and con-strained integer compressed sensing. IEEE Trans. on InformationTheory, 55(9):2215–2219, September 2009.

[7] D. Donoho. High-dimensional centrally symmetric polytopes withneighborlines proportional to dimension. Disc. Comput. Geom.,35(4):617–652, 2006.

[8] D. Donoho, A. Maleki, and A. Montanari. Message-passing algo-rithms for compressed sensing. Proc. National Academy of Sciences,106(45):18914–18919, Nov. 2009.

[9] D. Donoho and J. Tanner. Sparse nonnegative solutions of underdeter-mined linear equations by linear programming. Proc. National Academyof Sciences, 102(27):9446–9451, 2005.

[10] D. Donoho and J. Tanner. Thresholds for the recovery of sparse solutionsvia l1 minimization. Proc. Conf. on Information Sciences and Systems,March 2006.

[11] D. Donoho and J. Tanner. Counting the face of randomly projectedhypercubes and orthants with application. 2008. available online athttp://www.dsp.ece.rice.edu/cs/.

[12] D. L. Donoho and X. Huo. Uncertainty principles and ideal atomic de-compositions. IEEE Trans. Inform. Theory, 47(7):2845–2862, November2001.

[13] M. Duarte, M. Davenport, D. Takhar, J. Laska, T. Sun, K. Kelly, andR. Baraniuk. Single-pixel imaging via compressive sampling. IEEESignal Processing Magazine, 25(2), 2008.

[14] Y. C. Eldar and M. Mishali. Robust recovery of signals from a structuredunion of subspaces. 2008. available at arXiv:0807.4581.

[15] A. Ganesh, Z. Zhou, and Y. Ma. Separation of a subspace-sparse signal:Algorithms and conditions. IEEE International Conference on Acoustics,Speech and Signal Processing, pages 3141–3144, April 2009.

[16] Y. Gordon. On Milman’s inequality and random subspaces which escapethrough a mesh in Rn. Geometric Aspect of of functional analysis, Isr.Semin. 1986-87, Lect. Notes Math, 1317, 1988.

[17] R. Gribonval and M. Nielsen. Sparse representations in unions of bases.IEEE Trans. Inform. Theory, 49(12):3320–3325, December 2003.

[18] M. A. Khajehnejad, A. G. Dimakis, W. Xu, and B. Hassibi. Sparserecovery of positive signals with minimal expansion. Preprint, 2009.available at arXiv:0902.4045.

[19] O. L. Mangasarian and M. C. Ferris. Uniqueness of integer solutionof linear equations. Data Mining Institute Technical Report 09-01, July2009. available online at http://www.cs.wisc.edu/math-prog/tech-reports.

[20] O. L. Mangasarian and B. Recht. Probability of unique inte-ger solution to a syatem of linear equations. Data Mining Insti-tute Technical Report 09-02, September 2009. available online athttp://www.cs.wisc.edu/math-prog/tech-reports.

[21] F. Parvaresh, H. Vikalo, S. Misra, and B. Hassibi. Recovering sparse sig-nals using sparse measurement matrices in compressed dna microarrays.IEEE Journal of Selected Topics in Signal Processing, 2(3):275–285,June 2008.

[22] B. Recht, M. Fazel, and P. A. Parrilo. Guaranteed minimum-ranksolution of linear matrix equations via nuclear norm minimization. 2007.available online at http://www.dsp.ece.rice.edu/cs/.

[23] J. Romberg. Imaging via compressive sampling. IEEE Signal ProcessingMagazine, 25(2):14–20, 2008.

[24] M. Rudelson and R. Vershynin. On sparse reconstruction from Fourierand Gaussian measurements. Comm. on Pure and Applied Math., 61(8),2007.

[25] S. M. Stigler. The asymptotic distribution of the trimmed mean. Analysisof Statistics, 1:472–477, 1973.

[26] M. Stojnic. A simple performance analysis of �1-optimization incompressed sensing. ICASSP, International Conference on Acoustics,Signal and Speech Processing, April 2009.

[27] M. Stojnic. Various thresholds for �1-optimization in compressedsensing. Preprint, 2009. available at arXiv:0907.3666.

[28] M. Stojnic, F. Parvaresh, and B. Hassibi. On the reconstruction of block-sparse signals with an optimal number of measurements. IEEE Trans.on Signal Processing, August 2009.

[29] J. Tropp, A. C. Gilbert, and M. Strauss. Algorithms for simultaneoussparse approximation. part i: Greedy pursuit. Signal Processing, Aug2005.

[30] E. van den Berg and M. P. Friedlander. Joint-sparse recovery frommultiple measurements. Preprint, 2009. available at arXiv:0904.2051.

[31] H. Vikalo, F. Parvaresh, and B. Hassibi. On sparse recovery ofcompressed dna microarrays. Asilomor conference, November 2007.

[32] M. J. Wainwright. Sharp thresholds for high-dimensional and noisyrecovery of sparsity. Proc. Allerton Conference on Communication,Control, and Computing, September 2006.

[33] W. Xu and B. Hassibi. Compressed sensing over the grassmannmanifold: A unified analytical framework. 2008. available online athttp://www.dsp.ece.rice.edu/cs/.

[34] A. C. Zelinski, L. L. Wald, K. Setsompop, V. K. Goyal, and E. Adal-steinsson. Sparsity-enforced slice-selective mri rf excitation pulsedesign. IEEE Trans. on Medical Imaging, 27(9):1213–1229, Sep. 2008.

[35] Y. Zhang. A simple proof for recoverability of ell-1-minimization (ii):the nonnegative case. Rice CAAM Department Technical Report TR05-10, 2005. available at http://www.dsp.ece.rice.edu/cs/.

ISIT 2010, Austin, Texas, U.S.A., June 13 - 18, 2010

1597