Sparse Parameter Estimation: Compressed Sensing meets ...
Transcript of Sparse Parameter Estimation: Compressed Sensing meets ...
![Page 1: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/1.jpg)
Sparse Parameter Estimation:Compressed Sensing meets Matrix Pencil
Yuejie ChiDepartments of ECE and BMIThe Ohio State University
Colorado School of MinesDecember 9, 2014
Page 1
![Page 2: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/2.jpg)
Acknowledgement
Joint work with Yuxin Chen (Stanford) and Yuanxin Li (Ohio State).
– Y. Chen and Y. Chi, Robust Spectral Compressed Sensing via StructuredMatrix Completion, IEEE Trans. Information Theory, http://arxiv.
org/abs/1304.8126
– Y. Li and Y. Chi, Off-the-Grid Line Spectrum Denoising and Estimationwith Multiple Measurement Vectors, submitted, http://arxiv.org/abs/1408.2242
Page 2
![Page 3: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/3.jpg)
In many sensing applications, one is interested inidentification of a parametric signal:
x (t) =r
∑i=1
diej2π⟨t,f i⟩, t ∈ ⟦n1⟧ × . . . × ⟦nK⟧
(f i ∈ [0,1)K ∶ frequencies, di ∶ amplitudes, r ∶ model order)
Occam’s razor: the number of modes r is small.
Sensing with a minimal cost: how to identify the parametric signalmodel from a small subset of entries of x(t)?
This problem has many (classical) applications in communications,remote sensing, and array signal processing.
Sparse Fourier Analysis
Page 3
![Page 4: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/4.jpg)
Applications in Communications and Sensing
Multipath channels: a (relatively) small number of strong paths.
Radar Target Identification: a (relatively) small number of strong scatters.
Page 4
![Page 5: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/5.jpg)
Swap time and frequency:
z (t) =r
∑i=1
diδ(t − ti)
Applications in microscopy imagingand astrophysics.
Resolution is limited by the pointspread function of the imaging system
Applications in Imaging
Page 5
![Page 6: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/6.jpg)
x (t − τ ) =r
∑i=1
diej2π⟨t−τ ,f i⟩ =
r
∑i=1
die−j2π⟨τ ,f i⟩ej2π⟨t,f i⟩
Prony’s method: root-finding.
SVD based approaches: ESPRIT [RoyKailath’1989], MUSIC[Schmidt’1986], matrix pencil [HuaSarkar’1990, Hua’1992].
spectrum blind sampling [Bresler’ 1996], finite rate of innovation[Vetterli’ 2001], Xampling [Eldar’ 2011].
Pros: perfect recovery from (equi-spaced) O(r) samples
Cons: sensitive to noise and outliers, usually require prior knowledge onthe model order.
Something Old: Parametric Estimation
Exploring Physically-meaningful Constraints: shift invariance of theharmonic structure
Page 6
![Page 7: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/7.jpg)
Discretize the frequency and assume a sparse representation over thediscretized basis
fi ∈ F = 0
n1, . . . ,
n1 − 1
n1 ×
0
n2, . . . ,
n2 − 1
n2 × . . .
Pros: perfect recovery from O(r logn) random samples, robust againstnoise and outliers
Cons: sensitive to gridding error
Something New: Compressed Sensing
Exploring Sparsity: Compressed Sensing [Candes and Tao’2006, Donoho’2006]capture the attributes (sparsity) of signals from a small number of samples.
Page 7
![Page 8: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/8.jpg)
Sensitivity to Basis Mismatch
A toy example: x(t) = ej2πf0t:
– The success of CS relies on sparsity in the DFT basis.
– Basis mismatch: Physics places f off grid by a frequency offset.∗ Basis mismatch translates a sparse signal into an incompressible signal.
0 200 400
−1
0
1
Mismatch ∆θ=0.1π/N
0 200 400
−1
0
1
Normalized recovery error=0.0816
0 200 400
−1
0
1
Mismatch ∆θ=0.5π/N
0 200 400
−1
0
1
Normalized recovery error=0.3461
0 200 400
−1
0
1
Mismatch ∆θ=π/N
0 200 400
−1
0
1
Normalized recovery error=1.0873
– Finer grid does not help, and one never estimates the true continuous-valued frequencies! [Chi, Scharf, Pezeshki, Calderbank 2011]
Page 8
![Page 9: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/9.jpg)
– revisit matrix pencil proposed for arraysignal processing– revitalize matrix pencil by combiningit with convex optimization
5 10 15 20 25 30 35
5
10
15
20
25
30
35
Our Approach
Conventional approaches enforce physically-meaningful constraints, but notsparsity;
Compressed sensing enforces sparsity, but not physically-meaningfulconstraints;
Approach: We combine sparsity with physically-meaningful constraints, sothat we can stably estimate the continuous-valued frequencies from a minimalnumber of time-domain samples.
Page 9
![Page 10: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/10.jpg)
Two-Dimensional Frequency Model
Stack the signal x (t) = ∑ri=1 die
j2π⟨t,f i⟩ into a matrix X ∈ Cn1×n2.
The matrix X has the following Vandermonde decomposition:
X = Y ⋅ D®
diagonal matrix
⋅ZT .
Here, D ∶= diagd1,⋯, dr and
Y ∶=
⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣
1 1 ⋯ 1y1 y2 ⋯ yr⋮ ⋮ ⋮ ⋮
yn1−11 y
n1−12 ⋯ y
n1−1r
⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶Vandemonde matrix
,Z ∶=
⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣
1 1 ⋯ 1z1 z2 ⋯ zr⋮ ⋮ ⋮ ⋮
zn2−11 z
n2−12 ⋯ z
n2−1r
⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦
´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶Vandemonde matrix
where yi = exp(j2πf1i), zi = exp(j2πf2i), f i = (f1i, f2i).
Goal: We observe a random subset of entries of X, and wish to recover themissing entries.
Page 10
![Page 11: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/11.jpg)
Convex Relaxation
√? ?
√ √?
√?
√ √? ?
√ √?√ √ √ √?√ √
? ?√
︸ ︷︷ ︸Σ
decompose=
√0 0
√ √0
√0
√ √0 0
√ √0√ √ √ √0√ √
0 0√
︸ ︷︷ ︸Σ0
+
0 ? ? 0 0? 0 ? 0 0? ? 0 0 ?0 0 0 0 ?0 0 ? ? 0
︸ ︷︷ ︸H
• Applying Taylor expansion:
Σno = Σn
o K∗︸︷︷︸
sparse matrix
Σno + H∗
︸︷︷︸support known
+ W︸︷︷︸residual
– Treat W as noise (BUT WHY???)⇐ W small⇐ H∗K∗H∗ and Σ0 − Σn
0 small
Yuxin Chen () Model Selection with Missing Data June 21, 2011 9 / 19
Yes, but it yields sub-optimal performance.
– It requires at least rmaxn1, n2 samples.
No, X is no longer low-rank if r > min (n1, n2)
– Note that r can be as large as n1n2
Matrix Completion?
recall that X = Y®
Vandemonde
⋅ D®
diagonal
⋅ ZT°
Vandemonde
.
where D ∶= diagd1,⋯, dr, and
Y ∶=
⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣
1 1 ⋯ 1y1 y2 ⋯ yr⋮ ⋮ ⋮ ⋮
yn1−11 y
n1−12 ⋯ y
n1−1r
⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦
,Z ∶=
⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣
1 1 ⋯ 1z1 z2 ⋯ zr⋮ ⋮ ⋮ ⋮
zn2−11 z
n2−12 ⋯ z
n2−1r
⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦
Quick observation: X can be a low-rank matrix with rank(X) = r.
Question: can we apply Matrix Completion algorithms directly on X?
Page 11
![Page 12: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/12.jpg)
Given a data matrix X, Hua proposed thefollowing matrix enhancement for two-dimensionalfrequency models [Hua 1992]:
Choose two pencil parameters k1 and k2;
5 10 15 20 25 30 35
5
10
15
20
25
30
35
An enhanced form Xe is an k1 × (n1 − k1 + 1) block Hankel matrix :
Xe =
⎡⎢⎢⎢⎢⎢⎢⎢⎣
X0 X1 ⋯ Xn1−k1
X1 X2 ⋯ Xn1−k1+1
⋮ ⋮ ⋮ ⋮
Xk1−1 Xk1 ⋯ Xn1−1
⎤⎥⎥⎥⎥⎥⎥⎥⎦
,
where each block is a k2 × (n2 − k2 + 1) Hankel matrix as follows
X l =
⎡⎢⎢⎢⎢⎢⎢⎢⎣
xl,0 xl,1 ⋯ xl,n2−k2
xl,1 xl,2 ⋯ xl,n2−k2+1
⋮ ⋮ ⋮ ⋮
xl,k2−1 xl,k2 ⋯ xl,n2−1
⎤⎥⎥⎥⎥⎥⎥⎥⎦
.
Revisiting Matrix Pencil: Matrix Enhancement
Page 12
![Page 13: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/13.jpg)
The Task
C = A* B*+
Low-rank Matrix
Unknown rank, eigenvectors
Sparse “Errors” Matrix
Unknown support, values
GivenComposite
matrix
Low Rankness of the Enhanced Matrix
Choose pencil parameters k1 = Θ(n1) and k2 = Θ(n2), the dimensionality ofXe is proportional to n1n2 × n1n2.
The enhanced matrix can be decomposed as follows [Hua 1992]:
Xe =
⎡⎢⎢⎢⎢⎢⎢⎢⎢⎣
ZL
ZLY d
⋮
ZLYk1−1d
⎤⎥⎥⎥⎥⎥⎥⎥⎥⎦
D [ZR,Y dZR,⋯,Yn1−k1d ZR] ,
– ZL and ZR are Vandermonde matrices specified by z1, . . . , zr,– Y d = diag [y1, y2,⋯, yr].
The enhanced form Xe is low-rank.
– rank (Xe) ≤ r
– Spectral Sparsity ⇒ Low Rankness
holds even with damping modes.Page 13
![Page 14: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/14.jpg)
Enhanced Matrix Completion (EMaC)
The natural algorithm is to find the enhanced matrix with the minimal ranksatisfying the measurements:
minimizeM∈Cn1×n2
rank (M e)
subject to M i,j =Xi,j,∀(i, j) ∈ Ω
where Ω denotes the sampling set.
Motivated by Matrix Completion, we will solve its convex relaxation,
(EMaC) ∶ minimizeM∈Cn1×n2
∥M e∥∗
subject to M i,j =Xi,j,∀(i, j) ∈ Ω
where ∥ ⋅ ∥∗ denotes the nuclear norm.
The algorithm is referred to as Enhanced Matrix Completion (EMaC).
Page 14
![Page 15: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/15.jpg)
Enhanced Matrix Completion (EMaC)
(EMaC) ∶ minimizeM∈Cn1×n2
∥M e∥∗
subject to M i,j =Xi,j,∀(i, j) ∈ Ω
existing MC result won’t apply – requires at least O(nr) samples
Question: How many samples do we need?
?√ √
?√
?√ √
? ?√ √
√ √? ? ?
√ √? ?
√ √?√
? ?√ √ √
?√ √ √
?√
? ?√
?√
?√
?√
?√ √
√?
√ √? ?
√ √ √?
√ √?
√ √? ?
√ √? ?
√ √?√ √
?√ √ √
?√ √ √
?√
√?
√?
√?
√ √ √?
√?
? ?√ √ √
?√ √
? ?√
??
√ √? ?
√ √? ?
√? ?√ √
?√ √ √
?√ √
? ?√
√?
√ √ √?
√? ? ?
√?
Page 15
Page 15
![Page 16: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/16.jpg)
separation on x axis
separa
tion o
n y
axis
−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5
−0.5
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Dirichlet Kernel
Introduce Coherence Measure
Define the 2-D Dirichlet kernel:
D(k1, k2, f1, f2) ∶=1
k1k2(
1 − e−j2πk1f1
1 − e−j2πf1)(
1 − e−j2πk2f2
1 − e−j2πf2) ,
Define GL and GR as r × r Gram matrices such that
(GL)i,l = D(k1, k2, f1i − f1l, f2i − f2l),
(GR)i,l = D(n1 − k1 + 1, n2 − k2 + 1, f1i − f1l, f2i − f2l).
Page 16
![Page 17: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/17.jpg)
Examples: µ = Θ(1) under many scenarios:
– Randomly generated frequencies;– (Mild) perturbation of grid points;– In 1D, let k1 ≈
n12 : well-separated frequencies (Liao and Fannjiang, 2014):
∆ = mini≠j ∣fi − fj∣ ≳2n1
, which is about 2 times Rayleigh limits.
Introduce Incoherence Measure
Incoherence condition holds w.r.t. µ if
σmin (GL) ≥1
µ, σmin (GR) ≥
1
µ.
Page 17
![Page 18: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/18.jpg)
Theoretical Guarantees for Noiseless Case
Theorem [Chen and Chi, 2013] (Noiseless Samples) Let n = n1n2. Ifall measurements are noiseless, then EMaC recovers X perfectly with highprobability if
m > Cµr log4n.
where C is some universal constant.
Implications
– deterministic signal model, random observation;
– coherence condition µ only depends on the frequencies but not amplitudes.
– near-optimal within logarithmic factors: Θ(rpolylogn).
– general theoretical guarantees for Hankel (Toeplitz) matrix completion.— see applications in control, MRI, natural language processing, etc
Page 18
![Page 19: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/19.jpg)
Phase Transition
m: number of samples
r: s
pa
rsity le
ve
l
20 40 60 80 100 120 140 160 180 200
2
4
6
8
10
12
14
16
18
20
22
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Figure 1: Phase transition diagrams where spike locations are randomlygenerated. The results are shown for the case where n1 = n2 = 15.
Page 19
![Page 20: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/20.jpg)
Robustness to Bounded Noise
Assume the samples are noisy X =Xo +N , where N is bounded noise:
(EMaC-Noisy) ∶ minimizeM∈Cn1×n2
∥M e∥∗
subject to ∥PΩ (M −X) ∥F ≤ δ,
Theorem [Chen and Chi, 2013] (Noisy Samples) Suppose Xo is a noisycopy of X that satisfies
∥PΩ(X −Xo)∥F ≤ δ.
Under the conditions of Theorem 1, the solution to EMaC-Noisy satisfies
∥Xe −Xe∥F ≤ 2√n + 8n +
8√
2n2
m δ
with probability exceeding 1 − n−2.
Implications: The average entry inaccuracy is bounded above by O( nmδ).In practice, EMaC-Noisy usually yields better estimate.
Page 20
![Page 21: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/21.jpg)
Singular Value Thresholding (Noisy Case)
Several optimized solvers for Hankel matrix completion exist, for example[Fazel et. al. 2013, Liu and Vandenberghe 2009]
Algorithm 1 Singular Value Thresholding for EMaC1: initialize Set M0 =Xe and t = 0.
2: repeat3: 1) Qt ← Dτt (M t) (singular-value thresholding)
4: 2) M t ← HankelX0(Qt) (projection onto a Hankel matrix consistent with observation)
5: 3) t← t + 1
6: until convergence
0 10 20 30 40 50 60 70 80 90 1000
5
10
15
20
25
Time (vectorized)
Am
plit
ude
True Signal
Reconstructed Signal
Figure 2: dimension: 101 × 101, r = 30, mn1n2
= 5.8%, SNR = 10dB.
Page 21
![Page 22: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/22.jpg)
Robustness to Sparse Outliers
What if a constant portion of measurements are arbitrarily corrupted?
0 10 20 30 40 50 60−8
−6
−4
−2
0
2
4
6
8
data index
real part
clean signal
noisy samples
– Robust PCA approach [Candes et. al. 2011, Chandrasekaran et. al. 2011]– Solve the following algorithm:
(RobustEMaC) ∶ minimizeM ,S∈Cn1×n2
∥M e∥∗ + λ∥Se∥1
subject to (M +S)i,l =Xcorruptedi,l , ∀(i, l) ∈ Ω
Page 22
![Page 23: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/23.jpg)
Implications:
– slightly more samples m ∼ Θ(r2 log3n);– robust to a constant portion of outliers:s ∼ Θ(m);
In summary, EMaC achieves robust recoverywith respect to dense and sparse errors froma near-optimal number of samples.
0 10 20 30 40 50 60−8
−6
−4
−2
0
2
4
6
8
data index
real part
clean signal
recovered signal
recovered corruptions
Theoretical Guarantees for Robust Recovery
Theorem [Chen and Chi, 2013] (Sparse Outliers) Set n = n1n2 andλ = 1√
m logn. Let the percent of corrupted entries s ≤ 20% selected uniformly
at random, then RobustEMaC recovers X with high probability if
m > Cµr2 log3n,
where C is some universal constant.
Page 23
![Page 24: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/24.jpg)
Phase Transition for Line Spectrum Estimation
Fix the amount of corruption as 10% of the total number of samples:
m: number of samples
r: s
pa
rsity le
ve
l
20 40 60 80 100
5
10
15
20
25
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Figure 3: Phase transition diagrams where spike locations are randomlygenerated. The results are shown for the case where n = 125.
Page 24
![Page 25: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/25.jpg)
Related Work
Cadzow’s denoising with full observation: non-convex heuristic to denoiseline spectrum data based on the Hankel form.
Atomic norm minimization with random observation: recently proposed by[Tang et. al., 2013] for compressive line spectrum estimation off the grid.
mins
∥s∥A subject to PΩ (s) = PΩ (x) ,
where the atomic norm is defined as ∥x∥A = inf ∑i ∣di∣∣x(t) = ∑i diej2πfit .
– Random signal model: if the frequencies are separated by 4 times theRayleigh limit and the phases are random, then perfect recovery withO(r log r logn) samples;
– no stability result with random observations;
– extendable to multi-dimensional frequencies [Chi and Chen, 2013], but theSDP characterization is more complicated [Xu et. al. 2013].
Page 25
![Page 26: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/26.jpg)
(Numerical) Comparison with Atomic Norm Minimization
Phase transition for 1D spectrum estimation: the phase transition for atomicnorm minimization is very sensitive to the separation condition. The EMaC, incontrast, is insensitive to the separation.
m: number of samples
r: s
pars
ity level
20 30 40 50 60 70 80 90 100 110 1200
5
10
15
20
25
30
35
40
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
m: number of samples
r: s
pars
ity level
20 30 40 50 60 70 80 90 100 110 1200
5
10
15
20
25
30
35
40
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
m: number of samples
r: s
pars
ity level
20 30 40 50 60 70 80 90 100 110 1200
5
10
15
20
25
30
35
40
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(a) (b) (c)
Figure: phase transition for atomic norm minimization without separation (a),with separation (b); and EMaC without separation (c). The inclusion ofseparation doesn’t change the phase transition of EMaC.
Page 26
![Page 27: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/27.jpg)
(Numerical) Comparison with Atomic Norm Minimization
Phase transition for 2D spectrum estimation: the phase transition for atomicnorm minimization is very sensitive to the separation condition. The EMaC, incontrast, is insensitive to the separation. Here the problem dimension n1 = n2 = 8is relatively small and the atomic norm minimization approach seems in favordespite of separation.
number of samples
sp
ars
ity le
ve
l
10 15 20 25 30 35 40 45 50 55 601
2
3
4
5
6
7
8
9
10
11
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
number of samples
sp
ars
ity le
ve
l
10 15 20 25 30 35 40 45 50 55 601
2
3
4
5
6
7
8
9
10
11
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
number of samples
sp
ars
ity le
ve
l
10 15 20 25 30 35 40 45 50 55 601
2
3
4
5
6
7
8
9
10
11
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
(a) (b) (c)
Figure: phase transition for atomic norm minimization without separation (a),with separation (b); and EMaC without separation (c). The inclusion ofseparation doesn’t change the phase transition of EMaC.
Page 27
![Page 28: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/28.jpg)
Extension to Multiple Measurement Vectors Model
When multiple snapshots available, it is possible to exploit the covariancestructure to reduce the number of sensors. Without loss of generality,consider 1D:
x` (t) =r
∑i=1
di,`ej2πtfi, t ∈ 0,1, . . . , n − 1
where x` = [x`(0), x`(1), . . . , x`(n − 1)]T , ` = 1, . . . , L.
We assume the coefficients di,` ∼ CN (0, σ2i ), then the covariance matrix
Σ = E [x`xH` ] = toep(u)
is a PSD block Toeplitz matrix with rank(Σ) = r.
The frequencies can be estimated without separation from u using `1minimization with nonnegative constraints [Donoho and Tanner’ 2005].
Page 28
![Page 29: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/29.jpg)
Observation with the Sparse Ruler
Observation pattern: Instead of random observations, we assumedeterministic observation pattern Ω over a (minimum) sparse ruler for allsnapshots:
y` = xΩ,` = x`(t), t ∈ Ω , ` = 1, . . . , L.
Sparse ruler in 1D: for Ω ∈ 0, . . . , n − 1
– Define the difference set:
∆ = ∣i − j∣, ∀i, j ∈ Ω
– Ω is called a length-n sparse ruler if ∆ = 0, . . . , n − 1.– Examples:∗ when n = 21, Ω = 0,1,2,6,7,8,17,20.∗ nested arrays, co-prime arrays [Pal and Vaidyanathan’ 2010, 2011]∗ minimum sparse rulers [Redei and Renyi, 1949]
roughly ∣Ω∣ = O(√n).
Page 29
![Page 30: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/30.jpg)
Covariance Estimation on Sparse Ruler Entries
Consider the observation on Ω = 0,1,2,5,8,
E [y`yH` ] = E
⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣
x`(0)x`(1)x`(2)x`(5)x`(8)
⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦
[xH` (0) xH` (1) xH` (2) xH` (5) xH` (8)]
=
⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣
u0 uH1 uH2 uH5 uH8u1 u0 uH1 uH4 uH7u2 u1 u0 uH3 uH6u5 u4 u3 u0 uH3u8 u7 u6 u3 u0
⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦
which gives the exact full covariance matrix Σ = toep(u) in the absence ofnoise and an infinite number of snapshots.
Page 30
![Page 31: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/31.jpg)
Two-step Structured Covariance Estimation
In practice, measurements will be noisy with a finite number of snapshots:
y` = xΩ,` +w`, ` = 1, . . . , L,
where w` ∼ CN (σ2,I).
Two-step covariance estimation:
– formulate the sample covariance matrix of y`:
ΣΩ,L =1
L
L
∑`=1
y`yH` ;
– determine the Teoplitz covariance matrix with SDP:
u = argminu∶toep(u)⪰0
1
2∥PΩ(toep(u)) −ΣΩ,L∥
2F + λTr(toep(u)),
where λ is some regularization parameter.Page 31
![Page 32: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/32.jpg)
Theoretical Guarantee
Theorem [Li and Chi] Let u⋆ be the ground truth. Set
λ ≥ Cmax
⎧⎪⎪⎨⎪⎪⎩
√r log(Ln)
L,r log(Ln)
L
⎫⎪⎪⎬⎪⎪⎭
∥Σ⋆Ω∥
with Σ⋆Ω = E [ymy
Hm] for some constant C, then with probability at least
1 −L−1, the solution satisfies
1√n∥u −u⋆∥F ≤ 16λ
√r
if Ω is a complete sparse ruler.
Remark:
– 1√n∥u −u⋆∥F is small as soon as L ≳ O(r2 logn);
– As the rank r (as large as n) can be larger than ∣Ω∣ (as small as√n), this
allows frequency estimation even the snapshots cannot be recovered.
Page 32
![Page 33: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/33.jpg)
−1
0
1
−1
0
1
0
0.5
1
1.5
2
Ground truth
−1
0
1
−1
0
1
0
0.5
1
1.5
2
CS with DFT frame
−1
0
1
−1
0
1
0
0.5
1
1.5
2
Correlation−aware with DFT frame
−1
0
1
−1
0
1
0
0.5
1
1.5
2
Atomic norm minimization
−1
0
1
−1
0
1
0
0.5
1
1.5
2
Structured covariance estimation
Numerical Simulations
The algorithm also applies to other observation patterns, e.g. random. Setting:n = 64, L = 400, ∣Ω∣ = 5, and r = 6.
Page 33
![Page 34: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/34.jpg)
Final Remarks
Sparse parameter estimation is possible leveraging shift-invariance structuresembedded in matrix pencil with recent matrix completion techniques;
Fundamental performance is determined by the proximity of the frequenciesmeasured by the conditioning number of the Gram matrix formed by thesampling the Dirichlet kernel;
Recovering more lines than the number of sensors is made possible byexploiting the second-order statistics;
Future work: how to compare conventional algorithms with those of convexoptimization?
Page 34
![Page 35: Sparse Parameter Estimation: Compressed Sensing meets ...](https://reader030.fdocuments.us/reader030/viewer/2022013018/61d15fd27fb2fd564c01ef18/html5/thumbnails/35.jpg)
Q&A
Publications available on arXiv:
Robust Spectral Compressed Sensing via Structured Matrix Completion, IEEETrans. Information Theory, http://arxiv.org/abs/1304.8126
Off-the-Grid Line Spectrum Denoising and Estimation with MultipleMeasurement Vectors, submitted, http://arxiv.org/abs/1408.2242
Thank You! Questions?
Page 35