Richard Baraniuk
Rice University
CompressiveSensing
Acknowledgements
• For assistance preparing this presentation
• Rice DSP group– Petros Boufounos, Volkan Cevher– Mark Davenport, Marco Duarte, Chinmay Hegde,
Jason Laska, Shri Sarvotham, …
• Mike Wakin, University of Michigan– geometry of CS, embeddings
• Justin Romberg, Georgia Tech– optimization algorithms
Agenda
• Introduction to Compressive Sensing (CS) – motivation– basic concepts
• CS Theoretical Foundation– geometry of sparse and compressible signals– coded acquisition– restricted isometry property (RIP)– signal recovery
• CS Applications
• Related concepts and work
Pressure is on Digital Sensors
• Success of digital data acquisition is placing increasing pressure on signal/image processing hardware and software to support
higher resolution / denser sampling» ADCs, cameras, imaging systems, microarrays, …
xlarge numbers of sensors
» image data bases, camera arrays, distributed wireless sensor networks, …
xincreasing numbers of modalities
» acoustic, RF, visual, IR, UV, x-ray, gamma ray, …
Pressure is on Digital Sensors
• Success of digital data acquisition is placing increasing pressure on signal/image processing hardware and software to support
higher resolution / denser sampling» ADCs, cameras, imaging systems, microarrays, …
xlarge numbers of sensors
» image data bases, camera arrays, distributed wireless sensor networks, …
xincreasing numbers of modalities
» acoustic, RF, visual, IR, UV
deluge of datadeluge of data» how to acquire, store,
fuse, process efficiently?
Digital Data Acquisition
• Foundation: Shannon sampling theorem“if you sample densely enough(at the Nyquist rate), you can perfectly reconstruct the original analog data”
time space
Sensing by Sampling
• Long-established paradigm for digital data acquisition– uniformly sample data at Nyquist rate (2x Fourier bandwidth)
sample toomuchdata!
Sensing by Sampling
• Long-established paradigm for digital data acquisition– uniformly sample data at Nyquist rate (2x Fourier bandwidth)
– compress data
compress transmit/store
receive decompress
sample
JPEGJPEG2000
…
Sparsity / Compressibility
pixels largewaveletcoefficients
(blue = 0)
widebandsignalsamples
largeGabor (TF)coefficients
time
freq
uen
cy
Sample / Compress
compress transmit/store
receive decompress
sample
sparse /compressiblewavelettransform
• Long-established paradigm for digital data acquisition– uniformly sample data at Nyquist rate – compress data
What’s Wrong with this Picture?
• Why go to all the work to acquire N samples only to discard all but K pieces of data?
compress transmit/store
receive decompress
sample
sparse /compressiblewavelettransform
What’s Wrong with this Picture?linear processinglinear signal model(bandlimited subspace)
compress transmit/store
receive decompress
sample
sparse /compressiblewavelettransform
nonlinear processingnonlinear signal model(union of subspaces)
Compressive Sensing
• Directly acquire “compressed” data
• Replace samples by more general “measurements”
compressive sensing transmit/store
receive reconstruct
Compressive Sensing
Theory IGeometrical Perspective
• Signal is -sparse in basis/dictionary– WLOG assume sparse in space domain
Sampling
sparsesignal
nonzeroentries
Compressive Sampling
• When data is sparse/compressible, can directly acquire a condensed representation with no/little information loss through linear dimensionality reduction
measurements sparsesignal
nonzeroentries
How Can It Work?
• Projection not full rank…
… and so loses information in general
• Ex: Infinitely many ’s map to the same
How Can It Work?
• Projection not full rank…
… and so loses information in general
• But we are only interested in sparse vectors
• is effectively MxK
columns
How Can It Work?
• Projection not full rank…
… and so loses information in general
• But we are only interested in sparse vectors
• Design so that each of its MxK submatricesare full rank
columns
How Can It Work?
• Goal: Design so that its Mx2K submatrices are full rank
– difference between two K-sparse vectors is 2K sparse in general
– preserve information in K-sparse signals
– Restricted Isometry Property (RIP) of order 2K
columns
Unfortunately…
• Goal: Design so that its Mx2K submatrices are full rank(Restricted Isometry Property – RIP)
• Unfortunately, a combinatorial, NP-complete design problem
columns
Insight from the 80’s [Kashin, Gluskin]
• Draw at random– iid Gaussian– iid Bernoulli
…
• Then has the RIP with high probability
– Mx2K submatrices are full rank– stable embedding for sparse signals– extends to compressible signals in balls
columns
Compressive Data Acquisition
• Measurements = random linear combinationsof the entries of
• WHP does not distort structure of sparse signals – no information loss
measurements sparsesignal
nonzeroentries
CS Signal Recovery
• Goal: Recover signal from measurements
• Problem: Randomprojection not full rank(ill-posed inverse problem)
• Solution: Exploit the sparse/compressiblegeometry of acquired signal
• Sparse signal: only K out of Ncoordinates nonzero
– model: union of K-dimensional subspacesaligned w/ coordinate axes
Concise Signal Structure
sorted index
• Sparse signal: only K out of Ncoordinates nonzero
– model: union of K-dimensional subspaces
• Compressible signal: sorted coordinates decay rapidly to zero
sorted index
power-lawdecay
Concise Signal Structure
• Sparse signal: only K out of Ncoordinates nonzero
– model: union of K-dimensional subspaces
• Compressible signal: sorted coordinates decay rapidly to zero
– model: ball:
sorted index
power-lawdecay
Concise Signal Structure
CS Signal Recovery• Random projection
not full rank
• Recovery problem:givenfind
• Null space
• So search in null space for the “best”according to some criterion– ex: least squares
(N-M)-dim hyperplaneat random angle
• Recovery: given(ill-posed inverse problem) find (sparse)
• fast, wrong
CS Signal Recovery
pseudoinverse
Why Doesn’t Work
least squares,minimum solutionis almost never sparse
null space of
translated to
(random angle)
for signals sparse in the space/time domain
• Reconstruction/decoding: given(ill-posed inverse problem) find
• fast, wrong
•
CS Signal Recovery
number ofnonzeroentries
“find sparsestin translated nullspace”
• Reconstruction/decoding: given(ill-posed inverse problem) find
• fast, wrong
• correct:only M=2Kmeasurements required to stably reconstruct K-sparse signal
slow: NP-completealgorithm
CS Signal Recovery
number ofnonzeroentries
• Recovery: given(ill-posed inverse problem) find (sparse)
• fast, wrong
• correct, slow
• correct, efficientmild oversampling[Candes, Romberg, Tao; Donoho]
number of measurements required
CS Signal Recovery
linear program
Why Works
minimum solution= sparsest solution (with high probability) if
for signals sparse in the space/time domain
Universality
• Random measurements can be used for signals sparse in any basis
Universality
• Random measurements can be used for signals sparse in any basis
Universality
• Random measurements can be used for signals sparse in any basis
sparsecoefficient
vector
nonzeroentries
Compressive Sensing
• Directly acquire “compressed” data
• Replace N samples by M random projections
transmit/store
receive linear pgm
…
random measurements
Compressive Sensing
Theory IIStable Embedding
Johnson-Lindenstrauss Lemma
• JL Lemma: random projection stably embeds a cloud of Q points whp provided
• Proved via concentration inequality
• Same techniques link JLL to RIP [B, Davenport, DeVore, Wakin, Constructive Approximation, 2008]
Q points
Connecting JL to RIPConsider effect of random JL � on each K-plane
– construct covering of points Q on unit sphere– JL: isometry for each point with high probability– union bound � isometry for all points q in Q– extend to isometry for all x in K-plane
K-plane
Connecting JL to RIPConsider effect of random JL � on each K-plane
– construct covering of points Q on unit sphere– JL: isometry for each point with high probability– union bound � isometry for all points q in Q– extend to isometry for all x in K-plane– union bound � isometry for all K-planes
K-planes
• Gaussian
• Bernoulli/Rademacher [Achlioptas]
• “Database-friendly” [Achlioptas]
• Random Orthoprojection to RM [Gupta, Dasgupta]
Favorable JL Distributions
RIP as a “Stable” Embedding
• RIP of order 2K implies: for all K-sparse x1 and x2,
K-planes
• Theorem: Supposing � is drawn from a JL-favorabledistribution,* then with probability at least 1-e-C*M, � meets the RIP with
* Gaussian/Bernoulli/database-friendly/orthoprojector
• Bonus: universality (repeat argument for any ��)
• See also Mendelson et al. concerning subgaussian ensembles
Connecting JL to RIP[with R. DeVore, M. Davenport, R. Baraniuk]
Compressive Sensing
Recovery Algorithms
CS Recovery Algorithms
• Convex optimization: [with or without noise]– Linear programming– BPDN– GPSR– FPC– Bregman iteration– Dantzig selector, …
• Iterative greedy algorithm– Matching Pursuit (MP)– Orthogonal Matching Pursuit (OMP)– StOMP– ROMP– CoSaMP, …
Compressive Sensing
Theory Summary
CS Hallmarks
• CS changes the rules of the data acquisition game– exploits a priori signal sparsity information
• Stable– acquisition/recovery process is numerically stable
• Universal– same random projections / hardware can be used for
any compressible signal class (generic)
• Asymmetrical (most processing at decoder)– conventional: smart encoder, dumb decoder– CS: dumb encoder, smart decoder
• Random projections weakly encrypted
CS Hallmarks
• Democratic– each measurement carries the same amount of information– robust to measurement loss and quantization simple
encoding
• Ex: wireless streaming application with data loss
– conventional: complicated (unequal) error protection of compressed data� DCT/wavelet low frequency coefficients
– CS: merely stream additional measurements and reconstruct using those that arrive safely (fountain-like)
Compressive SensingIn Action
Art
Gerhard Richter4096 Farben / 4096 Colours
1974254 cm X 254 cmLaquer on CanvasCatalogue Raisonné: 359
Museum Collection:Staatliche KunstsammlungenDresden (on loan)
Sales history: 11 May 2004Christie's New York Post-War and Contemporary Art (Evening Sale), Lot 34US$3,703,500
Gerhard RichterCologne Cathedral
Stained Glass
Compressive SensingIn Action
Cameras
“Single-Pixel” CS Camera
randompattern onDMD array
DMD DMD
single photon detector
imagereconstruction
orprocessing
w/ Kevin Kelly
scene
“Single-Pixel” CS Camera
randompattern onDMD array
DMD DMD
single photon detector
imagereconstruction
orprocessing
scene
• Flip mirror array M times to acquire M measurements• Sparsity-based (linear programming) recovery
…
Single Pixel Camera
Object LED (light source)
DMD+ALP Board
Lens 1Lens 2Photodiode
circuit
First Image Acquisition
target 65536 pixels
1300 measurements (2%)
11000 measurements (16%)
Second Image Acquisition
500 random measurements
4096 pixels
Single-Pixel Camera
randompattern onDMD array
DMD DMD
single photon detector
imagereconstruction
orprocessing
CS Low-Light Imaging with PMT
true color low-light imaging
256 x 256 image with 10:1 compression[Nature Photonics, April 2007]
Hyperspectral Imaging
spectrometer
blue
red
near IR
Video Acquisition• Measure via stream of random projections
– shutterless camera
• Reconstruct using sparse model for video structure– 3-D wavelets (localized in space-time)
space
time
3D complex wavelet
original 64x64x643-D wavelet thresholding
2000 coeffs, MSE = 3.5
frame-by-frame 2-D CS recon20000 coeffs, MSE = 18.4
joint 3-D CS recon20000 coeffs, MSE = 3.2
Miniature DMD-based Cameras
• TI DLP “picoprojector” destined for cell phones
Georgia Tech Analog Imager
• Bottleneck in imager arrays is data readout
• Instead of quantizing pixel values, take CS inner products in analog
• Potential for tremendous (factor of 10000) power savings
[Hassler, Anderson, Romberg, et al]
Other Imaging Apps
• Compressive camera arrays– sampling rate scales logarithmically
in both number of pixels and number of cameras
• Compressive radar and sonar– simplified receiver– exploring radar/sonar networks
• Compressive DNA microarrays– smaller, more agile arrays for
bio-sensing– exploit sparsity in presentation
of organisms to array [O. Milenkovic et al]
Compressive SensingIn Action
A/D Converters
Analog-to-Digital Conversion
• Nyquist rate limits reach of today’s ADCs
• “Moore’s Law” for ADCs:– technology Figure of Merit incorporating sampling rate
and dynamic range doubles every 6-8 years
• DARPA Analog-to-Information (A2I) program– wideband signals have
high Nyquist rate but are often sparse/compressible
– develop new ADC technologies to exploit
– new tradeoffs amongNyquist rate, sampling rate,dynamic range, …
frequency hopperspectrogram
time
freq
uen
cy
Analog-to-Information Conversion
• Sample near signal’s (low) “information rate”rather than its (high) Nyquist rate
• Practical hardware:randomized demodulator(CDMA receiver)
A2Isampling rate
number oftones /window
Nyquistbandwidth
Example: Frequency Hopper
20x sub-Nyquistsampling
spectrogram sparsogram
Nyquist rate sampling
Analog-to-Information Conversion
analogsignal
digitalmeasurements
informationstatisticsA2I DSP
• Analog-to-information (A2I) converter takes analog input signal and creates discrete (digital) measurements
• Extension of analog-to-digital converter that samples at the signal’s “information rate” rather than its Nyquist rate
Streaming Measurements
measurements
streaming requires special
• Streaming applications: cannot fit entire signal into a processing buffer at one time
Random Demodulator
Random Demodulator
50 100 150 200 250 300 350 400 450
�0.5
0
0.5
Random Demodulator
Parallel Architecture
p1(t) mT
y1[m]
p2(t) mT
y2[m]
pZ(t) mT
yZ[m]
x(t)
[Candes, Romberg, Wakin, et al]
Random Demodulator
• Theorem [Tropp et al 2007]
If the sampling rate satisfies
then locally Fourier K-sparse signals can be recovered exactly with probability
Empirical Results
Example: Frequency Hopper
• Random demodulator AIC at 8x sub-Nyquist
spectrogram sparsogram
Compressive SensingIn Action
Data Processing
Information Scalability
• Many applications involve signal inferenceand not reconstruction
detection < classification < estimation < reconstruction
fairly computationallyintense
Information Scalability
• Many applications involve signal inferenceand not reconstruction
detection < classification < estimation < reconstruction
• Good news: CS supports efficient learning, inference, processing directly on compressive measurements
• Random projections ~ sufficient statisticsfor signals with concise geometrical structure
Matched Filter• Detection/classification with K unknown
articulation parameters– Ex: position and pose of a vehicle in an image– Ex: time delay of a radar signal return
• Matched filter: joint parameter estimation and detection/classification– compute sufficient statistic for each potential target and
articulation– compare “best” statistics to detect/classify
Matched Filter Geometry
• Detection/classification with K unknown articulation parameters
• Images are points in
• Classify by finding closesttarget template to datafor each class (AWG noise)
– distance or inner product
data
target templatesfrom
generative modelor
training data (points)
Matched Filter Geometry
• Detection/classification with K unknown articulation parameters
• Images are points in
• Classify by finding closesttarget template to data
• As template articulationparameter changes, points map out a K-dimnonlinear manifold
• Matched filter classification = closest manifold search
articulation parameter space
data
CS for Manifolds
• Theorem:
random measurements stably embed manifoldwhp[B, Wakin, FOCM ’08]related work:[Indyk and Naor, Agarwal et al., Dasgupta and Freund]
• Stable embedding
• Proved via concentration inequality arguments(JLL/CS relation)
CS for Manifolds
• Theorem:
random measurements stably embed manifoldwhp
• Enables parameter estimation and MFdetection/classificationdirectly on compressivemeasurements– K very small in many
applications (# articulations)
Example: Matched Filter
• Detection/classification with K=3 unknown articulation parameters1. horizontal translation2. vertical translation3. rotation
Smashed Filter
• Detection/classification with K=3 unknown articulation parameters (manifold structure)
• Dimensionally reduced matched filter directly on compressive measurements
Smashed Filter
• Random shift and rotation (K=3 dim. manifold)• Noise added to measurements• Goal: identify most likely position for each image class
identify most likely class using nearest-neighbor test
number of measurements Mnumber of measurements M
avg.
shift
estim
ate
erro
r
clas
sifica
tion r
ate
(%)
more noise
more noise
Compressive Sensing(Theoretical Aside)
CS for Manifolds
• Random projections preserve information– Compressive Sensing (sparse signal embeddings)– Johnson-Lindenstrauss lemma (point cloud embeddings)
• What about manifolds?
K-planes
Random Projections
K-dimensional,smooth, compact
• M � 2K+1 linear measurements- necessary for injectivity (in general)- sufficient for injectivity (w.p. 1) when � Gaussian
• But not enough for efficient, robust recovery
Manifold Embeddings[Whitney (1936); Sauer et al. (1991)]
Recall: K-planes Embedding
K-planes
• M � 2K linear measurements- necessary for injectivity- sufficient for injectivity (w.p. 1) when � Gaussian
• But not enough for efficient, robust recovery
Let � be a random MxN orthoprojector with
Theorem:
Let F ½ RN be a compact K-dimensional manifold with– condition number 1/� (curvature, self-avoiding)
– volume V
Then with probability at least 1-�, the following
statement holds: For every pair x1,x2 2 F,
Stable Manifold Embedding[with R. Baraniuk]
Let � be a random MxN orthoprojector with
Theorem:
Let F ½ RN be a compact K-dimensional manifold with– condition number 1/� (curvature, self-avoiding)
– volume V
Then with probability at least 1-�, the following
statement holds: For every pair x1,x2 2 F,
Stable Manifold Embedding[with R. Baraniuk]
Stable Manifold EmbeddingSketch of proof:
– construct a sampling of points� on manifold at fine resolution� from local tangent spaces
– apply JL to these points
– extend to entire manifold
Implications:
Nonadaptive (even random) linear projections can efficiently capture & preserve structure of manifold
See also: Indyk and Naor, Agarwal et al., Dasgupta and Freund
Manifold Learning
• Given training points in , learn the mapping to the underlying K-dimensional articulation manifold
• ISOMAP, LLE, HLLE, …
• Ex: images of rotating teapot
articulation space= circle
Up/Down Left/Right Manifold
[Tenenbaum, de Silva, Langford]
K=2
Compressive Manifold Learning
• ISOMAP algorithm based on geodesic distances between points
• Random measurements preserve these distances
• Theorem: If , then theISOMAP residual variance in the projected domain is bounded by the additive error factor
full data (N=4096) M = 100 M = 50 M = 25
translatingdisk manifold
(K=2)
[Hegde et al ’08]
Compressive SensingIn Action
Networked Sensing
• Sensor networks:intra-sensor and inter-sensor correlation
• Can we exploit these to jointly compress?
• Popular approach: collaboration– inter-sensor communication overhead
• Ongoing challenge in information theory– Slepian-Wolf coding, …
• One solution: Compressive Sensing
Multi-Signal CS
• “Measure separately, reconstruct jointly”• Ingredients
– models for joint sparsity– algorithms for joint reconstruction– theoretical results for measurement savings
Distributed CS (DCS)
…
• “Measure separately, reconstruct jointly”• Ingredients
– models for joint sparsity– algorithms for joint reconstruction– theoretical results for measurement savings
• Zero collaboration, trivially scalable, robust• Low complexity, universal encoding
Distributed CS (DCS)
…
Real Data Example• Light Sensing in Intel Berkeley Lab• 49 sensors, N =1024 samples each, � = wavelets
K=100
M=400
M=400
Multisensor Fusion• Network of J sensors viewing articulating object
• Complexity/bandwidth for fusion and inference
– naïve w/ no compression– individual CS (sensor by sensor)
Multisensor Fusion• Network of J sensors viewing articulating object
• Collection of images lies on K-dim manifold in
• Complexity/bandwidth for fusion and inference
– naïve w/ no compression– individual CS (sensor by sensor)
– joint CS (fuse all sensors)
Multi-Camera Vision Apps
• Background subtraction from multi-camera compressive video – innovations (due to object motion) sparse in space domain– estimate innovations from random projections of video– only few measurements required per camera
thanks to overlapping field of view
Multi-Camera Vision Apps• 3-D shape inference from compressive msmnts
– estimate innovations from random projections of video– fuse into 3D shape by exploiting object sparsity in 3D volume
Compressive Sensing
Related Work
Related Work
• Streaming algorithms
• Multiband sampling
• Finite rate of innovation (FROI) sampling
• General sparse approximation
Compressive Sensing
The End
CS – Summary
• Compressive sensing– integrates sensing, compression, processing– exploits signal sparsity/compressibility information– enables new sensing modalities, architectures, systems
• Why CS works: stable embedding for signals with concise geometric structure
sparse signals | compressible signals | manifolds
• Information scalability for compressive inference– compressive measurements ~ sufficient statistics– many fewer measurements may be required to
detect/classify/estimate than to reconstruct
dsp.rice.edu/cs
Top Related