Amit Singer · 2017-08-21 · SNR= 1=40, 26000 images, 10 defocus groups. Amit Singer (Princeton...
Transcript of Amit Singer · 2017-08-21 · SNR= 1=40, 26000 images, 10 defocus groups. Amit Singer (Princeton...
Multireference Alignment, Cryo-EM, and XFEL
Amit Singer
Princeton UniversityDepartment of Mathematics and Program in Applied and Computational Mathematics
August 16, 2017
Amit Singer (Princeton University) August 2017 1 / 32
Joint work with...
Afonso BandeiraNYU
Tamir BendoryPrinceton
Nicolas BoumalPrinceton
Roy LedermanPrinceton
Will LeebPrinceton
Tejal BhamrePrinceton -> Apple
Joao PereiraPrinceton
Nir SharonPrinceton
Teng ZhangCentral Florida
Zhizhen (Jane) ZhaoUIUC
Edgar DobribanStanford -> UPenn
Lydia LiuPrinceton -> Berkeley
Amit Singer (Princeton University) August 2017 2 / 32
Multi-reference alignment of 1D periodic signals
=0 =0.1 =1.2
High SNR: pairwise alignment succeeds.
Low SNR: pairwise alignment fails. How to use information in many (n > 2) signals?
Can we reconstruct the signal accurately while estimating most shifts poorly?
How many observations are needed for an accurate reconstruction?
Amit Singer (Princeton University) August 2017 3 / 32
Shift invariant features
yi = Ri x + εi , x , yi , εi ∈ RL, εi ∼ N (0, σ2IL×L), i = 1, 2, . . . , n
1 Zero frequency / average pixel value:
1n
n∑i=1
yi (0)→ x(0) as n→∞. Need n & σ2
2 Power spectrum / autocorrelation:
1n
n∑i=1
|yi (k)|2 → |x(k)|2 + σ2 as n→∞. Need n & σ4
3 Bispectrum / triple correlation (Tukey, 1953):
1n
n∑i=1
yi (k1)yi (k2)yi (−k1 − k2)→ x(k1)x(k2)x(−k1 − k2) as n→∞. Need n & σ6
Amit Singer (Princeton University) August 2017 4 / 32
Estimation using shift invariant features
The bispectrum Bx(k1, k2) = x(k1)x(k2)x(−k1 − k2) contains phaseinformation and is invertible (up to global shift)
Sadler, Giannakis (JOSA A 1992)Kakarala (1992; arXiv 2009)
It is possible to accurately reconstruct the signal from sufficientlymany noisy shifted copies for arbitrarily low SNR withoutestimating the shifts and even when estimation of shifts is poor
Notice that if shifts are known, then n & 1/SNR is sufficient.Unknown shifts make a big difference: n & 1/SNR3.No method can succeed with fewer measurements!
Perry, Weed, Bandeira, Rigollet, S (arXiv 2017)Abbe, Pereira, S (ISIT 2017)
Amit Singer (Princeton University) August 2017 5 / 32
Maximum Likelihood vs. Invariant Features
5 10 15 20 25 30-0.2
0
0.2
0.4
0.6
0.8
1
1.2signalinvariant features MRAexpectation maximization
5 10 15 20 25 30-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
101 102 103 104 105 106 107
#observations n
10-2
10-1
100
101
102
103
104
Com
puta
tion
time
[s]
invariants feature approachexpectation maximization
(n = 104, σ = 1)
Invariant features: only one pass over data, data can come as astream and not be stored, parallel computationExpectation-Maximization, Stochastic Gradient Descent:multiple passes over the data, all data needs to be stored, parallelcomputation.
Bendory, Boumal, Ma, Zhao, S (arXiv 2017)
Amit Singer (Princeton University) August 2017 6 / 32
Non-uniform but unknown distribution of shifts
5 10 15 20 25 30-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
true signalgradient-based algorithmexpectation maximization
(n = 105, σ = 2,Pr{Ri = t} ∝ exp(−1
2 t2/52)).Non-uniform is easier than uniform (only second-order moments):n & σ4
Moments-based method is faster and more accurate than EMAmit Singer (Princeton University) August 2017 7 / 32
Heterogeneity from invariant features
0 5 10 15 20 25 30-0.5
0
0.5
1
1.5
estimated signaltrue signal
5 10 15 20 25 30-1
0
1
2
3
estimated signaltrue signal
0 5 10 15 20 25 30-3
-2
-1
0
1
2
3
4
(n = 106, σ = 1.5)Two step procedure:
1 Compute invariant features averages over entire data2 Find two (or more) signals and their population proportions that agree
with the computed averages. No clustering!
Why does it work? Count degrees of freedomAmit Singer (Princeton University) August 2017 8 / 32
Main motivation: Cryo-EM
In a basement room, deep in the bowels of a steel-clad building in Cambridge, a major insurgency is under way.
A hulking metal box, some three metres tall, is quietly beaming terabytes’ worth of data through thick orange cables that disappear off through the ceiling. It is one of the world’s most advanced cryo-
electron microscopes: a device that uses electron beams to photograph frozen biological molecules and lay bare their molecular shapes. The microscope is so sensitive that a shout can ruin an experiment, says Sjors Scheres, a structural biologist at the UK Medical Research Council Laboratory of Molecular Biology (LMB), as he stands dwarfed beside the £5-million (US$7.7-million) piece of equipment. “The UK needs many more of these, because there’s going to be a boom,” he predicts.
In labs around the world, cryo-electron microscopes such as this one are sending tremors through the field of structural biology. In the past three years, they have revealed exquisite details of protein-making ribosomes, quivering membrane proteins and other key cell molecules,
THE REVOLUTION WILL NOT BE CRYSTALLIZED
MOVE OVER X-RAY CRYSTALLOGRAPHY. CRYO-ELECTRON MICROSCOPY IS
KICKING UP A STORM IN STRUCTURAL BIOLOGY BY REVEALING THE HIDDEN
MACHINERY OF THE CELL.B Y E W E N C A L L A W A Y
ILLU
STR
ATIO
N B
Y VIK
TOR
KO
EN
1 7 2 | N A T U R E | V O L 5 2 5 | 1 0 S E P T E M B E R 2 0 1 5© 2015 Macmillan Publishers Limited. All rights reserved
Amit Singer (Princeton University) August 2017 9 / 32
www.sciencemag.org SCIENCE VOL 343 28 MARCH 2014 1443
The Resolution Revolution
BIOCHEMISTRY
Werner Kühlbrandt
Advances in detector technology and image
processing are yielding high-resolution
electron cryo-microscopy structures of
biomolecules.
Precise knowledge of the structure of macromolecules in the cell is essen-tial for understanding how they func-
tion. Structures of large macromolecules can now be obtained at near-atomic resolution by averaging thousands of electron microscope images recorded before radiation damage accumulates. This is what Amunts et al. have done in their research article on page 1485 of this issue ( 1), reporting the structure of the large subunit of the mitochondrial ribosome at 3.2 Å resolution by electron cryo-micros-copy (cryo-EM). Together with other recent high-resolution cryo-EM structures ( 2– 4) (see the fi gure), this achievement heralds the beginning of a new era in molecular biology, where structures at near-atomic resolution are no longer the prerogative of x-ray crys-tallography or nuclear magnetic resonance (NMR) spectroscopy.
Ribosomes are ancient, massive protein-RNA complexes that translate the linear genetic code into three-dimensional proteins. Mitochondria—semi-autonomous organelles
A B C
Near-atomic resolution with cryo-EM. (A) The large subunit of the yeast mitochondrial ribosome at 3.2 Å reported by Amunts et al. In the detailed view below, the base pairs of an RNA double helix and a magnesium ion (blue) are clearly resolved. (B) TRPV1 ion channel at 3.4 Å ( 2), with a detailed view of residues lining the ion pore on the four-fold axis of the tetrameric channel. (C) F420-reducing [NiFe] hydrogenase at 3.36 Å ( 3). The detail shows an α helix in the FrhA subunit with resolved side chains. The maps are not drawn to scale.
Amit Singer (Princeton University) August 2017 10 / 32
January 2016 Volume 13 No 1
Single-particle cryo-electron microscopy (cryo-EM) is our choice for Method of the Year 2015 for
its newfound ability to solve protein structures at near-atomic resolution. Featured is the 2.2-Å
cryo-EM structure of β-galactosidase as recently reported by Bartesaghi et al. (Science 348,
1147-1151, 2015). Cover design by Erin Dewalt.
Amit Singer (Princeton University) August 2017 11 / 32
Main motivation: Cryo-EM
Why cryo-electron microscopy?
Mapping the structure of molecules without crystallizing themImaging of heterogeneous samples, with mixtures of molecules ormultiple conformations
Why now?
Advancements in detector technology have led to near-atomicresolution mapping
Amit Singer (Princeton University) August 2017 12 / 32
How does it work?
Schematic drawing of the imaging process:
The basic cryo-EM problem:
Amit Singer (Princeton University) August 2017 13 / 32
Image formation model and inverse problems
Projection images Ii (x , y) = Ti ∗∫∞−∞ φ(xR1
i + yR2i + zR3
i ) dz + “noise".n images (i = 1, . . . ,n), images of size N × N pixelsφ : R3 7→ R is the scattering density of the molecule.Cryo-EM basic problem: Estimate φ given I1, . . . , In.The heterogeneity problem: Estimate φ1, . . . , φn given I1, . . . , In.
Amit Singer (Princeton University) August 2017 14 / 32
Kam’s method
Amit Singer (Princeton University) August 2017 15 / 32
Why was Kam’s method mostly forgotten?
Idea that was ahead of its time: There was not enough data toaccurately calculate second and third order statistics.
Requires uniform distribution of viewing directions.
Maximum likelihood framework prevailed.
Amit Singer (Princeton University) August 2017 16 / 32
Fourier projection-slice theorem
Amit Singer (Princeton University) August 2017 17 / 32
Kam’s method revisited
Spherical harmonics expansion
φ(k , θ, ϕ) =L∑`=0
∑m=−`
A`m(k)Y m` (θ, ϕ)
PCA / 2D covariance gives for each ` the A`m’s up to an orthgonalmatrix of size (2`+ 1)× (2`+ 1) through
C`(k1, k2) =∑
m=−`A`m(k1)A∗`m(k2), or C` = A`A∗`
C` is the analogue of power spectrum. Autocorrelation of 3Dstructure is obtained through Fourier slice theorem.
Amit Singer (Princeton University) August 2017 18 / 32
From phase retrieval to orthogonal matrix retrieval
C` = A`AT` = A`O`OT
` AT`
Cholesky decomposition of C` determines A` up to an(2l + 1)× (2l + 1) orthogonal matrix O`.
Orthogonal matrix retrieval: Bispectrum / homology modelling.
Homology: Compute C` = F`F T` , want A` = F`O` such that
A` ≈ B`O` = argmin
O∈O(2l+1)‖F`O − B`‖2F
Closed form solution using singular value decomposition:
O` = V`UT` where BT
` F` = U`Σ`V T`
Bhamre, Zhang, S (ISBI 2015)
Amit Singer (Princeton University) August 2017 19 / 32
Orthogonal matrix retrieval via homology modeling
Bhamre, Zhang, S (arXiv, 2017)
EMDB 8118 EMDB 8117(a) (b)
E
Ground Truth Least Squares Twicing Anisotropic Twicing(b) (c) (d) (e)
Homologous structure(a)
Synthetic Dataset: TRPV1 with DxTx and RTX.SNR= 1/40, 26000 images, 10 defocus groups.
Amit Singer (Princeton University) August 2017 20 / 32
Improved covariance estimation
Steerable PCA: Covariance matrix commutes with in-planerotations, hence block-diagonalized in Fourier-Bessel basis (orany other angular Fourier basis)
Zhao, Shkolnisky, S (IEEE Computational Imaging, 2016)
CTF correction
Eigenvalue shrinkage
Bhamre, Zhang, S (Journal Structural Biology, 2016)
Amit Singer (Princeton University) August 2017 21 / 32
Application to denoisingBhamre, Zhang, S (Journal Structural Biology, 2016)
Raw Closest projection TWF CWF
CWF = Covariance Wiener Filter, TWF = Traditional Wiener Filter
TRPV1, K2 direct electron detector
35645 motion corrected, picked particle images of 256×256 pixels belonging to935 defocus groups (Liao et al., Nature 2013)
Amit Singer (Princeton University) August 2017 22 / 32
Third order moment tensor for 3D ab-initioreconstruction
Work in progress: extension to non-uniform distributions,uniqueness? constraints? (positivity at low frequency?)
Why bother?
Extremely fast: just one or two passes over the data; single pass ismuch faster than a typical refinement iteration
Streaming?
Validation tool: No starting model to refine, no need to worry aboutrotations estimated correctly
Amit Singer (Princeton University) August 2017 23 / 32
Kam’s method for XFEL
X-ray free electron laser (XFEL) molecular imaging (Gaffney andChapman, Science 2007)Amit Singer (Princeton University) August 2017 24 / 32
Kam’s method for XFEL vs. Cryo-EM
Ewald spheres
Uniform distribution
No CTF, no shifts
Low photon count: Poisson noise
Non-negativity constraint
Amit Singer (Princeton University) August 2017 25 / 32
ePCA: Exponential family PCA
Liu, Dobriban, S (arXiv 2017)Demo of ePCA on XFEL imagesDemo of ePCA on XFEL images
(a) Clean intensitymaps
(b) Noisy photoncounts
(c) Denoised (PCA) (d) Denoised (ePCA)
Figure: XFEL diffraction images (n = 70, 000, p = 65, 536)
6 / 35XFEL diffraction images (n = 70,000,p = 65,536)Amit Singer (Princeton University) August 2017 26 / 32
PCA for Exponential Family Distributions
One-parameter exponential family with density
fθ(y) = exp[θy − A(θ)]
No commonly agreed upon version of PCA for non-Gaussian data(Jolliffe 2002)Likelihood/generalized linear latent variable models (Collins et al.2001; Knott and Bartholomew 1999; Udell et al. 2016)
lack of global convergence guaranteesslow
Gaussianizing transforms: wavelet, Anscombe (Anscombe 1948;Starck et al. 2010)
unsuitable for low-intensity
Amit Singer (Princeton University) August 2017 27 / 32
Problem formulation
Sampling model for Poisson
Each p-dim latent vector is drawn i.i.d. from distribution D
(X (1), · · · ,X (p))> = X ∼ D(µ,Σ)
— e.g., the noiseless image. µ and Σ are the mean andcovariance of D.Observations Yi ∼ Y ∈ Rp — e.g., the noisy imageModel for Y : draw latent X ∈ Rp, then
Y = (Y (1), · · · ,Y (p))> where Y (j) ∼ Poisson(X (j))
Goal: Recover information about the original distribution D, i.e. Σ.Recover X .
Amit Singer (Princeton University) August 2017 28 / 32
Summary of ePCA
ePCA can be seen as a sequence of improved covariance estimators
Table: Covariance estimators
Name Formula Motivation
Sample covariance S = 1n
∑ni=1(Yi − Y )(Yi − Y )> -
Diagonal debiasing Sd = S − diag[V (Y )] Hierarchy
Homogenization Sh = D− 1
2n Sd D
− 12
n Heteroskedasticity
Shrinkage Sh,η = ρ(Sh) High dimensionality
Heterogenization She = D12n Sh,ηD
12n Heteroskedasticity
Scaling Ss =∑
αi vi v>i (She =∑
vi v>i ) Heteroskedasticity
Amit Singer (Princeton University) August 2017 29 / 32
Conclusions
Method of moments paves the way to signal(s) recovery throughone or two passes over the data, no alignment and no clustering.
Reconstruction is possible at any SNR, given sufficiently manyobservations.
Qualitative determination of the number of observations neededas a function of the SNR:
1/SNR, 1/SNR2, 1/SNR3
Improved high dimensional covariance estimation (shrinkage,steerable, Poisson)
Amit Singer (Princeton University) August 2017 30 / 32
ASPIRE: Algorithms for Single Particle Reconstruction
Open source toolbox, publicly available:http://spr.math.princeton.edu/
Amit Singer (Princeton University) August 2017 31 / 32
References
A. S. Bandeira, M. Charikar, A. Singer, A. Zhu, “Multireference Alignment usingSemidefinite Programming", in Proceedings of the 5th conference on Innovations inTheoretical Computer Science (ITCS ’14), pp. 459–470 (2014).
Z. Zhao, A. Singer, “Rotationally Invariant Image Representation for Viewing DirectionClassification in Cryo-EM", Journal of Structural Biology, 186 (1), pp. 153–166 (2014).
T. Bhamre, T. Zhang, A. Singer, “Orthogonal Matrix Retrieval in Cryo-Electron Microscopy",in IEEE 12th International Symposium on Biomedical Imaging (ISBI 2015), pp. 1048–1052,16-19 April 2015.
T. Bhamre, T. Zhang, A. Singer, “Denoising and Covariance Estimation of Single ParticleCryo-EM Images", Journal of Structural Biology, 195 (1), pp. 72–81 (2016).
T. Bendory, N. Boumal, C. Ma, Z. Zhao, A. Singer, “Bispectrum Inversion with Applicationto Multireference Alignment", https://arxiv.org/abs/1705.00641 (2017).
E. Abbe, J. Pereira, A. Singer, “Sample Complexity of the Boolean MultireferenceAlignment Problem", IEEE International Symposium on Information Theory (ISIT) (2017).
A. Perry, J. Weed, A. S. Bandeira, P. Rigollet, A. Singer, “The sample complexity ofmulti-reference alignment", https://arxiv.org/abs/1707.00943 (2017).
L. Liu, E. Dobriban, A. Singer, “ePCA: High Dimensional Exponential Family PCA",preprint. Available at https://arxiv.org/abs/1611.05550 (2017).
Amit Singer (Princeton University) August 2017 32 / 32