Sparse Generalized Principal Components Analysis with ...
Transcript of Sparse Generalized Principal Components Analysis with ...
![Page 1: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/1.jpg)
Sparse Generalized Principal Components Analysis withApplications to Neuroimaging
Genevera I. Allen
Department of Statistics, Rice University,Department of Pediatrics-Neurology, Baylor College of Medicine,
& Jan and Dan Duncan Neurological Research Institute, Texas Children’s Hospital.
March 11, 2013
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 1 / 24
![Page 2: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/2.jpg)
1 Motivation
2 Generalized PCA and Sparse GPCA
3 Results
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 2 / 24
![Page 3: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/3.jpg)
Review: Principal Components Analysis
Principal Components Analysis (PCA):
Dimension reduction.
Exploratory data analysis.
PCA Problem:
maximizevk
Var(Xvk) = vTk XT Xvk
subject to vTk vk = 1 & vT
k vk ′ = 0 ∀ k ′ < k.
PC: zk = Xvk .
Given by the singular value decomposition (SVD) of the data matrix:X = UDVT , then Z = XV.
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 3 / 24
![Page 4: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/4.jpg)
When does PCA (SVD) fail?
1 High-dimensional data.I Fix: Sparsity - Sparse PCA (Johnstone and Lu, 2004).
2 Structured Factors.I Fix: Smoothness - Functional PCA (Rice and Silverman, 1991).I Fix: Sparsity - Sparse PCA (Jolliffe et. al, 2003).
3 Strong dependencies among row and/or column variables? StructuredData?
I Transposable Data: Dependencies among the rows and/or column of adata matrix.
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 4 / 24
![Page 5: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/5.jpg)
When does PCA (SVD) fail?
1 High-dimensional data.I Fix: Sparsity - Sparse PCA (Johnstone and Lu, 2004).
2 Structured Factors.I Fix: Smoothness - Functional PCA (Rice and Silverman, 1991).I Fix: Sparsity - Sparse PCA (Jolliffe et. al, 2003).
3 Strong dependencies among row and/or column variables? StructuredData?
I Transposable Data: Dependencies among the rows and/or column of adata matrix.
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 4 / 24
![Page 6: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/6.jpg)
PCA in Neuroimaging
Multivariate analysis techniques used for finding Regions of Interest andActivation Patterns, understanding Functional Connectivity, but . . .
(Viviani et. al, 2005)
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 5 / 24
![Page 7: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/7.jpg)
PCA and Correlated Noise
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 6 / 24
![Page 8: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/8.jpg)
PCA and Correlated Noise
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 6 / 24
![Page 9: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/9.jpg)
StarPlus fMRI Data
StarPlus Data (Subject 04847): (Mitchell et al., 2004)
Task: Object identification.
20 tasks in which sentence agrees withimage.
20 tasks in which sentence opposes image.
Each task lasted 27 seconds (55 timepoints).
Images: 64× 64× 8.
Data Set: 4,698 voxels × 40 tasks × 55time points.
Goal: Use pattern recognition techniques tofind regions of interest and activation patternsrelated to object identification.
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 7 / 24
![Page 10: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/10.jpg)
Starplus PCA Results
Classical PCA:
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 8 / 24
![Page 11: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/11.jpg)
Starplus PCA Results
Sparse PCA:
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 8 / 24
![Page 12: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/12.jpg)
Objectives
Goals1 Incorporate known noise structure and/or dependencies into PCA
problems.
2 Develop a framework for regularization of PCA factors.
3 Provide computationally feasible solutions and algorithms in ultrahigh-dimensional settings.
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 9 / 24
![Page 13: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/13.jpg)
1 Motivation
2 Generalized PCA and Sparse GPCA
3 Results
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 10 / 24
![Page 14: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/14.jpg)
PCA Model
Xn×p =K∑
k=1
dk uk vTk +En×p .
Random: dk & Fixed: U = [u1, . . .uK ] and V = [v1, . . . vK ].
Independent Noise: Eijiid∼ (0, σ2), or Cov(vec(E)) = σ2I(p) ⊗ I(n).
SVD Loss Function: ||X−UDVT ||2F .
|| · ||F is the Frobenius norm (sums of squared errors).
Error terms weighted equally.
Cross-product errors between elements ij and i ′j ′ are ignored.
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 11 / 24
![Page 15: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/15.jpg)
Our Generative Model
Xn×p =K∑
k=1
dk uk vTk +En×p .
Random: dk & Fixed: U = [u1, . . .uK ] and V = [v1, . . . vK ].
Noise: Two-way (separable) dependencies:
Cov(vec(E)) = ∆⊗Σ,
with ∆ ∈ <p×p the column covariance and Σ ∈ <n×n the rowcovariance.
∆⊗Σ =
0BBB@∆11 Σ ∆12 Σ . . . ∆1p Σ∆21 Σ ∆22 Σ
.... . .
...∆p1 Σ . . . ∆pp Σ
1CCCASignal factors assumed to be orthogonal to the noise covariance:UT ΣU = I & VT ∆V = I.
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 12 / 24
![Page 16: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/16.jpg)
GPCA Optimization Problem
SVD Problem
minimizeU,D,V
||X−UDVT ||2F
subject to UT U = I(K), VT V = I(K) & diag(D) ≥ 0.
How can we modify the Frobenius norm?
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 13 / 24
![Page 17: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/17.jpg)
GPCA Optimization Problem
Generalized Least Squares Matrix Decomposition (GMD / GPCA)Problem
minimizeU,D,V
||X−UDVT ||2Q,R
subject to UT QU = I(K), VT RV = I(K) & diag(D) ≥ 0.
Quadratic Operators: Q ∈ <n×n and R ∈ <p×p positive semi-definite.
Q,R-norm: ||X ||Q,R =
√tr(QXRXT ).
Generalization of the Frobenius norm: If Q,R = I, then GMDequivalent to the SVD.
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 13 / 24
![Page 18: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/18.jpg)
Quadratic Operators
Interpretations:1 Matrix-variate Normal:
I ||X−UDVT ||2Q,R ∝ `n,p(UDVT ,Q−1,R−1).I Q and R behave like inverse row and column covariances.
2 Covariance Decomposition:Under certain model assumptions:
Cov(vec(X)) =∑
Var(dk)(vk vTk )⊗ (uk uT
k ) + R⊗Q .
where VT RV = I and UT QU = I.
3 Smoothing Matrices: Factors U and V as smooth as the smallesteigenvectors of Q and R.
4 Weighting Matrices: Up-weight cross-product errors in the lossaccording to the covariance between variables.
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 14 / 24
![Page 19: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/19.jpg)
Quadratic Operators
Classes of Quadratic Operators:1 Model-Based Operators.
I Random Field Covariances.I Temporal Processes Covariance
or Inverse Covariances.I Gaussian Markov Random
Fields.
2 Smoothing Operators.I Functional Data Analysis.
3 Graphical Operators.I Graph Laplacians.
Spatial Graphical Operator
Temporal Smoothing Operator
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 14 / 24
![Page 20: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/20.jpg)
GPCA Solution
GMD Solution
Let X = Q1/2 XR1/2 and X = UDVT
be the SVD of X. Then the GMDsolution, X = U∗ D∗(V∗)T , is given by:
U∗ = Q−1/2
U, V∗ = R−1/2
V, & D∗ = D.
U∗ sample GPCs (scores) and V∗ GPCA loadings (directions)
Alternative Computational Approaches:
Linear algebra tricks when n << p.
Deflation via the Generalized Power Method: Performs alternatinggeneralized least squares regression.
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 15 / 24
![Page 21: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/21.jpg)
Regularized GPCA
Regularized GPCA Optimization Problem
maximizev,u
uT QXRv−λv P1(v)− λu P2(u)
subject to uT Qu ≤ 1 & vT Rv ≤ 1.
Theorem: If P() is any norm or semi-norm, then factor-wise solutionsgiven by solving a penalized regression problem and re-scaling.
Options: P(x) = ||x ||1 - sparsity, group sparsity, `q balls, totalvariation, and etc.
Multiple factors computed via deflation.
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 16 / 24
![Page 22: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/22.jpg)
1 Motivation
2 Generalized PCA and Sparse GPCA
3 Results
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 17 / 24
![Page 23: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/23.jpg)
Starplus PCA Results
Classical PCA:
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 18 / 24
![Page 24: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/24.jpg)
Starplus PCA Results
Sparse PCA:
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 18 / 24
![Page 25: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/25.jpg)
Starplus GPCA Results
Generalized PCA:
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 19 / 24
![Page 26: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/26.jpg)
Starplus GPCA Results
Sparse Generalized PCA:
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 19 / 24
![Page 27: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/27.jpg)
Starplus GPCA Results
Results:
Identified “Ventral Stream” (brainregions associated with objectidentification).
Anatomical Regions:I Bilateral occipital.I Left-lateralized inferior temporal.I Inferior frontal.
(Pennick & Kana, 2012)
SGPCA 1, Axial Slice 2
SGPCA 1, Axial Slice 3
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 19 / 24
![Page 28: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/28.jpg)
PCA Comparisons
Extent of dimension reduction achieved:
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 20 / 24
![Page 29: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/29.jpg)
Other Extensions
Non-negative (and sparse) GPCA.
Tensor GPCA or Higher-Order GPCA.I Based on the Tucker Decomposition.
Sparse Higher-Order GPCA.I Computes sequential rank-one
decompositions that are a relaxation ofthe CANDENCOMP/PARAFACdecomposition.
Applications: Multi-subject neuroimaging data.
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 21 / 24
![Page 30: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/30.jpg)
Concluding Remarks
When to use GPCA vs. PCA:
Structured Data (variables are associated with a specific location).
Smooth or functional data.
Data with low signal to noise ratio.
Future Work:
Statistical Work: How to choose Q and R, the rank of thedecomposition, the level of sparsity, and consistency studies.
Applications in Neuroimaging: Comparisons to ICA methods.
Other Applications: Genomcis, proteomics, image data, time seriesand longitudinal data, spatio-temporal data, climate studies, remotesensing.
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 22 / 24
![Page 31: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/31.jpg)
Concluding Remarks
R Package & Matlab Toolbox
Coming Soon . . .
Code available from www.stat.rice.edu∼/gallen.
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 22 / 24
![Page 32: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/32.jpg)
Acknowledgments
Funding:
National Science Foundation DMS-1209017
Collaborators:
Logan Grosenick, Center for Mind and Brain, Stanford University.
Jonathan Taylor, Statistics, Stanford University.
Mirjana Maletic-Savatic, Jan and Dan Duncan Neurological ResearchInstitute & Baylor College of Medicine.
Frederick Campbell, PhD Candidate, Statistics, Rice University.
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 23 / 24
![Page 33: Sparse Generalized Principal Components Analysis with ...](https://reader030.fdocuments.us/reader030/viewer/2022020314/58a1d1181a28abbe5a8ba570/html5/thumbnails/33.jpg)
References
G. I. Allen, L. Grosenick & J. Taylor, A generalized least squares decomposition, arXiv:
1102.3074, Rice University Technical Report No. TR2011-03, 2011.
G. I. Allen, Regularized tensor decompositions and higher-order principal components analysis,
arXiv:1202.2476, 2012.
G. I. Allen, Sparse higher-order principal components analysis, In Artificial Intelligence and
Statistics, 2012.
G. I. Allen & M. Maletic-Savatic, Sparse non-negative generalized PCA with applications to
metabolomics, Bioinformatics, 27:21, 3029-3035, 2011.
F. D. Campbell & G. I. Allen, Algorithms and approaches for analyzing massive structured data
with Sparse Generalized PCA, In Preparation.
G. I. Allen, C. Peterson, M. Vannucci, and M. Maletic-Savatic, ”Regularized Partial Least
Squares with an Application to NMR Spectroscopy”, (To Appear) Statistical Analysis and Data
Mining, 2013.
G. I. Allen (BCM & Rice) Sparse GPCA March 11, 2013 24 / 24