A x A x x A √ √ √ √ A A x x A A 1t t...
Transcript of A x A x x A √ √ √ √ A A x x A A 1t t...
Background Alternative Deflation Methods Deflation PropertiesBackground Alternative Deflation Methods Deflation Properties
Principal Components Analysis (PCA) Projection Deflation−−=
Method 0=ttT
t xAx tsxA ts >∀= ,00=tt xA 0ftAPrincipal Components Analysis (PCA)• Goal: Extract r leading eigenvectors of sample
Projection Deflation)()( 1
Tttt
Tttt xxIAxxIA −−= −
MethodHotelling’s (HD) √ X X X
0=ttt xAx tstt t
A• Goal: Extract r leading eigenvectors of sample covariance matrix, Intuition: Projects data onto orthocomplement of space
)()( 1 tttttt xxIAxxIA −−= −
x
Hotelling’s (HD) √ X X XProjection (PD) √ √ √ X0A
• Typical solution: Alternate between two tasks1. Rank-one variance maximization
Intuition: Projects data onto orthocomplement of space spanned by tx
Projection (PD) √ √ √ XSchur Complement (SCD) √ √ √ √
0
1. Rank-one variance maximizationSchur Complement Deflation
tx
1 s.t. maxarg == xxxAxx TT
Schur Complement (SCD) √ √ √ √
Orthog. Hotelling’s (OHD) √ X X X
2. Hotelling’s matrix deflationSchur Complement Deflation1 s.t. maxarg 1 == − xxxAxx T
tT
t
Orthog. Hotelling’s (OHD) √ X X XOrthog. Projection (OPD) √ √ √ √T AxxA2. Hotelling’s matrix deflation Orthog. Projection (OPD) √ √ √ √
Generalized (GD) √ √ √ √TT xxAxxAA −= Tt
Tttt
tt xAx
AxxAAA 11
1−−
− −=• Removes contribution of from Intuition: Conditional variance of data variables given the x
Generalized (GD) √ √ √ √
ATttt
Ttttt xxAxxAA 11 −− −=
ttTt
tt xAxAA
11
−− −=
• Removes contribution of from • Primary drawback: Non-sparse solutions
Intuition: Conditional variance of data variables given the new sparse principal component
Experimentstx 1−tA ttt 1−
• Primary drawback: Non-sparse solutions new sparse principal componentExperiments
Sparse PCA Orthogonalized Projection DeflationSet up: Leading deflation-based Sparse PCA algorithmsSparse PCA
• Goal: Extract r sparse “pseudo-eigenvectors” Orthogonalized Projection Deflation
Set up: Leading deflation-based Sparse PCA algorithms• GSLDA (Moghaddam et al., ICML ‘06)
)( T xQQI −• Goal: Extract r sparse “pseudo-eigenvectors” • High variance directions, few non-zero components
• GSLDA (Moghaddam et al., ICML ‘06)• DC-PCA (Sriperumbudur et al., ICML ‘07)• Outfitted with each deflation technique
)()(,)(
111 T
tttTtttT
tT
t qqIAqqIAxQQI
q tt −−=−−
= −−−• High variance directions, few non-zero components
• Typical solution: Alternate between two tasks • Outfitted with each deflation techniquePit props dataset: 13 variables, 180 observations
)()(,||)(|| 1
11
ttttttt
Tt qqIAqqIAxQQI
qtt
−−=−
= −−−• Typical solution: Alternate between two tasks
1. Constrained rank-one variance maximization Intuition : Eliminates additional variance contributed by Pit props dataset: 13 variables, 180 observationsDC-PCA Cumulative % variance, Cardinality 4,4,4,4,4,4
||)(||11 txQQI
tt−
−−
txTT kxxxxAxx ≤== )(Card ,1 s.t. maxarg
1. Constrained rank-one variance maximization Intuition : Eliminates additional variance contributed by • = orthonormal basis for extracted pseudo-eigenvectors
DC-PCA Cumulative % variance, Cardinality 4,4,4,4,4,4txt
Tt
Tt kxxxxAxx ≤== − )(Card ,1 s.t. maxarg 1 tQ
HD PD SCD OHD OPD GD2. Hotelling’s matrix deflation (borrowed from PCA)• = orthonormal basis for extracted pseudo-eigenvectors• Successive pseudo-eigenvectors are not orthogonal
ttt −1 tQHD PD SCD OHD OPD GD
PC 1 22.60% 22.60% 22.60% 22.60% 22.60% 22.60%• Successive pseudo-eigenvectors are not orthogonal• Annihilating full vector can reintroduce old components
PC 1 22.60% 22.60% 22.60% 22.60% 22.60% 22.60%PC 2 39.60% 39.60% 38.60% 39.60% 39.60% 39.60%
• Annihilating full vector can reintroduce old components• Orthogonalized Hotelling’s Deflation defined similarly
PC 2 39.60% 39.60% 38.60% 39.60% 39.60% 39.60%PC 3 46.80% 50.90% 53.40% 46.80% 50.90% 51.00%The Problem • Orthogonalized Hotelling’s Deflation defined similarly
Reformulating Sparse PCA
PC 3 46.80% 50.90% 53.40% 46.80% 50.90% 51.00%PC 4 56.80% 62.10% 62.30% 52.90% 62.10% 62.20%
The Problem
Reformulating Sparse PCA PC 4 56.80% 62.10% 62.30% 52.90% 62.10% 62.20%PC 5 66.10% 70.20% 73.70% 59.90% 70.20% 71.30%Hotelling’s deflation was designed for eigenvectors, not
New goal: Explicitly maximize additional variance criterion
PC 5 66.10% 70.20% 73.70% 59.90% 70.20% 71.30%PC 6 73.40% 77.80% 79.30% 63.20% 77.20% 78.90%pseudo-eigenvectors
Compare Gene expression dataset: 21 genes, 5759 fly nuclei New goal: Explicitly maximize additional variance criterion• Solve sparse generalized eigenvector problem
PC 6 73.40% 77.80% 79.30% 63.20% 77.20% 78.90%Compare
• Hotelling’s deflation for PCAGene expression dataset: 21 genes, 5759 fly nuclei GSLDA Cumulative % variance, Cardinality 9,7,6,5,3,2,2,2• Solve sparse generalized eigenvector problem
TTT xQQIAQQIx −− )()(max• Hotelling’s deflation for PCA
• Annihilates variance of : 0=T xAxxGSLDA Cumulative % variance, Cardinality 9,7,6,5,3,2,2,2TTT
xxQQIAQQIx
tttt−−
−−−−)()(max
1111 0 HD PD SCD OHD OPD GD• Annihilates variance of : • Renders orthogonal to :
0=ttTt xAxtx
A x 0=xATT
x
kxxQQIx ≤=− )(Card ,1)( s.t.PC 1 21.00% 21.00% 21.00% 21.00% 21.00% 21.00%• Renders orthogonal to :
• Preserves positive semidefiniteness:tA tx 0=tt xA
0fA• Yields generalized deflation procedure
tkxxQQIxtt
≤=−−−
)(Card ,1)( s.t.11 PC 2 38.20% 38.10% 38.10% 38.20% 38.10% 38.20%
PC 3 52.10% 51.90% 52.00% 52.00% 51.90% 52.20%
• Preserves positive semidefiniteness:• Hotelling’s deflation for Sparse PCA
0ftA• Yields generalized deflation procedure
Generalized DeflationPC 3 52.10% 51.90% 52.00% 52.00% 51.90% 52.20%PC 4 60.50% 60.60% 60.40% 60.40% 60.40% 61.00%
• Hotelling’s deflation for Sparse PCA• Annihilates variance of tx Generalized Deflation
)( , TqqIBBxBq −==PC 4 60.50% 60.60% 60.40% 60.40% 60.40% 61.00%PC 5 65.70% 67.40% 67.10% 65.90% 67.10% 68.20%
• Annihilates variance of • Does not render orthogonal to tA tx
tx
)()(
)( , 11
TT
Tttttttt
qqIAqqIA
qqIBBxBq
−−=
−== −− PC 5 65.70% 67.40% 67.10% 65.90% 67.10% 68.20%PC 6 69.30% 71.00% 70.40% 70.00% 70.00% 72.10%
• Does not render orthogonal to • Does not preserve positive semidefiniteness
tA tx
)()( 1Tttt
Tttt qqIAqqIA −−= −
PC 6 69.30% 71.00% 70.40% 70.00% 70.00% 72.10%PC 7 72.50% 74.00% 73.40% 72.80% 73.70% 75.70%
• Does not preserve positive semidefinitenessKey properties lost in the Sparse PCA setting 1 tttttt − PC 7 72.50% 74.00% 73.40% 72.80% 73.70% 75.70%
PC 8 75.10% 76.80% 77.00% 75.90% 76.60% 79.60%
Key properties lost in the Sparse PCA settingGoal: Recover lost properties with new deflation methods PC 8 75.10% 76.80% 77.00% 75.90% 76.60% 79.60%Goal: Recover lost properties with new deflation methods