A x A x x A √ √ √ √ A A x x A A 1t t...

Post on 07-Aug-2020

1 views 0 download

Transcript of A x A x x A √ √ √ √ A A x x A A 1t t...

Background Alternative Deflation Methods Deflation PropertiesBackground Alternative Deflation Methods Deflation Properties

Principal Components Analysis (PCA) Projection Deflation−−=

Method 0=ttT

t xAx tsxA ts >∀= ,00=tt xA 0ftAPrincipal Components Analysis (PCA)• Goal: Extract r leading eigenvectors of sample

Projection Deflation)()( 1

Tttt

Tttt xxIAxxIA −−= −

MethodHotelling’s (HD) √ X X X

0=ttt xAx tstt t

A• Goal: Extract r leading eigenvectors of sample covariance matrix, Intuition: Projects data onto orthocomplement of space

)()( 1 tttttt xxIAxxIA −−= −

x

Hotelling’s (HD) √ X X XProjection (PD) √ √ √ X0A

• Typical solution: Alternate between two tasks1. Rank-one variance maximization

Intuition: Projects data onto orthocomplement of space spanned by tx

Projection (PD) √ √ √ XSchur Complement (SCD) √ √ √ √

0

1. Rank-one variance maximizationSchur Complement Deflation

tx

1 s.t. maxarg == xxxAxx TT

Schur Complement (SCD) √ √ √ √

Orthog. Hotelling’s (OHD) √ X X X

2. Hotelling’s matrix deflationSchur Complement Deflation1 s.t. maxarg 1 == − xxxAxx T

tT

t

Orthog. Hotelling’s (OHD) √ X X XOrthog. Projection (OPD) √ √ √ √T AxxA2. Hotelling’s matrix deflation Orthog. Projection (OPD) √ √ √ √

Generalized (GD) √ √ √ √TT xxAxxAA −= Tt

Tttt

tt xAx

AxxAAA 11

1−−

− −=• Removes contribution of from Intuition: Conditional variance of data variables given the x

Generalized (GD) √ √ √ √

ATttt

Ttttt xxAxxAA 11 −− −=

ttTt

tt xAxAA

11

−− −=

• Removes contribution of from • Primary drawback: Non-sparse solutions

Intuition: Conditional variance of data variables given the new sparse principal component

Experimentstx 1−tA ttt 1−

• Primary drawback: Non-sparse solutions new sparse principal componentExperiments

Sparse PCA Orthogonalized Projection DeflationSet up: Leading deflation-based Sparse PCA algorithmsSparse PCA

• Goal: Extract r sparse “pseudo-eigenvectors” Orthogonalized Projection Deflation

Set up: Leading deflation-based Sparse PCA algorithms• GSLDA (Moghaddam et al., ICML ‘06)

)( T xQQI −• Goal: Extract r sparse “pseudo-eigenvectors” • High variance directions, few non-zero components

• GSLDA (Moghaddam et al., ICML ‘06)• DC-PCA (Sriperumbudur et al., ICML ‘07)• Outfitted with each deflation technique

)()(,)(

111 T

tttTtttT

tT

t qqIAqqIAxQQI

q tt −−=−−

= −−−• High variance directions, few non-zero components

• Typical solution: Alternate between two tasks • Outfitted with each deflation techniquePit props dataset: 13 variables, 180 observations

)()(,||)(|| 1

11

ttttttt

Tt qqIAqqIAxQQI

qtt

−−=−

= −−−• Typical solution: Alternate between two tasks

1. Constrained rank-one variance maximization Intuition : Eliminates additional variance contributed by Pit props dataset: 13 variables, 180 observationsDC-PCA Cumulative % variance, Cardinality 4,4,4,4,4,4

||)(||11 txQQI

tt−

−−

txTT kxxxxAxx ≤== )(Card ,1 s.t. maxarg

1. Constrained rank-one variance maximization Intuition : Eliminates additional variance contributed by • = orthonormal basis for extracted pseudo-eigenvectors

DC-PCA Cumulative % variance, Cardinality 4,4,4,4,4,4txt

Tt

Tt kxxxxAxx ≤== − )(Card ,1 s.t. maxarg 1 tQ

HD PD SCD OHD OPD GD2. Hotelling’s matrix deflation (borrowed from PCA)• = orthonormal basis for extracted pseudo-eigenvectors• Successive pseudo-eigenvectors are not orthogonal

ttt −1 tQHD PD SCD OHD OPD GD

PC 1 22.60% 22.60% 22.60% 22.60% 22.60% 22.60%• Successive pseudo-eigenvectors are not orthogonal• Annihilating full vector can reintroduce old components

PC 1 22.60% 22.60% 22.60% 22.60% 22.60% 22.60%PC 2 39.60% 39.60% 38.60% 39.60% 39.60% 39.60%

• Annihilating full vector can reintroduce old components• Orthogonalized Hotelling’s Deflation defined similarly

PC 2 39.60% 39.60% 38.60% 39.60% 39.60% 39.60%PC 3 46.80% 50.90% 53.40% 46.80% 50.90% 51.00%The Problem • Orthogonalized Hotelling’s Deflation defined similarly

Reformulating Sparse PCA

PC 3 46.80% 50.90% 53.40% 46.80% 50.90% 51.00%PC 4 56.80% 62.10% 62.30% 52.90% 62.10% 62.20%

The Problem

Reformulating Sparse PCA PC 4 56.80% 62.10% 62.30% 52.90% 62.10% 62.20%PC 5 66.10% 70.20% 73.70% 59.90% 70.20% 71.30%Hotelling’s deflation was designed for eigenvectors, not

New goal: Explicitly maximize additional variance criterion

PC 5 66.10% 70.20% 73.70% 59.90% 70.20% 71.30%PC 6 73.40% 77.80% 79.30% 63.20% 77.20% 78.90%pseudo-eigenvectors

Compare Gene expression dataset: 21 genes, 5759 fly nuclei New goal: Explicitly maximize additional variance criterion• Solve sparse generalized eigenvector problem

PC 6 73.40% 77.80% 79.30% 63.20% 77.20% 78.90%Compare

• Hotelling’s deflation for PCAGene expression dataset: 21 genes, 5759 fly nuclei GSLDA Cumulative % variance, Cardinality 9,7,6,5,3,2,2,2• Solve sparse generalized eigenvector problem

TTT xQQIAQQIx −− )()(max• Hotelling’s deflation for PCA

• Annihilates variance of : 0=T xAxxGSLDA Cumulative % variance, Cardinality 9,7,6,5,3,2,2,2TTT

xxQQIAQQIx

tttt−−

−−−−)()(max

1111 0 HD PD SCD OHD OPD GD• Annihilates variance of : • Renders orthogonal to :

0=ttTt xAxtx

A x 0=xATT

x

kxxQQIx ≤=− )(Card ,1)( s.t.PC 1 21.00% 21.00% 21.00% 21.00% 21.00% 21.00%• Renders orthogonal to :

• Preserves positive semidefiniteness:tA tx 0=tt xA

0fA• Yields generalized deflation procedure

tkxxQQIxtt

≤=−−−

)(Card ,1)( s.t.11 PC 2 38.20% 38.10% 38.10% 38.20% 38.10% 38.20%

PC 3 52.10% 51.90% 52.00% 52.00% 51.90% 52.20%

• Preserves positive semidefiniteness:• Hotelling’s deflation for Sparse PCA

0ftA• Yields generalized deflation procedure

Generalized DeflationPC 3 52.10% 51.90% 52.00% 52.00% 51.90% 52.20%PC 4 60.50% 60.60% 60.40% 60.40% 60.40% 61.00%

• Hotelling’s deflation for Sparse PCA• Annihilates variance of tx Generalized Deflation

)( , TqqIBBxBq −==PC 4 60.50% 60.60% 60.40% 60.40% 60.40% 61.00%PC 5 65.70% 67.40% 67.10% 65.90% 67.10% 68.20%

• Annihilates variance of • Does not render orthogonal to tA tx

tx

)()(

)( , 11

TT

Tttttttt

qqIAqqIA

qqIBBxBq

−−=

−== −− PC 5 65.70% 67.40% 67.10% 65.90% 67.10% 68.20%PC 6 69.30% 71.00% 70.40% 70.00% 70.00% 72.10%

• Does not render orthogonal to • Does not preserve positive semidefiniteness

tA tx

)()( 1Tttt

Tttt qqIAqqIA −−= −

PC 6 69.30% 71.00% 70.40% 70.00% 70.00% 72.10%PC 7 72.50% 74.00% 73.40% 72.80% 73.70% 75.70%

• Does not preserve positive semidefinitenessKey properties lost in the Sparse PCA setting 1 tttttt − PC 7 72.50% 74.00% 73.40% 72.80% 73.70% 75.70%

PC 8 75.10% 76.80% 77.00% 75.90% 76.60% 79.60%

Key properties lost in the Sparse PCA settingGoal: Recover lost properties with new deflation methods PC 8 75.10% 76.80% 77.00% 75.90% 76.60% 79.60%Goal: Recover lost properties with new deflation methods