Advanced Introduction to Machine...
Transcript of Advanced Introduction to Machine...
Advanced Introduction to Machine Learning
10715, Fall 2014
Structured Sparsity, with application in Computational
Genomics
Eric XingLecture 3, September 15, 2014
Reading:© Eric Xing @ CMU, 2014 1
Structured Sparsity
Sparsity
Group sparsity
Total variation sparsity
y
x1
x2
x3
xn-1
xn
1
2
3
n-1
n
Ð(¯) =P
i j¯ij
Ð(¯) =P
c j¯Gc j2 =P
c
qPi2Gc
¯2i
Ð(¯) =P
c j¯GcjTV =P
c
Pi2Gc
j¯i ¡ ¯i¡1j
¯¤ = arg minL(X;Y;¯) + Ð(¯)
© Eric Xing @ CMU, 2014 2
Healthy
Sick
ACTCGTACGTAGACCTAGCATTACGCAATAATGCGA
ACTCGAACCTAGACCTAGCATTACGCAATAATGCGA
TCTCGTACGTAGACGTAGCATTACGCAATTATCCGA
ACTCGAACCTAGACCTAGCATTACGCAATTATCCGA
ACTCGTACGTAGACGTAGCATAACGCAATAATGCGA
TCTCGTACCTAGACGTAGCATAACGCAATAATCCGA
ACTCGAACCTAGACCTAGCATAACGCAATTATCCGA
Single nucleotide polymorphism (SNP)
Causal (or "associated") SNP
Genetic Basis of Diseases
© Eric Xing @ CMU, 2014 3
A . . . . T . . G . . . . . C . . . . . . T . . . . . A . . G .
A . . . . A . . C . . . . . C . . . . . . T . . . . . A . . G .
T . . . . T . . G . . . . . G . . . . . . T . . . . . T . . C .
A . . . . A . . C . . . . . C . . . . . . T . . . . . T . . C .
A . . . . T . . G . . . . . G . . . . . . A . . . . . A . . G .
T . . . . T . . C . . . . . G . . . . . . A . . . . . A . . C .
A . . . . A . . C . . . . . C . . . . . . A . . . . . T . . C .
Data
Genotype Phenotype
• Cancer: Dunning et al. 2009.• Diabetes: Dupuis et al. 2010.• Atopic dermatitis: Esparza-Gordillo et al. 2009.• Arthritis: Suzuki et al. 2008
Standard Approach
causal SNP
a univariate phenotype:e.g., disease/control
Genetic Association Mapping
© Eric Xing @ CMU, 2014 4
Healthy
Cancer
ACTCGTACGTAGACCTAGCATTACGCAATAATGCGA
ACTCGAACCTAGACCTAGCATTACGCAATAATGCGA
TCTCGTACGTAGACGTAGCATTACGCAATTATCCGA
ACTCGAACCTAGACCTAGCATTACGCAATTATCCGA
ACTCGTACGTAGACGTAGCATAACGCAATAATGCGA
TCTCGTACCTAGACGTAGCATAACGCAATAATCCGA
ACTCGAACCTAGACCTAGCATAACGCAATTATCCGA
Causal SNPs
Genetic Basis of Complex Diseases
© Eric Xing @ CMU, 2014 5
Genetic Basis of Complex Diseases
Healthy
Cancer
ACTCGTACGTAGACCTAGCATTACGCAATAATGCGA
ACTCGAACCTAGACCTAGCATTACGCAATAATGCGA
TCTCGTACGTAGACGTAGCATTACGCAATTATCCGA
ACTCGAACCTAGACCTAGCATTACGCAATTATCCGA
ACTCGTACGTAGACGTAGCATAACGCAATAATGCGA
TCTCGTACCTAGACGTAGCATAACGCAATAATCCGA
ACTCGAACCTAGACCTAGCATAACGCAATTATCCGA
Causal SNPs
Biological mechanism
© Eric Xing @ CMU, 2014 6
Intermediate Phenotype
Genetic Basis of Complex Diseases
Healthy
Cancer
ACTCGTACGTAGACCTAGCATTACGCAATAATGCGA
ACTCGAACCTAGACCTAGCATTACGCAATAATGCGA
TCTCGTACGTAGACGTAGCATTACGCAATTATCCGA
ACTCGAACCTAGACCTAGCATTACGCAATTATCCGA
ACTCGTACGTAGACGTAGCATAACGCAATAATGCGA
TCTCGTACCTAGACGTAGCATAACGCAATAATCCGA
ACTCGAACCTAGACCTAGCATAACGCAATTATCCGA
Causal SNPs
Clinical records
Gene expressionAssociation to intermediate phenotypes
© Eric Xing @ CMU, 2014 7
Subnetworks for lung physiology Subnetwork for
quality of life
Genetic Association for Asthma Clinical Traits
TCGACGTTTTACTGTACAATT
Statistical challengeHow to find
associations to a multivariate entity in a graph?
© Eric Xing @ CMU, 2014 8
Gene Expression Trait Analysis
TCGACGTTTTACTGTACAATT
GenesSamples
Statistical challengeHow to find
associations to a multivariate entity in a tree?
© Eric Xing @ CMU, 2014 9
Association with Phenome
ACGTTTTACTGTACAATT
Multivariate complex syndrome (e.g., asthma)age at onset, history of eczema
genome‐wide expression profile
ACGTTTTACTGTACAATT
Traditional Approachcausal SNP
a univariate phenotype:gene expression level
Structured Association
© Eric Xing @ CMU, 2014 10
Sparse Associations
Pleotropic effects
Epistatic effects
© Eric Xing @ CMU, 2014 11
Considerone phenotype at a time
Considermultiple correlated phenotypes (phenome) jointly
vs.
Standard Approach New Approach
Phenotypes Phenome
Structured Sparse Association : a New Paradigm
© Eric Xing @ CMU, 2014 12
Sparse Learning
Linear Model:
Lasso (Sparse Linear Regression)
Why sparse solution? penalizing
constraining
[R.Tibshirani 96]
© Eric Xing @ CMU, 2014 13
Multi-Task Extension Multi-Task Linear Model:
Coefficients for k-th task:Coefficient Matrix:
Input:Output:
Coefficients for a variable (2nd)
Coefficients for a task (2nd)© Eric Xing @ CMU, 2014 14
Background: Sparse multivariate regression for disease association studies
Structured association – a new paradigm Association to a graph-structured phenome
Graph-guided fused lasso (Kim & Xing, PLoS Genetics, 2009)
Association to a tree-structured phenome Tree-guided group lasso (Kim & Xing, ICML 2010)
Outline
© Eric Xing @ CMU, 2014 15
T G
A A
C C
A T
G A
A G
T A
GenotypeTrait
Multivariate Regression for Single-Trait Association Analysis
Xy
2.1 x=
x= β
Association Strength
?
© Eric Xing @ CMU, 2014 16
T G
A A
C C
A T
G A
A G
T A
GenotypeTrait
Multivariate Regression for Single-Trait Association Analysis
Many non-zero associations: Which SNPs are truly significant?
2.1 x=
Association Strength
© Eric Xing @ CMU, 2014 17
Genotype
x=2.1
Trait
Lasso for Reducing False Positives (Tibshirani, 1996)
Many zero associations (sparse results), but what if there are multiple related traits?
+
J
j 1 |βj |
T G
A A
C C
A T
G A
A G
T A
Lasso Penalty for sparsity
Association Strength
© Eric Xing @ CMU, 2014 18
Genotype
(3.4, 1.5, 2.1, 0.9, 1.8)
Multivariate Regression for Multiple-Trait Association Analysis
T G
A A
C C
A T
G A
A G
T A
How to combine information across multiple traits to increase the power?
Association Strength
x=
Association strength between SNP j and Trait i: βj,i
Allergy Lung physiology
LDSynthetic
lethal
+ ji,
|βj,i|
© Eric Xing @ CMU, 2014 19
Genotype
(3.4, 1.5, 2.1, 0.9, 1.8)
Trait
Multivariate Regression for Multiple-Trait Association Analysis
T G
A A
C C
A T
G A
A G
T A
Allergy Lung physiology
Association Strength
x=
+We introducegraph-guided fusion penalty
Association strength between SNP j and Trait i: βj,i
+ ji,
|βj,i|
© Eric Xing @ CMU, 2014 20
Multiple-trait Association:Graph-Constrained Fused Lasso
Step 1: Thresholded correlation graph of phenotypes
ACGTTTTACTGTACAATT
Step 2: Graph‐constrained fused lasso
Lasso Penalty
Graph-constrained fusion penalty
Fusion
© Eric Xing @ CMU, 2014 21
Fusion Penalty
Fusion Penalty: | βjk - βjm | For two correlated traits (connected in the network), the
association strengths may have similar values.
ACGTTTTACTGTACAATT
SNP j
Trait m
Trait k
Association strength between SNP j and Trait k:βjk
Association strength between SNP j and Trait m:βjm
© Eric Xing @ CMU, 2014 22
ACGTTTTACTGTACAATT
Overall effect
Graph-Constrained Fused Lasso
Fusion effect propagates to the entire network Association between SNPs and subnetworks of traits
© Eric Xing @ CMU, 2014 23
ACGTTTTACTGTACAATT
Multiple-trait Association: Graph-Weighted Fused Lasso
Subnetwork structure is embedded as a densely connected nodes with large edge weights
Edges with small weights are effectively ignored
Overall effect
© Eric Xing @ CMU, 2014 24
Estimating Parameters
Quadratic programming formulation Graph-constrained fused lasso
Graph-weighted fused lasso
Many publicly available software packages for solving convex optimization problems can be used
© Eric Xing @ CMU, 2014 25
50 SNPs taken from HapMap chromosome 7, CEU population
10 traits
Trait Correlation Matrix
True Regression Coefficients
Single SNP-Single Trait Test
Significant at α = 0.01
Lasso Graph-guided Fused Lasso
Thresholded Trait Correlation Network
Simulation Results
PhenotypesSN
Ps
No association
High association
© Eric Xing @ CMU, 2014 26
Simulation Results
© Eric Xing @ CMU, 2014 27
Association to a Tree-structured Phenome
TCGACGTTTTACTGTACAATT
© Eric Xing @ CMU, 2014 28
Tree-Guided Group Lasso In a simple case of two genes
• Low height• Tight correlation• Joint selection
• Large height• Weak correlation• Separate selection
h
h
Gen
otyp
es
Gen
otyp
es
© Eric Xing @ CMU, 2014 29
L1 penalty• Lasso penalty• Separateselection
L2 penalty • Group lasso• Joint selection
h
Elastic net
Select the child nodes jointly orseparately?
Tree-Guided Group Lasso In a simple case of two genes
Tree-guided group lasso
© Eric Xing @ CMU, 2014 30
h2
h1
Select the child nodes jointly or separately?
Joint selection
Separate selection
Tree-Guided Group Lasso For a general tree
Tree-guided group lasso
© Eric Xing @ CMU, 2014 31
h2
h1
Select the child nodes jointly or separately?
Tree-Guided Group Lasso For a general tree
Tree-guided group lasso
Joint selection
Separate selection
© Eric Xing @ CMU, 2014 32
Previously, in Jenatton, Audibert & Bach, 2009
Balanced Shrinkage
© Eric Xing @ CMU, 2014 33
Estimating Parameters
Second-order cone program
Many publicly available software packages for solving convex optimization problems can be used
© Eric Xing @ CMU, 2014 34
Illustration with Simulated Data
True association strengths Lasso Tree‐guided
group lasso
SNPs
Phenotypes
No association
High association
© Eric Xing @ CMU, 2014 35
Yeast eQTL Analysis
Tree‐guided group lasso
Single‐Marker Single‐Trait Test
SNPs
Phenotypes
No association
High association
Hierarchical clustering tree
© Eric Xing @ CMU, 2014 36
Ultimately …
Pleotropic effects
Epistatic effects
© Eric Xing @ CMU, 2014 37
38
Structured Input/Output-Lasso
k Usr
krs
j k
kj
k m Ssr
krs
j k
kj
K
k
N
i Usrrsi
krs
p
jij
kj
kilassoio
m
ZXY
).(4
23
),(
22
1 11
1 1 ),(,
1
minarg
This full model incorporates input/output structure of the dataset as well as epistatic effects guided by genetic interaction networks
network SNPin cluster m :Snetworksn interactio genetic:
thm
U
Lasso penalty: within group sparsity Input structure: group selection of correlated epistatic SNPs
Output structure: group selection of SNPs across multiple correlated traits
[Lee, Zhu and Xing, submitted 2010]
© Eric Xing @ CMU, 2014 38
Sensitivity and Specificity varying the number of SNPs
9/15/2014 39 For each number of SNPs, we show the average of the performance with 5 different simulated data
Marginal SNP: Methods taking advantage of output structures outperforms others.
Epistatic SNP: Methods taking advantage of input structures outperforms others.
IO-Lasso outperforms other methods for detecting both marginal & epsitatic eQTLs
© Eric Xing @ CMU, 2014 39
Computation Time
© Eric Xing @ CMU, 2014 40
Proximal Gradient DescentOriginal Problem:
Approximation Problem:
Gradient of theApproximation:
© Eric Xing @ CMU, 2014 41
Geometric Interpretation Smooth approximation
Uppermost LineNonsmooth
Uppermost LineSmooth
© Eric Xing @ CMU, 2014 42
Convergence RateTheorem: If we require and set , the number of iterations is upper bounded by:
Remarks: state of the art IPM method for for SOCP converges at a rate
© Eric Xing @ CMU, 2014 43
Multi-Task Time Complexity Pre-compute:
Per-iteration Complexity (computing gradient)
Tree:
Graph:
IPM for SOCP
Proximal-Gradient
IPM for SOCP
Proximal-Gradient
Proximal-Gradient: Independent of Sample Size
Linear in #.of Tasks© Eric Xing @ CMU, 2014 44
Experiments Multi-task Graph Structured Sparse Learning
(GFlasso)
© Eric Xing @ CMU, 2014 45
Experiments
Multi-task Tree-Structured Sparse Learning (TreeLasso)
© Eric Xing @ CMU, 2014 46
Conclusions
Novel statistical methods for joint association analysis to correlated phenotypes Graph-structured phenome : graph-guided fused lasso Tree-structured phenome : tree-guided group lasso
Advantages Greater power to detect weak association signals Fewer false positives Joint association to multiple correlated phenotypes
© Eric Xing @ CMU, 2014 47