Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the...
Transcript of Selection of the Regularization Parameter in Graphical ... · PDF fileSelection of the...
Selection of the Regularization Parameterin Graphical Models usingNetwork Characteristics
Natalia Bochkina
University of Edinburgh, Maxwell Institute and the Alan Turing Institute
joint work withAdria Caballe Mestres (University of Edinburgh and BioSS)and Claus Mayer (BioMathematics and Statistics Scotland)
27 July 2016
Natalia Bochkina (University of Edinburgh) 27 July 2016 1 / 35
Outline
1 Sparse High Dimensional Gaussian Graphical Models
2 Network-based estimation of hyperparameter
3 Simulated data
4 Tumour gene expression data
5 Summary
Natalia Bochkina (University of Edinburgh) 27 July 2016 2 / 35
Sparse High Dimensional Gaussian Graphical Models Model
Gaussian graphical models
Suppose we observe n replicates of p variables:
Yi = (Y1,i , . . . ,Yp,i ) ∼ N (µ,Ω−1) independently for i = 1, . . . ,n
where Ω is p × p precision matrix and µ is the vector of the means (assumeµ = 0).
Matrix Ω represents the conditional dependence structure among thevariables, with zero values representing conditional independence.
Aim: estimate the underlying graph of the conditional dependence structuredetermined by Ω.
Applications: networks in genetics and genomics, financial models . . .
Natalia Bochkina (University of Edinburgh) 27 July 2016 3 / 35
Sparse High Dimensional Gaussian Graphical Models Estimation of precision matrix
Gaussian graphical models, p is large
For p small compared to n, the maximum likelihood estimate (MLE)
ΩML = arg maxΩ0
[log det Ω− tr(SΩ)], (1)
where S is the sample covariance matrix: S = n−1∑ni=1 YiY T
i .
Problem: if p is large s.t. S is not of full rank, then ΩML is not unique.
Additional assumption on Ω is sparsity.
Penalised maximum likelihood estimator (with a convex penalty) is
ΩpML = arg maxΩ0
[log det Ω− tr(SΩ)− λ||Ω||1], (2)
where ||Ω||1 =∑p
i,j=1 |Ωij | is the elementwise `1 norm of the matrix Ω.
For λ large enough, estimator ΩpML is sparse.
Natalia Bochkina (University of Edinburgh) 27 July 2016 4 / 35
Sparse High Dimensional Gaussian Graphical Models Estimation of precision matrix
Gaussian graphical models, p is large
For p small compared to n, the maximum likelihood estimate (MLE)
ΩML = arg maxΩ0
[log det Ω− tr(SΩ)], (1)
where S is the sample covariance matrix: S = n−1∑ni=1 YiY T
i .
Problem: if p is large s.t. S is not of full rank, then ΩML is not unique.
Additional assumption on Ω is sparsity.
Penalised maximum likelihood estimator (with a convex penalty) is
ΩpML = arg maxΩ0
[log det Ω− tr(SΩ)− λ||Ω||1], (2)
where ||Ω||1 =∑p
i,j=1 |Ωij | is the elementwise `1 norm of the matrix Ω.
For λ large enough, estimator ΩpML is sparse.
Natalia Bochkina (University of Edinburgh) 27 July 2016 4 / 35
Sparse High Dimensional Gaussian Graphical Models Algorithms and consistency
Methods for estimating hyperparameter λ
Additional penalty term for λ:
ΩppML = arg maxΩ0, λ>0
[L(Ω)− λ||Ω||1 − pen(λ)]
Methods such as AIC and BIC are suboptimal when p is large.
Bayesian model with a slab-and-spike prior for elements of Ω: computationallyintensive for large p (≥ 103).
Two steps procedure:
Step 1: Use pMLE estimator ΩpML = ΩpML(λ) for given λ,Step 2: Choose λ to minimise R(λ, ΩpML(λ)),e.g. Cross Validation for Ω. Overfits when p is large (Liu et al., 2011).
Stability selection by Meinshausen and Bühlman (2010): controls FDR
StARS – Stability Approach to Regularization Selection by Liu et al. (2011).Additional tuning parameter; can lead to overfitting in certain graph topologies.
. . .
Natalia Bochkina (University of Edinburgh) 27 July 2016 5 / 35
Sparse High Dimensional Gaussian Graphical Models Algorithms and consistency
Instability of estimated graph structureSmall variation in the penalty (||Ω||1) can lead to a significant change in theestimated graph structure.
λ = 0.83 λ = 0.85
0.70 0.75 0.80 0.85 0.90 0.95
020
040
060
080
010
0012
00
λ
l_1
norm
Natalia Bochkina (University of Edinburgh) 27 July 2016 6 / 35
Network-based estimation of hyperparameter
Estimation of the hyperparameter
in sparse graphical models
using network characteristics
Natalia Bochkina (University of Edinburgh) 27 July 2016 7 / 35
Network-based estimation of hyperparameter Network approach
Network-based approach to estimating λ
We propose to estimate λ using network characteristics of underlying graph.
Notation: Graph G(V ,E) with nodes V , edges E , adjacency matrix A.In graphical models: Aij = I(Ωij 6= 0) for i 6= j , and Aii = 0.
Network-based estimation of λ
Given λ, estimate Ω = Ωλ by penalised MLE:
Ωλ = arg maxΩ0
[log det Ω− tr(SΩ)− λ||Ω||1]
Choose λ = arg minλ R(λ, Aλ),where Aλ is adjacency matrix of cond. dependence graph of Ωλ.
The loss f-n for estimating λ depends only on the adjacency matrix of the underlyingconditional dependence graph.
Main a priori assumption: presence of weakly connected clusters.
Natalia Bochkina (University of Edinburgh) 27 July 2016 8 / 35
Network-based estimation of hyperparameter Network approach
Network-based approach to estimating λ
We propose to estimate λ using network characteristics of underlying graph.
Notation: Graph G(V ,E) with nodes V , edges E , adjacency matrix A.In graphical models: Aij = I(Ωij 6= 0) for i 6= j , and Aii = 0.
Network-based estimation of λ
Given λ, estimate Ω = Ωλ by penalised MLE:
Ωλ = arg maxΩ0
[log det Ω− tr(SΩ)− λ||Ω||1]
Choose λ = arg minλ R(λ, Aλ),where Aλ is adjacency matrix of cond. dependence graph of Ωλ.
The loss f-n for estimating λ depends only on the adjacency matrix of the underlyingconditional dependence graph.
Main a priori assumption: presence of weakly connected clusters.
Natalia Bochkina (University of Edinburgh) 27 July 2016 8 / 35
Network-based estimation of hyperparameter Network approach
Network characteristics
Correlation coefficient between nodes Vi , Vj ∈ G(V ,E):
σij =|nei(Vi ) ∩ nei(Vj )|√|nei(Vi )| |nei(Vj )|
,
where nei(Vi ) is the set of neighbours of node Vi (Estrada, 2011).Corresponding dissimilarity measure δij = 1− σij .
Mean Geodesic Distance: measure of connectivity between nodes
H(λ) =1
p(p − 1)
∑i<j
dij I(dij <∞)
where dij is the length of the shortest path between nodes i and j (Costaand Rodrigues, 2007).. . .
Natalia Bochkina (University of Edinburgh) 27 July 2016 9 / 35
Network-based estimation of hyperparameter Novel approaches
General algorithm
Fix a sequence (grid) of values of λ, (λ1, . . . , λN)
For each λ`, estimate Ω by penalised MLE, and hence the adjacencymatrix A` of the corresponding graphChoose λ`? = arg min` R(λ`,A`)
Can be interpreted as a point estimator of a modularised Bayesian model.
We propose two risk functions R(λ,A):
Path Connectivity: uses λ corresponding to the biggest structural changein the complexity of the graph.Complexity of the graph is measured by the Mean Geodesic Distance.
Augmented MSE: mimics a cross-validation approach with the lossdepending on the adjacency matrix of the graph.
Natalia Bochkina (University of Edinburgh) 27 July 2016 10 / 35
Network-based estimation of hyperparameter Novel approaches
General algorithm
Fix a sequence (grid) of values of λ, (λ1, . . . , λN)
For each λ`, estimate Ω by penalised MLE, and hence the adjacencymatrix A` of the corresponding graphChoose λ`? = arg min` R(λ`,A`)
Can be interpreted as a point estimator of a modularised Bayesian model.
We propose two risk functions R(λ,A):
Path Connectivity: uses λ corresponding to the biggest structural changein the complexity of the graph.Complexity of the graph is measured by the Mean Geodesic Distance.
Augmented MSE: mimics a cross-validation approach with the lossdepending on the adjacency matrix of the graph.
Natalia Bochkina (University of Edinburgh) 27 July 2016 10 / 35
Network-based estimation of hyperparameter Path Connectivity
Path ConnectivityConsider the Mean Geodesic Distance
H(λ) =1
p(p − 1)
∑i<j
dij I(dij <∞)
where dij is the length of the shortest path between nodes Vi and Vj .Choose λ: the largest change in graph structure measured by H(λ).
0.3 0.4 0.5 0.6
010
000
2000
030
000
λ
conn
Use finite differences with bandwidth h: H(λ+ h)− H(λ).Natalia Bochkina (University of Edinburgh) 27 July 2016 11 / 35
Network-based estimation of hyperparameter Path Connectivity
Path Connectivity: motivationUse normalised difference between mean geodesic distances:
R(λk ,A) =H(λk + h)− H(λk )
k−1∑k
j=1[H(λj + h)− H(λj )], λk = λ0 + (k − 1)h
FP = 46TP = 116
(a) Optimal λk?
FP = 54TP = 120
(b) λk?−1
Natalia Bochkina (University of Edinburgh) 27 July 2016 12 / 35
Network-based estimation of hyperparameter Path Connectivity
Path Connectivity: estimators of λ
0.20 0.25 0.30 0.35 0.40
05
1015
2025
λ
Den
sity
clusterednon−clustered
Natalia Bochkina (University of Edinburgh) 27 July 2016 13 / 35
Network-based estimation of hyperparameter Augmented MSE
Augmented MSE
Ideally, would like to use a cross-validation approach over some characteristic ofthe conditional dependence graph, which is an unbiased estimator of thecorresponding oracle risk.E.g. the MSE error of estimating a characteristic (dij ):
R(λ) = E∑i<j
(dij − dij (λ))2
where dij (λ) are based on GLasso estimator Ω(λ), with the corresponding oracleλoracle
λoracle = arg minλ
R(λ).
However, we do not observe the conditional dependency graph, i.e. we do nothave unbiased estimators of dij .
A priori information
the network contains clusters (possibly overlapping)⇒an algorithm that estimates well global characteristics (number of clusters, degrees,..)
to produce an original “estimate”
Natalia Bochkina (University of Edinburgh) 27 July 2016 14 / 35
Network-based estimation of hyperparameter Augmented MSE
Augmented MSE of graph correlations
network characteristic: graph correlations:
ρij =|nei(Vi ) ∩ nei(Vj )|√|nei(Vi )| |nei(Vj )|
original "estimate": output of a clustering algorithm.AGNES (Kaufman and Rousseeuw, 2009): estimates well global characteristicssuch as average degree of the graph, eigenvalues of A, etc
A-MSE estimator of λGiven Ωλ from GLasso and its adjacency matrix Aλ, choose
λAMSE = arg minλ
R(λ, Aλ) = arg minλ
E(∑i>j
|ρij − ρλij |q), q ≥ 1,
where E is the average over subsamples and (ρij ) correspond to the graph correlationsin the “original graph estimate”.
Natalia Bochkina (University of Edinburgh) 27 July 2016 15 / 35
Network-based estimation of hyperparameter Augmented MSE
Augmented MSE of graph correlations
network characteristic: graph correlations:
ρij =|nei(Vi ) ∩ nei(Vj )|√|nei(Vi )| |nei(Vj )|
original "estimate": output of a clustering algorithm.AGNES (Kaufman and Rousseeuw, 2009): estimates well global characteristicssuch as average degree of the graph, eigenvalues of A, etc
A-MSE estimator of λGiven Ωλ from GLasso and its adjacency matrix Aλ, choose
λAMSE = arg minλ
R(λ, Aλ) = arg minλ
E(∑i>j
|ρij − ρλij |q), q ≥ 1,
where E is the average over subsamples and (ρij ) correspond to the graph correlationsin the “original graph estimate”.
Natalia Bochkina (University of Edinburgh) 27 July 2016 15 / 35
Network-based estimation of hyperparameter Augmented MSE
Augmented MSE of graph order 2 connectivity
network characteristic: graph order 2 connectivity:
δij = I(|nei(Vi ) ∩ nei(Vj )| > 0) = I(ρij 6= 0)
i.e. the indicator function whether nodes i and j are connected or share aconnection.original "estimate": clustering algorithm (AGNES)
Risk:
R(λ, Aλ) = E∑i>j
(δij − δλij )2 = C + E(TP(λ)− FP(λ))
also known as Youden index, where
FP(λ) =∑i<j
I[δij = 0, δij (λ) = 1], TP(λ) =∑i<j
I[δij = 1, δij (λ) = 1].
Similarly, can estimate λ using this risk with δij replaced by δij in original graphestimate.
Natalia Bochkina (University of Edinburgh) 27 July 2016 16 / 35
Network-based estimation of hyperparameter Augmented MSE
Augmented MSE of graph order 2 connectivity
network characteristic: graph order 2 connectivity:
δij = I(|nei(Vi ) ∩ nei(Vj )| > 0) = I(ρij 6= 0)
i.e. the indicator function whether nodes i and j are connected or share aconnection.original "estimate": clustering algorithm (AGNES)
Risk:
R(λ, Aλ) = E∑i>j
(δij − δλij )2 = C + E(TP(λ)− FP(λ))
also known as Youden index, where
FP(λ) =∑i<j
I[δij = 0, δij (λ) = 1], TP(λ) =∑i<j
I[δij = 1, δij (λ) = 1].
Similarly, can estimate λ using this risk with δij replaced by δij in original graphestimate.
Natalia Bochkina (University of Edinburgh) 27 July 2016 16 / 35
Network-based estimation of hyperparameter Augmented MSE
A-MSE and oracle tuning parameter
−0.1
0.0
0.1
n=50 n=100 n=200 n=500
n
λ−λ
(c) p=50
−0.1
0.0
0.1
n=50 n=100 n=200 n=500
n
λ−λ
(d) p=170
−0.1
0.0
0.1
n=50 n=100 n=200 n=500
n
λ−λ
(e) p=290
−0.1
0.0
0.1
n=50 n=100 n=200 n=500
n
λ−λ
(f) p=500
The oracle value of λ is within the 95% confidence interval for the median ofλAMSE .
Natalia Bochkina (University of Edinburgh) 27 July 2016 17 / 35
Simulated data
Comparison on simulated data
Compare 6 approaches:
StARS, AGNES, A-MSE (graph correlations), PC, AIC and BIC
method penalized uses network subsampling fully fast very sparselikelihood characteristics. automatic graph estimates
PC X X X XA-MSE X X X XAGNES X X XStARS X XBIC X X X XAIC X X X
Compare on 3 graph structure scenarios: hubs, power law and randomnetworks.
Natalia Bochkina (University of Edinburgh) 27 July 2016 18 / 35
Simulated data Graph topologies
Graph topologies in biological data
Networks with hubs.Typical in biological networksPower-law networks. Distribution of the number of connections ξ of eachnode is
Pξ = k =k−α
ς(α), k ≥ 1,
for some constant α and the normalizing function ς(α).Peng et al. (2009): α = 2.3 provides a distribution that is close to what isexpected in biological networks.Random networks:
Pξ = k =
(pk
)θk (1− θ)p−k ,
where the parameter θ determines the proportion of edges (or sparsity) inthe graph.
Natalia Bochkina (University of Edinburgh) 27 July 2016 19 / 35
Simulated data Graph topologies
Examples of simulated graphs
1
(g) p=50, hubs-based
1
(h) p=170, hubs-based
1
(i) p=290, hubs-based
1
(j) p=50, power-law
1
(k) p=170, power-law
1
(l) p=290, power-lawNatalia Bochkina (University of Edinburgh) 27 July 2016 20 / 35
Simulated data Performance
Average ranks for the MSE of the precision matrix
Hubs-based Power lawn 50 100 200 500 50 100 200 500
dimension p=50AGNES 3.05 3.55 4.06 4.40 3.12 3.73 4.40 4.71A-MSE 4.33 4.90 5.22 5.38 4.92 5.47 5.67 5.78PC 5.23 5.80 5.58 5.15 4.58 5.13 4.85 4.49StARS 1.27 1.49 1.18 1.28 1.17 1.43 1.04 1.07BIC 5.38 3.73 3.14 3.06 5.33 3.66 3.08 3.02AIC 1.73 1.52 1.82 1.73 1.90 1.58 1.96 1.92
dimension p=500AGNES 2.13 3.00 3.92 4.30 2.11 3.00 3.62 4.11A-MSE 4.28 4.78 5.13 5.35 4.81 5.25 5.27 5.47PC 4.94 5.97 5.60 4.85 4.63 5.67 5.73 5.39StARS 1.00 1.01 1.00 1.00 1.00 1.00 1.00 1.00BIC 5.78 4.25 3.31 3.32 5.55 4.08 3.38 3.03AIC 2.88 2.00 2.05 2.18 2.90 2.00 2.00 2.00
Natalia Bochkina (University of Edinburgh) 27 July 2016 21 / 35
Simulated data Performance
Average ranks for the MSE of the dissimilarity matrix
Hubs-based Power lawn 50 100 200 500 50 100 200 500
dimension p=50AGNES 2.88 2.70 2.23 2.09 3.60 3.02 2.38 2.09A-MSE 2.83 2.47 1.65 1.44 2.12 1.65 1.20 1.41PC 3.52 3.67 2.75 2.68 2.42 2.22 2.53 2.52StARS 4.16 4.58 5.81 5.72 5.70 5.58 5.97 5.96BIC 3.83 3.05 3.41 3.79 2.13 3.12 3.88 3.98AIC 3.77 4.54 5.16 5.28 5.02 5.42 5.03 5.04
dimension p=170AGNES 3.52 2.98 2.12 1.73 4.31 3.68 3.06 2.32A-MSE 2.62 2.04 1.65 1.45 2.14 1.58 1.45 1.40PC 2.46 2.49 3.11 3.83 2.12 1.62 1.73 2.32StARS 6.00 5.78 6.00 6.00 6.00 6.00 6.00 6.00BIC 2.14 2.52 3.14 3.34 1.80 3.12 3.77 3.97AIC 4.26 5.18 4.98 4.65 4.62 5.00 5.00 5.00
dimension p=500AGNES 4.83 3.25 2.06 1.60 4.89 4.00 3.38 2.56A-MSE 2.51 1.80 1.94 2.00 2.09 1.72 1.33 1.42PC 2.04 3.12 3.69 3.83 2.32 1.48 1.68 2.06StARS 6.00 6.00 6.00 6.00 6.00 6.00 6.00 6.00BIC 1.51 1.95 2.40 2.78 1.60 2.80 3.61 3.97AIC 4.11 4.88 4.92 4.79 4.10 5.00 5.00 5.00
Natalia Bochkina (University of Edinburgh) 27 July 2016 22 / 35
Simulated data Performance
True discovery rate TDR = TP/(TP + FP)
0.0
0.2
0.4
0.6
0.8
1.0
p=50T
DR
p=170
p=290
p=500
0.0
0.2
0.4
0.6
0.8
1.0
n
TD
R
50 100 200 500
n50 100 200 500
n50 100 200 500
n50 100 200 500
AGNES AMSE PC StARS BIC AIC
TDR increases with n for AGNES, A-MSE and PC, and decreases for AIC and BIC.Natalia Bochkina (University of Edinburgh) 27 July 2016 23 / 35
Simulated data Performance
ROC curves
0.0
0.2
0.4
0.6
0.00 0.01 0.02 0.03 0.04FPR
TP
R
METHODPCAGSTAAG
0.0
0.2
0.4
0.6
0.00 0.02 0.04 0.06FPR
TP
R
METHODPCAGSTAAG
0.0
0.2
0.4
0.6
0.000 0.025 0.050 0.075FPR
TP
R
METHODPCAGSTAAG
0.0
0.2
0.4
0.6
0.00 0.02 0.04 0.06 0.08FPR
TP
R
METHODPCAGSTAAG
0.0
0.2
0.4
0.6
0.00 0.02 0.04 0.06 0.08FPR
TP
R
METHODPCAGSTAAG
0.0
0.2
0.4
0.6
0.000 0.025 0.050 0.075 0.100FPR
TP
R
METHODPCAGSTAAG
Dots: optimal graph selected by the corresponding method.Natalia Bochkina (University of Edinburgh) 27 July 2016 24 / 35
Simulated data Summary
Summary
AGNES is the best approach to recover global network characteristics(e.g. the proportion of edges, Mean Geodesic Distance) but generallyleads to complex graphs that are difficult to interpret.
Augmented MSE: sparser graphs than AGNES and achieves betterresults in estimating adjacency matrix A; more interpretable graphs
Path Connectivity is computationally the fastest method and only doesslightly worse than A-MSE in estimating MSE(A). It generally obtainssimple graph structures which are easier to interpret.
The choice of method depends on the relative cost of False Positivescompared to that of True Positives.
Natalia Bochkina (University of Edinburgh) 27 July 2016 25 / 35
Tumour gene expression data
Tumour gene expression data
Gene expression data set, colorectal tumour study (Hinoue et al., 2012).25 patientspaired samples: the gene expression profiling is obtained in each patientfor a colorectal tumor sample and its healthy adjacent colonic tissueTotal number of genes: 25, 000.7,579 genes were analysed (selected as differentially expressed betweenthe conditions).
Natalia Bochkina (University of Edinburgh) 27 July 2016 26 / 35
Tumour gene expression data
Dependence structure for tumour gene expressiondata: healthy
Path Connectivity A-MSE
clust 1clust 2clust 3clust 4clust 5clust 6clust 7clust 8clust 9clust 10
clust 1clust 2clust 3clust 4clust 5clust 6clust 7clust 8clust 9clust 10clust 11clust 12clust 13clust 14clust 15clust 16clust 17
Natalia Bochkina (University of Edinburgh) 27 July 2016 27 / 35
Tumour gene expression data
Dependence structure for tumour gene expressiondata: tumour
Path Connectivity A-MSE
clust 1clust 2clust 3clust 4clust 5clust 6clust 7clust 8clust 9clust 10clust 11clust 12clust 13
clust 1clust 2clust 3clust 4clust 5clust 6clust 7clust 8clust 9clust 10clust 11clust 12clust 13clust 14clust 15
Natalia Bochkina (University of Edinburgh) 27 July 2016 28 / 35
Tumour gene expression data
PC graph for gene expression data
10 clusters in the healthy samples13 clusters in the tumour samples
Overlap between cluster 4 in the healthy samples (84 genes) with cluster 2 inthe tumor sample (88 genes), which share 38 genes.
Overlap expected by chance: ∼ 4.45 genes.
Genes in Cluster 4 (normal) and Cluster 2 (tumour):P53-signaling pathway (P53 being the classical cancer gene)DNA replicationadaptive immune system
Natalia Bochkina (University of Edinburgh) 27 July 2016 29 / 35
Tumour gene expression data
PC graph for gene expression data
10 clusters in the healthy samples13 clusters in the tumour samples
Overlap between cluster 4 in the healthy samples (84 genes) with cluster 2 inthe tumor sample (88 genes), which share 38 genes.
Overlap expected by chance: ∼ 4.45 genes.
Genes in Cluster 4 (normal) and Cluster 2 (tumour):P53-signaling pathway (P53 being the classical cancer gene)DNA replicationadaptive immune system
Natalia Bochkina (University of Edinburgh) 27 July 2016 29 / 35
Summary
Summary and future workSummary
Propose a network-based method to choose the hyperparameter in Gaussiangraphical modelEstimation of conditional dependence graph is more stable than the approacheswhich depend on Ω only via ||Ω||1Estimated graphs are more interpretableChoice of method should be determined by the relative cost of FP vs TPBayesian interpretation? A point estimator under a modularised DAG.
R package: "GMRPS", paper is on arXiv:1509.05326.Current and future work
Asymptotic/non-asymptotic propertiesIn particular, given n and p, how large is the conditional dependence graph thatcan be estimated reliably.Other risk functions, notably based on second (and other) eigenvalues of ATest for the difference between the conditional dependence graphs in differentgroups of samples“Differential” network: difference between networks in two conditions
Natalia Bochkina (University of Edinburgh) 27 July 2016 30 / 35
Summary
Summary and future workSummary
Propose a network-based method to choose the hyperparameter in Gaussiangraphical modelEstimation of conditional dependence graph is more stable than the approacheswhich depend on Ω only via ||Ω||1Estimated graphs are more interpretableChoice of method should be determined by the relative cost of FP vs TPBayesian interpretation? A point estimator under a modularised DAG.
R package: "GMRPS", paper is on arXiv:1509.05326.Current and future work
Asymptotic/non-asymptotic propertiesIn particular, given n and p, how large is the conditional dependence graph thatcan be estimated reliably.Other risk functions, notably based on second (and other) eigenvalues of ATest for the difference between the conditional dependence graphs in differentgroups of samples“Differential” network: difference between networks in two conditions
Natalia Bochkina (University of Edinburgh) 27 July 2016 30 / 35
Summary
References
Cai, T., W. Liu, and X. Luo (2011). A Constrained l1 Minimization Approach to Sparse Precision MatrixEstimation. Journal of the American Statistical Association 106(494), 594–607.
Costa, L. and F. Rodrigues (2007). Characterization of complex networks: A survey of measurements.Advances in Physics 56(1), 167–242.
Estrada, E. (2011). The structure of complex networks. New York: OXFORD University press.
Hinoue, T., D. J. Weisenberger, C. P. E. Lange, H. Shen, H.-M. Byun, D. Van Den Berg, S. Malik, F. Pan,H. Noushmehr, C. M. van Dijk, R. a. E. M. Tollenaar, and P. W. Laird (2012, February). Genome-scaleanalysis of aberrant DNA methylation in colorectal cancer. Genome research 22(2), 271–82.
Kaufman, L. and P. Rousseeuw (2009). Finding groups in data: an introduction to cluster analysis. New Jersey:John Wiley & sons.
Liu, H., K. Roeder, and L. Wasserman (2011). Stability approach to regularization selection (stars) for highdimensional graphical models. Journal of Computational and Graphical Statistics, 1.
Meinshausen, N. and P. Bühlman (2010). Stability Selection. Journal of the Royal Statistical Society, SeriesB 72, 417–473.
Peng, J., P. Wang, N. Zhou, and J. Zhu (2009, June). Partial Correlation Estimation by Joint Sparse RegressionModels. Journal of the American Statistical Association 104(486), 735–746.
Natalia Bochkina (University of Edinburgh) 27 July 2016 31 / 35
Summary
Thank you!
Natalia Bochkina (University of Edinburgh) 27 July 2016 32 / 35
Summary
Simulated data
Yi ∼ Np(0,Ω−1), i = 1, . . . ,n
3 graph structure scenarios: hubs, power law and random networks.
Then,Ω = Ω(0) + δI
where off-diagonal elements of Ω are (Cai et al., 2011)
Ω(0)ij =
Unif (0.5,0.9) if Aij = 1 and Bern(0.5)=1 ;Unif (−0.5,−0.9) if Aij = 1 and Bern(0.5)=0;0 if Aij = 0.
with δ such that Ω is a positive definite matrix.
Each simulation is repeated 50 times.
Natalia Bochkina (University of Edinburgh) 27 July 2016 33 / 35
Summary
Path connectivity and 2nd eigenvalue of A
0.30 0.35 0.40 0.45 0.50 0.55
050
100
150
λ
H(λ)100
3.0
3.5
4.0
evalue2
H(λ) 100evalue2
Natalia Bochkina (University of Edinburgh) 27 July 2016 34 / 35