Identification of Molecular Mechanisms that Drive...
Transcript of Identification of Molecular Mechanisms that Drive...
Identification of Molecular Mechanisms that Drive Interindividual Variability Using
Mediation Analysis
Joshua Millstein, Assistant Professor
Biostatistics, Preventive Medicine
University of Southern California
Example: Food Allergy
rs7192, rs9275596
HLA-DRB1, HLA-DQB1 DNA Methylation
Other Exposures?
Peanuts IgE reactivity
HLA-DRB1, HLA-DQB1 Gene Expression?
Other Exposures?
Hong, et al. 2015.
Example: Food Allergy
Hong, et al. 2015.
DNA : Outcome (GWAS)
DNA : Meth, Meth : Outcome
Causal Inference Test (CIT)
Example: Personalized Diabetes Therapy
rs552668
ADRA2A Expression
Yohimbine
Metformin T2D Control
Tang, et al. 2014.
Other Exposures?
Example: Personalized Diabetes Therapy
Tang, et al. 2014.
DNA : Expr DNA : Outcome Expr : Outcome
CIT
Yohimbine
Example: Immune Response to Influenza Vaccination
Mult. SNPs
Expr 20 Genes
Other Exposures?
Infl. Vaccine Antibody Response
Franco, et al. 2013.
Example: Immune Response to Influenza Vaccination
Franco, et al. 2013.
Vaccine : mRNA DNA : mRNA
CIT
mRNA : Antibody Response
Causality Conditions
1) L causes T
4) The predictive power of L on T is explained by G.
2) L explains variation in G not explained by T.
3) G explains variation in T not explained by L.
L T G
L T G
L T G ?
L T G
L®T
|L G T
|G T L
|L T G
Millstein, et al. 2009. BMC Genetics
Component Tests
A working mathematical definition of ‘causal’ is described by a set of conditions within a linear modeling framework:
Four component hypothesis tests:
Standard F-tests (partial F-tests) can be used for tests 1-3, however, test 4 is an equivalence testing problem.
Equivalence testing requires defining
boundaries within which the parameter is sufficiently close to the target.
Non-significance does not equate to
significant equivalence.
T =a1 + b1L +e1
G =a + b2L+ b3T +e2
T =a3 + b4G+ b5L+e3
1) H0 : b1 = 0 H1 : b1 ¹ 0
2) H0 : b2 = 0 H1 : b2 ¹ 0
3) H0 : b4 = 0 H1 : b4 ¹ 0
4) H0 : b5 ¹ 0 H1 : b5 = 0
Millstein, et al. 2009. BMC Genetics
Causal Inference Test (CIT)
Intersection/union test:
• Union of the acceptance regions
• Intersection of the rejection regions
• Conservative test of the union of null hypotheses
• P-value is the max of component test p-values
Software freely available from CRAN:
• R package, ‘cit’
• https://cran.r-project.org/web/packages/cit/index.html
• Continuous outcome
• Continuous potential mediator
• Single instrumental variable with values, {0, 1, 2}
pCIT = max(p1, p2, p3, p4)
Millstein, et al. 2009. BMC Genetics
Cassella & Berger, 2002.
CIT False Discovery Rate (qCIT)
Remaining issues:
• How to adjust for multiple testing?
• What if parametric assumptions fail?
• Not obvious how to estimate empirical null distribution using permutation
Solution: novel intersection/union type test with FDR
• Let, P[TD1] denote P[ true discovery for component test 1 | FDR = q1 ]
• True discovery union = P[ all TDs ] = P[ TD1 ] * P[ TD2 ] *…
• P[ TD1 ] = 1 – P[ FD1 ] = 1 – q1
• qCIT = P[ any false ] = 1 – P[ all TDs ]
qCIT =1- (1-q1)*(1-q2 )*(1-q3)*(1-q4) ³ max(q1,q2,q3,q4 )
Parametric
Millstein FDR estimators
Permutation-Based (non-parametric)
FD̂R =S *
S
1- S /m
1- S * /m
FD̂R =ma
S
1- S /m
1-a
m: total number of tests conducted
a: significance level
S: No. tests w/ p-value < a
S*: No. tests w/ p-value < a
from a permuted replicate dataset
Advantages:
1. Non-parametric
2. Powerful
3. Confidence Intervals for FDR
4. R package, ‘fdrci’
5. https://cran.r-project.org/web/packages/fdrci/index.html
Millstein & Volfson, 2013. Frontiers in Genetics
Permutation-Based FDR Example
Extensions to the CIT software:
1. Binary as well as continuous outcomes
2. Multiple binary and/or continuous instrumental variables
3. Permutation-based FDR
4. Parametric FDR
5. Adjustment covariates
Millstein & Volfson, 2013. Frontiers in Genetics
Causal Inference Test (CIT)
L T G
L T G
L
T
G
Causal
Reactive
Independent
Underlying Model: Causal