Qunyuan Zhang, Ingrid Borecki, Michael A. Province Division of Statistical Genomics

15
Generalized Linear Mixed Model (GLMM) & Weighted Sum Test (WST) Detecting Association between Rare Variants and Complex Traits Qunyuan Zhang, Ingrid Borecki, Michael A. Province Division of Statistical Genomics Washington University School of Medicine St. Louis, Missouri, USA 1

description

Generalized Linear Mixed Model (GLMM) & Weighted Sum Test (WST) Detecting Association between Rare Variants and Complex Traits. Qunyuan Zhang, Ingrid Borecki, Michael A. Province Division of Statistical Genomics Washington University School of Medicine St. Louis, Missouri, USA. - PowerPoint PPT Presentation

Transcript of Qunyuan Zhang, Ingrid Borecki, Michael A. Province Division of Statistical Genomics

Page 1: Qunyuan Zhang,  Ingrid Borecki, Michael A. Province  Division of Statistical Genomics

Generalized Linear Mixed Model (GLMM)& Weighted Sum Test (WST)

Detecting Association between Rare Variants and Complex Traits

Qunyuan Zhang, Ingrid Borecki, Michael A. Province

 

Division of Statistical Genomics

Washington University School of Medicine

St. Louis, Missouri, USA

1

Page 2: Qunyuan Zhang,  Ingrid Borecki, Michael A. Province  Division of Statistical Genomics

Collapsing/Collective Testing Methods

2

CAST (Morgenthaler and Thilly, 2006)

CMC (Li and Leal, 2008)

WSS (Madsen and Browning, 2009)

VT (Price et al, 2010)

aSum (Han and Pan, 2010)

KBAC (Liu and Leal, 2010)

C-alpha (Neale et al, 2011)

RBT (Ionita-Laza et al, 2011)

PWST (Zhang et al, 2011)

SKAT( Wu et al, 2011)…

Page 3: Qunyuan Zhang,  Ingrid Borecki, Michael A. Province  Division of Statistical Genomics

GLMM & WST

Y : quantitative trait or logit(binary trait)α : interceptβ : regression coefficient of weighted sum m : number of RVs to be collapsed wi : weight of variant igi : genotype (recoded) of variant iΣwigi : weighted sum (WS)X: covariate(s), such as population structure variable(s)τ : fixed effect(s) of XZ: design matrix corresponding to γγ : random polygene effects for individual subjects, ~N(0, G), G=2σ2K, K is the kinship matrix and σ2 the additive ploygene genetic variance ε : residual

ZXgwY i

m

ii

1

3

Page 4: Qunyuan Zhang,  Ingrid Borecki, Michael A. Province  Division of Statistical Genomics

Some special instances:

Mgenthaler and Thilly’s CAST, wi =1 for all RVs;

Li and Leal’s CMC, wi =1 for all RVs, limiting the sum ≤1;

Madsen and Browning’s WSS, wi based on allele frequency in controls;

Han and Pan’s aSum test, wi = 1 or -1, according to the direction of regression coefficient and a cutoff of p-value;

Zhang et al’s PWST, wi defined as a rescaled left-tailed p-value

Weighted Sum

4

i

m

ii gw

1

Page 5: Qunyuan Zhang,  Ingrid Borecki, Michael A. Province  Division of Statistical Genomics

Base on allele frequency, continuous or binary(0,1) weight, variable threshold;

Based on function annotation/prediction;

Based on sequencing quality (coverage, mapping quality, genotyping quality etc.);

Data-driven, using both genotype and phenotype data, learning weight from data, permutation test;

Any combination …

More Weighting Methods

5

Page 6: Qunyuan Zhang,  Ingrid Borecki, Michael A. Province  Division of Statistical Genomics

Using re-scaled left-tailed p-value as weight to incorporate directionality of effects into a test, P-value Weighted Sum Test (PWST, Zhang et al, 2011, Genetic Epidemiology).

Application (1)

6

Page 7: Qunyuan Zhang,  Ingrid Borecki, Michael A. Province  Division of Statistical Genomics

7

(+) (+) (.) (.) (-) (-)Subject V1 V2 V3 V4 V5 V6 Collapsed Trait

1 1 0 0 0 0 0 1 3.002 0 1 0 0 0 0 1 3.103 0 0 0 0 0 0 0 1.954 0 0 0 0 0 0 0 2.005 0 0 0 0 0 0 0 2.056 0 0 0 0 0 0 0 2.107 0 0 1 0 0 0 1 2.008 0 0 0 1 0 0 1 2.109 0 0 0 0 1 0 1 0.95

10 0 0 0 0 0 1 1 1.00

When there are causal(+) non-causal(.) and causal (-) variants …

Power of collapsing test significantly down

Page 8: Qunyuan Zhang,  Ingrid Borecki, Michael A. Province  Division of Statistical Genomics

P-value Weighted Sum Test (PWST)(+) (+) (.) (.) (-) (-)

Subject V1 V2 V3 V4 V5 V6 Collapsed PWS Trait1 1 0 0 0 0 0 1 0.86 3.002 0 1 0 0 0 0 1 0.90 3.103 0 0 0 0 0 0 0 0.00 1.954 0 0 0 0 0 0 0 0.00 2.005 0 0 0 0 0 0 0 0.00 2.056 0 0 0 0 0 0 0 0.00 2.107 0 0 1 0 0 0 1 -0.02 2.008 0 0 0 1 0 0 1 0.08 2.109 0 0 0 0 1 0 1 -0.90 0.95

10 0 0 0 0 0 1 1 -0.88 1.00t 1.61 1.84 -0.04 0.11 -1.84 -1.72

p(x≤t) 0.93 0.95 0.49 0.54 0.05 0.062*(p-0.5) 0.86 0.90 -0.02 0.08 -0.90 -0.88

Rescaled left-tail p-value [-1,1] is used as weight

Page 9: Qunyuan Zhang,  Ingrid Borecki, Michael A. Province  Division of Statistical Genomics

9

P-value Weighted Sum Test

Power of collapsing test is retained

even there are bidirectional effects

Page 10: Qunyuan Zhang,  Ingrid Borecki, Michael A. Province  Division of Statistical Genomics

Adjusting relatedness in family data for non-data-driven test of rare variants.

Application (2)

10

i

m

ii gwY

1

ZgwY i

m

ii

1

γ ~N(0,2σ2K)

Unadjusted:

Adjusted:

Page 11: Qunyuan Zhang,  Ingrid Borecki, Michael A. Province  Division of Statistical Genomics

Q-Q Plots of –log10(P) under the Null

Li & Leal’s collapsing test, ignoring family structure, inflation of type-1 error

Li & Leal’s collapsing test, modeling family structure via mixed model,inflation is corrected

11

(From Zhang et al, 2011, BMC Proc.)

Page 12: Qunyuan Zhang,  Ingrid Borecki, Michael A. Province  Division of Statistical Genomics

Application(3)

ZgwY i

m

ii

1

Permuted

Non-permuted, subject IDs fixed

12

MMPT: Mixed Model-based Permutation Test

Adjusting relatedness in family data for data-driven permutation test of rare variants.

γ ~N(0,2σ2K)

For more detail, please see poster 37 …

Page 13: Qunyuan Zhang,  Ingrid Borecki, Michael A. Province  Division of Statistical Genomics

Q-Q Plots under the Null WSS

SPWSTPWSTaSum

Permutation test, ignoring family structure, inflation of type-1 error

13

Page 14: Qunyuan Zhang,  Ingrid Borecki, Michael A. Province  Division of Statistical Genomics

Q-Q Plots under the Null WSS

SPWSTPWSTaSum

Mixed model-based permutation test (MMPT), modeling family structure, inflation corrected

Page 15: Qunyuan Zhang,  Ingrid Borecki, Michael A. Province  Division of Statistical Genomics

Conclusion

15

GLMM-WST is

powerful, flexible and useful !