1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department...
-
date post
18-Dec-2015 -
Category
Documents
-
view
215 -
download
1
Transcript of 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department...
![Page 1: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/1.jpg)
1
Cladistic Clustering of Haplotypes
in Association Analysis
Jung-Ying Tzeng
Aug 27, 2004
Department of Statistics & Bioinformatics Research Center
North Carolina State University
![Page 2: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/2.jpg)
2
Simple Disorder vs. Complex Disorder
Peltonen and McKusick (2001). Science
![Page 3: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/3.jpg)
3
Complex Disorders
Liability genes = genes containing variants increasing disease liability
Goal: look for such genes Rely more on the epidemiological evidences
Association analysis Case-control studies Detect liability genes by searching for association
between disease status and genetic variants
![Page 4: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/4.jpg)
4
Genetic Markers
Instead of studying the whole DNA sequences, we look at a subset of
them---genetic markers
SNP: Single Nucleotide Polymorphism
• Pro: dense; 100-300bp
• Con: binary variants
Resolved by considering adjacent SNPs jointly
![Page 5: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/5.jpg)
5
Haplotype-based Association Analysis
Haplotype = maker sequence
Haplotye-based association analysis
TCTC
CACA
Case Control
Hap 1Hap 2Hap 3
.
.
.
Hap k
T C T C
C A C A
![Page 6: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/6.jpg)
6
Haplotype-based Association Analysis
Problem: findings are not replicable• Under-powered (Lohmueller et. al 2003; Neal and Sham 2004 )
Solution:
1. Use large samples (Lohmueller et. al 2003)
2. Reduce the dimension of the parameter space
![Page 7: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/7.jpg)
7
Dimensionality
Haplotype distribution within a block
Daly et al. (2001) Nature Genetics
Method I: Truncating
: tag SNPs
![Page 8: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/8.jpg)
8
Evolutionary tree of haplotypes
Minimize the haplotype distance within clusters
000000
100000
100001
100011 100101 101001 110000
010000
011001 000100
011000
111000
Method II: Clustering (Molitor et al. 2003; Durrant et al. 2004)
![Page 9: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/9.jpg)
9
Method II: Clustering
000000
100000
100001
100011 100101 101001 110000
010000
011001 000100
011000
111000
![Page 10: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/10.jpg)
10
000000
100000
100001
100011 100101 101001 110000
010000
011001 000100
011000
111000
Method II: Clustering
![Page 11: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/11.jpg)
11
Observed Hap ={ 000, 001, 010, 100,110, 101, 011, 111 }
001
101
110
010
011000
111
100 001
101
110
010
011000
111
100
Method III: Cladistic Grouping(Templeton 1995)(Seltman et al. 2003)
Cladogram
![Page 12: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/12.jpg)
12
Include all samples
Incorporate both haplotype distance and age
• High frequency ancient (Crandall & Templeton 1995)
• Low frequency young
Allow uncertainty in inferring the underlying
evolutionary relationship
Desired Features
![Page 13: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/13.jpg)
13
Possible Hap = { 000, 001, 010, 100, 110, 101, 011, 111 }
110
001 101011
000
111010 100
{ 110 } (2)
*(i)t = (i)t + (i+1)t B(i+1)
{ 000, 010, 111, 100 }
{ 001, 011, 101 }
(1)
(0)
001 101011
111010 100
000
110 B(2)
B(1)
Proposed Approach: Cladistic Clustering
p 1-p
q1 q2 1-q1-q2
*t = tB
= (0)t (1)t (2)t
B(2)B(1)
B(1)
I
![Page 14: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/14.jpg)
14
Issues
1. Determine major nodes (0)
2. Construct conditional allocating matrix B(i)
![Page 15: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/15.jpg)
15
110
001 101011
000
111010 100
{ 110 }
{ 000, 010, 100, 111 }
{ 001, 011, 101 }
B(2) =
C = ()
c c c c110
000 010 100 111
(2)
(1)
(0)
Conditional Allocating Matrix B(i)
*(1)t = (2)t B(2) + (1)t
[0,1likelihood of one step movement
B(2)
110
111010 100
000
![Page 16: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/16.jpg)
16
B(1) =
*t = (0)t + (1)t B(1) + (2)t B(2)B(1)
Conditional Allocating Matrix B(i)
110
001 101011
000
111010 100
100
111
010
000
101011001
101110
101
101110
110
![Page 17: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/17.jpg)
17
Determine
Information criteria
• Net Information (Shannon’s Information content)
k
k
iii nk /)(log)/1(log 2
12
![Page 18: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/18.jpg)
18
Net Information and (0)
![Page 19: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/19.jpg)
19
Association Analysis Based on *
Coalescent simulation (Hudson’s 2002):
• Prevalence = 0.01
• Relative Risk = 2
• Frequencies of liability Allele = (0.1, 0.3, 0.5)
• Location of liability allele = (hot spot, blocky, very blocky)
• Draw 200 cases and 200 controls
Test of homogeneity based on *cs and *cn
![Page 20: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/20.jpg)
20
Power and Type I error
Gene Pelc Gene IL01RB
![Page 21: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/21.jpg)
21
Summary
Provide a mechanism of cladistic clustering by * B
• Combine the ideas of Truncating and Clustering
• Based on evolutionary relationship without reconstruct cladogram
• Incorporate haplotype frequencies and distance in cluster assignment
• One-step conditional regrouping can accommodate multiple step regrouping: self-repeating, algebraic multiplicative
• Reserve (0) based on information criteria
* increases test efficiency
• Increased power even for large samples and haplotypes in block regions
![Page 22: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/22.jpg)
22
End of Slides
![Page 23: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/23.jpg)
23
Approach
Two stages:
• Stage I: (Where)
Identify the susceptible regions across genome
(multiple testing problem)
Approaches based on haplotype similarity
• Stage II: (Which)
Determine and pinpoint the specific liability
variants
Study individual effects of groups of haplotypes
![Page 24: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/24.jpg)
24
I. Haplotype Similarity
• Van Der Meulen and te Meerman 1997; Bourgain et al. 2000-2002; Tzeng et al. 2003ab
• Search for extra haplotype sharing among cases
• Pro: 1 degree of freedom
• Con: not study individual haplotype effect
• Usage: good for genome screening
Strategies of Reducing Degrees of Freedom
![Page 25: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/25.jpg)
25
Strategies of Reducing Degrees of Freedom
Freq(%)
1 A C A C C C C C G G G C C G 45
2 . . . . . . . . . . . A . . 20
3 C T T G . T A T T A . . . . 13.25
4 . . . . . . . . . . . . . A 11.25
5 C . T . T . A . . . A A . . 3.75
6 . . . . . . . . . . . . T . 3.50
7 C . . . . . . . . . . . . . 1.50
8 C . T . T . A . . . . . . . 0.50
9 . T T G . T A T T A . . . . 0.50
1 A C G
2 . A .
3 T . .
4 . . A
5 T A .
(1) . . .
(1) . . .
6 T . .
(6) T . .
tag SNP
II. Haplotype Tagging (Johnson et al. 2001)
• Pro: efficiently capture the major diversity
• Con: discard rare haplotypes
![Page 26: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/26.jpg)
26
III. Haplotype Clustering
• Molitor et al. 2003; Seltman et al 2001, 2003; Durrant et al 2004
• Similar haplotypes induce similar liability effect
• Cluster haplotypes and perform analysis based
on clusters of haplotypes
• Pro: incorporating all data
• Con: may cluster two major haplotypes in the
same group
Strategies of Reducing Degrees of Freedom
![Page 27: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/27.jpg)
27
Approach
Two stages:
• Stage I: (Where)
Identify the susceptible regions across genome
(multiple testing problem)
Approaches based on haplotype similarity
• Stage II: (Which)
Determine and pinpoint the specific liability variants
Study individual effects of groups of haplotypes
![Page 28: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/28.jpg)
28
Haplotype Grouping
Focus on Stage II
Combine the pros of haplotype tagging and clustering
![Page 29: 1 Cladistic Clustering of Haplotypes in Association Analysis Jung-Ying Tzeng Aug 27, 2004 Department of Statistics & Bioinformatics Research Center North.](https://reader034.fdocuments.us/reader034/viewer/2022051618/56649d235503460f949fa4be/html5/thumbnails/29.jpg)
29
Power and Type I error
Gene Pelc Gene IL01RB