Genome-Wide Association Studies: Hunting for Genes in the New Millennium
description
Transcript of Genome-Wide Association Studies: Hunting for Genes in the New Millennium
Genome-Wide Association Studies: Hunting for
Genes in the New Millennium
National Human Genome Research
Institute
National Institutes of
Health
U.S. Department of Health and
Human Services
U.S. Department of Health and Human Services
National Institutes of HealthNational Human Genome Research
InstituteTeri A. Manolio, M.D., Ph.D.Director, Office of Population GenomicsSenior Advisor to the Director, NHGRI,
for Population GenomicsNovember 20, 2008
We Live in Interesting Times…
“‘May he live in interesting times.’ Like it or not we live in interesting times.”
--Robert Kennedy, June 7, 1966
May you come to the attention of those in authority.
May you find what you are looking for.
Wikipedia, accessed 11Sep07
200520062007 first quarter2007 second quarter2007 third quarter2007 fourth quarter2008 first quarter 2008 second quarter 2008 third quarter
Manolio, Brooks, Collins, J. Clin. Invest., May 2008
Pennisi E, Science 2007; 318:1842-43.
2007: The Year of GWA Studies
Diseases and Traits with Published GWA Studies (n = 76, 11/17/08)
• Macular Degeneration• Exfoliation Glaucoma
• Lung Cancer• Prostate Cancer• Breast Cancer• Colorectal Cancer• Bladder Cancer• Neuroblastoma• Melanoma• TP53 Cancer
Predispos’n• Chr. Lymph. Leukemia
• Inflamm. Bowel Disease• Celiac Disease• Gallstones• Irritable Bowel
Syndrome
• QT Prolongation • Coronary Disease• Coronary Spasm • Atrial
Fibrillation/Flutter• Stroke• Subarachnoid
Hemorrhage• Intracranial Aneurysm • Hypertension• Hypt. Diuretic Response• Peripheral Artery
Disease
• Syst. Lupus Erythematosus
• Sarcoidosis• Pulmonary Fibrosis• Psoriasis • HIV Viral Setpoint• Childhood Asthma
• Type 1 Diabetes • Type 2 Diabetes• Diabetic Nephropathy • End-St. Renal Disease• Obesity, BMI, Waist,
IR• Height• Osteoporosis• Osteoarthritis• Male Pattern Baldness
• F-Cell Distribution• Fetal Hgb Levels• C-Reactive Protein• ICAM-1• Total IgE Levels• Uric Acid Levels, Gout• Protein Levels• Vitamin B12 Levels• Recombination Rate• Pigmentation
• Lipids and Lipoproteins• Warfarin Dosing• Ximelegatran Adv.
Resp.
• Parkinson Disease• Amyotrophic Lat.
Sclerosis• Multiple Sclerosis• MS Interferon-β
Response • Prog. Supranuclear
Palsy• Alzheimer’s Disease in
ε4+• Cognitive Ability• Memory• Hearing• Restless Legs Syndrome • Nicotine Dependence• Methamphetamine
Depend.• Neuroticism• Schizophrenia• Sz. Iloperidone
Response• Bipolar Disorder• Family Chaos• Narcolepsy• Attention Deficit
Hyperactivity• Personality Traits
• Rheumatoid Arthritis• RA Anti-TNF Response
Hunter DJ and Kraft P, N Engl J Med 2007; 357:436-439.
“There have been few, if any, similar bursts of discovery in the history of medical research…”
What is a Genome-Wide Association Study?
• Method for interrogating all 10 million variable points across human genome
• Variation inherited in groups, or blocks, so not all 10 million points have to be tested
• Blocks are shorter (so need to test more points) the less closely people are related
• Technology now allows studies in unrelated persons, assuming 5,000 – 10,000 base pair lengths in common (300,000 – 1,000,000 markers)
DNA on Chromosome 7GAAATAATTAATGTTTTCCTTCCTTCTCCTATTTTGTCCTTTACTTCAATTTATTTATTTATTATTAATATTATTATTTTTTG
AGACGGAGTTTC/ACTCTTGTTGCCAACCTGGAGTGCAGTGGCGTGATCTCAGCTCACTGCACACTCCGCTTTCCTGGTTTCAAGCGATTCTCCTGCCTCAGCCTCCTGAGTAGCTGGGACTACAGTCACACACCACCACGCCCGGCTAATTTTTGTATTTTTAGTAGAGTTGGGGTTTCACCATGTTGGCCAGACTGGTCTCGAACTCCTGACCTTGTGATCCGCCAGCCTCTGCCTCCCAAAGAGCTGGGATTACAGGCGTGAGCCACCGCGCTCGGCCCTTTGCATCAATTTCTACAGCTTGTTTTCTTTGCCTGGACTTT
ACAAGTCTTACCTTGTTCTGCC/TTCAGATATTTGTGTGGTCTCATTCTGGTGTGCCAGTAGCTAAAAATCCATGATTTGCTCTCATCCCACTCCTGTTGTT
CATCTCCTCTTATCTGGGGTCACA/CTATCTCTTCGTGATTGCATTCTGATCCCCAGTACTTAGCATGTGCGTAACAACTCTGCCTCTGCTTTCCCAGGCTGTTGATGGGGTGCTGTTCATGCCTCAGAAAAATGCATTGTAAGTTAAATTATTAAAGATTTTAAATATAGGAAAAAAGTAAGCAAACATAAGGAACAAAAAGGAAAGAACATGTATTCTAATCCATTATTTATTATACAATTAAGAAATTTGGAAACTTTAGATTACACTGCTTTTAGAGATGGAGATGTAGTAAGTCTTTTACTCTTTACAAAATACATGTGTTAGCAATTTTGGGAA
GAATAGTAACTCACCCGAACAGTG/TAATGTGAATATGTCACTTACTAGAGGAAAGAAGGCACTTGAAAAACATCTCTAAACCGTATAAAAACAATTACATCATAATGATGAAAACCCAAGGAATTTTTTTAGAAAACATTACCAGGGCTAATAACAAAGTAGAGCCACATGTCATTTATCTTCCCTTTGTGTCTGTGTGAGAATTCTAGAGTTATATTTGTACATAGCATGGAAAAATGAGAGGCTAGTTTATCAACTAGTTCATTTTTAAAAGTCTAACACATCCTAGGTATAGGTGAACTGTCCTCCTGCCAATGTATTGCACATTTGTGCCCAGATCCAGCATAGGGTATGTTTGCCATTTACAAACGTTTATGTCTTAAGAGAGGAAATATGAAGAGCAAAACAGTGCATGC
TGGAGAGAGAAAGCTGATACAAATATAAAT/GAAACAATAATTGGAAAAATTGAGAAACTACTCATTTTCTAAATTACTCATGTATTTTCCTAGAATTTAAGTCTTTTAATTTTTGATAAATCCCAATGTGAGACAAGATAAGTATTAGTGATGGTATGAGTAATTAATATCTGTTATATAATATTCATTTTCATAGTGGAAGAAATAAAATAAAGGTTGTGATGATTGTTGATTATTTTTTCTAGAGGGGTTGTCAGGGAAAGAAATTGCTTTTT
SNPs 1 / 300 bases
Christensen and Murray, N Engl J Med 2007; 356:1094-97.
Mapping the Relationships Among SNPs
Chromosome 9p21 Region Associated with MI
Samani N et al, N Engl J Med 2007; 357:443-453.
BostonProvi-dence
New York
Phila-delphi
a
Balti-more
Providence 59
New York 210 152Philadelphia 320 237 86
Baltimore 430 325 173 87Washington 450 358 206 120 34
Distances Among East Coast Cities
BostonProvi-dence
New York
Phila-delphi
a
Balti-more
Providence 59
New York 210 152Philadelphia 320 237 86
Baltimore 430 325 173 87Washington 450 358 206 120 34
Distances Among East Coast Cities
< 100 101-200
201-300
301-400
> 400
BostonProvi-dence
New York
Phila-delphi
a
Balti-more
Providence 59
New York 210 152Philadelphia 320 237 86
Baltimore 430 325 173 87Washington 450 358 206 120 34
Distances Among East Coast Cities
< 100 101-200
201-300
301-400
> 400
Distances Among East Coast Cities
Boston Provi-
dence New
York Phila-
delphia Balti-
more Wash-
ington
Distances Among East Coast Cities
Boston
Provi-
dence
New York
Phila-delph
ia
Balti-more
Wash-
ington
Christensen and Murray, N Engl J Med 2007; 356:1094-97.
Mapping the Relationships Among SNPs
One Tag SNP May Serve as Proxy for Many
CAGATCGCTGGATGAATCGCATCTGTAAGCAT
CGGATTGCTGCATGGATCGCATCTGTAAGCAC
CAGATCGCTGGATGAATCGCATCTGTAAGCAT
CAGATCGCTGGATGAATCCCATCAGTACGCAT
CGGATTGCTGCATGGATCCCATCAGTACGCAT
CGGATTGCTGCATGGATCCCATCAGTACGCAC
SNP2↓
SNP3↓
SNP4↓
SNP5↓
SNP6↓
SNP1↓
Block 1 Block 2
SNP7↓
SNP8↓
One Tag SNP May Serve as Proxy for Many
CAGATCGCTGGATGAATCGCATCTGTAAGCAT
CGGATTGCTGCATGGATCGCATCTGTAAGCAC
CAGATCGCTGGATGAATCGCATCTGTAAGCAT
CAGATCGCTGGATGAATCCCATCAGTACGCAT
CGGATTGCTGCATGGATCCCATCAGTACGCAT
CGGATTGCTGCATGGATCCCATCAGTACGCAC
%
SNP2↓
SNP3↓
SNP4↓
SNP5↓
SNP6↓
SNP1↓
Block 1 Block 2
SNP7↓
SNP8↓
One Tag SNP May Serve as Proxy for Many
CAGATCGCTGGATGAATCGCATCTGTAAGCAT
CGGATTGCTGCATGGATCGCATCTGTAAGCAC
CAGATCGCTGGATGAATCGCATCTGTAAGCAT
CAGATCGCTGGATGAATCCCATCAGTACGCAT
CGGATTGCTGCATGGATCCCATCAGTACGCAT
CGGATTGCTGCATGGATCCCATCAGTACGCAC
%
SNP3↓
SNP5↓
SNP6↓
Block 1 Block 2
SNP7↓
SNP8↓
One Tag SNP May Serve as Proxy for Many
CAGATCGCTGGATGAATCGCATCTGTAAGCAT
CGGATTGCTGCATGGATCGCATCTGTAAGCAC
CAGATCGCTGGATGAATCGCATCTGTAAGCAT
CAGATCGCTGGATGAATCCCATCAGTACGCAT
CGGATTGCTGCATGGATCCCATCAGTACGCAT
CGGATTGCTGCATGGATCCCATCAGTACGCAC
%
SNP3↓
SNP6↓
Block 1 Block 2
SNP8↓
One Tag SNP May Serve as Proxy for Many
GTT 35%
CTC 30%
GTT 10%
GAT 8%
CAT 7%
CAC 6%
other haplotypes 4%
Block 1 Block 2 FrequencySingleton
www.hapmap.org
Nature 2005; 437:1299-320.Nature 2007; 449:851-61.
A HapMap for More Efficient Association Studies: Goals
• Use just the density of SNPs needed to find associations between SNPs and diseases
• Do not miss chromosomal regions with disease association
• Produce a tool to assist in finding genes affecting health and disease
• Use more SNPs for complete genome coverage of populations of recent African ancestry populations due to shorter LD
Progress in Genotyping Technology
1 10 102 103 104 105 106
Nb of SNPs
Cost
per
gen
oty
pe
(Cen
ts,
US
D)10
1
102
ABITaqMan
ABISNPlex
IlluminaGolden
Gate
IlluminaInfinium/
Sentrix Affymetrix
100K/500K
Perlegen
Affymetrix
MegAllele
2001 2005
Affymetrix
10K
Courtesy S. Chanock, NCI
0
300
600
900
1200
1500
1800
Jul-05 Oct-05 Jan-06 Apr-06 Jul-06 Oct-06
Affymetrix 500K
Illumina 317K
Illumina 550K
Illumina 650Y
Continued Progress in Genotyping Technology
Courtesy S. Gabriel, Broad/MIT
July 2005 Oct 2006
Cost
per
pers
on
(U
SD
)
Association of Alleles and Genotypes of rs1333049 with Myocardial
Infarction C
N (%)G
N (%)2
(1df)P-value
Cases2,132 (55.4)
1,716 (44.6)
55.11.2 x 10-
13Controls
2,783 (47.4)
3,089 (52.6)
Allelic Odds Ratio = 1.38
Samani N et al, N Engl J Med 2007; 357:443-53.
Association of Alleles and Genotypes of rs1333049 with Myocardial
Infarction C
N (%)G
N (%)2
(1df)P-value
Cases2,132 (55.4)
1,716 (44.6)
55.11.2 x 10-
13Controls
2,783 (47.4)
3,089 (52.6)
Allelic Odds Ratio = 1.38CC
N (%)CG
N (%)GG
N (%)2
(2df) P-value
Cases586
(30.5) 960 (49.9)
378 (19.6)59.7
1.1 x 10-
14Controls
676 (23.0)
1,431 (48.7)
829 (28.2)
Heterozygote Odds Ratio = 1.47
Homozygote Odds Ratio = 1.90
Samani N et al, N Engl J Med 2007; 357:443-53.
Klein et al, Science 2005; 308:385-389.
P Values of GWA Scan for Age-Related Macular Degeneration
Bierut LJ et al, Hum Molec Genet 2007; 16:24-35.
Nicotine Dependence among Smokers
http://www.broad.mit.edu/diabetes/scandinavs/type2.html
Genome-Wide Scan for Type 2 Diabetes in a Scandinavian Cohort
Libioulle C et al, PLoS Genet; 2007 Apr 20;3(4):e58.
Genome-Wide Scan for Crohn Disease in Belgian Cases and
Controls
Sladek R et al, Nature 2007; 445, 881-885.
Genome-Wide Scan for Type 2 Diabetes in French Case-Control
Study
WTCCC, N ature 2007; 447:661-678.
Wellcome Trust Genome-Wide Association Study of Seven
Common Diseases
Hunter DJ et al, Nat Genet 2007; 39:870-874.
Genome-Wide Scan for Breast Cancer in Postmenopausal Women
-Log10 P Values for SNP Associations with Myocardial
Infarction
Samani N et al., N Engl J Med 2007; 357:443-53.
Association Signal for Coronary Artery Disease on Chromosome 9
Samani N et al., N Engl J Med 2007; 357:443-53.
Region of Chromosome 1 Showing Strong Association with Inflammatory
Bowel Disease
Duerr R et al., Science 2006; 314:1461-63.
Unique Aspects of GWA Studies
• Permit examination of inherited genetic variability at unprecedented level of resolution
• Permit "agnostic" genome-wide evaluation • Once genome measured, can be related to any
trait• Most robust associations in GWA studies have not
been with genes previously suspected of association with the disease
• Some associations in regions not even known to harbor genes
“The chief strength of the new approach also contains its chief problem: with more than 500,000 comparisons per study, the potential for false positive results is unprecedented.”
Hunter DJ and Kraft P, N Engl J Med 2007; 357:436-439.
Larson, G. The Complete Far Side. 2003.
Number of New, Significant Gene-Disease Associations by Year, 1984 -
2000
Hirschhorn J et al, Genet Med 2002; 4:45-61.
Of 600 Gene-Disease Associations, Only 6 Significant in > 75% of
Identified Studies
Disease/Trait GenePolymorphism
Frequency
DVT F5 Arg506Gln 0.015
Graves’ Disease
CTLA4 Thr17Ala 0.62
Type 1 DM INS 5’ VNTR 0.67
HIV/AIDS CCR5 32 bp Ins/Del 0.05-0.07
Alzheimer’s APOE Epsilon 2/3/4 0.16-0.24
Creutzfeldt-Jakob Disease
PRNP Met129Val 0.37
Hirschhorn J et al., Genet Med 2002; 4:45-61.
POLYMORPHISM PRESENT ABSENT SUMMARY
ACE I/D 13 with D; 1 with I 18 favors none
APOE 8 with ε4, 2 with ε2 9 equivocal
AGT M235T 0 8 none
AGTR1 A1166C 0 7 none
MTHFR 7 with T, 1 with non-T 8 equivocal
PON1 Q192R 3 with R 10 none
PON1 L55M 5 with L (subgroups) 1 weak
NOS3 G894T 1 with T 4 none
MMP3 -1516 5A/6A
4 with 6A0
association
IL-6 G-174C 1 with G 3 none
Reports For and Against Associations of Variants with
Carotid Atherosclerosis
Manolio et al., ATVB 2004; 24:1567-77.
May 1999
J. Hirschhorn and D. Altshuler J Clin Endo Metab 2002
Am J Hum Genet July 2004
Am J Hum Genet July 2004
PLoS Biol Sept 2005
Nat Genet July 2006
Chanock S, Manolio T, et al., Nature 2007; 447:655-60.
Replication, Replication, ReplicationInitial study: Sufficient description to permit
replication• Sources of cases and controls• Participation rates and flow chart of selection• Methods for assessing affected status• Standard “Table 1” including rates of missing data• Assessment of population heterogeneity• Genotyping methods and QC metricsReplication study: • Similar population, similar phenotype• Same genetic model, same SNP, same direction• Adequately powered to detect postulated effect
Chanock S, Manolio T, et al., Nature 2007; 447:655-60.
Replication Study #1 3,000 cases / 3,000 controls
Replication Study #22,400 cases / 2,400 controls
Replication Study #3 2,500 cases / 2,500 controls
Initial Study1,150 cases / 1,150 controls
~24,000 SNPs
~1,500 SNPs
200+ New ht-SNPs
>500,000 Tag SNPs
25-50 Loci
Replication Strategy for Prostate Cancer Study in CGEMS
Hoover R, Epidemiology 2007; 18:13-17.
Replication Strategy in Easton Breast Cancer Study
Stage Cases Controls SNPs
1 408 400 266,722
Easton et al, Nature 2007; 447:1087-93.
Replication Strategy in Easton Breast Cancer Study
Stage Cases Controls SNPs
1 408 400 266,722
2 3,990 3,916 13,023
Easton et al, Nature 2007; 447:1087-93.
Replication Strategy in Easton Breast Cancer Study
Stage Cases Controls SNPs
1 408 400 266,722
2 3,990 3,916 13,023
3 23,734 23,639 31
Easton et al, Nature 2007; 447:1087-93.
Replication Strategy in Easton Breast Cancer Study
• ABCFS• BCST• COPS • GENICA• HBCS• HBCP
Stage Cases Controls SNPs
1 408 400 266,722
2 3,990 3,916 13,023
3 23,734 23,639 31
Final 6
• MEC-W• MEC-J• NHS• PBCS• RBCS• SASBAC
Easton et al, Nature 2007; 447:1087-93.
• SEARCH2• SEARCH3• SBCP• SBCS• CNIOBCS• USRT
• TBCS• KConFab/AOCS• KBCP• LUMCBCS• MCBCS• MCCS
Larson, G. The Complete Far Side. 2003.
Replication Strategy in CGEMS Prostate Cancer Study
Stage Cases Controls SNPs
1 1,172 1,157 527,869
Thomas et al, Nat Genet 2008; 40:310-15.
Replication Strategy in CGEMS Prostate Cancer Study
Stage Cases Controls SNPs
1 1,172 1,157 527,869
2 3,941 3,964 26,958*
Thomas et al, Nat Genet 2008; 40:310-15.
Replication Strategy in CGEMS Prostate Cancer Study
Stage Cases Controls SNPs
1 1,172 1,157 527,869
2 3,941 3,964 26,958*
* Selected for p < 0.068
Thomas et al, Nat Genet 2008; 40:310-15.
Replication Strategy in CGEMS Prostate Cancer Study
Stage Cases Controls SNPs
1 1,172 1,157 527,869
2 3,941 3,964 26,958*
* Selected for p < 0.068
SNP GeneStage 1+2
P-value
rs4962416 MSMB 7 x 10-13
rs10896449 11q13 2 x 10-9
rs10993994 CTBP2 2 x 10-7
rs10486567 JAZF1 2 x 10-6
Thomas et al, Nat Genet 2008; 40:310-15.
Replication Strategy in CGEMS Prostate Cancer Study
Stage Cases Controls SNPs
1 1,172 1,157 527,869
2 3,941 3,964 26,958*
* Selected for p < 0.068
SNP GeneStage 1+2
P-value
Initial Rank
rs4962416 MSMB 7 x 10-13 24,223
rs10896449 11q13 2 x 10-9 2,439
rs10993994 CTBP2 2 x 10-7 319
rs10486567 JAZF1 2 x 10-6 24,407Thomas et al, Nat Genet 2008; 40:310-15.
Replication Strategy in CGEMS Prostate Cancer Study
Stage Cases Controls SNPs
1 1,172 1,157 527,869
2 3,941 3,964 26,958*
* Selected for p < 0.068
SNP GeneStage 1+2
P-valueInitial Rank
Initial P-value
rs4962416 MSMB 7 x 10-13 24,223 0.042
rs10896449 11q13 2 x 10-9 2,439 0.004
rs10993994 CTBP2 2 x 10-7 319 4 x 10-4
rs10486567 JAZF1 2 x 10-6 24,407 0.042
Thomas et al, Nat Genet 2008; 40:310-15.
0
50
100
150
200
2005 2006 2007 2008
Published GWA Reports, 3/2005 - 9/2008
Tota
l N
um
ber
of
Pu
blic
ati
on
s
191
Calendar Quarter
NHGRI Catalog of GWA Studies: http://www.genome.gov/gwastudies/
NHGRI GWA Catalog - Objectives
• Identify and track all GWA publications attempting to assay > 100,000 SNPs
• Extract key information regarding associations
• Provide widely as scientific resource, including downloadable datafile
• Seek commonalities across associations genome-wide rather than disease by disease
• Describe approach clearly so others can replicate or expand upon it
• Maintain consistency in approach• Adapt to evolving technologies: CNVs?
NHGRI GWA Catalog - Methods
• Survey NIH “e-clips” daily, PubMed weekly• Identify all GWA publications attempting to
assay > 100,000 SNPs• Describe top 5 novel associations significant
at p < 10-6 • Expand to all associations < 10-6 • Extract information on:
- Disease/trait - Rs number/risk allele - Sample size - Risk allele frequency- Genomic region - P-value, OR [95%
CI]- Reported genes - Platform, #SNPs
Reports Included in this Analysis
• 180 published papers through 9/18/2008
• 34 did not report SNP
• 1 reported haplotypes without specific SNPs
• 145 reports
– 782 unique (“index”) SNPs
– 3,841 unique perfect LD SNPs (“linked”)
– 4,623 index + linked
• 83 index SNPs reported 2-7 times
• 10 multiple reports in “unrelated” traits
0 10 20 30 40 50 60
Intergenic
3' (0.5kb)
5' (2kb)
miRTS
3' UTR
5' UTR
Intronic
Synonymous
Missense
Functional Classification of 782 Index SNPs Associated with
Complex Traits37
0 10 20 30 40 50 60 Percent
11
340
11
2
6
22
20
354
0 10 20 30 40 50 60
Intergenic
3' (0.5kb)
5' (2kb)
miRTS
3' UTR
5' UTR
Intronic
Synonymous
Missense
Functional Classification of 782 Index SNPs and 4,623 Index +
Linked SNPs
Index SNPs Index + Linked SNPs
0 10 20 30 40 50 60 Percent
Odds Ratios of Discrete Associations
0
20
40
60
80
100
120
1.2 1.4 1.6 1.8 2.0 2.2 2.4 3.04.05.06.0 9.013.020.030.0
Odds Ratio (upper inclusive bound)
Num
ber
of A
ssocia
tions
3 4 5 6 9 13 20 30
// //
Median = 1.28
Percent of Variance in Disease Risk Explained by 32 Established CD Risk Loci
Barrett et al., Nat Genet 2008 Jun 29.
Power to detect risk loci
Odds Ratios of Discrete Associations
0
20
40
60
80
100
120
1.2 1.4 1.6 1.8 2.0 2.2 2.4 3.04.05.06.0 9.013.020.030.0
Odds Ratio (upper inclusive bound)
Num
ber
of A
ssocia
tions
3 4 5 6 9 13 20 30
// //
Reported Risk Allele Frequencies by Odds Ratios for
Discrete Traits
1
3
5
7
9
0.0 0.2 0.4 0.6 0.8 1.0
Risk Allele Frequency (%)
Od
ds
Rat
io
30
25
20
15
10
5
4
3
2
1
Reported Risk Allele Frequencies by Odds Ratios for
Discrete Traits
1
3
5
7
9
0.0 0.2 0.4 0.6 0.8 1.0
Risk Allele Frequency (%)
Od
ds
Rat
io
30
25
20
15
10
5
4
3
2
1
Reported Risk Allele Frequencies by Odds Ratios for
Discrete Traits
1
3
5
7
9
0.0 0.2 0.4 0.6 0.8 1.0
Risk Allele Frequency (%)
Od
ds
Rat
io
30
25
20
15
10
5
4
3
2
1
Reported Risk Allele Frequencies by Odds Ratios for
Discrete Traits
1
3
5
7
9
0.0 0.2 0.4 0.6 0.8 1.0
Risk Allele Frequency (%)
Od
ds
Rat
io
30
25
20
15
10
5
4
3
2
1
Thorlieifsson Exfoliation Glaucoma
SarasqueteOsteonecrosis
HakonarsonType 1 DM
van HeelCeliac Disease
WTCCCType 1 DM
Author Trait # Ca/Co RAF OR P-value
Thorliefsson Exfol’n glaucoma 75/14,747 0.85 20.10 3 x 10-21
Sarasquete Osteonecrosis 21/64 0.12 12.75 1 x 10-6
HakonarsonM Type 1 diabetes 561/1,143 0.13 8.30 1 x 10-16
van HeelM Celiac disease 991/1,489 0.14 7.04 1 x 10-19
Matarin* Stroke 259/269 NR 5.62 6 x 10-6
WTCCCM Type 1 diabetes 1,963/2,938 0.39 5.49 5 x 10-134
Behrens Juvenile arthritis 130/1,952 NR 5.37 2 x 10-10
Fung Parkinson’s dis. 267/270 NR 5.00 7 x 10-6
Klein Macular degen. 96/50 0.70 4.60 4 x 10-8
SEARCH Statin myopathy 85/90 0.13 4.50 2 x 10-9
*2 other SNPs in this study also associated, OR > 2.0.MSNPs in MHC region.
Characteristics of SNPs Associated with
Odds Ratios > 4.5
FST Values: Index SNPs and HapMap SNPs
0.0
0.1
0.2
0.3
0.4
0.5
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80
Nu
mb
er
of
Lo
ci
HapMap CEU Index SNPsFST Values
0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80
Median = 0.069
Phenotype Relationships of SNPs with Highest FST Values
0.0
0.1
0.2
0.3
0.4
0.5
0.6
IR OB Height Cancer
Nu
mb
er
of
Lo
ci
Top 5% Fst values (0.49) Top 1% Fst values (0.69)
Immune Pigment Obesity Neuro. Height BMD Cancer Related Traits Related
Lessons Learned from Initial GWA Studies
Signals in Previously Unsuspected Genes
Macular Degeneration
CFH
Coronary Disease CDKN2A/2BChildhood Asthma ORMDL3Type II Diabetes CDKAL1Crohn’s Disease ATG16L1
Lessons Learned from Initial GWA Studies
Signals in Previously Unsuspected Genes
Macular Degeneration
CFH
Coronary Disease CDKN2A/2BChildhood Asthma ORMDL3Type II Diabetes CDKAL1Crohn’s Disease ATG16L1
Signals in Gene “Deserts”Prostate Cancer 8q24
Crohn’s Disease5p13.1,
1q31.2, 10p21
Lessons Learned from Initial GWA Studies
Signals in Previously Unsuspected Genes
Signals in Common
Macular Degeneration
CFH
Coronary Disease CDKN2A/2BDiabetes, Melanoma
Childhood Asthma ORMDL3 Crohn’s DiseaseType II Diabetes CDKAL1 Prostate CancerCrohn’s Disease ATG16L1
Signals in Gene “Deserts”Prostate Cancer 8q24
Crohn’s Disease5p13.1,
1q31.2, 10p21
Lessons Learned from Initial GWA Studies
Signals in Previously Unsuspected Genes
Signals in Common
Macular Degeneration
CFH
Coronary Disease CDKN2A/2BDiabetes, Melanoma
Childhood Asthma ORMDL3 Crohn’s DiseaseType II Diabetes CDKAL1 Prostate CancerCrohn’s Disease ATG16L1
Signals in Gene “Deserts”Signals in Common
Prostate Cancer 8q24Breast, Colorectal Cancer; Crohn’s
Crohn’s Disease5p13.1,
1q31.2, 10p21
Lessons Learned from Initial GWA Studies
Signals in Previously Unsuspected Genes
Signals in Common
Macular Degeneration
CFH
Coronary Disease CDKN2A/2BDiabetes, Melanoma
Childhood Asthma ORMDL3 Crohn’s DiseaseType II Diabetes CDKAL1 Prostate CancerCrohn’s Disease ATG16L1
Signals in Gene “Deserts”Signals in Common
Prostate Cancer 8q24Breast, Colorectal Cancer; Crohn’s
Crohn’s Disease5p13.1,
1q31.2, 10p21Signals in Common
Multiple Sclerosis IL7R Type 1 DiabetesSarcoidosis C10orf67 Celiac DiseaseRA, T1DM PTPN2, PTPN22 Crohn’s
Lessons Learned from Initial GWA Studies
Signals in Previously Unsuspected Genes
Signals in Common
Macular Degeneration
CFH
Coronary Disease CDKN2A/2BDiabetes, Melanoma
Childhood Asthma ORMDL3 Crohn’s DiseaseType II Diabetes CDKAL1 Prostate CancerCrohn’s Disease ATG16L1
Signals in Gene “Deserts”Signals in Common
Prostate Cancer 8q24Breast, Colorectal Cancer; Crohn’s
Crohn’s Disease5p13.1,
1q31.2, 10p21Signals in Common
Multiple Sclerosis IL7R Type 1 DiabetesSarcoidosis C10orf67 Celiac DiseaseRA, T1DM PTPN2, PTPN22 Crohn’s
Study Crohn’s Disease!
Barrett et al., Nat Genet 2008 Jun 29.
• Nearly half of GWA-identified SNPs are intergenic• Only 8.4% of index SNPs are in coding regions, 5’
or 3’ UTR, or miRTS• Potential selection bias in genotyped SNPs for
excess of missense variants • Most associated odds ratios are < 1.5• Risk allele frequencies do not appear skewed
toward rare alleles or large FST values
• Highly-differentiated SNPs enriched for immune-related, pigmentation, and obesity traits
• Examination of loci at extremes of these characteristics may yield interesting insights
Conclusions
“The more we find, the more we see, the more we come to learn.
The more that we explore, the more we shall return.”
Sir Tim Rice, Aida, 2000