Computational and Statistical Challenges in Association Studies

Computational and Statistical Challenges in Association Studies

Eleazar Eskin

University of California, Los Angeles

The Human Genome Project “What we are announcing today is that we have reached a milestone…that is, covering the genome in…a working draft of the human sequence.”

“I would be willing to make a prediction that within 10 years, we will have the potential of offering any of you the opportunity to find out what particular genetic conditions you may be at increased risk for…”

Washington, DCJune, 26, 2000.

Human Genetics

Mother Father

Child

Disease Risk “genetic” factors account for

20%-80% of disease risk. Many genes contribute to

“complex” diseases.

Personalized Medicine Treatment decisions influenced

by diagnostics

Understanding Disease Biology New drug targets. Understanding of mechanism of

disease.

Mother

Child

Risk Factors

Risk Factors

Where are the risk factors?(Genetic Basis of Disease)

Disease Association StudiesThe search for genetic factors

Comparing the DNA contents of two populations:

• Cases - individuals carrying the disease.• Controls - background population.

Differences within a gene between the two populations is evidence the gene is involved in the disease.

Single Nucleotide Polymorphisms(SNPs)

AGAGCCGTCGACAGGTATAGCCTAAGAGCCGTCGACATGTATAGTCTA

AGAGCAGTCGACAGGTATAGTCTAAGAGCAGTCGACAGGTATAGCCTA

AGAGCCGTCGACATGTATAGCCTAAGAGCAGTCGACATGTATAGCCTA

AGAGCCGTCGACAGGTATAGCCTAAGAGCCGTCGACAGGTATAGCCTA

Human Variation Humans differ by

0.1% of their DNA. A significant

fraction of this variation is accounted by SNPs.

Single Nucleotide PolymorphismsAssociation Analysis

AGAGCAGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCGACATGATAGCCAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGTCAGAGCAGTCGACAGGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCGACATGATAGCCAGAGCAGTCGACAGGTATAGCCTACATGAGATCAACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGCCAGAGCCGTCGACATGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCAACATGATAGCCAGAGCCGTCGACATGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCAACATGATAGCCAGAGCCGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCAACATGATAGTCAGAGCAGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGCC

AGAGCAGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCAACATGATAGCCAGAGCAGTCGACATGTATAGTCTACATGAGATCAACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGCCAGAGCAGTCGACATGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCAACATGATAGCCAGAGCCGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGTCAGAGCCGTCGACAGGTATAGTCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCAACATGATAGCCAGAGCAGTCGACAGGTATAGTCTACATGAGATCGACATGAGATCTGTAGAGCAGTGAGATCGACATGATAGCCAGAGCCGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGCCAGAGCCGTCGACAGGTATAGTCTACATGAGATCAACATGAGATCTGTAGAGCAGTGAGATCGACATGATAGTC

Cases: (Individuals with the disease)

Controls: (Healthy individuals) Associated SNP

Correlations between SNPs

Single Nucleotide Polymorphisms Association Analysis

AGAGCAGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCGACATGATAGCCAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGTCAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGTCAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGTCAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAAGAGCAGTCGACAGGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCGACATGATAGCCAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGTCAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAAGAGCAGTCGACAGGTATAGCCTACATGAGATCAACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGCCAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGTCAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAAGAGCCGTCGACATGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCCGTGAGATCAACATGATAGCCAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGTCAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAAGAGCCGTCGACATGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCAACATGATAGCCAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGTCAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAAGAGCCGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCAACATGATAGTCAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGTCAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAAGAGCAGTCGACAGGTATAGCCTACATGAGATCGACATGAGATCTGTAGAGCCGTGAGATCGACATGATAGCCAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTAGAGCAGTGAGATCGACATGATAGTCAGAGCCGTCGACATGTATAGTCTACATGAGATCGACATGAGATCGGTA

Cases:



Controls:


Challenges: Millions of Common SNPs

False Positives

Single Nucleotide Polymorphisms(SNPs)


Cases:



Controls:


Challenges: Millions of Common SNPs Correlations between SNPs SNP locations unknown

False Positives

•Successor to the Human Genome Project •International consortium that aims in genotyping the genome of 270 individuals from four different populations.• Launched in 2002. First phase was finished in October (Nature, 2005).•Collected genotypes for 3.9 million SNPs.•Location and correlation structure of many common SNPs.

Public Genotype Data Growth

2001

Daly et al.Nature Genetics103 SNPs40,000genotypes

Gabriel et al.Science3000 SNPs400,000 genotypes

2002

TSC DataNucleic AcidsResearch35,000 SNPs4,500,000genotypes

2003

Perlegen DataScience1,570,000 SNPs100,000,000 genotypes

2004

NCBI dbSNPGenomeResearch3,000,000 SNPs286,000,000 genotypes

2005

HapMap Phase 25,000,000+ SNPs600,000,000+genotypes

2006

More SNPs increase genome coverage in association studies.

More genotypes allow for discovery of weaker associations.

Some Computational Challenges

Genetics - identifying disease genes Haplotype phasing - preprocessing SNPs Association study design Association study analysis Population stratification Inferring evolutionary processes (recombination rates,

selection, haplotype ancestry). Etc…

Genomics - functions of disease genes Predicting functional effect of variation Understanding disease effect on gene regulation Understanding disease effect on metabolic pathways Combining systems biology with genetics Etc…

HAP

WHAPSAT Tagger

Haplotype Phasing using Imperfect Phylogeny

Haplotype Phasing

High throughput cost effective sequencing technology gives genotypes and not haplotypes.

HaplotypesATCCGAAGACGC

ATACGAAGCCGC

Possiblephases:

AGACGAATCCGC ….

mother chromosomefather chromosome

Genotype

A

CCG

A

C

G

TA

Haplotype Limited Diversity

Previous studies on local haplotype structure: (Daly et al., 2001) chromosome 5q31. (Patil et al., 2001) chromosome 21.

Study findings: The SNPs on each haplotype are correlated. SNPs can be separated into blocks of limited diversity.

Local regions have few haplotypes.

Haplotype Data in a Block

(Daly et al., 2001) Block 6 from Chromosome 5q31

2nd Possibleresolution

11100110 100011001 1

10000001 201000001 2

01011001 110000000 1

10101110 101010001 1

11000001 100000001 1

11001000 100010001 1

01000001 210000001 2

or?

MaximumLikelihoodCriterion

?

ExamplePhasing

Genotypes

22222222

22000001

22022002

22222222

22000001

22022002

22000001

1st Possibleresolution

11111110 200000001 7

11000001 300000001 7

11011000 200000001 7

11111110 200000001 7

11000001 300000001 7

11011000 200000001 7

11000001 300000001 7

MaximumLikelihoodHaplotypeInference

is aNP-HardProblem

2

10

1

11

0

00

Narrowing the Search:Perfect Phylogeny

A directed phylogenetic tree. {0,1} alphabet. Each site mutates at most

once. No recombination.

00000

01000

1100001001

11100

11110

4

3

15

2

The Perfect Phylogeny Haplotype Problem (PPH)

Given genotypes over a short region. Find compatible haplotypes which

correspond to a perfect phylogeny tree.

[Gusfield 02’]. PPH deficiency – the data does not fit the

model.

Solving PPH

A very simple o(nm2) algorithm for PPH problem. (Also Gusfield 02, Bafna et al., 2003)

But – in practice, we do not expect to see perfect phylogeny in biological data.

We extend our algorithms to the case where the data is almost perfect phylogeny.

Eskin, Halperin, Karp ``Large Scale Reconstruction of Haplotypes from Genotype Data.'’ RECOMB 2003.

HAP Algorithm

HAP Local Predictions http://research.calit2.net/hap/ Over 6,000 users of webserver.

Main Ideas: Imperfect Phylogeny Maximum Likelihood Criterion

Extremely efficient. Orders of magnitude faster than other algorithms.

Eskin, Halperin, Karp ``Large Scale Reconstruction of Haplotypes from Genotype Data.'’ RECOMB 2003.


2001



2002


2003


2004


2005


2006Eskin, Halperin, KarpRECOMB 2003

HAPTimeline

:

Phasing Methods

HAP is one of many phasing algorithms. Clark, 1990, Excoffier and Slatkin, 1995, PHASE – Stephens et al., 2001, HAPLOTYPER - Niu et al., 2002. Gusfield, 2000, Lancia et al. 2001. Many more…

How do we phase entire chromosomes?

Algorithms were designed for only 4-12 SNPs!

HAP “tiling” extension phasing for long regions.

Leverages the speed of HAP.

• For each window we compute the haplotypes using HAP

• We tile the windows using dynamic programming

genotypes

Local predictions

Scaling to Whole GenomesHAP-TILE

0010000011011011111001

Haplotype Tiling Problem(ignoring homozygous positions)

001000110111 010000 101111 011111 100000 000101 111010 000011 111100 100110 011001

0010000011011011111001

(minimum number of conflicts)

001000110111 010000 101111 011111 100000 000101 111010 000011 111100 100110 011001

• NP-Hard Problem• Dynamic Programming Solution

(Eskin et al. 2004.)

Phasing Running Time Comparison(Phaseoff Competition)

Marchini et al. American Journal of Human Genetics, 2006.

HAP is over 1000x faster than PHASE.


2001



2002


2003


2004


2005


2006Eskin, Halperin, KarpRECOMB 2003

HAPTimeline

:

Perlegencollaboration

(12 hours)

NCBI dbSNPcollaboration

(24 hours) (48 hours)

Only 103 SNPs,0.02% of the genome!

RE

CO

MB

200

3 S

ub

mis

sio

n

Weighted Haplotype Association

Association Statistics

Assume we are given N/2 cases and N/2 control individuals.

Since each individual has 2 chromosomes, we have a total of N case chromosomes and N control chromosomes.

At SNP A, let p+A and p-

A be the observed case and control frequencies respectively.

We know that:

p+A ~ N(p+

A, p+A(1-p+

A)/N).

p-A ~ N(p-

A, p-A(1-p-

A)/N).

^ ^

^

^

Association Statistics

p+A ~ N(p+

A, p+A(1-p+

A)/N).

p-A ~ N(p-

A, p-A(1-p-

A)/N).

p+A- p-

A ~ N(p+A- p-

A,(p+A(1-p+

A)+p-A(1-p-

A))/N)

We approximate

p+A(1-p+

A)+p-A(1-p-

A) ≈ 2 pA(1-pA)

then if p+A =p-

A

^

^

^ ^

€

SA =ˆ p +A − ˆ p −A

2 /N ˆ p A (1− ˆ p A )~ N(0,1)

^ ^

-

Association Statistic

Under the null hypothesis p+A- p-

A=0

We compute the statistic SA.

If SA< -1(/2) or SA>--1(/2) then the association is significant at level .€


2 /N ˆ p A (1− ˆ p A )~ N(0,1)

Association Power

Lets assume that SNP A is causal and p+A ≠ p-

A

Given the true p+A and p-

A, if we collect N individuals, and compute the statistic SA, the probability that SA has a significance level of is the power.

Power is the chance of detecting an association of a certain strength with a certain number of individuals.

Association Statistic Lets assume that p+

A ≠ p-A then

€


2 /N ˆ p A (1− ˆ p A )~ N

pA+ − pA

−

2 /N pA (1− pA ),1

⎛

⎝ ⎜ ⎜

⎞

⎠ ⎟ ⎟

€


2 /N ˆ p A (1− ˆ p A )~ N

( pA+ − pA

− ) N

2pA (1− pA ),1

⎛

⎝ ⎜ ⎜

⎞

⎠ ⎟ ⎟

€


2 /N ˆ p A (1− ˆ p A )~ N λ A N ,1( )

Association Power

€


2 /N ˆ p A (1− ˆ p A )~ N λ A N ,1( )

€

λA N

Power ofassociationtest

Threshold forsignificance

Non-centralityparameter.

Association Power

Statistical Power of an association with N individuals, non-centrality parameter and significance threshold is P(, )=

Note that if λ=0, power is always .€

λ N

€

λ N

€

(Φ−1

(α / 2) + λ N ) + 1 − Φ(−Φ−1

(α / 2) + λ N )

Indirect Association

Now lets assume that we have 2 markers, A and B. Let us assume that marker B is the causal mutation, but we are observing marker A.

If we observed marker B directly our statistic would be

€

λB =( pB

+ − pB− )

2pB (1− pB )

€

SB ~ N λ B N ,1( )


However, we are observing A where our statistic is

What is the relation between SA and SB?

€

λA =( pA

+ − pA− )

2 pA (1− pA )

€

SA ~ N λ A N ,1( )


We want to relate

to

€

λA =( pA

+ − pA− )

2 pA (1− pA )

€

SA ~ N λ A N ,1( )

€

λB =( pB

+ − pB− )

2pB (1− pB )

€

SB ~ N λ B N ,1( )

Indirect Association Then

€

λA =( pA

+ − pA− )

2pA (1− pA )=

( pB+ − pB

− )( pA |B − pA |b )

2pA (1− pA )

=( pB

+ − pB− )( pA |B − pA |b )

2pA (1− pA )

2 pB (1− pB )

2 pB (1− pB )

=( pB

+ − pB− )

2pB (1− pB )

( pA |B − pA |b ) 2pB (1− pB )

2pA (1− pA )

= λ B

( pA |B − pA |b ) 2pB (1− pB )

2 pA (1− pA )


Note that

€

λA = λ B

( pA |B − pA |b ) 2pB (1− pB )

2pA (1− pA )

= λ B

pAB

pB

−pAb

1− pB

⎛

⎝ ⎜

⎞

⎠ ⎟ pB (1− pB )

pA (1− pA )

= λ B

pAB − pAB pB − pAb pB

pB (1− pB )

⎛

⎝ ⎜

⎞

⎠ ⎟ pB (1− pB )

pA (1− pA )

= λ B

pAB − pA pB

pA (1− pA ) pB (1− pB )= λ B r2

€

λA = λ B r2


How many individuals, NA, do we need to collect at marker A to achieve the same power as if we collected NB markers at marker B?

€

SA ~ N λ A NA ,1( )

€

SB ~ N λ B NB ,1( )

€

λA NA = λ B NB

λ B r2 NA = λ B NB

NA =NB

r2

€

λA = λ B r2

Visualization in terms of Power

€

λB N

Power ofassociationtest

Threshold forsignificance

Non-centralityparameters.

€

λA N

€

λA = λ B r2

Correlating Haplotypes with the Disease

The disease may be correlated with a SNP not in the panel.

The disease may be more correlated with a haplotype (group of SNPs) than with any single SNP in the panel.

Haplotype tests: Which haplotypes should we test? Which blocks should we pick?

Key Problem: Indirect Association

We have the HapMap. Information on 4,000,000 SNPs.

AffyMetrix gene chip collects information on 500,000 SNPs. What about the remaining 3,500,000 SNPs?

So far, we have designed studies by picking tag SNPs with high r2.

Can we use the HapMap when performing association? Multi-Tag methods.

Haplotypes as Proxies for Hidden SNPs (de Bakker 2005)

HaplotypesFreq.

1 2 3 4 5

A A A A A .25

A G A G G .15

A G A G A .10

G A G G G .25

G G G G G .25

HaplotypesFreq.

1 2 3 4 5

A A A A A .25

A G A G G .15

A G A G A .10

G A G G G .25

G G G G G .25

HaplotypesFreq.

1 2 3 4 5

A A A A A .25

A G A G G .15

A G A G A .10

G A G G G .25

G G G G G .25

HaplotypesFreq.

1 2 3 4 5

A A A A A .25

A G A G G .15

A G A G A .10

G A G G G .25

G G G G G .25

WHAP - Weighted Haplotypes

HaplotypesFreq.

1 2 3 4 5

A A A A A .25

A G A G G .15

A G A G A .10

G A G G G .25

G G G G G .25

A

0.71AA + 0.29AG0.71AA + 0.29AG

Basic MultiMarker Method

For each SNP in HapMap, find haplotype among genotyped SNPs that has highest r2 to the SNP.

Perform association at each SNP and each added haplotype.

Now instead of performing 500,000 tests, we perform 4,000,000 tests.

Weighted Haplotype Test

For each haplotype h, we assign a weight wh

We use a “weighted” allele frequency statistic:

This statistic is the weighted numerator in SA. What is the variance of this statistic?

Complication: Haplotype frequencies are not independent!

€

Wh = wh ( ph+ − ph

−)h

∑

Weighted Haplotype Example

Assume we have 4 haplotypes AB, Ab, aB and ab. If we set the weights so that wAB=wAb=1 and

waB=wab=0, this is equivalent to looking at the single SNP A.

If we set the weights so that wAB=1 and wAb=waB=wab=0, this is equivalent to looking at the single haplotype AB.

Other weights are can be something in between.

The -test

€

(w) =N wh ph

case − phcontrol

( )h=1

k

∑ ⎛ ⎝ ⎜ ⎞

⎠ ⎟2

2 wh2 ph − wh ph∑ ⎛ ⎝

⎜ ⎞ ⎠ ⎟2

h=1

k

∑ ⎛

⎝ ⎜

⎞

⎠ ⎟

Each haplotype h is assigned a weight wh. N is the number of individuals. ph - the probablity for h in cases/controls, or

average. Under the null, the -test is 2 distributed.

Non-Centrality Parameter

Under weights w1,w2,w3,w4 and true case/control probabilities p1

+,p2+,p3

+,p4+ and

p1-,p2

-,p3-,p4

-, Wh is expected to be

When normalizing for the variance, the non-centrality parameter is

€

Wh = wi( pi+ − pi

−)i=1

4

∑

€

λh N =

wi( pi+ − pi

−)i=1

4

∑

2 /N wi

2 pi − wi pi

i=1

4

∑ ⎛

⎝ ⎜

⎞

⎠ ⎟

2

i=1

4

∑

Wh and indirect association

Let us assume that SNP C is causal with non-centrality parameter λC.

If we perform weighted haplotype association, the noncentrality parameter is λh.

How are they related? (i.e. What is the power of the weighted haplotype association test).

Using the same technique, we can show that λC=rh λh, where rh is the conceptual equivalent of r in 2 SNP case.

The Relation to Power

€

rh2 =

wh qhC − qhc( )h=1

k

∑ pC (1− pC )

wh2 phh

∑ − wh phh∑( )

2

€

qhC = P(h | C)

qhc = P(h | c)

pC = P(c)

The power of detecting the SNPwith N individuals is the sameas using the tag SNPs withN/rh

2 individuals.

Choosing the Weights

Haplotypes

1 2 3 4 5

A A A A A .05

A G A G G .15

A G A G A .10

G A G G G .25

G G G G G .25

Optimal weights:

wh(s5) = P(s5 = ‘A’ | h) = qAh

The Relation to Power

€

rh2 =

qCh pCh − pC ph( )h=1

k

∑pC (1− pC )

This is exactly r2 in the case of one tag SNP.

WHAP always has at least as much power as:• single SNP test• single haplotype test• haplotype group test• 2 with k degrees of freedom.

Cases0.5M SNPs

Controls0.5M SNPs

HapMap4M SNPs

Use as training dataset to getthe weights

Tests: T1,…,T4M

Apply tests: T1,…,T4M

Positive results give evidence for a causal SNP - can be verified by a follow up/two stage study.

How Many SNPs are Captured?

Tag Set Pop SNP HAP WHAP

Affy500 CEU 0.61 0.77 0.84

Affy500 CHB 0.62 0.76 0.83

Affy500 JPT 0.59 0.73 0.81

Affy500 YRI 0.37 0.61 0.74

Illumina CEU 0.88 0.97 0.98

Illumina CHB 0.80 0.91 0.94

Illumina JPT 0.78 0.90 0.95

Illumina YRI 0.52 0.83 0.92

Power Simulations

Pop SNP HAP WHAP

CEU 0.92 0.94 0.96

CHB 0.90 0.94 0.95

JPT 0.90 0.93 0.95

YRI 0.77 0.88 0.92

- Relative power to using all SNPs. - Tested on the ENCODE regions, Affy 500k tag SNPs.

Practical Issues

We assume we have the haplotype frequencies in the HapMap (not the phase).

We assume the case/control populations are coming from the same population as the HapMap.

Over-fitting: Train with half of the data, test the other half. No correlation between the haps and random SNPs.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

WHAP r2 in a region. Red lines are collected SNPs. Blue lines are rh2 values.

Associations using WHAP. Red lines are assocations at collected SNPs. Blue lines are associations at uncollected SNPs inferred by WHAP.

QuickTime™ and aTIFF (LZW) decompressor


Optimal Genome Wide Tagging by Reduction to SAT

Correlation Strucutre

QuickTime™ and a decompressor


Example r2 Matrix





Graph Representation

Satisfiability and SAT Solvers Boolean variables called literals Logical operators

AND ∧ OR ∨ NOT ¬

Example: (s1 ∨ ¬ s2) ∧ (s2 ∨ s3 ∨ s1) s1 = false; s2 = false; s3 = true

A. Darwiche

A B B A C D D C

and and and and and and and and

or or or or

and and

or

rooted DAG (Circuit)

Negation Normal Form

CNF Form and Logical Solutions







NNF Form of Solutions



Local Single SNP r2 Tagging

Generate a clause for each SNP Clause for SNP si contains all covers

Input CNF as conjuction of all clauses Compile with minSAT solver Find solutions by traversal of NNF

Optimal Tagging



Whole Genome Tagging



MultiMarker Example





MultiMarker Tagging





UCLA:Adnan DarwicheArthur ChoiKnot Pipatswisawat

ICSI:Eran HalperinRichard Karp

Perlegen Sciences:David HindsDavid Cox

Ph.D. Students:Buhm HanNils Homer Hyun Min KangSean O’RourkeJimmie YeNoah Zaitlen

Webserver Hosted By:

Computational and Statistical Challenges in Association Studies

Documents

Transcript of Computational and Statistical Challenges in Association Studies