Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

33
Use of Mixture Model in a genome-wide DNA microarray- based genetic screen for components of the NHEJ Pathway in Yeast Rafael A. Irizarry Department of Biostatistics, JHU [email protected]

description

Use of Mixture Model in a genome-wide DNA microarray-based genetic screen for components of the NHEJ Pathway in Yeast. Rafael A. Irizarry Department of Biostatistics, JHU [email protected]. Damaged DNA. Yku70p/Yku80p (DNA-PK ). DNA end binding. Nucleolytic processing. Rad50p/Mre11p/Xrs2p. - PowerPoint PPT Presentation

Transcript of Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

Page 1: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

Use of Mixture Model in a genome-wide DNA microarray-based genetic screen for

components of the NHEJ Pathway in Yeast

Rafael A. IrizarryDepartment of Biostatistics, JHU

[email protected]

Page 2: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu
Page 3: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

Damaged DNA

Rad50p/Mre11p/Xrs2p

Yku70p/Yku80p(DNA-PK )DNA end binding

Lig4p/Lif1pLigation

Nucleolytic processing

Repaired DNA

Page 4: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

kanRA

Transformation into deletion pool

Select for Ura+ transformantsGenomic DNA preparation

Circular pRS416

PCRCy5 labeled PCR products Cy3 labeled PCR products

Oligonucleotide array hybridization

B

EcoRI linearized PRS416

NHEJ Defective

MCS

CEN/ARS

URA3 ttaaaatt

CEN/ARS

URA3

UPTAG DOWNTAG

Page 5: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

Data

• 5718 mutants• 3 replicates on each slide• 5 Haploid slides, 4 Diploid slides• Haploids are divided into 2 downtags, 3

uptag (2 of which replicate uptags)• Diploids are divided into 3 uptags (2 of

which are replicates) and 2 uptags

Page 6: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

Which mutants are NHEJ defective?

• Find mutants defective for transformation with linear DNA

• Dead in linear transformation (green)• Alive in circular transformation (red)• Look for spots with large log(R/G)

Page 7: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu
Page 8: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu
Page 9: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu
Page 10: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu
Page 11: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu
Page 12: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

Improvement to usual approach

• Take into account that some mutants are dead and some alive

• Use a statistical model to represent this• Mixture model?• With ratio’s we lose information about of R

and G separately • Look at them separately (absolute analysis)

Page 13: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu
Page 14: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu
Page 15: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu
Page 16: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu
Page 17: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu
Page 18: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

Warning

• Absolute analyses can be dangerous for competitive hybridization slides

• We must be careful about “spot effect”• Big R or G may only mean the spot they

where on had large amounts of cDNA• Look at some facts that make us feel safer

Page 19: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

Correlation between replicates

R1 R2 R3 G1 G2 G3R1 1.00 0.95 0.95 0.94 0.90 0.90R2 0.95 1.00 0.96 0.90 0.95 0.91R3 0.95 0.96 1.00 0.91 0.92 0.95G1 0.94 0.90 0.91 1.00 0.96 0.96G2 0.90 0.95 0.92 0.96 1.00 0.97G3 0.90 0.91 0.95 0.96 0.97 1.00

Page 20: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

Correlation between red, green, haploid, diplod, uptag, downtag

RHD RHU RDD RDU GHD GHU GDD GDU

RHD 1.00 0.59 0.56 0.32 0.95 0.58 0.54 0.37RHU 0.59 1.00 0.38 0.56 0.58 0.95 0.40 0.58RDD 0.56 0.38 1.00 0.58 0.54 0.39 0.92 0.64RDU 0.32 0.56 0.58 1.00 0.33 0.53 0.58 0.89GHD 0.95 0.58 0.54 0.33 1.00 0.62 0.56 0.39GHU 0.58 0.95 0.39 0.53 0.62 1.00 0.41 0.58GDD 0.54 0.40 0.92 0.58 0.56 0.41 1.00 0.73GDU 0.37 0.58 0.64 0.89 0.39 0.58 0.73 1.00

Page 21: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

BTW

The mean squared error across slides is about 3 times bigger than the mean squared error within slides

Page 22: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

Mixture Model

We use a mixture model that assumes:• There are three classes:

– Dead– Marginal– Alive

• Normally distributed with same correlation structure from gene to gene

Page 23: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

Random effect justification

Each x = (r1,…,r5,g1,…,g5) will have the following effects:

• Individual effect: same mutant same expression (replicates are alike)

• Genetic effect: same genetics same expression

• PCR effect : expect difference in uptag, downtag

Page 24: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

Does it fit?

Page 25: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

Does it fit?

Page 26: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

What can we do now that we couldn’t do before?

• Define a t-test that takes into account if mutants are dead or not when computing variance

• For each gene compute likelihood ratios comparing two hypothesis:

alive/dead vs.dead/dead or alive/alive

Page 27: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

QQ-plot for new t-test

Page 28: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

Better looking than others

Page 29: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu
Page 30: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu
Page 31: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu
Page 32: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

1 YMR106C 9.5 47 69.2 a a 1002 YOR005C 19.7 35 44.9 a d 1003 YLR265C 6.1 32 35.8 a m 1004 YDL041W 10.4 32 35.6 a m 1005 YIL012W 12.2 31 21.7 a a 1006 YIL093C 4.8 29 30.8 a a 1007 YIL009W 5.6 29 -23.5 a a 1008 YDL042C 12.9 29 32.1 a d 1009 YIL154C 1.8 28 91.3 m m 8210 YNL149C 1.7 27 93.4 m d 7111 YBR085W 2.5 26 -15.8 a a 8412 YBR234C 1.7 26 87.5 m d 7513 YLR442C 6.1 26 -100.0 a a 100

Page 33: Rafael A. Irizarry Department of Biostatistics, JHU rafa@jhu

Acknowledgements

• Siew Loon Ooi • Jef Boeke• Forrest Spencer• Jean Yang