An Expert System for Scoring DNA Database Profiles Dr. Mark W. Perlin Cybergenetics Pittsburgh, PA.
DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD,...
-
Upload
aaron-walton -
Category
Documents
-
view
222 -
download
0
Transcript of DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD,...
![Page 1: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/1.jpg)
DNA Identification:Mixture Weight & Inference
Cybergenetics © 2003-2010Cybergenetics © 2003-2010
Mark W Perlin, PhD, MD, PhDMark W Perlin, PhD, MD, PhDCybergenetics, Pittsburgh, PACybergenetics, Pittsburgh, PA
TrueAlleleTrueAllele®® Lectures LecturesFall, 2010Fall, 2010
![Page 2: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/2.jpg)
Mixture Weight: Uncertain Quantity
Infer mixture weight fromSTR experiments:
• quantitative peak data• contributor genotypes
Pr(W=w | data, G1=g1, G2=g2, …)hierarchical Bayesian model
Perlin MW, Legler MM, Spencer CE, Smith JL, Allan WP, Belrose JL, Duceman BW. Validating TrueAllele® DNA mixture interpretation. Journal of Forensic Sciences. 2011;56(November):in press.
![Page 3: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/3.jpg)
Mixture Weight Model
template weight
locus weights
Wk
Wk,1 Wk,2 Wk,N…
kth contributor
W0prior probability
experiment data dk,1 dk,2 dk,N…
![Page 4: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/4.jpg)
Experiment Estimate
wk
sum of peak heightsfrom kth contributor
sum of peak heightsfrom all contributors
=
![Page 5: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/5.jpg)
D16S539
![Page 6: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/6.jpg)
11 12 13
Three Alleles
Allele111213
Quantity500
6,750250
![Page 7: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/7.jpg)
11 12 13
Experiment Estimate
Allele111213
Quantity500
6,750250
500 + 250
500 + 6,750 + 250
750
7,500= 10%=
![Page 8: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/8.jpg)
D7S820
![Page 9: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/9.jpg)
Four Alleles
Allele8
101213
Quantity4,000
2503,000
250
10 128 13
![Page 10: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/10.jpg)
Experiment Estimate
10 128 13
250 + 250
4,000 + 250 + 3,000 + 250
500
7,500= 6.7%=
Allele8
101213
Quantity4,000
2503,000
250
![Page 11: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/11.jpg)
Overlapping Alleles
![Page 12: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/12.jpg)
Template Average
mean
variance
![Page 13: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/13.jpg)
Template Mixture Weight Probability Distribution
mean = 6.7%
standarddeviation
= 0.9%
![Page 14: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/14.jpg)
Central Limit Theorem
• more data experiments for a template provide greater mixture weight precision
• double the precision by doing four times the number of experiments
• combine evidence from multiple experiments to obtain a more informative result
![Page 15: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/15.jpg)
Probability Solution
w | d, g1, g2, …
g1 | d, g2, w, …
g2 | d, g1, w, …
interacting random variables
find probability distributions by iterative sampling
zi | d, g1, g2, w, …
Gelfand, A. and Smith, A. (1990). Sampling based approaches to calculating marginal densities. J. American Statist. Assoc., 85:398-409.
![Page 16: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/16.jpg)
Markov Chain Monte Carlo
![Page 17: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/17.jpg)
gk,l :fi2, i=j
2fifj, i≠j⎧⎨⎪⎩⎪
w: Dir1( )ml : N+ 5000, 50002( )σ−2:Gam10, 20( )τ−2:Gam10, 500( )ψ−2:Gam12, 1200( )
Prior Probability
genotype
mixture weight
varianceparameters
DNA quantity
![Page 18: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/18.jpg)
Σl =σ 2 ⋅Vl +τ 2 wl : N0,1[ ]K−1 w, ψ2⋅I( )
μl =ml ⋅ wk,l ⋅gk,lk=1
K
∑ dl : N+ μl,Σl( )
Joint Likelihood Function
data
pattern
variation
![Page 19: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/19.jpg)
Prσ2=s2d1,d2,...,dj,...{ }∝Prσ2=s2{ }⋅ Prdjσ2=s2,...{ }j=1
J∏Prτ2=t2d1,d2,...,dj,...{ }∝Prτ2=t2{ }⋅ Prdjτ2=t2,...{ }
j=1
J∏
PrW=w|d1,d2K,dJ,K{ }∝PrW=w{ }⋅ Prdj|W=w,K{ }
j=1
J∏
PrQ=x|dl,1,dl,2,...,dl,z,...{ }∝PrQ=x{ }⋅ Prdl,i|Q=x,...{ }i=1
I∏Posterior Probability
genotype
mixture weight
data variation
![Page 20: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/20.jpg)
Generally Accepted Method
genotype
mixture weight
data variation
James Curran. A MCMC method for resolving two person mixtures. Science & Justice.
2008;48(4):168-77.
![Page 21: DNA Identification: Mixture Weight & Inference Cybergenetics © 2003-2010 Mark W Perlin, PhD, MD, PhD Cybergenetics, Pittsburgh, PA TrueAllele ® Lectures.](https://reader033.fdocuments.us/reader033/viewer/2022061305/5513ef4055034646298b5f72/html5/thumbnails/21.jpg)
Hierarchical Bayesian Model with MCMC Solution
• standard approach in modern science• describes uncertainty using probability• the "new calculus"• replaces hard calculus with easy computing• can solve virtually any problem• well-suited to interpreting DNA evidence