YFiler®Plus Sensitivity Study

Yfiler® Plus Sensitivity Study (early release kit)

Sonja Klein Mark Timken

Martin Buoncristiani

Cal DOJ Jan Bashinski DNA Lab

Presentation Outline

• What is a sensitivity study? • Pre-PCR sampling statistics • Results of the empirical study compared to

predicted results based on sampling statistics • How the results might help with interpretation

and provide guidance with addressing the SWGDAM Y STR guidelines

Yfiler® Plus from Applied Biosystems

• 27 locus Y STR kit – 3 known multi-copy loci

• DYS389, DYS385, DYF387S1 • 25 ul reaction volume • 30 cycles • 1 ng target template • 6 dye system (5 plus LIZ) • 3500

– 1.2kV 16sec. – 175 RFU Analytical Threshold

Yfiler® Plus Plot- 1 ng 10 new loci

What is a Sensitivity Study?

• Validation requirement • Typically:

– Replicate amps in a dilution series • Example: 2 ng to 16 pg (for a 1 ng target system)

– Positive control DNA or other known single source samples (extracted, purified DNA)

Some Sensitivity Study Goals

• Is the target template appropriate for system? – Cycle number, reaction volume, instrument/injection settings – Mid to low-mid range of CCD camera detection

• Over what range of input is signal linear? • Over what range of input are all peaks likely to be

detected? • What stochastic effects are observed?

– PHRs, dropout, (stutter) – Stochastic Threshold

• Multi-copy loci for Y STRs – Probability of dropout (for probabilistic genotyping)

• Null allele vs. drop-out for Y STRs

Sensitivity Study

• DNA Dilution Series Template (pg) # of replicates Loci # expected alleles

1000 2 27 54

500 2 27 54

250 4 27 108

125 8 27 216

62.5 14 27 378

31.25 14 27 378

15.6 14 27 378

7.8 14 27 378

72 amps 1944 alleles (loci)

Linearity

0

2000

4000

6000

8000

10000

12000

0 200 400 600 800 1000

RFU

Template (pg)

empirical avg PH

Peak Height Dispersion

0

2000

4000

6000

8000

10000

12000

0 200 400 600 800 1000

RFU

Template (pg)

empirical avg PH

empirical PHs

Peak Height Relative Stdev

0

20

40

60

80

100

120

7.8 15.6 31.25 62.5 125 250 500 1000

Stde

v PH

/Avg

PH

*100

(%CV

)

Template (pg)

Sources of Peak-Height Variation • Stochastic:

– Pre-PCR allelic sampling – PCR synthesis

• Systematic:

– Inter-locus imbalance – Preferential amplification – Cap-to-cap – Injection-to-injection – Degradation – Inhibition

Pre-PCR Stochastic Sampling Effects

• Publications pointing to pre-PCR sampling is the main source of peak height variance at low templates (for extracted DNA, i.e. dissociated alleles) – Walsh, et.al. CSH Genome Res. 1992 – Taberlet, et.al., NAR 1996 – Stenman and Orpana, Nature Biotechnology 2001 – Gill, et.al., NAR 2005 – Timken, et.al., FSIG 2014

• paper compared empirical results of 2372 with ID+ and MF to predicted results based on pre-PCR sampling alone

2001

“Assuming that the template molecules are evenly distributed in a solution of known concentration, the probability of a certain number of molecules to be present in an aliquot pipetted from this solution can be calculated according to the Poisson distribution”

Uniform Sampling (what we imagine) vs. Stochastic (Poisson) Sampling (what we get)

Assume a homogeneous DNA solution at 105.6 pg per 10 uL, i.e., 16 copies per 10 uL. This situation is depicted below as a uniform distribution of copies (black balls) in squares representing 10 uL volumes.

If we transfer 10 uL from this solution into a PCR tube, at a single locus with an average of 16 copies, what do stochastic sampling statistics (using the Poisson distribution) say that we’ll get?

22 copies ●●●●●●●●●●●●●●●●●●●●●●●●

11 copies ●●●●●●●●●●●●●●●●●●●●●●

18 copies ●●●●●●●●●●●●●●●●●●●●●●●●

13 copies ●●●●●●●●●●●●●●●●●●●●●●

Avg = 16 = λ Stdev=√λ

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Poisson Distribution

Poisson assumptions: 1. Alleles have the same average concentration (λ)

2. Are sampled equivalently

3. Are sampled independently

Poisson is a discrete probability distribution

Poisson properties: 1. The mean of the distribution is equal to λ 2. The variance is also equal to λ

•C.M. Grinstead, J.L. Snell, Introduction to Probability: Second Revised Edition, American Mathematical Society, Rhode Island, 1997.

𝑃 𝑋 λ =λ𝑋𝑒−λ

𝑋!

Uniform Sampling (what we imagine) vs. Stochastic (Poisson) Sampling (what we get)

Assume a homogeneous DNA solution at 105.6 pg per 10 uL, i.e., 16 copies per 10 uL. This situation is depicted below as a uniform distribution of copies (black balls) in squares representing 10 uL volumes.

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

If we transfer 10 uL from this solution into a PCR tube, at a single locus with an average of 16 copies, what do stochastic sampling statistics (using the Poisson distribution) say that we’ll get?

22 copies ●●●●●●●●●●●●●●●●●●●●●●●●

11 copies ●●●●●●●●●●●●●●●●●●●●●●

18 copies ●●●●●●●●●●●●●●●●●●●●●●●●

13 copies ●●●●●●●●●●●●●●●●●●●●●●

Avg = 16 λ = 16 Stdev=√λ

The probability that we actually sample exactly 16 according to the Poisson distribution is:

𝑃 16 16 =1616 × 𝑒−16

16!

= 0.0992 (~9.9% of the time exactly 16 copies will be sampled)

We can also calculate a cumulative probability. For example, the probability we sample 16 or fewer copies with an average of 16 (calculated in Excel) is:

P(16 or fewer) = POISSON(16,16,TRUE) =0.566

Pre-PCR Sampling Statistics and Sensitivity Study with ID+/MF

Applying pre-PCR Sampling Statistics to Yfiler® Plus Sensitivity Study

• Test if Poisson sampling statistics can predict Yfiler® Plus peak height variance Need: – Accurate average starting copy number (λ) – Signal proportional to template (linear) – Dissociated alleles (extracted DNA)

NIST SRM® 2372A

• NIST Standard Reference Material (SRM® 2372A) – Single source male (white blood cell) extract that

was developed as a standard for Human DNA Quantitation*

– 57ng/ul concentration (known λ) – Most accurate template available

*Kline, et.al. Production and certification of NIST Standard Reference Material 2372 Human DNA Quantitation Standard, Anal. Bioanal. Chem. 394 (2009) 1183-1192.

Sensitivity Study 2372A

• DNA Dilution Series Template (pg) # of replicates Loci # expected alleles Avg starting copy #

1000 2 27 54 151.5

500 2 27 54 75.8

250 4 27 108 37.9

125 8 27 216 18.9

62.5 14 27 378 9.5

31.25 14 27 378 4.7

15.6 14 27 378 2.4

7.8 14 27 378 1.2

Assuming 6.6 pg per diploid cell

λ =

Linearity

RFU/copy = ~ 48

y = 47.66x - 33.196

0

2000

4000

6000

8000

10000

12000

0 20 40 60 80 100 120 140 160

RFU

Average Copy Number

empirical avg PH

Signal per starting copy #

49.8 46.8

50.2 50.9

41.2

46.4

41.2

48.9

0

10

20

30

40

50

60

7.8 15.6 31.25 62.5 125 250 500 1000

Aver

age

sign

al/s

tart

ing

copy

(RFU

)

Template (pg)

Detection Sensitivity Analytical Threshold (in copies)

𝐴𝐴𝑐 =𝐴𝐴𝑅𝑅𝑅

𝑅𝑅𝑅/𝑐𝑐𝑐𝑐

𝐴𝐴𝑐 =17548.9

𝐴𝐴𝑐 = 3.6

For reference, 2372 with ID+ (25ul volume, 28 cycles) on the 3500 at an AT of 175 RFU gave an ATc of 4.1 (175/42.4= 4.1).

𝐴𝐴𝑐𝑝 = 3.6 𝑐𝑐𝑐𝑐𝑒𝑐 𝑥6.6𝑐𝑝𝑐𝑐𝑐𝑐

~ 24 pg

Detection Sensitivity Analytical Threshold (in copies)

𝐴𝐴𝑐 =𝐴𝐴𝑅𝑅𝑅

𝑅𝑅𝑅/𝑐𝑐𝑐𝑐

𝐴𝐴𝑐 =17548.9

𝐴𝐴𝑐 = 3.6

For reference, 2372 with ID+ (25ul volume, 28 cycles) on the 3500 at an AT of 175 RFU gave an ATc of 4.1 (175/42.4= 4.1).

𝐴𝐴𝑐𝑝 = 3.6 𝑐𝑐𝑐𝑐𝑒𝑐 𝑥6.6𝑐𝑝𝑐𝑐𝑐𝑐

~ 24 pg

For repeated 24 pg amps, on average, half the alleles will be above the AT and half will be below.

STRBase http://www.cstl.nist.gov/strbase/ystrpos1.htm

STRBase http://www.cstl.nist.gov/strbase/ystrpos1.htm

Alleles on the Y chromosome are associated in terms of inheritance, but dissociated in terms of sampling from extracted DNA.

Sequences searched by using BLAT Human Genome Assembly (Feb. 2009)

STR Y Location DistanceDYS393 3.131E+06

1.140E+06DYS456 4.271E+06

2.590E+06DYS570 6.861E+06

1.920E+05DYS576 7.053E+06

8.148E+05DYS458 7.868E+06

3.502E+05DYS449 8.218E+06

2.080E+05DYS481 8.426E+06

2.240E+05DYS627 8.650E+06

8.719E+05DYS19 9.522E+06

4.581E+06DYS391 1.410E+07


2.772E+05DYS635 1.438E+07

8.711E+04DYS437 1.447E+07

4.826E+04DYS439 1.452E+07

9.680E+04DYS389 I 1.461E+07

DYS389 II 1.461E+073.257E+05

DYS438 1.494E+072.337E+06

DYS390 1.727E+074.512E+04

DYS518 1.732E+071.073E+06

DYS533 1.839E+07


3.506E+05Y GATA H4 1.874E+07

2.058E+06DYS385 a 2.080E+07

4.023E+04DYS385 b 2.084E+07

2.090E+05DYS460 2.105E+07

1.583E+06DYS392 2.263E+07

1.731E+06DYS448 2.436E+07

1.566E+06DYF387S1 2.593E+07

2.100E+06DYF387S1 2.803E+07

Sequences searched by using BLAT Human Genome Assembly (Feb. 2009)


1.140E+06DYS456 4.271E+06

2.590E+06DYS570 6.861E+06

1.920E+05DYS576 7.053E+06

8.148E+05DYS458 7.868E+06

3.502E+05DYS449 8.218E+06

2.080E+05DYS481 8.426E+06

2.240E+05DYS627 8.650E+06

8.719E+05DYS19 9.522E+06

4.581E+06DYS391 1.410E+07


2.772E+05DYS635 1.438E+07

8.711E+04DYS437 1.447E+07

4.826E+04DYS439 1.452E+07

9.680E+04DYS389 I 1.461E+07

DYS389 II 1.461E+073.257E+05

DYS438 1.494E+072.337E+06

DYS390 1.727E+074.512E+04

DYS518 1.732E+071.073E+06

DYS533 1.839E+07


3.506E+05Y GATA H4 1.874E+07

2.058E+06DYS385 a 2.080E+07

4.023E+04DYS385 b 2.084E+07

2.090E+05DYS460 2.105E+07

1.583E+06DYS392 2.263E+07

1.731E+06DYS448 2.436E+07

1.566E+06DYF387S1 2.593E+07

2.100E+06DYF387S1 2.803E+07

The minimum distance between the 27 Yfiler Plus loci is approximately 40,000 bases (except for DYS389I/II). High quality extracted DNA is ~10,000-25,000 bases so each Y locus should be dissociated and follow sampling statistics (with the exception of DYS389I/II).

Empirical vs. Predicted

• Results: – Comparison of the Yfiler® Plus empirical dropout

rate to the dropout rate predicted by pre-PCR sampling statistics

0

250

500

750

1000

1250

1500

1750

2000

0 25 50 75 100 125

RFU

Template (pg)

empirical avg PH

empirical PHs

Empirical Dropout: pg # alleles < 175 RFU Total expected alleles Fr(D) < 175 RFU 7.8 357 378 0.94

15.6 295 378 0.78 31.25 139 378 0.37 62.5 28 378 0.074 125 0 216 0

175

Empirical Frequency of Dropout at 175 RFU

Probability of Dropout at ATc of 3.6

pg copies NORM.DIST(ATc, λ, sqrtλ, TRUE) 7.8 1.18 0.99

15.6 2.36 0.79 31.25 4.73 0.30 62.5 9.47 0.028 125 18.94 2.11E-04

POISSON: A discrete probability distribution, e.g. 1, 2, 3, etc., so to simulate the continuous peak height data, we use a Normal approximation to the Poisson.

NORM.DIST(x, mean, stdev, TRUE)

NORM.DIST(ATc, λ, sqrtλ, TRUE) NORM.DIST(3.6,4.7,2.18,TRUE) = 0.306924 For example, starting with 31.25 pg, P(D) =

Fr(D) from Sensitivity Study vs. P(D) using λ and Sampling Statistics

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100 120 140

P(D

) (N

orm

al) o

r Fr(

D) Y

FP

Template (pg) in the PCR

P(D) Normal at ATc 3.6

Fr(D) at AT 175 RFU

Fr(D) from Sensitivity Study vs. P(D) using λ and Sampling Statistics

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20 40 60 80 100 120 140

P(D

) (N

orm

al) o

r Fr(

D) Y

FP

Template (pg) in the PCR

P(D) LR Analysis (1-18cells) at ATc 3.6


Fr(D) at AT 175 RFU

Logistic Regression: P(D) = 1

1+𝑒− 𝑎+𝑏𝑏 For ATc 3.6,

𝑎 = 3.029319 𝑏 = −0.11497

SWGDAM Interpretation Guidelines for Y-Chromosome STR Typing by Forensic DNA

Laboratories (Jan. 9, 2014) • 3.2.1 The laboratory should establish guidelines for the identification of

such null alleles. As an example, this guideline may be based on experimental studies designed to distinguish a null allele from an undetected allele resulting from low template amounts, DNA degradation or inhibition.

• 5.2 The laboratory should establish a stochastic threshold for known multi-copy Y-STR loci based on empirical data derived within the laboratory and specific to the quantitation and amplification systems (e.g., kits) and the detection instrumentation used. It is noted that a stochastic threshold may be established by assessing peak height ratios across any multi-copy locus in a dilution series of DNA amplified in replicate. The RFU value above which it is reasonable to assume that, at a given locus, allelic dropout of a sister allele has not occurred constitutes a stochastic threshold.

Yfiler® Plus Plot- 31 pg

?

?

Probability of Dropout vs. 2372 Average Peak Height Could Aid Null Allele Assessment

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 100 200 300 400 500 600 700 800 900 1000

P(D

) (N

orm

al) o

r Fr(

D) Y

FP

2372 Average Peak Height (RFU)

P(D) LR Analysis (1-18cells) at ATc 3.6


Fr(D) at AT 175 RFU

PHRs for Multi-copy Loci

• PHRs for Y kits – Very few multi-copy loci to measure PHRs from

• For example, 10 amps with YFP yields only 10*2= 20 PHRs as compared to ~120 PHRs for 10 amps with ID+ (~12 het loci*10)

– Can use Poisson distribution to predict PHR dispersion

Poisson Generated PHRs

Stochastic Threshold for Multi-copy Loci

• Stochastic Threshold – Can set an ST relative to P(D) risk using logistic

regression curve at the ATc

Logistic Regression (Semi-Log Plot) (Normal approx. of Poisson) for multi-copy ST

Logistic Regression (Semi-Log Plot) (Normal approx. of Poisson) for multi-copy ST

17

A “peak height” of 17 copies has a “sister allele” dropout probability of 1 in 1000 (using ATc of 3.6). 17 copies x 48.9 RFU/copy= 831 RFU

Conclusions

A sensitivity study was conducted with NIST SRM® 2372A and Yfiler® Plus on the 3500. Peaks, on average, were shown to be proportional to input template amount. Pre-PCR sampling statistics predicted empirical dropout rates at the analytical threshold (of 175 RFU or 3.6 ATc). Amplification of an accurate standard can serve as a good starting point for characterizing a system’s sensitivity and estimating dropout probabilities, ST, and PHR ranges.

Limitations

• Repeat in the presence of large amounts of female DNA to verify RFU per copy values

• Systematic signal differences – run-to-run – Inter-color or inter-locus signal differences

• Predictions assume extracted, diluted DNA (100% dissociated model) – Poisson sampling will overestimate the variance

• if performing direct amps (associated alleles)

Thank you!

Questions?

[email protected] [email protected]

[email protected]

mailto:[email protected]



YFiler®Plus Sensitivity Study

Science

Transcript of YFiler®Plus Sensitivity Study