Biology of STRs - University of Vermontbiology/Classes/296D/6_Biology.pdf · Stuttering •...

33
Biology of STRs

Transcript of Biology of STRs - University of Vermontbiology/Classes/296D/6_Biology.pdf · Stuttering •...

Biology of STRs

Artifacts in Genotyping STRs

• A number of artifacts are possible:– Stuttering– Non-template additions– Microvariants– Three peaks– Allele dropouts– Mutations

• All interfere with reading a DNA profile accurately and consistently

Stuttering

• Stuttering is caused by the very structure of the STRs that make them good markers

• They are repeats• That are highly polymorphic• Stutter product is a band that has the

wrong number of repeats• Either one repeat more or one less• Caused by strand slippage

Strand-slippageATGCGGCGGCGTGTGTGTGTGGCGTACGCCGCCGCACACACACACCGCCG

5’3’

ATGCGGCGTACGCCGCCGCACACACACACCGCCG

5’3’

CGTGTG

GTGTGTDNA Replication

Or PCR

ElongationATGCGGCGGCGTGTGTGTGTGGCGGCTACGCCGCCGCACACACACACCGCCG

5’3’

GT

ATGCGGCGGCGTGTGTGTTACGCCGCCGCACACACACACCGCCG

5’3’

GT Misalignment

Strand Slippage

• Occurs during extension step of PCR• The newly formed strand of DNA skips

one repeat unit – starts complementary base pairing with next repeat

• Pushing out a non-base paired loop from the template strand of DNA

• Usually causes a deletion of one repeat unit – therefore band will be one unit smaller than true genotype

Strand Slippage

• Naturally this is the mechanism that makes repeats polymorphic

• When it happens during PCR it can produce a band that is not real:– Genotype will be wrong– One repeat unit lower or higher than reality

• Rarer in Tetranucleotides than any other repeats – which is why tetra’s are used

Amount of Stutter Product

• Stutter is usually rare• Therefore might show a small bump - can

usually be differentiated from a true band• Earlier in PCR reaction strand slips

– More stutter product will be produced• Or if genotyping protocol doesn’t work well

true band may be very low– Difficult to separate stutter band from true

band

Stutter Products

Stutter Stutter Stutter ?

Call these genotypes:

Calling Alleles

• Biggest problem with stutter bands:– They are the same size as a real allele!

• Especially difficult if you know the DNA sample is mixed

• Or you are unsure whether sample has been contaminated

• Difficult to determine:– Stutter band– Minor allele (because less DNA)

13 CODIS STR Loci

• All produce some stutter products• Longer alleles produce more stuttering

– Why does this make sense?• Stutter percentages for Tetranucleotides:

– From Less than 1 %– Up to 15% - of the true allele size– Therefore always calculate percentage of

small band’s peak height– Be sure < 15% height of large band

Reducing Stuttering Products

• Changing PCR conditions• Faster DNA Polymerase

– Faster it works, less chance for slippage• STRs with longer repeats (> 4 bps)

– More difficult to “skip” past repeat• STRs with imperfect repeat units

– Complex and compound repeats– More difficult to skip past repeat if next repeat

unit sequence is different

Summary of Stutter Products

• One repeat unit more or less than real allele peaks

• Less then 15% real allele height• Quantity of stutter band depends on:

– When in PCR reaction first slippage occurs– Allele size (bigger alleles, more stutter)– PCR Conditions– Polymerase used– Repeat length and sequence

Non-Template Additions

• Polymerase often adds an extra Adenosine to the end of the newly formed sequence

• Not a part of the template sequence• Makes PCR product one base longer than

actual sequence• If your PCR reaction forms both +A and -A

products then your band will be wide

Non-Template Additions

• Want to have peaks as clear as possible• Therefore want all PCR products to be

identical• Either all +A or all -A• Imagine case where you were genotyping

a dinucleotide, with stutter, and half the products were +A and half were -A

• Impossible to separate genotypes

Non-Template Additions

• Set up PCR conditions so that every product will be +A

• Conditions:– Final extension for 10 mins– Allows all products to be fully adenylated– Primer ends in a guanosine

• Commercially available kits turn every allele (and ladder) into +A

Overloading Sample

• Signal on gel is too strong – will be difficult to call

• May result in a split peak• Or a peak that is off scale• Caused by:

– Too much DNA sample in PCR reaction– Primer concentrations too high

• Why DNA quantification is so important

D3S1358 VWA FGA

-A

+A 10 ng template

(overloaded)

2 ng template (suggested level)

DNA Size (bp)

Rel

ativ

e Fl

uore

scen

ce (R

FUs) off-scale

Figure 6.5, J.M. Butler (2005) Forensic DNA Typing, 2nd Edition © 2005 Elsevier Academic Press

Non-Template Additions and Overloading Samples

Microvariants

• Remember these are variants of the repeat that are not a full repeat unit

• Example – TH01 9.3 allele• As opposed to stutter allele microvariants

are not same size as expected allele• Problem is – determining whether there is

a true microvariant in the person• Or you are seeing a normal band being

shifted over for some genotyping reason?

Microvariants

1. True microvariants must be validated to happen in many samples

• Even if variant is rare it must show up in more than one individual to be considered a true microvariant

2. Exact distance in base pairs should be calculated

• 9.3 means 9 repeats plus 3 bases• Always calculate in bases exactly how “off”

the microvariant is

Sequence Microvariants

• Sometimes there are also sequence differences in these polymorphisms as well as length differences

• The only way to genotype a sequence variant is to sequence the PCR product

• Not necessary for Forensics because you are simply matching genotypes

• These variants are not important for Forensics analysis

Peaks outside of the Ladder

• Sometimes you will see a peak that it outside of the expected range for any marker (between markers?)

• What could cause this?– Unsuccessful PCR product– Primer dimers or etc.– Person really has a new allele

• Check with different set of primers• Sequence new allele and region

Three Peaks

• Sometimes three bands may be seen• What could cause three bands?

– Stuttering– Mixed or contaminated samples– Genotyping error– True duplication or extra chromosome in the

individual• Need to validate what is seen in gel

Three Peaks1. Check other markers in panel:

1. Is there evidence of mixed or contaminated samples in any other markers?

2. Check database information for this marker:

1. More than 50 tri-allelic patterns have been reported as possible with 13 CODIS loci

3. Sequence or genotype this region:1. Is there truly a duplication or extra

chromosome in this person?

Allele Dropout• Most worrisome problem• May call a person homozygous when

really they are heterozygous• What can this be caused by?

– Larger allele is not amplified successfully– Primer site mutation

• Rare with chosen tetranucleotides:– Alleles are very similar in size– Primers have been optimized and chosen in

regions that are very stable

Avoiding Allele Dropout

• Chose primers carefully• Work with polymorphisms that have alleles

of similar size• Always check genotypes with Hardy-

Weinberg Equation– Make sure you see the expected number of

heterozygotes population wide• Most commercial kits have taken care of

all these issues

“Fixing” Allele Dropouts

• Add a “degenerate” primer– Extra primer with known polymorphism– Three primers total will be added

• Lower annealing temperature– Reduce the stringency of primer binding

• Remember that with Forensics what matters is matching genotypes– As long as allele always drops out, don’t have

a problem

Mutations

• STRs do mutate at an expected mutation rate over time

• Mutation may cause:– New Alleles– Change primer binding regions– Sequence changes (less important)

• Very rare events• Can be validated by examining families

Mendelization of Alleles

• Using family members to determine which alleles are possible

• If you know parent’s alleles then there are only so many genotypes possible for children

• Mendel’s law of segregation• All STRs have been genotyped on CEPH

families – huge family sets from Utah

Mendelization of Alleles

3/148/12

3/8 8/14 3/12

2/9 5/11

10/112/11

• As always – must validate mutation• By sequencing or regenotyping

Mutation Rates

• Mutations rates of 13 CODIS have been calculated over thousands of meioses

• All 13 are between 1 to 5 per 1000 generational events

• Highest mutation rates:– Markers that are most polymorphic

• Lowest mutation rates:– Markers that are least informative

Impact of Mutations

• Paternity testing– Can cause problems– Because father may not match true child if

genotype has change in child– Compare many STR loci

• Identity matching– Will not cause a problem– Because mutation will be consistent over a

person’s lifetime and in all tissues

Genotyping Errors• All the previous were artifacts that can be

explained• However the problems you really worry

about are unexplained errors• Especially if sample may be:

– Contaminated – Mixed samples

• Need to always validate any artifact• Be sure it’s not genotyping error

Any Questions?

• Review Chapters 1 – 6

• Email me at least 2 questions you have about the first 6 chapters

• Next class will be review for Exam

• Exam One – February 5th