Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework...

24
Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA) avid Coffman (Florida Department of Law Enforcement, Tallahassee, FL lia A. Crouse & Felipe Konotop (Palm Beach County Sheriff’s Office, Jeffrey D. Ban (Division of Forensic Science, Richmond, VA) NIJ grants 2000-IJ-CX-K005 & 2001-IJ-CX-K003

Transcript of Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework...

Page 1: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

Automated STR Data Analysis:Validation Studies

Automated AnalysisDatabasing ValidationCasework Studies

Mark W. Perlin (Cybergenetics, Pittsburgh, PA)David Coffman (Florida Department of Law Enforcement, Tallahassee, FL)

Cecelia A. Crouse & Felipe Konotop (Palm Beach County Sheriff’s Office, FL) Jeffrey D. Ban (Division of Forensic Science, Richmond, VA)

NIJ grants 2000-IJ-CX-K005 & 2001-IJ-CX-K003

Page 2: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

Reviewing STR DataReviewing STR DataHuman data review bottleneckHuman data review bottleneck

Computer AutomationComputer AutomationQuality AssuranceQuality AssuranceDatabase IntegrityDatabase IntegrityCasework & MixturesCasework & Mixtures

Key goals:Key goals:• • no errorno error• • high throughputhigh throughput• • small staffsmall staff

Page 3: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

TrueAllele™ TechnologyTrueAllele™ TechnologyEliminates STR human review bottleneckEliminates STR human review bottleneck

Gel-based, orGel-based, orSequencer, orSequencer, orCapillaryCapillary

Raw STRRaw STRData Data INPUTINPUT

QualityQualityAssuredAssuredProfilesProfiles

DatabaseDatabaseOUTPUTOUTPUT

Fully AutomatedFully Automated(on Mac/PC/Unix)(on Mac/PC/Unix)Color SeparationColor SeparationImage ProcessingImage ProcessingLane TrackingLane TrackingSignal AnalysisSignal AnalysisLadder BuildingLadder BuildingPeak QuantificationPeak QuantificationAllele DesignationAllele DesignationQuality CheckingQuality CheckingCODIS ReportingCODIS Reporting

Protected by US patents 5,541,067 & 5,580,728 & 5,876,933 & 6,054,268Protected by US patents 5,541,067 & 5,580,728 & 5,876,933 & 6,054,268

Page 4: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

Automated ProcessingAutomated Processing1.Input1.Input 2.Gel/CE2.Gel/CE 3.Allele3.Allele 4.Output4.Output

Page 5: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

Quality AssuranceQuality Assurance

Page 6: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

Rule SystemRule System

Good data. “Low” optical density?

User sets criteria.No need to review.No rules fired.

Page 7: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

ABI/3100 plate MegaBACE

Multi-Platform EngineMulti-Platform Engine

Page 8: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

Validation MethodsValidation Methods

1. Obtain original data1. Obtain original data2. Process data in TrueAllele ES2. Process data in TrueAllele ES (auto-setup, process run, Q/A, (auto-setup, process run, Q/A, call alleles, apply rules, check)call alleles, apply rules, check) computer: accept/reject/editcomputer: accept/reject/edit3. Review all data3. Review all data one person, many computersone person, many computers human: accept/reject/edithuman: accept/reject/edit4. Generate results & stats4. Generate results & stats

Page 9: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

ExtractExtract

AmplifyAmplify

SeparateSeparate

OtherOther

Rule SettingsRule Settings

Page 10: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

Hitachi + PowerPlexHitachi + PowerPlexHitachi FM/Bio2 & Promega PowerPlex 1.2Hitachi FM/Bio2 & Promega PowerPlex 1.2

Computer: ~75%* data, no review needed Computer: ~75%* data, no review needed Human: All these designations correctHuman: All these designations correct

~8,000 PBSO genotypes reviewed~8,000 PBSO genotypes reviewedTrueAllele performed all gel & allele processingTrueAllele performed all gel & allele processing

TrueAllele expert system can eliminateTrueAllele expert system can eliminatemost human review of gel STR datamost human review of gel STR data

Page 11: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

Hitachi ResultsHitachi Results

Accept Edit RejectAccept Edit RejectR

ejec

t

E

dit

A

ccep

tR

ejec

t

E

dit

A

ccep

tHuman ReviewHuman Review

Co

mp

ute

r P

roce

ssC

om

pu

ter

Pro

cess

non-automation datanon-automation data

Page 12: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

ABI/310 + Pro/CoFilerABI/310 + Pro/CoFilerABI 310 & ProfilerPlus/CofilerABI 310 & ProfilerPlus/Cofiler

Computer: ~85% data, no review needed Computer: ~85% data, no review needed Human: All proper designations correctHuman: All proper designations correct

~24,000 FDLE genotypes reviewed~24,000 FDLE genotypes reviewedTrueAllele performed all CE & allele processingTrueAllele performed all CE & allele processing

TrueAllele expert system can eliminateTrueAllele expert system can eliminatemost human review of CE STR datamost human review of CE STR data

Page 13: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

310 Results310 Results

Accept Edit RejectAccept Edit RejectR

ejec

t

E

dit

A

ccep

tR

ejec

t

E

dit

A

ccep

tHuman ReviewHuman Review

Co

mp

ute

r P

roce

ssC

om

pu

ter

Pro

cess

94 genos/min94 genos/min

Page 14: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

ABI/3700 + Pro/CoFilerABI/3700 + Pro/CoFilerABI 3700 & ProfilerPlus/CofilerABI 3700 & ProfilerPlus/Cofiler

Computer: ~85% data, no review needed Computer: ~85% data, no review needed Human: All TA/ES designations correctHuman: All TA/ES designations correct

~17,000 FDLE genotypes reviewed~17,000 FDLE genotypes reviewedTrueAllele performed all CE & allele processingTrueAllele performed all CE & allele processing

TrueAllele expert system can eliminateTrueAllele expert system can eliminatemost human review of CE-array STR datamost human review of CE-array STR data

Page 15: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

3700 Results3700 Results

Accept Edit RejectAccept Edit RejectR

ejec

t

E

dit

A

ccep

tR

ejec

t

E

dit

A

ccep

tHuman ReviewHuman Review

Co

mp

ute

r P

roce

ssC

om

pu

ter

Pro

cess

76 genos/min76 genos/min

Page 16: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

The UK FSS ExperienceThe UK FSS ExperienceGenerate STR DataGenerate STR Data

UK National DNA DatabaseUK National DNA Database

Person reviewsPerson reviewsa fractiona fraction

of the dataof the data

TrueAllele expert systemTrueAllele expert systemscores all STR data and scores all STR data and assesses data qualityassesses data quality

Page 17: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

FSS ABI/377 ValidationFSS ABI/377 ValidationResourcesResources• • Data: 22,000 genotypes (SGMplus)Data: 22,000 genotypes (SGMplus)• • People: 6 reviewers + 6 managers People: 6 reviewers + 6 managers • • Time: 8 weeks work + 4 weeks reportTime: 8 weeks work + 4 weeks report

ComponentsComponents• • Peak height correlation (GS vs TA)Peak height correlation (GS vs TA)• • Establish baseline height (error-free)Establish baseline height (error-free)• • Designation accuracy (human vs TA)Designation accuracy (human vs TA)• • Network/computer environmentNetwork/computer environment• • QMS documentationQMS documentation

ResultsResults• • Greater yield with TAGreater yield with TA• • No errors on quality dataNo errors on quality data

Page 18: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

Casework StudiesCasework Studies

NonmixtureNonmixtureMixtureMixtureRape KitsRape KitsDisastersDisastersLCN, SNPsLCN, SNPs

M.W. Perlin and B. Szabady, M.W. Perlin and B. Szabady, ““Linear mixture analysis: Linear mixture analysis: a mathematical approach to resolving mixed DNA samples,a mathematical approach to resolving mixed DNA samples,””

Journal of Forensic SciencesJournal of Forensic Sciences, November, 2001., November, 2001.

Page 19: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

Statistical InformationStatistical Information

Degenerate SGM+ alleles (6x6x6x10x...x6)Degenerate SGM+ alleles (6x6x6x10x...x6) 100,000,000 feasible profiles100,000,000 feasible profilesCompute unknown minor contrib profilesCompute unknown minor contrib profiles @30% @30% 1 feasible profile1 feasible profile @10% @10% 100 feasible profiles100 feasible profiles

LMA increases identification power LMA increases identification power a million-fold; a million-fold; CODIS matchCODIS match

Prob{Prob{STR profile STR profile }}& peak quants& peak quants}}

Page 20: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

Validation Data SetsValidation Data Sets

Collaborators:Collaborators: Florida, Virginia, Florida, Virginia, New York, FBI, UK, Private LabsNew York, FBI, UK, Private Labs

Data:Data: synthetic lab mixtures, synthetic lab mixtures, casework, rape kits, disasterscasework, rape kits, disasters

Studies:Studies: comparison, concordance, comparison, concordance,automated lab processesautomated lab processes

Input:Input: TrueAllele quantitated peaks TrueAllele quantitated peaks

Page 21: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

Lab: 70% victim, Lab: 70% victim, 30% unknown30% unknownComputer: 71%, 29%Computer: 71%, 29%

Rape Kit: Unknown SuspectRape Kit: Unknown Suspect

Known victimKnown victimUnknown suspectUnknown suspectFlorida data (FDLE, PBSO)Florida data (FDLE, PBSO)9:1, 7:3, 5:5, 3:7, 1:9 ratios9:1, 7:3, 5:5, 3:7, 1:9 ratiosSix pairs of samplesSix pairs of samples

11//1010 22//66

Page 22: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

Other Mixing ProportionsOther Mixing Proportions

Lab: 90% victim, 10% unknownComputer: 90%, 10%

Lab: 50% victim, 50% unknownComputer: 52%, 48% Lab & review automation

Page 23: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

Disaster Data ReviewDisaster Data Review

Page 24: Automated STR Data Analysis: Validation Studies Automated Analysis Databasing Validation Casework Studies Mark W. Perlin (Cybergenetics, Pittsburgh, PA)

ConclusionsConclusions

• • TrueAllele TrueAllele databasing validationdatabasing validation• • Reduce time, error, staff & costReduce time, error, staff & cost

• • Ongoing Ongoing casework validationcasework validation• • Automate: data review & lab workAutomate: data review & lab work• • Serve: police, courts, societyServe: police, courts, society• • Objective, comprehensiveObjective, comprehensive