Tests utilizing read data- Andrew, Yu Under development Use number of reads and proportion of...

11
Tests utilizing read data- Andrew, Yu • Under development • Use number of reads and proportion of variant reads at a site directly • Case-control burden and collapsing tests • Rare variant transmission distortion tests • May provide an effective solution to differential coverage problem Wiki: ATAV ATAV: yes with new system Regulated: ATAV

Transcript of Tests utilizing read data- Andrew, Yu Under development Use number of reads and proportion of...

Page 1: Tests utilizing read data- Andrew, Yu Under development Use number of reads and proportion of variant reads at a site directly Case-control burden and.

Tests utilizing read data- Andrew, Yu

• Under development• Use number of reads and proportion of variant reads at a site

directly• Case-control burden and collapsing tests• Rare variant transmission distortion tests• May provide an effective solution to differential coverage

problem

• Wiki: ATAV• ATAV: yes with new system• Regulated: ATAV

Page 2: Tests utilizing read data- Andrew, Yu Under development Use number of reads and proportion of variant reads at a site directly Case-control burden and.

Three-group analysis (gene-level)- Andrew, Janice

• Analyzes the effect of variants across three groups (as apposed to just case-control)– Accounting for the fact that only some of the possible patterns of

allele frequencies across these groups are biologically plausible• Can take covariates• Single locus analysis is up and running (in R)• Gene-level analysis under development

• Wiki: yes if widely usable and not in ATAV• ATAV: yes• Regulated: yes/ATAV

Page 3: Tests utilizing read data- Andrew, Yu Under development Use number of reads and proportion of variant reads at a site directly Case-control burden and.

De-novo-Poisson-Tester- Andrew, Yu, Yujun

• Identifies whether there is an enrichment of de novo mutations• Weighted version under development

– Incorporate functional data (e.g., Polyphen2) directly into the test statistic

• Wiki: maybe– Distribution probably not needed unless it will be run many times on

smaller datasets• ATAV: no• Regulated: maybe

– If only being run as a final analysis on large datasets, precise methods can be determined each time

Page 4: Tests utilizing read data- Andrew, Yu Under development Use number of reads and proportion of variant reads at a site directly Case-control burden and.

eQTL analyses- Andrew, Chuck

• Under development• Score test for tissue-specific expression

quantitative trait loci• What projects will use this?

• Wiki: if widely usable• ATAV: no• Regulated: probably

Page 5: Tests utilizing read data- Andrew, Yu Under development Use number of reads and proportion of variant reads at a site directly Case-control burden and.

Fisher’s Exact Test Permutation Tool- Quanli

• Hundreds of thousands of times faster than R• Should be kept in mind when tools need

permutation

• Wiki: no• ATAV: no• Regulated: no

Page 6: Tests utilizing read data- Andrew, Yu Under development Use number of reads and proportion of variant reads at a site directly Case-control burden and.

SV-Simu-Viewer- Yujun

• Under development• Simulates and then creates pictures of SVs

• Wiki: probably not– Is it used for multiple projects?

• ATAV: no• Regulated: no

Page 7: Tests utilizing read data- Andrew, Yu Under development Use number of reads and proportion of variant reads at a site directly Case-control burden and.

Somatic-Mutation-Rater- Andrew, Yujun

• Under development• Calculates and compares somatic mutation

rates

• Wiki: yes if widely usable• ATAV: no (maybe if somaticannoDB)• Regulated: yes if widely used

Page 8: Tests utilizing read data- Andrew, Yu Under development Use number of reads and proportion of variant reads at a site directly Case-control burden and.

Novel-Seq-Finder- Yujun

• Under development• A pipeline for acquiring novel sequences that

are not in the reference

• Wiki: no• ATAV: no• Regulated: no

Page 9: Tests utilizing read data- Andrew, Yu Under development Use number of reads and proportion of variant reads at a site directly Case-control burden and.

Regulatory RVIS- Ayal, Quanli, Slave, Andrew

• Under development• CHGV based non-coding measures of

evolutionary constraint.

• Wiki: yes• ATAV: yes• Regulated: yes

Page 10: Tests utilizing read data- Andrew, Yu Under development Use number of reads and proportion of variant reads at a site directly Case-control burden and.

Artifact flagging- Slave• Continues to be under development; new analysis done Christmas time

he said?• Putative artifacts and sites of preferential alignment/alternative error

are flagged for suggested exclusion of variants– Through comparison with EVS so far

• Warnings provided for genes that have excess artifacts• Now extending to flagging sites of repetitive high-confidence de novo

mutation calls across multiple trios.

• Wiki: ATAV• ATAV: already uses• Regulated: ATAV

– Careful consideration and announcement when changing artifact file

Page 11: Tests utilizing read data- Andrew, Yu Under development Use number of reads and proportion of variant reads at a site directly Case-control burden and.

DNM Filter-Yongzhuang and Xiaolin• They are going to test this on the malformations and see how it does; may be better

for genomes• Uses machine learning to filter candidate DNM calls from GATK or other trio-aware

callers– Require trio-aware or can use multi sample calls?

• Training data built from 264 epi4k trios and used to predict true DNMs from candidate DNM call set

• Selects highly confident DNM calls by generating a score for each candidate DNM– Can specify ranking and thresholding methods for scoring

• Designed to expedite obtaining highly confident DNM calls from WGS trio analysis• Has it been tested against current de novo pipeline?

• Wiki: yes if outperforms current method• ATAV: no• Regulated: yes if outperforms current method