D A S for ENCODE data coordination Felix Kokocinski, WTSI.
-
Upload
everett-wells -
Category
Documents
-
view
217 -
download
0
Transcript of D A S for ENCODE data coordination Felix Kokocinski, WTSI.
D A S for ENCODE
data coordination
Felix Kokocinski, WTSI
Project Overview
Annotate all evidence-based gene features at a high accuracy
across the human genome– protein-coding loci with isoforms– nc loci with transcript evidence– pseudogenes
Goal:
– HAVANA & EnsEMBL, Sanger Institute, UK– University of Lausanne, CH– Centre for Genomic Regulation, ES– Spanish Nat. Cancer Res. Centre, ES
– University of California Santa Cruz, USA– Washington University St. Louis, USA– Broad Inst. of MIT and Harvard, USA– Yale University, USA
Partners:
Manual Genome Annotation
• ~20 annotators working according to HAVANA guidelines
• computational pipeline for alignments
• Otterlace software
• input from partner groups, import of data source via DAS
• verification with RT-PCR, RACE & sequencing
Data Exchange using DAS
DistributedAnnotationSources
interfaceWWW
GenTrack
tracking system
Otterlace
ann. software
high prior.issues
exper. ver.issues
Perl API
Source Adaptors
Update Scripts
GenTrack Annotation Tracking
• extension of open-source RoR ticketing system Redmine (www.redmine.org)• data import via DAS• modules for analyzing and flagging data• www.sanger.ac.uk/gentrack
GenTrack Annotation Tracking
GenTrack Annotation Tracking
QuickTime™ and a decompressor
are needed to see this picture.
GenTrack Annotation Tracking
QuickTime™ and a decompressor
are needed to see this picture.
• Entry points:
– List of all genes & transcripts in region
– High-priority loci
– Loci with specific tags
• Identify problem, compare in Otterlace
• Resolve by
– Changing annotation or
– Disbelieving other source
– Note decision
GenTrack: Workflow
DAS Specifics
Format: Specialized 1.53E
<type-id>
from sequence ontology (exon: SO:0000147)
<method>
(havana_manual_annotation)
<type-category>
Evidence code describing the type of method
(inferred from RT-PCR experiment (ECO:0000109))
<note>
- key=value pairs
- parent, lastmod [req] (LASTMOD=2006-04-07T15:15:58+0100)
- transcripttype, etc. [opt]
DAS Specifics
QuickTime™ and a decompressor
are needed to see this picture.
Thanks
Tim Hubbard
ENCODE partners
Andy Jenkinson
Jonathan Warren
Paul Bevan
Jody Clements
Steve Trevanion
James Gilbert
Anacode
Adam Frankish
Toby Hunt Bronwen Aken
Steve Searle
Jennifer Harrow
Redmine.org