Peter Rice and Mahmut Uludag EMBOSS as an Efficient DAS Annotation Source Peter Rice, EBI...
-
Upload
jorge-merren -
Category
Documents
-
view
221 -
download
2
Transcript of Peter Rice and Mahmut Uludag EMBOSS as an Efficient DAS Annotation Source Peter Rice, EBI...
Peter Rice and Mahmut Uludag
EMBOSS as an Efficient DAS Annotation Source
Peter Rice, EBI ([email protected])
Mahmut Uludag, EBI ([email protected])
10th March 2009
EMBOSS: History
• European Molecular Biology Open Software Suite• 1996: Started at Sanger Centre• 2000: Release 1.0.0 and moved to HGMP• 2005: Moved to EBI (HGMP closed)• 2008: Release 6.0.0
http://emboss.sourceforge.net
EMBOSS: Status
• Open source package• Sequence analysis• 200 applications• 100 third-party applications
• Reads 40 sequence formats• Writes 40 sequence formats• Reads 6 feature formats• Writes 10 feature formats
EMBOSS: Interfaces
• Over 100 interfaces / packages containing EMBOSS
• Command line• Web interfaces• GUIs• SOAP Web services (EMBRACE)• Taverna workflows
• Galaxy
Overview
EMBOSS produces annotations in DASGFF format Protein sequence referencing using Uniprot
protein identifiers Nucleotide sequence referencing using
Ensembl gene identifiers MyDAS based annotation server
Executes EMBOSS programs based on the incoming requests
Protein sequence annotation,EMBOSS programs used so far
pepcoil; predicted coiled coil regions in protein sequences patmatmotifs; motifs from the PROSITE database helixturnhelix; nucleic acid-binding motifs in protein
sequences garnier; predicted protein secondary structures using
GOR method sigcleave; predicted signal cleavage sites in protein
sequences digest; protein proteolytic enzyme or reagent cleavage sites antigenic; predicted antigenic regions in protein sequences
Nucleotide sequence annotation,EMBOSS programs used so far
equicktandem, tandem; tandem repeats in nucleotide sequences
silent; restriction enzyme sites in a nucleotide sequence which can be inserted (mutated) without changing the translation
jaspscan; transcription factor binding sites from the JASPAR database
marscan; matrix/scaffold recognition (MRS) signatures in DNA sequences
restrict; restriction enzyme cleavage sites in nucleotide sequences
tcode; protein-coding regions identified using Fickett TESTCODE statistic
Other EMBOSS programsthat can be used for annotation
26 EMBOSS programs producing graphical outputs Possibly using stylesheet support in Ensembl &
DAS 13 EMBOSS alignment programs
DAS 1.53E has alignment extension
Test clients used
Dasty2; for protein annotations Good in displaying individual features Useful links for further exploration
Links to ontology terms used Links to original DAS responses
Ensembl; for gene and protein annotations Displays features in genomic context Possible to use DAS resources that not in the registry
Work in progress
Need to register on dasregistry.org Experimental DAS server available at
http://wwwdev.ebi.ac.uk/soaplab/das DAS servers as data sources
Common coordinate systems