Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome...

28
Marie-Adèle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom

description

Sequencing  Small genomes (bacterial and model organisms)  projects  Current capacity 4 M reads p/a sufficient for 100 Mb of finished sequence  Mainly whole genome/chromosome shotguns including finishing  Many are international collaborations  Larger more complex genomes ( Mb) on the horizon Informatics  Automatic analysis  Manual annotation by expert biologists  Tools: finishing (Cyclops), annotation (Artemis), comparative analysis (ACT)  Data dissemination  Database resources Functional Genomics  S. pombe  Bacterial Genomes  D. discoideum The Pathogen Sequencing Unit

Transcript of Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome...

Page 1: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.

Marie-Adèle RajandreamThe Pathogen Sequencing Unit

The Sanger InstituteThe Wellcome Trust Genome Campus

HinxtonCambridge

United Kingdom

Page 2: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.

The Sanger Institute

Principally funded by Wellcome Trust (about 96 %)

60,000,000 bases per day of raw data

600 employees

Sequencing of Human, Mice, Zebrafish & pathogen genomes

Manual and automatic genome annotation (Ensembl, Artemis)

Identification of cancer causing mutations (recently BRAF gene mutation)

Sequence variation and disease association

Page 3: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.

Sequencing Small genomes (bacterial and model organisms) 60-70 projects Current capacity 4 M reads p/a sufficient for 100 Mb of finished sequence Mainly whole genome/chromosome shotguns including finishing Many are international collaborations Larger more complex genomes (35-100 Mb) on the horizon

Informatics Automatic analysis Manual annotation by expert biologists Tools: finishing (Cyclops), annotation (Artemis), comparative analysis (ACT) Data dissemination Database resources

Functional Genomics S. pombe Bacterial Genomes D. discoideum

The Pathogen Sequencing Unit

Page 4: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.

GeneDB

http://www.genedb.org

Page 5: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.

Project pages

annotation

sequencesanalysis

GeneDBhttp://www.genedb.org

FTP site

BLASTcuration

Page 6: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.

What is GeneDB?

• a generic organism database

• annotated sequences as well as functional data

• visualisation in user-friendly environment

• annotation and analysis of data by biologists

• flexible enough to incorporate new data types

• linked to external databases

• fully curated

Page 7: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.

The GeneDB project

• Started in 2001

• Funded by the Wellcome Trust for a period of 5 years

• Initially for 3 organisms: S. pombe, Leishmania & Trypanosome

• 2 full-time programmers, 1 part-time programmer

• One curator for each organism

• One helpdesk person / programmer

• Prototype now done and in use

Page 8: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.
Page 9: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.
Page 10: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.
Page 11: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.
Page 12: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.
Page 13: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.
Page 14: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.
Page 15: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.
Page 16: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.
Page 17: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.
Page 18: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.
Page 19: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.
Page 20: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.

Technical Outline Prototype“Java”

biojava

data

gui

minelet

mining

test

utils

web

Web

jsp cgi

blast

ominblast

asp common

cerevisiae

pombe

malaria

leish

tryp

Data

aspimagesserialiseindices

cerevisiaeimagesserialiseindices

pombe

malaria

tryp

leish

EMBL

Page 21: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.

Broad specifications for production version

• Relational database

• Curator / annotator interface incorporating functionality of Artemis (MESS)

• Facility for doing more complex queries

For comprehensive, detailed specs see our Functional Specifications document

Page 22: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.

P. falciparum chr. 14

Page 23: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.
Page 24: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.
Page 25: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.
Page 26: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.
Page 27: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.

“biotin carboxylase”Inferred by Sequence Similarity

with a yeast sequenceSGD:S0005299

(which was originally annotated based on a published

mutant phenotype)

Page 28: Marie-Adle Rajandream The Pathogen Sequencing Unit The Sanger Institute The Wellcome Trust Genome Campus Hinxton Cambridge United Kingdom.

Pathogen Sequencing Unit

AnalysisMartin AslettSteven Bentley Matthew BerrimanAna CerdenoChristiane Hertz-FowlerMatthew HoldenKeith JamesRachel Lyne Arnab PainChris PeacockMohammed Sebaihia Nick Thomson Valerie Wood

Project ManagementBart BarrellJulian ParkhillMarie-Adele RajandreamAl IvensNeil Hall

ProgrammingRob DaviesDavid HarperArnaud KerhornouPaul MooneyKim RutherfordAdrian TiveyEd Zuiderwijk

Karen MungallTheresa FeltwellIan GoodheadZahra HanceHeidi HauserMandy SandersMark SimmondsDanielle Walker

Barbara HarrisBecky AtkinAndrew BarronCarol ChillingworthLouise ClarkeCraig CortonJonathan DoggettNicola LennardAlexandra LineDoug Ormand

David HarrisMatthew CollinsNigel FoskerArlette GobleLee MurphySusan O’NeilSimon RutterDavid SaundersKathy SeegerRobert SquaresSteven Squares

Carol ChurcherKaren Brooks Inna CherevachTracey ChillingworthKay ClarkePaul DaviesNancy HamlinKay JagelsSharon MouleBrian WhiteSally WhiteheadSubcloning

Ann CroninAudrey FraserDavid JohnsonMike QuailClaire Price Ester Rabbinowitsch Sarah Sharp

MappingMaria FookesJohn Woodward

Sequencing

Wellcome Trust Sanger Institute

AdministrationYvonne Shaw