GENBANK, SWISSPROT AND OTHERS

9
GENBANK, SWISSPROT AND OTHERS As Problem Sources for CSE 549 Andriy Tovkach Genetics

description

GENBANK, SWISSPROT AND OTHERS. As Problem Sources for CSE 549 Andriy Tovkach Genetics. GENBANK OVERVIEW. Consists of EMBL, NCBI and DDBJ Started 10 years ago Exponential growth ( graph ) On Saturday, the 7 th – 20.2 billion bases. FILE FORMAT. Header Features Sequence ( see files ). - PowerPoint PPT Presentation

Transcript of GENBANK, SWISSPROT AND OTHERS

Page 1: GENBANK,  SWISSPROT AND OTHERS

GENBANK, SWISSPROT AND OTHERSGENBANK, SWISSPROT AND OTHERS

As Problem Sources for CSE 549Andriy Tovkach

Genetics

Page 2: GENBANK,  SWISSPROT AND OTHERS

GENBANK OVERVIEWGENBANK OVERVIEW

Consists of EMBL, NCBI and DDBJ Started 10 years ago Exponential growth (graph) On Saturday, the 7th – 20.2 billion bases

Page 3: GENBANK,  SWISSPROT AND OTHERS

FILE FORMATFILE FORMAT

Header Features Sequence(see files)

Page 4: GENBANK,  SWISSPROT AND OTHERS

FASTA FORMATFASTA FORMAT

Single line description begins with > Followed by sequence data Can be both protein or DNA

Page 5: GENBANK,  SWISSPROT AND OTHERS

ENTREZ as RETRIEVAL SYSTEMENTREZ as RETRIEVAL SYSTEM

PubMed – 12 million citations from life science journals

Nucleotide – collection of DNA sequences Protein – protein sequences from SwissProt Genome – genomes of over 800 organisms Also Structure, PopSet, Taxonomy, OMIM

Page 6: GENBANK,  SWISSPROT AND OTHERS

PROTEIN DATABASESPROTEIN DATABASES

SWISS-PROT EBI – TREMBL NCBI – GENPEPT (already in history)

Page 7: GENBANK,  SWISSPROT AND OTHERS

GENOME DATABASESGENOME DATABASES

SGD: homepage example 1.1 example 1.2

Wormbase Ensembl Human Genome Browser

Page 8: GENBANK,  SWISSPROT AND OTHERS

CONCLUSIONSCONCLUSIONS

Sequencing projects produce a lot of data These data have at least to be structured in the

databases Ideally all sequences need high-quality human

annotation That’s why computer scientists are welcome in

biology

Page 9: GENBANK,  SWISSPROT AND OTHERS

LITERATURELITERATURE

Genebank presentation by Manpreet Katari (CSE 549, Fall 2000)

Thomas Lengauer (Ed.) Bioinformatics – From Genomes to Drugs

Entrez website Google