BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu,...

53
BIO-TRAC 25 (Proteomics: Principles and Methods) BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 October 10, 2003 NIH, Bethesda, MD NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Senior Bioinformatics Scientist, Protein Information Resource Protein Information Resource National Biomedical Research Foundation, GUMC National Biomedical Research Foundation, GUMC Tutorial: Tutorial: Bioinformatics Resources Bioinformatics Resources
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    0

Transcript of BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu,...

Page 1: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

BIO-TRAC 25 (Proteomics: Principles and Methods)BIO-TRAC 25 (Proteomics: Principles and Methods)October 10, 2003October 10, 2003 NIH, Bethesda, MDNIH, Bethesda, MD

Zhang-Zhi Hu, M.D. Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Senior Bioinformatics Scientist, Protein Information ResourceProtein Information ResourceNational Biomedical Research Foundation, GUMCNational Biomedical Research Foundation, GUMC

Tutorial: Tutorial: Bioinformatics ResourcesBioinformatics Resources

Page 2: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

2

What is Bioinformatics?What is Bioinformatics?

NIH Biomedical Information Science and Technology NIH Biomedical Information Science and Technology Initiative (BISTI) Working Definition (2002)Initiative (BISTI) Working Definition (2002) - Research, - Research, development, or application of computational tools and development, or application of computational tools and approaches for expanding the use of biological, medical, approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data.organize, archive, analyze, or visualize such data.

BioinformaticsBioinformatics is the application of information technology is the application of information technology to the analysis, organization and distribution of biological to the analysis, organization and distribution of biological data in order to answer complex biological questions.data in order to answer complex biological questions.

Page 3: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

3

Bioinformatics ResourcesBioinformatics Resources

The Molecular Biology Database Collection: The Molecular Biology Database Collection: An Online An Online Compilation of Relevant Database ResourcesCompilation of Relevant Database Resources 2003 update: 2003 update: http://www3.oup.co.uk/nar/database/ Nucleic Acids Research Database Issues (January Annually) Nucleic Acids Research Database Issues (January Annually)

(2003 - (2003 - http://nar.oupjournals.org/content/vol31/issue1/))

DBcat: DBcat: A Catalog of > 500 Biological DatabasesA Catalog of > 500 Biological Databases http://www.infobiogen.fr/services/dbcat/

Page 4: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

4

Molecular Biology Database Collection Molecular Biology Database Collection (http://nar.oupjournals.org/cgi/content/full/31/1/1#GKG120TB1)

Page 5: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

5

The Molecular Biology Database Collection: The Molecular Biology Database Collection: 2003 update (Baxevanis, A.D.)2003 update (Baxevanis, A.D.)

---- An online resource of 386 key databases of 18 categoriesAn online resource of 386 key databases of 18 categories

Major sequence repositoriesMajor sequence repositories

Comparative GenomicsComparative Genomics

Gene ExpressionGene Expression

Gene Identification and Gene Identification and StructureStructure

Genetic and Physical MapsGenetic and Physical Maps

Genomic DatabasesGenomic Databases

Intermolecular InteractionsIntermolecular Interactions

Metabolic Pathways and Metabolic Pathways and Cellular RegulationCellular Regulation

Mutation DatabasesMutation Databases

PathologyPathology

Protein Sequence MotifsProtein Sequence Motifs

Proteome ResourcesProteome Resources

Retrieval Systems and Retrieval Systems and Database StructureDatabase Structure

RNA SequencesRNA Sequences

StructureStructure

TransgenicsTransgenics

Varied Biomedical ContentVaried Biomedical Content

Page 6: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

6

OverviewOverview

Protein Sequence AnalysisProtein Sequence AnalysisII. Sequence Similarity Search and Alignment. Sequence Similarity Search and Alignment

IIII. Family Classification Methods. Family Classification Methods

IIIIII. Structure Prediction Methods. Structure Prediction Methods

Molecular Biology DatabasesMolecular Biology DatabasesIVIV. Protein Family Databases. Protein Family Databases

VV. Database of Protein Functions. Database of Protein Functions

VIVI. Databases of Protein Structures. Databases of Protein Structures

Proteomic ResourcesProteomic ResourcesVIIVII. 2D-gel databases. 2D-gel databases

VIIIVIII. Proteomic analyses. Proteomic analyses

Page 7: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

7

I. Sequence Similarity SearchI. Sequence Similarity Search

Find a protein sequence: Find a protein sequence: text searchtext searchBased on Based on Pair-Wise ComparisonsPair-Wise Comparisons BLOSUMBLOSUM scoring matrix scoring matrix PAMPAM scoring matrix scoring matrixDynamic Programming AlgorithmsDynamic Programming Algorithms Global Similarity: Global Similarity: Needleman-WunschNeedleman-Wunsch ( (GAP/BestFitGAP/BestFit)) Local Similarity: Local Similarity: Smith-WatermanSmith-Waterman ( (SSEARCHSSEARCH))Heuristic Algorithms (Sequence Database Searching)Heuristic Algorithms (Sequence Database Searching) FASTAFASTA: Based on K-Tuples (2-Amino Acid): Based on K-Tuples (2-Amino Acid) BLASTBLAST: Triples of Conserved Amino Acids: Triples of Conserved Amino Acids Gapped-BLASTGapped-BLAST: Allow Gaps in Segment Pairs (NREF): Allow Gaps in Segment Pairs (NREF) PHI-BLASTPHI-BLAST: Pattern-Hit Initiated Search (NCBI): Pattern-Hit Initiated Search (NCBI) PSI-BLASTPSI-BLAST: Iterative Search (NCBI): Iterative Search (NCBI)

Page 8: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

8

Sequence Search by Text or Unique IDSequence Search by Text or Unique IDEntrez (http://www.ncbi.nlm.nih.gov/Entrez/)

(http://pir.georgetown.edu/pirwww/search/textsearch.html)

Page 9: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

9

Pair-Wise Pair-Wise ComparisonsComparisons

Scoring matrix Global lobal and local local

Similarity: Similarity: Dynamic Dynamic ProgrammingProgramming((Needleman-Wunsch,Smith-Waterman)

((http://www.ebi.ac.uk/emboss/align/))

Page 10: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

10

FASTA SearchFASTA Search

(http://www.ebi.ac.uk/fasta33/)

(http://pir.georgetown.edu/pirwww/search/fasta.html)

Page 11: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

11

Gapped-BLAST SearchGapped-BLAST Search(http://pir.georgetown.edu/pirwww/search/pirnref.shtml)

(http://www.ncbi.nlm.nih.gov/BLAST/)

Page 12: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

A BLAST ResultA BLAST Result

Page 13: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

13

PSI-BLAST Iterative SearchPSI-BLAST Iterative Search

(http://www.ncbi.nlm.nih.gov/BLAST/)

Page 14: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

14

PSI-BLASTPSI-BLAST

Page 15: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

15

II. Family Classification MethodsII. Family Classification Methods

Multiple Sequence AlignmentMultiple Sequence Alignment and Phylogenetic Analysis and Phylogenetic Analysis ClustalW Multiple Sequence AlignmentClustalW Multiple Sequence Alignment Alignment Editor & Phylogenetic TreesAlignment Editor & Phylogenetic Trees

Searches Based on Searches Based on Family InformationFamily Information PROSITE Pattern SearchPROSITE Pattern Search Motif and Profile SearchMotif and Profile Search Hidden Markov Model (HMMs)Hidden Markov Model (HMMs)

Page 16: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

16

Multiple Sequence AlignmentMultiple Sequence Alignment ClustalW (http://pir.georgetown.edu/pirwww/search/multaln.html)

Page 17: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

17

Alignment Editor (Jalview)Alignment Editor (Jalview)(http://www.ebi.ac.uk/clustalw/)

Page 18: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

18

Alignment Editor (GeneDoc)Alignment Editor (GeneDoc)(http://www.psc.edu/biomed/genedoc/)

Page 19: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

19

Phylogenetic AnalysisPhylogenetic AnalysisTree Programs: (Tree Programs: (http://evolution. http://evolution. genetics.washington.edu/phylip.htmlgenetics.washington.edu/phylip.html)) Tree Searches: (http://pauling.

mbu.iisc.ernet.in/~pali/index.html)

Page 20: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

20

Phylogenetic Trees Phylogenetic Trees (IGFBP Superfamily)

(Radial Tree)

(Phylogram)

Page 21: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

21

PROSITE Pattern SearchPROSITE Pattern Search(http://pir.georgetown.edu/pirwww/search/patmatch.html)

Page 22: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

22

Profile SearchProfile Search(http://bmerc-www.bu.edu/bioinformatics/profile_request.html)

Page 23: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

23

Hidden Markov Model Search Hidden Markov Model Search (http://www.sanger.ac.uk/Software/Pfam/search.shtml)

(http://smart.embl-heidelberg.de)

Page 24: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

24

III. Structural Prediction MethodsIII. Structural Prediction Methods

Signal Peptide: SIGFIND, SignalP

Transmembrane Helix: TMHMM, TMAP

2D Prediction (-helix, -sheet, Coiled-coils): PHD, JPred

3D Modeling: Homology Modeling (Modeller, SWISS-MODEL), Threading, Ab-initio Prediction

Page 25: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

25

StructureStructurePrediction:Prediction:A GuideA Guide

(http://speedy.embl-heidelberg.de/gtsp/flowchart2.html)

Page 26: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

26

Protein Protein Prediction Prediction ServerServer

(http://www.cbs.dtu.dk/services/)

Page 27: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

27

Signal Peptide PredictionSignal Peptide Prediction(http://www.stepc.gr/~synaptic/sigfind.html)

(http://www.cbs.dtu.dk/services/SignalP-2.0)

Page 28: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

28

Transmembrane HelixTransmembrane Helix

(http://www.cbs.dtu.dk/services/TMHMM/)

Page 29: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

29

Protein Structure PredictionProtein Structure Prediction(http://cmgm.stanford.edu/WWW/www_predict.html)

(http://restools.sdsc.edu/biotools/biotools9.html)

Page 30: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

30

Structure Prediction ServerStructure Prediction Server(http://cubic.bioc.columbia.edu/predictprotein/)

(http://www.compbio.dundee.ac.uk/WWW_Servers/JPred/jpred.html)

Page 31: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

31

3D-Modelling3D-Modelling(http://www.salilab.org/modeller/modeller.html)

(http://www.expasy.ch/swissmod/SWISS-MODEL.html)

Page 32: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

32

IV. Protein Family DatabasesIV. Protein Family Databases

Whole Proteins PIR: Superfamilies and Families COG (Clusters of Orthologous Groups) of Complete Genomes ProtoNet: Automated Hierarchical Classification of Proteins

Protein Domains Pfam: Alignments and HMM Models of Protein Domains SMART: Protein Domain Families

Protein Motifs PROSITE: Protein Patterns and Profiles BLOCKS: Protein Sequence Motifs and Alignments PRINTS: Protein Sequence Motifs and Signatures

Integrated Family Databases iProClass: Superfamilies/Families, Domains, Motifs, Rich Links InterPro: Integrate Pfam, PRINTS, PROSITES, ProDom, SMART

Page 33: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

33

Protein ClusteringProtein Clustering((http://www.ncbi.nlm.nih.gov/COG/))

Page 34: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

34

Protein DomainsProtein DomainsPfam (http://www.sanger.ac.uk/Software/Pfam/)

SMART (http:// smart.embl-heid elberg.de/smart/ show_motifs.pl)

Page 35: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

35

Protein MotifsProtein Motifs PROSITE is a database of protein families and domains. It

consists of biologically significant sites, patterns and profiles. (http://www.expasy.ch/prosite/)

Page 36: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

36

Integrated Family ClassificationIntegrated Family ClassificationInterProInterPro: An integrated resource unifying PROSITE, PRINTS, ProDom, Pfam, SMART, and TIGRFAMs, PIRSF. (http://www.ebi.ac.uk/interpro/search.html)

Page 37: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

37

V. Databases of Protein FunctionsV. Databases of Protein Functions

Metabolic Pathways, Enzymes, and Compounds Enzyme Classification: Classification and Nomenclature of Enzyme-Catalysed

Reactions (EC-IUBMB) KEGG (Kyoto Encyclopedia of Genes and Genomes): Metabolic Pathways LIGAND (at KEGG): Chemical Compounds, Reactions and Enzymes EcoCyc: Encyclopedia of E. coli Genes and Metabolism MetaCyc: Metabolic Encyclopedia (Metabolic Pathways) WIT: Functional Curation and Metabolic Models BRENDA: Enzyme Database UM-BBD: Microbial Biocatalytic Reactions and Biodegradation Pathways Klotho: Collection and Categorization of Biological Compounds

Cellular Regulation and Gene Networks EpoDB: Genes Expressed during Human Erythropoiesis BIND: Descriptions of interactions, molecular complexes and pathways DIP: Catalogs experimentally determined interactions between proteins RegulonDB: Escherichia coli Pathways and Regulation

Page 38: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

38

KEGG Metabolic & Regulatory PathwaysKEGG Metabolic & Regulatory Pathways

(http://www.genome.ad.jp/dbget-bin/show_pathway?hsa00590+874)

KEGG is a suite of databases and associated software, integrating our current knowledge on molecular interaction networks, the information of genes and proteins, and of chemical compounds and reactions. (http://www.genome.ad.jp/kegg/kegg2.html)

Page 39: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

39

BioCycBioCyc (EcoCyc/MetaCyc Metabolic Pathways) (EcoCyc/MetaCyc Metabolic Pathways) The BioCyc Knowledge Library is a collection of Pathway/Genome

Databases (http://biocyc.org/)

Page 40: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

40

Protein-Protein Interactions: DIPProtein-Protein Interactions: DIP(http://dip.doe-mbi.ucla.edu/)

Page 41: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

41

Protein-Protein Interaction: BINDProtein-Protein Interaction: BIND((http://www.bind.ca/))

Page 42: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

42

BioCarta Cellular PathwaysBioCarta Cellular Pathways(http://www.biocarta.com/index.asp)

Page 43: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

43

VI. Databases of Protein StructuresVI. Databases of Protein Structures

Protein Structure and Classification PDB: Structure Determined by X-ray Crystallography and NMR CATH: Hierarchical Classification of Protein Domain Structures SCOP: Familial and Structural Protein Relationships FSSP: Protein Fold Family Database

Protein Sequence-Structure Relationship PIR-NRL3D: Protein Sequence-Structure Database PIR-RESID: Protein Structure/Post-Translational Modifications HSSP: Families and Alignments of Structurally-Conserved

Regions

Page 44: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

44

PDB Structure DataPDB Structure Data(http://www.rcsb.org/pdb/)

Page 45: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

45

PDBsum:PDBsum:

Summary and AnalysisSummary and Analysis (http://www.biochem.ucl.ac.uk/bsm/pdbsum)

Page 46: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

46

Protein Structural Protein Structural ClassificationClassification

CATH: Hierarchical domain classification of protein structures (http://www.biochem.ucl.ac.uk/bsm/cath_new/)

Page 47: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

47

Protein Structural ClassificationProtein Structural Classification

(http://scop.mrc-lmb. cam.ac.uk/scop/)

The SCOP database aims to provide a detailed and comprehensive description of the structural and evolutionary relationships between all proteins whose structure is known, including all entries in the PDB.

Page 48: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

48

VII. Proteomic ResourcesVII. Proteomic Resources

GELBANK (GELBANK (http://gelbank.anl.gov): 2D-gel patterns from completed ): 2D-gel patterns from completed genomes; SWISS-2DPAGE (genomes; SWISS-2DPAGE (http://www.expasy.org/ch2d/))

PEP: Predictions for Entire Proteomes: (PEP: Predictions for Entire Proteomes: (http://cubic.bioc.columbia.edu/ pep/): Summarized analyses of protein sequences): Summarized analyses of protein sequences Proteome BioKnowledge Library: (http://www.proteome.com): Detailed Proteome BioKnowledge Library: (http://www.proteome.com): Detailed information on human, mouse and rat proteomesinformation on human, mouse and rat proteomesProteome Analysis Database (http://www.ebi.ac.uk/proteome/): Online Proteome Analysis Database (http://www.ebi.ac.uk/proteome/): Online application of InterPro and CluSTr for the functional classification of application of InterPro and CluSTr for the functional classification of proteins in whole genomesproteins in whole genomesExpression Profiling databases: GNF Expression Profiling databases: GNF (http://expression.gnf.org/cgi-bin/index.cgi, human and mouse (http://expression.gnf.org/cgi-bin/index.cgi, human and mouse transcriptome), SMD transcriptome), SMD (http://genome-www5.stanford.edu/MicroArray/SMD/, Stanford (http://genome-www5.stanford.edu/MicroArray/SMD/, Stanford microarray data analysis), EBI Microarray Informatics microarray data analysis), EBI Microarray Informatics (http://www.ebi.ac.uk/microarray/ index.html , (http://www.ebi.ac.uk/microarray/ index.html , managing, storing and managing, storing and analyzing microarray dataanalyzing microarray data))

Page 49: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

49

2D-Gel Image Databases (1)2D-Gel Image Databases (1)(http://gelbank.anl.gov/2dgels/index.asp)

Page 50: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

50

2D-Gel Image Databases (2)2D-Gel Image Databases (2)(http://us.expasy.org/ch2d/2d-index.html)

(http://us.expasy.org/cgi-bin/nice2dpage.pl?P06493)

Page 51: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

51

VIII. Proteome AnalysisVIII. Proteome Analysis(http://www.ebi.ac.uk/proteome)

Page 52: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

52

Expression ProfilingExpression Profiling Human and Mouse Transcriptome

(http://expression.gnf.org/cgi-bin/index.cgi)

(http://genome-www. stanford.edu/serum/)

Page 53: BIO-TRAC 25 (Proteomics: Principles and Methods) October 10, 2003 NIH, Bethesda, MD Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist, Protein Information.

53

Lab:Lab: Visit selected websites and analyze some protein sequences of

your own choices. - List of Bioinformatics Resources of this tutorial available: http://pir.georgetown.edu/~huz/bioinfo_resource.html

Try some of the following sequences for analysis: 1) well characterized proteins: PIR:A26366(CYP17), JS0747(Sp1) 2) less characterized proteins: PIR:A59000(MATER) TrEMBL:Q9QY16(GRTH) 3) hypothetical protein: PIR:T12515, T00338 , T47130 SWISS-PROT:Q9BWT7