Structure Databases: The Protein Data Bank
description
Transcript of Structure Databases: The Protein Data Bank
Structure Databases:The Protein Data Bank
Swanand Gore & Gerard KleywegtPDBe – EBI
May 7th 2010, 9-10 am
Macromolecular Crystallography Course
Outline
• Structural Biology and Bioinformatics
• Databases in Structural Bioinformatics
• Protein Data Bank
• PDBe
Promise of Structural Biology• Basic research
– Insights in biophysics of folding– Insights into Evolution– Insights into enzymatic catalysis
• Applications– Design of drug / antibody / epitope / pesticide / enzymes– Design of new materials– Understanding disease
• Structural bioinformatics– Big computational and informatics toolbox– Full of techniques to translate insights to application– Databases are a vital aspect
Sequence-Structure-Function
Sequence
Function
PredictionModelling
DeterminationArchival / Retrieval
Classification
StructureSearching
MiningComparisonAlignment
DesignEngineering
A rich toolbox
StructuralBio-info-computing
Structure Refinement
DatabasesAnnotation
Classification
Comparison
Analysis Mining
Prediction
Databases are central to structural bioinformatics pipeline
Primary StructuralDatabases
DetermineAnnotate
AlignCompare
MineClassify
ModelPredict
Secondary StructuralDatabases
Databases help in Structure Determination
• Dihedral preferences– Ramachandran contours– Sidechain rotamer libraries– RNA backbone and puckers
• Likely ring conformations– Small-molecules (CCDC)
• Molecular replacement– Choice of probe using homology– fragment-based MR
• Validation– Electron density server and PrEDS
• Dunbrack, R.L., Jr. Rotamer libraries in the 21st century. Curr. Opin. Struct. Biol. 12:431-440, 2002.• Jane S. Richardson et al (2008) "RNA Backbone: Consensus All-angle Conformers and Modular String Nomenclature (an RNA Ontology Consortium contribution)" RNA 14 :465-481• The Cambridge Structural Database: a quarter of a million crystal structures and rising, F. H. Allen, /Acta Cryst./, B*58*, 380-388, 2002 • S.C. Lovell et al. (2003) "Structure Validation by Cα Geometry: φ,ψ and Cβ Deviation." Proteins: Structure, Function and Genetics 50, 437-450.• Claude et al. CaspR: a web server for automated molecular replacement using homology modelling. Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W606-9.• McCoy, A.J., Grosse-Kunstleve, R.W., Adams, P.D., Winn, M.D., Storoni, L.C. and Read, R.J. (2007). Phaser crystallographic software. J. Appl. Cryst. 40: 658-674.• Gubbi et al. (2007) Solving Protein Structures Using Molecular Replacement Via Protein Fragments, Lecture Notes In Artificial Intelligence;.Vol. 4578. 627.• GJ Kleywegt et al. (2004) "The Uppsala Electron-Density Server", Acta Crystallographica, D60, 2240-2249
Databases are vital to archiving structures!
• Structures represent invaluable scientific insights
• But it is costly to solve a structure– Time, effort, money
• Organize and safe-keep painstakingly determined data– Formal mechanisms of arranging,
searching, backing up• Wide-ranged access to invaluable
repository without compromising data integrity
• Very low cost of maintenance in comparison with the cost of content!
Databases are vital to archiving structures
• “Database is a structured collection of data held in computer storage, often incorporating software to make it accessible in various ways”
• Databases– Provide accessibility with safety and persistence– Provide context for your data against other data– Facilitate comparisons and data-mining
• Primary structural databases– Experimental data and model coordinates
• NDB, wwPDB, BMRB, CSD, EMDB
• Secondary structural databases– Classification, function annotation
• SCOP, EC2PDB, PALI, and many many more!
Databases / Archival / Retrieval• Formats of databases
– Flat files (csv, tsv, columnar), supporting scripts– Relational (MySQL, Oracle): professional, indexed
• Access– Modes: read, write, edit, delete (PDB provides entry deposition mechanisms)– Means: Download (wwPDB ftp), Command-line or GUI (SQL queries, Oracle desktop client),
Web-based interfaces (PDBeDatabase service)– Access frequency
• Schema design– Tables, primary keys,
foreign keys, views….– Normal forms: avoid
data repetition, inconsistencies
Databases for Classification• Structural hierarchy
– CATH• Class, Architecture, Topology, Homology
– SCOP• Class, Fold, Superfamily, Family
• Enzyme hierarchy– EC-PDB
• Oxidoreductase, ligase, lyase, isomerase, hydrolase, transferase.
• Functional ontology– GOA
• Gene Ontology: Cellular component, Biological process, Molecular Function
• Linked to structures via SIFTS
• Christos A. Ouzounis et al. (2005) Classification schemes for protein structure and function Nature Reviews Genetics 4, 508-519.• Andreeva et al. (2007) Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res. 36:D419• Gene Ontology: tool for the unification of biology. The Gene Ontology Consortium (2000) Nature Genet. 25: 25-29• Barrell D. et al. (2009) The GOA database in 2009--an integrated Gene Ontology Annotation resource. Nucleic Acids Research 2009 37: D396-D403.
Databases for Comparison• Structural and structure-sequence alignments
• Phylogeny– Evolutionary trace
• Evolutionarily important residues• Mapping onto structure• Mizuguchi K, Deane CM, Blundell TL, Overington JP. (1998) HOMSTRAD: a database of protein structure alignments for homologous families. Protein Science 7:2469-2471.
• SISYPHUS - structural alignments for proteins with non-trivial relationships Andreeva et al, Nucleic Acid Research Database Issue 2007, 35, D253-D259• Gowri, V. S. Et al. (2003). Integration of related sequences with protein three-dimensional structural families in an updated Version of PALI database. Nucleic Acids Res. 2003 31: 486-488.• Bhaduri A, Pugalenthi G, Sowdhamini R. PASS2: an automated database of protein alignments organised as structural superfamilies. BMC Bioinformatics. 2004, 5:35• DBAli tools: mining the protein structure space. Marc A. Marti-Renom et al. Nucleic Acids Research, doi:10.1093/nar/gkm236 • Whelan, S., P.I.W. de Bakker, & N. Goldman. (2003). Pandit: a database of protein and associated nucleotide domains with inferred trees. Bioinformatics 19:1556-1563• The Pfam protein families database:,R.D. Finn,et al, Nucleic Acids Research (2010) Database Issue 38:D211-222• Morgan, D.H., D.M. Kristensen, D. Mittleman, and O. Lichtarge. ET Viewer: An Application for Predicting and Visualizing Functional Sites in Protein Structures. Bioinformatics. 2006 Aug 15;22(16):2049-50
Databases for Annotation
• SNPs
• Servant F. rt al (2002) ProDom: Automated clustering of homologous domains. Briefings in Bioinformatics. vol 3, no 3:246-251• Marchler-Bauer A,et al CDD: specific functional annotation with the Conserved Domain Database. Nucleic Acids Res. 2009 Jan;37(Database issue):D205-10• Hulo N., Bairoch A., Bulliard V., Cerutti L., Cuche B., De Castro E., Lachaize C., Langendijk-Genevaux P.S., Sigrist C.J.A. The 20 years of PROSITE. Nucleic Acids Res. 2007• SitesBase: a database for structure-based protein–ligand binding site comparisons , Nicola D. Gold and Richard M. Jackson, Nucleic Acids Research, 2006, Vol. 34, Database issue D231-D234• sc-PDB: an Annotated Database of Druggable Binding Sites from the Protein Data Bank, Esther Kellenberger et al, J. Chem. Inf. Model., 2006, 46 (2), pp 717–727• Binding MOAD, a high-quality protein–ligand database. Mark L. Benson et al, Nucleic Acids Research 2008 36(Database issue):D674-D678• SNPeffect v2.0: a new step in investigating the molecular phenotypic effects of human non-synonymous SNPs . Joke Reumers at al, Bioinformatics 2006 22(17):2183-2185
• Domains
• Active / allosteric sites
Databases for Annotation
• CREDO: A Protein-Ligand Interaction Database for Drug Discovery.Adrian Schreyer, Tom Blundell. Chemical Biology & Drug Design, Vol. 73, No. 2. (February 2009), pp. 157-167• BIPA: a database for protein–nucleic acid interaction in 3D structures. Semin Lee and Tom L Blundell, Bioinformatics 2009 25(12):1559-1560• PIBASE: a comprehensive database of structurally defined protein interfaces. Davis FP and Sali A, Bioinformatics. 2005 May 1;21(9):1901-7.• JAIL: a structure-based interface library for macromolecules. Stefan Günther et al. Nucleic Acids Res. 2009 January; 37(Database issue): D338–D341• Elke Michalsky et al., SuperLigands – a database of ligand structures derived from the Protein Data Bank, BMC Bioinformatics 2005, 6:122• Voronoia: analyzing packing in protein structures. Rother K et al. Nucleic Acids Res. 2009 Jan;37(Database issue):D393-5.• CASTp: Computed Atlas of Surface Topography of proteins. Binkowski et al. Nucleic Acids Res. 2003 Jul 1;31(13):3352-5.• The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data. Craig T. Porter, Gail J. Bartlett, and Janet M. Thornton (2004) Nucl. Acids. Res. 32: D129-D133.
• Binding partners– Small molecule: TIMBAL, CREDO– Protein, DNA – PiBase JAIL, BIPA
• Residues critical to enzyme mechanism
• Surface properties, cavities: Voronoia,
Databases of Analysis / Mining
• Secondary structure: SSEP• Active sites
• Oliva et al (1997) An automated classification of the structure of protein loops. J Mol Biol 266 (4): 814-830.• SSEP: secondary structural elements of proteins , V. Shanthi, P. Selvarani, Ch. Kiran Kumar, C. S. Mohire and K. SekarNucleic Acids Research, 2003, Vol. 31, No. 13 3404-3405• PepX: a structural database of non-redundant protein-peptide complexes. Vanhee F et al., Nucleic Acids Res. 2010 Jan;38(Database issue):D545-51. • Baeten L, et al. (2008) Reconstruction of Protein Backbones from the BriX Collection of Canonical Protein Fragments. PLoS Comput Biol 4(5): e1000083. doi:10.1371/journal.pcbi.1000083• Bystroff C & Baker D. (1998). Prediction of local structure in proteins using a library of sequence-structure motifs. J Mol Biol 281, 565-77.• LigBase: a database of families of aligned ligand binding sites in known protein sequences and structures. Stuart AC et al., Bioinformatics. 2002 Jan;18(1):200-1.• PTGL—a web-based database application for protein topologies. Patrick May et al. Bioinformatics 2004 20(17):3277-3279; doi:10.1093/bioinformatics/bth367 • Fitzkee, N. C., Fleming, P. J, Rose G. D. (2005) The Protein Coil Library: a structural database of nonhelix, nonstrand fragments derived from the PDB. Proteins. 58 (4): 852-4.
• Protein-peptide interactions
• Loop databases– Protein Coil Library– Protein Loop Classification– Loops in Proteins
– Protein Topology Graph Library
• Frequent structural motifs
Databases in Prediction• Oligomeric state
– PISA at PDBe
• 3D coordinates– ab-initio folding– homology models
• Possible binding partners and binding modes
– small-molecule (PRECISE)– protein-protein (ADAN)
• Dynamics, conformational changes– MolMovDB
• Cellular location
• LOC3D: annotate sub-cellular localization for protein structures. Nair R, Rost B., Nucleic Acids Res. 2003 Jul 1;31(13):3337-40.• MolMovDB: analysis and visualization of conformational change and structural flexibility. Echols N et al., Nucleic Acids Res. 2003 Jan 1;31(1):478-82.• ADAN: a database for prediction of protein-protein interaction of modular domains mediated by linear motifs. Encinar JA et al., Bioinformatics. 2009 Sep 15;25(18):2418-24. Epub 2009 Jul 14.• PRECISE: a Database of Predicted and Consensus Interaction Sites in Enzymes . Shu-Hsien Sheu et al., Nucleic Acids Research, 2005, Vol. 33, Database issue D206-D211• MODBASE, a database of annotated comparative protein structure models and associated resources. Ursula Pieper et al., Nucleic Acids Research 37, D347-D354, 2009.• Krissinel E, Henrick K. Inference of macromolecular assemblies from crystalline state. J. Mol. Biol. (2007) 372:774–797.• S. M. Larson . Folding@Home and Genome@Home: Using distributed computing to tackle previously intractable problems in computational biology. Mod Meth Comp Biol, R. Grant, ed, Horizon Press (2003)
Specialized databases with structures
• MCSIS (GPCRs, Prions etc)
• Carbohydrates– KEGG Glycans
• Antibodies (Abysis)
• Lysozymes
• Abysis: http://www.bioinf.org.uk/abysis/• Horn F., Vriend G., Cohen FE. Collecting and harvesting biological data: the GPCRDB and NucleaRDB information systems. Nucleic Acids Res. 29:346-349 (2001)• LySDB - Lysozyme Structural DataBase. Mohan KS et al., Acta Crystallogr D Biol Crystallogr. 2004 Mar;60(Pt 3):597-600.
The Protein Data Bank• Unique primary database
– Single archive of experimentally determined macromolecular (biopolymer) structures
– ~ 65000 entries– Distributed online– Updated weekly– Numerous databases derived and enriched with PDB data– Many frontends- RCSB, PDBe, PDBsum, OCA, MMDB, Jena, SIB
• “The PDB” is a flat-file archive– PDB formatted coordinate files– any experimental data when submitted
The Protein Data Bank
• International Effort– Curated by RCSB, PDBe, PDBj, BMRB– ftp archive currently operated by RCSB
FTP traffic at PDB sites
RCSB PDB200 milliondata downloads
PDBe37 milliondata downloads
PDBj14 milliondata downloads
The Protein Data Bank• When is a biopolymer PDB-worthy?
– Polypeptides• Gene products• Non-ribosomal• Synthetic peptides > 23 residues
– Unless clearly biologically significant
– Polynucleotides• > 3 residues
– Sugars• > 3 sugar residues
– Fibers• Only repeating unit deposited
Annual Growth of PDBPrimary databases differ by magnitudes in size.
UniprotKB107 protein sequences
GenBank1011 base pairs
108 gene sequences
< 105 structures
http://www.rcsb.org/pdb/statistics/contentGrowthChart.do?content=total&seqid=100http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.htmlhttp://www.ebi.ac.uk/uniprot/TrEMBLstats/
Annual Growth of PDB
0100020003000400050006000700080009000
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
X-rayNMREM
Dominated by x-ray!
EM rising…
Redundancy in PDB(as in Nov’08)
• Entries > 54,000• Chains > 120,000
– Copies of a chain in same entry• Homo-oligomers
– Same chains in different entries• Determined by multiple labs• Determined under different conditions• Complexed with different partners• Mutants
• Chains < 8700 at seq.id < 30%– Orthologs, paralogs are very similar
• Using non-redundant chains from PDB– PISCES server– WHATIF, CATH, SCOP, DALI sets
• G. Wang and R. L. Dunbrack, Jr. PISCES: a protein sequence culling server. Bioinformatics, 19:1589-1591, 2003.
File formats at PDB
• The .pdb format– Header
• Remarks– experimental setup– Refinement details– oligomeric state– deviations from expected geometry
• Biochemical entities– Biopolymers, het groups
– Coordinates• 3D model of the entity• Multiple coordinates for same entity can exists
– MODELs, altloc identifiers
• Structure factors– .cif file
File formats at PDB
XML
mmCIF
The PDB format: header
123456789+123456789+123456789+123456789+123456789+123456789+123456789+123456789+
HEADER RETINOIC-ACID TRANSPORT 28-SEP-94 1CBS 1CBS 2COMPND CELLULAR RETINOIC-ACID-BINDING PROTEIN TYPE II COMPLEXED 1CBS 3COMPND 2 WITH ALL-TRANS-RETINOIC ACID (THE PRESUMED PHYSIOLOGICAL 1CBS 4COMPND 3 LIGAND) 1CBS 5SOURCE HUMAN (HOMO SAPIENS) 1CBS 6SOURCE 2 EXPRESSION SYSTEM: (ESCHERICHIA COLI) BL21 (DE3) 1CBS 7SOURCE 3 PLASMID: PET-3A 1CBS 8SOURCE 4 GENE: HUMAN CRABP-II 1CBS 9AUTHOR G.J.KLEYWEGT,T.BERGFORS,T.A.JONES 1CBS 10REVDAT 1 26-JAN-95 1CBS 0 1CBS 11
Column 1-6Record type
Column 7-72 - human-readable, mostlytextual information
The PDB format: coordinates
HETATM 1 C ACE A 0 4.279 14.829 14.190 1.00 19.08 C HETATM 2 O ACE A 0 3.706 14.098 15.038 1.00 20.62 O HETATM 3 CH3 ACE A 0 3.827 16.236 14.001 1.00 20.22 C ATOM 4 N MET A 1 5.514 14.621 13.695 1.00 17.77 N ATOM 5 CA MET A 1 6.269 13.401 13.959 1.00 16.51 C ATOM 6 C MET A 1 6.702 13.319 15.400 1.00 16.41 C ATOM 7 O MET A 1 7.036 12.248 15.870 1.00 15.38 O ATOM 8 CB MET A 1 7.529 13.301 13.085 1.00 16.52 C ATOM 9 CG MET A 1 7.292 12.805 11.676 1.00 16.48 C
Atom nr
Residue type
Atom name
Chain name
Residue nr
“B-factor”
Occupancy
X, Y, Z coordinates
Protein Data Bank in Europe• PDBe
– European node of wwPDB– Started 1996 as MSD at EBI– Deposition site since 1999– Started EMDB in 2002
• PDBe operations– Handle deposition and annotation of PDB and EMDB entries– Build advanced structure databases– Build services for search, browsing, analysis– Liaise with broader structural biology community– Coordinate with other databases e.g. Uniprot
• Funding
• PDBe: Protein Data Bank in Europe. S. Velankar et al., • Nucleic Acids Research, doi:10.1093/nar/gkp916
PDBe Deposition and Annotation
• Checks– Is format correct?– Are biopolymer sequences in biochemical entities
consistent with 3D models?– Are hetero groups named correctly?– Where all does model deviate from expected
geometry?
• Record various types of information– Experiment: Method, conditions, data resolution,
spacegroup, completeness etc.– Sample: source, expression system, engineered
etc.– Refinement: program, target
AutoDepDeposition
Tool
AutoDep provides valuable information to depositors
• Validation of structure factors– EDS criteria
• http://www.ebi.ac.uk/pdbe-xdep/autodep/index.jsp
AutoDep provides valuable information to depositors
Heterogen summary and Validation against ideal representations of ligands
AutoDep provides valuable information to depositors
Oligomeric state - PQS Sequence-structure alignmentUniprot, Pfam, Interpro
AutoDep provides valuable information to depositors
• Revisions, withdrawal, release– Release sequence-only immediately– Release coordinates immediately– Hold for 1 year– Release after publication
• Communication with depositors– Help depositors understand and
conform to PDB standards– Discussing errors
PDBe ServicesPISA, SSM/ PDBeFold, PDBeMotif, PDBeChem, SIFTS, PDBeStatistics, PDBeSearch, PDBeView
PDBe Services
PDBe ServicesPDBeView – the Atlas pages
• http://www.ebi.ac.uk/pdbe-srv/view/
PDBe ServicesPDBeFold (SSM): has my fold been seen before? Or is it novel!
PDB
???
• E. Krissinel and K. Henrick, Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Cryst. (2004). D60, 2256±2268.
PDBe Services
• Why compare structures?– Reveal conformational changes
• Ligands, mutations, crystal packing, pH..– Judge structural variability
• NMR ensembles, structure families– Discover common structural motifs– Identify fold– Infer function
• Sequence-alignments do not work well for distant evolutionary relationships
• Structures diverge much slowly than sequences• Structure improves quality of alignment• Better inference of function, e.g. when active sites
match well
PDBeFold (SSM)
• The relation between the divergence of sequence and structure in proteins. Chothia C, Lesk AM. EMBO J. 1986 Apr;5(4):823-6.
PDBe ServicesPDBeFold (SSM) algorithm
H1
S1
S2S3
S4
H2
H1
H2 H3
H4
S1
H5
H6
S2
S3
S4 S5
S6
S7
Match SSE graphs to get initial alignment
Iterative expansion of Ca-alignment
PDBe ServicesPDBeFold (SSM)
SSM can carry out genuine multiple structure alignment to reveal a motif common to a family of structures
PDBe ServicesPDBePISA
• What is the likely biological assembly of a given structure?• Can I learn about it from crystal-packing of chains?
PDB file (ASU)
Biological Unit
Crystal Symmetry ASU
PISAGenerate possible assembliesRank according to free energy
PDBe ServicesPDBePISA
PDB entry 1P30A monomer?
Biological unit 1P30Homotrimer!
PDBe ServicesPDBePISA
PDB entry 2TBVA trimer?
Biological Unit 2TBV180-mer!
PDBe ServicesPDBePISA
PDBe ServicesPDBePISA
PDB entry 1E942 Biological Units in 1E94:
A dodecamer and a hexamer!
PDBe ServicesPDBeMotif
• A very powerful engine to search PDB• Structure-sequence general searches
• Chemical substructure• Predefined frequent motifs• Arbitrary secondary structure patterns• Φψ patterns• Protein sequences
• Prosite motif, Uniprot, CSA accessions• Raw sequence • Regular expression
• Interactions between ligands, protein• Seq-distance between protein motifs
• PDB header searches• Specialized searches
• Envionment around an interaction• Motif binding• Occurrence of a motif inside another
• MSDmotif: exploring protein sites and motifs. Adel Golovin and Kim Henrick. BMC Bioinformatics 2008, 9:312
PDBe ServicesPDBeMotif: which motif does my substructure bind often?
Stau
rosp
orin
e K
inas
e in
hibi
tor
PDBe ServicesPDBeMotif: which ligands and chemical fragments does my sequence motif bind?
Tyrosine protein kinase-specific active-site signature:
[LIVMFYC]-{A}-[HY]-x-D-[LIVMFY]-[RSTAC]-{D}-{PF}-N-[LIVMFYC](3)
Motif binding statistics
Chemical fragments
PDBe ServicesPDBeMotif: how does a sequence motif look like in 3D?
Tyrosine protein kinase-specific active-site signature:
[LIVMFYC]-{A}-[HY]-x-D-[LIVMFY]-[RSTAC]-{D}-{PF}-N-[LIVMFYC](3)
Sequence hits 3D alignment
PDBe ServicesPDBeMotif: which sequences often host a Ramachandran path?
3D fragmentφ/ψ sequence
-156/-155,-103/17,-134/161
Search Sequence pattern
PDBe ServicesPDBeAnalysis: selections and statistics
• Structure Statistics• frequency plots on 1 or 2 properties of entries
• Residue Statistics• Choose residues and make frequency plots of a property• Choose residues in entry meeting certain filters, and plot their property
• Atom Statistics• Choose atom-sets in entries and plots distance, angle, dihedrals
between them• Structure Selection
• Create a subset of entries using various filters• Database Browser
• Web-based SQL query page to internal database• Geometric Validation coupled with 3D viewer
• http://www.ebi.ac.uk/pdbe-as/pdbevalidate/
PDBe ServicesPDBeAnalysis: selections and statistics
Resolution vs Rfactor
CA1-CA2-CA3-CA4Torsion distribution
Low res
High res
PDBe ServicesPDBeAnalysis: geometric validation
Table and plot of geometric checksPhi-psi, chi, omega, B-value,bonds, angles, chiralities
AstexViewer coordinated with plots
PDBe Community Work• X-ray
– CCP4 software: MMDB, PISA, SSM, harvesting
– Validation Task Force
• NMR– CCP-NMR software– Validation task force
• EM– Validation and standards– Ongoing software
development
• SIFTS - coordinating with other biodatabases
• CAPRI - Provide infrastructure for submission and maintenance of entries
• PiMS – Information management system for protein crystallography experiments
PDBe Community Work• EuroCarbDB
– Databases and bioinformatic tools in glycobiology and glycomics
• BIObar– A toolbar for browsing biological data and
databases, a Mozilla plugin for your browser
• Outreach and training– Roadshows: invite us!– Tutorials
PDBe Services: Future Emphasis• To go from being a historic structural archive to a valuable
resource for structural biomedicine
• PDBeXplore– Provide relevant interesting avenues to access structural information– Ligands, Assemblies, Enzymes, GO, CATH, Sequences, Publications,
Pathways
• PDBe Validation Resource– Provide a comprehensive battery of validation tools during deposition
and to the end-user– Migrate and enhance EDS server– Partner with CCDC to bring cutting edge ligand validation
Summary• Structural Bioinformatics and Biocomputing are essential to fulfilling the
promise of structural biology
• Databases are indispensible to all aspects of structural bioinformatics
• PDB is the primary repository of structures and numerous databases are developed based on PDB.
• PDBe provides high-quality services to depositors and end-users, and is an active member of structure-determination community.
• PDBe is open to all suggestions to make our services better and more relevant to your work.
Acknowledgements
• Alejandro and organizers at IPMont
• PDBe group– Sameer Velankar, Jawahar Swaminathan
• Designers, developers, maintainers of various structural databases at PDBe and elsewhere