The Protein Data Bank: Evolution of a key resource in biology

12
The Protein Data Bank: Evolution of a key resource in biology Helen M. Berman September 9, 2010

description

The Protein Data Bank: Evolution of a key resource in biology. Helen M. Berman September 9, 2010. What is the Protein Data Bank?. Single international archive for all information about the structure of large biological molecules (>67,000 entries) - PowerPoint PPT Presentation

Transcript of The Protein Data Bank: Evolution of a key resource in biology

Page 1: The Protein Data Bank: Evolution of a key resource in biology

The Protein Data Bank:Evolution of a key resource in

biology

Helen M. Berman

September 9, 2010

Page 2: The Protein Data Bank: Evolution of a key resource in biology

What is the Protein Data Bank? Single international archive for all

information about the structure of large biological molecules (>67,000 entries)

Archival database with hundreds of thousands of users who depend on the data

Used by structural biologists, computational biologists, biophysicists, biochemists, geneticists, cell biologists, molecular biologists, educators, students, general public

Page 3: The Protein Data Bank: Evolution of a key resource in biology

Early structures

1960s: Protein crystallography begins to take off

Emerging interest in protein folding

Use of computer graphics to represent structure

Nobel Prize awarded for the first 3D protein structures: myoglobin and hemoglobin

Lysozyme

Hemoglobin

Ribonuclease

Myoglobin

Myoglobin: Kendrew, Bodo, Dintzis, Parrish, Wyckoff, Phillips (1958) Nature 181 662-666; Hemoglobin: Perutz (1962) Proc. R. Soc. A265, 161-187; Lysozyme: Blake, Koenig, Mair, North, Phillips, Sarma (1965) Nature 206 757; Ribonuclease: Kartha, Bello, Harker (1967) Nature 213, 862-865; Wyckoff, Hardman, Allewell, Inagami, Johnson, Richards (1967) J. Biol. Chem. 242, 3753-3757.

Page 4: The Protein Data Bank: Evolution of a key resource in biology
Page 5: The Protein Data Bank: Evolution of a key resource in biology

PDB Depositors

RCSB PDB173,416,704data downloads

PDBe32,344,547data downloads

PDBj14,053,071data downloads

PDB AccessPDB FTP & RSYNC Traffic (July 2009 – June 2010)

Page 6: The Protein Data Bank: Evolution of a key resource in biology

1970s Community discussions about a protein structure archive

Cold Spring Harbor meeting in protein crystallography

PDB established at Brookhaven (Oct 1971; 7 structures)

1980s Number of structures increases as technology improves

Community discussions about requiring depositions

IUCr guidelines established

Number of structures deposited increases

PDB History

Page 7: The Protein Data Bank: Evolution of a key resource in biology

1990s mmCIF standard created

Structural genomics begins

PDB moves to RCSB PDB

2000s wwPDB formed

New methods for structure determination

Demand for new validation standards

PDB History

Page 8: The Protein Data Bank: Evolution of a key resource in biology

wwPDB

Formalization of current working practice

MOU signed July 1, 2003

Announced in Nature Structural Biology November 21, 2003

Page 9: The Protein Data Bank: Evolution of a key resource in biology

wwPDB guidelines and responsibilities

All members issue PDB IDs and serve as distribution sites for data

One member is the archive keeper (RCSB PDB)

All format documentation publicly available

Strict rules for redistribution of PDB files

All sites can create their own websites

Page 10: The Protein Data Bank: Evolution of a key resource in biology

Community involvement at every step

Formation of the resource Guidelines for deposition Standards for the data Global cooperation

Page 11: The Protein Data Bank: Evolution of a key resource in biology

Contributing factors for success The science that is being archived must be

important enough for people to want to access results

The technology for data archiving must be continually evaluated and changed as IT changes

The creation of an international organization recognizes the fact that science is global

Understanding sociological issues of both the data users and the data producers

Attribution of the work of data producers

Page 12: The Protein Data Bank: Evolution of a key resource in biology

Wellcome Trust, EU, CCP4, BBSRC, MRC, EMBL

NLM

BIRD-JST, MEXT

NSF, NIGMS, DOE, NLM, NCI, NINDS, NIDDK