Increasing the Value of Crystallographic Databases Derived knowledge bases Knowledge-based...
-
Upload
allyson-daniel -
Category
Documents
-
view
217 -
download
2
Transcript of Increasing the Value of Crystallographic Databases Derived knowledge bases Knowledge-based...
Increasing the Value ofCrystallographic Databases
• Derived knowledge bases
• Knowledge-based applications programs
• Data mining tools for protein-ligand complexes
Mogul
• Knowledge base of molecular geometry information taken from CSD
• Bond length, valence angle and torsion angle distributions
• Aim: click on a molecular parameter of interest and get observed distribution with no intervening steps
Mogul - Search Setup
User loads amolecule thenspecifies a bond length,bond angle ortorsion angle,of interest
Mogul - Search Algorithm
Substructures stored in a hierarchical tree:
B C
A D
Properties of B,C
Properties of A-B & C-D bonds
Properties of atoms bound to B and C
IsoStar and SuperStar
• IsoStar - knowledge base of information about intermolecular interactions
• SuperStar - program for predicting binding points in an enzyme active site
• SuperStar predictions based solely on IsoStar data
CSD vs. PDB scatterplots
0
5
10
15
20
25
30
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Carbo Index
Nr.
of
de
nsity
plo
ts Any aromatic CH
Any aliphatic CH
Water
Any C=O
Any OH
Any NH
Similarity index distribution for 72 comparisons
Scaling of IsoStar Surfaces
• Densities of grid point i are converted to propensities by:
• Average density is the density of contacts expected by random chance:
n
icelli
contacti
centrali
avV
nnd
1
av
ii d
dp
SuperStar
• Calculate binding positions for specific probe atoms in protein active sites
• Identify functional groups in binding-site • Look up relevant IsoStar scatterplots and
overlay on functional groups• Contour - combining by taking products
+ =
SuperStar Features
• Cavity detection
• Surface or pharmacophore point display
• Metal coordination
• Hyperlinking to IsoStar scatterplots
• Choice of CSD- or PDB-based maps
• Gaussian fits
SuperStar Validation
• 265 PDB complexes
• Generate four maps (Me, C=O, NH, OH)
• See whether maps discriminate correctly, e.g. does Me have highest propensity where a ligand Me group is observed?
• Compute percentage success rate
• CSD 74%
• PDB 75%
• Gaussian CSD 70 - 74%
• PDB maps fuzzier, fewer probes possible
• Gaussian 4-5 times faster
Relibase+• Protein-ligand database system
• Based on original software developed by Manfred Hendlich and colleagues at Merck and Marburg University
• Enables searching of PDB and of in-house proprietary databases
Some Relibase+ Options
• Text searching
• Sequence searching
• 2D substructure and similarity searching
• 3D substructure searching
• Logical combination of hit lists
• Searching for intermolecular interactions
• Auto-superposition of similar binding sites
• Scripting facility based on Python
Analysis of 3D Queries
Distance Distribution
Torsion Distribution
Benzamidine-CarboxylateInteractions
Example Python Script# Find all benzamidines # and check contacts to ASP under 3Å
relibase.load(’dbase1') ba = relibase.Hitlist({'smiles':'c1ccccc1C(=N)N'}) new = relibase.Hitlist() for ligand in ba: for chain in ligand.contacts(): for residue in chain.residues(): if residue.name() == 'ASP': ligatoms = ligand.atoms() resatoms = residue.atoms() d = mindist(ligatoms,resatoms) if d < 3.0: new.append(ligand) new.saveas(’contact')