Ligand Building with ARP/wARP
Automated Model Building
Given the native X-ray diffraction data and a phase-set
To rapidly deliver a complete, accurate and error free model
Building Ligands from Dummy Atoms / Seed Points
Back to about 2000: a side project for a PhD student
Nearest Neighbour Distance Distribution
€
f ( d
jk
obs
) =
1
σ
m
π
d
jk
obs
d
jk
tar
e
−
d
jk
o b s
( )
2
+ d
jk
ta r
( )
2
4 σ
m
2
sinh
d
jk
obs
d
jk
tar
2 σ
m
2
⎛
⎝
⎜
⎜
⎞
⎠
⎟
⎟
0
0.1
0.2
0.3
0.4
0.5
0.6
0 0.5 1 1.5 2 2.5 3 3.5 4
d
obs
Error free distance dtar is 1.5 Å
Expected rmsd is 1.0 Å
€
N ( d
ij
tar
, 2 σ
m
2
)
ShakeGiven a coordinate error, the inter-atomic distances in a protein model change:
Fit that
into
that !
Building a Ligand into a Difference Mapimagine:
a ligand consisting of N atoms
a density map containing M points
the only thing to do is to correctly select N out of M !
A Simple Example: Select 3 out of 4
• The task is to find an equilateral triangle• Prior knowledge: edges should have a length 1.0 Å• Reliability: error on data (distances) is 0.01 Å
a
bc
d
a b c d
a 0 1.07 Å 0.98 Å 1.01 Å
b 7 0 0.85 Å 2.10 Å
c 2 15 0 0.95 Å
d 1 110 5 0
Triangle Log likelihood Probability
abc -278 2.0*10-108
€
f ( d
j k
obs
) =
1
σ
m
π
d
j k
obs
d
j k
t ar
e
−
d
j k
obs
( )
2
+ d
j k
t ar
( )
2
4 σ
m
2
s inh
d
j k
obs
d
j k
t ar
2 σ
m
2
⎛
⎝
⎜
⎜
⎞
⎠
⎟
⎟
0
0.1
0.2
0.3
0.4
0.5
0.6
0 0.5 1 1.5 2 2.5 3 3.5 4
d
obs
Error free distance dtar is 1.5 Å
Expected rmsd is 1.0 Å
€
N ( d
ij
t ar
, 2 σ
m
2
)
A Simple Example: Select 3 out of 4
• The task is to find an equilateral triangle• Prior knowledge: edges should have a length 1.0 Å• Reliability: error on data (distances) is 0.01 Å
a
bc
d
Triangle Log likelihood Probability
abc -278 2.0*10-108
abd -12150 0
a b c d
a 0 1.07 Å 0.98 Å 1.01 Å
b 7 0 0.85 Å 2.10 Å
c 2 15 0 0.95 Å
d 1 110 5 0
A Simple Example: Select 3 out of 4
• The task is to find an equilateral triangle• Prior knowledge: edges should have a length 1.0 Å• Reliability: error on data (distances) is 0.01 Å
a
bc
d
Triangle Log likelihood Probability
abc -278 2.0*10-108
abd -12150 0
bcd -12350 0
a b c d
a 0 1.07 Å 0.98 Å 1.01 Å
b 7 0 0.85 Å 2.10 Å
c 2 15 0 0.95 Å
d 1 110 5 0
A Simple Example: Select 3 out of 4
• The task is to find an equilateral triangle• Prior knowledge: edges should have a length 1.0 Å• Reliability: error on data (distances) is 0.01 Å
a
bc
d
Triangle Log likelihood Probability
abc -278 2.0*10-108
abd -12150 0
bcd -12350 0
acd -30 0.9999
a b c d
a 0 1.07 Å 0.98 Å 1.01 Å
b 7 0 0.85 Å 2.10 Å
c 2 15 0 0.95 Å
d 1 110 5 0
N atoms in the ligand molecule
M points in a density map
W X Y Z
A B C D
Ligand Building as a Label Swapping Problem
€
Qassignment = log P(dijobs | dij
assigned ,error _model)[ ]j= i+1
N
∑i=1
N
∑
• Sources of possible prior information:– Chemical composition of a ligand– Bonding distances – Angle bonded distances– Chirality– VdW interactions
Combinatorial Explosion
€
N po int s!
N po int s −Natoms( )!
Label Swapping
Initial map 349 grid pointsComplexity 1059
Sparse map 58 grid pointsComplexity 1037
22-atoms molecule of retinoic acid
Topological Extension(a branch and bound approach)
Retinoic acid - topological extension
Topology of the sparse map Topology of the ligand
a
bc
d a
bc
d a
bc
d a
bc
d a
bc
d
Real Space Fit for Final Selection of the Model
22 atoms molecule of retinoic acid: among 100 “top” models:21 are less than 0.5 Å r.m.s.d. from the final modelthe “best” model is 0.14 Å r.m.s.d. from the final model
MTZ file
Protein withoutligand
Ligand
Ligand Building Module in ARP/wARP 6.1
Take the largest object in the
difference map
Build the ligand there (label assignment)
Real space refinement of the
ligand
Ligand Building Module in ARP/wARP 6.1
Location unknown Location known
Single known ligand
Yes (if the largest) No
A ligand out of the list of expected
ligandsNo No
Partially ordered ligand
No No
Working sample
Ligand building
Performance Assessment
Run with default parameters
- PDB and MTZ from the EDS- Ligand PDB from HICUP- Exclude DNA- Exclude ligands covalently bound to the chain- Exclude ligands with partial occupancies
(3821 structures)
Large-Scale Test
1
3
2
4
5
6
78
9
Name-by-name Nearest neighbour
Assume the PDB structure to be correct
Atomic scale(correctly built ligand
into correct site)
Ligand scale(correct site
incorrectly built ligand)
Protein scale(incorrect site)
Accuracy of Ligand Building Process
Size of the Largest Ligand in the Working Sample
2981 structures withLigand size 7
3821 structures
Dependence on Resolution of the Data
Dependence on Ligand DisorderB factors
Dependence on Ligand DisorderR.m.s.d (Ligand_Bfactors)
Dependence on Ligand Size
What is the Ligand Site / Largest Object ?
Typically it is the largest set (cluster) of connected map points where the density is above a threshold
It is however mostly the case that at different thresholds there are different (and even non-overlapping) clusters
Take the largest object in the
difference map
Build the ligand there (label assignment)
Real space refinement of the
ligand
At each density threshold count the number of clusters.
A maximum is reached at typically ~1.5 sigma density level.
Density Clusters and a Fragmentation Tree
1ED5 (nitric oxide synthase), 1.8 Å resolution, Rfactor 21 % (with CNS)
Ligands: 2 x HEM and NGR (N-omega-nitro-L-arginine)
Fragmentation Tree: an Example
1ED5 (nitric oxide synthase), 1.8 Å resolution, Rfactor 21 % (with CNS)
Ligands: 2 x HEM and NGR (N-omega-nitro-L-arginine)
Fragmentation Tree: an Example
Looking for HEM, finding HEM
Scoring of Density Clusters
Looking for NGR, finding NGR
Looking for NGR, finding HEM Looking for HEM, finding NGR
Selection of Correct Density Cluster
Other Lessons ?
Take the largest object in the
difference map
Build the ligand there (label assignment)
Real space refinement of the
ligand
Ligand Building: ARP/wARP 6.1 and perspectives
Location unknown Location known
Single known ligand
Yes (if the largest)
Yes
No
Yes
A ligand out of the list of expected
ligands
No
Yes
No
Yes
Partially ordered ligand
No
No
No
May be
Developers
EMBL Hamburg: Guillaume Evrard, Johan Hattne, Gerrit Langer,
Venkat Parthasarathy, Tilo Strutz, Victor Lamzin and
many in-house friends
NKI Amsterdam: Serge Cohen, Diederick De Vries, Marouane
Jelloul, Krista Joosten, Tassos Perrakis
Former members and collaborators
Richard Morris, Peter Zwart, Francisco Fernandez, Olga
Kirillova, Matheos Kakaris, Gleb Bourenkov, Garib
Murshudov, Alexei Vagin, Andrey Lebedev, Peter Briggs,
Eleanor Dodson, Keith Wilson, Zbyszek Dauter, Gerard
Klejwegt
ARP/wARP - the people
Top Related