Download - Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Transcript
Page 1: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Interactive fitting of high resolution modelsinto low resolution maps.

Last developments in the UROX software //http://mem.ibs.fr/UROX

Xavier Siebert and Jorge Navaza

Methods in Electron MicroscopyInstitut de Biologie Structurale

CNRS, Grenoble, France

Leiden, May 16, 2008

Page 2: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Fitting high resolution models . . .

◮ XRay crystallography

◮ need crystals◮ hard for large complexes

◮ NMR

◮ hard > 50kDa◮ many peaks◮ broader peaks

Page 3: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

. . . into Electron Microscopy maps

Rotavirus, 25 Å (Jean Lepault)

Page 4: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

. . . or Small Angle Scattering envelopes

Page 5: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Methods for Fitting

◮ “by hand”, with a graphics software◮ subjective (Wang et al. 1992 ; Stewart et al. 1993)

◮ with a fitting algorithm◮ real space : 3SOM, ADP_EM, CHIMERA, DOCKEM,

EMFIT, FOLDHUNTER, MOLREP, SITUS◮ reciprocal space : COAN, URO, UROX

◮ “force feedback 3D devices” : SENSITUS

Page 6: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Fitting in real space or reciprocal space ?

◮ minimize mismatch / maximize correlation (Q = 1 − CC2)◮ . . . in real space : between densities

Q =

|ρem(r) − λρmod(r)|2d3r∫

|ρem(r)|2d3r

◮ . . . in reciprocal space : between Fourier coefficients

Q =

|F em(s) − λF mod(s)|2d3s∫

|F em(s)|2d3s

◮ equivalent formulations (Parseval)

Page 7: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Reciprocal-space fitting with UROX

Page 8: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Application of UROX : Rotavirus (J. Virol, March 2008)Fitting in the whole reconstruction, using symmetry

Page 9: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Application of UROX : RotavirusChannel shrinks (right), inhibiting transcription

Page 10: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Reciprocal-space formulation with symmetry

◮ Goal : maximize correlation map - models:

CC =

F em(s)F mod(s)d3s√

|F em(s)|2d3s√

|F mod(s)|2d3s. (1)

◮ where F mod are functions of the positional variables of theindependent molecules :

F mod(s) =∑

m∈M

g∈G

fm(sMgRm) exp[2πis(MgXm + Tg)] , (2)

◮ m = one of the M independent molecules, located at theposition Xm in the orientation Rm with respect to areference position

◮ g = symmetry operator represented by the translation Tg

and the rotation Mg.

Page 11: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

UROX design

◮ core calculations : Fortran77 code (adapted from URO)◮ graphical libraries : VTK (Visualization Toolkit)◮ Python wrapper

◮ Tkinter : graphical user interface◮ F2PY : import fortran from python

Page 12: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

VTK (Visualization Toolkit) www.vtk.org

◮ powerful libraries for medical and scientific applications

Page 13: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Why reciprocal space ?

Pros:

1. speed allows real-time calculations

2. choose fitting resolution without extra computations

3. use whole EM reconstruction (no masking)

4. symmetry (if any) incorporated in the formulation

Cons:◮ hard to impose real-space restrains

Page 14: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Why reciprocal space ?

Pros:

1. speed allows real-time calculations

2. choose fitting resolution without extra computations

3. use whole EM reconstruction (no masking)

4. symmetry (if any) incorporated in the formulation

Cons:◮ hard to impose real-space restrains

Page 15: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Why reciprocal space ?

Pros:

1. speed allows real-time calculations

2. choose fitting resolution without extra computations

3. use whole EM reconstruction (no masking)

4. symmetry (if any) incorporated in the formulation

Cons:◮ hard to impose real-space restrains

Page 16: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Why reciprocal space ?

Pros:

1. speed allows real-time calculations

2. choose fitting resolution without extra computations

3. use whole EM reconstruction (no masking)

4. symmetry (if any) incorporated in the formulation

Cons:◮ hard to impose real-space restrains

Page 17: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Why reciprocal space ?

Pros:

1. speed allows real-time calculations

2. choose fitting resolution without extra computations

3. use whole EM reconstruction (no masking)

4. symmetry (if any) incorporated in the formulation

Cons:◮ hard to impose real-space restrains

Page 18: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 19: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 20: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 21: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 22: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 23: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 24: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 25: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 26: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

Page 27: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXSpeedup Graphics (for oldish graphics cards like mine . . . )

◮ VTK decimation (wireframe mode)

Page 28: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXSpeedup Graphics (for oldish graphics cards like mine . . . )

◮ VTK decimation (surface mode)

Page 29: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

1. Speed of UROXSpeedup Graphics

◮ VTK BoxWidget◮ Analyse local parts of the map (and speed up)

Page 30: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

2. Change fitting resolutionCorrelation profile at 1

20 Å−1

-10

0

10

20

30

40

50

60

70

80

-40 -20 0 20 40 60 80 100

CC

Z [Angstroms]

20A

Page 31: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

2. Change fitting resolutionCorrelation profile at 1

40 Å−1

-10

0

10

20

30

40

50

60

70

80

90

100

-40 -20 0 20 40 60 80 100

CC

Z [Angstroms]

40A

Page 32: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

2. Change fitting resolutionCorrelation profile at 1

60 Å−1

-20

0

20

40

60

80

100

-40 -20 0 20 40 60 80 100

CC

Z [Angstroms]

60A

Page 33: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

2. Change fitting resolutionStrategy

◮ to avoid local extrema :1. low resolution2. high resolution

◮ two modes :◮ interactive with least-squares optimization◮ exhaustive 3D or 6D searches

Page 34: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

3. and 4. Use whole EM map and symmetryIllustrated by a benchmark comparison of several fitting softwares

◮ test case : GroEl (cryo-stain, Dubochet, JSB 2002)◮ D7 symmetry

Page 35: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

3. and 4. - Benchmark : GroElWarning : subject to my mishandling of other people’s softwares

Page 36: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

3. and 4. - Benchmark : GroElAnalysis of Benchmark for other softwares

◮ most softwares struggle because of "extra" density◮ alternative : mask around putative solution (but bias . . . )

◮ in that case most softwares find the solution

Page 37: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

3. and 4. - Benchmark : GroElAnalysis of Benchmark for UROX (exhaustive search mode)

◮ without symmetry (C1) : difficult (requires tweaking)◮ with symmetry (D7) : easy (10 min)◮ conclusions :

◮ symmetry matters, no mask necessary◮ could use interactive mode instead of exhaustive search

Page 38: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Other features of UROX

◮ refine electron microscope magnification (5% error)

Page 39: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Latest developments (UROX 2.0)

◮ (re-writing of the Python classes . . . )◮ flexible fitting : normal modes◮ fit map in map◮ applications to tomography

Page 40: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

UROX 2.0 - flexible fittingNormal modes (with K. Suhre and Y-H. Sanejouand)

◮ low frequency motion of proteins◮ harmonic approximation

rj(t) = r0j +

k

Ajkαkcos(ωk t + φk ) (3)

◮ use with care (will always give better answer ! )

Page 41: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

UROX 2.0 - fit map in map

Page 42: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

UROX 2.0 - Tomography and missing wedgePresentation of the problem

Page 43: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

UROX 2.0 - Tomography and missing wedgeVisualize the Fourier transform (and select reflections)

Page 44: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

UROX 2.0 - Tomography and missing wedge

◮ Detect missing wedge◮ remove it from fitting (don’t align missing wedges !)

Page 45: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Thank you

. The Organizers . . .

. Jorge Navaza (IBS, Grenoble)

. Jean Lepault and Sonia Libersou (LVMS, Gif-sur-Yvette)

. Karsten Suhre (Neuherberg, Germany)

. Yves-Henri Sanejouand (ENS, Lyon)

. Leandro F. Estrozi (EMBL, Grenoble)

. Stefano Trapani (CBS, Montpellier)

. James Conway (Pittsburgh, USA)

. Irina Gutsche and Ambroise Desfosses (EMBL, Grenoble)

+ http://mem.ibs.fr/UROX

Page 46: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Error EstimatesMy map has a resolution of x Å. What is the error on the fit ?

◮ in UROX:1. R-factor ↔ quality of the map

R =

h ||F emh | − |F mod

h ||∑

h |F emh |

2. Q = quadratic misfit

◮ rule of thumb : 10% resolution (Rossmann, Acta Crys. 2001)

◮ empirically : VP6 of the rotavirus (25 Å map)◮ fit with a trimer◮ fit with 3 monomers◮ RMSD (trimer, 3 monomers) ≈ 3 Å

Page 47: Interactive fitting of high resolution models into low ... fileFitting high resolution models ... XRay crystallography need crystals hard for large complexes NMR hard > 50kDa many

Error Estimate by Least SquaresBorel p. 204

◮ let us suppose that the errors are distributed as a gaussian:

P(ǫ) =1

σ√

2πexp(− ǫ2

2σ2 ) (4)

◮ if σ is the same for all N reflections :

P({F modH , F em

H , σ}) = (1

σ√

2π)N exp(−

H

|F emH − F mod

H |22σ2 ) (5)

σ ≈√

Qmin

N − M(6)