Interactive fitting of high resolution models into low ... fileFitting high resolution models ......

Post on 05-Aug-2019

220 views 0 download

Transcript of Interactive fitting of high resolution models into low ... fileFitting high resolution models ......

Interactive fitting of high resolution modelsinto low resolution maps.

Last developments in the UROX software //http://mem.ibs.fr/UROX

Xavier Siebert and Jorge Navaza

Methods in Electron MicroscopyInstitut de Biologie Structurale

CNRS, Grenoble, France

Leiden, May 16, 2008

Fitting high resolution models . . .

◮ XRay crystallography

◮ need crystals◮ hard for large complexes

◮ NMR

◮ hard > 50kDa◮ many peaks◮ broader peaks

. . . into Electron Microscopy maps

Rotavirus, 25 Å (Jean Lepault)

. . . or Small Angle Scattering envelopes

Methods for Fitting

◮ “by hand”, with a graphics software◮ subjective (Wang et al. 1992 ; Stewart et al. 1993)

◮ with a fitting algorithm◮ real space : 3SOM, ADP_EM, CHIMERA, DOCKEM,

EMFIT, FOLDHUNTER, MOLREP, SITUS◮ reciprocal space : COAN, URO, UROX

◮ “force feedback 3D devices” : SENSITUS

Fitting in real space or reciprocal space ?

◮ minimize mismatch / maximize correlation (Q = 1 − CC2)◮ . . . in real space : between densities

Q =

|ρem(r) − λρmod(r)|2d3r∫

|ρem(r)|2d3r

◮ . . . in reciprocal space : between Fourier coefficients

Q =

|F em(s) − λF mod(s)|2d3s∫

|F em(s)|2d3s

◮ equivalent formulations (Parseval)

Reciprocal-space fitting with UROX

Application of UROX : Rotavirus (J. Virol, March 2008)Fitting in the whole reconstruction, using symmetry

Application of UROX : RotavirusChannel shrinks (right), inhibiting transcription

Reciprocal-space formulation with symmetry

◮ Goal : maximize correlation map - models:

CC =

F em(s)F mod(s)d3s√

|F em(s)|2d3s√

|F mod(s)|2d3s. (1)

◮ where F mod are functions of the positional variables of theindependent molecules :

F mod(s) =∑

m∈M

g∈G

fm(sMgRm) exp[2πis(MgXm + Tg)] , (2)

◮ m = one of the M independent molecules, located at theposition Xm in the orientation Rm with respect to areference position

◮ g = symmetry operator represented by the translation Tg

and the rotation Mg.

UROX design

◮ core calculations : Fortran77 code (adapted from URO)◮ graphical libraries : VTK (Visualization Toolkit)◮ Python wrapper

◮ Tkinter : graphical user interface◮ F2PY : import fortran from python

VTK (Visualization Toolkit) www.vtk.org

◮ powerful libraries for medical and scientific applications

Why reciprocal space ?

Pros:

1. speed allows real-time calculations

2. choose fitting resolution without extra computations

3. use whole EM reconstruction (no masking)

4. symmetry (if any) incorporated in the formulation

Cons:◮ hard to impose real-space restrains

Why reciprocal space ?

Pros:

1. speed allows real-time calculations

2. choose fitting resolution without extra computations

3. use whole EM reconstruction (no masking)

4. symmetry (if any) incorporated in the formulation

Cons:◮ hard to impose real-space restrains

Why reciprocal space ?

Pros:

1. speed allows real-time calculations

2. choose fitting resolution without extra computations

3. use whole EM reconstruction (no masking)

4. symmetry (if any) incorporated in the formulation

Cons:◮ hard to impose real-space restrains

Why reciprocal space ?

Pros:

1. speed allows real-time calculations

2. choose fitting resolution without extra computations

3. use whole EM reconstruction (no masking)

4. symmetry (if any) incorporated in the formulation

Cons:◮ hard to impose real-space restrains

Why reciprocal space ?

Pros:

1. speed allows real-time calculations

2. choose fitting resolution without extra computations

3. use whole EM reconstruction (no masking)

4. symmetry (if any) incorporated in the formulation

Cons:◮ hard to impose real-space restrains

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

1. Speed of UROXInteractive graphics . . .

◮ fast calculation of correlation coefficient (CC):◮ 10−7 s / symmetry operation / Fourier coefficient

symmetry resol # coeff timeGroEl 14 30 Å ≈ 5000 3.5 ms

DLP (Rotavirus) 60 20 Å ≈ 650,000 4 s◮ interactive graphics

◮ CC computed in real-time as a model is moved in the map◮ speed up

◮ recuperate asymmetric unit in reciprocal space◮ decimate Fourier coefficients (only 6N unknowns !)

1. Speed of UROXSpeedup Graphics (for oldish graphics cards like mine . . . )

◮ VTK decimation (wireframe mode)

1. Speed of UROXSpeedup Graphics (for oldish graphics cards like mine . . . )

◮ VTK decimation (surface mode)

1. Speed of UROXSpeedup Graphics

◮ VTK BoxWidget◮ Analyse local parts of the map (and speed up)

2. Change fitting resolutionCorrelation profile at 1

20 Å−1

-10

0

10

20

30

40

50

60

70

80

-40 -20 0 20 40 60 80 100

CC

Z [Angstroms]

20A

2. Change fitting resolutionCorrelation profile at 1

40 Å−1

-10

0

10

20

30

40

50

60

70

80

90

100

-40 -20 0 20 40 60 80 100

CC

Z [Angstroms]

40A

2. Change fitting resolutionCorrelation profile at 1

60 Å−1

-20

0

20

40

60

80

100

-40 -20 0 20 40 60 80 100

CC

Z [Angstroms]

60A

2. Change fitting resolutionStrategy

◮ to avoid local extrema :1. low resolution2. high resolution

◮ two modes :◮ interactive with least-squares optimization◮ exhaustive 3D or 6D searches

3. and 4. Use whole EM map and symmetryIllustrated by a benchmark comparison of several fitting softwares

◮ test case : GroEl (cryo-stain, Dubochet, JSB 2002)◮ D7 symmetry

3. and 4. - Benchmark : GroElWarning : subject to my mishandling of other people’s softwares

3. and 4. - Benchmark : GroElAnalysis of Benchmark for other softwares

◮ most softwares struggle because of "extra" density◮ alternative : mask around putative solution (but bias . . . )

◮ in that case most softwares find the solution

3. and 4. - Benchmark : GroElAnalysis of Benchmark for UROX (exhaustive search mode)

◮ without symmetry (C1) : difficult (requires tweaking)◮ with symmetry (D7) : easy (10 min)◮ conclusions :

◮ symmetry matters, no mask necessary◮ could use interactive mode instead of exhaustive search

Other features of UROX

◮ refine electron microscope magnification (5% error)

Latest developments (UROX 2.0)

◮ (re-writing of the Python classes . . . )◮ flexible fitting : normal modes◮ fit map in map◮ applications to tomography

UROX 2.0 - flexible fittingNormal modes (with K. Suhre and Y-H. Sanejouand)

◮ low frequency motion of proteins◮ harmonic approximation

rj(t) = r0j +

k

Ajkαkcos(ωk t + φk ) (3)

◮ use with care (will always give better answer ! )

UROX 2.0 - fit map in map

UROX 2.0 - Tomography and missing wedgePresentation of the problem

UROX 2.0 - Tomography and missing wedgeVisualize the Fourier transform (and select reflections)

UROX 2.0 - Tomography and missing wedge

◮ Detect missing wedge◮ remove it from fitting (don’t align missing wedges !)

Thank you

. The Organizers . . .

. Jorge Navaza (IBS, Grenoble)

. Jean Lepault and Sonia Libersou (LVMS, Gif-sur-Yvette)

. Karsten Suhre (Neuherberg, Germany)

. Yves-Henri Sanejouand (ENS, Lyon)

. Leandro F. Estrozi (EMBL, Grenoble)

. Stefano Trapani (CBS, Montpellier)

. James Conway (Pittsburgh, USA)

. Irina Gutsche and Ambroise Desfosses (EMBL, Grenoble)

+ http://mem.ibs.fr/UROX

Error EstimatesMy map has a resolution of x Å. What is the error on the fit ?

◮ in UROX:1. R-factor ↔ quality of the map

R =

h ||F emh | − |F mod

h ||∑

h |F emh |

2. Q = quadratic misfit

◮ rule of thumb : 10% resolution (Rossmann, Acta Crys. 2001)

◮ empirically : VP6 of the rotavirus (25 Å map)◮ fit with a trimer◮ fit with 3 monomers◮ RMSD (trimer, 3 monomers) ≈ 3 Å

Error Estimate by Least SquaresBorel p. 204

◮ let us suppose that the errors are distributed as a gaussian:

P(ǫ) =1

σ√

2πexp(− ǫ2

2σ2 ) (4)

◮ if σ is the same for all N reflections :

P({F modH , F em

H , σ}) = (1

σ√

2π)N exp(−

H

|F emH − F mod

H |22σ2 ) (5)

σ ≈√

Qmin

N − M(6)