Symmetry considerations, restraints and constraints in structure...

June 7, 2019

Symmetry considerations, restraints and constraints in structure refinement from single-crystal

X-ray diffraction data

B. Guillot

Outline

The Least Squares optimization : general considerations

The Least Squares refinement method in Crystallography

Parameters of the model

Symmetries in Least-Squares

Restraints and constraints … and symmetries

Symmetries in another crystallographic modelling

Symmetries In protein crystallography

Least Squares refinement problems related to symmetries

Examples … tutorials

ematics of allographic

refinement

The Least Squares refinement : general considerations

Crystallographic refinement Optimizing the model parameters against experimental diffraction data

Observations Diffraction Intensities 𝐼(ℎ𝑘𝑙)

Variables Parameters to describe a crystal structure

Least Squares optimization method A very general approach:

𝑚 experimental observations f1 … fm Depend linearly on 𝑛 parameters x1 …. xn :

𝑓1 = 𝑑11𝑥1 + 𝑑12𝑥2 + ⋯ + 𝑑1𝑛𝑥𝑛

𝑓2 = 𝑑21𝑥1 + 𝑑22𝑥2 + ⋯ + 𝑑2𝑛𝑥𝑛

𝑓𝑚 = 𝑑𝑚1𝑥1 + 𝑑𝑚2𝑥2 + ⋯ + 𝑑𝑚𝑛𝑥𝑛

….

With 𝑚 ≥ 𝑛

𝑫𝑿 = 𝑭

In matrix form :

Where 𝑫 is the “design matrix”,

of elements 𝑑𝑖𝑗 =𝜕𝑓𝑖

𝜕𝑥𝑗

Least-squares procedure find the best estimate of 𝑿 to reproduce 𝑭 𝑿 = estimate of 𝑿, 𝑭 = estimate of 𝑭 using 𝑿 Thus : 𝑫𝑿 = 𝑭

If 𝑿 is ok, then it minimizes the difference between 𝑭 and 𝑭

Hence : to find the best estimate 𝑿 ,

proposed to minimize the sum 𝑆 of the squared, weighted differences between

all the 𝑓𝑖 and the 𝑓 𝑖:

Minimize 𝑆 = 𝑤𝑖 𝑓𝑖 − 𝑓 𝑖2

𝑚

𝑖=1

Friedrich Gauss

𝑓1 = 𝑑11𝑥1 + 𝑑12𝑥2 + ⋯ + 𝑑1𝑛𝑥𝑛

𝑓2 = 𝑑21𝑥1 + 𝑑22𝑥2 + ⋯ + 𝑑2𝑛𝑥𝑛

𝑓𝑚 = 𝑑𝑚1𝑥1 + 𝑑𝑚2𝑥2 + ⋯ + 𝑑𝑚𝑛𝑥𝑛

….


Minimize 𝑆 = 𝑤𝑖 𝑓𝑖 − 𝑓 𝑖2

𝑚

𝑖=1

𝑓𝑖 = experimental measurements experimental uncertainties

The weights = inverse of the corresponding variances : 𝑤𝑖 =1

𝜎𝑖2

Defining : 𝑾 = diagonal matrix of experimental variances (𝑾−𝟏 contains weights)

𝑹 = residual matrix 𝑹 = 𝑭 − 𝑭 = 𝑭 − 𝑫𝑿

𝑆 can be rewritten in matrix form : 𝑆 = 𝑹𝒕𝑾−𝟏𝑹 = 𝑭 − 𝑫𝑿 𝒕𝑾−𝟏 𝑭 − 𝑫𝑿

… and developed : 𝑆 = 𝑭𝒕𝑾−𝟏𝑭 + 𝑿 𝒕𝑫𝒕𝑾−𝟏𝑫𝑿 − 𝑭𝒕𝑾−𝟏𝑫𝑿 − 𝑿 𝒕

𝑫𝒕𝑾−𝟏𝑭

𝑫𝑿 = 𝑭 hence 𝑿 𝒕𝑫𝒕 = 𝑭 𝒕 two last terms are equal :

−𝑭𝒕𝑾−𝟏𝑫𝑿 = −𝑭𝒕𝑾−𝟏𝑭 = −𝑭 𝒕𝑾−𝟏𝑭 = −𝑿 𝒕𝑫𝒕𝑾−𝟏𝑭

Finally : 𝑆 = 𝑭𝒕𝑾−𝟏𝑭 + 𝑿 𝒕𝑫𝒕𝑾−𝟏𝑫𝑿 − 𝟐𝑿 𝒕



Optimal estimates 𝑿 of 𝑿, to minimize 𝑆

𝑆 = 𝑭𝒕𝑾−𝟏𝑭 + 𝑿 𝒕𝑫𝒕𝑾−𝟏𝑫𝑿 − 𝟐𝑿 𝒕


𝜕𝑆

𝜕𝑥𝑖= 0 for 1 ≤ 𝑖 ≤ 𝑛

The differential of 𝑆 is zero : 𝛿𝑆 = 𝛿 𝑹𝒕𝑾−𝟏𝑹 = 0

𝑆 will be minimum

𝛿𝑆 = 𝛿 𝑹𝒕𝑾−𝟏𝑹 = 𝛿 𝑭𝒕𝑾−𝟏𝑭 + 𝑿 𝒕𝑫𝒕𝑾−𝟏𝑿 − 𝟐𝑿 𝒕


= 𝛿 𝑿 𝒕𝑫𝒕𝑾−𝟏𝑫𝑿 − 𝟐𝑿 𝒕


= 𝟐 𝛿𝑿 𝒕𝑫𝒕𝑾−𝟏𝑫𝑿 − 𝟐 𝛿𝑿 𝒕


= 𝟐 𝛿𝑿 𝒕𝑫𝒕𝑾−𝟏𝑫𝑿 − 𝑫𝒕𝑾−𝟏𝑭 = 0

Lets call 𝑨 = 𝑫𝒕𝑾−𝟏𝑫 and 𝐕 = 𝑫𝒕𝑾−𝟏𝑭

𝑨 = 𝑫𝒕𝑾−𝟏𝑫 is the Normal Matrix 𝑨 𝑿 = 𝑽 are the Normal Equations The estimate : 𝑿 = 𝑨−𝟏𝑽

…. will minimize 𝑆


A closer look to : 𝑨 𝑿 = 𝑽

𝑨 = 𝑫𝒕𝑾−𝟏𝑫 =

𝜕𝑓1

𝜕𝑥1⋯

𝜕𝑓𝑚

𝜕𝑥1

⋮ ⋱ ⋮𝜕𝑓1

𝜕𝑥𝑛⋯

𝜕𝑓𝑚

𝜕𝑥𝑛

𝑤1 ⋯ 0⋮ ⋱ ⋮0 ⋯ 𝑤𝑚

𝜕𝑓1

𝜕𝑥1⋯

𝜕𝑓1

𝜕𝑥𝑛

⋮ ⋱ ⋮𝜕𝑓𝑚

𝜕𝑥1⋯

𝜕𝑓𝑚

𝜕𝑥𝑛

𝑚 × 𝑚 diagonal 𝑛 × 𝑚 𝑚 × 𝑛

… do the maths …

𝐴𝑖𝑗 = 𝑤𝑘

𝜕𝑓𝑘

𝜕𝑥𝑗

𝑚

𝑘=1

𝜕𝑓𝑘

𝜕𝑥𝑖 𝑨 is a square 𝑛 × 𝑛 symmetric matrix of elements :

𝑨 =

𝑤𝑘𝜕𝑓𝑘

𝜕𝑥1

2𝑚1 ⋯ 𝑤𝑘

𝜕𝑓𝑘

𝜕𝑥𝑛

𝜕𝑓𝑘

𝜕𝑥1

𝑚1

⋮ ⋱ ⋮

𝑤𝑘𝜕𝑓𝑘

𝜕𝑥1

𝜕𝑓𝑘

𝜕𝑥𝑛

𝑚1 ⋯ 𝑤𝑘

𝜕𝑓𝑘

𝜕𝑥𝑛

2𝑚1

𝑨 must be « positive definite » to be inverted in order to obtain 𝑿 = 𝑨−𝟏𝑽


A closer look to : 𝑨 𝑿 = 𝑽

𝑽 = 𝑫𝒕𝑾−𝟏𝑭 =

𝜕𝑓1

𝜕𝑥1⋯

𝜕𝑓𝑚

𝜕𝑥1

⋮ ⋱ ⋮𝜕𝑓1

𝜕𝑥𝑛⋯

𝜕𝑓𝑚

𝜕𝑥𝑛

𝑤1 ⋯ 0⋮ ⋱ ⋮0 ⋯ 𝑤𝑚

𝑓1

…𝑓𝑚

𝑚 × 𝑚 diagonal 𝑚 𝑚 × 𝑛

𝑽 =

𝑤𝑘𝑓𝑘

𝜕𝑓𝑘

𝜕𝑥1

𝑚

1 …

𝑤𝑘𝑓𝑘

𝜕𝑓𝑘

𝜕𝑥𝑛

𝑚

1

… do the maths …

𝑉𝑗 = 𝑤𝑘𝑓𝑘

𝜕𝑓𝑘

𝜕𝑥𝑗

𝑚

𝑘=1

𝑽 is a vector of dimension 𝑛 with elements :



Application to Least Squares refinement in crystallography

𝑓𝑖 → 𝐼(ℎ𝑘𝑙) and 𝑥𝑖 → the atomic fractional coordinates

𝐼 ℎ𝑘𝑙 = 𝐹 ℎ𝑘𝑙 2 are not linear functions of the model parameters 𝑥𝑖

𝐹 ℎ𝑘𝑙 = 𝐹 𝑯 = 𝑓𝑗 𝑯

𝑗,𝑈𝑛𝑖𝑡𝐶𝑒𝑙𝑙

exp (2𝜋𝑖 𝑯𝑡𝒓𝒋 )

𝑯 = ℎ𝑎∗ + 𝑘𝑏∗ + 𝑙𝑐∗ and 𝒓𝒋 = 𝑥𝑗𝑎 + 𝑦𝑗𝑏 + 𝑧𝑗𝑐

𝐹 ℎ𝑘𝑙 = 𝐹 𝑯 = 𝑓𝑗 𝑯


exp (2𝜋𝑖 ℎ𝑥𝑗 + 𝑘𝑦𝑗 + 𝑙𝑧𝑗 )

𝑌𝑗,𝑜𝑏𝑠

In the following :

can refer either to 𝐼𝑜𝑏𝑠 = 𝐹𝑜𝑏𝑠2 or to 𝐹𝑜𝑏𝑠

𝑌𝑗,𝑐𝑎𝑙𝑐 𝑿𝟎 can refer either to 𝐼𝑐𝑎𝑙𝑐 = 𝐹𝑐𝑎𝑙𝑐2 or to 𝐹𝑐𝑎𝑙𝑐

computed at the “approximate” value 𝑿𝟎 of the model parameters vector 𝑿

of observation 𝑗 among 𝑚 (ℎ𝑘𝑙)

?

However we want to minimize the function :

𝑆 = 𝑤𝑗 𝑓𝑗 − 𝑓 𝑗2

𝑚

𝑗=1

→ 𝑆 = 𝑤𝑗 𝑌𝑗,𝑜𝑏𝑠 − 𝑘𝑌𝑗,𝑐𝑎𝑙𝑐 𝑿𝟎2

𝑚ℎ𝑘𝑙

𝑗=1

The 𝑘 parameter = “Scale Factor”, a global parameter of the refinement

Finding a solution for the parameters values to minimize 𝑆 means 𝜕𝑆

𝜕𝑥𝑖= 𝑆𝑖

′ = 0

Lets expand at first order 𝑆𝑖′ around 𝑿𝟎

𝑆𝑖′ = 𝑆𝑖

′ 𝑿𝟎 + 𝜕𝑆𝑖

′ 𝑿𝟎

𝜕𝑥𝑘 𝛿𝑥𝑘

𝑛

𝑘=1

+ ⋯

𝑆𝑖′ 𝑿𝟎 =

𝜕𝑆

𝜕𝑥𝑖= −2 𝑤𝑗

𝑚ℎ𝑘𝑙

𝑗=1

𝑌𝑗,𝑜𝑏𝑠 − 𝑘𝑌𝑗,𝑐𝑎𝑙𝑐 𝑿𝟎

𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐

𝜕𝑥𝑖 With:


Finally, plugin the expression for 𝑆𝑖′ 𝑿𝟎 in 𝑆𝑖

′ = 𝑆𝑖′ 𝑿𝟎 +

𝜕𝑆𝑖′ 𝑿𝟎

𝜕𝑥𝑘 𝛿𝑥𝑘

𝑛

𝑘=1

+ ⋯

𝑆𝑖′= −2 𝑤𝑗

𝑚ℎ𝑘𝑙

𝑗=1

𝑌𝑗,𝑜𝑏𝑠 − 𝑘𝑌𝑗,𝑐𝑎𝑙𝑐 𝑿𝟎


𝜕𝑥𝑖

+ 2 𝑤𝑗


𝜕𝑥𝑖


𝜕𝑥𝑘

𝑚ℎ𝑘𝑙

𝑗=1

𝛿𝑥𝑘

𝑛

𝑘=1

= 0

Slightly rearranged :

Removing the second derivatives terms !

𝑤𝑗


𝜕𝑥𝑖


𝜕𝑥𝑘

𝑚ℎ𝑘𝑙

𝑗=1

𝛿𝑥𝑘 = 𝑤𝑗

𝑚ℎ𝑘𝑙

𝑗=1

𝑌𝑗,𝑜𝑏𝑠 − 𝑘𝑌𝑗,𝑐𝑎𝑙𝑐 𝒙𝟎


𝜕𝑥𝑖

𝑛

𝑘=1

… and there are 1 ≤ 𝑖 ≤ 𝑛 equations of the same type


The 𝑛 equations :

𝑤𝑗


𝜕𝑥𝑖


𝜕𝑥𝑘

𝑚ℎ𝑘𝑙

𝑗=1

𝛿𝑥𝑘 = 𝑤𝑗

𝑚ℎ𝑘𝑙

𝑗=1

𝑌𝑗,𝑜𝑏𝑠 − 𝑘𝑌𝑗,𝑐𝑎𝑙𝑐 𝒙𝟎


𝜕𝑥𝑖

𝑛

𝑘=1

… are equivalent to a set of linear LS equations, of type 𝑨 𝑿 = 𝑽

With 𝑨 = 𝑫𝒕𝑾−𝟏𝑫 design matrix 𝑫 of elements 𝐷𝑗𝑖 =𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐

𝜕𝑥𝑖

diagonal variances matrix 𝑾 of elements 𝑊𝑗𝑗 = 𝜎²𝑗

And 𝑽 = 𝑫𝒕𝑾−𝟏𝑭 observation vector of element 𝐹𝑗 = 𝑌𝑗,𝑜𝑏𝑠 − 𝑘𝑌𝑗,𝑐𝑎𝑙𝑐 𝒙𝟎

which contains the experimental measurements

And 𝑿 a vector of elements 𝑋 𝑖 = 𝛿𝑥𝑖

Solution of the LS equations are SHIFTS applied to the parameters of the model : 𝑥′𝑖 = 𝑥𝑖 + 𝛿𝑥𝑖

Various approximations : shifts are under or over-estimated Several least-square cycles are needed

The least square refinement of a crystal structure is an ITERATIVE PROCESS


Another way

𝑌𝑗,𝑐𝑎𝑙𝑐 𝑿𝟎 + 𝜹𝒙 = 𝑌𝑗,𝑐𝑎𝑙𝑐 𝒙𝟎 + 𝜕𝑌𝑗,𝑐𝑎𝑙𝑐

𝜕𝑥𝑖 𝛿𝑥𝑖

𝑛

𝑖=1

Lets expand at first order 𝑌𝑗,𝑐𝑎𝑙𝑐 around 𝑿𝟎

𝜹𝒙 corresponds to “small” modifications of the parameter vector 𝒙

Lets assume this modification will make 𝑌𝑗,𝑐𝑎𝑙𝑐 a better estimate of the corresponding 𝑌𝑗,𝑜𝑏𝑠

So that : 𝑌𝑗,𝑐𝑎𝑙𝑐 𝑿𝟎 + 𝜹𝒙 = 𝑌𝑗,𝑜𝑏𝑠

We obtain : 𝑌𝑗,𝑜𝑏𝑠 − 𝑌𝑗,𝑐𝑎𝑙𝑐 𝒙𝟎 = 𝜕𝑌𝑗,𝑐𝑎𝑙𝑐

𝜕𝑥𝑖𝛿𝑥𝑖

𝑛

𝑖=1

for each of the 1 ≤ 𝑗 ≤ 𝑚 observations


𝑌1,𝑜𝑏𝑠 − 𝑌1,𝑐𝑎𝑙𝑐 𝑿𝟎 =𝜕𝑌1,𝑐𝑎𝑙𝑐

𝜕𝑥1𝛿𝑥1 +

𝜕𝑌1,𝑐𝑎𝑙𝑐

𝜕𝑥2𝛿𝑥2 + ⋯ +


𝜕𝑥𝑛𝛿𝑥𝑛

𝑌2,𝑜𝑏𝑠 − 𝑌2,𝑐𝑎𝑙𝑐 𝑿𝟎 =𝜕𝑌2,𝑐𝑎𝑙𝑐

𝜕𝑥1𝛿𝑥1 +


𝜕𝑥2𝛿𝑥2 + ⋯ +



𝑌𝑚,𝑜𝑏𝑠 − 𝑌𝑚,𝑐𝑎𝑙𝑐 𝑿𝟎 =𝜕𝑌𝑚,𝑐𝑎𝑙𝑐

𝜕𝑥1𝛿𝑥1 +

𝜕𝑌𝑚,𝑐𝑎𝑙𝑐

𝜕𝑥2𝛿𝑥2 + ⋯ +

𝜕𝑌𝑚,𝑐𝑎𝑙𝑐


…

𝑫𝑿 = 𝑭 Which corresponds to :

With 𝑿 ≡ 𝜹𝒙 , 𝑭 ≡ 𝒀𝒐𝒃𝒔 − 𝒀𝒄𝒂𝒍𝒄 and 𝑫 the design matrix of partial derivatives of 𝑌𝑗,𝑐𝑎𝑙𝑐 with respect to the model parameters

…plugged into LS general equations leads to the same results (𝑨, 𝑽)

Thus the system of 𝑚 equations :


The Normal Matrix is square 𝑛 × 𝑛 symmetric, with 𝑛 the number of refined variables

𝑨 =

𝑤𝑗𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐

𝜕𝑥1


𝜕𝑥1

𝑚ℎ𝑘𝑙𝑗=1 ⋯ 𝑤𝑗


𝜕𝑥1


𝜕𝑥𝑛

𝑚ℎ𝑘𝑙𝑗=1

⋮ ⋱ ⋮


𝜕𝑥𝑛


𝜕𝑥1



𝜕𝑥𝑛


𝜕𝑥𝑛


Contains the derivatives of 𝐼𝑐𝑎𝑙𝑐 or 𝐹𝑐𝑎𝑙𝑐

Must be inverted Should not be singular (i.e. det 𝑨 = 𝟎) in order to be inverted!

Has an important property : 𝑨−𝟏 = 𝑴 𝑴 is the variance-covariance matrix of the refined parameters

Gives access (square root of diagonal values) to the uncertainties of the refined parameters

and to the correlation between parameters : 𝑐𝑥𝑖,𝑥𝑗=

𝑐𝑜𝑣(𝑥𝑖,𝑥𝑗)

𝜎𝑖𝜎𝑗=

𝑀𝑖𝑗

𝑀𝑖𝑖𝑀𝑗𝑗

𝑴 =𝜎²𝑥1 ⋯ 𝑐𝑜𝑣(𝑥1, 𝑥𝑛)

⋮ ⋱ ⋮𝑐𝑜𝑣(𝑥𝑛, 𝑥1) ⋯ 𝜎²𝑥𝑛


Crystallographic agreement factors (most important / used)

𝑅1 𝐹 = 𝐹𝑜𝑏𝑠 − 𝐹𝑐𝑎𝑙𝑐ℎ𝑘𝑙

𝐹𝑜𝑏𝑠ℎ𝑘𝑙

Unweighted (SHELXL’s “R1”) :

𝑤𝑅2 𝐹 𝑅1 𝐼 = 𝐼𝑜𝑏𝑠 − 𝐼𝑐𝑎𝑙𝑐ℎ𝑘𝑙

𝐼𝑜𝑏𝑠ℎ𝑘𝑙

𝑤𝑅2(𝐼) = 𝑤𝐼 𝐹𝑜𝑏𝑠

2 − 𝐹𝑐𝑎𝑙𝑐2 2

ℎ𝑘𝑙

𝑤𝐼 𝐹𝑜𝑏𝑠2 2

ℎ𝑘𝑙

Weighted (SHELXL’s “wR2”) :

Goodness-of-Fit (SHELXL’s “GooF”) :

Possible variations : etc……

Residual relative errors (%) when 𝑌𝑐𝑎𝑙𝑐 are compared to 𝑌𝑜𝑏𝑠

Should decrease toward a « low » value at convergence of the refinement

𝐺𝑜𝑜𝐹 = 𝑤𝐼 𝐹𝑜𝑏𝑠

2 − 𝐹𝑐𝑎𝑙𝑐2 2

ℎ𝑘𝑙

𝑚 − 𝑛

Should aim at 1.0 if the model is correct and 𝑤𝐼 (then 𝜎𝐼) are well estimated

𝜒² distribution with 𝑚 − 𝑛 degrees of freedom


In a LS refinement program (SHELXL, OLeX, JANA, CRYSTALS, MoPro …) is used :

Compute 𝑘𝑌𝑐𝑎𝑙𝑐 𝑿 for each ℎ𝑘𝑙 and current R-factors

Compute 𝜕𝑘𝑌𝑐𝑎𝑙𝑐

𝜕𝑥𝑖𝑿 for each (ℎ𝑘𝑙) and each parameter 𝑥𝑖

Build the normal matrix 𝑨 and the vector 𝑽

Compute 𝑨−1

Compute 𝜹𝒙 = 𝑨−𝟏𝑽

Apply shifts 𝑿 = 𝑿 + 𝜹𝒙

Model with parameter vector 𝑿 and (ℎ𝑘𝑙) data

If max𝛿𝑥𝑖

𝜎𝑖> 휀

Parameters of the model

Global : scale factor, extinction, “overall” thermal displacement tensor, solvent parameters (proteins)

Atomic : fractional coordinates, occupancies, thermal displacement parameters (+valence populations, deformation density populations for more advanced model)

All written in the structure factor :

𝐹𝑐𝑎𝑙𝑐 𝑯 = 𝑛𝑗𝑓𝑗 𝑯 exp (−2𝜋2𝑯𝑡𝑼𝒋𝑯)exp (2𝜋𝑖 𝑯𝑡𝒓𝑗 )


With, for atom 𝒋 :

𝑛𝑗 the occupancy factor

𝑓𝑗 𝑯 the scattering factor

𝑈𝑗 the anisotropic thermal displacement tensor

𝒓𝑗 the fractional coordinates vector 𝒓𝒋 = 𝑥𝑗𝑎 + 𝑦𝑗𝑏 + 𝑧𝑗𝑐

Thermal displacement parameters in crystal structures

Is the atomic Debye Waller factor T𝑘 𝑯 = exp (−2𝜋2𝑯𝑡𝑼𝒌𝑯)

Thermal vibrations (and static disorder) are modelled by a 3x3 symmetric tensor

𝑼 =

𝑈11 𝑈12 𝑈13

𝑈21 𝑈22 𝑈23

𝑈31 𝑈32 𝑈33

Hydrogen atoms usually modelled using isotropic displacements

Anisotropic description 6 parameters / atom

T𝑗=𝐻𝑦𝑑 𝑯 = exp (−2𝜋2𝑼𝒊𝒔𝒐 𝑯 𝟐)

Very often is used the “B-factor” : 𝐵 = 8𝜋2𝑈

Thermal displacement parameters in crystal structures

A second rank tensor can be plotted using its representation ellipsoid :

U, B are expressed in Å² mean square displacement of atoms with respect to their mean position

𝒙𝒕𝑼−𝟏𝒙 = C

C has been tabulated to represent the probability to find the atom in the volume enclosed by the ellipsoid. Usually represented at 50% probability

20% 50% 90%

𝑼 must be invertible to lead to a “positive volume” Otherwise 𝑼 is said to be “Non positive definite”


Where are the crystal (space group) symmetries here ????

“hidden” in the (squared) structure factor amplitude used in 𝑆

𝐹𝑐𝑎𝑙𝑐 𝑯 = 𝑛𝑗𝑓𝑗 𝑯 exp (−2𝜋2𝑯𝑡𝑼𝒋𝑯)exp (2𝜋𝑖 𝑯𝑡𝒓𝑗 )


𝐹𝑐𝑎𝑙𝑐 𝑯 = 𝑛′𝑗𝑓𝑗 𝑯 exp (−2𝜋2𝑯𝑡𝑼𝒋,𝒔𝑯)exp (2𝜋𝑖 𝑯𝑡 𝑹𝒔𝒓𝒋 + 𝑻𝒔 )

𝑁𝑠

𝑠=1𝑗,𝑎𝑠𝑦𝑚.𝑈𝑛𝑖𝑡

Is actually :

With 𝑁𝑠 = multiplicity of the general position in the considered space group 𝑹𝒔 and 𝑻𝒔 = rotation and translation parts of the symmetry operation 𝑠 𝑈𝑗,𝑠 = thermal displacement tensor transformed by the symmetry operation

𝑛′𝑗 = atomic occupancy factor defined as 𝑛′𝑗 = 𝑛𝑗𝑚𝑗

𝑁𝑠

with 𝑚𝑗 the multiplicity of the atom site (Wyckoff position)

In a crystallographic LS refinement, 𝐹𝑐𝑎𝑙𝑐 𝑯 is actually used

𝐹𝑐𝑎𝑙𝑐 𝑯 = 𝑛′𝑗𝑓𝑗 𝑯 exp (−2𝜋2𝑯𝑡𝑼𝒋,𝒔𝑯)exp (2𝜋𝑖 𝑯𝑡 𝑹𝒔𝒓𝒋 + 𝑻𝒔 )

𝑁𝑠


𝐹𝑐𝑎𝑙𝑐 𝑯 = 𝐴 𝑯 + 𝑖𝐵(𝑯) hence 𝐹𝑐𝑎𝑙𝑐 𝑯 = 𝐴 𝑯 2 + 𝐵 𝑯 2

Note about 𝑯𝑡𝑼𝒋,𝒔𝑯 and 𝑯𝑡 𝑹𝒔𝒓𝒋 + 𝑻𝒔

𝑼𝒋 is transformed into 𝑼𝒋,𝒔 by the rotation matrix 𝑹𝒔

𝑼𝒋,𝒔 = 𝑹𝒔𝑼𝒋𝑹𝒔𝒕

Computationally more efficient to compute once for all 𝑯𝑠 = 𝑯𝑡𝑹𝒔

𝐹𝑐𝑎𝑙𝑐 𝑯 = 𝑛′𝑗𝑓𝑗 𝑯 exp (−2𝜋2𝑯𝑠𝑡𝑼𝒋𝑯𝑠)exp (2𝜋𝑖 𝑯𝑠

𝑡𝒓𝒋 + 𝑯𝑡𝑻𝒔 )

𝑁𝑠


so that :



Constraints /Restraints apply to parameters of the model, or to functions of the parameters

Restraints

Most common = “target restraints”: disallowing a function of the model parameters to deviate “too much” from ideal target values

Modification of the minimized residual :

𝑆 = 𝑤𝑗 𝑌𝑗,𝑜𝑏𝑠 − 𝑌𝑗,𝑐𝑎𝑙𝑐 𝑿𝟎2

𝑚ℎ𝑘𝑙

𝑗=1

+ 𝑤𝑛 𝑅𝑡𝑎𝑟𝑔𝑒𝑡 − 𝑅𝑐𝑎𝑙𝑐 𝑿2

𝑁𝑟𝑒𝑠𝑡𝑟.

𝑛=1

𝑤𝑛 = weight of the 𝑛th restrain 𝑅𝑡𝑎𝑟𝑔𝑒𝑡 = target value

𝑅𝑐𝑎𝑙𝑐 𝑿 = calculated value of a subset 𝑿 of parameters

Add terms to the normal matrix elements (and to 𝑽) related to the parameters 𝑿

𝐴𝑖𝑘 = 𝑤𝑗

𝜕𝑌𝑗,𝑐𝑎𝑙𝑐

𝜕𝑥𝑖

𝜕𝑌𝑗,𝑐𝑎𝑙𝑐

𝜕𝑥𝑘

𝑚ℎ𝑘𝑙

𝑗=1

+ 𝑤𝑛

𝜕𝑅𝑐𝑎𝑙𝑐

𝜕𝑥𝑖

𝜕𝑅𝑐𝑎𝑙𝑐

𝜕𝑥𝑘

Restraints increase correlations between parameters Adding restraints = adding observations (overdetermined system of normal equations ! )

Type of restraints :

Stereochemistry : interatomic distances, valence angles, torsion angles, planarity

Thermal displacements :

Similar & proportional between bonded atoms :

Similar projection along covalent bond (“Rigid bond restraint”)

Limited anisotropy

Similarity restraints, chiral volume etc….


Example with a restraint on interatomic distance Two atoms of fractional coordinates 𝒓1 and 𝒓2 and G the metric tensor

𝑑 = (𝒓2 − 𝒓1)𝑡𝑮(𝒓2 − 𝒓1)

There are derivatives of 𝑑 with respect to 𝑥1, 𝑦1, 𝑧1 & 𝑥2, 𝑦2, 𝑧2

𝑥1 𝑦1 𝑧1

𝑥2

𝑦2

𝑧2

𝑥

1 𝑦

1 𝑧

1

𝑥2 𝑦2 𝑧2

𝐴 = Augmented block diagonal terms

Augmented off-diagonal terms


Constraints

Can be seen as infinitely strong restraints : no deviation allowed

Implicit : choice of model (e.g. isotropic atoms, scattering factors) Explicit : relationships between parameters or fixed values for parameters

Constraints as fixed parameters :

Special positions :

𝑃21/𝑚

Fractional coordinates of an atom at Wyckoff position d of 𝑃21/𝑚 cannot be refined Fixed at (½ 0 ½)


Constraints

Constraints as parameters relationships:

𝑃23

Fractional coordinates of an atom at Wyckoff position e of 𝑃23 must be constrained so that :

𝑥 = 𝑥′ 𝑦 = 𝑥′ 𝑧 = 𝑥′

Only one refined parameter : 𝑥′

𝑌 𝑐𝑎𝑙𝑐 𝑥, 𝑦, 𝑧 → 𝑌 𝑐𝑎𝑙𝑐(𝑥 𝑥′ , 𝑦 𝑥′ , 𝑧 𝑥′ ) = 𝑌 𝑐𝑎𝑙𝑐 (𝑥′) Reparameterization

Normal matrix is reduced in size using chain rule derivatives as 𝑥 𝑥′ , 𝑦 𝑥′ and 𝑧 𝑥′ are known.


Anisotropic thermal displacement parameters for atoms on special position

Site symmetry restriction on the elements of the 𝑈𝑖𝑗 tensor!

If 𝑅 is a rotation matrix an U a thermal displacement tensor then:

𝑹𝑼𝑹𝒕 = 𝑼

Example for an atom on a 4-fold axis along [001]

𝑼 =

𝑈11 𝑈12 𝑈13

𝑈21 𝑈22 𝑈23

𝑈31 𝑈32 𝑈33

𝑹 =0 −1 01 0 00 0 1

Application of the symmetry operations must leave the representation ellipsoid unchanged

→ 𝑼 =

𝑈11 0 00 𝑈11 00 0 𝑈33



When the resolution of the diffraction data allows to observe bonding electrons :

Extra parameters included in the refinement : 𝑃𝑣 = atomic valence populations 𝑃𝑙𝑚 = deformation density populations

Angular dependence 𝑦𝑙𝑚 𝜃, 𝜑 : real spherical harmonics

𝑙 = 0, 𝑚 = 0

𝑙 = 1, 𝑚 = −1,0, +1

𝑙 = 2 …

Multipolar Modeling

𝑦𝑙𝑚 𝜃, 𝜑 are oriented with respect to a local frame centered on the atomic nuclei :

Residual electron density N-H bond

Dipolar (l=1) function + local axis system

Dipolar (l=1) function in the molecular plane

Problems with atoms in special positions : Local frame must follow the point symmetry of the site 𝑦𝑙𝑚 𝜃, 𝜑 functions must respect the point symmetry as well


Constrains related to site symmetry “Index picking rules”

Example : Atom on a twofold axis z parallel to 2 Multipoles allowed - Any 𝑙 - 𝑚 even

𝑙 = 0, 𝑚 = 0

𝑙 = 1, 𝑚 = −1,0, +1

𝑙 = 2, 𝑚 = −2, −1, 0, 1, 2

Crystal symmetry constrains the crystallographic modeling



Proteins : Quite low quality crystals, limited diffraction data resolution (<d> ~2Å) Made of a large number of atoms (difficult to crystalize) Only one enantiomer (mirrors and inversion incompatible) : 65 remaining space groups!

Assemble in dimers, trimers, tetramers (biological assemblies) Monomers related by (pseudo) symmetry operations which do not belong to the space group description Protein crystallographers call them “Non Crystallographic symmetries” (NCS) Better to call them “pseudo symmetries” : Pseudo – translations Pseudo – rotations

Difficulties in space group assignments

Ca2+ Gated potassium Channel

44.035 63.452 63.477 90.03 89.99 89.99 … solved and published in 𝑃1 Then in 𝑃4212 (Id = 4HZ3)

Nucleotide gated ion channel. PDB : 1q43

Space group I4 2 mol / AU

Pseudo – 2fold axis in (𝑎 , 𝑐 ) plane Monomers related by ½ − 𝑦, ½ − 𝑥, 0.31 − 𝑧 rms ~0.2Å and 𝑅𝑠𝑦𝑚 (𝐼422) = 40% 𝑰𝟒

True pseudo symmetries



Correlations between parameters

Parameters are correlated when they have the same effect on the model

⇒𝜕𝑌 𝑐𝑎𝑙𝑐

𝜕𝑥𝑖=

𝜕𝑌 𝑐𝑎𝑙𝑐

𝜕𝑥𝑗 if 𝑥𝑖 and 𝑥𝑗 are (perfectly) correlated

𝑨 =


𝜕𝑥𝑖


𝜕𝑥𝑖



𝜕𝑥𝑖


𝜕𝑥𝑗


⋮ ⋱ ⋮


𝜕𝑥𝑗


𝜕𝑥𝑖



𝜕𝑥𝑗


𝜕𝑥𝑗


=

If two columns are equal, det 𝑨 = 0 and the matrix cannot be inverted

Singular Normal Matrix


When parameters have similar effects

Large correlations due to 𝜕𝑌 𝑐𝑎𝑙𝑐

𝜕𝑥𝑖≈

𝜕𝑌 𝑐𝑎𝑙𝑐

𝜕𝑥𝑗 ill-conditioned Matrix

det 𝑨 ≈ 0 and 𝑨−𝟏 poorly reliable, leads to large (huge) shifts and uncertainties

Well-known causes of ill-conditioned refinements : - Correlations between atomic occupancies and thermal displacement parameters - Overlapping of atomic sites due to disorder - Missing symmetry constrain (for coordinates or thermal displacement) - Refinement in a space group of too low symmetry - Correlations between parameters of an atom in oblique crystal systems


Correlations between coordinates of an atom in oblique crystal systems

𝑥

𝑦

𝛾

= True atomic position

∆𝑦 ∆𝑥

Searching for the true position of an atom If ∆𝑦 is the coordinate shift along y, the “best” ∆𝑥 shift is ∆𝑥 = - ∆𝑦 cos(γ)

If γ ≠ 90°, 𝑥 and 𝑦 are co-varying and are correlated

Example in Feast et al. Acta C, 2009 : refinement of methylene aziridine 𝑃21/𝑛 : a =13.8593 (3), b = 10.5242 (2), c = 14.8044 (4) Å , 𝛽 = 92.0014 (7)° 𝑃21/𝑐 : a = 13.8594 (2), b =10.5243 (2), c = 19.9230 (3) Å , 𝛽 = 132.0439 (7)

Cell transform

Coordinates transform


Parois & Lutz, Acta A, 2011


Ill-conditioned refinement caused by missing inversion (SG of too low symmetry)

Significant bibliography about space group corrections (CSD deposited)

Clemente & Marzotto, Acta B, 2002 ; Henling & Marsh, Acta C 2014 Marsh, Acta B 2004, 2005abc, 2007, 2009 …

In 2005, about 10% of CSD deposited structures were inappropriately assigned to P1

Description in more appropriate space group leads to better geometry

Coordinates shifts needed for added symmetries ~0.05Å max


Examples of structures deposited in an inappropriate space group

.. Or CSD Id = QEQRUZ

Missing 1 : CSD Id = BAHHUP

Supercell : CSD Id = GUHTAF

Symmetry considerations, restraints and constraints in structure...

Documents

Transcript of Symmetry considerations, restraints and constraints in structure...