Symmetry considerations, restraints and constraints in structure...
Transcript of Symmetry considerations, restraints and constraints in structure...
June 7, 2019
Symmetry considerations, restraints and constraints in structure refinement from single-crystal
X-ray diffraction data
B. Guillot
Outline
The Least Squares optimization : general considerations
The Least Squares refinement method in Crystallography
Parameters of the model
Symmetries in Least-Squares
Restraints and constraints … and symmetries
Symmetries in another crystallographic modelling
Symmetries In protein crystallography
Least Squares refinement problems related to symmetries
Examples … tutorials
ematics of allographic
refinement
The Least Squares refinement : general considerations
Crystallographic refinement Optimizing the model parameters against experimental diffraction data
Observations Diffraction Intensities 𝐼(ℎ𝑘𝑙)
Variables Parameters to describe a crystal structure
Least Squares optimization method A very general approach:
𝑚 experimental observations f1 … fm Depend linearly on 𝑛 parameters x1 …. xn :
𝑓1 = 𝑑11𝑥1 + 𝑑12𝑥2 + ⋯ + 𝑑1𝑛𝑥𝑛
𝑓2 = 𝑑21𝑥1 + 𝑑22𝑥2 + ⋯ + 𝑑2𝑛𝑥𝑛
𝑓𝑚 = 𝑑𝑚1𝑥1 + 𝑑𝑚2𝑥2 + ⋯ + 𝑑𝑚𝑛𝑥𝑛
….
With 𝑚 ≥ 𝑛
𝑫𝑿 = 𝑭
In matrix form :
Where 𝑫 is the “design matrix”,
of elements 𝑑𝑖𝑗 =𝜕𝑓𝑖
𝜕𝑥𝑗
Least-squares procedure find the best estimate of 𝑿 to reproduce 𝑭 𝑿 = estimate of 𝑿, 𝑭 = estimate of 𝑭 using 𝑿 Thus : 𝑫𝑿 = 𝑭
If 𝑿 is ok, then it minimizes the difference between 𝑭 and 𝑭
Hence : to find the best estimate 𝑿 ,
proposed to minimize the sum 𝑆 of the squared, weighted differences between
all the 𝑓𝑖 and the 𝑓 𝑖:
Minimize 𝑆 = 𝑤𝑖 𝑓𝑖 − 𝑓 𝑖2
𝑚
𝑖=1
Friedrich Gauss
𝑓1 = 𝑑11𝑥1 + 𝑑12𝑥2 + ⋯ + 𝑑1𝑛𝑥𝑛
𝑓2 = 𝑑21𝑥1 + 𝑑22𝑥2 + ⋯ + 𝑑2𝑛𝑥𝑛
𝑓𝑚 = 𝑑𝑚1𝑥1 + 𝑑𝑚2𝑥2 + ⋯ + 𝑑𝑚𝑛𝑥𝑛
….
The Least Squares refinement : general considerations
Minimize 𝑆 = 𝑤𝑖 𝑓𝑖 − 𝑓 𝑖2
𝑚
𝑖=1
𝑓𝑖 = experimental measurements experimental uncertainties
The weights = inverse of the corresponding variances : 𝑤𝑖 =1
𝜎𝑖2
Defining : 𝑾 = diagonal matrix of experimental variances (𝑾−𝟏 contains weights)
𝑹 = residual matrix 𝑹 = 𝑭 − 𝑭 = 𝑭 − 𝑫𝑿
𝑆 can be rewritten in matrix form : 𝑆 = 𝑹𝒕𝑾−𝟏𝑹 = 𝑭 − 𝑫𝑿 𝒕𝑾−𝟏 𝑭 − 𝑫𝑿
… and developed : 𝑆 = 𝑭𝒕𝑾−𝟏𝑭 + 𝑿 𝒕𝑫𝒕𝑾−𝟏𝑫𝑿 − 𝑭𝒕𝑾−𝟏𝑫𝑿 − 𝑿 𝒕
𝑫𝒕𝑾−𝟏𝑭
𝑫𝑿 = 𝑭 hence 𝑿 𝒕𝑫𝒕 = 𝑭 𝒕 two last terms are equal :
−𝑭𝒕𝑾−𝟏𝑫𝑿 = −𝑭𝒕𝑾−𝟏𝑭 = −𝑭 𝒕𝑾−𝟏𝑭 = −𝑿 𝒕𝑫𝒕𝑾−𝟏𝑭
Finally : 𝑆 = 𝑭𝒕𝑾−𝟏𝑭 + 𝑿 𝒕𝑫𝒕𝑾−𝟏𝑫𝑿 − 𝟐𝑿 𝒕
𝑫𝒕𝑾−𝟏𝑭
The Least Squares refinement : general considerations
Optimal estimates 𝑿 of 𝑿, to minimize 𝑆
𝑆 = 𝑭𝒕𝑾−𝟏𝑭 + 𝑿 𝒕𝑫𝒕𝑾−𝟏𝑫𝑿 − 𝟐𝑿 𝒕
𝑫𝒕𝑾−𝟏𝑭
𝜕𝑆
𝜕𝑥𝑖= 0 for 1 ≤ 𝑖 ≤ 𝑛
The differential of 𝑆 is zero : 𝛿𝑆 = 𝛿 𝑹𝒕𝑾−𝟏𝑹 = 0
𝑆 will be minimum
𝛿𝑆 = 𝛿 𝑹𝒕𝑾−𝟏𝑹 = 𝛿 𝑭𝒕𝑾−𝟏𝑭 + 𝑿 𝒕𝑫𝒕𝑾−𝟏𝑿 − 𝟐𝑿 𝒕
𝑫𝒕𝑾−𝟏𝑭
= 𝛿 𝑿 𝒕𝑫𝒕𝑾−𝟏𝑫𝑿 − 𝟐𝑿 𝒕
𝑫𝒕𝑾−𝟏𝑭
= 𝟐 𝛿𝑿 𝒕𝑫𝒕𝑾−𝟏𝑫𝑿 − 𝟐 𝛿𝑿 𝒕
𝑫𝒕𝑾−𝟏𝑭
= 𝟐 𝛿𝑿 𝒕𝑫𝒕𝑾−𝟏𝑫𝑿 − 𝑫𝒕𝑾−𝟏𝑭 = 0
Lets call 𝑨 = 𝑫𝒕𝑾−𝟏𝑫 and 𝐕 = 𝑫𝒕𝑾−𝟏𝑭
𝑨 = 𝑫𝒕𝑾−𝟏𝑫 is the Normal Matrix 𝑨 𝑿 = 𝑽 are the Normal Equations The estimate : 𝑿 = 𝑨−𝟏𝑽
…. will minimize 𝑆
The Least Squares refinement : general considerations
A closer look to : 𝑨 𝑿 = 𝑽
𝑨 = 𝑫𝒕𝑾−𝟏𝑫 =
𝜕𝑓1
𝜕𝑥1⋯
𝜕𝑓𝑚
𝜕𝑥1
⋮ ⋱ ⋮𝜕𝑓1
𝜕𝑥𝑛⋯
𝜕𝑓𝑚
𝜕𝑥𝑛
𝑤1 ⋯ 0⋮ ⋱ ⋮0 ⋯ 𝑤𝑚
𝜕𝑓1
𝜕𝑥1⋯
𝜕𝑓1
𝜕𝑥𝑛
⋮ ⋱ ⋮𝜕𝑓𝑚
𝜕𝑥1⋯
𝜕𝑓𝑚
𝜕𝑥𝑛
𝑚 × 𝑚 diagonal 𝑛 × 𝑚 𝑚 × 𝑛
… do the maths …
𝐴𝑖𝑗 = 𝑤𝑘
𝜕𝑓𝑘
𝜕𝑥𝑗
𝑚
𝑘=1
𝜕𝑓𝑘
𝜕𝑥𝑖 𝑨 is a square 𝑛 × 𝑛 symmetric matrix of elements :
𝑨 =
𝑤𝑘𝜕𝑓𝑘
𝜕𝑥1
2𝑚1 ⋯ 𝑤𝑘
𝜕𝑓𝑘
𝜕𝑥𝑛
𝜕𝑓𝑘
𝜕𝑥1
𝑚1
⋮ ⋱ ⋮
𝑤𝑘𝜕𝑓𝑘
𝜕𝑥1
𝜕𝑓𝑘
𝜕𝑥𝑛
𝑚1 ⋯ 𝑤𝑘
𝜕𝑓𝑘
𝜕𝑥𝑛
2𝑚1
𝑨 must be « positive definite » to be inverted in order to obtain 𝑿 = 𝑨−𝟏𝑽
The Least Squares refinement : general considerations
A closer look to : 𝑨 𝑿 = 𝑽
𝑽 = 𝑫𝒕𝑾−𝟏𝑭 =
𝜕𝑓1
𝜕𝑥1⋯
𝜕𝑓𝑚
𝜕𝑥1
⋮ ⋱ ⋮𝜕𝑓1
𝜕𝑥𝑛⋯
𝜕𝑓𝑚
𝜕𝑥𝑛
𝑤1 ⋯ 0⋮ ⋱ ⋮0 ⋯ 𝑤𝑚
𝑓1
…𝑓𝑚
𝑚 × 𝑚 diagonal 𝑚 𝑚 × 𝑛
𝑽 =
𝑤𝑘𝑓𝑘
𝜕𝑓𝑘
𝜕𝑥1
𝑚
1 …
𝑤𝑘𝑓𝑘
𝜕𝑓𝑘
𝜕𝑥𝑛
𝑚
1
… do the maths …
𝑉𝑗 = 𝑤𝑘𝑓𝑘
𝜕𝑓𝑘
𝜕𝑥𝑗
𝑚
𝑘=1
𝑽 is a vector of dimension 𝑛 with elements :
The Least Squares refinement : general considerations
The Least Squares refinement method in Crystallography
Application to Least Squares refinement in crystallography
𝑓𝑖 → 𝐼(ℎ𝑘𝑙) and 𝑥𝑖 → the atomic fractional coordinates
𝐼 ℎ𝑘𝑙 = 𝐹 ℎ𝑘𝑙 2 are not linear functions of the model parameters 𝑥𝑖
𝐹 ℎ𝑘𝑙 = 𝐹 𝑯 = 𝑓𝑗 𝑯
𝑗,𝑈𝑛𝑖𝑡𝐶𝑒𝑙𝑙
exp (2𝜋𝑖 𝑯𝑡𝒓𝒋 )
𝑯 = ℎ𝑎∗ + 𝑘𝑏∗ + 𝑙𝑐∗ and 𝒓𝒋 = 𝑥𝑗𝑎 + 𝑦𝑗𝑏 + 𝑧𝑗𝑐
𝐹 ℎ𝑘𝑙 = 𝐹 𝑯 = 𝑓𝑗 𝑯
𝑗,𝑈𝑛𝑖𝑡𝐶𝑒𝑙𝑙
exp (2𝜋𝑖 ℎ𝑥𝑗 + 𝑘𝑦𝑗 + 𝑙𝑧𝑗 )
𝑌𝑗,𝑜𝑏𝑠
In the following :
can refer either to 𝐼𝑜𝑏𝑠 = 𝐹𝑜𝑏𝑠2 or to 𝐹𝑜𝑏𝑠
𝑌𝑗,𝑐𝑎𝑙𝑐 𝑿𝟎 can refer either to 𝐼𝑐𝑎𝑙𝑐 = 𝐹𝑐𝑎𝑙𝑐2 or to 𝐹𝑐𝑎𝑙𝑐
computed at the “approximate” value 𝑿𝟎 of the model parameters vector 𝑿
of observation 𝑗 among 𝑚 (ℎ𝑘𝑙)
?
However we want to minimize the function :
𝑆 = 𝑤𝑗 𝑓𝑗 − 𝑓 𝑗2
𝑚
𝑗=1
→ 𝑆 = 𝑤𝑗 𝑌𝑗,𝑜𝑏𝑠 − 𝑘𝑌𝑗,𝑐𝑎𝑙𝑐 𝑿𝟎2
𝑚ℎ𝑘𝑙
𝑗=1
The 𝑘 parameter = “Scale Factor”, a global parameter of the refinement
Finding a solution for the parameters values to minimize 𝑆 means 𝜕𝑆
𝜕𝑥𝑖= 𝑆𝑖
′ = 0
Lets expand at first order 𝑆𝑖′ around 𝑿𝟎
𝑆𝑖′ = 𝑆𝑖
′ 𝑿𝟎 + 𝜕𝑆𝑖
′ 𝑿𝟎
𝜕𝑥𝑘 𝛿𝑥𝑘
𝑛
𝑘=1
+ ⋯
𝑆𝑖′ 𝑿𝟎 =
𝜕𝑆
𝜕𝑥𝑖= −2 𝑤𝑗
𝑚ℎ𝑘𝑙
𝑗=1
𝑌𝑗,𝑜𝑏𝑠 − 𝑘𝑌𝑗,𝑐𝑎𝑙𝑐 𝑿𝟎
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑖 With:
The Least Squares refinement method in Crystallography
Finally, plugin the expression for 𝑆𝑖′ 𝑿𝟎 in 𝑆𝑖
′ = 𝑆𝑖′ 𝑿𝟎 +
𝜕𝑆𝑖′ 𝑿𝟎
𝜕𝑥𝑘 𝛿𝑥𝑘
𝑛
𝑘=1
+ ⋯
𝑆𝑖′= −2 𝑤𝑗
𝑚ℎ𝑘𝑙
𝑗=1
𝑌𝑗,𝑜𝑏𝑠 − 𝑘𝑌𝑗,𝑐𝑎𝑙𝑐 𝑿𝟎
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑖
+ 2 𝑤𝑗
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑖
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑘
𝑚ℎ𝑘𝑙
𝑗=1
𝛿𝑥𝑘
𝑛
𝑘=1
= 0
Slightly rearranged :
Removing the second derivatives terms !
𝑤𝑗
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑖
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑘
𝑚ℎ𝑘𝑙
𝑗=1
𝛿𝑥𝑘 = 𝑤𝑗
𝑚ℎ𝑘𝑙
𝑗=1
𝑌𝑗,𝑜𝑏𝑠 − 𝑘𝑌𝑗,𝑐𝑎𝑙𝑐 𝒙𝟎
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑖
𝑛
𝑘=1
… and there are 1 ≤ 𝑖 ≤ 𝑛 equations of the same type
The Least Squares refinement method in Crystallography
The 𝑛 equations :
𝑤𝑗
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑖
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑘
𝑚ℎ𝑘𝑙
𝑗=1
𝛿𝑥𝑘 = 𝑤𝑗
𝑚ℎ𝑘𝑙
𝑗=1
𝑌𝑗,𝑜𝑏𝑠 − 𝑘𝑌𝑗,𝑐𝑎𝑙𝑐 𝒙𝟎
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑖
𝑛
𝑘=1
… are equivalent to a set of linear LS equations, of type 𝑨 𝑿 = 𝑽
With 𝑨 = 𝑫𝒕𝑾−𝟏𝑫 design matrix 𝑫 of elements 𝐷𝑗𝑖 =𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑖
diagonal variances matrix 𝑾 of elements 𝑊𝑗𝑗 = 𝜎²𝑗
And 𝑽 = 𝑫𝒕𝑾−𝟏𝑭 observation vector of element 𝐹𝑗 = 𝑌𝑗,𝑜𝑏𝑠 − 𝑘𝑌𝑗,𝑐𝑎𝑙𝑐 𝒙𝟎
which contains the experimental measurements
And 𝑿 a vector of elements 𝑋 𝑖 = 𝛿𝑥𝑖
Solution of the LS equations are SHIFTS applied to the parameters of the model : 𝑥′𝑖 = 𝑥𝑖 + 𝛿𝑥𝑖
Various approximations : shifts are under or over-estimated Several least-square cycles are needed
The least square refinement of a crystal structure is an ITERATIVE PROCESS
The Least Squares refinement method in Crystallography
Another way
𝑌𝑗,𝑐𝑎𝑙𝑐 𝑿𝟎 + 𝜹𝒙 = 𝑌𝑗,𝑐𝑎𝑙𝑐 𝒙𝟎 + 𝜕𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑖 𝛿𝑥𝑖
𝑛
𝑖=1
Lets expand at first order 𝑌𝑗,𝑐𝑎𝑙𝑐 around 𝑿𝟎
𝜹𝒙 corresponds to “small” modifications of the parameter vector 𝒙
Lets assume this modification will make 𝑌𝑗,𝑐𝑎𝑙𝑐 a better estimate of the corresponding 𝑌𝑗,𝑜𝑏𝑠
So that : 𝑌𝑗,𝑐𝑎𝑙𝑐 𝑿𝟎 + 𝜹𝒙 = 𝑌𝑗,𝑜𝑏𝑠
We obtain : 𝑌𝑗,𝑜𝑏𝑠 − 𝑌𝑗,𝑐𝑎𝑙𝑐 𝒙𝟎 = 𝜕𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑖𝛿𝑥𝑖
𝑛
𝑖=1
for each of the 1 ≤ 𝑗 ≤ 𝑚 observations
The Least Squares refinement method in Crystallography
𝑌1,𝑜𝑏𝑠 − 𝑌1,𝑐𝑎𝑙𝑐 𝑿𝟎 =𝜕𝑌1,𝑐𝑎𝑙𝑐
𝜕𝑥1𝛿𝑥1 +
𝜕𝑌1,𝑐𝑎𝑙𝑐
𝜕𝑥2𝛿𝑥2 + ⋯ +
𝜕𝑌1,𝑐𝑎𝑙𝑐
𝜕𝑥𝑛𝛿𝑥𝑛
𝑌2,𝑜𝑏𝑠 − 𝑌2,𝑐𝑎𝑙𝑐 𝑿𝟎 =𝜕𝑌2,𝑐𝑎𝑙𝑐
𝜕𝑥1𝛿𝑥1 +
𝜕𝑌2,𝑐𝑎𝑙𝑐
𝜕𝑥2𝛿𝑥2 + ⋯ +
𝜕𝑌2,𝑐𝑎𝑙𝑐
𝜕𝑥𝑛𝛿𝑥𝑛
𝑌𝑚,𝑜𝑏𝑠 − 𝑌𝑚,𝑐𝑎𝑙𝑐 𝑿𝟎 =𝜕𝑌𝑚,𝑐𝑎𝑙𝑐
𝜕𝑥1𝛿𝑥1 +
𝜕𝑌𝑚,𝑐𝑎𝑙𝑐
𝜕𝑥2𝛿𝑥2 + ⋯ +
𝜕𝑌𝑚,𝑐𝑎𝑙𝑐
𝜕𝑥𝑛𝛿𝑥𝑛
…
𝑫𝑿 = 𝑭 Which corresponds to :
With 𝑿 ≡ 𝜹𝒙 , 𝑭 ≡ 𝒀𝒐𝒃𝒔 − 𝒀𝒄𝒂𝒍𝒄 and 𝑫 the design matrix of partial derivatives of 𝑌𝑗,𝑐𝑎𝑙𝑐 with respect to the model parameters
…plugged into LS general equations leads to the same results (𝑨, 𝑽)
Thus the system of 𝑚 equations :
The Least Squares refinement method in Crystallography
The Normal Matrix is square 𝑛 × 𝑛 symmetric, with 𝑛 the number of refined variables
𝑨 =
𝑤𝑗𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥1
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥1
𝑚ℎ𝑘𝑙𝑗=1 ⋯ 𝑤𝑗
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥1
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑛
𝑚ℎ𝑘𝑙𝑗=1
⋮ ⋱ ⋮
𝑤𝑗𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑛
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥1
𝑚ℎ𝑘𝑙𝑗=1 ⋯ 𝑤𝑗
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑛
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑛
𝑚ℎ𝑘𝑙𝑗=1
Contains the derivatives of 𝐼𝑐𝑎𝑙𝑐 or 𝐹𝑐𝑎𝑙𝑐
Must be inverted Should not be singular (i.e. det 𝑨 = 𝟎) in order to be inverted!
Has an important property : 𝑨−𝟏 = 𝑴 𝑴 is the variance-covariance matrix of the refined parameters
Gives access (square root of diagonal values) to the uncertainties of the refined parameters
and to the correlation between parameters : 𝑐𝑥𝑖,𝑥𝑗=
𝑐𝑜𝑣(𝑥𝑖,𝑥𝑗)
𝜎𝑖𝜎𝑗=
𝑀𝑖𝑗
𝑀𝑖𝑖𝑀𝑗𝑗
𝑴 =𝜎²𝑥1 ⋯ 𝑐𝑜𝑣(𝑥1, 𝑥𝑛)
⋮ ⋱ ⋮𝑐𝑜𝑣(𝑥𝑛, 𝑥1) ⋯ 𝜎²𝑥𝑛
The Least Squares refinement method in Crystallography
Crystallographic agreement factors (most important / used)
𝑅1 𝐹 = 𝐹𝑜𝑏𝑠 − 𝐹𝑐𝑎𝑙𝑐ℎ𝑘𝑙
𝐹𝑜𝑏𝑠ℎ𝑘𝑙
Unweighted (SHELXL’s “R1”) :
𝑤𝑅2 𝐹 𝑅1 𝐼 = 𝐼𝑜𝑏𝑠 − 𝐼𝑐𝑎𝑙𝑐ℎ𝑘𝑙
𝐼𝑜𝑏𝑠ℎ𝑘𝑙
𝑤𝑅2(𝐼) = 𝑤𝐼 𝐹𝑜𝑏𝑠
2 − 𝐹𝑐𝑎𝑙𝑐2 2
ℎ𝑘𝑙
𝑤𝐼 𝐹𝑜𝑏𝑠2 2
ℎ𝑘𝑙
Weighted (SHELXL’s “wR2”) :
Goodness-of-Fit (SHELXL’s “GooF”) :
Possible variations : etc……
Residual relative errors (%) when 𝑌𝑐𝑎𝑙𝑐 are compared to 𝑌𝑜𝑏𝑠
Should decrease toward a « low » value at convergence of the refinement
𝐺𝑜𝑜𝐹 = 𝑤𝐼 𝐹𝑜𝑏𝑠
2 − 𝐹𝑐𝑎𝑙𝑐2 2
ℎ𝑘𝑙
𝑚 − 𝑛
Should aim at 1.0 if the model is correct and 𝑤𝐼 (then 𝜎𝐼) are well estimated
𝜒² distribution with 𝑚 − 𝑛 degrees of freedom
The Least Squares refinement method in Crystallography
In a LS refinement program (SHELXL, OLeX, JANA, CRYSTALS, MoPro …) is used :
Compute 𝑘𝑌𝑐𝑎𝑙𝑐 𝑿 for each ℎ𝑘𝑙 and current R-factors
Compute 𝜕𝑘𝑌𝑐𝑎𝑙𝑐
𝜕𝑥𝑖𝑿 for each (ℎ𝑘𝑙) and each parameter 𝑥𝑖
Build the normal matrix 𝑨 and the vector 𝑽
Compute 𝑨−1
Compute 𝜹𝒙 = 𝑨−𝟏𝑽
Apply shifts 𝑿 = 𝑿 + 𝜹𝒙
Model with parameter vector 𝑿 and (ℎ𝑘𝑙) data
If max𝛿𝑥𝑖
𝜎𝑖> 휀
Parameters of the model
Global : scale factor, extinction, “overall” thermal displacement tensor, solvent parameters (proteins)
Atomic : fractional coordinates, occupancies, thermal displacement parameters (+valence populations, deformation density populations for more advanced model)
All written in the structure factor :
𝐹𝑐𝑎𝑙𝑐 𝑯 = 𝑛𝑗𝑓𝑗 𝑯 exp (−2𝜋2𝑯𝑡𝑼𝒋𝑯)exp (2𝜋𝑖 𝑯𝑡𝒓𝑗 )
𝑗,𝑈𝑛𝑖𝑡𝐶𝑒𝑙𝑙
With, for atom 𝒋 :
𝑛𝑗 the occupancy factor
𝑓𝑗 𝑯 the scattering factor
𝑈𝑗 the anisotropic thermal displacement tensor
𝒓𝑗 the fractional coordinates vector 𝒓𝒋 = 𝑥𝑗𝑎 + 𝑦𝑗𝑏 + 𝑧𝑗𝑐
Thermal displacement parameters in crystal structures
Is the atomic Debye Waller factor T𝑘 𝑯 = exp (−2𝜋2𝑯𝑡𝑼𝒌𝑯)
Thermal vibrations (and static disorder) are modelled by a 3x3 symmetric tensor
𝑼 =
𝑈11 𝑈12 𝑈13
𝑈21 𝑈22 𝑈23
𝑈31 𝑈32 𝑈33
Hydrogen atoms usually modelled using isotropic displacements
Anisotropic description 6 parameters / atom
T𝑗=𝐻𝑦𝑑 𝑯 = exp (−2𝜋2𝑼𝒊𝒔𝒐 𝑯 𝟐)
Very often is used the “B-factor” : 𝐵 = 8𝜋2𝑈
Thermal displacement parameters in crystal structures
A second rank tensor can be plotted using its representation ellipsoid :
U, B are expressed in Ų mean square displacement of atoms with respect to their mean position
𝒙𝒕𝑼−𝟏𝒙 = C
C has been tabulated to represent the probability to find the atom in the volume enclosed by the ellipsoid. Usually represented at 50% probability
20% 50% 90%
𝑼 must be invertible to lead to a “positive volume” Otherwise 𝑼 is said to be “Non positive definite”
Symmetries in Least-Squares
Where are the crystal (space group) symmetries here ????
“hidden” in the (squared) structure factor amplitude used in 𝑆
𝐹𝑐𝑎𝑙𝑐 𝑯 = 𝑛𝑗𝑓𝑗 𝑯 exp (−2𝜋2𝑯𝑡𝑼𝒋𝑯)exp (2𝜋𝑖 𝑯𝑡𝒓𝑗 )
𝑗,𝑈𝑛𝑖𝑡𝐶𝑒𝑙𝑙
𝐹𝑐𝑎𝑙𝑐 𝑯 = 𝑛′𝑗𝑓𝑗 𝑯 exp (−2𝜋2𝑯𝑡𝑼𝒋,𝒔𝑯)exp (2𝜋𝑖 𝑯𝑡 𝑹𝒔𝒓𝒋 + 𝑻𝒔 )
𝑁𝑠
𝑠=1𝑗,𝑎𝑠𝑦𝑚.𝑈𝑛𝑖𝑡
Is actually :
With 𝑁𝑠 = multiplicity of the general position in the considered space group 𝑹𝒔 and 𝑻𝒔 = rotation and translation parts of the symmetry operation 𝑠 𝑈𝑗,𝑠 = thermal displacement tensor transformed by the symmetry operation
𝑛′𝑗 = atomic occupancy factor defined as 𝑛′𝑗 = 𝑛𝑗𝑚𝑗
𝑁𝑠
with 𝑚𝑗 the multiplicity of the atom site (Wyckoff position)
In a crystallographic LS refinement, 𝐹𝑐𝑎𝑙𝑐 𝑯 is actually used
𝐹𝑐𝑎𝑙𝑐 𝑯 = 𝑛′𝑗𝑓𝑗 𝑯 exp (−2𝜋2𝑯𝑡𝑼𝒋,𝒔𝑯)exp (2𝜋𝑖 𝑯𝑡 𝑹𝒔𝒓𝒋 + 𝑻𝒔 )
𝑁𝑠
𝑠=1𝑗,𝑎𝑠𝑦𝑚.𝑈𝑛𝑖𝑡
𝐹𝑐𝑎𝑙𝑐 𝑯 = 𝐴 𝑯 + 𝑖𝐵(𝑯) hence 𝐹𝑐𝑎𝑙𝑐 𝑯 = 𝐴 𝑯 2 + 𝐵 𝑯 2
Note about 𝑯𝑡𝑼𝒋,𝒔𝑯 and 𝑯𝑡 𝑹𝒔𝒓𝒋 + 𝑻𝒔
𝑼𝒋 is transformed into 𝑼𝒋,𝒔 by the rotation matrix 𝑹𝒔
𝑼𝒋,𝒔 = 𝑹𝒔𝑼𝒋𝑹𝒔𝒕
Computationally more efficient to compute once for all 𝑯𝑠 = 𝑯𝑡𝑹𝒔
𝐹𝑐𝑎𝑙𝑐 𝑯 = 𝑛′𝑗𝑓𝑗 𝑯 exp (−2𝜋2𝑯𝑠𝑡𝑼𝒋𝑯𝑠)exp (2𝜋𝑖 𝑯𝑠
𝑡𝒓𝒋 + 𝑯𝑡𝑻𝒔 )
𝑁𝑠
𝑠=1𝑗,𝑎𝑠𝑦𝑚.𝑈𝑛𝑖𝑡
so that :
Symmetries in Least-Squares
Restraints and constraints … and symmetries
Constraints /Restraints apply to parameters of the model, or to functions of the parameters
Restraints
Most common = “target restraints”: disallowing a function of the model parameters to deviate “too much” from ideal target values
Modification of the minimized residual :
𝑆 = 𝑤𝑗 𝑌𝑗,𝑜𝑏𝑠 − 𝑌𝑗,𝑐𝑎𝑙𝑐 𝑿𝟎2
𝑚ℎ𝑘𝑙
𝑗=1
+ 𝑤𝑛 𝑅𝑡𝑎𝑟𝑔𝑒𝑡 − 𝑅𝑐𝑎𝑙𝑐 𝑿2
𝑁𝑟𝑒𝑠𝑡𝑟.
𝑛=1
𝑤𝑛 = weight of the 𝑛th restrain 𝑅𝑡𝑎𝑟𝑔𝑒𝑡 = target value
𝑅𝑐𝑎𝑙𝑐 𝑿 = calculated value of a subset 𝑿 of parameters
Add terms to the normal matrix elements (and to 𝑽) related to the parameters 𝑿
𝐴𝑖𝑘 = 𝑤𝑗
𝜕𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑖
𝜕𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑘
𝑚ℎ𝑘𝑙
𝑗=1
+ 𝑤𝑛
𝜕𝑅𝑐𝑎𝑙𝑐
𝜕𝑥𝑖
𝜕𝑅𝑐𝑎𝑙𝑐
𝜕𝑥𝑘
Restraints increase correlations between parameters Adding restraints = adding observations (overdetermined system of normal equations ! )
Type of restraints :
Stereochemistry : interatomic distances, valence angles, torsion angles, planarity
Thermal displacements :
Similar & proportional between bonded atoms :
Similar projection along covalent bond (“Rigid bond restraint”)
Limited anisotropy
Similarity restraints, chiral volume etc….
Restraints and constraints … and symmetries
Example with a restraint on interatomic distance Two atoms of fractional coordinates 𝒓1 and 𝒓2 and G the metric tensor
𝑑 = (𝒓2 − 𝒓1)𝑡𝑮(𝒓2 − 𝒓1)
There are derivatives of 𝑑 with respect to 𝑥1, 𝑦1, 𝑧1 & 𝑥2, 𝑦2, 𝑧2
𝑥1 𝑦1 𝑧1
𝑥2
𝑦2
𝑧2
𝑥
1 𝑦
1 𝑧
1
𝑥2 𝑦2 𝑧2
𝐴 = Augmented block diagonal terms
Augmented off-diagonal terms
Restraints and constraints … and symmetries
Constraints
Can be seen as infinitely strong restraints : no deviation allowed
Implicit : choice of model (e.g. isotropic atoms, scattering factors) Explicit : relationships between parameters or fixed values for parameters
Constraints as fixed parameters :
Special positions :
𝑃21/𝑚
Fractional coordinates of an atom at Wyckoff position d of 𝑃21/𝑚 cannot be refined Fixed at (½ 0 ½)
Restraints and constraints … and symmetries
Constraints
Constraints as parameters relationships:
𝑃23
Fractional coordinates of an atom at Wyckoff position e of 𝑃23 must be constrained so that :
𝑥 = 𝑥′ 𝑦 = 𝑥′ 𝑧 = 𝑥′
Only one refined parameter : 𝑥′
𝑌 𝑐𝑎𝑙𝑐 𝑥, 𝑦, 𝑧 → 𝑌 𝑐𝑎𝑙𝑐(𝑥 𝑥′ , 𝑦 𝑥′ , 𝑧 𝑥′ ) = 𝑌 𝑐𝑎𝑙𝑐 (𝑥′) Reparameterization
Normal matrix is reduced in size using chain rule derivatives as 𝑥 𝑥′ , 𝑦 𝑥′ and 𝑧 𝑥′ are known.
Restraints and constraints … and symmetries
Anisotropic thermal displacement parameters for atoms on special position
Site symmetry restriction on the elements of the 𝑈𝑖𝑗 tensor!
If 𝑅 is a rotation matrix an U a thermal displacement tensor then:
𝑹𝑼𝑹𝒕 = 𝑼
Example for an atom on a 4-fold axis along [001]
𝑼 =
𝑈11 𝑈12 𝑈13
𝑈21 𝑈22 𝑈23
𝑈31 𝑈32 𝑈33
𝑹 =0 −1 01 0 00 0 1
Application of the symmetry operations must leave the representation ellipsoid unchanged
→ 𝑼 =
𝑈11 0 00 𝑈11 00 0 𝑈33
Restraints and constraints … and symmetries
Symmetries in another crystallographic modelling
When the resolution of the diffraction data allows to observe bonding electrons :
Extra parameters included in the refinement : 𝑃𝑣 = atomic valence populations 𝑃𝑙𝑚 = deformation density populations
Angular dependence 𝑦𝑙𝑚 𝜃, 𝜑 : real spherical harmonics
𝑙 = 0, 𝑚 = 0
𝑙 = 1, 𝑚 = −1,0, +1
𝑙 = 2 …
Multipolar Modeling
𝑦𝑙𝑚 𝜃, 𝜑 are oriented with respect to a local frame centered on the atomic nuclei :
Residual electron density N-H bond
Dipolar (l=1) function + local axis system
Dipolar (l=1) function in the molecular plane
Problems with atoms in special positions : Local frame must follow the point symmetry of the site 𝑦𝑙𝑚 𝜃, 𝜑 functions must respect the point symmetry as well
Symmetries in another crystallographic modelling
Constrains related to site symmetry “Index picking rules”
Example : Atom on a twofold axis z parallel to 2 Multipoles allowed - Any 𝑙 - 𝑚 even
𝑙 = 0, 𝑚 = 0
𝑙 = 1, 𝑚 = −1,0, +1
𝑙 = 2, 𝑚 = −2, −1, 0, 1, 2
Crystal symmetry constrains the crystallographic modeling
Symmetries in another crystallographic modelling
Symmetries In protein crystallography
Proteins : Quite low quality crystals, limited diffraction data resolution (<d> ~2Å) Made of a large number of atoms (difficult to crystalize) Only one enantiomer (mirrors and inversion incompatible) : 65 remaining space groups!
Assemble in dimers, trimers, tetramers (biological assemblies) Monomers related by (pseudo) symmetry operations which do not belong to the space group description Protein crystallographers call them “Non Crystallographic symmetries” (NCS) Better to call them “pseudo symmetries” : Pseudo – translations Pseudo – rotations
Difficulties in space group assignments
Ca2+ Gated potassium Channel
44.035 63.452 63.477 90.03 89.99 89.99 … solved and published in 𝑃1 Then in 𝑃4212 (Id = 4HZ3)
Nucleotide gated ion channel. PDB : 1q43
Space group I4 2 mol / AU
Pseudo – 2fold axis in (𝑎 , 𝑐 ) plane Monomers related by ½ − 𝑦, ½ − 𝑥, 0.31 − 𝑧 rms ~0.2Å and 𝑅𝑠𝑦𝑚 (𝐼422) = 40% 𝑰𝟒
True pseudo symmetries
Symmetries In protein crystallography
Least Squares refinement problems related to symmetries
Correlations between parameters
Parameters are correlated when they have the same effect on the model
⇒𝜕𝑌 𝑐𝑎𝑙𝑐
𝜕𝑥𝑖=
𝜕𝑌 𝑐𝑎𝑙𝑐
𝜕𝑥𝑗 if 𝑥𝑖 and 𝑥𝑗 are (perfectly) correlated
𝑨 =
𝑤𝑗𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑖
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑖
𝑚ℎ𝑘𝑙𝑗=1 ⋯ 𝑤𝑗
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑖
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑗
𝑚ℎ𝑘𝑙𝑗=1
⋮ ⋱ ⋮
𝑤𝑗𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑗
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑖
𝑚ℎ𝑘𝑙𝑗=1 ⋯ 𝑤𝑗
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑗
𝜕𝑘𝑌𝑗,𝑐𝑎𝑙𝑐
𝜕𝑥𝑗
𝑚ℎ𝑘𝑙𝑗=1
=
If two columns are equal, det 𝑨 = 0 and the matrix cannot be inverted
Singular Normal Matrix
Least Squares refinement problems related to symmetries
When parameters have similar effects
Large correlations due to 𝜕𝑌 𝑐𝑎𝑙𝑐
𝜕𝑥𝑖≈
𝜕𝑌 𝑐𝑎𝑙𝑐
𝜕𝑥𝑗 ill-conditioned Matrix
det 𝑨 ≈ 0 and 𝑨−𝟏 poorly reliable, leads to large (huge) shifts and uncertainties
Well-known causes of ill-conditioned refinements : - Correlations between atomic occupancies and thermal displacement parameters - Overlapping of atomic sites due to disorder - Missing symmetry constrain (for coordinates or thermal displacement) - Refinement in a space group of too low symmetry - Correlations between parameters of an atom in oblique crystal systems
Least Squares refinement problems related to symmetries
Correlations between coordinates of an atom in oblique crystal systems
𝑥
𝑦
𝛾
= True atomic position
∆𝑦 ∆𝑥
Searching for the true position of an atom If ∆𝑦 is the coordinate shift along y, the “best” ∆𝑥 shift is ∆𝑥 = - ∆𝑦 cos(γ)
If γ ≠ 90°, 𝑥 and 𝑦 are co-varying and are correlated
Example in Feast et al. Acta C, 2009 : refinement of methylene aziridine 𝑃21/𝑛 : a =13.8593 (3), b = 10.5242 (2), c = 14.8044 (4) Å , 𝛽 = 92.0014 (7)° 𝑃21/𝑐 : a = 13.8594 (2), b =10.5243 (2), c = 19.9230 (3) Å , 𝛽 = 132.0439 (7)
Cell transform
Coordinates transform
Least Squares refinement problems related to symmetries
Parois & Lutz, Acta A, 2011
Least Squares refinement problems related to symmetries
Ill-conditioned refinement caused by missing inversion (SG of too low symmetry)
Significant bibliography about space group corrections (CSD deposited)
Clemente & Marzotto, Acta B, 2002 ; Henling & Marsh, Acta C 2014 Marsh, Acta B 2004, 2005abc, 2007, 2009 …
In 2005, about 10% of CSD deposited structures were inappropriately assigned to P1
Description in more appropriate space group leads to better geometry
Coordinates shifts needed for added symmetries ~0.05Å max
Least Squares refinement problems related to symmetries
Examples of structures deposited in an inappropriate space group
.. Or CSD Id = QEQRUZ
Missing 1 : CSD Id = BAHHUP
Supercell : CSD Id = GUHTAF