Automated calculation of reaction kinetics via transition ...cj82nd63m/fulltext.pdf · AUTOMATED...
Transcript of Automated calculation of reaction kinetics via transition ...cj82nd63m/fulltext.pdf · AUTOMATED...
AUTOMATED CALCULATION OF REACTION KINETICS
VIA TRANSITION STATE THEORY
A Dissertation Presented
By
Pierre Lennox Bhoorasingh
to
The Department of Chemical Engineering
In partial fulfillment of the requirements
for the degree of
Doctor of Philosophy
in the field of
Chemical Engineering
Northeastern University
Boston, Massachusetts
August 2016
Dedication
I dedicate this thesis to AMT.
i
Acknowledgments
I have been able to complete this thesis work due to the help I have received from those
who have found time in their busy schedules. This is my attempt to express my profound
gratitude to those who have helped me during my thesis work.
Thanks to my advisor, Prof. Richard West, for the guidance over the 5 years. You also
gave me the freedom to explore and that has only enhanced my thesis work, and it has been
a pleasure to be your first graduate student.
I would also like to thank my thesis committee members, Dr. David Budil, Dr. Hicham
Fenniri, Dr. C. Franklin Goldsmith, and Dr. Reza Sheikhi. They made the time to have
engaging discussions that impacted this thesis, and were also very generous with their
professional advice.
Thanks to the Computational Modeling group. Fariba Seyedzadeh Khanshan and Be-
linda Slakman, you were always helpful in our discussions and made the laboratory a fun
working environment. I’d also like to thank Jason Cain for being a super helpful under-
graduate who assumed nothing in pursuit of the right approach. I want to also thank Sean
Troiano, Victor Lambert, Jacob Barlow, and Elliot Nash for their contributions to laboratory
discussions.
Thanks to past and present RMG developers, who do a great job working on a complex
open-source software. I would like to thank Joshua Allen and Amrit Jalan for their scientific
perspectives in the early stages of this thesis work. I’d also like to thank Shamel Merchant
and Enoch Dames for their help with CanTherm.
I would like to thank Greg Landrum and the RDKit developers, for this thesis would be
much more difficult without their work.
Thanks to Pat Rowe, Jessica Smith-Japhet, and Brandon Mennillo for their assistance
over the years. I would like to express my gratitude to the Research Computing team at
ii
iii
Northeastern University, and in particular Dr. Nilay Roy, for their work on the Discovery
cluster. I would also like to thank Bill Sheehan for his help with the now retired Venture
and Opportunity clusters.
I’d like to thank the Combustion Energy Frontier Research Center, especially Prof.
Chung Law and Lilian Tsang, for organizing and hosting the Combustion Summer School,
which I had the opportunity to attend twice (2012 and 2014).
I must thank Prof. David Beck and the organizers of the 2015 Data Science Work-
shop for hosting an enjoyable and intense discussion group on the role of data science in
academia. I would also like to thank Michael Li and the team at the Data Incubator for
running an informative and rigorous data science bootcamp that I had the opportunity to
attend in the Spring of 2016.
Thank you to my classmates, Avinash, Dan, Dinara, Emily, and Nil. Your support has
been important through the years. I want to also thank the friends I made in the Chemical
Engineering Department.
Finally, thanks to my family, for their unending support as I take another step in life.
Abstract
Modeling complex chemical systems often requires knowledge of the elementary reac-
tions involved, such as in combustion kinetics where models routinely contain thousands of
reactions. Automated tools have been developed to construct such models, as manual meth-
ods have proven to be tedious and susceptible to human error. A large number of kinetic
parameters are required to complete the construction of detailed kinetic models, but the
available data are quite sparse. As a result, estimation methods use existing data to predict
the many unknown kinetics, but the accuracy of these kinetics suffers due to insufficient
data to make good kinetic predictions.
Theoretical calculations can be used to improve the kinetics in models, but these cal-
culations require a transition state geometry estimate that is typically provided manually.
Manual geometry estimation is slow and infeasible for automated construction of reaction
mechanisms, so this thesis describes an automated method to estimate transition state ge-
ometries and calculate reaction kinetics. The three dimensional chemical structure for un-
reactive atoms at the transition state can be predicted with existing computational methods,
but the geometry of the reaction center is unknown. The unknown section of the transition
state must be predicted to create the transition state geometry.
The reaction center distances are predicted using data from analogous transition state
structures, and the transition state geometry prediction is constructed using an existing tech-
nique known as distance geometry. The transition state geometry prediction is optimized
using a commercially available computational chemistry software package in order to cal-
culate molecular properties of the transition state, such as bond vibrational frequencies.
The molecular properties of reactants and products are also required to calculate reaction
kinetics, and these are determined using an existing automated method. Molecular prop-
erties of the reactants, products, and transition state are used to calculate the kinetics of a
i
ii
reaction via classical transition state theory.
The work in this thesis was initially developed for hydrogen abstraction reactions, and
has been extended to β-scission and intra-hydrogen migration reactions. The automatically
determined kinetics and state-of-the-art estimation methods were compared to high accu-
racy theoretical calculations, and the automated calculations were shown to outperform
the estimation methods. This enables improved mechanism generation, where high-fidelity
complex chemical models can be constructed with minimal human intervention.
Contents
1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Automatic mechanism generation . . . . . . . . . . . . . . . . . . 2
1.1.2 Kinetic and thermodynamic parameter estimation . . . . . . . . . . 4
1.1.3 Theoretical rate calculation . . . . . . . . . . . . . . . . . . . . . . 5
1.1.4 Statistical mechanics and quantum chemistry . . . . . . . . . . . . 8
1.1.5 Stable geometry and transition state searches . . . . . . . . . . . . 10
1.1.6 Automated transition state searches . . . . . . . . . . . . . . . . . 11
1.1.7 Kinetic Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2 Thesis overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2 Using double-ended methods to automate transition state searches 15
2.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.1 Generating 3-dimensional geometries for double-ended search meth-
ods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.2 Locating transition states with the automatic double-ended search . 22
2.2.3 Electronic Structure calculations . . . . . . . . . . . . . . . . . . . 23
2.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5.1 Semi-empirical methods are insufficient for transition state searches 26
2.5.2 Consider more robust double-ended search methods . . . . . . . . 27
iii
iv
3 Automatic transition state geometry estimation using group contributions 28
3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2.1 Geometry estimation and optimization . . . . . . . . . . . . . . . . 31
3.2.2 Method evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3.1 Transition state geometries were successfully estimated using the
distance estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3.2 Increasing training data improves the group value predictions . . . 38
3.3.3 Geometry estimation needs improvement to make best use of pre-
dicted values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.4 Algorithm optimization . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.5 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.5.1 Conformer recognition . . . . . . . . . . . . . . . . . . . . . . . . 44
4 Improving the group contribution transition state search method 45
4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.2.1 Modifying the transition state geometry prediction . . . . . . . . . 46
4.2.2 Modifying the transition state optimization sequence . . . . . . . . 49
4.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.1 Tree structure and data diversity affect prediction accuracy . . . . . 49
4.3.2 Manipulating distance limits and force constants can improve UFF
optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.3.3 Replacing the UFF optimization with more robust calculations may
improve transition state prediction . . . . . . . . . . . . . . . . . . 54
4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
v
4.5 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.5.1 Efficient calculation of the molecular group contributions . . . . . . 57
4.5.2 UFF optimization with constrained optimization . . . . . . . . . . 58
5 Method extension to new reaction families and automated kinetic parameter
calculation 59
5.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2.1 Computational chemistry . . . . . . . . . . . . . . . . . . . . . . . 61
5.2.2 Automated geometry searches . . . . . . . . . . . . . . . . . . . . 61
5.2.3 Kinetic calculations . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.2.4 Comparison of Automated TST calculations and Rate Rules . . . . 63
5.2.5 Comparison to benchmark calculations . . . . . . . . . . . . . . . 64
5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.3.1 Comparison of automated TST calculations and rate rules . . . . . 65
5.3.2 Comparing predictions to benchmark calculations . . . . . . . . . . 65
5.3.3 Sources of error in the automated calculations . . . . . . . . . . . . 66
5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.6 Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.6.1 Improve symmetry number calculation . . . . . . . . . . . . . . . 73
5.6.2 Automate hindered rotor calculations . . . . . . . . . . . . . . . . 73
6 Summary 75
Appendices 90
Appendix A Double-ended method 91
vi
Appendix B Group contribution method 100
B.1 Group Training Regression Details . . . . . . . . . . . . . . . . . . . . . . 100
B.2 Predicted vs Optimized distances . . . . . . . . . . . . . . . . . . . . . . . 102
B.3 Group Naming Convention . . . . . . . . . . . . . . . . . . . . . . . . . . 103
B.4 Group values for original tree . . . . . . . . . . . . . . . . . . . . . . . . . 104
B.5 Group values for new tree . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
B.6 List of test reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
B.7 Effect of increasing force constants and reducing the difference between
upper and lower limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Appendix C Kinetic calculations 150
C.1 Molecular group trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
C.1.1 Hydrogen Abstraction . . . . . . . . . . . . . . . . . . . . . . . . 150
C.1.2 Intra-hydrogen migration . . . . . . . . . . . . . . . . . . . . . . . 160
C.1.3 β-scission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
List of Figures
1.1 Beta-scission reaction template from RMG. . . . . . . . . . . . . . . . . . 3
1.2 Potential energy profile for a typical reaction. . . . . . . . . . . . . . . . . 6
2.1 The molecular bounds matrix. (A) Bonded atom limits are set by bond
length rules, while connectivity limits non-bonded atoms in the same molecule.
Van der Waals radii set the lower limits for atoms on separate molecules,
while there are no upper limits (set to 1000 A). (B) By editing these dis-
tance limits, we can position molecules relative to one another, as well as
stretch or shrink bond distances. . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Potential energy surface showing minimum energy pathway (dotted line)
from reactants (R) to products (P). A double-ended search between 1 and 2
will start closer to the TS (grey circle) than a search between R and P. . . . 20
2.3 Definition of the 3 key distances for editing the transition state geometries.
H represents the abstracted hydrogen, X the atom bonded to the hydrogen,
and Y the radical abstracting the hydrogen. . . . . . . . . . . . . . . . . . 21
2.4 Automatic double-ended enabled transition state search procedure. . . . . . 23
2.5 Points of failure along the automatic double-ended algorithm with SAD-
DLE/PM7 calculations. The arrow widths are in proportion with the num-
ber of associated reactions. 194 succeeded with 128 reactions failing at the
reaction path validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6 dXH transition state distances in Angstrom calculated with M06-2X/6-
31+G(d,p). Abstracting radicals (Y) are on the left, and the hydrogen and
the carbon it is abstracted from (XH) are on top. Similar trends were ob-
served for dHY, and dXY. . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
vii
viii
3.1 Manipulating the molecular bounds matrix to create transition state geom-
etry estimates. (A) The matrix generated for a pair of stable species. (B)
Editing the matrix with the group contribution predictions for transition
state distances. (C) Conflicting lower limit distances are corrected, creat-
ing a valid transition state distance bounds matrix. . . . . . . . . . . . . . 34
3.2 The automated transition state search algorithm. . . . . . . . . . . . . . . . 36
3.3 Distances from 907 validated transition states found at B3LYP/6-31+G(d,p)
were compared to predictions derived from molecular group values. The
solid line represents parity with the optimized distances, and the dashed
lines represent the root mean squared error of the estimates from parity.
The predictions improved as the training set used to calculate the group
values was expanded. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.4 422 transition states found in one trial of the algorithm were unsuccessful
in another. Comparing optimized distances against the failed estimation at-
tempt showed: 1. poorly estimated distances that were improved when the
training set was expanded 2. the conversion from prediction to geometry
estimate introduced additional error. . . . . . . . . . . . . . . . . . . . . . 41
3.5 Probability of a failed TS search as a function of RMS error in reactive
distances of starting geometry. For each point the vertical bar show the
Clopper–Pearson [122] 95% confidence interval of the lower bound and
the horizontal bar shows the range of RMS errors used to calculate it. . . . 42
4.1 The RMS error for the distance estimates compared to the optimized tran-
sition state distances. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2 Probability distribution for the root-mean-squared error of the reaction cen-
ter distances when training the groups with 44 transition state distances, for
the Original and New tree structures. . . . . . . . . . . . . . . . . . . . . . 52
ix
4.3 Decreasing the distance and increasing the force constants for the reaction
center each minimized error in the dXH distance introduced during the con-
struction of the 3-dimensional transition state estimate. The error reduction
is additive as seen when combining the modifications. . . . . . . . . . . . . 53
4.4 Mean and standard deviation of the absolute error in dXH distances from
the final optimized transition state at each stage of the transition state pre-
diction process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.5 Mean and standard deviation of the absolute error in dHY distances from
the final optimized transition state at each stage of the transition state pre-
diction process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.6 Mean and standard deviation of the absolute error in dXY distances from
the final optimized transition state at each stage of the transition state pre-
diction process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.1 Automated transition state search algorithm as described in ref. 136. The
steps with bold borders, adapted from the AARON software [81], are devi-
ations from the original algorithm. . . . . . . . . . . . . . . . . . . . . . . 62
5.2 The automated kinetic calculations involve an automated transition state
search (Figure 5.1), automated search for reactant and product geometries
[8], and automatically calculating kinetics using CanTherm [84]. . . . . . . 63
5.3 Rate rule estimates (y-axis) plotted against automated algorithm TST cal-
culations (x-axis) at 1000 K. . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.4 Comparison of kinetic estimates for hydrogen abstraction reactions. . . . . 67
5.5 Comparison of kinetic estimates for intramolecular hydrogen migration re-
actions R3 (a) and R4 (b). . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.6 Comparison of kinetic estimates for β-scission reactions R5 (a) and R6 (b). 69
x
5.7 Magnitude of the sources of error in each automated algorithm compared
to its respective benchmark calculation. The summation of all errors are
represented by the algorithm calculation. . . . . . . . . . . . . . . . . . . . 70
B.1 Comparison of distances from validated transition states to predictions from
molecular group values calculated from different sized data sets. The solid
line represents parity, the dashed lines represent the root mean squared er-
ror of the estimates from parity. The predictions derived from the original
and new trees are represented by the black and red circles respectively. . . . 102
B.2 Decreasing the distance and increasing the force constants for the reaction
center each minimized error in the dHY distance introduced during the con-
struction of the 3-dimensional transition state estimate. The error reduction
is additive as seen when combining the modifications. . . . . . . . . . . . . 148
B.3 Decreasing the distance and increasing the force constants for the reaction
center each minimized error in the dXY distance introduced during the con-
struction of the 3-dimensional transition state estimate. The error reduction
is additive as seen when combining the modifications. . . . . . . . . . . . . 149
List of Tables
3.1 Part of the hierarchical molecular group tree for transition state distances
trained using 1071 transition state distances calculated using B3LYP/6-
31+G(d,p). The full tree is provided in the appendices. . . . . . . . . . . . 35
3.2 Training set information. As the training set was expanded, the RMS error
from the validated transition state distances decreases. . . . . . . . . . . . . 38
4.1 Training set information. As the training set was expanded, the RMS er-
ror from the validated transition state distances decreases. The new tree
structure performed better when training data was sparse. . . . . . . . . . . 51
5.1 Number of reactions for each family contained in the combustion model,
and success of the AutoTST algorithm. . . . . . . . . . . . . . . . . . . . . 65
5.2 Reactions compared to benchmark calculations. . . . . . . . . . . . . . . . 65
5.3 Difference in the activation energy (kJ/mol) compared to the benchmark
calculations. Kinetics fitted to Arrhenius form between 600K and 2000K. . 68
5.4 Difference in the log10 of the A factor compared to the benchmark calcula-
tions. Kinetics fitted to Arrhenius form between 600K and 2000K. R3 and
R4 are in [s−1] and the rest are in [cm3/(mols)] . . . . . . . . . . . . . . . 69
A.1 Transition states determined at M06-2X/6-31+G(d,p) showed trends in the
distances (in Angstroms) with changes to molecular groups. The distances
dXH, dHY, and dXY are defined in Figure 2.3. . . . . . . . . . . . . . . . 91
A.2 334 hydrogen abstraction reactions used to test automated transition state
algorithms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
B.1 Original tree structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
xi
xii
B.2 Modified tree structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
B.3 1393 hydrogen abstraction reactions used to test the group estimates cou-
pled with the automated transition state algorithm. The reactants and prod-
ucts are provided as SMILES strings. Transition states that were found and
validated are available in CML format. . . . . . . . . . . . . . . . . . . . . 118
C.1 HAbs tree structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
C.2 intraH tree structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
C.3 β-scission tree structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
1. Introduction
Detailed kinetic modeling allows researchers to gain a deeper understanding of the
chemistry contributing to phenomena observed in complex chemical systems [1]. Such
complex systems include the combustion of novel renewable fuels [2], and the formation
of silicon nanoparticles [3]. Detailed kinetic, or microkinetic, models aim to capture all the
relevant chemistry for a system of interest, including thermodynamic parameters for each
species, the elementary reactions involving these species, and the kinetic parameters for
each reaction contained in the model. Adding these necessary details often leads to models
that are quite large, such as the combustion model developed for 2-methylalkanes that
contained approximately 7200 species and 31400 reactions [4]. Lu and Law also showed
that models have only increased in size over time, as new important reaction pathways are
discovered and included in models [5]. The current size and complexity of detailed models
mean it is an inefficient and error prone process to continue constructing these models by
hand.
Automated software, such as Reaction Mechanism Generator (RMG) [6], have been
developed to address the difficulties of large model construction, but the models require
many thousands of parameters, most of which are unknown and must be estimated. Esti-
mation methods are usually derived from group additivity and are attractive due to being
computationally efficient, but the estimates used can be quite inaccurate. This is not only
detrimental to the model accuracy, but also to the reactions considered during the model
construction due to the rate-based approach employed by RMG in building a tractable
model, where reaction pathways with the highest flux are explored and low flux pathways
are ignored [7]. An alternate method to determine thermodynamic parameters was devel-
oped using quantum chemical calculations, and this method has been implemented in RMG
to good effect [8]. An alternative approach is also desired for kinetic parameters to address
1
2
the widespread uncertainties of models.
Automating kinetic calculations requires the automation of several steps. The reactant
species geometries must be estimated automatically, but this has been previously achieved
in the automated thermodynamic parameter calculator. A transition state geometry esti-
mate is also required, so an estimation method must be developed since existing conformer
generation tools are not capable of predicting transition states. Transition state validation
methods rely on visual inspection, so an alternate automated approach needs to be de-
veloped. Finally, the automated transition state search method must be integrated with a
kinetic calculation software to provide the kinetic parameters via transition state theory
calculations.
1.1 Background
1.1.1 Automatic mechanism generation
One of the earliest reaction network generators was developed in 1979 by Ugi et al. to
take advantage of ”the inherent capabilities of modern computers” in determining chemi-
cal synthesis pathways [9]. Molecules were represented by bond-electron matrices, which
stored bond and free valence electron information for each atom in a molecule, and a trans-
formation matrix would represent a reaction type that would convert reactant molecule
matrices to product matrices. Iteratively applying the transformation matrices would allow
the algorithm to postulate all possible reactions.
More modern and efficient mechanism generators have been developed [6, 10–17], and
these benefit from the evolution of new methods and advances in computational power.
These modern mechanism generators represent molecules as chemical graphs with atoms
represented as nodes and bonds as edges. Graphs theory [18] can efficiently compare these
chemical graphs [19], to determine if a newly made molecule already exists in the model
due to an alternate reaction pathway. Reaction templates are used to convert chemical
graphs from one form to another, and each type of reaction (a reaction family) has its own
3
reaction template [18].
2R 1R 3R 1R 3R+ 2R
Figure 1.1: Beta-scission reaction template from RMG.
Mechanism generators restrict model size in order to exclude reactions of little impor-
tance at the conditions of interest, and several model size restriction strategies exist. The
REACTION software allows the user to specify which reaction families to apply for each
stage of the model generation [13]. MAMOX generates a detailed primary mechanism
which is then lumped to reduce the model size [12]. These steps are iterated to produce
a series of highly lumped mechanisms [1]. The approach in EXGAS uses a base mecha-
nism for small molecule chemistry, with the rest of the mechanism consisting of a detailed
primary mechanism and a lumped secondary mechanism [14]. Reaction Mechanism Gen-
erator (RMG) restricts model size using the rate-based screening algorithm of Susnow et al.
[7]. A species is only included in the model if the flux to the species surpasses a cutoff cri-
teria.
Other mechanism generators differ in how molecules are represented. Rule Input Net-
work Generator (RING) allow users to specify constraints on molecules at the input, such
as the maximum number of heavy atoms [16]. GENESYS treats stereo-isomers as unique
species [17], allowing model construction for pharmaceuticals and other systems.
Detailed kinetic models also require thermodynamic and kinetic parameters to com-
pletely describe a system [20]. These parameters are ideally taken from sources where
they were either experimentally measured or theoretically calculated, but well determined
parameters are sparse when compared to the number of parameters required to complete a
model. When a needed parameter is unknown, estimation methods are used to supplement
the available data.
A sensitivity analysis on a microkinetic model will identify the important parameters,
and these should be targeted for improvement if they were estimated. Typically theoreti-
cal approaches are applied, such as the calculation of reaction kinetics via transition state
4
theory.
1.1.2 Kinetic and thermodynamic parameter estimation
Estimation methods are used to address the shortage of thermodynamic and kinetic
parameters. Thermodynamic parameters are typically estimated using Benson’s group ad-
ditivity, where the contribution of molecular groups contained in a molecule are summed to
provide an estimate of the thermodynamics of the molecule [21]. The value of the molec-
ular group contributions are determined from molecules with well known thermodynamic
values, and the method has been shown to work for a variety of systems [22–25].
The accuracy of Benson’s method depends on the data used to determine the molecular
group values, and the ability to account for the important factors affecting the thermody-
namics of a molecule. Difficulties in accounting for the effects of radicals and fused-rings
on thermodynamic parameters has motivated work on supplementary estimation methods.
The Hydrogen Bond Increment method has improved radical species thermodynamic esti-
mation by using summing the thermodynamics of a closed-shell molecule and the contribu-
tion of removing a H atom to form the radical [26]. The contribution of the loss of a H atom
is calculated using known thermodynamic parameters for other closed-shell molecules and
its associated radical. Fused-ring thermodynamics have been addressed in RMG by au-
tomating thermodynamic parameter calculations from first principles [8].
Kinetic estimation methods have also been developed that can make predictions in good
agreement with experimental data. The Evans-Polanyi relationship is the classic method for
estimating kinetic parameters, relating the activation energy of a reaction to the enthalpy
of reaction [27]. This methods can be automated and is also computationally efficient
[28], but the relationship is not always appropriate for some systems such as the hydrogen
abstraction from polynuclear aromatics by methyl radicals [29]. In recent years, the most
commonly applied kinetic estimation methods rely on transition state theory calculations
on select reactions in order to develop rate rules to be used for similar reactions [30–35].
5
The considerable time, effort, and expertise required to develop rate rules has motivated
alternative high throughput means to determine chemical kinetic parameters [20].
The Evans-Polanyi relationship
k(T ) = A exp
(−(E0 + α∆H◦)
RT
)(1.1)
is a simple approach to estimating reaction kinetics [27], requiring just the change in en-
thalpy of the specific reaction (∆H◦), and three parameters for the reaction family (A, E0,
α).
1.1.3 Theoretical rate calculation
The central aim of theoretical kinetics is to understand why reactions are fast or slow,
but we have progressed to understanding the temperature dependence of a reaction, and can
even calculate reaction rates. One of the earliest approaches to determining reaction rates
was collision theory, which started with the basic knowledge that a reaction can only occur
if the reactants collide. Each reactant is treated as a hard sphere, where they maintain their
shape during a collision, and there is no interaction between spheres until they collide [36].
A reaction occurs when the valence electrons are disturbed, which requires energy. This
energy can only be overcome (an energy barrier) by collisions that have sufficient energy
at the moment of impact.
k = Ze−∆GRT (1.2)
The form of the kinetic temperature dependence is captured by collision theory, but it
produced large overpredictions compared to observed reaction rates [37]. One approach to
refine collision theory introduced a steric factor to represent the fraction of collisions that
led to successful reactions (Equation 1.3) [38], but this arbitrary approach still could not
6
account for the discrepancies found for some reactions.
k = Zρe−∆GRT (1.3)
V
rAB
rBC
X‡
X‡
Reaction coordinate
En
erg
y
Reactants
Products
Reactants Products
Figure 1.2: Potential energy profile for a typical reaction.
The activated complex theory, more commonly known as transition state theory, as-
sumes a reaction proceeds from reactants to products via an activated complex [39]. If the
change in molecular structure during the reaction (the reaction coordinate) is plotted against
its potential energy, the reactants and products lie in energy minima and the transition state
is located at the highest point along the reaction coordinate (Figure 1.2). The transition
state is unstable with respect to the reaction coordinate, but it lies in an energy minimum
with respect to all other coordinate axes. Transition state theory assumes the reactants and
transition state are in quasi-equilibrium, and the rate limiting step is the decomposition
of the transition state to products. Based on these assumptions, the Eyring equation de-
scribes the relationship between the rate of an elementary reaction and the thermodynamic
properties of the equilibrium between the reactants and transition state.
7
k(T ) =kBT
hexp(−∆G‡
RT) (1.4)
kB is the Boltzmann constant, h is Planck’s constant, R is the molar gas constant, and
∆G‡ is the change in Gibbs free energy between the reactants and transition state. Equation
1.5 shows the elementary reaction rate expressed in terms of the partition functions of the
reactants and transition state based on the same transition state theory assumptions.
k(T ) =kBT
h
Q‡∏ni Qi
exp(−∆E‡0kBT
) (1.5)
The above equation accounts for all cases where the reactants collide and form an ac-
tivated complex, but does not account for the effects of quantum tunneling. Tunneling is
a phenomenon of quantum mechanics that describes the finite possibility that a particle
will tunnel through a barrier that it cannot overcome in a classical sense, so some reactants
will form products even though they have less energy than the barrier height [40]. In reac-
tion kinetics, this effect is small at high temperatures, but increases in importance at lower
temperatures. This is represented by κ in the classical transition state theory equation.
k(T ) = κkBT
h
Q‡∏ni Qi
exp(−∆E‡0kBT
) (1.6)
Good reaction rate predictions can be made using classical transition state theory, but
further refinement is sometimes necessary. Variational transition state theory is one such
refinement, where the dividing surface between reactants and products is varied such as to
minimize the reaction rate [41].
A reaction rate determined using classical transition state theory only considers the
8
lowest energy conformer for the reactants, products, and transition state. This approach
is appropriate since the lowest energy conformer is usually the dominant conformer in the
species population. This breaks down when reacting species have multiple conformers
of similar energy to the lowest energy conformer [42–44], so the multi-structural varia-
tional transition state theory method was developed for such cases [45, 46]. The multi-path
variational transition state theory method was developed to account for complex reacting
molecules with multiple reactants and multiple transition states [47].
1.1.4 Statistical mechanics and quantum chemistry
The kinetics of a reaction can be determined via transition state theory if the total par-
tition function of the reacting molecules and the transition state are known. The rigid-rotor
harmonic-oscillator approximation is often used to calculate the total partition function,
Qtot.
Qtot = QtransQrotQvibQelec (1.7)
Qtrans = V
(2πMkBT
h2
) 32
(1.8)
Qrot =
√π
σexternal
(8πImkBT
h2
) 32
; Im = IxIyIz (1.9)
Qvib =∏i
(1− e−hvikBT )−1 (1.10)
Qtrans,Qrot,Qvib, andQelec represent the translational, rotational, vibrational, and elec-
tronic partition functions of the molecule. Aside from temperature (T ), Boltzmann’s con-
stant (kB), and Planck’s constant (h), the total partition function is related to properties
specific to the molecule. The electronic partition function is taken to be the electronic spin
multiplicity of the molecule, as only the lowest electronic energy state is accessible. The
translational partition function is related to the unit volume, V , and the molecular weight
9
of the molecule, M . The rotational partition function is related to the external symmetry,
σexternal, and the moments of inertia, Ix, Iy, and Iz. The vibrational partition function is
related to the vibrational frequencies of every bond in the molecule, vi.
The rigid-rotor harmonic-oscillator approximation is not always appropriate as it does
not account for anharmonicity, but can be corrected using the 1-D hindered rotor approxi-
mation [48]. Alternatively, the vibrational and rotational partition functions can be treated
as a coupled conformer partition function, as in the case of the multi-structural anharmonic-
ity method [43, 44].
Molecular properties can be theoretically calculated by solving the electronic Schrodinger
equation, HΨ = EΨ, where H is the Hamiltonian, E is the electronic energy, and Ψ is
the wavefunction. Approximations of the Schrodinger equation are used in computational
chemistry software for practical reasons to determine the molecular properties. Several
approximations can be made, and typically each approximation offers significant savings
and computational time, but this comes at the risk of decreased accuracy. Semi-empirical
methods, such as Parameterized Model number 3 (PM3) [49], were developed using ex-
perimental data to parameterize the Schrodinger equation solution, making these method
computationally efficient but are only accurate for molecules similar to those used to pa-
rameterize the solution. Hartree-Fock (HF) methods assume the wavefunction can be ap-
proximated by a single expression known as a Slater determinant, and this method is often
used to get a fast and good approximation before more robust methods are applied [50].
Moller-Plesset methods, such as MP2 and MP4 [51, 52], uses perturbation theory to im-
prove on HF. Density functional theory uses an approximate electron density functional to
find the solution, instead of solving the wavefunction. Coupled cluster theory is a further
improvement to solving the wavefunction can produce some of the most accurate calcula-
tions but at a significant computational cost [53]. Basis sets contain basis functions which
describes molecular orbitals, and these are used in conjunction with the above described
electronic structure methods.
10
These calculations can only be used to determine the molecular properties if the cor-
rect transition state, reactant, and product geometries are known. This requires the use of
algorithms that search the potential energy surface for the lowest energy conformer of each
reactant and product, and other methods that search for the transition state which rests at
the first-order saddle point between the reactants and products.
1.1.5 Stable geometry and transition state searches
Partition function calculation requires molecular properties such as vibrational frequen-
cies and moments of inertia, all of which can be theoretically calculated if the correct
structures are known. Search algorithms have been developed that require a geometry
prediction, after which the method will automatically search for either a local minimum
representing a stable species, or a saddle point for the transition state.
Stable species searches determine if the structure lies in a local minimum. The search
method perturbs the supplied geometry so that the energy of the molecule is decreased,
and the search is terminated when any further change in geometry will result in an increase
in energy [54–59]. This method has also been called a surface-walking method since the
geometry is being slowly perturbed toward its final configuration.
Transition state geometry searches use similar surface-walking algorithms, except that
the geometry now lies at a first-order saddle point on the potential energy surface [60, 61].
The location of the transition state means a estimate quite similar to the real geometry is
required for the search to be successful. It is sometimes difficult to determine the transition
state geometry, so double-ended algorithms have been developed that use structures of the
reactants and products to make the transition state structure prediction. A user starts by
manually providing a transition state structure estimate to initiate the surface-walking tran-
sition state search. If successful, a reaction path analysis calculation, such as an intrinsic
reaction coordinate (IRC) calculation, is necessary to confirm the transition state. If unsuc-
cessful, the user would then use a double-ended search algorithm, where the user would
11
provide the reactant and product structures.
Double-ended search algorithms typically rely on two strategies: interpolation and path
analysis. The earlier linear and quasi-synchronous transit methods compare the provided
reactant and product structures and interpolate their atom positions to create the transition
state prediction [62, 63]. More advanced methods have improved on interpolation methods,
where several geometries, or images, are interpolated between the reactants and products
[64]. Elastic band methods, such as the nudged (NEB) and double-nudged elastic band
(DNEB) methods [65–68], determine the gradients of the images on the potential energy
surface, and determines spring constants between every image and its two neighbors. These
calculations determine how to move each image so that the spring constants are minimized.
String methods attempt the build the minimum energy pathway by partially modifying
the reactant geometry to move it toward the product [69]. The new reactant is partially
optimized to find the minimum energy pathway, then the product is interpolated toward the
new reactant structure. These steps are iterated until the transition state structure can be
postulated [69]. The various string methods differ subtly in their treatment of the string
as it is formed. The freezing sting method maintains the partially optimizes structures
[70], while the growing string method allows the entire string to be reevaluated during the
calculation [71, 72]. The DNEB and growing string methods are the most widely used
of the more advanced methods, but there is no definitive advantage of using one over the
other [73]. The advanced methods have a higher associated computational cost due to the
increased number of calculations required, but do not require a reaction path analysis to
validate the transition state step since the calculation discovers the reaction path.
1.1.6 Automated transition state searches
Semi-automated search methods have been developed to locate transition state struc-
tures on a potential energy surface. Surface-walking algorithms require a user to provide a
good transition state estimate to initiate the search for a first order saddle point, the location
12
of a transition state. The small margin of error required for the transition state estimate led
to the development of double-ended search algorithms, starting with the linear and quasi
synchronous transit methods to the more robust elastic band and string methods. These
still require starting structures from the user, but structures of the reactant and product
species often in their reactive form. High-throughput kinetic calculation methods require
fully automated approaches to find and validate transition state geometries. A variety of ap-
proaches have been developed, with some building on the semi-automated transition state
approaches.
Maeda and Morokuma used an artificial force to push reacting molecules together, to
probe the potential energy surface around atoms, predicting reactions and finding their tran-
sition states [74, 75]. This artificial force induced reaction (AFIR) method has the potential
to discover new reaction pathways, but also requires many random starting orientations
which leads to a high computational cost.
Zimmerman used the growing string double-ended search method [71] to find possi-
ble transition states [76], and while there is a high computational cost in the transition
state search, the string method negates the need for a transition state validations step. This
method has been extended to the construction of detailed mechanisms for systems contain-
ing approximately 100 reactions, where the user limits path exploration with restrictions
such as barrier height limits [77]. A similar method was also developed using the freezing
string double-ended method [78]. The approach is limited by access to software with the
more advanced string methods. The method has been further developed to create a single-
ended transition state search, which uses driving coordinates from reactants to find several
products, from which the transition states can be found using the growing string method
[79].
A rule-based approach was developed by Zador and Najm that starts with a reactant
structure and directs atoms toward a product, using gradient and energy calculations at
each step to determine the location of the transition state [80]. The method can reliably
13
determine transition states if the reaction type is known but is not computationally efficient
due to the number of gradient and energy calculations. It is best suited to smaller reaction
systems, such as the exploration of a pressure dependent reaction network.
The Automated Alkylation Reaction Optimizer for N-oxides (AARON) code was de-
veloped to automate the screening of potential organocatalysts [81]. A catalyst structure is
provided by the user and mapped onto a parent catalyst structure for which the transition
state geometry is already known, then a series of partially constrained semi-empirical and
DFT optimization steps allow the new transition state to be found. This approach is limited
to catalysts systems and is not fully automated as it requires the user to provide all catalyst
structures and a starting known transition state geometry.
The methods described above apply different strategies to explore the potential energy
surface, and have been successful at finding transition states for tested applications. AFIR,
KinBot, and the double-ended enabled approach of Zimmerman are fully automated but
their use is restricted to smaller systems due to the computational cost. They are also better
suited at finding various transition states for a given set of reactants. These methods are not
ideal for automated mechanism generation, where a method is required to efficiently find
transition states for various reactions belonging to the same reaction family.
1.1.7 Kinetic Programs
Automated methods to determine reactant, product, and transition state structures have
been previously discussed. These structures can be used to theoretically calculate molec-
ular properties, such as bond vibrational frequencies and molecular moments of inertia.
Reaction kinetics can be calculated via transition state theory using the calculated molecu-
lar properties if the reactants and transition state.
Kinetic programs have been developed to automatically apply transition state theory if
provided the relevant molecular properties. They accept output files, containing the neces-
sary molecular properties, from various quantum chemistry packages.
14
The POLYRATE kinetic program was developed to apply variational transition state
theory and semi-classical tunneling methods [82]. Other kinetic programs have been de-
veloped to be more flexible, employing classical transition state theory and allowing the
user to selectively apply corrections, such as 1-D hindered rotor approximation. The pro-
grams may differ in the application of features, such as in the way corrections are applied.
For example, Variflex [83] and CanTherm [84] calculate the reduced moment of inertia
based on the axis of rotation and the identity of all the atoms on each side of the rotated
bond for the most stable species. MultiWell [85] treats the moment of inertia as a function
of the dihedral angle for an internal rotation, which is fitted with several cosine terms. De-
spite the differences, the reaction rates from these kinetic programs were shown to be in
close agreement, as shown for reactions important in the combustion of alcohols [86].
1.2 Thesis overview
This thesis describes methods to automate kinetic calculations by applying transition
state theory. The methods have been developed for the purposes of improving the kinetics
used in the construction of detailed chemical mechanisms, but the algorithm is not limited
to this application.
An algorithm to calculate reaction kinetics needs to find and validate stable species and
transition state structures. Automated methods exist to determine stable species structures
and have been discussed above. Chapter 2 describes efforts use double-ended search al-
gorithms to fully automate transition state structures. Chapter 3 details a group-additive
method that bypasses double-ended search methods to predict transition state geometries,
developed based on insights gleaned in Chapter 2. Chapter 4 discusses optimization of
the group-additive transition state prediction method. Chapter 5 describes the integration
of the automated transition state geometry search method with other software packages to
calculate reaction kinetics, and the comparison of those kinetics with the state-of-the-art
prediction methods.
2. Using double-ended methods to automate transition state searches
2.1 Background
Industrial and environmental applications, such as combustion for energy production,
require detailed kinetic models to sufficiently describe the reaction mechanisms of inter-
est. As these models can contain thousands of reactions [87], manual generation of these
mechanisms is an error-prone and laborious process. Automatic network generators, such
as Reaction Mechanism Generator (RMG) [88], represents molecules as chemical graphs
[18] to apply reaction rules to find all possible reaction pathways. As new species are
added to the model, the number of possible pathways grows exponentially [5]. RMG uses
a rate-based approach to screen reactions in order to build a tractable model, running a
series of simulations as it builds the model and exploring the reaction pathways that have
the highest flux [89]. This rate-based sorting places further importance on the accuracy of
thermodynamic and kinetic parameter estimates: an important pathway could be omitted
from a model if its rate is estimated too inaccurately. Experimentally determined or theo-
retically calculated parameters are preferred for their accuracy, but too few are known to
describe an entire model. Unknown kinetic parameters must be estimated, typically using
a functional-group database trained from other reactions of the same family [31, 90].
The group-based rate estimates can often be inaccurate, especially when derived us-
ing insufficient data. Theoretical calculations can improve the accuracy of these reaction
rates when experimental measurements are not feasible. Theoretical calculations require
properties of the reactants, products, and transition state [91], which can be determined
using ab initio calculations if each 3-dimensional structure is known. Efficient algorithms
have been developed to find the geometries of the reactant and product species [55, 57],
and these are used in the automated calculation of thermodynamic parameters for use in
mechanism generation [8]. Transition state geometries can be found using surface-walking
15
16
algorithms [60, 61], but require very good initial guess structures to be successful. Estimat-
ing the geometry becomes increasingly difficult for more complex chemical systems, and
an inaccurate estimate can lead to either the wrong saddle point or no saddle point being
located.
Double-ended search algorithms have been developed in an attempt to simplify the tran-
sition state search. These methods use geometries of the reactant and product structures as
inputs to create a TS estimate. A common difficulty in manually initiating these algorithms
is the requirement to consistently order the atoms on both reactants and products. Also,
the reactants and products need to be positioned such that active atoms in the reaction
are approaching their transition points (where bonds are formed or broken). Interpolation
methods were first developed where an intermediate guess structure is produced from the
provided reactant and product geometries [62, 63, 92]. These methods are simple to imple-
ment but are not reliably successful in determining the transition state as the PES is often
complex. Path optimization strategies improve on interpolation methods to determine the
reaction path [66, 68, 70, 72, 93]. Several strategies have been developed, but these meth-
ods are not as simple to implement as they contain parameters that need to be optimized for
efficient use [93]. The various strategies adopted have improved on simple interpolation
methods, but no definite advantage has yet been found for any strategy [73].
Other methods have been developed to locate transition states by exploring the potential
energy surface (PES) starting from the reactant valley. PES scanning perturbs the reactant
geometry, calculating points along the PES [94]. Isopotential searching methods explore
the PES at a given energy looking for exits (transition state) from the reactant valley [95].
Artificial Force Induced Reaction (AFIR) uses a bias potential approach, modifying the
PES to push reacting molecules together when performing a geometry optimization [74].
These methods are useful when searching for previously unknown pathways, but require a
large number of calculations and sometimes many repetitions. This makes them unsuited
for automatic mechanism generation where the reaction pathways are known, and thou-
17
sands of reactions must be evaluated.
This chapter describes an algorithm that automates the prediction of the reactant and
product geometries required for double-ended search algorithms. Reactions from RMG
were used to test the algorithm, and the chemical graphs of the reactants and products
from RMG were converted to 3-dimensional structures using distance geometry [96]. The
geometries were used to start quasi-synchronous transit [62, 63] and SADDLE [92] double-
ended searches. These methods were chosen for their ease of accessibility and as they are
simple methods to implement. Semi-empirical electronic structure methods were used to
conduct the double-ended transition state searches in an effort to reduce computational
cost, and the SADDLE method proved more successful. Transition states found with the
SADDLE method were then used for more robust transition state searches using density
functional theory. The algorithm has been tested on hydrogen abstraction reactions from
RMG, with more than 50% of the transition states found.
2.2 Methods
2.2.1 Generating 3-dimensional geometries for double-ended search methods
Molecules in RMG are represented as 2-dimensional graphs. These can be converted
into 3-dimensional structures using distance geometry via RDKit [8, 97]. RDKit creates
stable 3-dimensional structures, but double-ended search methods require reacting atoms
to be positioned within range of each other, where their bonds are broken or formed. This
requires modification to the automated geometry prediction implemented in RMG [8].
Distance geometry
The open-source cheminformatics toolkit RDKit [97] was chosen for its speed and
accuracy as a conformer generation tool [98]. The distance geometry approach used in
RDKit is described by Blaney and Dixon [96]. This approach uses a molecular bounds
matrix containing upper and lower bounds on distances separating each atom pair.
18
RDKit predicts molecular geometries using distance geometry. A molecular bounds
matrix is created when a molecule is passed to RDKit. The molecular bounds matrix is
a construct containing two distances per atom pair, representing upper and lower limits
for the distance between the atoms (Figure 2.1). These are set using atom types and the
hybridization of the atom pairs. For bonded atoms, the limits are set based on typical bond
distances for stable molecules; e.g. H—H bond is limits are set to 0.71 to 0.72 A. Distance
limits are calculated based on trigonometry and typical bond angles for non-bonded atom
pairs in the same molecule. For atom pairs where the atoms are on separate molecules,
lower limits are set as the sum of the atomic van der Waals radii, and the upper limits are
set to 1000 A(intended to be infinitely large).
RDKit positions atoms in 3-dimensional space using the distance constraints of the
bounds matrix by the process of ”embedding” [99]. The atoms are positioned in a ran-
dom arrangement that satisfies the matrix. The embedded geometry can then optimized
using universal force fields (UFF) [100], to give a refined geometry. The magnitude of the
spring constants for each atom pair used in the UFF optimization are determined using the
distance limits in the molecular bounds matrix. Repeated embeddings ensure a variety of
conformers are considered, and the minimum energy conformer estimate is selected.
Methods to generate the bounds matrix, then embed and optimize molecules based on a
bounds matrix, are all available in the open source cheminformatics software RDKit [97].
Positioning reactive atoms
Reactant and product positioning is important in the success of double-ended search
algorithms [101]. The gray area in Figure 2.2 represents a range of geometries close to
the TS that would lead to a successful surface-walking TS search. The dashed lines rep-
resent what a simple interpolation strategy may generate. Double-ended search methods
are always more successful if the reactants and products are placed closer to the transition
state (1 and 2) than using the minimum energy structures (R and P). We therefore need a
19
0.0 1.12 1.12 1.12 1.12 1000 1000 10001.10 0.0 1.86 1.86 1.86 1000 1000 10001.10 1.78 0.0 1.86 1.86 1000 1000 10001.10 1.78 1.78 0.0 1.86 1000 1000 10001.10 1.78 1.78 1.78 0.0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0.0 1.33 1.043.65 2.9 2.9 2.9 2.9 1.31 0.0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0.0
0.0 1.12 1.12 1.12 1.12 2.60 1000 10001.10 0.0 1.86 1.86 1.86 1000 1000 10001.10 1.78 0.0 1.86 1.86 2.10 1000 10001.10 1.78 1.78 0.0 1.86 1000 1000 10001.10 1.78 1.78 1.78 0.0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0.0 1.33 1.042.50 2.9 2.00 2.9 2.9 1.31 0.0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0.0
CHHHHOOH
C H H H H O O H
CHHHHOOH
C H H H H O O H
2.55
2.05
A
B
Figure 2.1: The molecular bounds matrix. (A) Bonded atom limits are set by bondlength rules, while connectivity limits non-bonded atoms in the samemolecule. Van der Waals radii set the lower limits for atoms on separatemolecules, while there are no upper limits (set to 1000 A). (B) By editingthese distance limits, we can position molecules relative to one another, aswell as stretch or shrink bond distances.
bounds matrix that predicts geometries closer to the transition state. This can be achieved
by positioning reactive atoms almost close enough that bonds are formed and/or broken.
For hydrogen abstraction reactions, 3 distances (Figure 2.3) were considered for modi-
fication to create the necessary starting geometries for double-ended methods: the distance
between the abstracted hydrogen atom and the heavy atom it is bonded to (dXH), the dis-
tance between the abstracted hydrogen and the abstracting radical (dHY), and the distance
between the heavy atom and the abstracting radical (dXY). These key distances undergo
significant change when the reaction proceeds from reactant to product as they occur where
bonds are broken and formed. The algorithm sets distance limits of 2.0 – 2.1A for dHY, and
2.5 – 2.6A for dXY. These distances were the most successful when tested on a test set of
50 hydrogen abstraction reactions (provided in Appendix A) involving various molecular
20
R
P2
1
Figure 2.2: Potential energy surface showing minimum energy pathway (dotted line)from reactants (R) to products (P). A double-ended search between 1 and2 will start closer to the TS (grey circle) than a search between R and P.
types. The same distance limits were used for the product geometries, but the X and Y
atoms for the reactants are now the respective Y and X atoms for the products. Embedding
and optimizing the reactants and products with their respective bounds matrices produced
starting geometries required for the QST2 method.
Transition state validation
An algorithm was created to control the transition state refinement and validation. The
algorithm interfaces with computational chemistry software to perform the transition state
optimization using electronic structure methods such as density functional theory. The
21
YX H
dXH dHY
dXY
Figure 2.3: Definition of the 3 key distances for editing the transition state geometries.H represents the abstracted hydrogen, X the atom bonded to the hydro-gen, and Y the radical abstracting the hydrogen.
calculation is checked for an absence of errors, and the presence of a single imaginary
frequency. The optimized geometry is then used for an intrinsic reaction coordinate calcu-
lation (IRC) [102].
The IRC result should connect the original reactants and products for a successful tran-
sition state. The result is typically inspected visually for comparison, but this is not possible
for an automatic procedure. In our algorithm, the IRC geometries are extracted and con-
verted into chemical graphs using a simplified version of the ConnectTheDots method in
Open Babel [103]. The atoms are sorted along the z-coordinate, with the method starting
with the lowest atom, continuing along the axis, and terminating with the highest. A bond
is made between this first atom, A, and its nearest neighbor, B, if all the following are true:
1. No bond currently exists between A and B
2. the number of other bonds to A and B is less than their respective valencies
3. the distance between A and B is less than the sum of their covalent radii + 0.2A.
The process is repeated with atom A being compared each time to the next-nearest
atom from the previous iteration, until either there are no more atoms to be compared or
the number of bonds on A equals its valency. The method then proceeds on to the next
atom along the z-axis.
With the bonding complete, the chemical graphs of the IRC molecules are compared to
the starting reactants and products using a graph isomorphism algorithm [19]. The transi-
22
tion state search is successful if the chemical graphs are isomorphic.
2.2.2 Locating transition states with the automatic double-ended search
Double-ended searches require that the atoms in the reactants and products must be
provided in corresponding order, e.g. the X atom in the reactant and the Y atom in the
product must be listed in the same place for both geometries. This can sometimes be tedious
for a human starting multiple double-ended searches, but is relatively straightforward in the
context of RMG, which generates the product molecules directly from the reactants.
Semi-empirical methods were used to conduct the double-ended searches to minimize
the computational cost, given the number of reactions being tested and the required number
of gradient calculations per reaction. The resulting transition state estimates were then op-
timized at the same semi-empirical level of theory. The algorithm interfaced with commer-
cially available computational chemistry packages to conduct the transition state searches,
and also validated the outputs to ensure they had converged. All successfully optimized
transition state geometries were validated via intrinsic reaction coordinate (IRC) calcula-
tions [102] to confirm that the saddle points connected with the expected reactants and
products. The successful transition state geometries found using semi-empirical calcula-
tions were used as initial estimates for optimizations using density functional theory. These
geometries were also validated via IRC calculations at the same level of theory as the tran-
sition state search. The overall workflow is shown in Figure 2.4.
334 unique hydrogen abstraction reactions were used to test the algorithm. These re-
actions were collated from hydrogen abstraction reactions in the NIST Chemical Kinetics
Database [104] involving species containing only C, H, and O atoms. They are listed in
Appendix A.
23
Reaction from RMG
Optimize TS geometry
Generate Bounds Matrix
Edit Bounds Matrix close to TS
Embed Matrix in 3D
Double-ended Search
IRCCalculation
Reactants Products
Generate Bounds Matrix
Edit Bounds Matrix close to TS
Embed Matrix in 3D
Figure 2.4: Automatic double-ended enabled transition state search procedure.
2.2.3 Electronic Structure calculations
After the initial embedding in 3D, geometries are refined by constrained optimization
using Universal Force Field (UFF) calculations [100] in RDKit [97]. For the double-ended
searches, Gaussian 09 [105] was used to conduct quasi-synchronous transit (QST2) [63]
searches at the semi-empirical PM6 [49] level of theory, and MOPAC [106] was used to
conduct SADDLE calculations [92] at the semi-empirical PM7 level of theory (a modified
PM6). The resulting transition states were used to start density functional theory calcula-
tions and IRC calculations were performed as noted in the text, using the M06-2X density
functional [107], recently recommended for transition state geometries [108], with the 6-
31+G(d,p) [109, 110] basis set.
24
TS search and refinement
Reaction path analysis
Compare to desired reactants & products
Embed geometry either side of TS
Get bounds matrix
Fail
Succeed
FailFail
H. .OH otherradical
.OH
otherradical
334reactions
4
49 41
8
33
Figure 2.5: Points of failure along the automatic double-ended algorithm with SAD-DLE/PM7 calculations. The arrow widths are in proportion with the num-ber of associated reactions. 194 succeeded with 128 reactions failing at thereaction path validation.
2.3 Results and Discussion
The double-ended algorithm was tested with the set of 334 hydrogen abstraction reac-
tions using Gaussian’s QST2 method at the PM6 level of theory (QST2/PM6) and Mopac’s
SADDLE method at the PM7 level of theory (SADDLE/PM7), with the resulting geome-
tries optimized at the respective levels of theory. 160 of the reactions were successful
with QST2/PM6, and 194 with SADDLE/PM7 (Figure 2.5). The higher failure rate of the
QST2/PM6 approach led to the SADDLE/PM7 approach being used for further calcula-
tions.
For the SADDLE/PM7 calculations, 322 reactions successfully converged to a saddle
point, but 128 of these were invalidated by the reaction path analysis: although a saddle
point was found, it represented a hindered rotor in the reactants, not a formation of prod-
ucts. 90% of these failures comprised reactions with small radicals (H, OH, CH3, and 3O2)
25
A B
1.241 1.237 1.241 1.241 1.214 1.194
1.373 1.376 1.378 1.390 1.365 1.345
1.390 1.395 1.395 1.446 1.419 1.402
O
HO
H3C H2CH3C CH
CH3
OCH
O
HO
H3C
Figure 2.6: dXH transition state distances in Angstrom calculated with M06-2X/6-31+G(d,p). Abstracting radicals (Y) are on the left, and the hydrogen andthe carbon it is abstracted from (XH) are on top. Similar trends wereobserved for dHY, and dXY.
abstracting the hydrogen. The high failure rate of small radical reactions, especially those
involving H radical as the abstracting radical, motivated additional calculations for the H
radical reactions. Transition states for the tested reactions could be found using the M06-
2X/6-31+G(d,p) level of theory, but could not be found with PM6 nor PM7. Thus suggests
the semi-empirical level of theory was insufficient for these transition state searches. The
geometries for H radical abstractions (Y· is H·) were also observed to be dissimilar to the
geometries found automatically: the distance dHY was on average 0.2 A smaller than for
other radicals. This may also have contributed to the failure with small radicals.
The successful transition state geometries were used as initial estimates for M06-2X/6-
31+G(d,p) optimizations. The results were largely successful with less than a 10% failure
rate. This means the transition states found using semi-empirical methods with low com-
putational effort are suitable starting points for more accurate calculations.
The 3 key reaction center distances (dXH, dHY, dXY) were collated from successfully
determined transition states found at M06-2X/6-31+G(d,p). Trends were observed in the
distances at the transition state, demonstrating that modifying a single reacting group had
a quantifiable effect on the reaction center distances (Figure 2.6). The trends shown in
Figure 2.6 suggest the effects of the reacting groups on dXH distances are separable and
consistent. Similar trends were observed for dHY and dXY (see Appendix A).
26
2.4 Conclusion
An algorithm has been developed to automatically determine transition state geometries
for hydrogen abstraction reactions using double-ended search methods. The algorithm uses
distance geometry to provide the starting geometries for double-ended searches. Transi-
tion states were successfully found for over 50% of tested hydrogen abstraction reactions,
using a combination of semi-empirical PM7 calculations and calculations at M06-2X/6-
31+G(d,p). Most failures were saddle points that cannot be found at PM7, and a higher
success rate is expected if DFT electronic structure methods are used when running the
double-ended transition state searches, though this would incur additional computational
expense. Trends were also observed in the reaction center distances, and these suggest
that unknown inter-atomic transition state distances can be estimated via a group-based
approach.
2.5 Recommendations
2.5.1 Semi-empirical methods are insufficient for transition state searches
The automated double-ended search algorithm used semi-empirical methods to con-
duct transition state searches, and the semi-empirical transition states were used to run
more robust density functional theory calculations. The semi-empirical calculations are
computationally inexpensive, but the poor accuracy of the potential energy surface means
many transition states cannot be determined. The computational savings are meaningless
if a large quantity of transition states cannot be found, so more robust electronic structure
methods should be used. Density functional theory provides a good balance between cost
and accuracy, and should be the minimum used for further transition state studies.
27
2.5.2 Consider more robust double-ended search methods
The QST2 and SADDLE methods used in this chapter are interpolation based double-
ended methods, but more robust path optimization methods have been developed. The path
optimization methods discover the reaction path from the same reactant and product ge-
ometries provided to interpolation methods. The path optimization methods determine the
reaction path using differing strategies to determine the reaction path. Path optimization
methods have an increased computational cost due to the number of gradient calculations
required to find the transition state [71]. The additional computational cost of double-ended
methods is somewhat mitigated by not requiring a path analysis calculation once the tran-
sition state is found, since the reaction path is determined during the transition state search.
It should be determined if this increased computational cost makes these calculations feasi-
ble for implementation with automated mechanism generation. Implementing more robust
double-ended approaches may not be feasible for mechanism generators, but could be use-
ful for other applications such as solving a small pressure-dependent reaction network. For
example, other automated transition state searches enabled by double-ended approaches
have been developed to discover new reaction pathways [76, 78].
It has been shown that there is no clear advantage to using a given path optimiza-
tion method [73], where one method may be more successful for some reactions, another
method will be for others. These methods also contain parameters that need to be opti-
mized [93], which further complicates their automated implementation. Several methods
could be implemented, but many of these more robust double-ended methods are imple-
mented in different commercially available software, providing a further barrier to their
widespread implementation.
3. Automatic transition state geometry estimation using group
contributions
3.1 Background
Complex chemical systems, such as the combustion of novel renewable fuels, can be
better understood with detailed kinetic models. The required detail means a model can
contain thousands of species and reactions [5], making their construction laborious and
prone to human error. Automated mechanism generators have been developed to construct
detailed kinetic models while avoiding the pitfalls of manual construction [1]. Thermody-
namic and kinetic parameters are preferentially sourced from experimental measurements
or high fidelity theoretical calculations to complete a kinetic model, but estimates are also
used as many of the required parameters are unknown [20].
Parameter estimation methods are computationally efficient strategies to provide ther-
modynamic and kinetic values [28]. Most parameter estimation methods are based on Ben-
son’s group additivity [111], in which the thermodynamics of a molecule are estimated by
summing the contributions from the molecular groups present in the molecule, these group
values having first been calculated from molecules with known thermodynamic parameters
[22, 112]. Such group contribution methods have been shown to work well for thermo-
chemistry of hydrocarbon species, and the concept has been extended to kinetic parameter
estimation [113–116]. Group contribution methods become less accurate when parameters
are estimated using groups values that have not been well determined, due to insufficient
training data. For example, group values have been difficult to extend to thermodynamics
of fused rings leading to inaccuracies in their estimates [8].
Such inaccuracies in group-based estimation methods have motivated high-throughput
electronic structure calculations for thermodynamics and kinetics [117, 118]. Such a pro-
28
29
cedure was recently developed to calculate thermodynamic parameters within the frame-
work of the automatic Reaction Mechanism Generator (RMG) [6, 8]. In that procedure,
3-dimensional structures were created via distance geometry [96], with the structures op-
timized using force-fields and semi-empirical electronic structure calculations to provide
molecular parameters, allowing thermodynamic parameters to be calculated. Thermody-
namic error was greatly reduced for fused-ring species compared to estimates derived from
Benson’s group additivity.
In a similar manner, kinetic parameters currently estimated from poorly trained group
values could be improved by applying electronic structure calculations and transition state
theory, but this requires a high-throughput approach for finding transition state geometries.
A transition state geometry estimate, which is typically provided manually, must be quite
similar to the correct transition state geometry for the optimization to converge. Manual
estimation of transition states is not compatible with the context of automated mechanism
generation, which requires thousands or even millions of reaction rates. With continuing
advances in computing power, it has become feasible to automate these searches.
One approach used the growing string double-ended method [71] to search for possible
transition states [76]. While there is an increased computational cost associated with the
transition state search, the use of the string method negates the need for a path analysis
step to validate the transition state. This method has been extended to the construction of
detailed mechanisms, where the user controls the mechanism generation with restrictions
such as barrier height limits [77]. Adoption of this method is limited to those with access
to software with reliable double-ended methods. Zimmerman has further developed these
methods to create a single-ended transition state search [79]. This makes use of driving
coordinates from reactants to find intermediates, from which the transition state can be
found using the growing string method.
Zador and Najm instead use a rule-based approach to direct atoms from a reactant
configuration towards the product, using energy calculations at each step to determine the
30
location of the transition state [80]. This method is best suited to reaction systems with a
small number of atoms, such as the exploration of a pressure dependent reaction network.
The Automated Alkylation Reaction Optimizer for N-oxides (AARON) code automates
transition state searches to screen potential organocatalysts [81]. A catalyst structure is
provided by the user and mapped onto a parent catalyst structure for which the transition
state geometry is already known, then a series of partially constrained semiempirical and
DFT optimizations allow the new transition state to be found.
Maeda and Morokuma used an artificial force to push reacting molecules together, to
probe the potential energy surface around atoms, predicting reactions and finding their
transition states [74, 75]. This Artificial Force Induced Reaction (AFIR) method requires
many random starting orientations.
The methods highlighted above explore the potential energy surface for a given set of
atoms, finding many reaction pathways for a few reactants. These are not well suited to
automated mechanism generation where it is routine to have many reactions of the same
type but with varying reactants. For such applications, this chapter describes an alterna-
tive method to estimate transition state geometries. Trends in reaction center distances at
the transition state were observed in Chapter 2, and have also been observed in previous
studies [31]. These insights inspired the development of molecular group contributions to
predict the inter-atomic distances in the reaction center of transition states, enabling fully
automated prediction of transition state structures.
Estimated 3D geometries are constructed from the predicted distances using distance
geometry. Optimization and validation of the transition state estimates have also been
automated. Hydrogen abstraction reactions from a diisopropyl ketone combustion model
[2], previously developed using RMG, were used to test the method, with transition states
found for over 65% of the 1393 reactions.
31
3.2 Methods
3.2.1 Geometry estimation and optimization
Distance geometry
Further details on the implementation of distance geometry can be found in section
2.2.1.
The open-source cheminformatics toolkit RDKit [97] was chosen for its speed and
accuracy as a conformer generation tool [98]. The distance geometry approach used in
RDKit is described by Blaney and Dixon [96]. This approach uses a molecular bounds
matrix containing upper and lower bounds on distances separating each atom pair.
Distances separating reactive atoms undergo significant change during a reaction, but
the rest of the molecule remains relatively unaffected. As a result, distances between the
reactive atoms are unknown at the transition state, but existing methods can be used to
determine the remaining distances.
For hydrogen abstraction reactions, three atoms lie in the reaction center: the abstracted
hydrogen (H), the atom bonded to the abstracted hydrogen (X), and the radical abstracting
the hydrogen (Y). The three distances separating each reactive atom pair are denoted as
dXH, dHY, and dXY. Estimating these distances allows the entire transition state geometry
to be created using distance geometry. Typically the geometry is specified manually, but
we demonstrate here a group contribution method to estimate the required reaction center
distances.
Molecular group organization
Molecular groups were used to predict distances separating reactive atoms of transition
states. The molecular groups were organized in a hierarchical tree structure, so that distance
predictions were made using the most relevant available data. The tree was limited to
reactions with only atom types (elements) of C, H, and O, but can be expanded to include
32
other atom types by adding the appropriate groups. Two trees were used as hydrogen
abstraction reactions are bimolecular and the reaction center distances are dependent on
both reactants. The head nodes (top groups) for the trees were X H or Xanyrad H and
Y anyrad. The X H or Xanyrad H tree described the reactant where X is a wildcard atom
of any atom type, with zero or more radical electrons, bonded to an H atom (the hydrogen
to be abstracted), and the Y anyrad tree described the abstracting radical of any atom type,
with one or more radical electrons. Child nodes were added to be more detailed than the
parent nodes, for example, a child of the X H or Xanyrad H node is X H(here X is any
element but has no radical electrons), itself having a child H2.
The structure of the molecular group tree was taken from the kinetics database of the
RMG software. This tree structure was developed to make efficient use of sparse data for
estimating kinetic parameters relevant to hydrocarbon combustion. The development of this
tree involved several researchers making independent modifications over a number of years
to provide improved kinetic estimates for specific fuels. Sometimes modifications were
made with the aim of minimizing disruption of the existing tree, rather than of optimizing
the overall tree structure.
The tree structure and associated group values are available in Appendix B.
Group additive distance estimation
Reaction center distances were collated from previously optimized transition state ge-
ometries, creating a training set. Values for molecular groups, organized in a hierarchical
tree, were calculated using values from the training set by linear least squares regression,
using the distances for every reaction in the training set that match the molecular group.
The base value is stored in the top level node, and the value for a descendant is stored as a
correction to the top level node value. This means the value of a given node is calculated
as the sum of the base value and the node’s correction.
The linear least squares regression calculates group values by finding the best fit to
33
the available training data. For each set of distances in the training set, the reactants are
matched to groups in the group tree. All groups that match the X H or Xanyrad H reactant
are paired with the groups that match the Y anyrad reactant, and the sum of each pair and
a base value is set equal to the training distances. This creates a system of equations where
the variables are the group values and the known values are the training data. The regression
is conducted using the linear algebra package in numpy, finding the group values that best
fit the data [119]. A detailed description of the least squares regression is available in the
appendices.
The reaction CH4 + C2H5 is used as an example. Table 3.1 shows the sections of
the molecular group tree relevant to this reaction. The most specific group that matches
each reactant is found by descending the tree. CH4 matches the C methane group in the
X H or Xanyrad H tree, while C2H5 matches the C rad/H2/Cs\H3 group in the Y anyrad
tree. An explanation of the naming convention, and complete tree definitions, are provided
in the appendices. The distance estimates are calculated by summing the top node value
and the group correction for each reactant, predicting respective values for dXH, dHY, and
dXY as 1.388A, 1.331A, and 2.721A.
Transition state geometry estimation
With the distances between atoms at the reaction center estimated using molecular
group values as described in the previous section, transition state geometry estimates can
be created via distance geometry (Figure 3.1). For a pair of reactants, a bounds matrix
is first generated in RDKit for the stable species, comprising upper and lower limits on
the distances between each pair of atoms. For the distances dXH, dXY, and dHY, the val-
ues in the bounds matrix are updated to be the distance prediction as described earlier, ±
0.05A. Some combinations of upper limits from these edits may conflict with previously
set lower limits, particularly lower limits between a reactive atom (X, H, or Y) and some
non-reacting atoms, forming an inconsistent bounds matrix. In these cases the conflicting
34
C H H H H C C H H H H HC 0.00 1.12 1.12 1.12 1.12 1000 1000 1000 1000 1000 1000 1000H 1.10 0.00 1.86 1.86 1.86 1000 1000 1000 1000 1000 1000 1000H 1.10 1.78 0.00 1.86 1.86 1000 1000 1000 1000 1000 1000 1000H 1.10 1.78 1.78 0.00 1.86 1000 1000 1000 1000 1000 1000 1000H 1.10 1.78 1.78 1.78 0.00 1000 1000 1000 1000 1000 1000 1000C 3.90 3.15 3.15 3.15 3.15 0.00 1.52 1.12 1.12 1.12 2.20 2.20C 3.90 3.15 3.15 3.15 3.15 1.50 0.00 2.20 2.20 2.20 1.12 1.12H 3.15 2.40 2.40 2.40 2.40 1.10 2.12 0.00 1.86 1.86 3.08 3.08H 3.15 2.40 2.40 2.40 2.40 1.10 2.12 1.78 0.00 1.86 3.08 3.08H 3.15 2.40 2.40 2.40 2.40 1.10 2.12 1.78 1.78 0.00 3.08 3.08H 3.15 2.40 2.40 2.40 2.40 2.12 1.10 2.26 2.26 2.26 0.00 1.86H 3.15 2.40 2.40 2.40 2.40 2.12 1.10 2.26 2.26 2.26 1.78 0.00
A
B
C
1.33Å
< 3.15Å
C H H H H C C H H H H HC 0.00 1.40 1.12 1.12 1.12 1000 2.73 1000 1000 1000 1000 1000H 1.38 0.00 1.86 1.86 1.86 1000 1.34 1000 1000 1000 1000 1000H 1.10 1.78 0.00 1.86 1.86 1000 1000 1000 1000 1000 1000 1000H 1.10 1.78 1.78 0.00 1.86 1000 1000 1000 1000 1000 1000 1000H 1.10 1.78 1.78 1.78 0.00 1000 1000 1000 1000 1000 1000 1000C 3.90 3.15 3.15 3.15 3.15 0.00 1.52 1.12 1.12 1.12 2.20 2.20C 2.71 1.32 3.15 3.15 3.15 1.50 0.00 2.20 2.20 2.20 1.12 1.12H 3.15 2.40 2.40 2.40 2.40 1.10 2.12 0.00 1.86 1.86 3.08 3.08H 3.15 2.40 2.40 2.40 2.40 1.10 2.12 1.78 0.00 1.86 3.08 3.08H 3.15 2.40 2.40 2.40 2.40 1.10 2.12 1.78 1.78 0.00 3.08 3.08H 3.15 2.40 2.40 2.40 2.40 2.12 1.10 2.26 2.26 2.26 0.00 1.86H 3.15 2.40 2.40 2.40 2.40 2.12 1.10 2.26 2.26 2.26 1.78 0.00
C H H H H C C H H H H HC 0.00 1.40 1.12 1.12 1.12 1000 2.73 1000 1000 1000 1000 1000H 1.38 0.00 1.86 1.86 1.86 1000 1.34 1000 1000 1000 1000 1000H 1.10 1.78 0.00 1.86 1.86 1000 1000 1000 1000 1000 1000 1000H 1.10 1.78 1.78 0.00 1.86 1000 1000 1000 1000 1000 1000 1000H 1.10 1.78 1.78 1.78 0.00 1000 1000 1000 1000 1000 1000 1000C 3.90 2.76 3.15 3.15 3.15 0.00 1.52 1.12 1.12 1.12 2.20 2.20C 2.71 1.32 3.10 3.10 3.10 1.50 0.00 2.20 2.20 2.20 1.12 1.12H 3.15 2.40 2.40 2.40 2.40 1.10 2.12 0.00 1.86 1.86 3.08 3.08H 3.15 2.40 2.40 2.40 2.40 1.10 2.12 1.78 0.00 1.86 3.08 3.08H 3.15 2.40 2.40 2.40 2.40 1.10 2.12 1.78 1.78 0.00 3.08 3.08H 3.15 2.40 2.40 2.40 2.40 2.12 1.10 2.26 2.26 2.26 0.00 1.86H 3.15 2.40 2.40 2.40 2.40 2.12 1.10 2.26 2.26 2.26 1.78 0.00
1.33Å
3.15Å
Figure 3.1: Manipulating the molecular bounds matrix to create transition state ge-ometry estimates. (A) The matrix generated for a pair of stable species.(B) Editing the matrix with the group contribution predictions for tran-sition state distances. (C) Conflicting lower limit distances are corrected,creating a valid transition state distance bounds matrix.
35
Table 3.1: Part of the hierarchical molecular group tree for transition state distancestrained using 1071 transition state distances calculated using B3LYP/6-31+G(d,p). The full tree is provided in the appendices.
Group dXH dHY dXYBase 1.336010 1.336330 2.667560L1: X H or Xanyrad H
L2: X H –0.002556 0.002864 0.000227L3: H2 –0.327434 –0.045046 –0.369886...L3: Cs H 0.007461 0.023642 0.032296L4: C methane 0.076680 –0.051468 0.028801L4: C pri 0.025511 –0.002230 0.025031L5: etc.
L4: C sec –0.026003 0.069757 0.044341L4: C ter –0.025676 0.062321 0.034956L5: etc.
L2: Xrad H 0.094987 –0.106435 –0.008430etc.
L1: Y anyrad...L2: Y rad 0.002857 –0.002500 0.000277L3: H rad –0.044160 –0.330263 –0.371926...L3: Cs rad 0.024200 0.007289 0.032625L4: C methyl –0.050813 0.075919 0.028607L4: C pri rad –0.001792 0.025273 0.025176L5: C rad/H2/Cs –0.032772 0.051719 0.021617L6: C rad/H2/Cs\H3 -0.024753 0.045959 0.024509L6: C rad/H2/Cs\Cs2\O –0.125966 0.025305 –0.097425
etc.
lower limits are reduced to be in agreement with the previous edits. Finally, a triangle
inequality algorithm is used to smooth the bounds matrix.
Transition state estimates are created by randomly “embedding” the atoms in 3D space
such that they satisfy the bounds matrix. Repeating this process allows multiple conform-
ers to be created. The conformer geometries are then optimized using a UFF force field
calculation constrained by the bounds matrix. The lowest energy conformer according to
the UFF calculation is selected as the transition state estimate. While the accuracy of the
force field energy calculation is low, it is sufficient for conformer selection.
36
Figure 3.2: The automated transition state search algorithm.
Transition state validation
An algorithm was created to control the transition state refinement and validation. The
geometry estimate resulting from the constrained force field optimization, is used as the
initial guess for a transition state optimization using electronic structure methods such as
density functional theory. The calculation is checked for an absence of errors, and the
presence of a single imaginary frequency. The optimized geometry is then used for an
intrinsic reaction coordinate calculation (IRC) [102]. Full details of the transition state
validation steps are provided in section 2.2.1. The full automated algorithm is outlined in
Figure 3.2.
Training the molecular group values
Molecular group values were trained with known values taken from transition state
geometries that were optimized and validated with the B3LYP electronic structure method
and a 6-31+G(d,p) basis set. All data added to a training set came from transition states
37
found and validated using the same electronic structure method and basis set. Transition
states found and validated with the automated algorithm were also added to the training set
at the end of each test of the automated algorithm. Before each run of the algorithm, the
molecular group values were retrained using the training set expanded from the previous
run.
3.2.2 Method evaluation
H abstraction reactions from a DIPK combustion model.
1,393 hydrogen abstraction reactions from a diisopropyl ketone (DIPK) combustion
model (total of 4,027 reactions) [2] were used to test the automated algorithm. Reactions
were passed to the transition state search algorithm, which created transition state estimates,
then optimized and validated them.
First, a preliminary training set was created from 44 unique hydrogen abstraction tran-
sition states, and was used to train the molecular group tree. As few groups were trained,
we found the distance estimates to be insufficient for reliably predicting transition states.
As a result, the training set was expanded to contain data from a total of 230 transition
states. This expansion of the training set was done with geometries found both manually
and using the automated algorithm. The reactions from the DIPK model were then passed
to the automated algorithm, with data from the successfully found transition states added
to the training set. The groups were retrained, and the method was tested again on the same
reactions from the DIPK model. This led to the expansion of the training set from data for
230 transition states to 827 and then 1,071 transition states. Characteristics of the group
contribution method were investigated using 4 training sets (Table 3.2).
Computational Chemistry
Estimated geometries were refined in RDKit using universal force fields (UFF) [100].
Geometry optimization and path analysis calculations were run using B3LYP [120, 121]
38
Table 3.2: Training set information. As the training set was expanded, the RMS errorfrom the validated transition state distances decreases.
Training Transition Geometries RMSset States in found Error
name training set A44TS 44 not run 0.181230TS 230 658 0.102827TS 827 734 0.0401071TS 1071 not run 0.036
with the 6-31+G(d,p) [109, 110] basis set in the Gaussian 09 [105] quantum chemistry
package.
3.3 Results and Discussion
3.3.1 Transition state geometries were successfully estimated using the distance es-
timates
The algorithm was tested on the DIPK reactions with the groups trained with the train-
ing set named ‘230TS’, and found 658 of the 1,393 transition state geometries. 597 of the
resulting geometries were not already in 230TS, making a set 827TS when added to the
training set. The set 827TS was used to retrain the group values, with the algorithm again
tested on the DIPK reactions, where 734 transition states were found, of which were 244
unique to the training data. The additional 244 transition states allowed the creation of the
1071TS set. Over the 2 test runs, 907 transition states of the 1,393 reactions were found
and validated, expanding the training data from 230 to 1,071 transition states.
3.3.2 Increasing training data improves the group value predictions
The reaction center distances from the 907 transition states found using the algorithm
were compared to distances estimated by molecular group values at differing training set
sizes (Figure 3.3). The root-mean-squared (RMS) error for each of the 3 distances de-
creased when the training set containing transition state data was increased from 44 up
39
to 1,071 entries. There was little improvement in the estimated values when the training
set expanded from 827 geometries to 1,071 in comparison to the earlier expansions of the
training set.
The observed improvement in the distance predictions as the groups were trained with
more data was consistent with our hypothesis. With a larger training set, some untrained
groups now have data and some trained groups have more data, improving their accu-
racy. If the group was newly trained, the algorithm would use more relevant and specific
group values, improving the predicted distances. This was observed in the improvement in
the distance predictions moving from 44TS to 230TS. With new training data, previously
trained groups improve as more data are used to train the group values, as seen when com-
paring the groups trained using 230TS and 827TS. Little improvement in the RMS error
for predictions made with 827TS and 1071TS shows that the 827TS groups were relatively
well trained so the extra data from 244 transition state geometries had little effect on group
value predictions.
The observations show certain data are more desirable when expanding a training set
for molecular group values. For example, if the reactions of interest are hydrogen abstrac-
tions from the OH group of an alcohol, the training set should contain such reactions with
different types of radicals abstracting the hydrogen. If the training set contains data from a
large number of transition states for hydrogen abstractions from alkanes by an alkyl radi-
cal, little will be gained by adding a transition state for ethyl abstracting a hydrogen from
methane. Both the reactions of interest and the available data should be considered when
adding new data to a training set.
3.3.3 Geometry estimation needs improvement to make best use of predicted values
As described earlier, two attempts were made to find all the TSs in the DIPK model:
first with the original group tree trained with the 230TS training set, and secondly trained
with the 827TS training set. Of the 907 geometries found over these two iterations, 422
40
dXH dHY dXYPr
edic
ted
Dis
tanc
e(A
)
44T
S23
0TS
827T
S10
71T
S
Optimized Distance (A)
Figure 3.3: Distances from 907 validated transition states found at B3LYP/6-31+G(d,p) were compared to predictions derived from molecular groupvalues. The solid line represents parity with the optimized distances, andthe dashed lines represent the root mean squared error of the estimatesfrom parity. The predictions improved as the training set used to calcu-late the group values was expanded.
41
were found during one iteration but not the other. This allowed comparison of estimates that
were unsuccessful, against the true optimized values from the successful attempts (Figure
3.4). One cluster of failures, with RMS errors greater than 0.15 A, came from the 230TS
iteration, and were mostly successful at the 827TS iteration. For distance estimates with
RMS errors below 0.05 A, the conversion from a predicted value into a UFF-optimized 3D
geometry using the current algorithm resulted in additional error being introduced into the
distances, possibly causing the failure. This suggests that while the group additive method
can make accurate distance predictions, further optimization of the algorithm for converting
these distances into 3D geometries is necessary.
0102030405060708090
100
0.01 0.06 0.11 0.16 0.21 0.26
# of
Fai
led
Geo
met
ries
RMS Error (Å)
Group Additive PredictionUFF Optimized Distances
Figure 3.4: 422 transition states found in one trial of the algorithm were unsuccessfulin another. Comparing optimized distances against the failed estimationattempt showed: 1. poorly estimated distances that were improved whenthe training set was expanded 2. the conversion from prediction to geom-etry estimate introduced additional error.
Figure 3.5 shows the probability of a failed transition state search increases with in-
creasing root mean squared (RMS) error in the three reacting distances of the starting ge-
ometry. The lower bound probabilities are calculated from trials from the 230TS training
set. It is a lower bound of P (failure) because only the 249 failures that later succeeded
with the 827TS training set were included; for the 486 reactions that continued to fail, the
true distances are not known and the RMS error could not be calculated. Because few of
our starting geometries were worse than 0.2 A we do not have many trials in this region and
42
our estimate of the failure probability is quite uncertain, hence the wide Clopper–Pearson
[122] 95% confidence interval of P (failure) (the vertical bars in figure 3.5). To estimate
the upper bound of the failure probabilities, we distributed the 486 additional failures us-
ing a variety of assumptions, each giving a different estimate of the P (failure) curve; the
upper bound in the figure encompasses all these curves.
Although uncertain, the shapes of these bounds are informative, and they support the
need for good starting geometries for a transition state search: embedded geometries with
an RMS error greater than 0.15 A have a high failure rate. Other reaction families, opti-
mization algorithms, and software packages may behave differently.
Upper boundLower bound
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5
Prob
aliti
y of
failu
re
RMS error in starting geometry distances (Å)
Figure 3.5: Probability of a failed TS search as a function of RMS error in reactivedistances of starting geometry. For each point the vertical bar show theClopper–Pearson [122] 95% confidence interval of the lower bound andthe horizontal bar shows the range of RMS errors used to calculate it.
43
3.3.4 Algorithm optimization
The automated algorithm takes advantage of the molecular group estimates to predict,
optimize, and validate transition state geometries, but it does not make best use of the
group-based distance estimates, and could be improved in future work. In the algorithm
tested here, after the atoms are positioned in the 3D space, a constrained UFF refinement
step is done in RDKit before the transition state search at DFT. This is designed to improve
the geometry of the non-reacting atoms, but the refinement can alter the reaction center
distances, dragging them away from their well-predicted values. This could be addressed
by tightening the constraint spring constants before the UFF refinement or replacing the
refinement step with a DFT optimization with the reaction center distances frozen as is
done in the AARON code [81].
The difference between the upper and lower bounds for the reaction center distance
estimates is currently set to 0.1A, which can be as much as 10% of some distances. This
range should be related to the uncertainty calculated when determining the group values by
linear regression, allowing well known values to have tight restrictions.
3.4 Conclusion
Automated transition state searches have previously been described as an important
challenge for studying complex chemical systems, helping to move mechanism genera-
tion closer to being predictive. A group contribution method has been developed to take
advantage of available chemical data to make predictions of transition state geometries.
The group contribution method performs best with well trained groups, as seen when pre-
dictions improved by adding more training data. The group contributions were used in
a novel, fully automated algorithm to create a transition state estimate using distance ge-
ometry methods, with the estimate then optimized and validated to find the true transition
state structure. The validation step makes it a self-improving machine learning algorithm,
44
as new transition state data are used to improve group values. That a simple sum of con-
tributions from the abstracting and donating groups can so fully determine the transition
state geometry offers new physical insight into these reactions. Although the algorithm
for generating 3D geometries from distances is a first generation and could be improved,
the simple method for predicting the inter-atomic distances is already remarkably accurate
with typical root-mean-squared errors of 0.04 A.
3.5 Recommendations
3.5.1 Conformer recognition
The lowest energy conformers are used when calculating thermodynamic and kinetic
parameters using quantum mechanics in its simplest form. Cases have been shown that
ignoring other conformers can introduce error in the parameters calculated [43–46]. Many
conformational isomers can be generated using the algorithm described above by repeated
embedding using a single molecular bounds matrix. It would be beneficial to develop
an algorithm to identify conformational isomers that are unique so they can be used in
statistical treatments that account for multiple conformers.
Each transition state conformer would need to be optimized and validated as described
above, adding computation time to the process. The gains in kinetic accuracy need to be
considered if this approach is implemented.
4. Improving the group contribution transition state search method
4.1 Background
Detailed mechanisms help understand complex chemical systems. Such detailed mech-
anisms can contain over 10,000 unique reactions, making their manual construction a falli-
ble and laborious process. Automated mechanism generation can resolve these difficulties,
where published sources are preferentially used to predict reaction rates, with estimation
methods deputizing when data are unavailable. Reaction rate estimates can be quite approx-
imate, but they can be improved using theoretical calculations via transition state theory.
Theoretical calculations are also laborious so a high throughput kinetic calculation method
is required for transition state theory to be used with automated mechanism generators, and
this requires an automated method to predict and find transition state structures. Chapter 3
describes the automated geometry search method.
Several steps are involved in the automatic transition state search method, starting with
the estimation of distances in the reaction center of the transition state using a molecular
group contribution method. The predicted transition state distances are used to create a
3-dimensional transition state estimate by applying distance geometry. The transition state
estimate is used to start a transition state search in order to find the true transition state
structure. Each transition state is validated using a reaction path analysis calculation.
Options exist for some of the steps involved in automating the transition state search,
and it is important to explore these options as they may improve the automated search
method. It was shown in Chapter 3 that increasing training data improves the accuracy
of the molecular group values, but additional data can only improve group values that are
poorly determined. This chapter discusses modifications to the algorithm and their effects
on transition state prediction accuracy. The modifications discussed are changes to the
group contribution prediction method and additions to the sequence of optimization steps
45
46
performed on the geometry estimate during the transition state search.
4.2 Methods
4.2.1 Modifying the transition state geometry prediction
Reshaping the molecular group tree
The automated algorithm uses molecular groups to predict distances separating reactive
atoms of transition states. The molecular groups were organized in a hierarchical tree
structure, so that distance predictions were made using the most relevant available data.
Details about the group tree structure were previously discussed in Section 3.2.1.
The structure of the molecular group tree used in Chapter 3 was taken from the kinetics
database of the RMG software. The development of this tree involved several researchers
making independent modifications over a number of years to provide improved kinetic es-
timates for specific fuels. Sometimes modifications were made with the aim of minimizing
disruption of the existing tree, rather than of optimizing the overall tree structure. The un-
coordinated nature of the modifications led to a tree structure that is hierarchical, but lacks
obvious logic in its structure, and was not optimized for best results for transition state
distances. For example, the O H group has descendants that are peroxides except for the
peroxyradical group (Orad O H), which is instead a sibling group.
A new tree structure was developed for comparison to the RMG designed structure, and
to better understand the effect of the tree structure on the predictions of the transition state
reaction center distances. The same top nodes were used for the new tree as they described
all possible reacting molecules for the hydrogen abstraction family. Each descendant gen-
eration in the molecular group tree had a single characteristic defined that was not in the
ancestor generations. Characteristics were also defined earlier (higher in the tree) when
they were thought to have a greater impact on the reaction center distances than other un-
defined characteristics. For example, the children of the head nodes specified the elements
47
of the wildcard atoms (X and Y), but no bonding or radical electrons were specified be-
cause, while important, they are less critical than the wildcard atom types. This meant that
child nodes to X H or Xanyrad H were H2, C H, and O H (the X is defined as H, C, or O),
while the children of Y anyrad were Hrad, Orad, and Crad. The following two generations
defined the radicals and bonding. For the X branch of the tree the bonding was defined first,
then the radicals; on the Y branch the radicals were defined first, then the bonding. This
convention was continued until the bonding on the nearest neighbor atoms were defined
(the R groups in R X H and R Y rad).
The tree structure from Chapter 3 is referred to as the ”original tree” and the structure
described in this section is referred to as the ”new tree”. Both the original and new tree
structures are provided in Appendix B.
Tree structure comparison
The original molecular group tree structure was used to automatically find transition
states for hydrogen abstraction reactions in the DIPK combustion model (Chapter 3). The
new group tree was used to estimate the reaction center distances of the transition states
previously found using the original group tree structure. This allowed comparison of the
reaction center distance predictions made with either tree for a given training set, without
repeating all the electronic structure calculations.
Further comparison tested the performance of the molecular group trees for small train-
ing sets. The largest training set from Chapter 3 (1071TS) was randomly sampled to create
many smaller training sets containing data from 44 transition states. With each of the
smaller training sets, group values were trained and distances were predicted then com-
pared to known distances from validated transition states (all contained in 1071TS). This
was done using both the original and new tree structures.
48
Modifying distance limits and the UFF force field optimization
The use of the predicted transition state distances to create 3-dimensional transition
state estimates was fully described in Section 3.2.1. The transition state estimate was cre-
ated based on a molecular bounds matrix, which contains upper and lower distance limits
for each atom pair. For transition state geometries, the distance separating the atoms in the
reaction center of the transition state are unknown. The reaction center distance limits in
the bounds matrix are set as the group contribution distance prediction ± 0.05A, creating a
transition state bounds matrix. The transition state estimate is created by positioning atoms
in 3-dimensional space to satisfy the bounds matrix. The initial 3-dimensional geometry
is then optimized using a universal force field (UFF), where the force constants between
every atom pair is calculated based on the distance bounds matrix (default value of 1,200
kcal/(mol A)).
The group contribution method was developed to provide good reaction center dis-
tances, but setting the distance limits to be the predicted distance ± 0.05A could cause
reaction center distance for two geometries created from the same bounds matrix to dif-
fer by up to 0.1A. The difference between the distance limits were reduced to a total of
0.05A (± 0.025A from the group contribution prediction) to determine the effect of the
distance range.
For the UFF optimization step, the force constants for every atom pair are set based
on the distances in the molecular bounds matrix. This procedure can also modify the re-
action center distances which is undesirable given the accuracy of the predicted distances.
Increased force constants in the reaction center should restrict the reaction center atoms,
so reaction center force constants were tested at 12,000 kcal/(mol A) as well as 100,000
kcal/(mol A). The coupled effects of the distance limit range and force constants were also
tested.
49
4.2.2 Modifying the transition state optimization sequence
The transition state estimate created by positioning molecules in 3-dimensional space
to satisfy distance limits is then is refined via a UFF optimization step. The algorithm uses
the UFF refined transition state estimate to start a surface-walking transition state search.
An alternative approach, inspired by the Automated Alkylation Reaction Optimizer for
N-oxides (AARON) software [81], replaces the UFF optimization step with two partial
refinements. The first freezes the reaction center distances and refines the rest of the geom-
etry, and the second freezes all distances except the reaction center which is refined. The
two refinement steps were run using density functional theory.
Computational Chemistry
Estimated geometries were refined in RDKit using universal force fields (UFF) [100].
Geometry optimization and path analysis calculations were run using M06-2X [107, 123]
with the MG3S [124] basis set in the Gaussian 09 [105] quantum chemistry package.
MG3S is equivalent to 6-311+G(2df,2p) for systems containing H, C, and O atoms [110,
125].
4.3 Results and Discussion
4.3.1 Tree structure and data diversity affect prediction accuracy
The new group tree was trained using the same 4 training sets used in Chaper 3, and
distance predictions were made for comparison to the 907 optimized transition states from
the chapter (Figure 4.1). Predictions made with the new tree structure showed the same
trends previously reported: the error decreased as the training sets were expanded, but
the change from 827TS to 1071TS was minimal. The new tree structure produced better
estimates than the original tree for small data sets, where the data was most erroneous.
The original tree provides marginally better estimates when trained using large data sets.
50
The original tree contained much more detailed groups than the new structure, so the new
tree structure is expected to at least match the accuracy for large data sets if more detailed
groups (more branches in the tree) are added.
0
0.1
0.2
44 230 827 1071
RM
S Er
ror (Å)
TS Geometries in Training Set
Original Tree New Tree
Figure 4.1: The RMS error for the distance estimates compared to the optimizedtransition state distances.
The differences in error observed with the two trees shows the importance of the struc-
ture to the distance predictions. While the new tree structure improves the distance pre-
dictions for smaller training sets, other tree structures might be able to further improve
the predictions. The tree structure may depend on the training data, so to test this 1,000
new training sets containing data from 44 transition states were created by randomly se-
lecting training data from 1071TS. Each of the 1,000 training sets were used to train both
the original and new molecular group trees, and reaction center distances were predicted
51
for comparison with the 907 known transition states previously found. In over 85% of the
1000 cases, the modified tree had a lower RMS error than the original tree. The probability
distribution of the RMS errors (Figure 4.2) show that the predictions are expected to be
more accurate if made using the modified tree instead of the original tree structure, given
the small size of the training set. Even though cases where the original structure outper-
formed the new tree were in the minority, these cases indicate that tree structures cannot be
selected without considering the training data that will be used to train the group values in
the tree.
Table 4.1: Training set information. As the training set was expanded, the RMS er-ror from the validated transition state distances decreases. The new treestructure performed better when training data was sparse.
Training Transition Geometries RMS Error (A)set States in found Original New
name training set tree tree44TS 44 not run 0.181 0.124230TS 230 658 0.102 0.088827TS 827 734 0.040 0.0421071TS 1071 not run 0.036 0.041
The RMS error attained using the 44TS training set was 0.181 A with the original tree
and 0.124 A with the modified tree (Table 4.1). Comparing with the probability distri-
butions in Figure 4.2, which peak around 0.09 A, shows that the probability of randomly
selecting from 1071TS the 44 transition states used in 44TS is very low, i.e. they are
strongly correlated and non-random. This lack of variety in the 44TS set is what leads
to the large RMS errors: some specific groups were well trained, but the overall tree was
poorly trained. This shows that the value of each transition state in a training set decreases
when a similar transition state already exists in that training set, i.e. it is important to have
a variety of structures in the training data, distributed evenly across the tree.
52
0.0
0.2
0.4
0.6
0.04 0.06 0.08 0.10 0.12 0.14
Prob
abili
ty
RMS Error (Å)
P (Original)P (New)
Figure 4.2: Probability distribution for the root-mean-squared error of the reactioncenter distances when training the groups with 44 transition state dis-tances, for the Original and New tree structures.
53
4.3.2 Manipulating distance limits and force constants can improve UFF optimiza-
tion
The molecular group contributions have been shown to make good predictions of the
reaction center distances of transition states given the groups are appropriately trained.
The subsequent steps converting the predicted distances into 3-dimensional structures can
modify the well predicted distances. The force constants applied during the UFF optimiza-
tion and the distance limit ranges were modified to minimize the change in the predicted
distances when producing the 3-dimensional estimate.
Figure 4.3: Decreasing the distance and increasing the force constants for the reac-tion center each minimized error in the dXH distance introduced duringthe construction of the 3-dimensional transition state estimate. The errorreduction is additive as seen when combining the modifications.
Reducing the distance limit range from 0.1A to 0.05A reduced the difference from
the predicted distances in the 3-D structure (Figure 4.3). Further reduction in the distance
limits are expected to help maintain the reaction center distance predictions when creating
the 3-dimensional geometry. Further testing is required as it is possible that a smaller range
54
could prevent successful 3-dimensional geometry construction.
The force constants between reaction center atoms were set to 12,000 kcal/(mol A),
representing approximately an order of magnitude increase. This also reduced the error
introduced in the reaction center when creating the 3-dimensional structure. Applying
the increased force constants with the reduction in the distance limit range led to further
restriction of the reaction center distances.
The strategies discussed show ways to modify the algorithm in order to ensure the
predicted reaction center distances are maintained when constructing the transition state
estimate. This is ideal when the molecular groups used are well trained, but there could be
cases where a relevant group value is unknown or poorly determined. It could be advan-
tageous to allow the reaction center distances to be partially modified when there is low
confidence in the accuracy of the molecular group values used to make the predictions.
4.3.3 Replacing the UFF optimization with more robust calculations may improve
transition state prediction
The transition state estimate is created by positioning atoms in 3-dimensional space in
a manner that satisfies a distance bounds matrix, then refined using a UFF optimization.
An alternative approach would replace the UFF optimization with a series of partial den-
sity functional theory optimizations. The partial optimizations were tested on 50 reactions
and the reaction center distances were compared to the validated transition state distance.
Figures 4.4, 4.5, and 4.6 show the sequence of partial optimizations led to an overall im-
provement in the reaction center distances from the molecular group predictions. There was
an increase in error when converting the molecular group prediction into a 3-dimensional
geometry, and this has been discussed in Section 4.3.2.
The average improvement in the reaction center distances due to the sequence of partial
optimizations is promising, but the standard deviation of the dHY, and dXY distances raises
questions about the consistency of the method. Despite this, the partial optimizations are
55
Figure 4.4: Mean and standard deviation of the absolute error in dXH distances fromthe final optimized transition state at each stage of the transition stateprediction process.
Figure 4.5: Mean and standard deviation of the absolute error in dHY distances fromthe final optimized transition state at each stage of the transition stateprediction process.
56
Figure 4.6: Mean and standard deviation of the absolute error in dXY distances fromthe final optimized transition state at each stage of the transition stateprediction process.
still expected to improve the automated transition state geometry searches, as the remaining
geometry is optimized at the same level of theory used for the transition state search. This
means the transition state geometry estimate provided to the surface walking algorithm is
closer to the true transition state if the partial optimization steps are used instead of the
UFF optimization. The UFF optimization should be used when computational resources
are restricted.
4.4 Conclusion
The group contribution method performs best with well trained groups, but evidence
suggests it can perform reasonably with sparse data if the group tree design is thoughtfully
considered. Aside from tree design, predictions can be improved by adding more training
data, and the value of the new data increases the more unique it is in relation to the existing
training data. Modifications to the reaction center distance limits and force constants can
57
improve the prediction of the transition state structure. The UFF optimization can also be
replaced with partial optimization steps using density functional theory to improve transi-
tion state structure prediction. These approaches provide alternatives to future users of the
automated transition state search method, but further study is required to best apply these
methods.
4.5 Recommendations
4.5.1 Efficient calculation of the molecular group contributions
The molecular group contribution approach classifies transition states as similar based
on the reaction molecular groups. The group contribution is calculated by linear regression
on the reaction center distances of transition states that were classified together. This is a
form of machine-learning and these methods have been hard-coded in the RMG framework,
and requires a developer to update the software when a change is desired. These changes
can be as simple as adding a new molecular group or modifying the structure molecular
group tree, which has been shown in this chapter to affect prediction accuracy. The current
method of optimizing the molecular group prediction method is inefficient and does not
scale well, but can be improved by using the scikit-learn package.
The scikit-learn package was developed to allow simple integration of state-of-the-art
machine-learning algorithms in Python [126]. The package contains the necessary classifi-
cation and regression tools to determine the molecular group contributions as done in this
work, and can also use alternative techniques to improve the group value predictions. The
package includes methods to test and score the models applied to ensure the best combina-
tion of parameters are used. The scikit-learn package contains methods to automatically
determine the best tree structure for the given data, and do not require a developer to modify
the tree structure whenever a new type of molecule is added, since it will add the necessary
group if a new feature is added to the training data. The scikit-learn package has also been
constructed in the most efficient manner, and can be implemented within a few lines of
58
code replacing the few hundred lines hard-coded into the RMG framework.
4.5.2 UFF optimization with constrained optimization
This chapter showed two methods to improve the transition state predictions once cre-
ated using distance geometry. The first applied UFF optimization using higher spring con-
stants for the reaction center distances, and the second bypassed the UFF optimization and
using a series of constrained optimizations to create a good transition state estimate. Their
respective advantages and disadvantages have been discussed previously in this chapter.
Various parameters were used to improve the UFF optimization, and these should be
further explored to ensure the best combination is used. The constrained optimizations can
be applied in addition to the UFF optimization, with the potential of creating a good transi-
tion state estimate at a reduced computational cost. The UFF optimization would provide a
better transition state estimate to the constrained optimizations at a low computational cost,
and the constrained optimizations would require less steps when provided a better starting
point.
5. Method extension to new reaction families and automated kinetic
parameter calculation
5.1 Background
Detailed chemical kinetic modeling of complex systems has been aided by software
for automated reaction mechanism generation [1]. One example of such software, Re-
action Mechanism Generator (RMG) [6], uses a rate-based approach to construct kinetic
models [7]. RMG has been applied to systems such as the pyrolysis and combustion of
isobutanol [127], the fast pyrolysis of bio-oil [128], and the auto-oxidation of a biofuel
surrogate [129]. Mechanism generators require thermodynamic and kinetic parameters to
complete the model construction; these parameters are preferentially sourced from exper-
imental measurements or accurate theoretical calculations, but more commonly estimates
are used as most of the required parameters are unknown [20].
Parameter estimation methods provide thermodynamic and kinetic values in a computa-
tionally efficient manner [28]. Estimation methods are typically based on Benson’s group
additivity method for thermochemistry [21], in which group values are first determined
from molecules with known thermodynamics, then used to estimate the thermodynamics
of other molecules. Benson’s group contributions have been used to make adequate ther-
mochemistry predictions for a variety of systems, including hydrocarbons [22, 112] and
silicon hydrides [23, 24]. Despite these successes, group contribution methods have been
difficult to extend to some cases, such as predicting thermodynamics for polycyclic species,
where the ring strain causes the molecule to be poorly described by the sum of its parts.
The RMG software addresses this deficiency in the group additive approach by perform-
ing semi-empirical or quantum mechanical calculations of thermodynamic parameters for
polycyclic species [8].
59
60
For estimating reaction kinetics, the Evans-Polanyi relationship is a simple approach in
which the change in enthalpy is used to predict the kinetics of the specific reaction [27]. It
is not always appropriate to apply the Evans-Polanyi relationship, such as in the hydrogen
abstraction by methyl radicals from polynuclear aromatics [29]. An alternative approach
extends group contribution methods to predict kinetic parameters [113–116, 130–132].
Group estimation methods can be automated efficiently, making them useful for mecha-
nism generators when specific reaction rates are unavailable [28]. Group-based predictions
can be further improved using reaction rate rules for increasingly specific reacting groups
[32, 35, 133, 134], but appropriate rate rules are rarely available when studying new sys-
tems. In these situations more general (less specific) rules are used, but the accuracy of the
estimates suffers.
Continuing advances in computing power have made it feasible to try to calculate un-
known kinetic parameters via transition state theory (TST) instead of relying on estimates,
motivating the automation of TST calculations. Reactant and product structures can already
be found using the automated software integrated in RMG to calculate species thermo-
chemistry [8]. The artificial force induced reaction (AFIR) method [75, 135], KinBot [80],
and other methods [76, 77, 79, 81, 136] use computational chemistry software to automati-
cally locate the necessary transition state geometries. Kinetic programs such as CanTherm
[84], VariFlex [83], MultiWell [85], and POLYRATE [137] have been developed to cal-
culate reaction kinetics if provided the quantum chemistry outputs. Integrating geometry
search software with kinetic calculators is a promising route to enable high-throughput
kinetics calculations.
The present chapter describes automated algorithms to locate reactants, products, and
transition states based on distance geometry [8, 136] and their integration with the Can-
Therm [84] code to calculate reaction rate expressions. The automated transition state
geometry search method has been described in the previous chapters of this thesis. The
integrated algorithms are referred to as the Automated Transition State Theory (AutoTST)
61
calculator. Kinetics calculated with the integrated algorithm were compared to two sets
of rate rule predictions. The first rate rules were used in the construction of a butanol
combustion model from the Lawrence Livermore National Laboratory (LLNL) [138], and
the second set of rate rule predictions were taken from RMG. Rate rule predictions and
AutoTST calculated reaction rates were compared to benchmark theoretical calculations,
which showed the integrated algorithm improved kinetics when no rate rule exists that is
similar to the reaction in question.
5.2 Methods
5.2.1 Computational chemistry
Geometry optimization and path analysis calculations used the M06-2X DFT functional
[107, 123] with the MG3S basis set [124] (equivalent to 6-311+G(2df,2p) for systems con-
taining C, H, and O) [110, 125] in the Gaussian 09 quantum chemistry package [105]. For
benchmark calculations, electronic energies were computed using the CCSD(T)-F12/RI
method with the cc-VTZ-F12 [139] and cc-VTZ-F12-CABS [140] basis sets in ORCA
[141].
5.2.2 Automated geometry searches
Reactant and product structures were located using the automated algorithm developed
in RMG and described by Magoon and Green [8]. Transition state structures were located
using a group contribution method that predicts transition state reaction center distances
using training data of known transition states, and has been described in the previous chap-
ters of this thesis [136]. The transition state training data used in this study were optimized
and validated at M06-2X with a MG3S basis set, so that predictions were made for the
same electronic structure method used in this study. M06-2X/MG3S provides sufficient
kinetic parameters at a reasonable computational cost, and is widely available in computa-
tional chemistry packages [141–143]. The method, previously demonstrated for hydrogen
62
abstraction reactions, was extended to be applied to intramolecular hydrogen migration and
β-scission reaction families. The modifications to the transition state geometry prediction
algorithm are discussed in the following section.
Modifications to the group contribution transition state search
Figure 5.1: Automated transition state search algorithm as described in ref. 136. Thesteps with bold borders, adapted from the AARON software [81], are de-viations from the original algorithm.
The group contribution method for predicting transition state geometries described in
ref. 136 has been modified to improve its performance (Figure 5.1). The distance geometry
algorithm requires upper and lower limits for the distances between every atom. The differ-
ence between the upper and lower limits for the reaction center distances were previously
set to 0.05A, but this was decreased to 0.025A due to increased confidence in the reac-
tion center predictions. 3-dimensional conformers were constructed to satisfy the distance
limits for every atom pair.
The optimization protocol was also modified, with the transition state geometry pre-
63
diction algorithm no longer using a universal force field optimization, instead adopting a
protocol similar to that used in the AARON code [81]. The geometry estimate undergoes a
constrained optimization to an energy minimum with the reaction center distances frozen,
followed by a transition state (saddle point) search with all distances frozen except the re-
action center. The resulting geometry is then used for a Berny transition state optimization.
5.2.3 Kinetic calculations
The CanTherm software package was used to determine kinetic parameters using clas-
sical transition state theory [84]. Symmetry numbers for the rate calculations were deter-
mined via point group using the SYMMETRY software [144]. SYMMETRY takes as input
the optimized 3-dimensional geometry and a tolerance to allow for small deviations, and
calculates the point group. The point group is used to determine the symmetry number
[145], and a chirality contribution of +R ln 2 is added for point groups that lack a superpos-
able mirror image. Product geometries and energies were also found for these calculations
so the Eckart model could be applied to determine tunneling corrections [146]. Figure 5.2
provides an overview of the automated kinetic calculation method.
Figure 5.2: The automated kinetic calculations involve an automated transition statesearch (Figure 5.1), automated search for reactant and product geometries[8], and automatically calculating kinetics using CanTherm [84].
5.2.4 Comparison of Automated TST calculations and Rate Rules
Rate rule predictions of hydrogen abstraction, intramolecular hydrogen migration, and
β-scission reactions from a butanol combustion model [138] were compared to the Au-
toTST calculations and the automated rate rule implementation in RMG. Kinetics were
compared at 1000K, since the rate rules were determined for a combustion model in that
64
temperature range. High pressure limit reaction rates were used for pressure-dependent
rate predictions in the butanol combustion model.
5.2.5 Comparison to benchmark calculations
In some cases there were large differences between rate rule predictions and AutoTST
calculations. For two of these cases from each reaction family, more accurate theoretical
calculations were applied by accounting for anharmonic rotations and improving barrier
heights using coupled-cluster theory. Other inaccuracies were addressed for cases in which
the automated algorithm did not find the lowest energy conformer or incorrectly deter-
mined the symmetry number. These benchmark calculations were compared to the rate
rule predictions and the AutoTST rates.
The geometries for the benchmark calculations were determined using the same DFT
functional and basis set as the automatically calculated rates, but the benchmark calcula-
tions used an ultrafine grid. For the benchmark calculations, the 1-D hindered rotor ap-
proximation was applied [48]. AutoTST did not always find the lowest energy conformer;
when the hindered rotor scans revealed a lower-energy conformer, this was re-optimized
and adopted for the benchmark calculations. Barrier heights were also improved using
single point coupled-cluster calculations (see the ‘Computational chemistry’ section for
details). Symmetry numbers were manually checked and corrected when the AutoTST
approach was incorrect.
These improvements allowed comparison between AutoTST and the benchmark calcu-
lations to identify the sources of error in the AutoTST calculations.
5.3 Results
The butanol combustion model contained 855 hydrogen abstraction, 78 intramolecular
hydrogen migration, and 131 β-scission reactions. For each reaction family, AutoTST
calculated kinetics for approximately 70% of the reactions (Table 5.1).
65
Table 5.1: Number of reactions for each family contained in the combustion model,and success of the AutoTST algorithm.
Reaction Family Number of Kinetics successfully PercentageReactions calculated calculated
Hydrogen abstraction 855 598 69.9Intramolecular hydrogen migration 78 52 66.7β-scission 184 131 71.2
Total 1117 781 69.9
5.3.1 Comparison of automated TST calculations and rate rules
Kinetic parameters calculated using AutoTST were compared to parameters predicted
using rate rules, applied both manually and automatically. The AutoTST kinetics corre-
sponded with with the rate rule predictions, with most rate rules being within an order of
magnitude (101) of each other (Figure 5.3). Despite the overall trend, a number of reactions
had significant discrepancy between AutoTST rates and the rate rule predictions. Six of the
reactions with significant discrepancies, two from each reaction family, were selected for
benchmark calculations to determine the accuracy of the three prediction sources.
5.3.2 Comparing predictions to benchmark calculations
Table 5.2: Reactions compared to benchmark calculations.
Label Family Reaction
R1 H abstraction C2H5OO· + C2H6 −−⇀↽−− C2H5OOH + .CH2CH3R2 H abstraction .OOH + CH3C(−−O)C2H5 −−⇀↽−− H2O2 +
.CH2C(−−O)C2H5R3 Intramolecular H migration O−−CHCH2OO· −−⇀↽−− O−−C·CH2OOHR4 Intramolecular H migration CH3C(CH3)(C−−O)OO· −−⇀↽−− CH3C(CH3)(.C−−O)OOHR5 β-scission CO2 +
.CH3 −−⇀↽−− CH3C(−−O)O·
R6 β-scission CH2C(CH3)CH−−O + HO2· −−⇀↽−− .CH2C(CH3)(CH−−O)OOH
Six reactions were selected for comparison to benchmark calculations, with two se-
lected from each reaction family (Table 5.2). The reactions were selected if there was a
102 discrepancy between the automatically calculated rate and both rate rule predictions at
1000K. For reaction 5, the rate from the LLNL model was provided in the reverse direc-
66
₁₀5
₁₀6
₁₀7
₁₀8
₁₀9
₁₀10
₁₀11
₁₀12
₁₀13
₁₀14
₁₀5 ₁₀6 ₁₀7 ₁₀8 ₁₀9 ₁₀10 ₁₀11 ₁₀12 ₁₀13 ₁₀14
k(T=
1000
K) [
cm³/m
ol/s
] by
LLNL
or R
MG
k(T=1000 K) [cm³/mol/s] by AutoTST
Hydrogen Abstraction
LLNLRMG-PyParity1 Order of Magnitude
Figure 5.3: Rate rule estimates (y-axis) plotted against automated algorithm TST cal-culations (x-axis) at 1000 K.
tion, so the rate shown was calculated using the provided rate and thermodynamics from
the model. The rate rule predictions better replicated the benchmark calculations than Au-
toTST for reaction 1 (Figure 5.4a), but the AutoTST calculations showed better agreement
than the rate rules for reactions 2 through 6 (Figures 5.4b, 5.5, and 5.6). This is addressed
in the discussion section.
5.3.3 Sources of error in the automated calculations
Discrepancies between the AutoTST calculations and the benchmark calculations pre-
sented an opportunity to identify sources of error in the AutoTST method. Geometries
were determined using the same DFT electronic structure method, but in some cases the
automated geometry search methods had not found the lowest energy conformer, and the
automated method to determine symmetry was not always correct. Other differences were
the use of couple-cluster calculations to improve barrier height calculation, and addressing
67
-6
-3
0
3
6
9
12
0 1 2 3
log₁₀(k
/ (c
m³/m
ol/s
))
1000K / T
BenchmarkAlgorithmLLNLRMG-Py
(a)
-6
-3
0
3
6
9
12
0 1 2 3
log₁₀(k
/ (c
m³/m
ol/s
))
1000K / T
(b)
Figure 5.4: Comparison of kinetic estimates for hydrogen abstraction reactions.
anharmonic rotations using the 1-D hindered rotor treatment, all done for the benchmark
calculations. Each correction was individually removed from the benchmark calculations
and replaced with the equivalent used for the automated calculations.
Table 5.3 displays the difference in activation energy due to each source of error and
the benchmark calculation, and Table 5.4 shows the changes to the A factor due to the
same effects. Figure 5.7 shows the magnitude of this difference for the rate coefficient at
1000 K. The major source of error for AutoTST calculations was the lack of treatment of
anharmonic rotors, but this was not consistent for all reactions since some contained few
rotors (e.g. R5). Symmetry was also a major source of error when the automated method
incorrectly determined symmetry numbers. This was not consistent for all tested reactions
as the automated method correctly determined symmetry for some cases. As expected, the
activation energy is unaffected by correcting the symmetry number (Table 5.3). AutoTST
was not always successful in finding the lowest energy conformer for all structures, which
contributed to errors of varying degrees. The intramolecular hydrogen migration reactions
were most affected by these, where a single wrong conformer would contribute signifi-
cantly to an error in the barrier height. Correcting the DFT energy had little effect on the
68
-6
-3
0
3
6
9
12
0 1 2 3
log₁₀(k
/s–1
)
1000K / T
BenchmarkAlgorithmLLNLRMG-Py
(a)
-6
-3
0
3
6
9
12
0 1 2 3
log₁₀(k
/s–1
)
1000K / T
(b)
Figure 5.5: Comparison of kinetic estimates for intramolecular hydrogen migrationreactions R3 (a) and R4 (b).
rate calculations in the combustion temperature range, but at lower temperatures the DFT
energy led to rates that were approximately 101 off of the benchmark calculations.
Table 5.3: Difference in the activation energy (kJ/mol) compared to the benchmarkcalculations. Kinetics fitted to Arrhenius form between 600K and 2000K.
Reaction Benchmark Inaccurate Hindered Incorrect Incorrect OverallEa Energy Rotors Symmetry Conformer Discrepancy
R1 111.98 –5.46 –0.34 0.00 –0.91 –6.01R2 119.38 –9.92 –20.70 0.00 +4.57 –24.53R3 88.54 +6.16 +0.28 0.00 –11.05 –4.06R4 85.19 +7.03 +1.08 0.00 –12.52 –3.35R5 97.24 –7.73 0.00 0.00 –2.52 –7.68R6 84.89 –6.74 –8.35 0.00 +0.01 +1.91
5.4 Discussion
AutoTST found transition states and calculated kinetics for 70% of the 1117 reactions
tested. The percentage successfully calculated was consistent across all reaction families,
so the AutoTST success rate is independent of the reaction type.
69
-6
-3
0
3
6
9
12
0 1 2 3
log₁₀(k
/ (c
m³/m
ol/s
))
1000K / T
BenchmarkAlgorithmLLNLRMG-Py
(a)
-6
-3
0
3
6
9
12
0 1 2 3
log₁₀(k
/ (c
m³/m
ol/s
))
1000K / T
(b)
Figure 5.6: Comparison of kinetic estimates for β-scission reactions R5 (a) and R6 (b).
Table 5.4: Difference in the log10 of the A factor compared to the benchmark calcula-tions. Kinetics fitted to Arrhenius form between 600K and 2000K. R3 andR4 are in [s−1] and the rest are in [cm3/(mols)]
Reaction Benchmark Inaccurate Hindered Incorrect Incorrect Overalllog10A Energy Rotors Symmetry Conformer Discrepancy
R1 13.96 0.020 –0.772 –1.079 0.000 –2.337R2 13.27 0.010 –2.659 0.000 -0.016 –1.854R3 12.00 –0.001 0.416 0.000 0.004 0.620R4 11.64 –0.002 1.253 0.000 0.004 1.369R5 12.83 0.000 0.000 –1.079 0.000 –1.169R6 11.53 0.000 –0.483 0.000 0.000 0.141
Reaction rate rules, when used appropriately, can provide good rate predictions. The
strong correlation between the rate rules and AutoTST calculations is evidence for contin-
ued use of rate rules. Rate rule use is computationally efficient, so should be applied for
reactions where the structure of the reacting molecular groups similar to that of the rate
rule. When such rate rules are unavailable, AutoTST now provides an alternate method to
determine kinetics.
The comparison of the 6 reactions with large discrepancies between rate rule predictions
and AutoTST show the automated method performs well for all tested reaction families.
70
-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0
R1
R2
R3
R4
R5
R6
Δlog(k(T=1000K))
DFT EnergyNo Hindered Rotor CorrectionAutomated SymmetryAutomated ConformerFully Automated Algorithmdummy
Figure 5.7: Magnitude of the sources of error in each automated algorithm comparedto its respective benchmark calculation. The summation of all errors arerepresented by the algorithm calculation.
71
This is particularly true when considering the performance of the kinetics across a wide
temperature range, where the kinetics calculated with AutoTST trend well with the high
accuracy calculations, but the rate rules perform best in the temperature range for which
they were developed.
For one case (R1), both rate rule estimates outperformed the AutoTST calculation. The
rate rules used in the model developed for RO2. +C2H6 −−⇀↽−− ROOH+C2H5 [147], and the
RMG rate rule was developed for HO2. + C2H6 −−⇀↽−− H2O2 + C2H5 [148]. The accuracy of
the rate predictions should be expected since the rate rule was developed for reactions quite
similar to R1, and in such cases AutoTST should not be used since the rate rules could
provide a good rate prediction at a far lower computational cost. AutoTST outperformed
the rate rule predictions for all the other reactions.
The RMG rate prediction for R3 was made from a generalized rate rule, so the kinetic
data used to make the prediction was very unlike the reaction leading to the large discrep-
ancy in the reaction rate. The value used in the LLNL model was not an estimate but a
theoretically calculated value for the specific rate [149]. Comparison to the benchmark
calculations shows it is sufficiently accurate to describe this reaction. In other such cases,
when specific reaction rates are available, the available data should be relied on, bypassing
AutoTST.
Not treating anharmonic rotations, determining symmetry incorrectly, and not finding
the lowest energy conformer were the major sources of error for AutoTST. Improving bar-
rier heights with coupled-cluster calculations showed that using DFT energies also con-
tributed to errors, but was less significant. While all need to be addressed, automating
hindered rotors and providing a more robust symmetry determining algorithm should be
targeted. Automating hindered rotor calculations will also help identify the lowest energy
conformer.
72
5.5 Conclusion
AutoTST has calculated kinetics for approximately 70% of all tested reactions. The
method is extensible, and has now been applied to hydrogen abstraction, intramolecular
hydrogen migration and β-scission reactions. The successful extension of AutoTST moti-
vates further work to include other reaction families with a reaction barrier.
Good kinetic estimates can be calculated using the automated algorithm, and the proto-
col should be used for mechanism generation. It has also been shown to outperform other
kinetic estimation methods when specific rate rules are unavailable for a reaction. The cur-
rent estimation methods should not be abandoned, as they can still provide good kinetic
predictions if used appropriately (R1) and in a computationally efficient manner. AutoTST
should be used to target reactions where the kinetics are estimated with more generalized
rate rules.
The current major sources of error in the automated kinetics are improperly determin-
ing symmetry, and not accounting for internal rotor contributions. Based on these errors,
the automated algorithm can be further improved to provide even better parameters. Im-
proved automated methods for determining symmetry numbers would reduce the uncer-
tainty in AutoTST calculations. Internal rotor contributions could be included in the calcu-
lations by automating hindered rotor calculations. The hindered rotor calculations would
also help identify an existing lower energy conformer, correcting cases where the lowest
energy conformer was not automatically found. The additional computational cost of au-
tomating coupled-cluster calculations to improve barrier heights would have to be balanced
against available computational resources, as using the DFT energy was not a major source
of discrepancy. Despite these sources of error, the method can provide improved kinetic
parameters for many reactions in microkinetic models, reducing the uncertainty of these
models.
73
5.6 Recommendations
5.6.1 Improve symmetry number calculation
Symmetry numbers for the rate calculations were determined using the SYMMETRY
software [144], developed in 2003. The software takes optimized geometries and tries to
determine their respective point groups. A tolerance is also used to allow small deviations
from the exact symmetry. The symmetry number can then be determined from the point
group. The calculations in this chapter show that the automated symmetry calculations are
imperfect, and can have a large effect on the calculated reaction kinetics when they are
incorrect.
It is often difficult to correctly determine symmetry numbers manually, and it has been
shown to be the source of discrepancies between kinetics calculated by different experi-
enced research groups [150]. Efforts have been made to standardize the application of
symmetry numbers in transition state theory [151], and recently a new automated approach
has been developed to determine symmetry numbers from an augmented chemical graph
[152].
These methods could potentially resolve issues with symmetry numbers, or at least
reduce the error associated with symmetry number calculation in the automated method
described in this chapter. This should provide an overall improvement in the performance
of the automated transition state theory method.
5.6.2 Automate hindered rotor calculations
Hindered rotor calculations help account for anharmonic effects that are neglected when
applying classical transition state theory. Comparison of the benchmark calculations with
the automated kinetics in this chapter showed the importance of accounting for hindered ro-
tors when dealing with larger molecules. These calculations, like transition state searches,
are computationally and labor intensive. Automating these calculations would be beneficial
74
as they would remove the need to manually set up these calculations, allowing the compu-
tationally intensive calculations to be moved to distributed computing resources for more
efficient calculations.
The 1-D hindered rotor approximation is simplest method to account for hindered rotors
[48]. This method was used for the benchmark calculations described in this chapter. An
automated approach would need to reliably determine the rotors, but this can be achieved
by using the chemical graph used to construct the 3-dimensional transition state estimate.
The computational cost added to the kinetic calculations would vary depending on the size
of the reacting molecules.
Automating 1-D hindered rotor calculations will have an added benefit of identifying
if the conformer used is not the lowest energy conformer. The calculation explores all
rotations in a conformer and will identify if another conformation has a lower energy. The
lower energy conformer can be extracted and used in the subsequent kinetic calculation.
Accounting for hindered rotors will also benefit automated thermodynamic calcula-
tions, further improving mechanism generation and allowing its expansion to new chemical
systems.
6. Summary
This thesis describes an automated method to calculate chemical kinetic parameters
using ab initio quantum chemistry methods. The method is referred to as AutoTST, and
requires a 2-dimensional representation of each reaction, making it ideal high-throughput
kinetics.
AutoTST uses distance geometry to to convert the 2-dimensional molecular represen-
tations into 3-dimensional geometries. The distance geometry approach cannot provide a
full description of the transition state geometry, so a machine learning approach is used to
predict the unknown distances of the transition state. The reactant, product, and transition
state geometries are refined using ab initio quantum chemistry programs, so that molecular
properties can be calculated. The kinetic parameters of a reaction are calculated using the
molecular properties via transition state theory.
The transition state geometry prediction method has been shown to have a 70% success
rate, with potential room for improvement. It has been applied to hydrogen abstraction,
intra-hydrogen migration, and β-scission reactions, which shows that the transition state
geometry search is also extensible to any reaction type with an energy barrier. This rep-
resents a significant increase in efficiency, where thousands of reaction kinetic parameters
are required to construct complex chemical reaction networks.
The transition state search method was integrated with two other automated methods
creating the AutoTST method. The first automatically determines geometries for stable
species which are the reactants and products of a reactions. The second applies transition
state theory when provided the optimized reactant, product, and transition state structures
from computational chemistry programs.
Kinetics calculated with AutoTST were compared to rate rule approximations, the best
available existing methods to estimate unknown reaction kinetic parameters. Rate rules
75
76
performed well when used appropriately, but were outperformed by AutoTST when they
were applied beyond their intended scope. Data scarcity means rate rules are often applied
outside their intended use, but now AutoTST offers an alternative approach to determine
chemical kinetic parameters that are unknown.
While AutoTST can be used in its current form to provide kinetics with reasonable ac-
curacy, comparison of its kinetics to high fidelity theoretical calculations showed AutoTST
kinetics contained sources of error that can be addressed. Symmetry numbers for each re-
acting species and the transition state are used to calculation reaction kinetics via transition
state theory, and these numbers are determined automatically in AutoTST. The algorithm
used to determine symmetry numbers is imperfect, and improved methods to determine
symmetry numbers are an active area of research.
Classical transition state theory calculations should be conducted using the lowest en-
ergy conformer. The AutoTST algorithm sometimes used a higher energy conformer,
adding error to the kinetic parameters. The method also did not account for anharmonicity
in the potential energy surface. These sources of error can be addressed by automating
hindered rotor calculations. The 1-dimensional hindered rotor approximation is sufficient
to address anharmonicity and to identify cases where a lower energy conformer exists.
The 1-dimensional hindered rotor approximation method can be automated by using the
geometries found using AutoTST and specifying rotation axes.
This thesis describes the AutoTST method, an automated approach to provide kinetic
calculations for use in mechanism generation. AutoTST can currently provide useful ki-
netics for mechanism generation, allowing rate rules to be applied only when they are
appropriate. This would represent a reduction in the uncertainty of detailed kinetic model.
References
[1] Edward S Blurock, Frederique Battin-Leclerc, Tiziano Faravelli, and William HGreen. Automatic Generation of Detailed Mechanisms. In Cleaner Combustion,pages 59–92. Springer London, London, September 2013.
[2] Joshua W Allen, Adam M Scheer, Connie W Gao, Shamel S Merchant, Subith SVasu, Oliver Welz, John D Savee, David L Osborn, Changyoul Lee, Stijn Vranckx,Zhandong Wang, Fei Qi, Ravi X Fernandes, William H Green, Masood Z Hadi,and Craig A Taatjes. A coordinated investigation of the combustion chemistry ofdiisopropyl ketone, a prototype for biofuels produced by endophytic fungi. Combust.Flame, 161(3):711–724, March 2014.
[3] Hsi-Wu Wong, Xuegeng Li, Mark T Swihart, and Linda J Broadbelt. Detailed Ki-netic Modeling of Silicon Nanoparticle Formation Chemistry via Automated Mech-anism Generation. J. Phys. Chem. A, 108(46):10122–10132, November 2004.
[4] S M Sarathy, Charles K Westbrook, M Mehl, W J Pitz, C Togbe, Philippe Dagaut,H Wang, M A Oehlschlaeger, U Niemann, K Seshadri, P S Veloo, C Ji, F N Egol-fopoulos, and T Lu. Comprehensive chemical kinetic modeling of the oxidation of2-methylalkanes from C7 to C20. Combust. Flame, 158(12):2338–2357, December2011.
[5] Tianfeng Lu and Chung K Law. Toward accommodating realistic fuel chemistry inlarge-scale computations. Prog. Energ. Combust., 35(2):192–215, April 2009.
[6] Connie W Gao, Joshua W Allen, William H Green, and Richard H West. ReactionMechanism Generator: Automatic construction of chemical kinetic mechanisms.Comput. Phys. Commun., February 2016.
[7] Roberta G Susnow, Anthony M Dean, William H Green, P Peczak, and Linda JBroadbelt. Rate-Based Construction of Kinetic Models for Complex Systems. J.Phys. Chem. A, 101(20):3731–3740, May 1997.
[8] Gregory R Magoon and William H Green. Design and implementation of a next-generation software interface for on-the-fly quantum and force field calculations inautomated reaction mechanism generation. Comput. Chem. Eng., 52:35–45, De-cember 2012.
[9] Ivar Ugi, Johannes Bauer, Josef Brandt, Josef Friedrich, Johann Gasteiger, ClemensJochum, and Wolfgang Schubert. New Applications of Computers in Chemistry.Angew. Chem. Int. Edit., 18(2):111–123, February 1979.
[10] F P Di Maio and P G Lignola. KING, a KInetic Network Generator. Chem. Eng.Sci., 47(9-11):2713–2718, June 1992.
77
78
[11] Linda J Broadbelt, Scott M Stark, and Michael T Klein. Computer generated reactionnetworks: on-the-fly calculation of species properties using computational quantumchemistry. Chem. Eng. Sci., 49(24):4991–5010, December 1994.
[12] Eliseo Ranzi, Tiziano Faravelli, Paolo Gaffuri, and Angelo Sogaro. Low-temperaturecombustion: Automatic generation of primary oxidation reactions and lumping pro-cedures. Combust. Flame, 102(1-2):179–192, July 1995.
[13] Edward S Blurock. Reaction: System for Modeling Chemical Reactions. J. Chem.Inf. Model., 35(3):607–616, May 1995.
[14] G M Come, V Warth, Pierre-Alexandre Glaude, R Fournet, Frederique Battin-Leclerc, and G Scacchi. Computer-aided design of gas-phase oxidation mech-anisms—Application to the modeling of n-heptane and iso-octane oxidation.Symposium (International) on Combustion, 26(1):755–762, January 1996.
[15] Andras Nemeth, Tamas Vidoczy, Karoly Heberger, Zsolt Kuti, and Janos Wagner.MECHGEN: Computer Aided Generation and Reduction of Reaction Mechanisms.J. Chem. Inf. Model., 42(2):208–214, February 2002.
[16] Srinivas Rangarajan, Aditya Bhan, and Prodromos Daoutidis. Rule-based generationof thermochemical routes to biomass conversion. Ind. Eng. Chem. Res., 49(21):10459–10470, 2010.
[17] Nick M Vandewiele, Kevin M Van Geem, Marie-Francoise Reyniers, and Guy BMarin. Genesys: Kinetic model construction using chemo-informatics. ChemicalEngineering Journal, 207-208:526–538, October 2012.
[18] Nenad Trinajstic. Computational chemical graph theory: characterization,enumeration, and generation of chemical structures by computer methods. E. Hor-wood, 1991.
[19] L P Cordella, P Foggia, C Sansone, and M Vento. A (sub)graph isomorphism al-gorithm for matching large graphs. IEEE Trans Pattern Anal Mach Intell, 26(10):1367–1372, October 2004.
[20] Linda J Broadbelt and Jim Pfaendtner. Lexicography of kinetic modeling of complexreaction networks. AIChE J., 51(8):2112–2121, 2005.
[21] Sidney W Benson, F R Cruickshank, D M Golden, Gilbert R Haugen, H E O’Neal,A S Rodgers, Robert Shaw, and R Walsh. Additivity rules for the estimation ofthermochemical properties. Chem. Rev., 69(3):279–324, June 1969.
[22] Raman Sumathi and William H Green. Thermodynamic Properties ofKetenes: Group Additivity Values from Quantum Chemical Calculations. J. Phys.Chem. A, 106(34):7937–7949, August 2002.
79
[23] Hsi-Wu Wong, Juan Carlos Alva Nieto, Mark T Swihart, and Linda J Broadbelt.Thermochemistry of Silicon−Hydrogen Compounds Generalized from QuantumChemical Calculations. J. Phys. Chem. A, 108(5):874–897, February 2004.
[24] Andrew J Adamczyk and Linda J Broadbelt. Thermochemical Property Estimationof Hydrogenated Silicon Clusters. J. Phys. Chem. A, 115(32):8969–8982, July 2011.
[25] Suarwee Snitsiriwat, Rubik Asatryan, and Joseph W Bozzelli. ThermochemicalProperties for n-Propyl, iso-Propyl, and tert-Butyl Nitroalkanes, Alkyl Nitrites, andTheir Carbon-Centered Radicals. Int. J. Chem. Kinet., 42(3):181–199, March 2010.
[26] Tsan H Lay, Joseph W Bozzelli, Anthony M Dean, and Edward R Ritter. HydrogenAtom Bond Increments for Calculation of Thermodynamic Properties of Hydrocar-bon Radical Species. J. Phys. Chem., 99(39):14514–14527, September 1995.
[27] M G Evans and M Polanyi. Further considerations on the thermodynamics of chem-ical equilibria and reaction rates. Trans. Faraday Soc., 1936.
[28] J Yu, R Sumathi, and William H Green. Accurate and efficient method for pre-dicting thermochemistry of polycyclic aromatic hydrocarbons bond-centered groupadditivity. J. Am. Chem. Soc., 126(39):12685–12700, 2004.
[29] K Hemelsoet, V Van Speybroeck, Guy B Marin, Paul Geerlings, Michel Waroquier,and Frank De Proft. Reactivity indices for radical reactions involving polyaromatics.The Journal of . . . , 2004.
[30] Mark T Swihart and Steven L Girshick. Thermochemistry and Kinetics of SiliconHydride Cluster Formation during Thermal Decomposition of Silane. J. Phys. Chem.B, 103(1):64–76, January 1999.
[31] R Sumathi, Hans-Heinrich Carstensen, and William H Green. Reaction Rate Predic-tion via Group Additivity, Part 2: H-Abstraction from Alkenes, Alkynes, Alcohols,Aldehydes, and Acids by H Atoms. J. Phys. Chem. A, 105(39):8969–8984, October2001.
[32] Hans-Heinrich Carstensen and Anthony M Dean. Rate Constant Rules for the Auto-mated Generation of Gas-Phase Reaction Mechanisms †. J. Phys. Chem. A, 113(2):367–380, January 2009.
[33] Sumathy Raman and Hans-Heinrich Carstensen. Tree structure for intermolecularhydrogen abstraction from hydrocarbons (C/H) and generic rate constant rules forabstraction by vinyl radical. Int. J. Chem. Kinet., 44(5):327–349, May 2012.
[34] Stephanie M SM Villano, Lam K Huynh, Hans-Heinrich Carstensen, and Anthony MDean. High-pressure rate rules for alkyl + O2 reactions. 2. The isomerization, cyclicether formation, and β-scission reactions of hydroperoxy alkyl radicals. J. Phys.Chem. A, 116(21):5068–5089, May 2012.
80
[35] Stephanie M Villano, Hans-Heinrich Carstensen, and Anthony M Dean. Rate Rules,Branching Ratios, and Pressure Dependence of the HO 2+ Olefin Addition Channels.J. Phys. Chem. A, 117(30):6458–6473, August 2013.
[36] Max Trautz. Das Gesetz der Reaktionsgeschwindigkeit und der Gleichgewichtein Gasen. Bestatigung der Additivitat von Cv-3/2R. Neue Bestimmung der Inte-grationskonstanten und der Molekuldurchmesser. Zeitschrift fur anorganische undallgemeine Chemie, 96(1):1–28, June 1916.
[37] Keith J Laidler and M Christine King. Development of transition-state theory. J.Phys. Chem., 87(15):2657–2664, July 1983.
[38] Cyril Norman Hinshelwood, Sir, and 1897-1967. Kinetics of chemical change ingaseous systems. 1926.
[39] Henry Eyring. The Activated Complex and the Absolute Rate of Chemical Reac-tions. Chem. Rev., 17(1):65–77, August 1935.
[40] E Wigner. On the penetration of potential energy barriers in chemical reactions. ZPhys Chem Abt B, 1932.
[41] Donald G Truhlar and Bruce C Garrett. Variational Transition State Theory. Annu.Rev. Phys. Chem., 35(1):159–189, October 1984.
[42] Jingjing Zheng, Tao Yu, Ewa Papajak, Ionut M Alecu, Steven L Mielke, andDonald G Truhlar. Practical methods for including torsional anharmonicity inthermochemical calculations on complex molecules: the internal-coordinate multi-structural approximation. Phys. Chem. Chem. Phys., 13(23):10885–10907, June2011.
[43] Tao Yu, Jingjing Zheng, and Donald G Truhlar. Statistical thermodynamics ofthe isomerization reaction between n-heptane and isoheptane. Phys. Chem. Chem.Phys., 14(2):482–494, January 2012.
[44] Prasenjit P Seal, Ewa E Papajak, Tao Yu, and Donald G Truhlar. Statistical thermo-dynamics of 1-butanol, 2-methyl-1-propanol, and butanal. J. Chem. Phys., 136(3):034306–034306, January 2012.
[45] T Yu, Jingjing Zheng, and Donald G Truhlar. Multi-structural variational transitionstate theory. Kinetics of the 1, 4-hydrogen shift isomerization of the pentyl radicalwith torsional anharmonicity. Chem. Sci., 2:2199–2213, 2011.
[46] Xuefei Xu, Ewa E Papajak, Jingjing Zheng, and Donald G Truhlar. Multi-structuralvariational transition state theory: kinetics of the 1,5-hydrogen shift isomerizationof the 1-butoxyl radical including all structures and torsional anharmonicity. Phys.Chem. Chem. Phys., 14(12):4204–4216, March 2012.
81
[47] Jingjing Zheng and Donald G Truhlar. Multi-path variational transition state theoryfor chemical reaction rates of complex polyatomic species: ethanol + OH reactions.Faraday Discuss., 157(0):59–88, 2012.
[48] Jim Pfaendtner, Xinrui Yu, and Linda J Broadbelt. The 1-D hindered rotor approxi-mation. Theor. Chem. Acc., 118(5-6):881–898, July 2007.
[49] James J P Stewart. Optimization of parameters for semiempirical methods V: modi-fication of NDDO approximations and application to 70 elements. J Mol Model, 13(12):1173–1213, May 2007.
[50] Charlotte Froese Fischer. General Hartree-Fock program. Comput. Phys. Commun.,43(3):355–365, February 1987.
[51] Martin Head-Gordon, John A Pople, and Michael J Frisch. MP2 energy evaluationby direct methods. Chem. Phys. Lett., 153(6):503–506, December 1988.
[52] R Krishnan and John A Pople. Approximate fourth-order perturbation theory of theelectron correlation energy. Int. J. Quantum Chem., 14(1):91–100, July 1978.
[53] H G Kummel. A biography of the coupled cluster method. International Journal ofModern Physics B, 17(28):5311–5325, 2003.
[54] H Bernhard Schlegel. Optimization of equilibrium geometries and transition struc-tures. J. Comput. Chem., 3(2):214–218, June 1982.
[55] Jack Simons, Poul Joergensen, Hugh Taylor, and Judy Ozment. Walking on potentialenergy surfaces. J. Phys. Chem., 87(15):2745–2753, July 1983.
[56] H Bernhard Schlegel. Estimating the hessian for gradient-type geometry optimiza-tions. Theoret. Chim. Acta, 66(5):333–340, 1984.
[57] Jay W Ponder and Frederic M Richards. An efficient newton-like method for molec-ular mechanics energy minimization of large molecules. J. Comput. Chem., 8(7):1016–1024, October 1987.
[58] Chunyang Peng, Philippe Y Ayala, H Bernhard Schlegel, and Michael J Frisch. Us-ing redundant internal coordinates to optimize equilibrium geometries and transitionstates. J. Comput. Chem., 17(1):49–56, January 1996.
[59] Xiaosong Li and Michael J Frisch. Energy-Represented Direct Inversion in the Iter-ative Subspace within a Hybrid Geometry Optimization Method. J. Chem. TheoryComput., 2(3):835–839, May 2006.
[60] Charles J Cerjan and William H Miller. On finding transition states. J. Chem. Phys.,75(6):2800, 1981.
[61] Ajit Banerjee, Noah Adams, Jack Simons, and Ron Shepard. Search for stationarypoints on surfaces. J. Phys. Chem., 89(1):52–57, January 1985.
82
[62] Thomas A Halgren and William N Lipscomb. The synchronous-transit method fordetermining reaction pathways and locating molecular transition states. Chem. Phys.Lett., 49(2):225–232, July 1977.
[63] C Y Peng and H Bernhard Schlegel. Combining synchronous transit and quasi-newton methods to find transition states. Isr. J. Chem., 33:449–454, 1993.
[64] R Elber and M Karplus. A method for determining reaction paths in large molecules:Application to myoglobin. Chem. Phys. Lett., 139(5):375–380, January 1987.
[65] G Mills, H Jonsson, and G K Schenter. Reversible Work Transition-State Theory- Application to Dissociative Adsorption of Hydrogen. Surf. Sci., 324:305–337,February 1995.
[66] Graeme Henkelman, Blas P Uberuaga, and Hannes Jonsson. A climbing imagenudged elastic band method for finding saddle points and minimum energy paths. J.Chem. Phys., 113(22):9901, 2000.
[67] P Maragakis, Stefan A Andreev, Yisroel Brumer, David R Reichman, and EfthimiosKaxiras. Adaptive nudged elastic band approach for transition state calculation. J.Chem. Phys., 117(10):4651–4658, September 2002.
[68] Semen A Trygubenko and David J Wales. A doubly nudged elastic band method forfinding transition states. J. Chem. Phys., 120(5):2082–2094, 2004.
[69] Weinan E, Weiqing Ren, and Eric Vanden-Eijnden. String method for the study ofrare events. Physical Review B, 66(5):052301, August 2002.
[70] Andrew Behn, Paul M Zimmerman, Alexis T Bell, and Martin Head-Gordon. Effi-cient exploration of reaction paths via a freezing string method. J. Chem. Phys., 135(22):224108, 2011.
[71] Baron Peters, Andreas Heyden, Alexis T Bell, and Arup Chakraborty. A growingstring method for determining transition states: comparison to the nudged elasticband and string methods. J. Chem. Phys., 120(17):7877–7886, May 2004.
[72] Anthony Goodrow, Alexis T Bell, and Martin Head-Gordon. Transition state-findingstrategies for use with the growing string method. J. Chem. Phys., 130(24):244108,2009.
[73] Elena F Koslover and David J Wales. Comparison of double-ended transition statesearch methods. J. Chem. Phys., 127(13):134102, 2007.
[74] Satoshi Maeda and Keiji Morokuma. Finding Reaction Pathways of Type A+B ->X:Toward Systematic Prediction of Reaction Mechanisms. J. Chem. Theory Comput.,7(8):2335–2345, August 2011.
[75] Satoshi Maeda, Tetsuya Taketsugu, and Keiji Morokuma. Exploring transitionstate structures for intramolecular pathways by the artificial force induced reactionmethod. J. Comput. Chem., 35(2):166–173, January 2014.
83
[76] Paul M Zimmerman. Automated discovery of chemically reasonable elementaryreaction steps. J. Comput. Chem., 34(16):1385–1392, 2013.
[77] Paul M Zimmerman. Navigating molecular space for reaction mechanisms: an effi-cient, automated procedure. Molecular Simulation, 41(1-3):43–54, December 2014.
[78] Yury V Suleimanov and William H Green. Automated Discovery of ElementaryChemical Reaction Steps Using Freezing String and Berny Optimization Methods.J. Chem. Theory Comput., 11(9):4248–4259, September 2015.
[79] Paul M Zimmerman. Single-Ended Transition State Finding with the Growing StringMethod. J. Comput. Chem., 36(9):601–611, 2015.
[80] Judit Zador and Habib N Najm. Automated exploration of the mechanism of elemen-tary reactions. Technical Report SAND2012-8095, Sandia National Laboratories,September 2012.
[81] Benjamin J Rooks, Madison R Haas, Diana Sepulveda, Tongxiang Lu, and Steven EWheeler. Prospects for the Computational Design of Bipyridine N,N´-Dioxide Cata-lysts for Asymmetric Propargylation Reactions. ACS Catal., 5(1):272–280, Decem-ber 2014.
[82] Alan D Isaacson, Donald G Truhlar, Sachchida N Rai, Rozeanne Steckler, Gene CHancock, Bruce C Garrett, and Michael J Redmon. POLYRATE: A general com-puter program for variational transition state theory and semiclassical tunneling cal-culations of chemical reaction rates. Comput. Phys. Commun., 47(1):91–102, Octo-ber 1987.
[83] S J Klippenstein, A F Wagner, R C Dunbar, D M Wardlaw, S H Robertson, and J AMiller. VARIFLEX: VERSION 2.02 m; Argonne National Laboratory: Argonne,IL, 2010. There is no corresponding record for this reference, 2010.
[84] S Sharma, Michael R Harper, and William H Green. CanTherm: Open-source soft-ware for thermodynamics and kinetics, 2010.
[85] J R Baker, N F Ortiz, J M Preses, L L Lohr, A Maranzana, P J Stimac, T L Nguyen,and T J Dhilip Kumar. MultiWell-2012.1 Software. University of Michigan: AnnArbor, MI, 2012.
[86] D Katsikadakos, C W Zhou, John M Simmie, Henry J Curran, P A Hunt, Y Hardalu-pas, and A M K P Taylor. Rate constants of hydrogen abstraction by methyl radicalfrom n-butanol and a comparison of CanTherm, MultiWell and Variflex. Proceedingsof the Combustion Institute, 34(1):483–491, 2013.
[87] E Ranzi, A Frassoldati, R Grana, Alberto Cuoci, Tiziano Faravelli, A P Kelley, andChung K Law. Hierarchical and comparative kinetic modeling of laminar flamespeeds of hydrocarbon and oxygenated fuels. Prog. Energ. Combust., 38(4):468–501, August 2012.
84
[88] William H Green, Joshua W Allen, Beat A Buesser, Robert W Ashcraft, GregoryJ O Beran, Caleb A Class, Connie W Gao, C Franklin Goldsmith, Michael R Harper,Amrit Jalan, Murat Keceli, Gregory R Magoon, David M Matheu, Shamel S Mer-chant, Jeffrey D Mo, Sarah Petway, Sumathy Raman, Sandeep Sharma, Jing Song,Yury V Suleimanov, Kevin M Van Geem, John Wen, Richard H West, Andrew Wong,Hsi-Wu Wong, Paul E Yelvington, Nathan Yee, and Joanna Yu. RMG — ReactionMechanism Generator. rmg.sourceforge.net, 2013.
[89] Jing Song, George Stephanopoulos, and William H Green. Valid parameter rangeanalyses for chemical reaction kinetic models. Chem. Eng. Sci., 57(21):4475–4491,November 2002.
[90] R Sumathi, Hans-Heinrich Carstensen, and William H Green. Reaction Rate Predic-tion via Group Additivity Part 1: H Abstraction from Alkanes by H and CH 3. J.Phys. Chem. A, 105(28):6910–6925, July 2001.
[91] E Wigner. The transition state method. Trans. Faraday Soc., 34(0):29–41, 1938.
[92] Michael J S Dewar, Eamonn F Healy, and James J P Stewart. Location of transitionstates in reaction mechanisms. J. Chem. Soc., Faraday Trans. 2, 80(3):227–233,1984.
[93] Nuria Gonzalez-Garcia, Jingzhi Pu, Angels Gonzalez-Lafont, Jose M Lluch, andDonald G Truhlar. Searching for Saddle Points by Using the Nudged Elastic BandMethod: An Implementation for Gas-Phase Systems. J. Chem. Theory Comput., 2(4):895–904, July 2006.
[94] James B Foresman and Aeleen Frisch. Exploring Chemistry with ElectronicStructure Methods. Second Edition. Gaussian, Pittsburgh, PA, 2nd ed edition, 1996.
[95] K K Irikura and R D Johnson. Predicting unexpected chemical reactions by isopo-tential searching. J. Phys. Chem. A, 104(11):2191–2194, 2000.
[96] Jeffrey M Blaney and J Scott Dixon. Distance Geometry in Molecular Modeling.In Reviews in Computational Chemistry, pages 299–335. John Wiley & Sons, Inc.,Hoboken, NJ, USA, January 1994.
[97] G Landrum. RDKit: Open-source cheminformatics.
[98] Jean-Paul Ebejer, Garrett M Morris, and Charlotte M Deane. Freely available con-former generation methods: how good are they? J. Chem. Inf. Model., 52(5):1146–1158, May 2012.
[99] G M Crippen and T F Havel. Distance geometry and molecular conformation. Re-search Studies Press, 1988.
[100] A K Rappe, C J Casewit, K S Colwell, W A Goddard, and W M Skiff. UFF, a fullperiodic table force field for molecular mechanics and molecular dynamics simula-tions. J. Am. Chem. Soc., 114(25):10024–10035, December 1992.
85
[101] H Bernhard Schlegel. Exploring potential energy surfaces for chemical reactions:An overview of some practical methods. J. Comput. Chem., 24(12):1514–1527,July 2003.
[102] Kenichi Fukui. The path of chemical reactions-the IRC approach. Acc. Chem. Res.,14(12):363–368, 1981.
[103] Noel M O’Boyle, Michael Banck, Craig A James, Chris Morley, Tim Vandermeer-sch, and Geoffrey R Hutchison. Open Babel: An open chemical toolbox. Journal ofCheminformatics, 3(1):33, October 2011.
[104] R E Huie, Donald R Burgess Jr, W Tsang, W S McGivern, J W Hudgens, E Chai,A M Tereza, , F Westley, J T Herron, , and D H Frizzell. NIST Chemical KineticsDatabase. Technical report, 2013.
[105] Michael J Frisch, G W Trucks, H Bernhard Schlegel, G E Scuseria, M A Robb,J R Cheeseman, G Scalmani, V Barone, B Mennucci, G A Petersson, H Nakatsuji,M Caricato, X Li, H P Hratchian, A F Izmaylov, J Bloino, G Zheng, J L Sonnen-berg, M Hada, M Ehara, K Toyota, R Fukuda, J Hasegawa, M Ishida, T Naka-jima, Y Honda, O Kitao, H Nakai, T Vreven Jr, J A Montgomery, Jr, J E Per-alta, F Ogliaro, M Bearpark, J J Heyd, E Brothers, K N Kudin, V N Staroverov,R Kobayashi, J Normand, K Raghavachari, A Rendell, J C Burant, S S Iyengar,J Tomasi, M Cossi, N Rega, J M Millam, M Klene, J E Knox, J B Cross, V Bakken,C Adamo, J Jaramillo, R Gomperts, R E Stratmann, O Yazyev, A J Austin, R Cammi,C Pomelli, J W Ochterski, R L Martin, K Morokuma, V G Zakrzewski, G A Voth,P Salvador, J J Dannenberg, S Dapprich, A D Daniels, O Farkas, James B Foresman,J V Ortiz, J Cioslowski, and D J Fox. Gaussian 09. Gaussian, Inc., Wallingford, CT,2009.
[106] James J P Stewart. MOPAC2012. openmopac.net, 2012.
[107] Yan Zhao and Donald G Truhlar. The M06 suite of density functionals for maingroup thermochemistry, thermochemical kinetics, noncovalent interactions, excitedstates, and transition elements: two new functionals and systematic testing of fourM06-class functionals and 12 other functionals. Theor. Chem. Acc., 120(1-3):215–241, May 2008.
[108] Xuefei Xu, Ionut M Alecu, and Donald G Truhlar. How Well Can Modern DensityFunctionals Predict Internuclear Distances at Transition States? J. Chem. TheoryComput., 2011.
[109] Vitaly A Rassolov, Mark A Ratner, John A Pople, Paul C Redfern, and Larry ACurtiss. 6-31G* basis set for third-row atoms. J. Comput. Chem., 22(9):976–984,2001.
[110] Timothy Clark, Jayaraman Chandrasekhar, G nther W Spitznagel, and PaulVon Ragu Schleyer. Efficient diffuse function-augmented basis sets for anion calcu-lations. III. The 3-21+G basis set for first-row elements, Li-F. J. Comput. Chem., 4(3):294–301, 1983.
86
[111] Sidney W Benson. Thermochemical kinetics : methods for the estimation ofthermochemical data and rate parameters. Wiley, New York, 2nd edition, 1976.
[112] Nadia Sebbar, Henning Bockhorn, and Joseph W Bozzelli. Thermodynamic prop-erties (S298, Cp(T), internal rotations and group additivity parameters) in vinyl andphenyl hydroperoxides. Phys. Chem. Chem. Phys., 5(2):300–307, January 2003.
[113] Mark Saeys, Marie-Francoise Reyniers, Guy B Marin, Veronique Van Speybroeck,and Michel Waroquier. Ab Initio Calculations for Hydrocarbons: Enthalpy of For-mation, Transition State Geometry, and Activation Energy for Radical Reactions. J.Phys. Chem. A, 107(43):9147–9159, October 2003.
[114] Mark Saeys, Marie-Francoise Reyniers, Guy B Marin, Veronique Van Speybroeck,and Michel Waroquier. Ab initio group contribution method for activation energiesfor radical additions. AIChE J., 50(2):426–444, 2004.
[115] Mark Saeys, Marie-Francoise Reyniers, Veronique Van Speybroeck, Michel Waro-quier, and Guy B Marin. Ab Initio Group Contribution Method for Activation Ener-gies of Hydrogen Abstraction Reactions. ChemPhysChem, 7(1):188–199, January2006.
[116] Aaron G Vandeputte, Maarten K Sabbe, Marie-Francoise Reyniers, and Guy BMarin. Kinetics of α hydrogen abstractions from thiols, sulfides and thiocarbonylcompounds. Phys. Chem. Chem. Phys., 14(37):12773–12793, August 2012.
[117] A McIlroy, G McRae, V Sick, D L Siebers, Charles K Westbrook, P J Smith, Craig ATaatjes, A Trouve, A F Wagner, E Rohlfing, D Manley, F Tully, R Hilderbrandt,William H Green, D Marceau, J O’Neal, M Lyday, F Cebulski, T R Garcia, andD Strong. Basic Research Needs for Clean and Efficient Combustion of 21st CenturyTransportation Fuels. Technical report, USDOE Office of Science (SC) (UnitedStates), November 2006.
[118] Chung K Law, Emily A Carter, J H Chen, Frederick L Dryer, F N Egolfopoulos,William H Green, Nils Hansen, Ronald K Hanson, Yiguang Ju, Stephen J Klip-penstein, S B POPE, C J Sung, Donald G Truhlar, and Hai Wang. First AnnualConference of the Combustion Energy Frontier Research Center (CEFRC). In FirstAnnual Conference of the Combustion Energy Frontier Reseach Center (CEFRC),2010.
[119] Stefan van der Walt, S Chris Colbert, and Gael Varoquaux. The NumPy Ar-ray: A Structure for Efficient Numerical Computation. Computing in Science &Engineering, 13(2):22–30, March 2011.
[120] Axel D Becke. Density-functional thermochemistry. III. The role of exact exchange.J. Chem. Phys., 98(7):5648–5652, 1993.
[121] P J Stephens, F J Devlin, C F Chabalowski, and Michael J Frisch. Ab initio calcula-tion of vibrational absorption and circular dichroism spectra using density functionalforce fields. J. Phys. Chem., 98(45):11623–11627, 1994.
87
[122] C J Clopper and E S Pearson. The Use of Confidence or Fiducial Limits Illustratedin the Case of the Binomial. Biometrika, 26(4):404, December 1934.
[123] Yan Zhao and Donald G Truhlar. A new local density functional for main-groupthermochemistry, transition metal bonding, thermochemical kinetics, and noncova-lent interactions. J. Chem. Phys., 125(19):194101, 2006.
[124] B J Lynch, Yan Zhao, and Donald G Truhlar. Effectiveness of diffuse basis functionsfor calculating relative energies by density functional theory. J. Phys. Chem., 2003.
[125] Michael J Frisch, John A Pople, and J Stephen Binkley. Self-consistent molecularorbital methods 25. Supplementary functions for Gaussian basis sets. J. Chem. Phys.,80(7):3265–3269, April 1984.
[126] Fabian Pedregosa, Gael Varoquaux, Alexandre Gramfort, Vincent Michel, BertrandThirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, VincentDubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher,Matthieu Perrot, and Edouard Duchesnay. Scikit-learn: Machine Learning inPython. The Journal of Machine Learning Research, 12:2825–2830, February 2011.
[127] Shamel S Merchant, Everton Fernando Zanoelo, Raymond L Speth, Michael RHarper, Kevin M Van Geem, and William H Green. Combustion and pyrolysis ofiso-butanol: Experimental and chemical kinetic modeling study. Combust. Flame,160(10):1907–1929, October 2013.
[128] Fariba Seyedzadeh Khanshan and Richard H West. Developing detailed kineticmodels of syngas production from bio-oil gasification using Reaction MechanismGenerator (RMG). Fuel, 163:25–33, January 2016.
[129] Arij Ben Amara, Andre Nicolle, Maira Alves-Fortunato, and Nicolas Jeuland. To-ward Predictive Modeling of Petroleum and Biobased Fuel Stability: Kinetics ofMethyl Oleate/ n-Dodecane Autoxidation. Energy Fuels, 27(10):6125–6133, Octo-ber 2013.
[130] Raman Sumathi, Hans-Heinrich Carstensen, and William H Green. Reaction RatePredictions Via Group Additivity. Part 3: Effect of Substituents with CH 2as theMediator. J. Phys. Chem. A, 106(22):5474–5489, June 2002.
[131] Andrew J Adamczyk, Marie-Francoise Reyniers, Guy B Marin, and Linda J Broad-belt. Exploring 1,2-Hydrogen Shift in Silicon Nanoparticles: Reaction Kinetics fromQuantum Chemical Calculations and Derivation of Transition State Group Additiv-ity Database. J. Phys. Chem. A, 113(41):10933–10946, October 2009.
[132] Andrew J Adamczyk, Marie-Francoise Reyniers, Guy B Marin, and Linda J Broad-belt. Kinetic correlations for H2 addition and elimination reaction mechanismsduring silicon hydride pyrolysis. Phys. Chem. Chem. Phys., 12(39):12676–12696,September 2010.
88
[133] Henry J Curran, P Gaffuri, W J Pitz, and Charles K Westbrook. A ComprehensiveModeling Study of n-Heptane Oxidation. Combust. Flame, 114(1-2):149–177, July1998.
[134] H Curran. A comprehensive modeling study of iso-octane oxidation. Combust.Flame, 129(3):253–280, May 2002.
[135] Satoshi Maeda, Koichi Ohno, and Keiji Morokuma. An Automated and System-atic Transition Structure Explorer in Large Flexible Molecular Systems Based onCombined Global Reaction Route Mapping and Microiteration Methods. J. Chem.Theory Comput., 5(10):2734–2743, October 2009.
[136] Pierre L Bhoorasingh and Richard H West. Transition state geometry predictionusing molecular group contributions. Phys. Chem. Chem. Phys., 17(48):32173–32182, December 2015.
[137] D H Lu, T N Truong, V S Melissas, G C Lynch, Y P Liu, Bruce C Garrett, R Stecker,Alan D Isaacson, S N Rai, G C Hancock, J G Lauderdale, T Joseph, and Donald GTruhlar. Polyrate-4 - a New Version of a Computer-Program for the Calculation ofChemical-Reaction Rates for Polyatomics. Comput. Phys. Commun., 71(3):235–262, September 1992.
[138] S Mani Sarathy, Stijn Vranckx, Kenji Yasunaga, Marco Mehl, Patrick Oßwald,Wayne K Metcalfe, Charles K Westbrook, William J Pitz, Katharina Kohse-Hoinghaus, Ravi X Fernandes, and Henry J Curran. A comprehensive chemicalkinetic combustion model for the four butanol isomers. Combust. Flame, 159(6):2028–2055, June 2012.
[139] Kirk A Peterson, Thomas B Adler, and Hans-Joachim Werner. Systematically con-vergent basis sets for explicitly correlated wavefunctions: The atoms H, He, B–Ne,and Al–Ar. J. Chem. Phys., 128(8):084102, February 2008.
[140] Kazim E Yousaf and Kirk A Peterson. Optimized auxiliary basis sets for explicitlycorrelated methods. J. Chem. Phys., 129(18):184108, 2008.
[141] Frank Neese. The ORCA program system. Wiley Interdiscip. Rev. Comput. Mol.Sci., 2(1):73–78, January 2012.
[142] Michael J Frisch, G W Trucks, H Bernhard Schlegel, G E Scuseria, M A Robb,J R Cheeseman, G Scalmani, V Barone, B Mennucci, G A Petersson, H Nakat-suji, M Caricato, X Li, H P Hratchian, A F Izmaylov, J Bloino, G Zheng, J LSonnenberg, M Hada, M Ehara, K Toyota, R Fukuda, J Hasegawa, M Ishida,T Nakajima, Y Honda, O Kitao, H Nakai, T Vreven Jr, J A Montgomery, J E Per-alta, F Ogliaro, M Bearpark, J J Heyd, E Brothers, K N Kudin, V N Staroverov,R Kobayashi, J Normand, K Raghavachari, A Rendell, J C Burant, S S Iyengar,J Tomasi, M Cossi, N Rega, N J Millam, M Klene, J E Knox, J B Cross, V Bakken,C Adamo, J Jaramillo, R Gomperts, R E Stratmann, O Yazyev, A J Austin, R Cammi,C Pomelli, J W Ochterski, R L Martin, K Morokuma, V G Zakrzewski, G A Voth,
89
P Salvador, S Dapprich A D Daniels O Farkas J J Dannenberg, James B Foresman,J V Ortiz, J Cioslowski, and D J Fox. Gaussian 09, Revision A. 02. Gaussian Inc,December 2009.
[143] M Valiev, E J Bylaska, N Govind, K Kowalski, T P Straatsma, H J J Van Dam,D Wang, J Nieplocha, E Apra, T L Windus, and W A de Jong. NWChem: A com-prehensive and scalable open-source solution for large scale molecular simulations.Comput. Phys. Commun., 181(9):1477–1489, September 2010.
[144] Serguei Patchkovskii. SYMMETRY. 2003.
[145] Karl K Irikura and David J Frurip. Computational thermochemistry. AmericanChemical Society, 1998.
[146] Carl Eckart. The Penetration of a Potential Barrier by Electrons. Phys. Rev., 35(11):1303–1309, June 1930.
[147] Hans-Heinrich Carstensen, Anthony M Dean, and Olaf Deutschmann. Rate con-stants for the H abstraction from alkanes (R–H) by R´O2 radicals: A systematicstudy on the impact of R and R´. Proceedings of the Combustion Institute, 31(1):149–157, January 2007.
[148] R W Walker. Reactions of HO2 radicals in combustion chemistry. Symposium(International) on Combustion, 22(1):883–892, January 1989.
[149] J Lee and J W Bozzelli. Thermochemical and kinetic analysis of the formyl methylradical+ O2 reaction system. J. Phys. Chem. A, 2003.
[150] Ionut M Alecu and Donald G Truhlar. Computational Study of the Reactions ofMethanol with the Hydroperoxyl and Methyl Radicals. 2. Accurate Thermal RateConstants. J. Phys. Chem. A, 115(51):14599–14611, December 2011.
[151] Antonio Fernandez-Ramos, Benjamin A Ellingson, Ruben Meana-Paneda, JorgeM C Marques, and Donald G Truhlar. Symmetry numbers and chemical reactionrates. Theor. Chem. Acc., 118(4):813–826, July 2007.
[152] N M Vandewiele, R Van de Vijver, Kevin M Van Geem, Marie-Francoise Reyniers,and Guy B Marin. Symmetry calculation for molecules and transition states. J.Comput. Chem., 2015.
Appendices
90
A. Double-ended method
Table A.1: Transition states determined at M06-2X/6-31+G(d,p) showed trends in thedistances (in Angstroms) with changes to molecular groups. The distancesdXH, dHY, and dXY are defined in Figure 2.3.
Reaction dXH dHY dXYCH3CH3 + CH3O↔ CH3CH2 + CH3OH 1.241 1.274 2.509CH3CH3 + CH3CHCH3↔ CH3CH2 + CH3CH2CH3 1.373 1.319 2.689CH3CH3 + CH2OH↔ CH3CH2 + CH3OH 1.390 1.316 2.689CH3CH3 + CH=O↔ CH3CH2 + CH2=O 1.446 1.293 2.720CH3CH2CH3 + CH3O↔ CH3CH2CH2 + CH3OH 1.237 1.275 2.510CH3CH2CH3 + CH3CHCH3↔ CH3CH2CH2 + CH3CH2CH3 1.376 1.319 2.689CH3CH2CH3 + CH2OH↔ CH3CH2CH2 + CH3OH 1.395 1.315 2.701CH3CH(CH3)CH3 + CH3O↔ CH3CH(CH3)CH2 + CH3OH 1.241 1.275 2.509CH3CH(CH3)CH3 + CH3CHCH3 ↔ CH3CH(CH3)CH2 +CH3CH2CH3
1.378 1.317 2.690
CH3CH(CH3)CH3 + CH2OH↔ CH3CH(CH3)CH2 + CH3OH 1.395 1.316 2.707CH3CH2CH3 + CH3O↔ CH3CHCH3 + CH3OH 1.214 1.322 2.531CH3CH2CH3 + CH2OH↔ CH3CHCH3 + CH3OH 1.365 1.340 2.696CH3CH2CH3 + CH=O↔ CH3CHCH3 + CH2=O 1.419 1.315 2.703CH3CH(CH3)CH3 + CH3O↔ CH3C(CH3)CH3 + CH3OH 1.194 1.368 2.559CH3CH(CH3)CH3 + CH2OH↔ CH3C(CH3)CH3 + CH3OH 1.345 1.360 2.695CH3CH(CH3)CH3 + CH=O↔ CH3C(CH3)CH3 + CH2=O 1.402 1.330 2.714
Table A.2: 334 hydrogen abstraction reactions used to test automated transition statealgorithms.
Reactions Success/Failure[CH3] + [H][H]↔ C + [H] SuccessO + [H]↔ [H][H] + [OH] SuccessC#C + [H]↔ [C]#C + [H][H] QST3 neededC[CH2] + [H][H]↔ CC + [H] SuccessC[O] + [H][H]↔ CO + [H] SuccessO[CH2] + [H][H]↔ CO + [H] SuccessO=[CH] + [H][H]↔ C=O + [H] SuccessC=[CH] + [H][H]↔ C=C + [H] Transition state failedO[O] + [H][H]↔ OO + [H] IRC failedO[O] + [H]↔ [H][H] + [O][O] Transition state failedO + [CH3]↔ C + [OH] SuccessCCO + [H]↔ CC[O] + [H][H] SuccessContinued on next page
91
92
Table A.2 – continued from previous pageReactions Success/FailureCCO + [H]↔ C[CH]O + [H][H] Transition state failedCCO + [H]↔ [CH2]CO + [H][H] SuccessC + [CH2]C↔ CC + [CH3] SuccessC + C[O]↔ CO + [CH3] SuccessC + [CH2]O↔ CO + [CH3] SuccessC + [CH]=O↔ C=O + [CH3] SuccessC + [CH]=C↔ C=C + [CH3] QST3 neededC + [O]O↔ OO + [CH3] SuccessC + [O][O]↔ [CH3] + [O]O Transition state failedC#C + [CH3]↔ C + [C]#C Transition state failedCCC + [H]↔ [CH2]CC + [H][H] SuccessC1CC1 + [H]↔ [CH]1CC1 + [H][H] SuccessC1CO1 + [H]↔ [CH]1CO1 + [H][H] IRC failedC=CC + [H]↔ [CH2]C=C + [H][H] SuccessCOC + [H]↔ [CH2]OC + [H][H] SuccessC[CH]C + [H][H]↔ CCC + [H] SuccessCO[O] + [H][H]↔ COO + [H] SuccessC[C]=O + [H][H]↔ CC=O + [H] SuccessO + [C]#C↔ C#C + [OH] SuccessC[O] + O↔ CO + [OH] SuccessO + [CH2]O↔ CO + [OH] IRC failedO + [CH]=O↔ C=O + [OH] IRC failedO + [CH]=C↔ C=C + [OH] SuccessO + [O]O↔ OO + [OH] IRC failedO + [CH2]C↔ CC + [OH] IRC failedO + [O][O]↔ [OH] + [O]O Transition state failedC=CC + [H]↔ [CH]=CC + [H][H] SuccessC=CC + [H]↔ C=[C]C + [H][H] SuccessC=O + C[O]↔ CO + [CH]=O SuccessC=O + [CH]=C↔ C=C + [CH]=O QST3 neededC=O + [O][O]↔ [CH]=O + [O]O IRC failedCCO + [CH3]↔ C + CC[O] SuccessCCO + [CH3]↔ C + C[CH]O SuccessCCO + [OH]↔ C[CH]O + O IRC failedCO + [CH2]O↔ CO + C[O] SuccessCO + [CH]=O↔ C=O + [CH2]O SuccessCO + [CH]=C↔ C=C + C[O] SuccessCO + [CH]=C↔ C=C + [CH2]O SuccessCO + [O][O]↔ [CH2]O + [O]O IRC failedCC(C)O + [H]↔ C[C](C)O + [H][H] IRC failedCC(C)=O + [H]↔ [CH2]C(C)=O + [H][H] SuccessC + CO[O]↔ COO + [CH3] SuccessContinued on next page
93
Table A.2 – continued from previous pageReactions Success/FailureCC + C[O]↔ CO + [CH2]C SuccessCC + [CH2]O↔ CO + [CH2]C SuccessCC + [CH]=O↔ C=O + [CH2]C IRC failedCC + [CH]=C↔ C=C + [CH2]C IRC failedCC + [O][O]↔ [CH2]C + [O]O IRC failedC=C + [O][O]↔ [CH]=C + [O]O IRC failedC#C + [CH2]C↔ CC + [C]#C IRC failedCCC + [CH3]↔ C + [CH2]CC SuccessCCC + [OH]↔ C[CH]C + O IRC failedCCC + [OH]↔ O + [CH2]CC IRC failedCC=O + [CH3]↔ C + C[C]=O Transition state failedCC=O + [OH]↔ C[C]=O + O IRC failedCC=O + [OH]↔ O + [CH2]C=O SuccessC1CC1 + [OH]↔ O + [CH]1CC1 SuccessC1CO1 + [CH3]↔ C + [CH]1CO1 SuccessC1CO1 + [OH]↔ O + [CH]1CO1 SuccessCCCC + [H]↔ [CH2]CCC + [H][H] SuccessC=CC=C + [H]↔ [CH]=CC=C + [H][H] SuccessC=CC + [CH3]↔ C + [CH2]C=C SuccessC=CC + [OH]↔ O + [CH2]C=C IRC failedCOC + [CH3]↔ C + [CH2]OC SuccessCOC + [OH]↔ O + [CH2]OC IRC failedC[C](C)C + [H][H]↔ CC(C)C + [H] SuccessC[CH]CC + [H][H]↔ CCCC + [H] SuccessCC(C)C + [H]↔ [CH2]C(C)C + [H][H] SuccessC=C(C)C + [H]↔ [CH2]C(=C)C + [H][H] SuccessC#C + [O][O]↔ [C]#C + [O]O QST3 neededOO + [CH2]O↔ CO + [O]O SuccessOO + [CH]=O↔ C=O + [O]O SuccessOO + [CH]=C↔ C=C + [O]O SuccessCOO + [OH]↔ CO[O] + O Transition state failedOO + [CH2]C↔ CC + [O]O SuccessOO + [O][O]↔ [O]O + [O]O SuccessCCCO + [H]↔ [CH2]CCO + [H][H] SuccessCOC=O + [H]↔ [CH2]OC=O + [H][H] SuccessCOC=O + [H]↔ CO[C]=O + [H][H] SuccessC=CC + [OH]↔ O + [CH]=CC SuccessC=CC + [OH]↔ C=[C]C + O SuccessC[C]=CC + [H][H]↔ CC=CC + [H] Transition state failedCOO + [OH]↔ O + [CH2]OO SuccessC=O + CO[O]↔ COO + [CH]=O SuccessC=O + C[C]=O↔ CC=O + [CH]=O QST3 neededContinued on next page
94
Table A.2 – continued from previous pageReactions Success/FailureCO + CO[O]↔ COO + [CH2]O SuccessCO + C[C]=O↔ CC=O + [CH2]O Transition state failedCC(C)=O + [CH3]↔ C + [CH2]C(C)=O SuccessCC(C)=O + [OH]↔ O + [CH2]C(C)=O SuccessCCCCO + [H]↔ [CH2]CCCO + [H][H] SuccessCC + C[CH]C↔ CCC + [CH2]C SuccessCC + CO[O]↔ COO + [CH2]C Transition state failedCCC + [CH2]C↔ CC + [CH2]CC Transition state failedCCC + C[O]↔ CO + C[CH]C Transition state failedCCC + C[O]↔ CO + [CH2]CC SuccessCCC + [CH2]O↔ CO + C[CH]C SuccessCCC + [CH2]O↔ CO + [CH2]CC SuccessCCC + [CH]=O↔ C=O + C[CH]C Transition state failedCCC + [CH]=O↔ C=O + [CH2]CC SuccessCCC + [CH]=C↔ C=C + C[CH]C SuccessCCC + [CH]=C↔ C=C + [CH2]CC SuccessCCC + [O]O↔ C[CH]C + OO SuccessCCC + [O][O]↔ C[CH]C + [O]O IRC failedCCC + [O][O]↔ [CH2]CC + [O]O IRC failedCC=O + [CH2]C↔ CC + C[C]=O SuccessCC=O + [CH]=C↔ C=C + C[C]=O SuccessCC=O + [O][O]↔ C[C]=O + [O]O IRC failedCC(C)C + [CH3]↔ C + C[C](C)C Transition state failedCC(C)C + [CH3]↔ C + [CH2]C(C)C SuccessCC(C)C + [OH]↔ C[C](C)C + O IRC failedCC(C)C + [OH]↔ O + [CH2]C(C)C IRC failedCC(C)(C)O + [H]↔ [CH2]C(C)(C)O + [H][H] SuccessCCC(C)C + [H]↔ [CH2]CC(C)C + [H][H] SuccessCCC(C)C + [H]↔ C[CH]C(C)C + [H][H] SuccessCCC(C)C + [H]↔ CC[C](C)C + [H][H] SuccessCCCC + [CH3]↔ C + C[CH]CC SuccessCCCC + [CH3]↔ C + [CH2]CCC SuccessCCCC + [OH]↔ C[CH]CC + O IRC failedCCCC + [OH]↔ O + [CH2]CCC IRC failedC=CC=C + [CH3]↔ C + [CH]=CC=C IRC failedCCCCC + [H]↔ C[CH]CCC + [H][H] Transition state failedCCCCC + [H]↔ [CH2]CCCC + [H][H] SuccessCCCCC + [H]↔ CC[CH]CC + [H][H] SuccessC=CC + [CH2]C↔ CC + [CH2]C=C SuccessC=CC + C[O]↔ CO + [CH2]C=C SuccessC=CC + [CH2]O↔ CO + [CH2]C=C SuccessC=CC + [CH]=O↔ C=O + [CH2]C=C SuccessContinued on next page
95
Table A.2 – continued from previous pageReactions Success/FailureC=CC + [CH]=C↔ C=C + [CH2]C=C Transition state failedC=C(C)C + [CH3]↔ C + [CH2]C(=C)C SuccessC1CCC1 + [OH]↔ O + [CH]1CCC1 IRC failedC1CCCC1 + [H]↔ [CH]1CCCC1 + [H][H] SuccessCC(C)(C)C + [H]↔ [CH2]C(C)(C)C + [H][H] SuccessC=CC + [O][O]↔ [CH2]C=C + [O]O IRC failedOO + [CH2]C=C↔ C=CC + [O]O SuccessCO[O] + [O]O↔ COO + [O][O] SuccessCO[O] + OO↔ COO + [O]O SuccessOO + [CH2]CC↔ CCC + [O]O SuccessC[C]=O + OO↔ CC=O + [O]O SuccessCCC(C)C + [H]↔ [CH2]C(C)CC + [H][H] SuccessC=CCC + [CH3]↔ C + C=C[CH]C SuccessCOC=O + [CH3]↔ C + CO[C]=O SuccessCOC=O + [OH]↔ O + [CH2]OC=O IRC failedCOC=O + [OH]↔ CO[C]=O + O SuccessCOC=O + [CH3]↔ C + [CH2]OC=O SuccessCC=CC + [CH3]↔ C + [CH2]C=CC SuccessCCOCC + [CH3]↔ C + C[CH]OCC Transition state failedCC(C)=O + [CH]=C↔ C=C + [CH2]C(C)=O Successc1ccccc1 + [H]↔ [H][H] + [c]1ccccc1 SuccessCC + [CH2]CCC↔ CCCC + [CH2]C IRC failedCCC + C[CH]C↔ CCC + [CH2]CC IRC failedCCC + CO[O]↔ COO + C[CH]C SuccessCCC + CO[O]↔ COO + [CH2]CC SuccessCCC + C[C]=O↔ CC=O + C[CH]C SuccessCCC + C[C]=O↔ CC=O + [CH2]CC SuccessCC=O + [CH2]C=C↔ C=CC + C[C]=O SuccessCC(C)C + [CH2]C↔ CC + C[C](C)C Transition state failedCC(C)C + [CH2]C↔ CC + [CH2]C(C)C SuccessCC(C)C + C[O]↔ CO + C[C](C)C Transition state failedCC(C)C + C[O]↔ CO + [CH2]C(C)C SuccessCC(C)C + [CH2]O↔ CO + C[C](C)C Transition state failedCC(C)C + [CH2]O↔ CO + [CH2]C(C)C SuccessCC(C)C + [CH]=O↔ C=O + C[C](C)C Transition state failedCC(C)C + [CH]=O↔ C=O + [CH2]C(C)C Transition state failedCC(C)C + [CH]=C↔ C=C + C[C](C)C Transition state failedCC(C)C + [CH]=C↔ C=C + [CH2]C(C)C QST3 neededCC(C)C + [O]O↔ C[C](C)C + OO SuccessCC(C)C + [O][O]↔ C[C](C)C + [O]O Transition state failedCC(C)C + [O][O]↔ [CH2]C(C)C + [O]O Transition state failedCCC(C)C + [OH]↔ C[CH]C(C)C + O IRC failedContinued on next page
96
Table A.2 – continued from previous pageReactions Success/FailureCCC(C)C + [OH]↔ CC[C](C)C + O IRC failedCC(C)C=O + [CH3]↔ C + CC(C)[C]=O SuccessCC(=O)C=O + [OH]↔ CC(=O)[C]=O + O IRC failedCCCC + [O]O↔ C[CH]CC + OO SuccessCCCC + [O]O↔ OO + [CH2]CCC SuccessC=CCO + [CH2]C↔ C=C[CH]O + CC IRC failedCOC=O + C[O]↔ CO + CO[C]=O SuccessCCCCC + [OH]↔ C[CH]CCC + O Transition state failedCCCCC + [OH]↔ O + [CH2]CCCC IRC failedCCCCC + [OH]↔ CC[CH]CC + O Transition state failedCCOC=O + [CH3]↔ C + CCO[C]=O SuccessC=CC + C[CH]C↔ CCC + [CH2]C=C SuccessC=CC + CO[O]↔ COO + [CH2]C=C SuccessC=CC + [CH2]CC↔ CCC + [CH2]C=C SuccessC=C(C)C + [O][O]↔ [CH2]C(=C)C + [O]O SuccessCCC=O + [CH2]C↔ CC + CC[C]=O SuccessCCCC=O + [CH3]↔ C + CCC[C]=O SuccessC1CCCC1 + [CH3]↔ C + [CH]1CCCC1 IRC failedC1CCCC1 + [OH]↔ O + [CH]1CCCC1 Transition state failedCC(C)(C)C + [CH3]↔ C + [CH2]C(C)(C)C SuccessCC(C)(C)C + [OH]↔ O + [CH2]C(C)(C)C SuccessO=CC1CC1 + [CH3]↔ C + O=[C]C1CC1 IRC failedCC=CC=O + [CH3]↔ C + CC=C[C]=O SuccessCCO[O] + [O]O↔ CCOO + [O][O] SuccessOO + [CH2]C(C)C↔ CC(C)C + [O]O SuccessCOC(C)=O + [CH3]↔ C + [CH2]OC(C)=O SuccessCOC(C)=O + [CH3]↔ C + [CH2]C(=O)OC SuccessCCC=O + [CH2]C↔ CC + C[CH]C=O SuccessCCC=O + [CH2]C↔ CC + [CH2]CC=O SuccessC=O + CC(C)(C)[O]↔ CC(C)(C)O + [CH]=O SuccessCCOCC + [CH2]C↔ CC + C[CH]OCC SuccessC + [c]1ccccc1↔ [CH3] + c1ccccc1 SuccessCC(C)C + C[CH]C↔ CCC + C[C](C)C SuccessCC(C)C + C[CH]C↔ CCC + [CH2]C(C)C SuccessCC(C)C + CO[O]↔ COO + C[C](C)C Transition state failedCC(C)C + CO[O]↔ COO + [CH2]C(C)C SuccessCC(C)C + [CH2]CC↔ CCC + C[C](C)C Transition state failedCC(C)C + [CH2]CC↔ CCC + [CH2]C(C)C SuccessCC(C)C + C[C]=O↔ CC=O + C[C](C)C SuccessCC(C)C + C[C]=O↔ CC=O + [CH2]C(C)C SuccessCC(C)(C)CO + [OH]↔ CC(C)(C)[CH]O + O IRC failedCC(C)(C)OO + [CH3]↔ C + CC(C)(C)O[O] SuccessContinued on next page
97
Table A.2 – continued from previous pageReactions Success/FailureCCC(C)CC + [OH]↔ CC[C](C)CC + O IRC failedC=CC=O + C[CH]C↔ C=C[C]=O + CCC IRC failedOc1ccccc1 + [H]↔ [H][H] + [O]c1ccccc1 SuccessCCOC=O + [CH2]C↔ CC + CCO[C]=O SuccessCCCCCC + [OH]↔ C[CH]CCCC + O Transition state failedCCCCCC + [OH]↔ O + [CH2]CCCCC SuccessCCCCC=O + [CH3]↔ C + CCCC[C]=O SuccessCCCOC=O + [CH3]↔ C + CCCO[C]=O SuccessC1CCCCC1 + [CH3]↔ C + [CH]1CCCCC1 SuccessC1CCCCC1 + [OH]↔ O + [CH]1CCCCC1 Transition state failedC1OCOCO1 + [OH]↔ O + [CH]1OCOCO1 IRC failedC=CC + C[C](C)C↔ CC(C)C + [CH2]C=C Transition state failedC=CC + [CH2]C(C)=O↔ CC(C)=O + [CH2]C=C SuccessC=CC + [CH2]C(C)C↔ CC(C)C + [CH2]C=C SuccessC1CCCC1 + [CH2]C↔ CC + [CH]1CCCC1 SuccessC1CCCC1 + [O]O↔ OO + [CH]1CCCC1 SuccessCC(C)(C)C + [C]#C↔ C#C + [CH2]C(C)(C)C IRC failedCC(C)=C(C)C + [CH3]↔ C + [CH2]C(C)=C(C)C QST3 neededCC(C)CC=O + [CH3]↔ C + CC(C)C[C]=O SuccessCOC(=O)OC + [CH3]↔ C + [CH2]OC(=O)OC SuccessCOC(=O)OC + [OH]↔ O + [CH2]OC(=O)OC SuccessCC(C)OC=O + [CH3]↔ C + CC(C)O[C]=O Transition state failedCC(C)(C)C=O + [CH3]↔ C + CC(C)(C)[C]=O IRC failedO + [c]1ccccc1↔ [OH] + c1ccccc1 SuccessCC(=O)O[O] + [O]O↔ CC(=O)OO + [O][O] SuccessCCC(C)(C)C + [OH]↔ C[CH]C(C)(C)C + O Transition state failedCCC(C)(C)C + [OH]↔ O + [CH2]C(C)(C)CC SuccessCCC(C)(C)C + [OH]↔ O + [CH2]CC(C)(C)C Transition state failedCC(C)C(C)C + [OH]↔ C[C](C)C(C)C + O Transition state failedC=CC=C + [CH]=C=C↔ C=C=C + [CH]=CC=C IRC failedCCCCCC + [OH]↔ CC[CH]CCC + O IRC failedc1ccccc1 + [O][O]↔ [O]O + [c]1ccccc1 Transition state failedCC + [c]1ccccc1↔ [CH2]C + c1ccccc1 Transition state failedCC(C)(C)[O] + CC=O↔ CC(C)(C)O + C[C]=O Transition state failedCC(C)(C)[O] + CC=O↔ CC(C)(C)O + [CH2]C=O SuccessCC(C)C + [CH2]C(C)C↔ CC(C)C + C[C](C)C SuccessCC(C)C=O + C[CH]C↔ CC(C)[C]=O + CCC SuccessCCc1ccccc1 + [H]↔ [CH2]Cc1ccccc1 + [H][H] SuccessOc1ccccc1 + [CH3]↔ C + [O]c1ccccc1 Transition state failedOc1ccccc1 + [OH]↔ O + [O]c1ccccc1 Transition state failedC1CCCCC1 + [O]O↔ OO + [CH]1CCCCC1 SuccessC1=CCCCC1 + [CH2]C↔ CC + [CH]1CC=CCC1 SuccessContinued on next page
98
Table A.2 – continued from previous pageReactions Success/FailureC=CC + CC(C)(C)[O]↔ CC(C)(C)O + [CH2]C=C SuccessCCCCCCC + [OH]↔ O + [CH2]CCCCCC IRC failedCCCCCCC + [OH]↔ C[CH]CCCCC + O Transition state failedC1CCCCCC1 + [OH]↔ O + [CH]1CCCCCC1 IRC failedCC(C)(C)C + C[CH]C↔ CCC + [CH2]C(C)(C)C SuccessC1=CCC=C1 + [CH2]C=C↔ C=CC + [CH]1C=CC=C1 Transition state failedC#C[CH2] + C1=CCC=C1↔ C#CC + [CH]1C=CC=C1 SuccessC1=CCCC=C1 + [O][O]↔ [CH]1C=CC=CC1 + [O]O SuccessCOC(=O)OC + C[O]↔ CO + [CH2]OC(=O)OC IRC failedO=CC1CC1 + [CH]1CC1↔ C1CC1 + O=[C]C1CC1 QST3 neededCCCC=O + [CH2]CC↔ CCC + CCC[C]=O SuccessCCCC=O + [CH2]CC↔ CCC + CC[CH]C=O SuccessCCCCCCC + [OH]↔ CC[CH]CCCC + O Transition state failedCCCCCCC + [OH]↔ CCC[CH]CCC + O IRC failedCC(C)C(C)(C)C + [OH]↔ C[C](C)C(C)(C)C + O IRC failedCC(C)C(C)(C)C + [OH]↔ O + [CH2]C(C)C(C)(C)C SuccessCC=CC + C[CH]CC↔ CCCC + C[C]=CC Transition state failedC=C[CH]C + CC=CC↔ C=CCC + [CH2]C=CC SuccessCC1C=CC(=O)C=C1 + [CH3]↔ C + C[C]1C=CC(=O)C=C1 SuccessCC(C)(C)[O] + CC(C)=O↔ CC(C)(C)O + [CH2]C(C)=O SuccessC1CC1 + [c]1ccccc1↔ [CH]1CC1 + c1ccccc1 SuccessCC(C)(C)[O] + CC(C)C↔ CC(C)(C)O + C[C](C)C Transition state failedCC(C)(C)[O] + CC(C)C↔ CC(C)(C)O + [CH2]C(C)C SuccessCOc1ccccc1 + [CH3]↔ C + [CH2]Oc1ccccc1 SuccessCOc1ccccc1 + [OH]↔ O + [CH2]Oc1ccccc1 IRC failedCC(C)(C)[O] + CCCC↔ CC(C)(C)O + C[CH]CC Transition state failedCC(C)(C)[O] + CCCC↔ CC(C)(C)O + [CH2]CCC Transition state failedCCCOC=O + [CH2]CC↔ CCC + CCCO[C]=O SuccessCCCCCCCC + [OH]↔ O + [CH2]CCCCCCC IRC failedC=C(C)C + CC(C)(C)[O]↔ CC(C)(C)O + [CH2]C(=C)C SuccessC1CCCCCCC1 + [OH]↔ O + [CH]1CCCCCCC1 IRC failedC=C=C + [c]1ccccc1↔ [CH]=C=C + c1ccccc1 SuccessCC(C)=C(C)C + C[CH]C↔ CCC + [CH2]C(C)=C(C)C Transition state failedCC(C)(C)C(C)(C)C + [CH3]↔ C + [CH2]C(C)(C)C(C)(C)C SuccessCC(C)(C)C(C)(C)C + [OH]↔ O + [CH2]C(C)(C)C(C)(C)C SuccessC1=CC1 + [c]1ccccc1↔ [CH]1C=C1 + c1ccccc1 IRC failedOOC1CCCC1 + [O][O]↔ [O]O + [O]OC1CCCC1 Transition state failedCC(C)(C)CO[O] + [O]O↔ CC(C)(C)COO + [O][O] SuccessCCCCCCCC + [OH]↔ C[CH]CCCCCC + O IRC failedCCCCCCCC + [OH]↔ CCC[CH]CCCC + O Transition state failedCCCCCCCC + [OH]↔ CC[CH]CCCCC + O IRC failedCC(C)CC(C)(C)C + [OH]↔ C[C](C)CC(C)(C)C + O Transition state failedContinued on next page
99
Table A.2 – continued from previous pageReactions Success/FailureCC(C)CC(C)(C)C + [OH]↔ O + [CH2]C(C)CC(C)(C)C Transition state failedCC(C)(C)[O] + CC=CC↔ CC(C)(C)O + [CH2]C=CC SuccessC1=CC1 + [c]1ccccc1↔ [C]1=CC1 + c1ccccc1 SuccessCC(C)(O)CO[O] + [O]O↔ CC(C)(O)COO + [O][O] IRC failedCCOCC + [CH2]COCC↔ CCOCC + C[CH]OCC Transition state failedCC(C)=O + [c]1ccccc1↔ [CH2]C(C)=O + c1ccccc1 Transition state failedCC(C)C + [c]1ccccc1↔ C[C](C)C + c1ccccc1 SuccessCCc1ccccc1 + [O]O↔ OO + [CH2]Cc1ccccc1 SuccessCCCCC=O + [CH2]CCC↔ CCCC + CCCC[C]=O SuccessCCCCCCCCC + [OH]↔ O + [CH2]CCCCCCCC SuccessCC(C)(C)C + CC(C)(C)[O]↔ CC(C)(C)O + [CH2]C(C)(C)C Transition state failedCC(C)(C)[O] + CC=C(C)C↔ CC(C)(C)O + [CH2]C(C)=CC Transition state failedCC(C)CC=O + [CH2]C(C)C↔ CC(C)C + CC(C)C[C]=O SuccessCC(C)(C)C=O + C[C](C)C↔ CC(C)(C)[C]=O + CC(C)C Transition state failedOOC1CCCCC1 + [O][O]↔ [O]O + [O]OC1CCCCC1 SuccessCC(C)(C)OOC(C)(C)C + [CH3]↔ C + [CH2]C(C)(C)OOC(C)(C)C SuccessC1CCCCC1 + CC(C)(C)[O]↔ CC(C)(C)O + [CH]1CCCCC1 SuccessC1CCCC1 + [c]1ccccc1↔ [CH]1CCCC1 + c1ccccc1 Transition state failedC1=CCC=C1 + [c]1ccccc1↔ [CH]1C=CC=C1 + c1ccccc1 SuccessCC(C)(C)[O] + CC(C)=C(C)C↔ CC(C)(C)O + [CH2]C(C)=C(C)C Transition state failedOOCc1ccccc1 + [O][O]↔ [O]O + [O]OCc1ccccc1 IRC failedCCCCCCCCCC + [OH]↔ O + [CH2]CCCCCCCCC SuccessCCCCCCCCCC + [OH]↔ C[CH]CCCCCCCC + O Transition state failedCCCCCCCCCC + [OH]↔ CC[CH]CCCCCCC + O Transition state failedCCCCCCCCCC + [OH]↔ CCCC[CH]CCCCC + O Transition state failedOc1ccccc1 + [CH]1C=CC=C1↔ C1=CCC=C1 + [O]c1ccccc1 Transition state failedC1CCCCC1 + [c]1ccccc1↔ [CH]1CCCCC1 + c1ccccc1 Transition state failedCC(C)C(C)C(C)C + [c]1ccccc1↔ C[C](C)C(C)C(C)C + c1ccccc1 Transition state failed
B. Group contribution method
B.1 Group Training Regression Details
Consider the trees:XH
X1
X11 X12
X2
X21 X22 X23
and Yrad
Y1
Y11 Y12
Y2
Y21 Y22
and a known transition state that matches nodes X11 and Y22. This example will dealwith a single distance d, but in reality it is done for each of the three distances dXH,dHY, and dXY. The distance d11,22 from the known transition state can be used to train thegroups (X11,Y22) and all combinations of the ancestors of these groups, namely (X11,Y2),(X11,Yrad), (X1,Y22), (X1,Y2), (X1,Yrad), (XH,Y22), (XH,Y2), and (XH,Yrad). If a secondknown transition state matches nodes (X23,Y2) and has distance d23,2 then it provides datafor those groups (X23,Y2) and all the pairs of ancestors: (X23,Yrad), (X2,Y2), (X2,Yrad),(XH,Y2), and (XH,Yrad).
From the two training transition states with distances d11,22 and d23,2 we construct a setof 15 linear equations,
X11 + Y22 + d0 = d11,22 (B.1)X11 + Y2 + d0 = d11,22 (B.2)
X11 + Yrad + d0 = d11,22 (B.3)X1 + Y22 + d0 = d11,22 (B.4)X1 + Y2 + d0 = d11,22 (B.5)
X1 + Yrad + d0 = d11,22 (B.6)XH + Y22 + d0 = d11,22 (B.7)XH + Y2 + d0 = d11,22 (B.8)
XH + Yrad + d0 = d11,22 (B.9)X23 + Y2 + d0 = d23,2 (B.10)
X23 + Yrad + d0 = d23,2 (B.11)X2 + Y2 + d0 = d23,2 (B.12)
X2 + Yrad + d0 = d23,2 (B.13)XH + Y2 + d0 = d23,2 (B.14)
XH + Yrad + d0 = d23,2 (B.15)
where d0 is the base distance common to all transition states, so that the final group values
100
101
contain only deviations from the base value.This set of 15 equations in 9 unknowns is over-specified, for example equations (B.8)
and (B.14) have the same left hand side, and equations (B.9) and (B.15) are also duplicates.Indeed, every known transition state will lead to an expression like (B.9) and (B.15) forXH + Yrad + d0. Although there is not a set of group values (X11, X1, Xrad, Y22, etc.) thatwill precisely solve the above set of linear equations, we can find the values that minimizethe error in the equations in the least-squares sense. This is the form of the linear leastsquares regression used to train the group values.
Writing the above set of equations in matrix form:
0 0 0 1 0 0 0 1 10 0 0 1 0 0 1 0 10 0 0 1 0 1 0 0 10 1 0 0 0 0 0 1 10 1 0 0 0 0 1 0 10 1 0 0 0 1 0 0 11 0 0 0 0 0 0 1 11 0 0 0 0 0 1 0 11 0 0 0 0 1 0 0 10 0 0 0 1 0 1 0 10 0 0 0 1 1 0 0 10 0 1 0 0 0 1 0 10 0 1 0 0 1 0 0 11 0 0 0 0 0 1 0 11 0 0 0 0 1 0 0 1
·
XHX1
X2
X11
X23
YradY2
Y22
d0
=
d11,22d11,22d11,22d11,22d11,22d11,22d11,22d11,22d11,22d23,2d23,2d23,2d23,2d23,2d23,2
(B.16)
we can use the notationX · β = y (B.17)
The least-squares fitted group values in the vector β can be found by
β =(XTX
)−1XTy (B.18)
or using Numpy’s linear algebra library in Python:
beta, residues, rank, s = numpy.linalg.lstsq(X, y)
which computes the vector β that minimizes the Euclidean 2-norm
‖y −Xβ‖2 (B.19)
102
B.2 Predicted vs Optimized distances
dXH dHY dXYPr
edic
ted
Dis
tanc
e(A
)
44T
S23
0TS
827T
S10
71T
S
Optimized Distance (A)
Figure B.1: Comparison of distances from validated transition states to predictionsfrom molecular group values calculated from different sized data sets.The solid line represents parity, the dashed lines represent the root meansquared error of the estimates from parity. The predictions derived fromthe original and new trees are represented by the black and red circlesrespectively.
103
B.3 Group Naming Convention
While much detail is included in molecular group names, a full description of the group
should be checked in the TS groups.py file. Atoms where the element is undefined are
typically assigned as an R atom, unless the atom is reactive where it is assigned as either X
or Y.
Molecular group names assume radical count is zero if undefined. For example, if
radicals are not included in the name, the atom has no radical electrons. So the group X H
describes a molecular group where a hydrogen atom is bonded to an atom X which does
not have any unpaired (radical) electrons.
Atom bonding is described by the letters s, d, dd, t, and b: s mean the atom only has
single bonds, d means the atom has exactly one double bond, dd means the atom has 2
double bonds, t means the atom has a triple bond, and b means the atom is part of an
aromatic ring. Thus Cs signifies Carbon atom with only single bonds (not a caesium atom).
A forward slash denotes atoms bonded to the atom specified before the first forward
slash. For example, C/H3/Cs describes a carbon atom, bonded to 3 hydrogen atoms and to
a carbon with only single bonds.
A backslash has the same meaning as the forward slash, but the atom being considered
is before the first backslash. For example, C/H3/Cs\H2\Cs describes a carbon atom bonded
to 3 hydrogen atoms and to a carbon which has single bonds to 2 hydrogen atoms and a
third carbon (which only has single bonds).
104
B.4 Group values for original tree
Table B.1: Original tree structure, with distance group data in A.
Groups TS count dXH (A) dHY (A) dXY (A)L1: X H or Xanyrad H 2142 1.336010 1.336330 2.667560
L2: X H 2089 -0.002556 0.002864 0.000227
L3: H2 70 -0.327434 -0.045046 -0.369886
L3: Ct H 12 0.426157 -0.214467 0.120589
L4: Ct/H/NonDeC 12 0.426157 -0.214467 0.120589
L3: O H 203 -0.110714 -0.034542 -0.149127
L4: O pri 14 0.075716 -0.199700 -0.158817
L4: O sec 189 -0.122296 -0.024281 -0.148525
L5: O/H/NonDeC 51 -0.061770 -0.091698 -0.156862
L5: O/H/NonDeO 137 -0.143000 -0.000215 -0.144650
L6: H2O2 37 -0.172464 0.030481 -0.143076
L6: ROOH pri 31 -0.164228 0.023011 -0.141922
L6: ROOH sec 1 -0.023183 -0.146971 -0.174787
L6: ROOH ter 1 -0.216642 0.096657 -0.117164
L5: O/H/OneDe 1 -0.241135 -0.138830 -0.383406
L6: O/H/OneDeC 1 -0.241135 -0.138830 -0.383406
L3: Orad O H 22 -0.244636 0.181811 -0.064615
L3: Cd H 319 0.085747 -0.068103 0.017934
L4: Cd pri 210 0.080833 -0.065760 0.015103
L5: Cd/H2/NonDeC 210 0.080833 -0.065760 0.015103
L4: Cd sec 109 0.095340 -0.072680 0.023461
L5: Cd/H/NonDeC 30 0.072086 -0.053192 0.015788
L5: Cd/H/NonDeO
L5: Cd/H/OneDe 79 0.104163 -0.080074 0.026373
L6: Cd/H/Ct 26 0.033884 -0.029006 0.006386
L6: Cd/H/Cb
L6: Cd/H/CO
L6: Cd/H/Cd 53 0.139835 -0.105994 0.036518
L3: Cb H 44 0.124436 -0.092331 0.033127
L3: CO H 153 -0.001372 0.059389 0.057711
L4: CO pri 50 0.000467 0.049726 0.050841
L4: CO sec 103 -0.002287 0.064195 0.061127
L5: CO/H/NonDe 60 -0.009196 0.071787 0.062334
L6: CO/H/Cs 55 -0.009637 0.077298 0.067136
L7: CO/H/Cs\Cs—Cs 6 -0.039990 0.127041 0.088998
L5: CO/H/OneDe 43 0.006808 0.054200 0.059539
L3: Cs H 1266 0.007461 0.023642 0.032296
L4: C methane 69 0.076680 -0.051468 0.028801
L4: C pri 684 0.025511 -0.002230 0.025031
L5: C/H3/Cs 357 0.051887 -0.033063 0.021528
L6: C/H3/Cs\H3 65 0.045880 -0.025075 0.024181
L6: C/H3/Cs\OneNonDe 107 0.053158 -0.037890 0.017928
L7: C/H3/Cs\H2\Cs 59 0.049403 -0.042001 0.010665
L8: C/H3/Cs\H2\Cs—O 1 0.045526 -0.296175 -0.246802
Continued on next page
105
Table B.1 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L7: C/H3/Cs\H2\O 48 0.057673 -0.032947 0.026660
L6: C/H3/Cs\TwoNonDe 44 0.049381 -0.031488 0.021282
L7: C/H3/Cs\H\Cs\OL7: C/H3/Cs\H\Cs\Cs—O
L5: C/H3/O 53 0.025187 0.000259 0.020403
L5: C/H3/OneDe 274 -0.007707 0.036215 0.030303
L6: C/H3/Ct 36 -0.003456 0.024269 0.023000
L6: C/H3/Cb
L6: C/H3/CO 84 0.012344 0.003831 0.016124
L6: C/H3/Cd 154 -0.019271 0.056043 0.039454
L7: C/H3/Cd\H Cd\H2 46 -0.022590 0.051658 0.031290
L7: C/H3/Cd\H Cd\H\Cs 3 -0.047979 0.075930 0.031408
L7: C/H3/Cd\Cs Cd\H2 3 -0.023917 0.029891 0.010253
L4: C sec 409 -0.026003 0.069757 0.044341
L5: C/H2/NonDeC 73 0.027438 -0.019733 0.010886
L6: C/H2/Cs/Cs\OL6: C/H2/Cs/Cs\Cs—O
L6: C/H2/NonDeC 5ring 5 -0.018316 -0.013568 -0.027701
L7: C/H2/NonDeC 5ring fused6 1
L7: C/H2/NonDeC 5ring fused6 2
L7: C/H2/NonDeC 5ring alpha6ring
L7: C/H2/NonDeC 5ring beta6ring
L6: C/H2/Cs\H3/Cs\H3 53 0.037209 -0.011011 0.029214
L5: C/H2/NonDeO 52 0.011534 0.020231 0.027527
L6: C/H2/CsO 50 0.010769 0.021611 0.027966
L7: C/H2/Cs\Cs2/O
L6: C/H2/O2 2 0.031818 -0.016339 0.015885
L5: C/H2/OneDe 185 -0.037737 0.086334 0.048927
L6: C/H2/OneDeC 184 -0.037603 0.086091 0.048826
L7: C/H2/CtCs
L7: C/H2/CbCs
L7: C/H2/COCs 41 -0.014812 0.053991 0.038405
L8: C/H2/CO\H/Cs\H3 1 -0.050355 0.095460 0.047057
L7: C/H2/CdCs 143 -0.043987 0.095083 0.051745
L8: C/H2/Cd\H Cd\H2/Cs\H3 33 -0.029127 0.076789 0.049531
L6: C/H2/OneDeO 1 -0.059511 0.125879 0.065513
L5: C/H2/TwoDe 99 -0.060370 0.126094 0.067502
L6: C/H2/CtCt
L6: C/H2/CtCb
L6: C/H2/CtCO
L6: C/H2/CbCb
L6: C/H2/CbCO
L6: C/H2/COCO
L6: C/H2/CdCt
L6: C/H2/CdCb
L6: C/H2/CdCO
L6: C/H2/CdCd 99 -0.060370 0.126094 0.067502
L5: C/H2/Cb
Continued on next page
106
Table B.1 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L4: C ter 104 -0.025676 0.062321 0.034956
L5: C/H/NonDe 16 0.001573 0.003478 0.006838
L6: C/H/Cs3 15 0.006985 0.005657 0.014344
L7: C/H/Cs3 5ring
L8: C/H/Cs3 5ring fused6
L8: C/H/Cs3 5ring adj5
L7: C/H/Cs2/Cs\OL6: C/H/NDMustO 1 -0.128313 -0.048800 -0.173313
L7: C/H/Cs2O 1 -0.128313 -0.048800 -0.173313
L7: C/H/CsO2
L7: C/H/O3
L5: C/H/OneDe 87 -0.029273 0.070037 0.038474
L6: C/H/Cs2 87 -0.029273 0.070037 0.038474
L7: C/H/Cs2Ct
L7: C/H/Cs2Cb
L7: C/H/Cs2CO 87 -0.029273 0.070037 0.038474
L7: C/H/Cs2Cd
L6: C/H/CsO
L6: C/H/OO
L5: C/H/TwoDe 1 -0.119337 0.270561 0.154144
L6: C/H/Cs 1 -0.119337 0.270561 0.154144
L7: C/H/CtCt
L7: C/H/CtCb
L7: C/H/CtCO
L7: C/H/CbCb
L7: C/H/CbCO
L7: C/H/COCO
L7: C/H/CdCt
L7: C/H/CdCb
L7: C/H/CdCO
L7: C/H/CdCd 1 -0.119337 0.270561 0.154144
L6: C/H/TDMustO
L5: C/H/ThreeDe
L5: C/H/Cb
L2: Xrad H 53 0.094987 -0.106435 -0.008430
L3: C rad H 38 0.155959 -0.117206 0.041719
L4: CH3 rad H 38 0.155959 -0.117206 0.041719
L3: OH rad H 15 -0.074639 -0.076468 -0.147944
L2: Xbirad H
L3: CH2 triplet H
L3: CH2 singlet H
L2: Xtrirad H
L3: C quartet H
L3: C doublet H
L1: Y anyrad
L2: Y 1centerquadrad
L3: C quintet
L3: C triplet
Continued on next page
107
Table B.1 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L3: C singlet
L2: Y 1centertrirad
L3: CH quartet
L3: CH doublet
L2: Y 1centerbirad 53 -0.106387 0.093103 -0.010316
L3: O atom triplet 15 -0.078856 -0.073302 -0.149113
L3: O atom singlet
L3: CH2 triplet 38 -0.116742 0.155689 0.041886
L3: CH2 singlet
L2: Y rad 2089 0.002857 -0.002500 0.000277
L3: H rad 70 -0.044160 -0.330263 -0.371926
L3: Y 2centeradjbirad 22 0.183054 -0.244770 -0.063414
L4: O2b 22 0.183054 -0.244770 -0.063414
L4: C2b
L3: Ct rad 12 -0.215601 0.427396 0.121746
L4: Ct rad/Ct 12 -0.215601 0.427396 0.121746
L3: O rad 203 -0.035877 -0.108766 -0.148471
L4: O pri rad 14 -0.200121 0.080838 -0.153831
L4: O sec rad 189 -0.024990 -0.121335 -0.148115
L5: O rad/NonDeC 51 -0.091751 -0.061514 -0.156592
L6: O rad/Cs\H2\Cs—H—Cs2
L5: O rad/NonDeO 137 0.000232 -0.143066 -0.144048
L6: OOC 100 -0.009950 -0.133086 -0.144316
L5: O rad/OneDe 1 -0.139436 -0.241161 -0.384069
L6: O rad/OneDeC 1 -0.139436 -0.241161 -0.384069
L7: O rad/Cd
L8: O rad/Cd\H Cd\H2
L8: O rad/Cd\H Cd\H\Cs
L8: O rad/Cd\H Cd\Cs2
L8: O rad/Cd\Cs Cd\H2
L8: O rad/Cd\Cs Cd\H\Cs
L8: O rad/Cd\Cs Cd\Cs2
L3: Cd rad 319 -0.068341 0.085680 0.017720
L4: Cd pri rad 210 -0.065901 0.080755 0.014975
L5: Cd Cd\H2 pri rad 37 -0.090673 0.111803 0.024431
L5: Cd Cd\H\Cs pri rad 25 -0.084518 0.116254 0.030054
L6: Cd Cd\H\Cs—H2—Cs pri rad
L5: Cd Cd\Cs2 pri rad
L4: Cd sec rad 109 -0.073073 0.095230 0.023044
L5: Cd rad/NonDeC 30 -0.053245 0.071823 0.015883
L6: Cd Cd\H2 rad/Cs 29 -0.047865 0.071398 0.020717
L6: Cd Cd\H\Cs rad/Cs 1 -0.338387 0.094328 -0.240337
L5: Cd rad/NonDeO
L5: Cd rad/OneDe 79 -0.080561 0.104069 0.025748
L6: Cd rad/Ct 26 -0.028430 0.033281 0.006395
L6: Cd rad/Cb
L6: Cd rad/CO
L6: Cd rad/Cd 53 -0.106627 0.139463 0.035425
Continued on next page
108
Table B.1 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L3: Cb rad 44 -0.092544 0.124820 0.033389
L3: CO rad 153 0.060690 -0.001930 0.058483
L4: CO pri rad 50 0.051903 -0.000963 0.051657
L4: CO sec rad 103 0.064999 -0.002404 0.061831
L5: CO rad/NonDe 60 0.072193 -0.008661 0.063342
L5: CO rad/OneDe 43 0.055533 0.005830 0.059842
L3: Cs rad 1266 0.024200 0.007289 0.032625
L4: C methyl 69 -0.050813 0.075919 0.028607
L4: C pri rad 684 -0.001792 0.025273 0.025176
L5: C rad/H2/Cs 357 -0.032772 0.051719 0.021617
L6: C rad/H2/Cs\H3 65 -0.024753 0.045959 0.024509
L6: C rad/H2/Cs\Cs2\O 2 -0.125966 0.025305 -0.097425
L6: C rad/H2/Cs\H\Cs\Cs—O 31 -0.033915 0.057010 0.024538
L6: C rad/H2/Cs\H\Cs—Cs\OL6: C rad/H2/Cs\H2\Cs—Cs—O
L6: C rad/H2/Cs\H2\Cs—Cs#O 1 -0.296450 0.035505 -0.257321
L5: C rad/H2/Ct 36 0.026030 -0.004893 0.023289
L5: C rad/H2/Cb
L5: C rad/H2/CO 84 0.004886 0.011458 0.016257
L5: C rad/H2/O 53 0.000911 0.024796 0.020406
L5: C rad/H2/Cd 154 0.056969 -0.019724 0.039836
L6: C rad/H2/Cd\H Cd\H2 118 0.058876 -0.021124 0.040298
L6: C rad/H2/Cd\Cs Cd\H2 3 0.030025 -0.024190 0.010083
L4: C sec rad 409 0.070550 -0.026112 0.044984
L5: C rad/H/NonDeC 73 -0.018369 0.026681 0.011442
L6: C rad/H/NonDeC 5ring fused6 1
L6: C rad/H/NonDeC 5ring fused6 2
L6: C rad/H/Cs\H3/Cs\H3 53 -0.010088 0.036102 0.028957
L6: C rad/H/NonDeC 5ring alpha6ring
L6: C rad/H/NonDeC 5ring beta6ring
L6: C rad/H/Cs\H2\Cs/Cs\H2\OL6: C rad/H/Cs\H\Cs\O/Cs
L6: C rad/H/Cs\H2\Cs—O/Cs
L5: C rad/H/NonDeO 52 0.020310 0.011913 0.027769
L6: C rad/H/CsO 50 0.021612 0.011204 0.028190
L7: C rad/H/Cs\H2\Cs/O
L8: C rad/H/Cs\H2\Cs—H2—Cs/O
L7: C rad/H/Cs\H\Cs2/O
L6: C rad/H/O2 2 -0.016152 0.031780 0.015998
L5: C rad/H/OneDe 185 0.086671 -0.037595 0.049410
L6: C rad/H/OneDeC 184 0.086442 -0.037467 0.049316
L7: C rad/H/CtCs
L7: C rad/H/CbCs
L7: C rad/H/CO/Cs 41 0.053933 -0.014722 0.038508
L8: C rad/H/CO\H/Cs\H3 1 0.095705 -0.050407 0.047214
L7: C rad/H/CdCs 143 0.095805 -0.044019 0.052429
L7: C rad/H/CSCs
L6: C rad/H/OneDeO 1 0.126124 -0.059563 0.065670
Continued on next page
109
Table B.1 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L5: C rad/H/TwoDe 99 0.126910 -0.060266 0.068401
L6: C rad/H/CtCt
L6: C rad/H/CtCb
L6: C rad/H/CtCO
L6: C rad/H/CbCb
L6: C rad/H/CbCO
L6: C rad/H/COCO
L6: C rad/H/CdCt
L6: C rad/H/CdCb
L6: C rad/H/CdCO
L6: C rad/H/CdCd 99 0.126910 -0.060266 0.068401
L4: C ter rad 104 0.062942 -0.025701 0.035402
L5: C rad/NonDe 16 0.005813 0.000990 0.008314
L6: C rad/Cs3 15 0.007964 0.006028 0.015416
L7: C rad/Cs2/Cs\OL7: C rad/Cs3 5ring fused6
L7: C rad/Cs3 5ring adj5
L6: C rad/NDMustO 1 -0.049406 -0.128339 -0.173976
L7: C rad/Cs2O 1 -0.049406 -0.128339 -0.173976
L8: C rad/OOH/Cs/Cs
L8: C rad/O/Cs/Cs\Cs
L7: C rad/CsO2
L7: C rad/O3
L5: C rad/OneDe 87 0.070669 -0.029339 0.038898
L6: C rad/Cs2 87 0.070669 -0.029339 0.038898
L7: C rad/CtCs2
L7: C rad/CbCs2
L7: C rad/COCs2 87 0.070669 -0.029339 0.038898
L7: C rad/CdCs2
L6: C rad/CsO
L6: C rad/O2
L5: C rad/TwoDe 1 0.270660 -0.119354 0.154191
L6: C rad/Cs 1 0.270660 -0.119354 0.154191
L7: C rad/CtCtCs
L7: C rad/CtCbCs
L7: C rad/CtCOCs
L7: C rad/CbCbCs
L7: C rad/CbCOCs
L7: C rad/COCOCs
L7: C rad/CdCtCs
L7: C rad/CdCbCs
L7: C rad/CdCOCs
L7: C rad/CdCdCs 1 0.270660 -0.119354 0.154191
L6: C rad/TDMustO
L5: C rad/ThreeDe
110
B.5 Group values for new tree
Table B.2: Modified tree structure, with distance group data in A.
Groups TS count dXH (A) dHY (A) dXY (A)L1: X H or Xanyrad H 2142 1.335530 1.335350 2.666040
L2: H2 70 -0.342288 -0.039783 -0.379376
L2: C H 1832 0.029362 0.004085 0.033720
L3: Cs H 1304 0.013039 0.020558 0.034846
L4: Csnorad H 1266 0.008578 0.024975 0.034745
L5: C methane 69 0.076977 -0.054808 0.025695
L5: CsRHHH 684 0.025708 -0.000824 0.026630
L6: CsCHHH 631 0.025688 -0.001134 0.026803
L7: C/H3/Cs 357 0.050417 -0.030813 0.022318
L7: C/H3/Cd 154 -0.016229 0.054933 0.041314
L7: C/H3/Ct 36 -0.002891 0.027628 0.026874
L7: C/H3/Cb
L6: CsOHHH 53 0.025956 0.002967 0.024508
L5: CsRRHH 409 -0.022947 0.070377 0.048092
L6: CsCCHH 356 -0.028156 0.077681 0.050784
L7: C/H2/Cs/Cs 73 0.027797 -0.011348 0.019668
L7: C/H2/Cs/Cd 143 -0.039272 0.093747 0.055160
L7: C/H2/Cs/Ct
L7: C/H2/Cs/Cb
L7: C/H2/Cd/Cd 99 -0.054862 0.123563 0.070441
L7: C/H2/Cd/Ct
L7: C/H2/Cd/Cb
L7: C/H2/Ct/Ct
L7: C/H2/Ct/Cb
L7: C/H2/Cb/Cb
L6: CsCOHH 51 0.011230 0.023228 0.030972
L7: C/H2/Cs/O 50 0.013016 0.020855 0.030315
L7: C/H2/Cd/O 1 -0.064825 0.124233 0.058927
L7: C/H2/Ct/O
L7: C/H2/Cb/O
L6: CsOOHH 2 0.027312 -0.019710 0.008337
L5: CsRRRH 104 -0.024248 0.067372 0.041141
L6: CsCCCH 103 -0.023913 0.067560 0.041645
L7: C/H/Cs/Cs/Cs 15 0.013473 0.013732 0.029444
L7: C/H/Cs/Cs/Cd
L7: C/H/Cs/Cs/Ct
L7: C/H/Cs/Cs/Cb
L7: C/H/Cs/Cd/Cd 1 -0.121224 0.263641 0.145531
L7: C/H/Cs/Cd/Ct
L7: C/H/Cs/Cd/Cb
L7: C/H/Cs/Ct/Ct
L7: C/H/Cs/Ct/Cb
L7: C/H/Cs/Cb/Cb
L7: C/H/Cd/Cd/Cd
Continued on next page
111
Table B.2 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L7: C/H/Cd/Cd/Ct
L7: C/H/Cd/Cd/Cb
L7: C/H/Cd/Ct/Ct
L7: C/H/Cd/Ct/Cb
L7: C/H/Cd/Cb/Cb
L7: C/H/Ct/Ct/Ct
L7: C/H/Ct/Ct/Cb
L7: C/H/Ct/Cb/Cb
L7: C/H/Cb/Cb/Cb
L6: CsCCOH 1 -0.121857 0.012401 -0.106079
L7: C/H/Cs/Cs/O 1 -0.121857 0.012401 -0.106079
L7: C/H/Cs/Cd/O
L7: C/H/Cs/Ct/O
L7: C/H/Cs/Cb/O
L7: C/H/Cd/Cd/O
L7: C/H/Cd/Ct/O
L7: C/H/Cd/Cb/O
L7: C/H/Ct/Ct/O
L7: C/H/Ct/Cb/O
L7: C/H/Cb/Cb/O
L6: CsCOOH
L7: C/H/Cs/O/O
L7: C/H/Cd/O/O
L7: C/H/Ct/O/O
L7: C/H/Cb/O/O
L6: CsOOOH
L4: Csrad H 38 0.154287 -0.119281 0.038063
L5: C methyl 38 0.154287 -0.119281 0.038063
L5: CsradRH2
L6: CsradCHH
L7: Csrad/H/Cs/H
L7: Csrad/H/Cd/H
L7: Csrad/H/Ct/H
L7: Csrad/H/Cb/H
L6: CsradOH2
L5: CsradRRH
L6: CsradCCH
L7: Csrad/Cs/Cs/H
L7: Csrad/Cs/Cd/H
L7: Csrad/Cs/Ct/H
L7: Csrad/Cs/Cb/H
L7: Csrad/Cd/Cd/H
L7: Csrad/Cd/Ct/H
L7: Csrad/Cd/Cb/H
L7: Csrad/Ct/Ct/H
L7: Csrad/Ct/Cb/H
L7: Csrad/Cb/Cb/H
L6: CsradCOH
Continued on next page
112
Table B.2 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L7: Csrad/Cs/O/H
L7: Csrad/Cd/O/H
L7: Csrad/Ct/O/H
L7: Csrad/Cb/O/H
L6: CsradOOH
L4: CsbiradH
L5: Cs singletH
L6: Cs singletHH
L6: Cs singletRH
L7: C singletCH
L8: C singlet/Cs/H
L8: C singlet/Cd/H
L8: C singlet/Ct/H
L8: C singlet/Cb/H
L7: C singletOH
L5: Cs tripletH
L6: Cs tripletHH
L6: Cs tripletRH
L7: Cs tripletCH
L8: C triplet/Cs/H
L8: C triplet/Cd/H
L8: C triplet/Ct/H
L8: C triplet/Cb/H
L7: Cs tripletOH
L4: CstriradH
L5: Cdoublet H
L5: Cquartet H
L3: Cd H 319 0.080461 -0.066374 0.014535
L4: Cdnorad H 319 0.080461 -0.066374 0.014535
L5: Cd C/R/H 319 0.080461 -0.066374 0.014535
L6: Cd C/H2 210 0.075312 -0.063805 0.011806
L7: Cd Cds/H2 121 0.112224 -0.091421 0.021864
L7: Cd Cdd/H2 89 0.024935 -0.026114 -0.001921
L6: Cd C/C/H 109 0.090623 -0.071444 0.019920
L7: Cd Cds/Cs/H 30 0.065391 -0.049501 0.012283
L7: Cd Cds/Cd/H 53 0.133020 -0.104743 0.030985
L7: Cd Cds/Ct/H 26 0.031752 -0.027696 0.005788
L7: Cd Cds/Cb/H
L7: Cd Cdd/Cs/H
L7: Cd Cdd/Cd/H
L7: Cd Cdd/Ct/H
L7: Cd Cdd/Cb/H
L6: Cd C/O/H
L7: Cd Cds/O/H
L7: Cd Cdd/O/H
L5: Cd O/R/H
L6: Cd O/H2
L6: Cd O/C/H
Continued on next page
113
Table B.2 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L7: Cd O/Cs/H
L7: Cd O/Cd/H
L7: Cd O/Ct/H
L7: Cd O/Cb/H
L6: Cd O/O/H
L4: Cdrad H
L5: Cdrad C/H
L6: Cdrad Cds/H
L6: Cdrad Cdd/H
L5: Cdrad O/H
L3: Ct H 12 0.431401 -0.229422 0.108762
L3: Cb H 44 0.117925 -0.092504 0.026137
L2: O H 240 -0.118125 -0.018878 -0.139931
L3: OradH 15 -0.075951 -0.082791 -0.155556
L3: ORH 225 -0.121281 -0.014095 -0.138762
L4: OHH 14 0.070355 -0.189952 -0.151526
L4: OCH 52 -0.066073 -0.091363 -0.160225
L5: O/Cs/H 51 -0.064968 -0.091454 -0.159202
L5: O/Cd/H
L5: O/Ct/H
L5: O/Cb/H 1 -0.234679 -0.077629 -0.316172
L4: OOH 159 -0.156010 0.026510 -0.130609
L1: Y anyrad
L2: Hrad 70 -0.039495 -0.342287 -0.379190
L2: Orad 240 -0.017829 -0.118668 -0.139585
L3: OjR 225 -0.012844 -0.122031 -0.138432
L4: OjH 14 -0.193300 0.076164 -0.149391
L4: OjC 52 -0.089279 -0.068097 -0.160291
L5: OjCs 51 -0.089356 -0.066985 -0.159249
L5: OjCd
L5: OjCt
L5: OjCb 1 -0.077813 -0.234351 -0.316079
L4: OjO 159 0.027899 -0.156949 -0.130319
L3: O atom triplet 15 -0.083286 -0.074511 -0.154718
L2: Crad 1832 0.003965 0.029654 0.033918
L3: Cj 1794 0.006693 0.026902 0.033836
L4: Csj 1266 0.025234 0.008380 0.034738
L5: Cs methyl 69 -0.053769 0.075442 0.025074
L5: CsjRH2 684 -0.000760 0.025831 0.026698
L6: CsjCH2 631 -0.001101 0.025831 0.026859
L7: Csj/Cs/H2 357 -0.031137 0.050803 0.022282
L7: Csj/Cd/H2 154 0.055474 -0.016532 0.041386
L7: Csj/Ct/H2 36 0.028053 -0.002911 0.027161
L7: Csj/Cb/H2
L6: CsjOH2 53 0.003401 0.025833 0.024741
L5: CsjRRH 409 0.070627 -0.023160 0.048156
L6: CsjCCH 356 0.077902 -0.028338 0.050848
L7: Csj/Cs/Cs/H 73 -0.011458 0.028261 0.019955
Continued on next page
114
Table B.2 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L7: Csj/Cs/Cd/H 143 0.094158 -0.039761 0.055281
L7: Csj/Cs/Ct/H
L7: Csj/Cs/Cb/H
L7: Csj/Cd/Cd/H 99 0.124292 -0.055524 0.070403
L7: Csj/Cd/Ct/H
L7: Csj/Cd/Cb/H
L7: Csj/Ct/Ct/H
L7: Csj/Ct/Cb/H
L7: Csj/Cb/Cb/H
L6: CsjCOH 51 0.024012 0.010600 0.031199
L7: Csj/Cs/O/H 50 0.021572 0.012445 0.030539
L7: Csj/Cd/O/H 1 0.126136 -0.066634 0.058822
L7: Csj/Ct/O/H
L7: Csj/Cb/O/H
L6: CsjOOH 2 -0.017641 0.025276 0.008156
L5: CsjRRR 104 0.068516 -0.025639 0.040796
L6: CsjCCC 103 0.068715 -0.025301 0.041315
L7: Csj/Cs/Cs/Cs 15 0.013659 0.012220 0.028184
L7: Csj/Cs/Cs/Cd
L7: Csj/Cs/Cs/Ct
L7: Csj/Cs/Cs/Cb
L7: Csj/Cs/Cd/Cd 1 0.265941 -0.123578 0.145242
L7: Csj/Cs/Cd/Ct
L7: Csj/Cs/Cd/Cb
L7: Csj/Cs/Ct/Ct
L7: Csj/Cs/Ct/Cb
L7: Csj/Cs/Cb/Cb
L7: Csj/Cd/Cd/Cd
L7: Csj/Cd/Cd/Ct
L7: Csj/Cd/Cd/Cb
L7: Csj/Cd/Ct/Ct
L7: Csj/Cd/Ct/Cb
L7: Csj/Cd/Cb/Cb
L7: Csj/Ct/Ct/Ct
L7: Csj/Ct/Ct/Cb
L7: Csj/Ct/Cb/Cb
L7: Csj/Cb/Cb/Cb
L6: CsjCCO 1 0.012218 -0.121529 -0.105986
L7: Csj/Cs/Cs/O 1 0.012218 -0.121529 -0.105986
L7: Csj/Cs/Cd/O
L7: Csj/Cs/Ct/O
L7: Csj/Cs/Cb/O
L7: Csj/Cd/Cd/O
L7: Csj/Cd/Ct/O
L7: Csj/Cd/Cb/O
L7: Csj/Ct/Ct/O
L7: Csj/Ct/Cb/O
L7: Csj/Cb/Cb/O
Continued on next page
115
Table B.2 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L6: CsjCOO
L7: Csj/Cs/O/O
L7: Csj/Cd/O/O
L7: Csj/Ct/O/O
L7: Csj/Cb/O/O
L6: CsjOOO
L4: Cdj 319 -0.067745 0.082294 0.015358
L5: Cdj CR 319 -0.067745 0.082294 0.015358
L6: Cdj CH 210 -0.065277 0.077335 0.012609
L7: Cdj CdsH 121 -0.093854 0.115196 0.022945
L7: Cdj CddH 89 -0.026782 0.026331 -0.001313
L6: Cdj CC 109 -0.072645 0.092142 0.020817
L7: Cdj CdsCs 30 -0.052748 0.067639 0.013440
L7: Cdj CdsCd 53 -0.104889 0.134137 0.031903
L7: Cdj CdsCt 26 -0.028231 0.032656 0.006171
L7: Cdj CdsCb
L7: Cdj CddCs
L7: Cdj CddCd
L7: Cdj CddCt
L7: Cdj CddCb
L6: Cdj CO
L7: Cdj CdsO
L7: Cdj CddO
L5: Cdj OR
L6: Cdj OH
L6: Cdj OC
L7: Cdj OCs
L7: Cdj OCd
L7: Cdj OCt
L7: Cdj OCb
L6: Cdj OO
L4: Ctj 12 -0.230464 0.433568 0.110987
L5: CtjC 12 -0.230464 0.433568 0.110987
L4: Cbj 44 -0.095175 0.120179 0.026648
L3: Cjj 38 -0.118072 0.152750 0.037583
L4: Csjj 38 -0.118072 0.152750 0.037583
L5: Cs sing
L6: Cs singH2
L6: Cs singRH
L7: Cs singCH
L8: Cs sing/Cs/H
L8: Cs sing/Cd/H
L8: Cs sing/Ct/H
L8: Cs sing/Cb/H
L7: Cs singOH
L6: Cs singRR
L7: Cs singCC
L8: Cs sing/Cs/Cs
Continued on next page
116
Table B.2 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L8: Cs sing/Cs/Cd
L8: Cs sing/Cs/Ct
L8: Cs sing/Cs/Cb
L8: Cs sing/Cd/Cd
L8: Cs sing/Cd/Ct
L8: Cs sing/Cd/Cb
L8: Cs sing/Ct/Ct
L8: Cs sing/Ct/Cb
L8: Cs sing/Cb/Cb
L7: Cs singCO
L8: Cs sing/Cs/O
L8: Cs sing/Cd/O
L8: Cs sing/Ct/O
L8: Cs sing/Cb/O
L7: Cs singOO
L5: Cs trip 38 -0.118072 0.152750 0.037583
L6: Cs tripH2 38 -0.118072 0.152750 0.037583
L6: Cs tripRH
L7: Cs tripCH
L8: Cs trip/Cs/H
L8: Cs trip/Cd/H
L8: Cs trip/Ct/H
L8: Cs trip/Cb/H
L7: Cs tripOH
L6: Cs tripRR
L7: Cs tripCC
L8: Cs trip/Cs/Cs
L8: Cs trip/Cs/Cd
L8: Cs trip/Cs/Ct
L8: Cs trip/Cs/Cb
L8: Cs trip/Cd/Cd
L8: Cs trip/Cd/Ct
L8: Cs trip/Cd/Cb
L8: Cs trip/Ct/Ct
L8: Cs trip/Ct/Cb
L8: Cs trip/Cb/Cb
L7: Cs tripCO
L8: Cs trip/Cs/O
L8: Cs trip/Cd/O
L8: Cs trip/Ct/O
L8: Cs trip/Cb/O
L7: Cs tripOO
L4: Cdjj
L5: Cd singletR
L6: Cd singletC
L6: Cd singletO
L5: Cd tripletR
L6: Cd tripletC
Continued on next page
117
Table B.2 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L6: Cd tripletO
L3: Cjjj
L4: C doubletR
L4: C quartetR
L3: Cjjjj
L4: C quintet
L4: C triplet
118
B.6 List of test reactions
Table B.3: 1393 hydrogen abstraction reactions used to test the group estimates cou-pled with the automated transition state algorithm. The reactants andproducts are provided as SMILES strings. Transition states that werefound and validated are available in CML format.
Reactions Found[CH2]C(C)C(=O)C(C)C + [O]O↔ CC(C)C(=O)C(C)C + [O][O] No
C[C](C)C(=O)C(C)C + [O]O↔ CC(C)C(=O)C(C)C + [O][O] No
[CH2]C(C)C(=O)C(C)C + CC(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + C[C](C)C(=O)C(C)C No
[CH2]C(C)C(=O)C(C)C + CC(C)C(=O)C(C)(C)OO↔ CC(C)C(=O)C(C)C + CC(C)C(=O)C(C)(C)O[O] Yes
C[C](C)C(=O)C(C)C + CC(C)C(=O)C(C)(C)OO↔ CC(C)C(=O)C(C)C + CC(C)C(=O)C(C)(C)O[O] Yes
[CH2]C(C)C(=O)C(C)C + CC(C)C(=O)C(C)COO↔ CC(C)C(=O)C(C)C + CC(C)C(=O)C(C)CO[O] Yes
C[C](C)C(=O)C(C)C + CC(C)C(=O)C(C)COO↔ CC(C)C(=O)C(C)C + CC(C)C(=O)C(C)CO[O] Yes
CC(C)C(=O)C(C)(C)O[O] + [O]O↔ CC(C)C(=O)C(C)(C)OO + [O][O] Yes
CC(C)C(=O)C(C)(C)OO + CC(C)C(=O)C(C)CO[O]↔ CC(C)C(=O)C(C)COO + CC(C)C(=O)C(C)(C)O[O] No
CC(C)C(=O)C(C)CO[O] + [O]O↔ CC(C)C(=O)C(C)COO + [O][O] Yes
CC(C)C(=O)C(C)(C)OO + [H]↔ [H][H] + CC(C)C(=O)C(C)(C)O[O] No
CC(C)C(=O)C(C)COO + [H]↔ [H][H] + CC(C)C(=O)C(C)CO[O] Yes
CC(C)C(=O)C(C)C + [O]↔ [OH] + [CH2]C(C)C(=O)C(C)C No
CC(C)C(=O)C(C)C + [O]↔ [OH] + C[C](C)C(=O)C(C)C Yes
CC(C)C(=O)C(C)(C)OO + [O]↔ [OH] + CC(C)C(=O)C(C)(C)O[O] No
CC(C)C(=O)C(C)COO + [O]↔ [OH] + CC(C)C(=O)C(C)CO[O] No
CC(C)C(=O)C(C)(C)OO + [OH]↔ O + CC(C)C(=O)C(C)(C)O[O] No
CC(C)C(=O)C(C)COO + [OH]↔ O + CC(C)C(=O)C(C)CO[O] No
CC(C)C(=O)C(C)(C)OO + [O]O↔ OO + CC(C)C(=O)C(C)(C)O[O] No
CC(C)C(=O)C(C)COO + [O]O↔ OO + CC(C)C(=O)C(C)CO[O] No
[CH2]C(C)C(=O)C(C)C + C#CC↔ CC(C)C(=O)C(C)C + C#C[CH2] No
CC(C)C(=O)C(C)C + C#C[CH2]↔ C#CC + C[C](C)C(=O)C(C)C No
CC(C)C(=O)C(C)(C)OO + C#C[CH2]↔ C#CC + CC(C)C(=O)C(C)(C)O[O] No
CC(C)C(=O)C(C)COO + C#C[CH2]↔ C#CC + CC(C)C(=O)C(C)CO[O] Yes
[H] + C#CC↔ [H][H] + C#C[CH2] Yes
[OH] + C#CC↔ O + C#C[CH2] Yes
[O]O + C#C[CH2]↔ C#CC + [O][O] No
OO + C#C[CH2]↔ C#CC + [O]O Yes
[CH2]C(C)C(=O)C(C)C + C=C=C↔ CC(C)C(=O)C(C)C + [CH]=C=C No
CC(C)C(=O)C(C)C + [CH]=C=C↔ C=C=C + C[C](C)C(=O)C(C)C Yes
CC(C)C(=O)C(C)(C)OO + [CH]=C=C↔ C=C=C + CC(C)C(=O)C(C)(C)O[O] No
CC(C)C(=O)C(C)COO + [CH]=C=C↔ C=C=C + CC(C)C(=O)C(C)CO[O] No
[H] + C=C=C↔ [H][H] + [CH]=C=C No
[OH] + C=C=C↔ O + [CH]=C=C No
[O]O + [CH]=C=C↔ C=C=C + [O][O] No
OO + [CH]=C=C↔ C=C=C + [O]O No
[c]1ccccc1 + [O]O↔ c1ccccc1 + [O][O] No
[c]1ccccc1 + CC(C)C(=O)C(C)C↔ c1ccccc1 + C[C](C)C(=O)C(C)C No
[c]1ccccc1 + CC(C)C(=O)C(C)C↔ c1ccccc1 + [CH2]C(C)C(=O)C(C)C Yes
[c]1ccccc1 + CC(C)C(=O)C(C)(C)OO↔ c1ccccc1 + CC(C)C(=O)C(C)(C)O[O] Yes
[c]1ccccc1 + CC(C)C(=O)C(C)COO↔ c1ccccc1 + CC(C)C(=O)C(C)CO[O] No
Continued on next page
119
Table B.3 – continued from previous pageReactions Found[c]1ccccc1 + [H][H]↔ c1ccccc1 + [H] Yes
c1ccccc1 + [OH]↔ O + [c]1ccccc1 Yes
[c]1ccccc1 + OO↔ c1ccccc1 + [O]O No
[c]1ccccc1 + C#CC↔ c1ccccc1 + C#C[CH2] Yes
[c]1ccccc1 + C=C=C↔ c1ccccc1 + [CH]=C=C Yes
[CH2]C(C)C(=O)C(C)C + C=CC↔ CC(C)C(=O)C(C)C + [CH2]C=C No
C[C](C)C(=O)C(C)C + C=CC↔ CC(C)C(=O)C(C)C + [CH2]C=C Yes
CC(C)C(=O)C(C)(C)OO + [CH2]C=C↔ C=CC + CC(C)C(=O)C(C)(C)O[O] Yes
CC(C)C(=O)C(C)COO + [CH2]C=C↔ C=CC + CC(C)C(=O)C(C)CO[O] Yes
OO + [CH2]C=C↔ C=CC + [O]O Yes
[c]1ccccc1 + C=CC↔ c1ccccc1 + [CH2]C=C No
C=C1C=C[CH]C1 + [O]O↔ C=C1C=CCC1 + [O][O] Yes
C=C1[CH]CC=C1 + [O]O↔ C=C1C=CCC1 + [O][O] Yes
C=C1C=CCC1 + C[C](C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + C=C1C=C[CH]C1 Yes
C=C1C=CCC1 + C[C](C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + C=C1[CH]CC=C1 No
C=C1C=CCC1 + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + C=C1C=C[CH]C1 Yes
C=C1C=CCC1 + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + C=C1[CH]CC=C1 Yes
C=C1C=CCC1 + CC(C)C(=O)C(C)(C)O[O]↔ CC(C)C(=O)C(C)(C)OO + C=C1C=C[CH]C1 Yes
C=C1[CH]CC=C1 + CC(C)C(=O)C(C)(C)OO↔ C=C1C=CCC1 + CC(C)C(=O)C(C)(C)O[O] Yes
C=C1C=CCC1 + CC(C)C(=O)C(C)CO[O]↔ CC(C)C(=O)C(C)COO + C=C1C=C[CH]C1 Yes
C=C1[CH]CC=C1 + CC(C)C(=O)C(C)COO↔ C=C1C=CCC1 + CC(C)C(=O)C(C)CO[O] Yes
C=C1C=CCC1 + [H]↔ [H][H] + C=C1C=C[CH]C1 Yes
C=C1C=CCC1 + [H]↔ [H][H] + C=C1[CH]CC=C1 Yes
C=C1C=CCC1 + [O]↔ [OH] + C=C1C=C[CH]C1 Yes
C=C1C=CCC1 + [O]↔ [OH] + C=C1[CH]CC=C1 No
C=C1C=CCC1 + [OH]↔ O + C=C1C=C[CH]C1 No
C=C1C=CCC1 + [OH]↔ O + C=C1[CH]CC=C1 No
C=C1C=CCC1 + [O]O↔ OO + C=C1C=C[CH]C1 No
C=C1C=CCC1 + [O]O↔ OO + C=C1[CH]CC=C1 No
C=C1C=CCC1 + C#C[CH2]↔ C#CC + C=C1C=C[CH]C1 Yes
C=C1C=CCC1 + C#C[CH2]↔ C#CC + C=C1[CH]CC=C1 Yes
C=C1C=CCC1 + [CH]=C=C↔ C=C=C + C=C1C=C[CH]C1 Yes
C=C1C=CCC1 + [CH]=C=C↔ C=C=C + C=C1[CH]CC=C1 No
C=C1C=CCC1 + [c]1ccccc1↔ c1ccccc1 + C=C1C=C[CH]C1 No
C=C1C=CCC1 + [c]1ccccc1↔ c1ccccc1 + C=C1[CH]CC=C1 Yes
C=C1C=CCC1 + [CH2]C=C↔ C=CC + C=C1C=C[CH]C1 Yes
C=C1C=CCC1 + [CH2]C=C↔ C=CC + C=C1[CH]CC=C1 Yes
C=C1[CH]C=CC1 + [O]O↔ C=C1CC=CC1 + [O][O] Yes
C=C1CC=CC1 + C[C](C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + C=C1[CH]C=CC1 Yes
C=C1CC=CC1 + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + C=C1[CH]C=CC1 No
C=C1CC=CC1 + CC(C)C(=O)C(C)(C)O[O]↔ CC(C)C(=O)C(C)(C)OO + C=C1[CH]C=CC1 Yes
C=C1CC=CC1 + CC(C)C(=O)C(C)CO[O]↔ CC(C)C(=O)C(C)COO + C=C1[CH]C=CC1 No
C=C1CC=CC1 + [H]↔ [H][H] + C=C1[CH]C=CC1 No
C=C1CC=CC1 + [O]↔ [OH] + C=C1[CH]C=CC1 Yes
C=C1CC=CC1 + [OH]↔ O + C=C1[CH]C=CC1 No
C=C1CC=CC1 + [O]O↔ OO + C=C1[CH]C=CC1 No
C=C1CC=CC1 + C#C[CH2]↔ C#CC + C=C1[CH]C=CC1 No
C=C1CC=CC1 + [CH]=C=C↔ C=C=C + C=C1[CH]C=CC1 Yes
Continued on next page
120
Table B.3 – continued from previous pageReactions FoundC=C1CC=CC1 + [c]1ccccc1↔ c1ccccc1 + C=C1[CH]C=CC1 No
C=C1CC=CC1 + [CH2]C=C↔ C=CC + C=C1[CH]C=CC1 Yes
C=C1C=CCC1 + C=C1[CH]CC=C1↔ C=C1C=CCC1 + C=C1C=C[CH]C1 No
C=C1CC=CC1 + C=C1[CH]CC=C1↔ C=C1C=CCC1 + C=C1[CH]C=CC1 No
C=C1C=C[CH]C1 + C=C1CC=CC1↔ C=C1C=CCC1 + C=C1[CH]C=CC1 No
CC(C)C(=O)C(C)C + [CH]=C↔ C=C + [CH2]C(C)C(=O)C(C)C No
CC(C)C(=O)C(C)C + [CH]=C↔ C=C + C[C](C)C(=O)C(C)C Yes
CC(C)C(=O)C(C)(C)OO + [CH]=C↔ C=C + CC(C)C(=O)C(C)(C)O[O] Yes
CC(C)C(=O)C(C)COO + [CH]=C↔ C=C + CC(C)C(=O)C(C)CO[O] No
[H][H] + [CH]=C↔ C=C + [H] Yes
[OH] + C=C↔ O + [CH]=C Yes
[O]O + [CH]=C↔ C=C + [O][O] Yes
OO + [CH]=C↔ C=C + [O]O No
[c]1ccccc1 + C=C↔ c1ccccc1 + [CH]=C No
C=C1C=CCC1 + [CH]=C↔ C=C + C=C1C=C[CH]C1 Yes
C=C1C=CCC1 + [CH]=C↔ C=C + C=C1[CH]CC=C1 Yes
C=C1CC=CC1 + [CH]=C↔ C=C + C=C1[CH]C=CC1 Yes
C=C=C + [O]↔ [OH] + [CH]=C=C Yes
C=C=C + C#C[CH2]↔ C#CC + [CH]=C=C Yes
[CH]=C=C + C=CC↔ C=C=C + [CH2]C=C No
C=C=C + [CH]=C↔ C=C + [CH]=C=C No
C#CC + [O]↔ [OH] + C#C[CH2] Yes
C#C[CH2] + C=CC↔ C#CC + [CH2]C=C Yes
C#CC + [CH]=C↔ C=C + C#C[CH2] No
[C]#C + [O]O↔ C#C + [O][O] Yes
[C]#C + CC(C)C(=O)C(C)C↔ C#C + C[C](C)C(=O)C(C)C No
[C]#C + CC(C)C(=O)C(C)C↔ C#C + [CH2]C(C)C(=O)C(C)C No
[C]#C + CC(C)C(=O)C(C)(C)OO↔ C#C + CC(C)C(=O)C(C)(C)O[O] No
[C]#C + CC(C)C(=O)C(C)COO↔ C#C + CC(C)C(=O)C(C)CO[O] No
[C]#C + [H][H]↔ C#C + [H] No
[C]#C + O↔ C#C + [OH] No
[C]#C + OO↔ C#C + [O]O No
[C]#C + C#CC↔ C#C + C#C[CH2] No
[C]#C + C=C=C↔ C#C + [CH]=C=C No
[C]#C + c1ccccc1↔ C#C + [c]1ccccc1 No
[C]#C + C=CC↔ C#C + [CH2]C=C Yes
[C]#C + C=C1C=CCC1↔ C#C + C=C1[CH]CC=C1 No
[C]#C + C=C1CC=CC1↔ C#C + C=C1[CH]C=CC1 Yes
[C]#C + C=C1C=CCC1↔ C#C + C=C1C=C[CH]C1 No
[C]#C + C=C↔ C#C + [CH]=C No
CC(C)C(=O)C(C)C + [CH]=CC↔ C=CC + [CH2]C(C)C(=O)C(C)C No
CC(C)C(=O)C(C)C + [CH]=CC↔ C=CC + C[C](C)C(=O)C(C)C Yes
CC(C)C(=O)C(C)(C)OO + [CH]=CC↔ C=CC + CC(C)C(=O)C(C)(C)O[O] No
CC(C)C(=O)C(C)COO + [CH]=CC↔ C=CC + CC(C)C(=O)C(C)CO[O] Yes
[O]O + [CH]=CC↔ C=CC + [O][O] No
OO + [CH]=CC↔ C=CC + [O]O No
[c]1ccccc1 + C=CC↔ c1ccccc1 + [CH]=CC No
C=C1C=CCC1 + [CH]=CC↔ C=CC + C=C1C=C[CH]C1 Yes
Continued on next page
121
Table B.3 – continued from previous pageReactions FoundC=C1C=CCC1 + [CH]=CC↔ C=CC + C=C1[CH]CC=C1 No
C=C1CC=CC1 + [CH]=CC↔ C=CC + C=C1[CH]C=CC1 No
C=C=C + [CH]=CC↔ C=CC + [CH]=C=C No
C#CC + [CH]=CC↔ C=CC + C#C[CH2] Yes
[C]#C + C=CC↔ C#C + [CH]=CC No
CC(C)C(=O)C(C)C + C=[C]C↔ C=CC + [CH2]C(C)C(=O)C(C)C Yes
CC(C)C(=O)C(C)C + C=[C]C↔ C=CC + C[C](C)C(=O)C(C)C Yes
CC(C)C(=O)C(C)(C)OO + C=[C]C↔ C=CC + CC(C)C(=O)C(C)(C)O[O] Yes
CC(C)C(=O)C(C)COO + C=[C]C↔ C=CC + CC(C)C(=O)C(C)CO[O] No
[O]O + C=[C]C↔ C=CC + [O][O] No
OO + C=[C]C↔ C=CC + [O]O No
[c]1ccccc1 + C=CC↔ c1ccccc1 + C=[C]C No
C=C1C=CCC1 + C=[C]C↔ C=CC + C=C1C=C[CH]C1 Yes
C=C1C=CCC1 + C=[C]C↔ C=CC + C=C1[CH]CC=C1 Yes
C=C1CC=CC1 + C=[C]C↔ C=CC + C=C1[CH]C=CC1 No
C=C=C + C=[C]C↔ C=CC + [CH]=C=C Yes
C#CC + C=[C]C↔ C=CC + C#C[CH2] Yes
[C]#C + C=CC↔ C#C + C=[C]C No
CC(C)C(=O)C(C)(C)OO + [CH3]↔ C + CC(C)C(=O)C(C)(C)O[O] Yes
CC(C)C(=O)C(C)COO + [CH3]↔ C + CC(C)C(=O)C(C)CO[O] Yes
[H] + C↔ [H][H] + [CH3] No
[OH] + C↔ O + [CH3] Yes
[O]O + [CH3]↔ C + [O][O] No
OO + [CH3]↔ C + [O]O No
[c]1ccccc1 + C↔ c1ccccc1 + [CH3] No
C=C1C=CCC1 + [CH3]↔ C + C=C1C=C[CH]C1 Yes
C=C1C=CCC1 + [CH3]↔ C + C=C1[CH]CC=C1 Yes
C=C1CC=CC1 + [CH3]↔ C + C=C1[CH]C=CC1 Yes
[C]#C + C↔ C#C + [CH3] Yes
CC(C)C(=O)C(C)C + [CH2]CC↔ CCC + [CH2]C(C)C(=O)C(C)C No
CC(C)C(=O)C(C)C + [CH2]CC↔ CCC + C[C](C)C(=O)C(C)C No
CC(C)C(=O)C(C)(C)OO + [CH2]CC↔ CCC + CC(C)C(=O)C(C)(C)O[O] Yes
CC(C)C(=O)C(C)COO + [CH2]CC↔ CCC + CC(C)C(=O)C(C)CO[O] Yes
OO + [CH2]CC↔ CCC + [O]O Yes
[c]1ccccc1 + CCC↔ c1ccccc1 + [CH2]CC No
C=C1C=CCC1 + [CH2]CC↔ CCC + C=C1C=C[CH]C1 Yes
C=C1C=CCC1 + [CH2]CC↔ CCC + C=C1[CH]CC=C1 Yes
C=C1CC=CC1 + [CH2]CC↔ CCC + C=C1[CH]C=CC1 Yes
C=C=C + [CH2]CC↔ CCC + [CH]=C=C No
C#CC + [CH2]CC↔ CCC + C#C[CH2] Yes
[C]#C + CCC↔ C#C + [CH2]CC Yes
[CH2]C(C)C(=O)C(C)C + CCC↔ CC(C)C(=O)C(C)C + C[CH]C No
CC(C)C(=O)C(C)C + C[CH]C↔ CCC + C[C](C)C(=O)C(C)C Yes
CC(C)C(=O)C(C)(C)OO + C[CH]C↔ CCC + CC(C)C(=O)C(C)(C)O[O] Yes
CC(C)C(=O)C(C)COO + C[CH]C↔ CCC + CC(C)C(=O)C(C)CO[O] Yes
OO + C[CH]C↔ CCC + [O]O Yes
[c]1ccccc1 + CCC↔ c1ccccc1 + C[CH]C No
C=C1C=CCC1 + C[CH]C↔ CCC + C=C1C=C[CH]C1 No
Continued on next page
122
Table B.3 – continued from previous pageReactions FoundC=C1C=CCC1 + C[CH]C↔ CCC + C=C1[CH]CC=C1 Yes
C=C1CC=CC1 + C[CH]C↔ CCC + C=C1[CH]C=CC1 Yes
C=C=C + C[CH]C↔ CCC + [CH]=C=C Yes
C#CC + C[CH]C↔ CCC + C#C[CH2] Yes
[C]#C + CCC↔ C#C + C[CH]C Yes
C=CC + [O]↔ [OH] + [CH2]C=C No
[CH]=CC + C=CC↔ C=CC + [CH2]C=C No
C=[C]C + C=CC↔ C=CC + [CH2]C=C No
[CH]=CC + C=C↔ C=CC + [CH]=C No
C=CC + [CH]=C↔ C=C + C=[C]C Yes
C=CC + [CH]=C↔ C=C + [CH2]C=C Yes
C=CC + [CH]=CC↔ C=CC + C=[C]C Yes
[CH]=CC + CCC↔ C=CC + [CH2]CC Yes
C=[C]C + CCC↔ C=CC + [CH2]CC Yes
C=CC + [CH2]CC↔ CCC + [CH2]C=C Yes
[CH]=CC + CCC↔ C=CC + C[CH]C Yes
C=[C]C + CCC↔ C=CC + C[CH]C Yes
C=CC + C[CH]C↔ CCC + [CH2]C=C Yes
[CH]=C + C↔ C=C + [CH3] Yes
[CH]=C + CCC↔ C=C + [CH2]CC Yes
[CH]=C + CCC↔ C=C + C[CH]C Yes
[CH]=C=O + [O]O↔ C=C=O + [O][O] Yes
[CH]=C=O + CC(C)C(=O)C(C)C↔ C=C=O + C[C](C)C(=O)C(C)C No
[CH]=C=O + CC(C)C(=O)C(C)C↔ C=C=O + [CH2]C(C)C(=O)C(C)C Yes
[CH]=C=O + CC(C)C(=O)C(C)(C)OO↔ C=C=O + CC(C)C(=O)C(C)(C)O[O] Yes
[CH]=C=O + CC(C)C(=O)C(C)COO↔ C=C=O + CC(C)C(=O)C(C)CO[O] Yes
[CH]=C=O + [H][H]↔ C=C=O + [H] Yes
C=C=O + [OH]↔ O + [CH]=C=O Yes
[CH]=C=O + OO↔ C=C=O + [O]O Yes
[CH]=C=O + C#CC↔ C=C=O + C#C[CH2] Yes
[CH]=C=O + C=C=C↔ C=C=O + [CH]=C=C Yes
C=C=O + [c]1ccccc1↔ c1ccccc1 + [CH]=C=O No
[CH]=C=O + C=CC↔ C=C=O + [CH2]C=C Yes
[CH]=C=O + C=C1C=CCC1↔ C=C=O + C=C1[CH]CC=C1 Yes
[CH]=C=O + C=C1CC=CC1↔ C=C=O + C=C1[CH]C=CC1 Yes
[CH]=C=O + C=C1C=CCC1↔ C=C=O + C=C1C=C[CH]C1 Yes
C=C=O + [CH]=C↔ C=C + [CH]=C=O Yes
C=C=O + [CH]=CC↔ C=CC + [CH]=C=O Yes
C=C=O + C=[C]C↔ C=CC + [CH]=C=O Yes
[CH]=C=O + C↔ C=C=O + [CH3] Yes
[CH]=C=O + CCC↔ C=C=O + [CH2]CC Yes
[CH]=C=O + CCC↔ C=C=O + C[CH]C Yes
[CH]=O + [O]O↔ C=O + [O][O] Yes
C=O + C[C](C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + [CH]=O Yes
C=O + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + [CH]=O Yes
[CH]=O + CC(C)C(=O)C(C)(C)OO↔ C=O + CC(C)C(=O)C(C)(C)O[O] Yes
[CH]=O + CC(C)C(=O)C(C)COO↔ C=O + CC(C)C(=O)C(C)CO[O] Yes
C=O + [H]↔ [H][H] + [CH]=O Yes
Continued on next page
123
Table B.3 – continued from previous pageReactions FoundC=O + [O]↔ [OH] + [CH]=O Yes
C=O + [OH]↔ O + [CH]=O No
[CH]=O + OO↔ C=O + [O]O No
C=O + C#C[CH2]↔ C#CC + [CH]=O Yes
C=O + [CH]=C=C↔ C=C=C + [CH]=O Yes
C=O + [c]1ccccc1↔ c1ccccc1 + [CH]=O No
C=O + [CH2]C=C↔ C=CC + [CH]=O Yes
[CH]=O + C=C1C=CCC1↔ C=O + C=C1[CH]CC=C1 Yes
[CH]=O + C=C1CC=CC1↔ C=O + C=C1[CH]C=CC1 Yes
[CH]=O + C=C1C=CCC1↔ C=O + C=C1C=C[CH]C1 Yes
C=O + [CH]=C↔ C=C + [CH]=O Yes
C=O + [CH]=CC↔ C=CC + [CH]=O Yes
C=O + C=[C]C↔ C=CC + [CH]=O Yes
C=O + [CH3]↔ C + [CH]=O Yes
C=O + [CH2]CC↔ CCC + [CH]=O Yes
C=O + C[CH]C↔ CCC + [CH]=O Yes
[CH2]O + [O]O↔ CO + [O][O] Yes
[CH2]O + CC(C)C(=O)C(C)C↔ CO + C[C](C)C(=O)C(C)C No
CO + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + [CH2]O Yes
[CH2]O + CC(C)C(=O)C(C)(C)OO↔ CO + CC(C)C(=O)C(C)(C)O[O] Yes
[CH2]O + CC(C)C(=O)C(C)COO↔ CO + CC(C)C(=O)C(C)CO[O] Yes
CO + [H]↔ [H][H] + [CH2]O Yes
CO + [O]↔ [OH] + [CH2]O Yes
CO + [OH]↔ O + [CH2]O Yes
[CH2]O + C#CC↔ CO + C#C[CH2] No
[CH2]O + C=C=C↔ CO + [CH]=C=C Yes
CO + [c]1ccccc1↔ c1ccccc1 + [CH2]O Yes
[CH2]O + C=CC↔ CO + [CH2]C=C Yes
[CH2]O + C=C1C=CCC1↔ CO + C=C1[CH]CC=C1 Yes
[CH2]O + C=C1CC=CC1↔ CO + C=C1[CH]C=CC1 Yes
[CH2]O + C=C1C=CCC1↔ CO + C=C1C=C[CH]C1 Yes
CO + [CH]=C↔ C=C + [CH2]O Yes
CO + [CH]=CC↔ C=CC + [CH2]O Yes
CO + C=[C]C↔ C=CC + [CH2]O Yes
CO + [CH3]↔ C + [CH2]O Yes
CO + [CH2]CC↔ CCC + [CH2]O Yes
CO + C[CH]C↔ CCC + [CH2]O Yes
CC(C)C(=O)C(C)C + [CH2]↔ [CH3] + [CH2]C(C)C(=O)C(C)C Yes
CC(C)C(=O)C(C)C + [CH2]↔ [CH3] + C[C](C)C(=O)C(C)C Yes
CC(C)C(=O)C(C)(C)OO + [CH2]↔ [CH3] + CC(C)C(=O)C(C)(C)O[O] Yes
CC(C)C(=O)C(C)COO + [CH2]↔ [CH3] + CC(C)C(=O)C(C)CO[O] Yes
[H][H] + [CH2]↔ [CH3] + [H] Yes
[O]O + [CH2]↔ [CH3] + [O][O] Yes
OO + [CH2]↔ [CH3] + [O]O No
C=C1C=CCC1 + [CH2]↔ [CH3] + C=C1C=C[CH]C1 No
C=C1C=CCC1 + [CH2]↔ [CH3] + C=C1[CH]CC=C1 Yes
C=C1CC=CC1 + [CH2]↔ [CH3] + C=C1[CH]C=CC1 Yes
C=C=C + [CH2]↔ [CH3] + [CH]=C=C Yes
Continued on next page
124
Table B.3 – continued from previous pageReactions FoundC#CC + [CH2]↔ [CH3] + C#C[CH2] Yes
C=CC + [CH2]↔ [CH3] + C=[C]C Yes
C=CC + [CH2]↔ [CH3] + [CH2]C=C Yes
C=C + [CH2]↔ [CH3] + [CH]=C Yes
C=C=O + [CH2]↔ [CH3] + [CH]=C=O Yes
C=O + [CH2]↔ [CH3] + [CH]=O No
CO + [CH2]↔ [CH3] + [CH2]O Yes
C + [O]↔ [OH] + [CH3] No
C + [CH2]↔ [CH3] + [CH3] Yes
[CH2]CO + [O]O↔ CCO + [O][O] No
C[CH]O + [O]O↔ CCO + [O][O] No
CC[O] + [O]O↔ CCO + [O][O] No
[CH2]CO + CC(C)C(=O)C(C)C↔ CCO + C[C](C)C(=O)C(C)C No
C[CH]O + CC(C)C(=O)C(C)C↔ CCO + C[C](C)C(=O)C(C)C Yes
CC[O] + CC(C)C(=O)C(C)C↔ CCO + C[C](C)C(=O)C(C)C No
[CH2]CO + CC(C)C(=O)C(C)C↔ CCO + [CH2]C(C)C(=O)C(C)C No
CCO + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + C[CH]O Yes
CC[O] + CC(C)C(=O)C(C)C↔ CCO + [CH2]C(C)C(=O)C(C)C Yes
[CH2]CO + CC(C)C(=O)C(C)(C)OO↔ CCO + CC(C)C(=O)C(C)(C)O[O] Yes
C[CH]O + CC(C)C(=O)C(C)(C)OO↔ CCO + CC(C)C(=O)C(C)(C)O[O] Yes
CC[O] + CC(C)C(=O)C(C)(C)OO↔ CCO + CC(C)C(=O)C(C)(C)O[O] Yes
[CH2]CO + CC(C)C(=O)C(C)COO↔ CCO + CC(C)C(=O)C(C)CO[O] No
C[CH]O + CC(C)C(=O)C(C)COO↔ CCO + CC(C)C(=O)C(C)CO[O] Yes
CC[O] + CC(C)C(=O)C(C)COO↔ CCO + CC(C)C(=O)C(C)CO[O] Yes
CCO + [O]↔ [OH] + [CH2]CO Yes
CCO + [O]↔ [OH] + C[CH]O Yes
CCO + [OH]↔ O + [CH2]CO No
CCO + [OH]↔ O + CC[O] No
[CH2]CO + OO↔ CCO + [O]O No
CC[O] + OO↔ CCO + [O]O Yes
[CH2]CO + C#CC↔ CCO + C#C[CH2] Yes
C[CH]O + C#CC↔ CCO + C#C[CH2] Yes
CC[O] + C#CC↔ CCO + C#C[CH2] Yes
[CH2]CO + C=C=C↔ CCO + [CH]=C=C Yes
C[CH]O + C=C=C↔ CCO + [CH]=C=C Yes
CC[O] + C=C=C↔ CCO + [CH]=C=C Yes
CCO + [c]1ccccc1↔ c1ccccc1 + [CH2]CO Yes
CCO + [c]1ccccc1↔ c1ccccc1 + C[CH]O Yes
CCO + [c]1ccccc1↔ c1ccccc1 + CC[O] Yes
[CH2]CO + C=CC↔ CCO + [CH2]C=C No
C[CH]O + C=CC↔ CCO + [CH2]C=C Yes
CC[O] + C=CC↔ CCO + [CH2]C=C Yes
[CH2]CO + C=C1C=CCC1↔ CCO + C=C1[CH]CC=C1 No
C[CH]O + C=C1C=CCC1↔ CCO + C=C1[CH]CC=C1 Yes
CC[O] + C=C1C=CCC1↔ CCO + C=C1[CH]CC=C1 Yes
[CH2]CO + C=C1CC=CC1↔ CCO + C=C1[CH]C=CC1 No
C[CH]O + C=C1CC=CC1↔ CCO + C=C1[CH]C=CC1 Yes
CC[O] + C=C1CC=CC1↔ CCO + C=C1[CH]C=CC1 Yes
Continued on next page
125
Table B.3 – continued from previous pageReactions Found[CH2]CO + C=C1C=CCC1↔ CCO + C=C1C=C[CH]C1 Yes
C[CH]O + C=C1C=CCC1↔ CCO + C=C1C=C[CH]C1 Yes
CC[O] + C=C1C=CCC1↔ CCO + C=C1C=C[CH]C1 Yes
CCO + [CH]=C↔ C=C + [CH2]CO Yes
CCO + [CH]=C↔ C=C + C[CH]O Yes
CCO + [CH]=C↔ C=C + CC[O] No
CCO + [CH]=CC↔ C=CC + [CH2]CO Yes
CCO + [CH]=CC↔ C=CC + C[CH]O Yes
CCO + [CH]=CC↔ C=CC + CC[O] No
CCO + C=[C]C↔ C=CC + [CH2]CO No
CCO + C=[C]C↔ C=CC + C[CH]O Yes
CCO + C=[C]C↔ C=CC + CC[O] Yes
[CH2]CO + CCC↔ CCO + [CH2]CC No
CCO + [CH2]CC↔ CCC + C[CH]O Yes
CC[O] + CCC↔ CCO + [CH2]CC Yes
[CH2]CO + CCC↔ CCO + C[CH]C Yes
CCO + C[CH]C↔ CCC + C[CH]O Yes
CC[O] + CCC↔ CCO + C[CH]C Yes
CCO + [CH2]↔ [CH3] + [CH2]CO No
CCO + [CH2]↔ [CH3] + C[CH]O Yes
CCO + [CH2]↔ [CH3] + CC[O] Yes
[C]#C + CCO↔ C#C + C[CH]O No
[CH]=C=O + CCO↔ C=C=O + C[CH]O No
C=O + C[CH]O↔ CCO + [CH]=O No
[CH2]O + CCO↔ CO + C[CH]O Yes
[CH2]CO + CCO↔ CCO + C[CH]O Yes
CC[O] + CCO↔ CCO + C[CH]O Yes
[C]#C + CCO↔ C#C + [CH2]CO Yes
[CH]=C=O + CCO↔ C=C=O + [CH2]CO Yes
C=O + [CH2]CO↔ CCO + [CH]=O Yes
CO + [CH2]CO↔ CCO + [CH2]O Yes
CC[O] + CCO↔ CCO + [CH2]CO Yes
[C]#C + CCO↔ C#C + CC[O] Yes
[CH]=C=O + CCO↔ C=C=O + CC[O] No
C=O + CC[O]↔ CCO + [CH]=O Yes
CO + CC[O]↔ CCO + [CH2]O Yes
[CH2]C=O + CC(C)C(=O)C(C)C↔ CC=O + C[C](C)C(=O)C(C)C No
C[C]=O + CC(C)C(=O)C(C)C↔ CC=O + C[C](C)C(=O)C(C)C Yes
CC=O + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + [CH2]C=O Yes
CC=O + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + C[C]=O Yes
[CH2]C=O + CC(C)C(=O)C(C)(C)OO↔ CC=O + CC(C)C(=O)C(C)(C)O[O] Yes
C[C]=O + CC(C)C(=O)C(C)(C)OO↔ CC=O + CC(C)C(=O)C(C)(C)O[O] Yes
[CH2]C=O + CC(C)C(=O)C(C)COO↔ CC=O + CC(C)C(=O)C(C)CO[O] Yes
C[C]=O + CC(C)C(=O)C(C)COO↔ CC=O + CC(C)C(=O)C(C)CO[O] Yes
CC=O + [H]↔ [H][H] + [CH2]C=O No
CC=O + [O]↔ [OH] + [CH2]C=O Yes
[CH2]C=O + C#CC↔ CC=O + C#C[CH2] Yes
CC=O + C#C[CH2]↔ C#CC + C[C]=O Yes
Continued on next page
126
Table B.3 – continued from previous pageReactions Found[CH2]C=O + C=C=C↔ CC=O + [CH]=C=C Yes
CC=O + [CH]=C=C↔ C=C=C + C[C]=O No
CC=O + [c]1ccccc1↔ c1ccccc1 + [CH2]C=O No
CC=O + [c]1ccccc1↔ c1ccccc1 + C[C]=O Yes
[CH2]C=O + C=CC↔ CC=O + [CH2]C=C Yes
C[C]=O + C=CC↔ CC=O + [CH2]C=C Yes
[CH2]C=O + C=C1C=CCC1↔ CC=O + C=C1[CH]CC=C1 Yes
C[C]=O + C=C1C=CCC1↔ CC=O + C=C1[CH]CC=C1 Yes
[CH2]C=O + C=C1CC=CC1↔ CC=O + C=C1[CH]C=CC1 Yes
C[C]=O + C=C1CC=CC1↔ CC=O + C=C1[CH]C=CC1 Yes
[CH2]C=O + C=C1C=CCC1↔ CC=O + C=C1C=C[CH]C1 Yes
C[C]=O + C=C1C=CCC1↔ CC=O + C=C1C=C[CH]C1 Yes
CC=O + [CH]=C↔ C=C + [CH2]C=O Yes
CC=O + [CH]=C↔ C=C + C[C]=O No
CC=O + [CH]=CC↔ C=CC + [CH2]C=O Yes
CC=O + [CH]=CC↔ C=CC + C[C]=O Yes
CC=O + C=[C]C↔ C=CC + [CH2]C=O Yes
CC=O + C=[C]C↔ C=CC + C[C]=O Yes
CC=O + [CH2]CC↔ CCC + [CH2]C=O Yes
CC=O + [CH2]CC↔ CCC + C[C]=O Yes
CC=O + C[CH]C↔ CCC + [CH2]C=O Yes
CC=O + C[CH]C↔ CCC + C[C]=O Yes
CC=O + [CH2]↔ [CH3] + [CH2]C=O Yes
CC=O + [CH2]↔ [CH3] + C[C]=O Yes
[CH2]C=O + CCO↔ CC=O + C[CH]O No
CC=O + C[CH]O↔ CCO + C[C]=O Yes
CC=O + [CH2]CO↔ CCO + [CH2]C=O Yes
CC=O + [CH2]CO↔ CCO + C[C]=O Yes
CC=O + CC[O]↔ CCO + [CH2]C=O Yes
CC=O + CC[O]↔ CCO + C[C]=O Yes
[C]#C + CC=O↔ C#C + C[C]=O Yes
[CH]=C=O + CC=O↔ C=C=O + C[C]=O No
C=O + C[C]=O↔ CC=O + [CH]=O No
[CH2]O + CC=O↔ CO + C[C]=O Yes
[CH2]C=O + CC=O↔ CC=O + C[C]=O Yes
[C]#C + CC=O↔ C#C + [CH2]C=O Yes
[CH]=C=O + CC=O↔ C=C=O + [CH2]C=O No
C=O + [CH2]C=O↔ CC=O + [CH]=O Yes
CO + [CH2]C=O↔ CC=O + [CH2]O Yes
[CH2]C(C)C(=O)C(C)C + CC↔ CC(C)C(=O)C(C)C + C[CH2] Yes
CC(C)C(=O)C(C)C + C[CH2]↔ CC + C[C](C)C(=O)C(C)C Yes
CC(C)C(=O)C(C)(C)OO + C[CH2]↔ CC + CC(C)C(=O)C(C)(C)O[O] Yes
CC(C)C(=O)C(C)COO + C[CH2]↔ CC + CC(C)C(=O)C(C)CO[O] Yes
[H] + CC↔ [H][H] + C[CH2] Yes
[OH] + CC↔ O + C[CH2] Yes
OO + C[CH2]↔ CC + [O]O No
[c]1ccccc1 + CC↔ c1ccccc1 + C[CH2] No
C=C1C=CCC1 + C[CH2]↔ CC + C=C1C=C[CH]C1 No
Continued on next page
127
Table B.3 – continued from previous pageReactions FoundC=C1C=CCC1 + C[CH2]↔ CC + C=C1[CH]CC=C1 Yes
C=C1CC=CC1 + C[CH2]↔ CC + C=C1[CH]C=CC1 Yes
C=C=C + C[CH2]↔ CC + [CH]=C=C Yes
C#CC + C[CH2]↔ CC + C#C[CH2] Yes
[C]#C + CC↔ C#C + C[CH2] Yes
[CH]=CC + CC↔ C=CC + C[CH2] No
C=[C]C + CC↔ C=CC + C[CH2] Yes
C=CC + C[CH2]↔ CC + [CH2]C=C Yes
[CH]=C + CC↔ C=C + C[CH2] Yes
[CH]=C=O + CC↔ C=C=O + C[CH2] Yes
C=O + C[CH2]↔ CC + [CH]=O No
CO + C[CH2]↔ CC + [CH2]O Yes
[CH3] + CC↔ C + C[CH2] Yes
[CH2]CO + CC↔ CCO + C[CH2] Yes
CCO + C[CH2]↔ CC + C[CH]O Yes
CC[O] + CC↔ CCO + C[CH2] Yes
CC=O + C[CH2]↔ CC + [CH2]C=O Yes
CC=O + C[CH2]↔ CC + C[C]=O Yes
CC + [O]↔ [OH] + C[CH2] Yes
CC + [CH2]CC↔ CCC + C[CH2] Yes
C[CH2] + CCC↔ CC + C[CH]C Yes
CC + [CH2]↔ [CH3] + C[CH2] Yes
[C]#C + C=C=O↔ C#C + [CH]=C=O Yes
C=O + [CH]=C=O↔ C=C=O + [CH]=O No
CO + [CH]=C=O↔ C=C=O + [CH2]O Yes
[C]#C + CO↔ C#C + [CH2]O No
C=O + [CH2]O↔ CO + [CH]=O No
C=O + [C]#C↔ C#C + [CH]=O No
CCC + [O]↔ [OH] + [CH2]CC No
CCC + [O]↔ [OH] + C[CH]C Yes
CCC + [CH2]CC↔ CCC + C[CH]C Yes
CCC + [CH2]↔ [CH3] + [CH2]CC Yes
CCC + [CH2]↔ [CH3] + C[CH]C Yes
[CH2]C=C=O + [O]O↔ CC=C=O + [O][O] Yes
[CH2]C=C=O + CC(C)C(=O)C(C)C↔ CC=C=O + C[C](C)C(=O)C(C)C Yes
CC=C=O + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + [CH2]C=C=O Yes
[CH2]C=C=O + CC(C)C(=O)C(C)(C)OO↔ CC=C=O + CC(C)C(=O)C(C)(C)O[O] Yes
[CH2]C=C=O + CC(C)C(=O)C(C)COO↔ CC=C=O + CC(C)C(=O)C(C)CO[O] Yes
[CH2]C=C=O + OO↔ CC=C=O + [O]O Yes
CC=C=O + C#C[CH2]↔ C#CC + [CH2]C=C=O Yes
[CH2]C=C=O + C=C=C↔ CC=C=O + [CH]=C=C Yes
CC=C=O + [c]1ccccc1↔ c1ccccc1 + [CH2]C=C=O Yes
[CH2]C=C=O + C=CC↔ CC=C=O + [CH2]C=C Yes
[CH2]C=C=O + C=C1C=CCC1↔ CC=C=O + C=C1[CH]CC=C1 Yes
[CH2]C=C=O + C=C1CC=CC1↔ CC=C=O + C=C1[CH]C=CC1 Yes
[CH2]C=C=O + C=C1C=CCC1↔ CC=C=O + C=C1C=C[CH]C1 Yes
CC=C=O + [CH]=C↔ C=C + [CH2]C=C=O Yes
CC=C=O + [CH]=CC↔ C=CC + [CH2]C=C=O No
Continued on next page
128
Table B.3 – continued from previous pageReactions FoundCC=C=O + C=[C]C↔ C=CC + [CH2]C=C=O No
CC=C=O + [CH3]↔ C + [CH2]C=C=O Yes
CC=C=O + [CH2]CC↔ CCC + [CH2]C=C=O No
CC=C=O + C[CH]C↔ CCC + [CH2]C=C=O Yes
CC=C=O + [CH2]↔ [CH3] + [CH2]C=C=O Yes
CC=C=O + C[CH]O↔ CCO + [CH2]C=C=O No
CC=C=O + [CH2]CO↔ CCO + [CH2]C=C=O Yes
CC=C=O + CC[O]↔ CCO + [CH2]C=C=O Yes
[CH2]C=C=O + CC=O↔ CC=C=O + C[C]=O No
CC=C=O + [CH2]C=O↔ CC=O + [CH2]C=C=O Yes
CC=C=O + C[CH2]↔ CC + [CH2]C=C=O Yes
CC=C=O + [CH]=C=O↔ C=C=O + [CH2]C=C=O Yes
CC=C=O + [CH2]O↔ CO + [CH2]C=C=O No
CC=C=O + [C]#C↔ C#C + [CH2]C=C=O Yes
[CH2]C=C=O + C=O↔ CC=C=O + [CH]=O No
[CH2]C(C)C(=O)C(C)C + C=CC=O↔ CC(C)C(=O)C(C)C + C=C[C]=O No
CC(C)C(=O)C(C)C + C=C[C]=O↔ C=CC=O + C[C](C)C(=O)C(C)C Yes
CC(C)C(=O)C(C)(C)OO + C=C[C]=O↔ C=CC=O + CC(C)C(=O)C(C)(C)O[O] Yes
CC(C)C(=O)C(C)COO + C=C[C]=O↔ C=CC=O + CC(C)C(=O)C(C)CO[O] Yes
OO + C=C[C]=O↔ C=CC=O + [O]O Yes
[c]1ccccc1 + C=CC=O↔ c1ccccc1 + C=C[C]=O No
C=C1C=CCC1 + C=C[C]=O↔ C=CC=O + C=C1C=C[CH]C1 Yes
C=C1C=CCC1 + C=C[C]=O↔ C=CC=O + C=C1[CH]CC=C1 Yes
C=C1CC=CC1 + C=C[C]=O↔ C=CC=O + C=C1[CH]C=CC1 Yes
[CH]=C=C + C=CC=O↔ C=C=C + C=C[C]=O Yes
C#C[CH2] + C=CC=O↔ C#CC + C=C[C]=O No
[C]#C + C=CC=O↔ C#C + C=C[C]=O No
[CH]=CC + C=CC=O↔ C=CC + C=C[C]=O No
C=[C]C + C=CC=O↔ C=CC + C=C[C]=O No
C=CC + C=C[C]=O↔ C=CC=O + [CH2]C=C Yes
[CH]=C + C=CC=O↔ C=C + C=C[C]=O Yes
[CH]=C=O + C=CC=O↔ C=C=O + C=C[C]=O No
C=O + C=C[C]=O↔ C=CC=O + [CH]=O Yes
[CH2]O + C=CC=O↔ CO + C=C[C]=O Yes
[CH3] + C=CC=O↔ C + C=C[C]=O Yes
[CH2]CO + C=CC=O↔ CCO + C=C[C]=O Yes
C[CH]O + C=CC=O↔ CCO + C=C[C]=O Yes
CC[O] + C=CC=O↔ CCO + C=C[C]=O Yes
[CH2]C=O + C=CC=O↔ CC=O + C=C[C]=O Yes
CC=O + C=C[C]=O↔ C=CC=O + C[C]=O Yes
C[CH2] + C=CC=O↔ CC + C=C[C]=O Yes
[CH2]CC + C=CC=O↔ CCC + C=C[C]=O No
C[CH]C + C=CC=O↔ CCC + C=C[C]=O Yes
[CH2]C=C=O + C=CC=O↔ CC=C=O + C=C[C]=O Yes
C=CC=O + [CH2]↔ [CH3] + C=C[C]=O Yes
CC(C)C(=O)C(C)C + [CH]=CC#C↔ C#CC=C + [CH2]C(C)C(=O)C(C)C Yes
CC(C)C(=O)C(C)C + [CH]=CC#C↔ C#CC=C + C[C](C)C(=O)C(C)C No
CC(C)C(=O)C(C)(C)OO + [CH]=CC#C↔ C#CC=C + CC(C)C(=O)C(C)(C)O[O] No
Continued on next page
129
Table B.3 – continued from previous pageReactions FoundCC(C)C(=O)C(C)COO + [CH]=CC#C↔ C#CC=C + CC(C)C(=O)C(C)CO[O] Yes
[H][H] + [CH]=CC#C↔ C#CC=C + [H] Yes
[OH] + C#CC=C↔ O + [CH]=CC#C Yes
[O]O + [CH]=CC#C↔ C#CC=C + [O][O] No
OO + [CH]=CC#C↔ C#CC=C + [O]O No
c1ccccc1 + [CH]=CC#C↔ C#CC=C + [c]1ccccc1 No
C=C1C=CCC1 + [CH]=CC#C↔ C#CC=C + C=C1C=C[CH]C1 Yes
C=C1C=CCC1 + [CH]=CC#C↔ C#CC=C + C=C1[CH]CC=C1 Yes
C=C1CC=CC1 + [CH]=CC#C↔ C#CC=C + C=C1[CH]C=CC1 No
C=C=C + [CH]=CC#C↔ C#CC=C + [CH]=C=C Yes
C#CC + [CH]=CC#C↔ C#CC=C + C#C[CH2] Yes
[C]#C + C#CC=C↔ C#C + [CH]=CC#C Yes
C=CC + [CH]=CC#C↔ C#CC=C + [CH]=CC Yes
C=CC + [CH]=CC#C↔ C#CC=C + C=[C]C Yes
C=CC + [CH]=CC#C↔ C#CC=C + [CH2]C=C Yes
C=C + [CH]=CC#C↔ C#CC=C + [CH]=C No
C=C=O + [CH]=CC#C↔ C#CC=C + [CH]=C=O Yes
C=O + [CH]=CC#C↔ C#CC=C + [CH]=O Yes
CO + [CH]=CC#C↔ C#CC=C + [CH2]O No
C + [CH]=CC#C↔ C#CC=C + [CH3] No
CCO + [CH]=CC#C↔ C#CC=C + [CH2]CO Yes
CCO + [CH]=CC#C↔ C#CC=C + C[CH]O Yes
CCO + [CH]=CC#C↔ C#CC=C + CC[O] Yes
CC=O + [CH]=CC#C↔ C#CC=C + [CH2]C=O Yes
CC=O + [CH]=CC#C↔ C#CC=C + C[C]=O Yes
CC + [CH]=CC#C↔ C#CC=C + C[CH2] Yes
CCC + [CH]=CC#C↔ C#CC=C + [CH2]CC No
CCC + [CH]=CC#C↔ C#CC=C + C[CH]C Yes
CC=C=O + [CH]=CC#C↔ C#CC=C + [CH2]C=C=O Yes
C=CC=O + [CH]=CC#C↔ C#CC=C + C=C[C]=O Yes
CC(C)C(=O)C(C)C + C#C[C]=C↔ C#CC=C + [CH2]C(C)C(=O)C(C)C Yes
CC(C)C(=O)C(C)C + C#C[C]=C↔ C#CC=C + C[C](C)C(=O)C(C)C Yes
CC(C)C(=O)C(C)(C)OO + C#C[C]=C↔ C#CC=C + CC(C)C(=O)C(C)(C)O[O] No
CC(C)C(=O)C(C)COO + C#C[C]=C↔ C#CC=C + CC(C)C(=O)C(C)CO[O] Yes
[H] + C#CC=C↔ [H][H] + C#C[C]=C Yes
[OH] + C#CC=C↔ O + C#C[C]=C Yes
[O]O + C#C[C]=C↔ C#CC=C + [O][O] No
OO + C#C[C]=C↔ C#CC=C + [O]O No
[c]1ccccc1 + C#CC=C↔ c1ccccc1 + C#C[C]=C No
C=C1C=CCC1 + C#C[C]=C↔ C#CC=C + C=C1C=C[CH]C1 Yes
C=C1C=CCC1 + C#C[C]=C↔ C#CC=C + C=C1[CH]CC=C1 Yes
C=C1CC=CC1 + C#C[C]=C↔ C#CC=C + C=C1[CH]C=CC1 Yes
C=C=C + C#C[C]=C↔ C#CC=C + [CH]=C=C Yes
C#CC + C#C[C]=C↔ C#CC=C + C#C[CH2] No
[C]#C + C#CC=C↔ C#C + C#C[C]=C No
[CH]=CC + C#CC=C↔ C=CC + C#C[C]=C No
C=[C]C + C#CC=C↔ C=CC + C#C[C]=C No
C=CC + C#C[C]=C↔ C#CC=C + [CH2]C=C Yes
Continued on next page
130
Table B.3 – continued from previous pageReactions Found[CH]=C + C#CC=C↔ C=C + C#C[C]=C Yes
[CH]=C=O + C#CC=C↔ C=C=O + C#C[C]=C Yes
C=O + C#C[C]=C↔ C#CC=C + [CH]=O Yes
CO + C#C[C]=C↔ C#CC=C + [CH2]O Yes
[CH3] + C#CC=C↔ C + C#C[C]=C Yes
[CH2]CO + C#CC=C↔ CCO + C#C[C]=C No
CCO + C#C[C]=C↔ C#CC=C + C[CH]O No
CC[O] + C#CC=C↔ CCO + C#C[C]=C Yes
CC=O + C#C[C]=C↔ C#CC=C + [CH2]C=O Yes
CC=O + C#C[C]=C↔ C#CC=C + C[C]=O Yes
CC + C#C[C]=C↔ C#CC=C + C[CH2] Yes
CCC + C#C[C]=C↔ C#CC=C + [CH2]CC No
CCC + C#C[C]=C↔ C#CC=C + C[CH]C Yes
CC=C=O + C#C[C]=C↔ C#CC=C + [CH2]C=C=O Yes
C=CC=O + C#C[C]=C↔ C#CC=C + C=C[C]=O No
C#CC=C + [O]↔ [OH] + C#C[C]=C Yes
C#CC=C + [CH2]↔ [CH3] + C#C[C]=C Yes
C#CC=C + [CH]=CC#C↔ C#CC=C + C#C[C]=C Yes
C[C](C)C(=O)C(C)(C)OO + [O]O↔ CC(C)C(=O)C(C)(C)OO + [O][O] Yes
C[C](C)C(=O)C(C)(C)OO + CC(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)(C)OO + C[C](C)C(=O)C(C)C Yes
CC(C)C(=O)C(C)(C)OO + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + C[C](C)C(=O)C(C)(C)OO Yes
C[C](C)C(=O)C(C)(C)OO + CC(C)C(=O)C(C)(C)OO↔ CC(C)C(=O)C(C)(C)OO + CC(C)C(=O)C(C)(C)O[O] Yes
C[C](C)C(=O)C(C)(C)OO + CC(C)C(=O)C(C)COO↔ CC(C)C(=O)C(C)(C)OO + CC(C)C(=O)C(C)CO[O] No
CC(C)C(=O)C(C)(C)OO + [H]↔ [H][H] + C[C](C)C(=O)C(C)(C)OO Yes
CC(C)C(=O)C(C)(C)OO + [O]↔ [OH] + C[C](C)C(=O)C(C)(C)OO Yes
CC(C)C(=O)C(C)(C)OO + [OH]↔ O + C[C](C)C(=O)C(C)(C)OO No
C[C](C)C(=O)C(C)(C)OO + OO↔ CC(C)C(=O)C(C)(C)OO + [O]O No
CC(C)C(=O)C(C)(C)OO + C#C[CH2]↔ C#CC + C[C](C)C(=O)C(C)(C)OO Yes
CC(C)C(=O)C(C)(C)OO + [CH]=C=C↔ C=C=C + C[C](C)C(=O)C(C)(C)OO Yes
CC(C)C(=O)C(C)(C)OO + [c]1ccccc1↔ c1ccccc1 + C[C](C)C(=O)C(C)(C)OO No
C[C](C)C(=O)C(C)(C)OO + C=CC↔ CC(C)C(=O)C(C)(C)OO + [CH2]C=C No
C[C](C)C(=O)C(C)(C)OO + C=C1C=CCC1↔ CC(C)C(=O)C(C)(C)OO + C=C1[CH]CC=C1 Yes
C[C](C)C(=O)C(C)(C)OO + C=C1CC=CC1↔ CC(C)C(=O)C(C)(C)OO + C=C1[CH]C=CC1 Yes
C[C](C)C(=O)C(C)(C)OO + C=C1C=CCC1↔ CC(C)C(=O)C(C)(C)OO + C=C1C=C[CH]C1 Yes
CC(C)C(=O)C(C)(C)OO + [CH]=C↔ C=C + C[C](C)C(=O)C(C)(C)OO Yes
CC(C)C(=O)C(C)(C)OO + [CH]=CC↔ C=CC + C[C](C)C(=O)C(C)(C)OO No
CC(C)C(=O)C(C)(C)OO + C=[C]C↔ C=CC + C[C](C)C(=O)C(C)(C)OO No
CC(C)C(=O)C(C)(C)OO + [CH3]↔ C + C[C](C)C(=O)C(C)(C)OO Yes
CC(C)C(=O)C(C)(C)OO + [CH2]CC↔ CCC + C[C](C)C(=O)C(C)(C)OO Yes
CC(C)C(=O)C(C)(C)OO + C[CH]C↔ CCC + C[C](C)C(=O)C(C)(C)OO Yes
CC(C)C(=O)C(C)(C)OO + [CH2]↔ [CH3] + C[C](C)C(=O)C(C)(C)OO Yes
CC(C)C(=O)C(C)(C)OO + C[CH]O↔ CCO + C[C](C)C(=O)C(C)(C)OO No
CC(C)C(=O)C(C)(C)OO + [CH2]CO↔ CCO + C[C](C)C(=O)C(C)(C)OO Yes
CC(C)C(=O)C(C)(C)OO + CC[O]↔ CCO + C[C](C)C(=O)C(C)(C)OO Yes
C[C](C)C(=O)C(C)(C)OO + CC=O↔ CC(C)C(=O)C(C)(C)OO + C[C]=O No
CC(C)C(=O)C(C)(C)OO + [CH2]C=O↔ CC=O + C[C](C)C(=O)C(C)(C)OO Yes
CC(C)C(=O)C(C)(C)OO + C[CH2]↔ CC + C[C](C)C(=O)C(C)(C)OO No
CC(C)C(=O)C(C)(C)OO + [CH]=C=O↔ C=C=O + C[C](C)C(=O)C(C)(C)OO Yes
Continued on next page
131
Table B.3 – continued from previous pageReactions FoundCC(C)C(=O)C(C)(C)OO + [CH2]O↔ CO + C[C](C)C(=O)C(C)(C)OO No
CC(C)C(=O)C(C)(C)OO + [C]#C↔ C#C + C[C](C)C(=O)C(C)(C)OO Yes
C[C](C)C(=O)C(C)(C)OO + C=O↔ CC(C)C(=O)C(C)(C)OO + [CH]=O No
CC(C)C(=O)C(C)(C)OO + [CH2]C=C=O↔ CC=C=O + C[C](C)C(=O)C(C)(C)OO No
C[C](C)C(=O)C(C)(C)OO + C=CC=O↔ CC(C)C(=O)C(C)(C)OO + C=C[C]=O Yes
CC(C)C(=O)C(C)(C)OO + [CH]=CC#C↔ C#CC=C + C[C](C)C(=O)C(C)(C)OO Yes
CC(C)C(=O)C(C)(C)OO + C#C[C]=C↔ C#CC=C + C[C](C)C(=O)C(C)(C)OO Yes
[CH2]C(C)C(=O)C(C)(C)OO + [O]O↔ CC(C)C(=O)C(C)(C)OO + [O][O] No
[CH2]C(C)C(=O)C(C)(C)OO + CC(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)(C)OO + C[C](C)C(=O)C(C)C No
CC(C)C(=O)C(C)(C)OO + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + [CH2]C(C)C(=O)C(C)(C)OO Yes
[CH2]C(C)C(=O)C(C)(C)OO + CC(C)C(=O)C(C)(C)OO↔ CC(C)C(=O)C(C)(C)OO + CC(C)C(=O)C(C)(C)O[O] Yes
[CH2]C(C)C(=O)C(C)(C)OO + CC(C)C(=O)C(C)COO↔ CC(C)C(=O)C(C)(C)OO + CC(C)C(=O)C(C)CO[O] Yes
CC(C)C(=O)C(C)(C)OO + [H]↔ [H][H] + [CH2]C(C)C(=O)C(C)(C)OO Yes
CC(C)C(=O)C(C)(C)OO + [O]↔ [OH] + [CH2]C(C)C(=O)C(C)(C)OO Yes
CC(C)C(=O)C(C)(C)OO + [OH]↔ O + [CH2]C(C)C(=O)C(C)(C)OO No
[CH2]C(C)C(=O)C(C)(C)OO + OO↔ CC(C)C(=O)C(C)(C)OO + [O]O No
[CH2]C(C)C(=O)C(C)(C)OO + C#CC↔ CC(C)C(=O)C(C)(C)OO + C#C[CH2] Yes
[CH2]C(C)C(=O)C(C)(C)OO + C=C=C↔ CC(C)C(=O)C(C)(C)OO + [CH]=C=C Yes
CC(C)C(=O)C(C)(C)OO + [c]1ccccc1↔ c1ccccc1 + [CH2]C(C)C(=O)C(C)(C)OO Yes
[CH2]C(C)C(=O)C(C)(C)OO + C=CC↔ CC(C)C(=O)C(C)(C)OO + [CH2]C=C Yes
[CH2]C(C)C(=O)C(C)(C)OO + C=C1C=CCC1↔ CC(C)C(=O)C(C)(C)OO + C=C1[CH]CC=C1 No
[CH2]C(C)C(=O)C(C)(C)OO + C=C1CC=CC1↔ CC(C)C(=O)C(C)(C)OO + C=C1[CH]C=CC1 Yes
[CH2]C(C)C(=O)C(C)(C)OO + C=C1C=CCC1↔ CC(C)C(=O)C(C)(C)OO + C=C1C=C[CH]C1 Yes
CC(C)C(=O)C(C)(C)OO + [CH]=C↔ C=C + [CH2]C(C)C(=O)C(C)(C)OO No
CC(C)C(=O)C(C)(C)OO + [CH]=CC↔ C=CC + [CH2]C(C)C(=O)C(C)(C)OO No
CC(C)C(=O)C(C)(C)OO + C=[C]C↔ C=CC + [CH2]C(C)C(=O)C(C)(C)OO Yes
CC(C)C(=O)C(C)(C)OO + [CH3]↔ C + [CH2]C(C)C(=O)C(C)(C)OO No
CC(C)C(=O)C(C)(C)OO + [CH2]CC↔ CCC + [CH2]C(C)C(=O)C(C)(C)OO Yes
[CH2]C(C)C(=O)C(C)(C)OO + CCC↔ CC(C)C(=O)C(C)(C)OO + C[CH]C Yes
CC(C)C(=O)C(C)(C)OO + [CH2]↔ [CH3] + [CH2]C(C)C(=O)C(C)(C)OO No
[CH2]C(C)C(=O)C(C)(C)OO + CCO↔ CC(C)C(=O)C(C)(C)OO + C[CH]O No
CC(C)C(=O)C(C)(C)OO + [CH2]CO↔ CCO + [CH2]C(C)C(=O)C(C)(C)OO Yes
CC(C)C(=O)C(C)(C)OO + CC[O]↔ CCO + [CH2]C(C)C(=O)C(C)(C)OO Yes
[CH2]C(C)C(=O)C(C)(C)OO + CC=O↔ CC(C)C(=O)C(C)(C)OO + C[C]=O No
[CH2]C(C)C(=O)C(C)(C)OO + CC=O↔ CC(C)C(=O)C(C)(C)OO + [CH2]C=O No
[CH2]C(C)C(=O)C(C)(C)OO + CC↔ CC(C)C(=O)C(C)(C)OO + C[CH2] Yes
CC(C)C(=O)C(C)(C)OO + [CH]=C=O↔ C=C=O + [CH2]C(C)C(=O)C(C)(C)OO Yes
[CH2]C(C)C(=O)C(C)(C)OO + CO↔ CC(C)C(=O)C(C)(C)OO + [CH2]O Yes
CC(C)C(=O)C(C)(C)OO + [C]#C↔ C#C + [CH2]C(C)C(=O)C(C)(C)OO Yes
[CH2]C(C)C(=O)C(C)(C)OO + C=O↔ CC(C)C(=O)C(C)(C)OO + [CH]=O Yes
[CH2]C(C)C(=O)C(C)(C)OO + CC=C=O↔ CC(C)C(=O)C(C)(C)OO + [CH2]C=C=O No
[CH2]C(C)C(=O)C(C)(C)OO + C=CC=O↔ CC(C)C(=O)C(C)(C)OO + C=C[C]=O No
CC(C)C(=O)C(C)(C)OO + [CH]=CC#C↔ C#CC=C + [CH2]C(C)C(=O)C(C)(C)OO No
CC(C)C(=O)C(C)(C)OO + C#C[C]=C↔ C#CC=C + [CH2]C(C)C(=O)C(C)(C)OO No
[CH2]C(C)C(=O)C(C)(C)OO + CC(C)C(=O)C(C)(C)OO↔ CC(C)C(=O)C(C)(C)OO + C[C](C)C(=O)C(C)(C)OO Yes
[CH2]C(C)C(=O)C(C)C + CC(C)C(=O)OO↔ CC(C)C(=O)C(C)C + CC(C)C(=O)O[O] Yes
CC(C)C(=O)C(C)C + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + C[C](C)C(=O)C(C)C No
[CH2]C(C)C(=O)C(C)(C)OO + CC(C)C(=O)OO↔ CC(C)C(=O)C(C)(C)OO + CC(C)C(=O)O[O] Yes
Continued on next page
132
Table B.3 – continued from previous pageReactions FoundCC(C)C(=O)C(C)(C)OO + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + C[C](C)C(=O)C(C)(C)OO No
CC(C)C(=O)C(C)(C)OO + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + CC(C)C(=O)C(C)(C)O[O] Yes
CC(C)C(=O)C(C)COO + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + CC(C)C(=O)C(C)CO[O] Yes
[H] + CC(C)C(=O)OO↔ [H][H] + CC(C)C(=O)O[O] Yes
[OH] + CC(C)C(=O)OO↔ O + CC(C)C(=O)O[O] No
[O]O + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + [O][O] No
OO + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + [O]O No
[c]1ccccc1 + CC(C)C(=O)OO↔ c1ccccc1 + CC(C)C(=O)O[O] No
C=C1C=CCC1 + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + C=C1C=C[CH]C1 No
C=C1C=CCC1 + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + C=C1[CH]CC=C1 Yes
C=C1CC=CC1 + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + C=C1[CH]C=CC1 Yes
C=C=C + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + [CH]=C=C No
C#CC + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + C#C[CH2] Yes
[C]#C + CC(C)C(=O)OO↔ C#C + CC(C)C(=O)O[O] No
[CH]=CC + CC(C)C(=O)OO↔ C=CC + CC(C)C(=O)O[O] No
C=[C]C + CC(C)C(=O)OO↔ C=CC + CC(C)C(=O)O[O] No
C=CC + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + [CH2]C=C No
[CH]=C + CC(C)C(=O)OO↔ C=C + CC(C)C(=O)O[O] Yes
[CH]=C=O + CC(C)C(=O)OO↔ C=C=O + CC(C)C(=O)O[O] No
C=O + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + [CH]=O No
CO + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + [CH2]O No
[CH3] + CC(C)C(=O)OO↔ C + CC(C)C(=O)O[O] No
[CH2]CO + CC(C)C(=O)OO↔ CCO + CC(C)C(=O)O[O] No
CCO + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + C[CH]O No
CC[O] + CC(C)C(=O)OO↔ CCO + CC(C)C(=O)O[O] Yes
CC=O + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + [CH2]C=O Yes
CC=O + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + C[C]=O Yes
C[CH2] + CC(C)C(=O)OO↔ CC + CC(C)C(=O)O[O] Yes
[CH2]CC + CC(C)C(=O)OO↔ CCC + CC(C)C(=O)O[O] No
CCC + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + C[CH]C No
CC=C=O + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + [CH2]C=C=O No
C=CC=O + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + C=C[C]=O No
C#C[C]=C + CC(C)C(=O)OO↔ C#CC=C + CC(C)C(=O)O[O] No
[CH]=CC#C + CC(C)C(=O)OO↔ C#CC=C + CC(C)C(=O)O[O] No
CC(C)C(=O)OO + [CH2]↔ [CH3] + CC(C)C(=O)O[O] No
C[C](C)C(=O)C(C)COO + [O]O↔ CC(C)C(=O)C(C)COO + [O][O] Yes
C[C](C)C(=O)C(C)COO + CC(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)COO + C[C](C)C(=O)C(C)C No
CC(C)C(=O)C(C)COO + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + C[C](C)C(=O)C(C)COO No
C[C](C)C(=O)C(C)COO + CC(C)C(=O)C(C)(C)OO↔ CC(C)C(=O)C(C)COO + CC(C)C(=O)C(C)(C)O[O] Yes
C[C](C)C(=O)C(C)COO + CC(C)C(=O)C(C)COO↔ CC(C)C(=O)C(C)COO + CC(C)C(=O)C(C)CO[O] Yes
CC(C)C(=O)C(C)COO + [H]↔ [H][H] + C[C](C)C(=O)C(C)COO No
CC(C)C(=O)C(C)COO + [O]↔ [OH] + C[C](C)C(=O)C(C)COO Yes
CC(C)C(=O)C(C)COO + [OH]↔ O + C[C](C)C(=O)C(C)COO No
C[C](C)C(=O)C(C)COO + OO↔ CC(C)C(=O)C(C)COO + [O]O No
CC(C)C(=O)C(C)COO + C#C[CH2]↔ C#CC + C[C](C)C(=O)C(C)COO Yes
CC(C)C(=O)C(C)COO + [CH]=C=C↔ C=C=C + C[C](C)C(=O)C(C)COO Yes
CC(C)C(=O)C(C)COO + [c]1ccccc1↔ c1ccccc1 + C[C](C)C(=O)C(C)COO No
C[C](C)C(=O)C(C)COO + C=CC↔ CC(C)C(=O)C(C)COO + [CH2]C=C No
Continued on next page
133
Table B.3 – continued from previous pageReactions FoundC[C](C)C(=O)C(C)COO + C=C1C=CCC1↔ CC(C)C(=O)C(C)COO + C=C1[CH]CC=C1 Yes
C[C](C)C(=O)C(C)COO + C=C1CC=CC1↔ CC(C)C(=O)C(C)COO + C=C1[CH]C=CC1 Yes
C[C](C)C(=O)C(C)COO + C=C1C=CCC1↔ CC(C)C(=O)C(C)COO + C=C1C=C[CH]C1 Yes
CC(C)C(=O)C(C)COO + [CH]=C↔ C=C + C[C](C)C(=O)C(C)COO Yes
CC(C)C(=O)C(C)COO + [CH]=CC↔ C=CC + C[C](C)C(=O)C(C)COO No
CC(C)C(=O)C(C)COO + C=[C]C↔ C=CC + C[C](C)C(=O)C(C)COO No
CC(C)C(=O)C(C)COO + [CH3]↔ C + C[C](C)C(=O)C(C)COO No
CC(C)C(=O)C(C)COO + [CH2]CC↔ CCC + C[C](C)C(=O)C(C)COO Yes
CC(C)C(=O)C(C)COO + C[CH]C↔ CCC + C[C](C)C(=O)C(C)COO Yes
CC(C)C(=O)C(C)COO + [CH2]↔ [CH3] + C[C](C)C(=O)C(C)COO Yes
CC(C)C(=O)C(C)COO + C[CH]O↔ CCO + C[C](C)C(=O)C(C)COO No
CC(C)C(=O)C(C)COO + [CH2]CO↔ CCO + C[C](C)C(=O)C(C)COO Yes
CC(C)C(=O)C(C)COO + CC[O]↔ CCO + C[C](C)C(=O)C(C)COO Yes
C[C](C)C(=O)C(C)COO + CC=O↔ CC(C)C(=O)C(C)COO + C[C]=O Yes
CC(C)C(=O)C(C)COO + [CH2]C=O↔ CC=O + C[C](C)C(=O)C(C)COO Yes
CC(C)C(=O)C(C)COO + C[CH2]↔ CC + C[C](C)C(=O)C(C)COO Yes
CC(C)C(=O)C(C)COO + [CH]=C=O↔ C=C=O + C[C](C)C(=O)C(C)COO Yes
CC(C)C(=O)C(C)COO + [CH2]O↔ CO + C[C](C)C(=O)C(C)COO No
CC(C)C(=O)C(C)COO + [C]#C↔ C#C + C[C](C)C(=O)C(C)COO No
C[C](C)C(=O)C(C)COO + C=O↔ CC(C)C(=O)C(C)COO + [CH]=O Yes
CC(C)C(=O)C(C)COO + [CH2]C=C=O↔ CC=C=O + C[C](C)C(=O)C(C)COO No
C[C](C)C(=O)C(C)COO + C=CC=O↔ CC(C)C(=O)C(C)COO + C=C[C]=O Yes
CC(C)C(=O)C(C)COO + [CH]=CC#C↔ C#CC=C + C[C](C)C(=O)C(C)COO Yes
CC(C)C(=O)C(C)COO + C#C[C]=C↔ C#CC=C + C[C](C)C(=O)C(C)COO No
CC(C)C(=O)C(C)COO + C[C](C)C(=O)C(C)(C)OO↔ CC(C)C(=O)C(C)(C)OO + C[C](C)C(=O)C(C)COO No
CC(C)C(=O)C(C)COO + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + C[C](C)C(=O)C(C)COO Yes
CC(C)C(=O)C(C)COO + [CH2]C(C)C(=O)C(C)(C)OO↔ CC(C)C(=O)C(C)(C)OO + C[C](C)C(=O)C(C)COO Yes
[CH2]C(C)=C=O + [O]O↔ CC(C)=C=O + [O][O] Yes
CC(C)=C=O + C[C](C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + [CH2]C(C)=C=O No
CC(C)=C=O + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + [CH2]C(C)=C=O Yes
[CH2]C(C)=C=O + CC(C)C(=O)C(C)(C)OO↔ CC(C)=C=O + CC(C)C(=O)C(C)(C)O[O] Yes
[CH2]C(C)=C=O + CC(C)C(=O)C(C)COO↔ CC(C)=C=O + CC(C)C(=O)C(C)CO[O] No
CC(C)=C=O + [H]↔ [H][H] + [CH2]C(C)=C=O Yes
CC(C)=C=O + [O]↔ [OH] + [CH2]C(C)=C=O Yes
CC(C)=C=O + [OH]↔ O + [CH2]C(C)=C=O No
[CH2]C(C)=C=O + OO↔ CC(C)=C=O + [O]O No
CC(C)=C=O + C#C[CH2]↔ C#CC + [CH2]C(C)=C=O Yes
CC(C)=C=O + [CH]=C=C↔ C=C=C + [CH2]C(C)=C=O Yes
CC(C)=C=O + [c]1ccccc1↔ c1ccccc1 + [CH2]C(C)=C=O No
CC(C)=C=O + [CH2]C=C↔ C=CC + [CH2]C(C)=C=O Yes
[CH2]C(C)=C=O + C=C1C=CCC1↔ CC(C)=C=O + C=C1[CH]CC=C1 Yes
[CH2]C(C)=C=O + C=C1CC=CC1↔ CC(C)=C=O + C=C1[CH]C=CC1 Yes
[CH2]C(C)=C=O + C=C1C=CCC1↔ CC(C)=C=O + C=C1C=C[CH]C1 Yes
CC(C)=C=O + [CH]=C↔ C=C + [CH2]C(C)=C=O Yes
CC(C)=C=O + [CH]=CC↔ C=CC + [CH2]C(C)=C=O No
CC(C)=C=O + C=[C]C↔ C=CC + [CH2]C(C)=C=O Yes
CC(C)=C=O + [CH3]↔ C + [CH2]C(C)=C=O Yes
CC(C)=C=O + [CH2]CC↔ CCC + [CH2]C(C)=C=O Yes
Continued on next page
134
Table B.3 – continued from previous pageReactions FoundCC(C)=C=O + C[CH]C↔ CCC + [CH2]C(C)=C=O Yes
CC(C)=C=O + [CH2]↔ [CH3] + [CH2]C(C)=C=O Yes
CC(C)=C=O + C[CH]O↔ CCO + [CH2]C(C)=C=O No
CC(C)=C=O + [CH2]CO↔ CCO + [CH2]C(C)=C=O Yes
CC(C)=C=O + CC[O]↔ CCO + [CH2]C(C)=C=O No
CC(C)=C=O + C[C]=O↔ CC=O + [CH2]C(C)=C=O No
CC(C)=C=O + [CH2]C=O↔ CC=O + [CH2]C(C)=C=O Yes
CC(C)=C=O + C[CH2]↔ CC + [CH2]C(C)=C=O Yes
CC(C)=C=O + [CH]=C=O↔ C=C=O + [CH2]C(C)=C=O Yes
CC(C)=C=O + [CH2]O↔ CO + [CH2]C(C)=C=O Yes
CC(C)=C=O + [C]#C↔ C#C + [CH2]C(C)=C=O Yes
CC(C)=C=O + [CH]=O↔ C=O + [CH2]C(C)=C=O No
CC(C)=C=O + [CH2]C=C=O↔ CC=C=O + [CH2]C(C)=C=O No
CC(C)=C=O + C=C[C]=O↔ C=CC=O + [CH2]C(C)=C=O Yes
CC(C)=C=O + [CH]=CC#C↔ C#CC=C + [CH2]C(C)=C=O Yes
CC(C)=C=O + C#C[C]=C↔ C#CC=C + [CH2]C(C)=C=O No
CC(C)=C=O + C[C](C)C(=O)C(C)(C)OO↔ CC(C)C(=O)C(C)(C)OO + [CH2]C(C)=C=O Yes
CC(C)=C=O + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + [CH2]C(C)=C=O Yes
CC(C)=C=O + [CH2]C(C)C(=O)C(C)(C)OO↔ CC(C)C(=O)C(C)(C)OO + [CH2]C(C)=C=O No
CC(C)=C=O + C[C](C)C(=O)C(C)COO↔ CC(C)C(=O)C(C)COO + [CH2]C(C)=C=O Yes
[CH2]C(C)C(=O)C(C)C + C=C(C)C(=O)OO↔ CC(C)C(=O)C(C)C + C=C(C)C(=O)O[O] Yes
CC(C)C(=O)C(C)C + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + C[C](C)C(=O)C(C)C No
[CH2]C(C)C(=O)C(C)(C)OO + C=C(C)C(=O)OO↔ CC(C)C(=O)C(C)(C)OO + C=C(C)C(=O)O[O] Yes
CC(C)C(=O)C(C)(C)OO + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + C[C](C)C(=O)C(C)(C)OO No
CC(C)C(=O)C(C)(C)OO + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + CC(C)C(=O)C(C)(C)O[O] Yes
CC(C)C(=O)C(C)COO + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + C[C](C)C(=O)C(C)COO Yes
CC(C)C(=O)C(C)COO + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + CC(C)C(=O)C(C)CO[O] Yes
[H] + C=C(C)C(=O)OO↔ [H][H] + C=C(C)C(=O)O[O] Yes
[OH] + C=C(C)C(=O)OO↔ O + C=C(C)C(=O)O[O] Yes
[O]O + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + [O][O] No
OO + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + [O]O Yes
[c]1ccccc1 + C=C(C)C(=O)OO↔ c1ccccc1 + C=C(C)C(=O)O[O] No
C=C1C=CCC1 + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + C=C1C=C[CH]C1 No
C=C1C=CCC1 + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + C=C1[CH]CC=C1 Yes
C=C1CC=CC1 + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + C=C1[CH]C=CC1 Yes
C=C=C + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + [CH]=C=C Yes
C#CC + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + C#C[CH2] No
[C]#C + C=C(C)C(=O)OO↔ C#C + C=C(C)C(=O)O[O] Yes
[CH]=CC + C=C(C)C(=O)OO↔ C=CC + C=C(C)C(=O)O[O] No
C=[C]C + C=C(C)C(=O)OO↔ C=CC + C=C(C)C(=O)O[O] No
C=CC + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + [CH2]C=C No
[CH]=C + C=C(C)C(=O)OO↔ C=C + C=C(C)C(=O)O[O] Yes
[CH]=C=O + C=C(C)C(=O)OO↔ C=C=O + C=C(C)C(=O)O[O] No
C=O + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + [CH]=O No
CO + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + [CH2]O Yes
[CH3] + C=C(C)C(=O)OO↔ C + C=C(C)C(=O)O[O] Yes
[CH2]CO + C=C(C)C(=O)OO↔ CCO + C=C(C)C(=O)O[O] No
CCO + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + C[CH]O No
Continued on next page
135
Table B.3 – continued from previous pageReactions FoundCC[O] + C=C(C)C(=O)OO↔ CCO + C=C(C)C(=O)O[O] Yes
CC=O + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + [CH2]C=O No
CC=O + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + C[C]=O Yes
C[CH2] + C=C(C)C(=O)OO↔ CC + C=C(C)C(=O)O[O] No
[CH2]CC + C=C(C)C(=O)OO↔ CCC + C=C(C)C(=O)O[O] No
CCC + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + C[CH]C No
CC=C=O + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + [CH2]C=C=O Yes
C=CC=O + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + C=C[C]=O No
C#C[C]=C + C=C(C)C(=O)OO↔ C#CC=C + C=C(C)C(=O)O[O] No
[CH]=CC#C + C=C(C)C(=O)OO↔ C#CC=C + C=C(C)C(=O)O[O] No
CC(C)=C=O + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + [CH2]C(C)=C=O No
CC(C)C(=O)OO + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + CC(C)C(=O)O[O] No
C=C(C)C(=O)OO + [CH2]↔ [CH3] + C=C(C)C(=O)O[O] Yes
[CH2]C(COO)C(=O)C(C)C + [O]O↔ CC(C)C(=O)C(C)COO + [O][O] No
[CH2]C(COO)C(=O)C(C)C + CC(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)COO + C[C](C)C(=O)C(C)C No
CC(C)C(=O)C(C)COO + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + [CH2]C(COO)C(=O)C(C)C Yes
[CH2]C(COO)C(=O)C(C)C + CC(C)C(=O)C(C)(C)OO↔ CC(C)C(=O)C(C)COO + CC(C)C(=O)C(C)(C)O[O] Yes
[CH2]C(COO)C(=O)C(C)C + CC(C)C(=O)C(C)COO↔ CC(C)C(=O)C(C)COO + CC(C)C(=O)C(C)CO[O] Yes
CC(C)C(=O)C(C)COO + [H]↔ [H][H] + [CH2]C(COO)C(=O)C(C)C Yes
CC(C)C(=O)C(C)COO + [O]↔ [OH] + [CH2]C(COO)C(=O)C(C)C Yes
CC(C)C(=O)C(C)COO + [OH]↔ O + [CH2]C(COO)C(=O)C(C)C No
[CH2]C(COO)C(=O)C(C)C + OO↔ CC(C)C(=O)C(C)COO + [O]O No
[CH2]C(COO)C(=O)C(C)C + C#CC↔ CC(C)C(=O)C(C)COO + C#C[CH2] Yes
[CH2]C(COO)C(=O)C(C)C + C=C=C↔ CC(C)C(=O)C(C)COO + [CH]=C=C Yes
CC(C)C(=O)C(C)COO + [c]1ccccc1↔ c1ccccc1 + [CH2]C(COO)C(=O)C(C)C Yes
[CH2]C(COO)C(=O)C(C)C + C=CC↔ CC(C)C(=O)C(C)COO + [CH2]C=C No
[CH2]C(COO)C(=O)C(C)C + C=C1C=CCC1↔ CC(C)C(=O)C(C)COO + C=C1[CH]CC=C1 Yes
[CH2]C(COO)C(=O)C(C)C + C=C1CC=CC1↔ CC(C)C(=O)C(C)COO + C=C1[CH]C=CC1 No
[CH2]C(COO)C(=O)C(C)C + C=C1C=CCC1↔ CC(C)C(=O)C(C)COO + C=C1C=C[CH]C1 Yes
CC(C)C(=O)C(C)COO + [CH]=C↔ C=C + [CH2]C(COO)C(=O)C(C)C No
CC(C)C(=O)C(C)COO + [CH]=CC↔ C=CC + [CH2]C(COO)C(=O)C(C)C Yes
CC(C)C(=O)C(C)COO + C=[C]C↔ C=CC + [CH2]C(COO)C(=O)C(C)C No
CC(C)C(=O)C(C)COO + [CH3]↔ C + [CH2]C(COO)C(=O)C(C)C Yes
CC(C)C(=O)C(C)COO + [CH2]CC↔ CCC + [CH2]C(COO)C(=O)C(C)C Yes
[CH2]C(COO)C(=O)C(C)C + CCC↔ CC(C)C(=O)C(C)COO + C[CH]C Yes
CC(C)C(=O)C(C)COO + [CH2]↔ [CH3] + [CH2]C(COO)C(=O)C(C)C Yes
[CH2]C(COO)C(=O)C(C)C + CCO↔ CC(C)C(=O)C(C)COO + C[CH]O No
CC(C)C(=O)C(C)COO + [CH2]CO↔ CCO + [CH2]C(COO)C(=O)C(C)C Yes
CC(C)C(=O)C(C)COO + CC[O]↔ CCO + [CH2]C(COO)C(=O)C(C)C Yes
[CH2]C(COO)C(=O)C(C)C + CC=O↔ CC(C)C(=O)C(C)COO + C[C]=O Yes
[CH2]C(COO)C(=O)C(C)C + CC=O↔ CC(C)C(=O)C(C)COO + [CH2]C=O No
[CH2]C(COO)C(=O)C(C)C + CC↔ CC(C)C(=O)C(C)COO + C[CH2] Yes
CC(C)C(=O)C(C)COO + [CH]=C=O↔ C=C=O + [CH2]C(COO)C(=O)C(C)C Yes
[CH2]C(COO)C(=O)C(C)C + CO↔ CC(C)C(=O)C(C)COO + [CH2]O No
CC(C)C(=O)C(C)COO + [C]#C↔ C#C + [CH2]C(COO)C(=O)C(C)C Yes
[CH2]C(COO)C(=O)C(C)C + C=O↔ CC(C)C(=O)C(C)COO + [CH]=O Yes
[CH2]C(COO)C(=O)C(C)C + CC=C=O↔ CC(C)C(=O)C(C)COO + [CH2]C=C=O No
[CH2]C(COO)C(=O)C(C)C + C=CC=O↔ CC(C)C(=O)C(C)COO + C=C[C]=O Yes
Continued on next page
136
Table B.3 – continued from previous pageReactions FoundCC(C)C(=O)C(C)COO + [CH]=CC#C↔ C#CC=C + [CH2]C(COO)C(=O)C(C)C Yes
CC(C)C(=O)C(C)COO + C#C[C]=C↔ C#CC=C + [CH2]C(COO)C(=O)C(C)C Yes
[CH2]C(COO)C(=O)C(C)C + CC(C)C(=O)C(C)(C)OO↔ CC(C)C(=O)C(C)COO + C[C](C)C(=O)C(C)(C)OO Yes
[CH2]C(COO)C(=O)C(C)C + CC(C)C(=O)OO↔ CC(C)C(=O)C(C)COO + CC(C)C(=O)O[O] Yes
CC(C)C(=O)C(C)COO + [CH2]C(C)C(=O)C(C)(C)OO↔ CC(C)C(=O)C(C)(C)OO + [CH2]C(COO)C(=O)C(C)C No
[CH2]C(COO)C(=O)C(C)C + CC(C)C(=O)C(C)COO↔ CC(C)C(=O)C(C)COO + C[C](C)C(=O)C(C)COO Yes
[CH2]C(COO)C(=O)C(C)C + CC(C)=C=O↔ CC(C)C(=O)C(C)COO + [CH2]C(C)=C=O Yes
[CH2]C(COO)C(=O)C(C)C + C=C(C)C(=O)OO↔ CC(C)C(=O)C(C)COO + C=C(C)C(=O)O[O] No
[CH2]C(C)=O + [O]O↔ CC(C)=O + [O][O] Yes
[CH2]C(C)=O + CC(C)C(=O)C(C)C↔ CC(C)=O + C[C](C)C(=O)C(C)C Yes
CC(C)=O + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + [CH2]C(C)=O Yes
[CH2]C(C)=O + CC(C)C(=O)C(C)(C)OO↔ CC(C)=O + CC(C)C(=O)C(C)(C)O[O] Yes
[CH2]C(C)=O + CC(C)C(=O)C(C)COO↔ CC(C)=O + CC(C)C(=O)C(C)CO[O] Yes
CC(C)=O + [H]↔ [H][H] + [CH2]C(C)=O Yes
CC(C)=O + [O]↔ [OH] + [CH2]C(C)=O Yes
CC(C)=O + [OH]↔ O + [CH2]C(C)=O Yes
[CH2]C(C)=O + OO↔ CC(C)=O + [O]O No
[CH2]C(C)=O + C#CC↔ CC(C)=O + C#C[CH2] Yes
[CH2]C(C)=O + C=C=C↔ CC(C)=O + [CH]=C=C Yes
CC(C)=O + [c]1ccccc1↔ c1ccccc1 + [CH2]C(C)=O Yes
[CH2]C(C)=O + C=CC↔ CC(C)=O + [CH2]C=C Yes
[CH2]C(C)=O + C=C1C=CCC1↔ CC(C)=O + C=C1[CH]CC=C1 Yes
[CH2]C(C)=O + C=C1CC=CC1↔ CC(C)=O + C=C1[CH]C=CC1 Yes
[CH2]C(C)=O + C=C1C=CCC1↔ CC(C)=O + C=C1C=C[CH]C1 Yes
CC(C)=O + [CH]=C↔ C=C + [CH2]C(C)=O Yes
CC(C)=O + [CH]=CC↔ C=CC + [CH2]C(C)=O Yes
CC(C)=O + C=[C]C↔ C=CC + [CH2]C(C)=O Yes
CC(C)=O + [CH3]↔ C + [CH2]C(C)=O Yes
CC(C)=O + [CH2]CC↔ CCC + [CH2]C(C)=O Yes
CC(C)=O + C[CH]C↔ CCC + [CH2]C(C)=O Yes
CC(C)=O + [CH2]↔ [CH3] + [CH2]C(C)=O Yes
[CH2]C(C)=O + CCO↔ CC(C)=O + C[CH]O No
CC(C)=O + [CH2]CO↔ CCO + [CH2]C(C)=O Yes
CC(C)=O + CC[O]↔ CCO + [CH2]C(C)=O No
[CH2]C(C)=O + CC=O↔ CC(C)=O + C[C]=O Yes
[CH2]C(C)=O + CC=O↔ CC(C)=O + [CH2]C=O Yes
CC(C)=O + C[CH2]↔ CC + [CH2]C(C)=O Yes
CC(C)=O + [CH]=C=O↔ C=C=O + [CH2]C(C)=O Yes
[CH2]C(C)=O + CO↔ CC(C)=O + [CH2]O No
CC(C)=O + [C]#C↔ C#C + [CH2]C(C)=O Yes
[CH2]C(C)=O + C=O↔ CC(C)=O + [CH]=O No
[CH2]C(C)=O + CC=C=O↔ CC(C)=O + [CH2]C=C=O No
[CH2]C(C)=O + C=CC=O↔ CC(C)=O + C=C[C]=O Yes
CC(C)=O + [CH]=CC#C↔ C#CC=C + [CH2]C(C)=O Yes
CC(C)=O + C#C[C]=C↔ C#CC=C + [CH2]C(C)=O No
[CH2]C(C)=O + CC(C)C(=O)C(C)(C)OO↔ CC(C)=O + C[C](C)C(=O)C(C)(C)OO Yes
CC(C)=O + CC(C)C(=O)O[O]↔ CC(C)C(=O)OO + [CH2]C(C)=O Yes
CC(C)=O + [CH2]C(C)C(=O)C(C)(C)OO↔ CC(C)C(=O)C(C)(C)OO + [CH2]C(C)=O No
Continued on next page
137
Table B.3 – continued from previous pageReactions Found[CH2]C(C)=O + CC(C)C(=O)C(C)COO↔ CC(C)=O + C[C](C)C(=O)C(C)COO Yes
[CH2]C(C)=O + CC(C)=C=O↔ CC(C)=O + [CH2]C(C)=C=O No
CC(C)=O + C=C(C)C(=O)O[O]↔ C=C(C)C(=O)OO + [CH2]C(C)=O Yes
CC(C)=O + [CH2]C(COO)C(=O)C(C)C↔ CC(C)C(=O)C(C)COO + [CH2]C(C)=O No
[CH]=CC=C + C↔ C=CC=C + [CH3] Yes
C=CC=C + [CH3]↔ C + C=[C]C=C No
[CH]=CC=C + C=CC↔ C=CC=C + [CH2]C=C Yes
C=[C]C=C + C=CC↔ C=CC=C + [CH2]C=C No
C=CC=CC + [CH3]↔ C + [CH2]C=CC=C Yes
C=CC=CC + [CH2]C=C↔ C=CC + [CH2]C=CC=C No
C1=CCCC=C1 + [CH3]↔ C + [CH]1C=CC=CC1 Yes
C1=CCCC=C1 + [CH2]C=C↔ C=CC + [CH]1C=CC=CC1 Yes
[CH]=CC=C + CC(C)C(=O)C(C)C↔ C=CC=C + C[C](C)C(=O)C(C)C Yes
C=[C]C=C + CC(C)C(=O)C(C)C↔ C=CC=C + C[C](C)C(=O)C(C)C Yes
C=CC=CC + C[C](C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + [CH2]C=CC=C No
C1=CCCC=C1 + C[C](C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + [CH]1C=CC=CC1 Yes
[CH]=CC=C + CC(C)C(=O)C(C)C↔ C=CC=C + [CH2]C(C)C(=O)C(C)C Yes
C=[C]C=C + CC(C)C(=O)C(C)C↔ C=CC=C + [CH2]C(C)C(=O)C(C)C Yes
C=CC=CC + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + [CH2]C=CC=C Yes
C1=CCCC=C1 + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + [CH]1C=CC=CC1 Yes
[H] + C=CC=CC↔ [H][H] + [CH2]C=CC=C Yes
[C]#C + C=CC=CC↔ C#C + [CH2]C=CC=C No
[CH]=C + C=CC=CC↔ C=C + [CH2]C=CC=C Yes
[CH]=CC + C=CC=CC↔ C=CC + [CH2]C=CC=C No
C=[C]C + C=CC=CC↔ C=CC + [CH2]C=CC=C Yes
[CH]=C=C + C=CC=CC↔ C=C=C + [CH2]C=CC=C Yes
C#C[CH2] + C=CC=CC↔ C#CC + [CH2]C=CC=C Yes
[CH]=CC#C + C=CC=CC↔ C#CC=C + [CH2]C=CC=C Yes
C#C[C]=C + C=CC=CC↔ C#CC=C + [CH2]C=CC=C No
[CH]=CC=C + C=CC=CC↔ C=CC=C + [CH2]C=CC=C Yes
C=[C]C=C + C=CC=CC↔ C=CC=C + [CH2]C=CC=C Yes
[CH2]C=C=O + C=CC=CC↔ CC=C=O + [CH2]C=CC=C No
[c]1ccccc1 + C=CC=CC↔ c1ccccc1 + [CH2]C=CC=C Yes
C1=CCCC=C1 + [CH2]C=CC=C↔ C=CC=CC + [CH]1C=CC=CC1 No
[H] + C1=CCCC=C1↔ [H][H] + [CH]1C=CC=CC1 Yes
[C]#C + C1=CCCC=C1↔ C#C + [CH]1C=CC=CC1 Yes
[CH]=C + C1=CCCC=C1↔ C=C + [CH]1C=CC=CC1 No
[CH]=CC + C1=CCCC=C1↔ C=CC + [CH]1C=CC=CC1 Yes
C=[C]C + C1=CCCC=C1↔ C=CC + [CH]1C=CC=CC1 Yes
[CH]=C=C + C1=CCCC=C1↔ C=C=C + [CH]1C=CC=CC1 No
C#C[CH2] + C1=CCCC=C1↔ C#CC + [CH]1C=CC=CC1 No
[CH]=CC#C + C1=CCCC=C1↔ C#CC=C + [CH]1C=CC=CC1 No
C#C[C]=C + C1=CCCC=C1↔ C#CC=C + [CH]1C=CC=CC1 Yes
[CH]=CC=C + C1=CCCC=C1↔ C=CC=C + [CH]1C=CC=CC1 No
C=[C]C=C + C1=CCCC=C1↔ C=CC=C + [CH]1C=CC=CC1 Yes
[CH2]C=C=O + C1=CCCC=C1↔ CC=C=O + [CH]1C=CC=CC1 No
[c]1ccccc1 + C1=CCCC=C1↔ c1ccccc1 + [CH]1C=CC=CC1 Yes
[CH]=CC=C + [H][H]↔ C=CC=C + [H] No
Continued on next page
138
Table B.3 – continued from previous pageReactions FoundC=CC=C + [H]↔ [H][H] + C=[C]C=C Yes
C=CC=C + [O]↔ [OH] + C=[C]C=C Yes
C=CC=CC + [O]↔ [OH] + [CH2]C=CC=C Yes
C1=CCCC=C1 + [O]↔ [OH] + [CH]1C=CC=CC1 No
C=CC=C + [OH]↔ O + [CH]=CC=C No
C=CC=C + [OH]↔ O + C=[C]C=C Yes
C=CC=CC + [OH]↔ O + [CH2]C=CC=C No
C1=CCCC=C1 + [OH]↔ O + [CH]1C=CC=CC1 No
[CH]=CC=C + [O]O↔ C=CC=C + [O][O] No
C=[C]C=C + [O]O↔ C=CC=C + [O][O] No
[CH2]C=CC=C + [O]O↔ C=CC=CC + [O][O] Yes
[CH]1C=CC=CC1 + [O]O↔ C1=CCCC=C1 + [O][O] Yes
[CH]=CC=C + OO↔ C=CC=C + [O]O Yes
C=[C]C=C + OO↔ C=CC=C + [O]O No
C=CC=CC + [O]O↔ OO + [CH2]C=CC=C No
C1=CCCC=C1 + [O]O↔ OO + [CH]1C=CC=CC1 No
[CH]=CC=C + C#CC↔ C=CC=C + C#C[CH2] No
C=[C]C=C + C#CC↔ C=CC=C + C#C[CH2] Yes
[CH]=CC=C + C=C=C↔ C=CC=C + [CH]=C=C No
C=[C]C=C + C=C=C↔ C=CC=C + [CH]=C=C Yes
C=CC=C + [c]1ccccc1↔ c1ccccc1 + [CH]=CC=C No
C=CC=C + [c]1ccccc1↔ c1ccccc1 + C=[C]C=C Yes
C=C1C=CCC1 + [CH2]C=CC=C↔ C=CC=CC + C=C1C=C[CH]C1 Yes
C=C1[CH]CC=C1 + C=CC=CC↔ C=C1C=CCC1 + [CH2]C=CC=C Yes
C=C1C=C[CH]C1 + C1=CCCC=C1↔ C=C1C=CCC1 + [CH]1C=CC=CC1 Yes
C=C1[CH]CC=C1 + C1=CCCC=C1↔ C=C1C=CCC1 + [CH]1C=CC=CC1 Yes
C=C1CC=CC1 + [CH2]C=CC=C↔ C=CC=CC + C=C1[CH]C=CC1 Yes
C=C1CC=CC1 + [CH]1C=CC=CC1↔ C1=CCCC=C1 + C=C1[CH]C=CC1 Yes
[C]#C + C=CC=C↔ C#C + C=[C]C=C Yes
[CH]=C + C=CC=C↔ C=C + C=[C]C=C No
[CH]=CC + C=CC=C↔ C=CC + C=[C]C=C No
C=[C]C + C=CC=C↔ C=CC + C=[C]C=C Yes
[CH]=CC#C + C=CC=C↔ C#CC=C + C=[C]C=C Yes
C#CC=C + C=[C]C=C↔ C=CC=C + C#C[C]=C Yes
[CH]=CC=C + C=CC=C↔ C=CC=C + C=[C]C=C No
CC=C=O + C=[C]C=C↔ C=CC=C + [CH2]C=C=O No
C=C1C=CCC1 + C=[C]C=C↔ C=CC=C + C=C1C=C[CH]C1 No
C=C1C=CCC1 + C=[C]C=C↔ C=CC=C + C=C1[CH]CC=C1 No
C=C1CC=CC1 + C=[C]C=C↔ C=CC=C + C=C1[CH]C=CC1 No
[CH]=CC=C + C=C1C=CCC1↔ C=CC=C + C=C1[CH]CC=C1 No
[CH]=CC=C + C=C1CC=CC1↔ C=CC=C + C=C1[CH]C=CC1 Yes
[CH]=CC=C + C=C1C=CCC1↔ C=CC=C + C=C1C=C[CH]C1 No
[CH]=CC=C + C=C↔ C=CC=C + [CH]=C No
[CH]=CC=C + C=CC↔ C=CC=C + [CH]=CC Yes
[CH]=CC=C + C=CC↔ C=CC=C + C=[C]C Yes
[CH]=CC=C + CCC↔ C=CC=C + [CH2]CC Yes
C=[C]C=C + CCC↔ C=CC=C + [CH2]CC Yes
C=CC=CC + [CH2]CC↔ CCC + [CH2]C=CC=C Yes
Continued on next page
139
Table B.3 – continued from previous pageReactions FoundC1=CCCC=C1 + [CH2]CC↔ CCC + [CH]1C=CC=CC1 Yes
[CH]=CC=C + CCC↔ C=CC=C + C[CH]C Yes
C=[C]C=C + CCC↔ C=CC=C + C[CH]C Yes
C=CC=CC + C[CH]C↔ CCC + [CH2]C=CC=C Yes
C1=CCCC=C1 + C[CH]C↔ CCC + [CH]1C=CC=CC1 Yes
[CH]=C=O + C=CC=CC↔ C=C=O + [CH2]C=CC=C Yes
[CH]=C=O + C1=CCCC=C1↔ C=C=O + [CH]1C=CC=CC1 Yes
[CH]=C=O + C=CC=C↔ C=C=O + C=[C]C=C Yes
[CH]=O + C=CC=CC↔ C=O + [CH2]C=CC=C Yes
[CH]=O + C1=CCCC=C1↔ C=O + [CH]1C=CC=CC1 Yes
C=O + C=[C]C=C↔ C=CC=C + [CH]=O Yes
[CH2]O + C=CC=CC↔ CO + [CH2]C=CC=C No
[CH2]O + C1=CCCC=C1↔ CO + [CH]1C=CC=CC1 Yes
CO + C=[C]C=C↔ C=CC=C + [CH2]O Yes
C=CC=C + [CH2]↔ [CH3] + C=[C]C=C No
C=CC=CC + [CH2]↔ [CH3] + [CH2]C=CC=C Yes
C1=CCCC=C1 + [CH2]↔ [CH3] + [CH]1C=CC=CC1 Yes
[CH2]CO + C=CC=CC↔ CCO + [CH2]C=CC=C No
C[CH]O + C=CC=CC↔ CCO + [CH2]C=CC=C Yes
CC[O] + C=CC=CC↔ CCO + [CH2]C=CC=C Yes
[CH2]CO + C1=CCCC=C1↔ CCO + [CH]1C=CC=CC1 No
C[CH]O + C1=CCCC=C1↔ CCO + [CH]1C=CC=CC1 No
CC[O] + C1=CCCC=C1↔ CCO + [CH]1C=CC=CC1 Yes
[CH2]CO + C=CC=C↔ CCO + C=[C]C=C No
CCO + C=[C]C=C↔ C=CC=C + C[CH]O Yes
CC[O] + C=CC=C↔ CCO + C=[C]C=C Yes
[CH]=CC=C + CCO↔ C=CC=C + C[CH]O Yes
[CH]=CC=C + CCO↔ C=CC=C + [CH2]CO No
[CH]=CC=C + CCO↔ C=CC=C + CC[O] Yes
[CH2]C=O + C=CC=CC↔ CC=O + [CH2]C=CC=C No
C[C]=O + C=CC=CC↔ CC=O + [CH2]C=CC=C Yes
[CH2]C=O + C1=CCCC=C1↔ CC=O + [CH]1C=CC=CC1 Yes
C[C]=O + C1=CCCC=C1↔ CC=O + [CH]1C=CC=CC1 Yes
CC=O + C=[C]C=C↔ C=CC=C + [CH2]C=O Yes
CC=O + C=[C]C=C↔ C=CC=C + C[C]=O No
[CH]=CC=C + CC=O↔ C=CC=C + C[C]=O No
[CH]=CC=C + CC=O↔ C=CC=C + [CH2]C=O Yes
[CH]=CC=C + CC↔ C=CC=C + C[CH2] Yes
C=[C]C=C + CC↔ C=CC=C + C[CH2] Yes
C=CC=CC + C[CH2]↔ CC + [CH2]C=CC=C No
C1=CCCC=C1 + C[CH2]↔ CC + [CH]1C=CC=CC1 Yes
[CH]=CC=C + C=C=O↔ C=CC=C + [CH]=C=O Yes
[CH]=CC=C + CO↔ C=CC=C + [CH2]O No
C=CC=C + [C]#C↔ C#C + [CH]=CC=C Yes
[CH]=CC=C + C=O↔ C=CC=C + [CH]=O Yes
[CH]=CC=C + CC=C=O↔ C=CC=C + [CH2]C=C=O No
[CH]=CC=C + C=CC=O↔ C=CC=C + C=C[C]=O No
C=[C]C=C + C=CC=O↔ C=CC=C + C=C[C]=O Yes
Continued on next page
140
Table B.3 – continued from previous pageReactions FoundC=CC=CC + C=C[C]=O↔ C=CC=O + [CH2]C=CC=C No
C1=CCCC=C1 + C=C[C]=O↔ C=CC=O + [CH]1C=CC=CC1 Yes
C=CC=C + [CH]=CC#C↔ C#CC=C + [CH]=CC=C Yes
[CH]=CC=C + C#CC=C↔ C=CC=C + C#C[C]=C Yes
CCC(=O)C(C)C + [CH3]↔ C + C[CH]C(=O)C(C)C Yes
C[CH]C(=O)C(C)C + C=CC↔ CCC(=O)C(C)C + [CH2]C=C Yes
C[CH]C(=O)C(C)C + CC(C)C(=O)C(C)C↔ CCC(=O)C(C)C + C[C](C)C(=O)C(C)C Yes
CCC(=O)C(C)C + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + C[CH]C(=O)C(C)C Yes
C[CH]C(=O)C(C)C + C=CC=CC↔ CCC(=O)C(C)C + [CH2]C=CC=C Yes
C[CH]C(=O)C(C)C + C1=CCCC=C1↔ CCC(=O)C(C)C + [CH]1C=CC=CC1 Yes
CCC(=O)C(C)C + [H]↔ [H][H] + C[CH]C(=O)C(C)C Yes
CCC(=O)C(C)C + [O]↔ [OH] + C[CH]C(=O)C(C)C Yes
CCC(=O)C(C)C + [OH]↔ O + C[CH]C(=O)C(C)C Yes
C[CH]C(=O)C(C)C + [O]O↔ CCC(=O)C(C)C + [O][O] No
C[CH]C(=O)C(C)C + OO↔ CCC(=O)C(C)C + [O]O No
C[CH]C(=O)C(C)C + C#CC↔ CCC(=O)C(C)C + C#C[CH2] Yes
C[CH]C(=O)C(C)C + C=C=C↔ CCC(=O)C(C)C + [CH]=C=C Yes
CCC(=O)C(C)C + [c]1ccccc1↔ c1ccccc1 + C[CH]C(=O)C(C)C No
CCC(=O)C(C)C + C=[C]C=C↔ C=CC=C + C[CH]C(=O)C(C)C Yes
C[CH]C(=O)C(C)C + C=C1C=CCC1↔ CCC(=O)C(C)C + C=C1[CH]CC=C1 Yes
C[CH]C(=O)C(C)C + C=C1CC=CC1↔ CCC(=O)C(C)C + C=C1[CH]C=CC1 Yes
C[CH]C(=O)C(C)C + C=C1C=CCC1↔ CCC(=O)C(C)C + C=C1C=C[CH]C1 Yes
CCC(=O)C(C)C + [CH]=C↔ C=C + C[CH]C(=O)C(C)C Yes
CCC(=O)C(C)C + [CH]=CC↔ C=CC + C[CH]C(=O)C(C)C Yes
CCC(=O)C(C)C + C=[C]C↔ C=CC + C[CH]C(=O)C(C)C No
CCC(=O)C(C)C + [CH2]CC↔ CCC + C[CH]C(=O)C(C)C No
CCC(=O)C(C)C + C[CH]C↔ CCC + C[CH]C(=O)C(C)C Yes
CCC(=O)C(C)C + [CH2]↔ [CH3] + C[CH]C(=O)C(C)C Yes
C[CH]C(=O)C(C)C + CCO↔ CCC(=O)C(C)C + C[CH]O No
CCC(=O)C(C)C + [CH2]CO↔ CCO + C[CH]C(=O)C(C)C Yes
CCC(=O)C(C)C + CC[O]↔ CCO + C[CH]C(=O)C(C)C Yes
C[CH]C(=O)C(C)C + CC=O↔ CCC(=O)C(C)C + C[C]=O No
C[CH]C(=O)C(C)C + CC=O↔ CCC(=O)C(C)C + [CH2]C=O Yes
CCC(=O)C(C)C + C[CH2]↔ CC + C[CH]C(=O)C(C)C Yes
CCC(=O)C(C)C + [CH]=C=O↔ C=C=O + C[CH]C(=O)C(C)C Yes
C[CH]C(=O)C(C)C + CO↔ CCC(=O)C(C)C + [CH2]O Yes
CCC(=O)C(C)C + [C]#C↔ C#C + C[CH]C(=O)C(C)C Yes
C[CH]C(=O)C(C)C + C=O↔ CCC(=O)C(C)C + [CH]=O Yes
C[CH]C(=O)C(C)C + CC=C=O↔ CCC(=O)C(C)C + [CH2]C=C=O No
C[CH]C(=O)C(C)C + C=CC=O↔ CCC(=O)C(C)C + C=C[C]=O Yes
CCC(=O)C(C)C + [CH]=CC#C↔ C#CC=C + C[CH]C(=O)C(C)C Yes
CCC(=O)C(C)C + C#C[C]=C↔ C#CC=C + C[CH]C(=O)C(C)C Yes
CCC(=O)C(C)C + [CH]=CC=C↔ C=CC=C + C[CH]C(=O)C(C)C Yes
C=CCC + [CH3]↔ C + C=C[CH]C No
C=CCC + [CH2]C=C↔ C=CC + C=C[CH]C Yes
C=CCC + C[C](C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + C=C[CH]C Yes
C=CCC + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + C=C[CH]C Yes
C=C[CH]C + C=CC=CC↔ C=CCC + [CH2]C=CC=C Yes
Continued on next page
141
Table B.3 – continued from previous pageReactions FoundC=C[CH]C + C1=CCCC=C1↔ C=CCC + [CH]1C=CC=CC1 Yes
C=CCC + [H]↔ [H][H] + C=C[CH]C Yes
C=CCC + [O]↔ [OH] + C=C[CH]C No
C=CCC + [OH]↔ O + C=C[CH]C No
C=C[CH]C + [O]O↔ C=CCC + [O][O] No
C=CCC + [O]O↔ OO + C=C[CH]C Yes
C=CCC + C#C[CH2]↔ C#CC + C=C[CH]C No
C=CCC + [CH]=C=C↔ C=C=C + C=C[CH]C Yes
C=CCC + [c]1ccccc1↔ c1ccccc1 + C=C[CH]C No
C=CCC + C=[C]C=C↔ C=CC=C + C=C[CH]C Yes
C=CCC + C=C1[CH]CC=C1↔ C=C1C=CCC1 + C=C[CH]C No
C=C[CH]C + C=C1CC=CC1↔ C=CCC + C=C1[CH]C=CC1 Yes
C=C[CH]C + C=C1C=CCC1↔ C=CCC + C=C1C=C[CH]C1 Yes
C=CCC + [CH]=C↔ C=C + C=C[CH]C Yes
C=CCC + [CH]=CC↔ C=CC + C=C[CH]C Yes
C=CCC + C=[C]C↔ C=CC + C=C[CH]C Yes
C=CCC + [CH2]CC↔ CCC + C=C[CH]C Yes
C=CCC + C[CH]C↔ CCC + C=C[CH]C No
C=CCC + [CH2]↔ [CH3] + C=C[CH]C Yes
C=CCC + C[CH]O↔ CCO + C=C[CH]C No
C=CCC + [CH2]CO↔ CCO + C=C[CH]C Yes
C=CCC + CC[O]↔ CCO + C=C[CH]C Yes
C=CCC + C[C]=O↔ CC=O + C=C[CH]C No
C=CCC + [CH2]C=O↔ CC=O + C=C[CH]C Yes
C=CCC + C[CH2]↔ CC + C=C[CH]C Yes
C=CCC + [CH]=C=O↔ C=C=O + C=C[CH]C Yes
C=CCC + [CH2]O↔ CO + C=C[CH]C Yes
C=CCC + [C]#C↔ C#C + C=C[CH]C Yes
C=CCC + [CH]=O↔ C=O + C=C[CH]C No
C=CCC + [CH2]C=C=O↔ CC=C=O + C=C[CH]C No
C=CCC + C=C[C]=O↔ C=CC=O + C=C[CH]C No
C=CCC + [CH]=CC#C↔ C#CC=C + C=C[CH]C Yes
C=CCC + C#C[C]=C↔ C#CC=C + C=C[CH]C Yes
C=CCC + [CH]=CC=C↔ C=CC=C + C=C[CH]C No
C=CCC + C[CH]C(=O)C(C)C↔ CCC(=O)C(C)C + C=C[CH]C Yes
C1=CCC=C1 + [CH3]↔ C + [CH]1C=CC=C1 Yes
C1=CCC=C1 + [CH2]C=C↔ C=CC + [CH]1C=CC=C1 Yes
C1=CCC=C1 + C[C](C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + [CH]1C=CC=C1 Yes
C1=CCC=C1 + [CH2]C(C)C(=O)C(C)C↔ CC(C)C(=O)C(C)C + [CH]1C=CC=C1 Yes
[CH]1C=CC=C1 + C=CC=CC↔ C1=CCC=C1 + [CH2]C=CC=C No
[CH]1C=CC=C1 + C1=CCCC=C1↔ C1=CCC=C1 + [CH]1C=CC=CC1 Yes
C1=CCC=C1 + [H]↔ [H][H] + [CH]1C=CC=C1 Yes
C1=CCC=C1 + [O]↔ [OH] + [CH]1C=CC=C1 Yes
C1=CCC=C1 + [OH]↔ O + [CH]1C=CC=C1 No
[CH]1C=CC=C1 + [O]O↔ C1=CCC=C1 + [O][O] No
C1=CCC=C1 + [O]O↔ OO + [CH]1C=CC=C1 Yes
C1=CCC=C1 + C#C[CH2]↔ C#CC + [CH]1C=CC=C1 No
C1=CCC=C1 + [CH]=C=C↔ C=C=C + [CH]1C=CC=C1 Yes
Continued on next page
142
Table B.3 – continued from previous pageReactions FoundC1=CCC=C1 + [c]1ccccc1↔ c1ccccc1 + [CH]1C=CC=C1 No
C1=CCC=C1 + C=[C]C=C↔ C=CC=C + [CH]1C=CC=C1 Yes
C1=CCC=C1 + C=C1[CH]CC=C1↔ C=C1C=CCC1 + [CH]1C=CC=C1 No
[CH]1C=CC=C1 + C=C1CC=CC1↔ C1=CCC=C1 + C=C1[CH]C=CC1 Yes
[CH]1C=CC=C1 + C=C1C=CCC1↔ C1=CCC=C1 + C=C1C=C[CH]C1 Yes
C1=CCC=C1 + [CH]=C↔ C=C + [CH]1C=CC=C1 Yes
C1=CCC=C1 + [CH]=CC↔ C=CC + [CH]1C=CC=C1 No
C1=CCC=C1 + C=[C]C↔ C=CC + [CH]1C=CC=C1 Yes
C1=CCC=C1 + [CH2]CC↔ CCC + [CH]1C=CC=C1 No
C1=CCC=C1 + C[CH]C↔ CCC + [CH]1C=CC=C1 Yes
C1=CCC=C1 + [CH2]↔ [CH3] + [CH]1C=CC=C1 Yes
C1=CCC=C1 + C[CH]O↔ CCO + [CH]1C=CC=C1 No
C1=CCC=C1 + [CH2]CO↔ CCO + [CH]1C=CC=C1 Yes
C1=CCC=C1 + CC[O]↔ CCO + [CH]1C=CC=C1 No
C1=CCC=C1 + C[C]=O↔ CC=O + [CH]1C=CC=C1 No
C1=CCC=C1 + [CH2]C=O↔ CC=O + [CH]1C=CC=C1 Yes
C1=CCC=C1 + C[CH2]↔ CC + [CH]1C=CC=C1 Yes
C1=CCC=C1 + [CH]=C=O↔ C=C=O + [CH]1C=CC=C1 Yes
C1=CCC=C1 + [CH2]O↔ CO + [CH]1C=CC=C1 Yes
C1=CCC=C1 + [C]#C↔ C#C + [CH]1C=CC=C1 No
C1=CCC=C1 + [CH]=O↔ C=O + [CH]1C=CC=C1 No
C1=CCC=C1 + [CH2]C=C=O↔ CC=C=O + [CH]1C=CC=C1 No
C1=CCC=C1 + C=C[C]=O↔ C=CC=O + [CH]1C=CC=C1 Yes
C1=CCC=C1 + [CH]=CC#C↔ C#CC=C + [CH]1C=CC=C1 Yes
C1=CCC=C1 + C#C[C]=C↔ C#CC=C + [CH]1C=CC=C1 Yes
C1=CCC=C1 + [CH]=CC=C↔ C=CC=C + [CH]1C=CC=C1 No
C1=CCC=C1 + C[CH]C(=O)C(C)C↔ CCC(=O)C(C)C + [CH]1C=CC=C1 Yes
C1=CCC=C1 + C=C[CH]C↔ C=CCC + [CH]1C=CC=C1 Yes
C1=CCC=C1 + [CH3]↔ C + [C]1=CCC=C1 Yes
[C]1=CCC=C1 + C=CC↔ C1=CCC=C1 + [CH2]C=C Yes
[C]1=CCC=C1 + CC(C)C(=O)C(C)C↔ C1=CCC=C1 + C[C](C)C(=O)C(C)C Yes
[C]1=CCC=C1 + CC(C)C(=O)C(C)C↔ C1=CCC=C1 + [CH2]C(C)C(=O)C(C)C Yes
[C]1=CCC=C1 + C=CC=CC↔ C1=CCC=C1 + [CH2]C=CC=C Yes
[C]1=CCC=C1 + C1=CCCC=C1↔ C1=CCC=C1 + [CH]1C=CC=CC1 Yes
C1=CCC=C1 + [H]↔ [H][H] + [C]1=CCC=C1 Yes
C1=CCC=C1 + [O]↔ [OH] + [C]1=CCC=C1 Yes
C1=CCC=C1 + [OH]↔ O + [C]1=CCC=C1 Yes
[C]1=CCC=C1 + [O]O↔ C1=CCC=C1 + [O][O] No
[C]1=CCC=C1 + OO↔ C1=CCC=C1 + [O]O No
[C]1=CCC=C1 + C#CC↔ C1=CCC=C1 + C#C[CH2] Yes
[C]1=CCC=C1 + C=C=C↔ C1=CCC=C1 + [CH]=C=C Yes
C1=CCC=C1 + [c]1ccccc1↔ c1ccccc1 + [C]1=CCC=C1 No
[C]1=CCC=C1 + C=CC=C↔ C1=CCC=C1 + C=[C]C=C Yes
[C]1=CCC=C1 + C=C1C=CCC1↔ C1=CCC=C1 + C=C1[CH]CC=C1 Yes
[C]1=CCC=C1 + C=C1CC=CC1↔ C1=CCC=C1 + C=C1[CH]C=CC1 Yes
[C]1=CCC=C1 + C=C1C=CCC1↔ C1=CCC=C1 + C=C1C=C[CH]C1 Yes
C1=CCC=C1 + [CH]=C↔ C=C + [C]1=CCC=C1 Yes
C1=CCC=C1 + [CH]=CC↔ C=CC + [C]1=CCC=C1 Yes
Continued on next page
143
Table B.3 – continued from previous pageReactions FoundC1=CCC=C1 + C=[C]C↔ C=CC + [C]1=CCC=C1 No
[C]1=CCC=C1 + CCC↔ C1=CCC=C1 + [CH2]CC No
[C]1=CCC=C1 + CCC↔ C1=CCC=C1 + C[CH]C Yes
C1=CCC=C1 + [CH2]↔ [CH3] + [C]1=CCC=C1 No
[C]1=CCC=C1 + CCO↔ C1=CCC=C1 + C[CH]O No
C1=CCC=C1 + [CH2]CO↔ CCO + [C]1=CCC=C1 Yes
C1=CCC=C1 + CC[O]↔ CCO + [C]1=CCC=C1 Yes
[C]1=CCC=C1 + CC=O↔ C1=CCC=C1 + C[C]=O Yes
[C]1=CCC=C1 + CC=O↔ C1=CCC=C1 + [CH2]C=O Yes
[C]1=CCC=C1 + CC↔ C1=CCC=C1 + C[CH2] Yes
C1=CCC=C1 + [CH]=C=O↔ C=C=O + [C]1=CCC=C1 Yes
[C]1=CCC=C1 + CO↔ C1=CCC=C1 + [CH2]O Yes
C1=CCC=C1 + [C]#C↔ C#C + [C]1=CCC=C1 Yes
[C]1=CCC=C1 + C=O↔ C1=CCC=C1 + [CH]=O Yes
[C]1=CCC=C1 + CC=C=O↔ C1=CCC=C1 + [CH2]C=C=O No
[C]1=CCC=C1 + C=CC=O↔ C1=CCC=C1 + C=C[C]=O Yes
C1=CCC=C1 + [CH]=CC#C↔ C#CC=C + [C]1=CCC=C1 Yes
[C]1=CCC=C1 + C#CC=C↔ C1=CCC=C1 + C#C[C]=C Yes
C1=CCC=C1 + [CH]=CC=C↔ C=CC=C + [C]1=CCC=C1 Yes
[C]1=CCC=C1 + CCC(=O)C(C)C↔ C1=CCC=C1 + C[CH]C(=O)C(C)C Yes
[C]1=CCC=C1 + C=CCC↔ C1=CCC=C1 + C=C[CH]C Yes
[C]1=CCC=C1 + C1=CCC=C1↔ C1=CCC=C1 + [CH]1C=CC=C1 Yes
[CH2]C(C)C(=O)C(C)C + C=C=C=C↔ CC(C)C(=O)C(C)C + [CH]=C=C=C Yes
CC(C)C(=O)C(C)C + [CH]=C=C=C↔ C=C=C=C + C[C](C)C(=O)C(C)C Yes
[H] + C=C=C=C↔ [H][H] + [CH]=C=C=C No
[CH3] + C=C=C=C↔ C + [CH]=C=C=C Yes
[C]#C + C=C=C=C↔ C#C + [CH]=C=C=C Yes
[CH]=C + C=C=C=C↔ C=C + [CH]=C=C=C No
[CH]=CC + C=C=C=C↔ C=CC + [CH]=C=C=C Yes
C=[C]C + C=C=C=C↔ C=CC + [CH]=C=C=C Yes
C=CC + [CH]=C=C=C↔ C=C=C=C + [CH2]C=C Yes
C=C=C + [CH]=C=C=C↔ C=C=C=C + [CH]=C=C No
C#CC + [CH]=C=C=C↔ C=C=C=C + C#C[CH2] No
[CH]=CC#C + C=C=C=C↔ C#CC=C + [CH]=C=C=C No
C#C[C]=C + C=C=C=C↔ C#CC=C + [CH]=C=C=C Yes
[CH]=CC=C + C=C=C=C↔ C=CC=C + [CH]=C=C=C Yes
C=[C]C=C + C=C=C=C↔ C=CC=C + [CH]=C=C=C Yes
CC=C=O + [CH]=C=C=C↔ C=C=C=C + [CH2]C=C=O Yes
C=CC=CC + [CH]=C=C=C↔ C=C=C=C + [CH2]C=CC=C No
[C]1=CCC=C1 + C=C=C=C↔ C1=CCC=C1 + [CH]=C=C=C No
C1=CCC=C1 + [CH]=C=C=C↔ C=C=C=C + [CH]1C=CC=C1 Yes
[c]1ccccc1 + C=C=C=C↔ c1ccccc1 + [CH]=C=C=C No
C1=CCCC=C1 + [CH]=C=C=C↔ C=C=C=C + [CH]1C=CC=CC1 Yes
C[CH]C(=O)C(C)C + C=C=C=C↔ CCC(=O)C(C)C + [CH]=C=C=C No
[OH] + C=C=C=C↔ O + [CH]=C=C=C Yes
[O]O + [CH]=C=C=C↔ C=C=C=C + [O][O] Yes
OO + [CH]=C=C=C↔ C=C=C=C + [O]O No
C=C1C=CCC1 + [CH]=C=C=C↔ C=C=C=C + C=C1C=C[CH]C1 No
Continued on next page
144
Table B.3 – continued from previous pageReactions FoundC=C1C=CCC1 + [CH]=C=C=C↔ C=C=C=C + C=C1[CH]CC=C1 No
C=C1CC=CC1 + [CH]=C=C=C↔ C=C=C=C + C=C1[CH]C=CC1 No
[CH]=C=O + C=C=C=C↔ C=C=O + [CH]=C=C=C Yes
C=O + [CH]=C=C=C↔ C=C=C=C + [CH]=O Yes
[CH2]O + C=C=C=C↔ CO + [CH]=C=C=C No
[CH2]CO + C=C=C=C↔ CCO + [CH]=C=C=C Yes
C[CH]O + C=C=C=C↔ CCO + [CH]=C=C=C Yes
CC[O] + C=C=C=C↔ CCO + [CH]=C=C=C Yes
[CH2]C=O + C=C=C=C↔ CC=O + [CH]=C=C=C Yes
CC=O + [CH]=C=C=C↔ C=C=C=C + C[C]=O Yes
C[CH2] + C=C=C=C↔ CC + [CH]=C=C=C Yes
[CH2]CC + C=C=C=C↔ CCC + [CH]=C=C=C No
C[CH]C + C=C=C=C↔ CCC + [CH]=C=C=C Yes
C=CC=O + [CH]=C=C=C↔ C=C=C=C + C=C[C]=O Yes
C=CCC + [CH]=C=C=C↔ C=C=C=C + C=C[CH]C Yes
C=C=C=C + [O]↔ [OH] + [CH]=C=C=C Yes
C=C=C=C + [CH2]↔ [CH3] + [CH]=C=C=C Yes
C=CCC + [CH3]↔ C + [CH2]CC=C Yes
[CH2]CC=C + C=CC↔ C=CCC + [CH2]C=C Yes
[CH2]CC=C + CC(C)C(=O)C(C)C↔ C=CCC + C[C](C)C(=O)C(C)C Yes
[CH2]CC=C + CC(C)C(=O)C(C)C↔ C=CCC + [CH2]C(C)C(=O)C(C)C No
[CH2]CC=C + C=CC=CC↔ C=CCC + [CH2]C=CC=C Yes
[CH2]CC=C + C1=CCCC=C1↔ C=CCC + [CH]1C=CC=CC1 Yes
C=CCC + [H]↔ [H][H] + [CH2]CC=C No
C=CCC + [O]↔ [OH] + [CH2]CC=C Yes
C=CCC + [OH]↔ O + [CH2]CC=C Yes
[CH2]CC=C + [O]O↔ C=CCC + [O][O] No
[CH2]CC=C + OO↔ C=CCC + [O]O No
[CH2]CC=C + C#CC↔ C=CCC + C#C[CH2] No
[CH2]CC=C + C=C=C↔ C=CCC + [CH]=C=C Yes
C=CCC + [c]1ccccc1↔ c1ccccc1 + [CH2]CC=C Yes
[CH2]CC=C + C=CC=C↔ C=CCC + C=[C]C=C Yes
[CH2]CC=C + C=C1C=CCC1↔ C=CCC + C=C1[CH]CC=C1 No
[CH2]CC=C + C=C1CC=CC1↔ C=CCC + C=C1[CH]C=CC1 No
[CH2]CC=C + C=C1C=CCC1↔ C=CCC + C=C1C=C[CH]C1 No
C=CCC + [CH]=C↔ C=C + [CH2]CC=C Yes
C=CCC + [CH]=CC↔ C=CC + [CH2]CC=C Yes
C=CCC + C=[C]C↔ C=CC + [CH2]CC=C Yes
[CH2]CC=C + CCC↔ C=CCC + [CH2]CC Yes
[CH2]CC=C + CCC↔ C=CCC + C[CH]C Yes
C=CCC + [CH2]↔ [CH3] + [CH2]CC=C Yes
[CH2]CC=C + CCO↔ C=CCC + C[CH]O No
C=CCC + [CH2]CO↔ CCO + [CH2]CC=C Yes
C=CCC + CC[O]↔ CCO + [CH2]CC=C Yes
[CH2]CC=C + CC=O↔ C=CCC + C[C]=O Yes
[CH2]CC=C + CC=O↔ C=CCC + [CH2]C=O Yes
[CH2]CC=C + CC↔ C=CCC + C[CH2] Yes
C=CCC + [CH]=C=O↔ C=C=O + [CH2]CC=C Yes
Continued on next page
145
Table B.3 – continued from previous pageReactions Found[CH2]CC=C + CO↔ C=CCC + [CH2]O Yes
C=CCC + [C]#C↔ C#C + [CH2]CC=C Yes
[CH2]CC=C + C=O↔ C=CCC + [CH]=O No
[CH2]CC=C + CC=C=O↔ C=CCC + [CH2]C=C=O No
[CH2]CC=C + C=CC=O↔ C=CCC + C=C[C]=O Yes
C=CCC + [CH]=CC#C↔ C#CC=C + [CH2]CC=C Yes
[CH2]CC=C + C#CC=C↔ C=CCC + C#C[C]=C Yes
[CH2]CC=C + C=C=C=C↔ C=CCC + [CH]=C=C=C Yes
C=CCC + [CH]=CC=C↔ C=CC=C + [CH2]CC=C Yes
[CH2]CC=C + CCC(=O)C(C)C↔ C=CCC + C[CH]C(=O)C(C)C Yes
[CH2]CC=C + C=CCC↔ C=CCC + C=C[CH]C Yes
[CH2]CC=C + C1=CCC=C1↔ C=CCC + [CH]1C=CC=C1 Yes
C=CCC + [C]1=CCC=C1↔ C1=CCC=C1 + [CH2]CC=C Yes
CC(C)C + [CH3]↔ C + [CH2]C(C)C Yes
[CH2]C(C)C + C=CC↔ CC(C)C + [CH2]C=C Yes
[CH2]C(C)C + CC(C)C(=O)C(C)C↔ CC(C)C + C[C](C)C(=O)C(C)C Yes
[CH2]C(C)C + CC(C)C(=O)C(C)C↔ CC(C)C + [CH2]C(C)C(=O)C(C)C No
[CH2]C(C)C + C=CC=CC↔ CC(C)C + [CH2]C=CC=C Yes
[CH2]C(C)C + C1=CCCC=C1↔ CC(C)C + [CH]1C=CC=CC1 Yes
CC(C)C + [H]↔ [H][H] + [CH2]C(C)C Yes
CC(C)C + [O]↔ [OH] + [CH2]C(C)C Yes
CC(C)C + [OH]↔ O + [CH2]C(C)C Yes
[CH2]C(C)C + [O]O↔ CC(C)C + [O][O] No
[CH2]C(C)C + OO↔ CC(C)C + [O]O No
[CH2]C(C)C + C#CC↔ CC(C)C + C#C[CH2] Yes
[CH2]C(C)C + C=C=C↔ CC(C)C + [CH]=C=C Yes
CC(C)C + [c]1ccccc1↔ c1ccccc1 + [CH2]C(C)C Yes
CC(C)C + C=[C]C=C↔ C=CC=C + [CH2]C(C)C No
[CH2]C(C)C + C=C1C=CCC1↔ CC(C)C + C=C1[CH]CC=C1 No
[CH2]C(C)C + C=C1CC=CC1↔ CC(C)C + C=C1[CH]C=CC1 Yes
[CH2]C(C)C + C=C1C=CCC1↔ CC(C)C + C=C1C=C[CH]C1 Yes
CC(C)C + [CH]=C↔ C=C + [CH2]C(C)C No
CC(C)C + [CH]=CC↔ C=CC + [CH2]C(C)C Yes
CC(C)C + C=[C]C↔ C=CC + [CH2]C(C)C Yes
CC(C)C + [CH2]CC↔ CCC + [CH2]C(C)C Yes
[CH2]C(C)C + CCC↔ CC(C)C + C[CH]C Yes
CC(C)C + [CH2]↔ [CH3] + [CH2]C(C)C Yes
[CH2]C(C)C + CCO↔ CC(C)C + C[CH]O No
CC(C)C + [CH2]CO↔ CCO + [CH2]C(C)C Yes
CC(C)C + CC[O]↔ CCO + [CH2]C(C)C Yes
[CH2]C(C)C + CC=O↔ CC(C)C + C[C]=O Yes
[CH2]C(C)C + CC=O↔ CC(C)C + [CH2]C=O Yes
[CH2]C(C)C + CC↔ CC(C)C + C[CH2] Yes
CC(C)C + [CH]=C=O↔ C=C=O + [CH2]C(C)C Yes
[CH2]C(C)C + CO↔ CC(C)C + [CH2]O Yes
CC(C)C + [C]#C↔ C#C + [CH2]C(C)C Yes
[CH2]C(C)C + C=O↔ CC(C)C + [CH]=O No
[CH2]C(C)C + CC=C=O↔ CC(C)C + [CH2]C=C=O No
Continued on next page
146
Table B.3 – continued from previous pageReactions Found[CH2]C(C)C + C=CC=O↔ CC(C)C + C=C[C]=O Yes
CC(C)C + [CH]=CC#C↔ C#CC=C + [CH2]C(C)C Yes
CC(C)C + C#C[C]=C↔ C#CC=C + [CH2]C(C)C Yes
[CH2]C(C)C + C=C=C=C↔ CC(C)C + [CH]=C=C=C Yes
CC(C)C + [CH]=CC=C↔ C=CC=C + [CH2]C(C)C Yes
[CH2]C(C)C + CCC(=O)C(C)C↔ CC(C)C + C[CH]C(=O)C(C)C Yes
[CH2]C(C)C + C=CCC↔ CC(C)C + C=C[CH]C Yes
[CH2]C(C)C + C1=CCC=C1↔ CC(C)C + [CH]1C=CC=C1 Yes
CC(C)C + [C]1=CCC=C1↔ C1=CCC=C1 + [CH2]C(C)C Yes
CC(C)C + [CH2]CC=C↔ C=CCC + [CH2]C(C)C Yes
[CH2]C(C)C(=O)C(C)C + C=CCC=C↔ CC(C)C(=O)C(C)C + C=C[CH]C=C Yes
C[C](C)C(=O)C(C)C + C=CCC=C↔ CC(C)C(=O)C(C)C + C=C[CH]C=C Yes
[H] + C=CCC=C↔ [H][H] + C=C[CH]C=C Yes
[CH3] + C=CCC=C↔ C + C=C[CH]C=C Yes
[C]#C + C=CCC=C↔ C#C + C=C[CH]C=C Yes
[CH]=C + C=CCC=C↔ C=C + C=C[CH]C=C No
[CH]=CC + C=CCC=C↔ C=CC + C=C[CH]C=C No
C=[C]C + C=CCC=C↔ C=CC + C=C[CH]C=C Yes
[CH2]C=C + C=CCC=C↔ C=CC + C=C[CH]C=C No
[CH]=C=C + C=CCC=C↔ C=C=C + C=C[CH]C=C Yes
C#C[CH2] + C=CCC=C↔ C#CC + C=C[CH]C=C No
[CH]=CC#C + C=CCC=C↔ C#CC=C + C=C[CH]C=C Yes
C#C[C]=C + C=CCC=C↔ C#CC=C + C=C[CH]C=C Yes
[CH]=CC=C + C=CCC=C↔ C=CC=C + C=C[CH]C=C No
C=[C]C=C + C=CCC=C↔ C=CC=C + C=C[CH]C=C No
[CH2]C=C=O + C=CCC=C↔ CC=C=O + C=C[CH]C=C No
[CH2]C=CC=C + C=CCC=C↔ C=CC=CC + C=C[CH]C=C Yes
[C]1=CCC=C1 + C=CCC=C↔ C1=CCC=C1 + C=C[CH]C=C Yes
[CH]1C=CC=C1 + C=CCC=C↔ C1=CCC=C1 + C=C[CH]C=C Yes
[c]1ccccc1 + C=CCC=C↔ c1ccccc1 + C=C[CH]C=C Yes
[CH]1C=CC=CC1 + C=CCC=C↔ C1=CCCC=C1 + C=C[CH]C=C No
C[CH]C(=O)C(C)C + C=CCC=C↔ CCC(=O)C(C)C + C=C[CH]C=C Yes
[OH] + C=CCC=C↔ O + C=C[CH]C=C Yes
[O]O + C=C[CH]C=C↔ C=CCC=C + [O][O] No
[O]O + C=CCC=C↔ OO + C=C[CH]C=C No
C=C1C=C[CH]C1 + C=CCC=C↔ C=C1C=CCC1 + C=C[CH]C=C No
C=C1[CH]CC=C1 + C=CCC=C↔ C=C1C=CCC1 + C=C[CH]C=C Yes
C=C1CC=CC1 + C=C[CH]C=C↔ C=CCC=C + C=C1[CH]C=CC1 Yes
[CH]=C=O + C=CCC=C↔ C=C=O + C=C[CH]C=C Yes
[CH]=O + C=CCC=C↔ C=O + C=C[CH]C=C Yes
[CH2]O + C=CCC=C↔ CO + C=C[CH]C=C Yes
[CH2]CO + C=CCC=C↔ CCO + C=C[CH]C=C Yes
C[CH]O + C=CCC=C↔ CCO + C=C[CH]C=C Yes
CC[O] + C=CCC=C↔ CCO + C=C[CH]C=C Yes
[CH2]C=O + C=CCC=C↔ CC=O + C=C[CH]C=C Yes
C[C]=O + C=CCC=C↔ CC=O + C=C[CH]C=C No
C[CH2] + C=CCC=C↔ CC + C=C[CH]C=C Yes
[CH2]CC + C=CCC=C↔ CCC + C=C[CH]C=C No
Continued on next page
147
Table B.3 – continued from previous pageReactions FoundC[CH]C + C=CCC=C↔ CCC + C=C[CH]C=C Yes
C=C[C]=O + C=CCC=C↔ C=CC=O + C=C[CH]C=C Yes
[CH2]C(C)C + C=CCC=C↔ CC(C)C + C=C[CH]C=C Yes
C=C[CH]C + C=CCC=C↔ C=CCC + C=C[CH]C=C Yes
[CH2]CC=C + C=CCC=C↔ C=CCC + C=C[CH]C=C Yes
[CH]=C=C=C + C=CCC=C↔ C=C=C=C + C=C[CH]C=C Yes
C=CCC=C + [O]↔ [OH] + C=C[CH]C=C No
C=CCC=C + [CH2]↔ [CH3] + C=C[CH]C=C No
148
B.7 Effect of increasing force constants and reducing the difference between upper
and lower limits
Figure B.2: Decreasing the distance and increasing the force constants for the reac-tion center each minimized error in the dHY distance introduced duringthe construction of the 3-dimensional transition state estimate. The errorreduction is additive as seen when combining the modifications.
149
Figure B.3: Decreasing the distance and increasing the force constants for the reac-tion center each minimized error in the dXY distance introduced duringthe construction of the 3-dimensional transition state estimate. The errorreduction is additive as seen when combining the modifications.
C. Kinetic calculations
C.1 Molecular group trees
C.1.1 Hydrogen Abstraction
Table C.1: Molecular group tree with distance group data in A for the hydrogen ab-straction reaction family.
Groups TS count dXH (A) dHY (A) dXY (A)L1: X H or Xrad H Xbirad H Xtrirad H 2160 1.302340 1.301790 2.587830
L2: H2 99 -0.334232 -0.026237 -0.348259
L2: C H 1531 0.042361 0.013802 0.056955
L3: Cs H 983 0.021965 0.030462 0.055401
L4: Csnorad H 949 0.021077 0.032448 0.056390
L5: C methane 63 0.084181 -0.043153 0.053187
L5: CsRHHH 542 0.032564 0.016179 0.053140
L6: CsCHHH 469 0.034758 0.013193 0.054628
L7: C/H3/Cs 257 0.056230 -0.015089 0.049210
L7: C/H3/Cd 128 -0.005007 0.066501 0.066401
L7: C/H3/Ct 20 0.033807 0.014522 0.055796
L7: C/H3/Cb
L6: CsOHHH 73 0.018498 0.035323 0.043597
L6: CsClHHH
L6: CsNHHH
L6: CsSiHHH
L6: CsSHHH
L5: CsRRHH 308 -0.011445 0.077412 0.063968
L6: CsCCHH 249 -0.016637 0.083646 0.068467
L7: C/H2/Cs/Cs 71 0.024854 0.014441 0.043693
L7: C/H2/Cs/Cd 101 -0.029327 0.104563 0.074517
L7: C/H2/Cs/Ct
L7: C/H2/Cs/Cb
L7: C/H2/Cd/Cd 60 -0.041748 0.129523 0.089601
L7: C/H2/Cd/Ct
L7: C/H2/Cd/Cb
L7: C/H2/Ct/Ct
L7: C/H2/Ct/Cb
L7: C/H2/Cb/Cb
L6: CsCOHH 59 0.010319 0.051278 0.045111
L7: C/H2/Cs/O 59 0.010319 0.051278 0.045111
L7: C/H2/Cd/O
L7: C/H2/Ct/O
L7: C/H2/Cb/O
L6: CsCClHH
Continued on next page
150
151
Table C.1 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L6: CsCNHH
L6: CsCSiHH
L6: CsCSHH
L6: CsOOHH
L5: CsRRRH 36 0.002993 0.038881 0.045616
L6: CsCCCH 29 0.000142 0.038829 0.043770
L7: C/H/Cs/Cs/Cs 12 -0.003029 0.034764 0.040281
L7: C/H/Cs/Cs/Cd
L7: C/H/Cs/Cs/Ct
L7: C/H/Cs/Cs/Cb
L7: C/H/Cs/Cd/Cd
L7: C/H/Cs/Cd/Ct
L7: C/H/Cs/Cd/Cb
L7: C/H/Cs/Ct/Ct
L7: C/H/Cs/Ct/Cb
L7: C/H/Cs/Cb/Cb
L7: C/H/Cd/Cd/Cd
L7: C/H/Cd/Cd/Ct
L7: C/H/Cd/Cd/Cb
L7: C/H/Cd/Ct/Ct
L7: C/H/Cd/Ct/Cb
L7: C/H/Cd/Cb/Cb
L7: C/H/Ct/Ct/Ct
L7: C/H/Ct/Ct/Cb
L7: C/H/Ct/Cb/Cb
L7: C/H/Cb/Cb/Cb
L6: CsCCOH 7 0.015687 0.039114 0.053833
L7: C/H/Cs/Cs/O 7 0.015687 0.039114 0.053833
L7: C/H/Cs/Cd/O
L7: C/H/Cs/Ct/O
L7: C/H/Cs/Cb/O
L7: C/H/Cd/Cd/O
L7: C/H/Cd/Ct/O
L7: C/H/Cd/Cb/O
L7: C/H/Ct/Ct/O
L7: C/H/Ct/Cb/O
L7: C/H/Cb/Cb/O
L6: CsCCClH
L6: CsCCNH
L6: CsCCSiH
L6: CsCCSH
L6: CsCOOH
L7: C/H/Cs/O/O
L7: C/H/Cd/O/O
L7: C/H/Ct/O/O
L7: C/H/Cb/O/O
L6: CsOOOH
L4: Csrad H 32 0.046940 -0.021165 0.033333
Continued on next page
152
Table C.1 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L5: C methyl 17 0.090667 -0.061808 0.037283
L5: CsradRH2 15 -0.008411 0.030282 0.028332
L6: CsradCHH 15 -0.008411 0.030282 0.028332
L7: Csrad/H/Cs/H
L7: Csrad/H/Cd/H
L7: Csrad/H/Ct/H 15 -0.008411 0.030282 0.028332
L7: Csrad/H/Cb/H
L6: CsradOH2
L5: CsradRRH
L6: CsradCCH
L7: Csrad/Cs/Cs/H
L7: Csrad/Cs/Cd/H
L7: Csrad/Cs/Ct/H
L7: Csrad/Cs/Cb/H
L7: Csrad/Cd/Cd/H
L7: Csrad/Cd/Ct/H
L7: Csrad/Cd/Cb/H
L7: Csrad/Ct/Ct/H
L7: Csrad/Ct/Cb/H
L7: Csrad/Cb/Cb/H
L6: CsradCOH
L7: Csrad/Cs/O/H
L7: Csrad/Cd/O/H
L7: Csrad/Ct/O/H
L7: Csrad/Cb/O/H
L6: CsradOOH
L4: CsbiradH 2 0.046729 -0.150684 -0.143516
L5: Cs singletH
L6: Cs singletHH
L6: Cs singletRH
L7: C singletCH
L8: C singlet/Cs/H
L8: C singlet/Cd/H
L8: C singlet/Ct/H
L8: C singlet/Cb/H
L7: C singletOH
L5: Cs tripletH 2 0.046729 -0.150684 -0.143516
L6: Cs tripletHH 2 0.046729 -0.150684 -0.143516
L6: Cs tripletRH
L7: Cs tripletCH
L8: C triplet/Cs/H
L8: C triplet/Cd/H
L8: C triplet/Ct/H
L8: C triplet/Cb/H
L7: Cs tripletOH
L4: CstriradH
L5: Cdoublet H
L5: Cquartet H
Continued on next page
153
Table C.1 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L3: Cd H 310 0.089459 -0.051663 0.040317
L4: Cdnorad H 284 0.097443 -0.057949 0.043368
L5: Cd C/R/H 284 0.097443 -0.057949 0.043368
L6: Cd C/H2 180 0.098347 -0.058882 0.042131
L7: Cd Cds/H2 113 0.120638 -0.076649 0.048701
L7: Cd Cdd/H2 67 0.062945 -0.030666 0.031698
L6: Cd C/C/H 104 0.095841 -0.056294 0.045563
L7: Cd Cds/Cs/H 38 0.089965 -0.048448 0.047802
L7: Cd Cds/Cd/H 51 0.105603 -0.066739 0.044881
L7: Cd Cds/Ct/H 15 0.077044 -0.039990 0.042408
L7: Cd Cds/Cb/H
L7: Cd Cdd/Cs/H
L7: Cd Cdd/Cd/H
L7: Cd Cdd/Ct/H
L7: Cd Cdd/Cb/H
L6: Cd C/O/H
L7: Cd Cds/O/H
L7: Cd Cdd/O/H
L5: Cd O/R/H
L6: Cd O/H2
L6: Cd O/C/H
L7: Cd O/Cs/H
L7: Cd O/Cd/H
L7: Cd O/Ct/H
L7: Cd O/Cb/H
L6: Cd O/O/H
L4: Cdrad H 26 0.006512 0.013635 0.008624
L5: Cdrad C/H 26 0.006512 0.013635 0.008624
L6: Cdrad Cds/H
L6: Cdrad Cdd/H 26 0.006512 0.013635 0.008624
L5: Cdrad O/H
L3: Ct H 20 0.459597 -0.204392 0.207500
L3: Cb H 30 0.119688 -0.081353 0.044020
L2: O H 520 -0.065043 -0.037997 -0.107342
L3: OradH 30 -0.035310 -0.089621 -0.121745
L3: ORH 490 -0.067028 -0.034550 -0.106381
L4: OHH 70 0.116491 -0.170432 -0.075411
L4: OCH 105 -0.013657 -0.113478 -0.132370
L5: O/Cs/H 99 -0.014675 -0.114277 -0.133992
L5: O/Cd/H 4 -0.069779 -0.048202 -0.119096
L5: O/Ct/H
L5: O/Cb/H
L4: OOH 315 -0.129641 0.024288 -0.105867
L2: Cl H 1 0.395488 -0.184160 0.069450
L2: Si H 7 0.304476 0.226924 0.533363
L2: N H
L3: N3 H
L4: N3s H
Continued on next page
154
Table C.1 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L5: NH3
L5: N3s/H2/R
L6: N3s/H2/C
L6: N3s/H2/N
L6: N3s/H2/O
L6: N3s/H2/Si
L6: N3s/H2/S
L6: N3s/H2/Cl
L5: N3s/H/R/R
L6: N3s/H/C/C
L6: N3s/H/C/N
L6: N3s/H/C/O
L6: N3s/H/C/Si
L6: N3s/H/C/S
L6: N3s/H/C/Cl
L6: N3s/H/N/N
L6: N3s/H/N/O
L6: N3s/H/N/Si
L6: N3s/H/N/S
L6: N3s/H/N/Cl
L6: N3s/H/O/O
L6: N3s/H/O/Si
L6: N3s/H/O/S
L6: N3s/H/O/Cl
L6: N3s/H/Si/Si
L6: N3s/H/Si/S
L6: N3s/H/Si/Cl
L6: N3s/H/S/S
L6: N3s/H/S/Cl
L6: N3s/H/Cl/Cl
L4: N3d H
L3: N5 H
L4: N5s H
L4: N5d H
L2: S H 2 0.225939 0.128426 0.248627
L3: SradH
L3: SRH 2 0.225939 0.128426 0.248627
L4: SHH 1 0.151101 0.201654 0.199974
L4: SClH
L4: SOH
L4: SCH 1 0.285809 0.069844 0.287550
L5: S/Cs/H 1 0.285809 0.069844 0.287550
L5: S/Cd/H
L5: S/Ct/H
L5: S/Cb/H
L4: SSH
L4: SNH
L4: SSiH
Continued on next page
155
Table C.1 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L1: Y rad birad trirad quadrad
L2: Hrad 99 -0.024278 -0.336325 -0.348540
L2: Orad 520 -0.036287 -0.067237 -0.107779
L3: OjR 490 -0.032816 -0.069208 -0.106764
L4: OjH 70 -0.168565 0.112385 -0.077781
L4: OjC 105 -0.111642 -0.016796 -0.133448
L5: OjCs 99 -0.112816 -0.017241 -0.134989
L5: OjCd 4 -0.043477 -0.072532 -0.114891
L5: OjCt
L5: OjCb
L4: OjO 315 0.026408 -0.131559 -0.105565
L3: O atom triplet 30 -0.087969 -0.037898 -0.122902
L2: Crad 1531 0.013032 0.043069 0.056882
L3: Cj 1466 0.013231 0.043776 0.058116
L4: Csj 949 0.031271 0.021748 0.056109
L5: Cs methyl 63 -0.041935 0.082581 0.052634
L5: CsjRH2 542 0.015359 0.033128 0.052856
L6: CsjCH2 469 0.012334 0.035364 0.054236
L7: Csj/Cs/H2 257 -0.016078 0.056620 0.048624
L7: Csj/Cd/H2 128 0.065052 -0.003387 0.066373
L7: Csj/Ct/H2 20 0.015649 0.032400 0.055232
L7: Csj/Cb/H2
L6: CsjOH2 73 0.034532 0.018962 0.044109
L5: CsjRRH 308 0.074972 -0.009945 0.063748
L6: CsjCCH 249 0.081137 -0.015097 0.068088
L7: Csj/Cs/Cs/H 71 0.013072 0.025416 0.043575
L7: Csj/Cs/Cd/H 101 0.101037 -0.027099 0.074047
L7: Csj/Cs/Ct/H
L7: Csj/Cs/Cb/H
L7: Csj/Cd/Cd/H 60 0.127932 -0.040702 0.089349
L7: Csj/Cd/Ct/H
L7: Csj/Cd/Cb/H
L7: Csj/Ct/Ct/H
L7: Csj/Ct/Cb/H
L7: Csj/Cb/Cb/H
L6: CsjCOH 59 0.049303 0.011508 0.045677
L7: Csj/Cs/O/H 59 0.049303 0.011508 0.045677
L7: Csj/Cd/O/H
L7: Csj/Ct/O/H
L7: Csj/Cb/O/H
L6: CsjOOH
L5: CsjRRR 36 0.038420 0.002291 0.045182
L6: CsjCCC 29 0.037866 -0.000153 0.042817
L7: Csj/Cs/Cs/Cs 12 0.037019 -0.005761 0.040111
L7: Csj/Cs/Cs/Cd
L7: Csj/Cs/Cs/Ct
L7: Csj/Cs/Cs/Cb
L7: Csj/Cs/Cd/Cd
Continued on next page
156
Table C.1 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L7: Csj/Cs/Cd/Ct
L7: Csj/Cs/Cd/Cb
L7: Csj/Cs/Ct/Ct
L7: Csj/Cs/Ct/Cb
L7: Csj/Cs/Cb/Cb
L7: Csj/Cd/Cd/Cd
L7: Csj/Cd/Cd/Ct
L7: Csj/Cd/Cd/Cb
L7: Csj/Cd/Ct/Ct
L7: Csj/Cd/Ct/Cb
L7: Csj/Cd/Cb/Cb
L7: Csj/Ct/Ct/Ct
L7: Csj/Ct/Ct/Cb
L7: Csj/Ct/Cb/Cb
L7: Csj/Cb/Cb/Cb
L6: CsjCCO 7 0.040874 0.013126 0.055667
L7: Csj/Cs/Cs/O 7 0.040874 0.013126 0.055667
L7: Csj/Cs/Cd/O
L7: Csj/Cs/Ct/O
L7: Csj/Cs/Cb/O
L7: Csj/Cd/Cd/O
L7: Csj/Cd/Ct/O
L7: Csj/Cd/Cb/O
L7: Csj/Ct/Ct/O
L7: Csj/Ct/Cb/O
L7: Csj/Cb/Cb/O
L6: CsjCOO
L7: Csj/Cs/O/O
L7: Csj/Cd/O/O
L7: Csj/Ct/O/O
L7: Csj/Cb/O/O
L6: CsjOOO
L4: Cdj 284 -0.058313 0.098243 0.043562
L5: Cdj CR 284 -0.058313 0.098243 0.043562
L6: Cdj CH 180 -0.059457 0.099451 0.042356
L7: Cdj CdsH 113 -0.077882 0.122527 0.049057
L7: Cdj CddH 67 -0.030443 0.063113 0.031804
L6: Cdj CC 104 -0.056290 0.096106 0.045695
L7: Cdj CdsCs 38 -0.048294 0.089904 0.047720
L7: Cdj CdsCd 51 -0.067373 0.106593 0.045123
L7: Cdj CdsCt 15 -0.038068 0.075533 0.042704
L7: Cdj CdsCb
L7: Cdj CddCs
L7: Cdj CddCd
L7: Cdj CddCt
L7: Cdj CddCb
L6: Cdj CO
L7: Cdj CdsO
Continued on next page
157
Table C.1 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L7: Cdj CddO
L5: Cdj OR
L6: Cdj OH
L6: Cdj OC
L7: Cdj OCs
L7: Cdj OCd
L7: Cdj OCt
L7: Cdj OCb
L6: Cdj OO
L4: Ctj 20 -0.202539 0.459642 0.208324
L5: CtjC 20 -0.202539 0.459642 0.208324
L4: Cbj 30 -0.083989 0.123085 0.044324
L3: Cjj 58 -0.004950 0.028166 0.021371
L4: Csjj 32 -0.022049 0.047291 0.032235
L5: Cs sing
L6: Cs singH2
L6: Cs singRH
L7: Cs singCH
L8: Cs sing/Cs/H
L8: Cs sing/Cd/H
L8: Cs sing/Ct/H
L8: Cs sing/Cb/H
L7: Cs singOH
L6: Cs singRR
L7: Cs singCC
L8: Cs sing/Cs/Cs
L8: Cs sing/Cs/Cd
L8: Cs sing/Cs/Ct
L8: Cs sing/Cs/Cb
L8: Cs sing/Cd/Cd
L8: Cs sing/Cd/Ct
L8: Cs sing/Cd/Cb
L8: Cs sing/Ct/Ct
L8: Cs sing/Ct/Cb
L8: Cs sing/Cb/Cb
L7: Cs singCO
L8: Cs sing/Cs/O
L8: Cs sing/Cd/O
L8: Cs sing/Ct/O
L8: Cs sing/Cb/O
L7: Cs singOO
L5: Cs trip 32 -0.022049 0.047291 0.032235
L6: Cs tripH2 17 -0.059322 0.087916 0.036802
L6: Cs tripRH 15 0.027153 -0.006333 0.026207
L7: Cs tripCH 15 0.027153 -0.006333 0.026207
L8: Cs trip/Cs/H
L8: Cs trip/Cd/H
L8: Cs trip/Ct/H 15 0.027153 -0.006333 0.026207
Continued on next page
158
Table C.1 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L8: Cs trip/Cb/H
L7: Cs tripOH
L6: Cs tripRR
L7: Cs tripCC
L8: Cs trip/Cs/Cs
L8: Cs trip/Cs/Cd
L8: Cs trip/Cs/Ct
L8: Cs trip/Cs/Cb
L8: Cs trip/Cd/Cd
L8: Cs trip/Cd/Ct
L8: Cs trip/Cd/Cb
L8: Cs trip/Ct/Ct
L8: Cs trip/Ct/Cb
L8: Cs trip/Cb/Cb
L7: Cs tripCO
L8: Cs trip/Cs/O
L8: Cs trip/Cd/O
L8: Cs trip/Ct/O
L8: Cs trip/Cb/O
L7: Cs tripOO
L4: Cdjj 26 0.013999 0.006970 0.009330
L5: Cd singletR
L6: Cd singletC
L6: Cd singletO
L5: Cd tripletR 26 0.013999 0.006970 0.009330
L6: Cd tripletC 26 0.013999 0.006970 0.009330
L6: Cd tripletO
L3: Cjjj 2 -0.152991 0.048485 -0.144094
L4: C doubletR
L4: C quartetR 2 -0.152991 0.048485 -0.144094
L3: Cjjjj
L4: C quintet
L4: C triplet
L2: Clrad 1 -0.186277 0.397424 0.069321
L2: Sirad 7 0.226142 0.305217 0.533278
L2: Srad 2 0.125799 0.228185 0.248529
L3: Srad H 1 0.198989 0.152974 0.199177
L3: Srad R 1 0.067246 0.288355 0.288011
L4: Srad C 1 0.067246 0.288355 0.288011
L4: Srad N
L4: Srad O
L4: Srad Si
L4: Srad S
L4: Srad Cl
L2: Nrad
L3: N3 rad
L4: N3s rad
L4: NH2
Continued on next page
159
Table C.1 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L4: N3s/H/R
L4: N3s/R/R
L4: N3d rad
L3: N5 rad
L4: N5s rad
L4: N5d rad
160
C.1.2 Intra-hydrogen migration
Table C.2: Molecular group tree with distance group data in A for the Intra-hydrogenmigration reaction family.
Groups TS count dXH (A) dHY (A) dXY (A)L1: RnH 140 2.312110 1.285790 1.285570
L2: R2Hall 14 -0.423185 -0.002694 -0.002561
L3: R2H s 14 -0.423185 -0.002694 -0.002561
L3: R2H r
L4: R2H d
L4: R2H t
L4: R2H b
L2: R3Hall 6 -0.210247 0.038629 0.042609
L3: R3H ss 6 -0.210247 0.038629 0.042609
L3: R3H sr
L4: R3H sd
L4: R3H st
L4: R3H sb
L3: R3H rs
L4: R3H ds
L4: R3H ts
L4: R3H bs
L3: R3H rr
L4: R3H bb
L2: R4Hall 50 0.096514 0.018029 0.017229
L3: R4H sss 46 0.081186 0.013837 0.012979
L3: R4H ssr
L4: R4H ssd
L4: R4H sst
L4: R4H ssb
L3: R4H srs 4 0.220103 0.051828 0.051493
L4: R4H sds 4 0.220103 0.051828 0.051493
L4: R4H sts
L4: R4H sbs
L3: R4H rss
L4: R4H dss
L4: R4H tss
L4: R4H bss
L3: R4H srr
L3: R4H rrs
L3: R4H rsr
L4: R4H dsd
L4: R4H tsd
L4: R4H bsd
L4: R4H dst
L4: R4H tst
L4: R4H bst
L4: R4H dsb
Continued on next page
161
Table C.2 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L4: R4H tsb
L4: R4H bsb
L3: R4H rrr
L2: R5Hall 58 0.134017 -0.016606 -0.016403
L3: R5H ssss 56 0.126439 -0.018537 -0.018273
L3: R5H sssr
L4: R5H sssd
L4: R5H ssst
L4: R5H sssb
L3: R5H ssrs 1 0.259051 0.013097 0.016608
L4: R5H ssds 1 0.259051 0.013097 0.016608
L4: R5H ssts
L4: R5H ssbs
L3: R5H srss 1 0.259051 0.017417 0.012288
L4: R5H sdss 1 0.259051 0.017417 0.012288
L4: R5H stss
L4: R5H sbss
L3: R5H rsss
L4: R5H dsss
L4: R5H tsss
L4: R5H bsss
L3: R5H ssrr
L3: R5H srsr
L4: R5H sdsd
L4: R5H stsd
L4: R5H sbsd
L4: R5H sdst
L4: R5H stst
L4: R5H sbst
L4: R5H sdsb
L4: R5H stsb
L4: R5H sbsb
L3: R5H rssr
L4: R5H dssd
L4: R5H tssd
L4: R5H bssd
L4: R5H dsst
L4: R5H tsst
L4: R5H bsst
L4: R5H dssb
L4: R5H tssb
L4: R5H bssb
L3: R5H srrs
L3: R5H rsrs
L4: R5H dsds
L4: R5H tsds
L4: R5H bsds
L4: R5H dsts
Continued on next page
162
Table C.2 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L4: R5H tsts
L4: R5H bsts
L4: R5H dsbs
L4: R5H tsbs
L4: R5H bsbs
L3: R5H rrss
L3: R5H srrr
L3: R5H rsrr
L3: R5H rrsr
L3: R5H rrrs
L3: R5H rrrr
L2: R6Hall 10 0.158073 -0.030703 -0.030480
L3: R6H sssss 10 0.158073 -0.030703 -0.030480
L3: R6H ssssr
L4: R6H ssssd
L4: R6H sssst
L4: R6H ssssb
L3: R6H sssrs
L4: R6H sssds
L4: R6H sssts
L4: R6H sssbs
L3: R6H ssrss
L4: R6H ssdss
L4: R6H sstss
L4: R6H ssbss
L3: R6H srsss
L4: R6H sdsss
L4: R6H stsss
L4: R6H sbsss
L3: R6H rssss
L4: R6H dssss
L4: R6H tssss
L4: R6H bssss
L3: R6H sssrr
L3: R6H ssrsr
L4: R6H ssdsd
L4: R6H sstsd
L4: R6H ssbsd
L4: R6H ssdst
L4: R6H sstst
L4: R6H ssbst
L4: R6H ssdsb
L4: R6H sstsb
L4: R6H ssbsb
L3: R6H srssr
L4: R6H sdssd
L4: R6H stssd
L4: R6H sbssd
Continued on next page
163
Table C.2 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L4: R6H sdsst
L4: R6H stsst
L4: R6H sbsst
L4: R6H sdssb
L4: R6H stssb
L4: R6H sbssb
L3: R6H rsssr
L4: R6H dsssd
L4: R6H tsssd
L4: R6H bsssd
L4: R6H dssst
L4: R6H tssst
L4: R6H bssst
L4: R6H dsssb
L4: R6H tsssb
L4: R6H bsssb
L3: R6H ssrrs
L3: R6H srsrs
L4: R6H sdsds
L4: R6H stsds
L4: R6H sbsds
L4: R6H sdsts
L4: R6H ststs
L4: R6H sbsts
L4: R6H sdsbs
L4: R6H stsbs
L4: R6H sbsbs
L3: R6H rssrs
L4: R6H dssds
L4: R6H tssds
L4: R6H bssds
L4: R6H dssts
L4: R6H tssts
L4: R6H bssts
L4: R6H dssbs
L4: R6H tssbs
L4: R6H bssbs
L3: R6H srrss
L3: R6H rsrss
L4: R6H dsdss
L4: R6H tsdss
L4: R6H bsdss
L4: R6H dstss
L4: R6H tstss
L4: R6H bstss
L4: R6H dsbss
L4: R6H tsbss
L4: R6H bsbss
Continued on next page
164
Table C.2 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L3: R6H rrsss
L3: R6H ssrrr
L3: R6H srsrr
L3: R6H rssrr
L3: R6H srrsr
L3: R6H rsrsr
L4: R6H dsdsd
L4: R6H tsdsd
L4: R6H bsdsd
L4: R6H dstsd
L4: R6H tstsd
L4: R6H bstsd
L4: R6H dsbsd
L4: R6H tsbsd
L4: R6H bsbsd
L4: R6H dsdst
L4: R6H tsdst
L4: R6H bsdst
L4: R6H dstst
L4: R6H tstst
L4: R6H bstst
L4: R6H dsbst
L4: R6H tsbst
L4: R6H bsbst
L4: R6H dsdsb
L4: R6H tsdsb
L4: R6H bsdsb
L4: R6H dstsb
L4: R6H tstsb
L4: R6H bstsb
L4: R6H dsbsb
L4: R6H tsbsb
L4: R6H bsbsb
L3: R6H rrssr
L3: R6H srrrs
L3: R6H rsrrs
L3: R6H rrsrs
L3: R6H rrrss
L3: R6H srrrr
L3: R6H rsrrr
L3: R6H rrsrr
L3: R6H rrrsr
L3: R6H rrrrs
L3: R6H rrrrr
L2: R7Hall 2 0.179079 -0.032439 -0.032230
L1: Y rad out
L2: Cj out 83 -0.005563 -0.003776 0.048410
L3: Csj out 77 0.007296 -0.008351 0.050820
Continued on next page
165
Table C.2 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L4: Csj out H2 44 0.020405 -0.023450 0.067691
L4: Csj out RH 14 0.181348 0.008883 0.038264
L5: Csj out CH 9 0.246944 -0.003708 0.040665
L6: Csj out CsH 9 0.246944 -0.003708 0.040665
L6: Csj out CdH
L6: Csj out CtH
L6: Csj out CbH
L5: Csj out OH 5 0.061090 0.031967 0.033862
L4: Csj out RR 19 -0.138239 0.020487 0.014328
L5: Csj out CC 15 -0.166212 0.021962 0.013029
L6: Csj out CsCs 12 -0.026739 0.017774 0.029485
L6: Csj out CsCd 1 -0.543722 -0.002171 -0.010414
L6: Csj out CsCt
L6: Csj out CsCb
L6: Csj out CdCd
L6: Csj out CdCt
L6: Csj out CdCb
L6: Csj out CtCt
L6: Csj out CtCb
L6: Csj out CbCb
L5: Csj out CO 4 0.020270 0.012129 0.021685
L6: Csj out CsO 4 0.020270 0.012129 0.021685
L6: Csj out CdO
L6: Csj out CtO
L6: Csj out CbO
L5: Csj out OO
L3: Cdj out 6 -0.188803 0.061424 0.014066
L4: Cdj out C 1 -0.618172 0.007499 0.010916
L5: Cdj out Cd 1 -0.618172 0.007499 0.010916
L5: Cdj out Cdd
L4: Cdj out O 5 -0.045681 0.079398 0.015116
L3: Ctj out
L3: Cbj out
L2: Oj out 57 0.005262 0.003571 -0.045783
L1: XH out
L2: C H out 83 -0.007593 0.048254 -0.002939
L3: Cs H out 77 0.005050 0.050600 -0.007410
L4: Cs H out H2 44 0.016883 0.067164 -0.022003
L4: Cs H out RH 14 0.181230 0.037898 0.009553
L5: Cs H out CH 9 0.246795 0.040402 -0.003138
L6: Cs H out CsH 9 0.246795 0.040402 -0.003138
L6: Cs H out CdH
L6: Cs H out CtH
L6: Cs H out CbH
L5: Cs H out OH 5 0.061027 0.033306 0.032818
L4: Cs H out RR 19 -0.138915 0.014388 0.020814
L5: Cs H out CC 15 -0.167162 0.013250 0.022171
L6: Cs H out CsCs 12 -0.027330 0.029543 0.018090
Continued on next page
166
Table C.2 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L6: Cs H out CsCd 1 -0.545671 -0.009742 -0.002262
L6: Cs H out CsCt
L6: Cs H out CsCb
L6: Cs H out CdCd
L6: Cs H out CdCt
L6: Cs H out CdCb
L6: Cs H out CtCt
L6: Cs H out CtCb
L6: Cs H out CbCb
L5: Cs H out CO 4 0.021151 0.020838 0.013129
L6: Cs H out CsO 4 0.021151 0.020838 0.013129
L6: Cs H out CdO
L6: Cs H out CtO
L6: Cs H out CbO
L5: Cs H out OO
L3: Cd H out 1 -0.620121 0.011588 0.007408
L4: Cd H out C 1 -0.620121 0.011588 0.007408
L5: Cd H out Cd 1 -0.620121 0.011588 0.007408
L6: Cd H out CdH
L6: Cd H out CdC
L7: Cd H out CdCs
L7: Cd H out CdCd
L7: Cd H out CdCt
L7: Cd H out CdCb
L6: Cd H out CdO
L5: Cd H out Cdd
L6: Cd H out CddH
L6: Cd H out CddC
L7: Cd H out CddCs
L7: Cd H out CddCd
L7: Cd H out CddCt
L7: Cd H out CddCb
L6: Cd H out CddO
L4: Cd H out O
L5: Cd H out OdH
L5: Cd H out OdC
L6: Cd H out OdCs
L6: Cd H out OdCd
L6: Cd H out OdCt
L6: Cd H out OdCb
L5: Cd H out OdO
L3: Ct H out
L3: Cb H out
L2: O H out 57 0.007022 -0.044626 0.002718
167
C.1.3 β-scission
Table C.3: Molecular group tree with distance group data in A for the β-scission re-action family.
Groups TS count dXH (A) dHY (A) dXY (A)L1: R R 158 1.307210 2.612790 2.004480
L2: Cn R 144 0.004967 0.011510 0.023505
L3: Cd R 134 0.011442 0.006411 0.022726
L4: Cds R 90 0.046130 0.077838 0.059501
L5: Cds Cd H2 34 0.045291 0.146565 0.087210
L6: Cds Cds/H2 30 0.046912 0.147575 0.095151
L6: Cds Cdd/H2 4 0.028197 0.135915 0.003470
L5: Cds Cd RH 37 0.046434 0.055242 0.054617
L6: Cds Cds/Cs/H 24 0.044214 0.038586 0.053166
L6: Cds Cds/Cd/H 1 0.043742 0.053180 -0.012341
L6: Cds Cds/Ct/H
L6: Cds Cds/Cb/H
L6: Cds Cdd/Cs/H 1 0.038510 0.209643 0.106540
L6: Cds Cdd/Cd/H
L6: Cds Cdd/Ct/H
L6: Cds Cdd/Cb/H
L5: Cds Cd RR 19 0.047561 -0.044953 -0.000354
L6: Cds Cds/Cs/Cs 9 0.044092 -0.069790 0.005843
L6: Cds Cds/Cs/Cd
L6: Cds Cds/Cs/Ct
L6: Cds Cds/Cs/Cb
L6: Cds Cds/Cd/Cd
L6: Cds Cds/Cd/Ct
L6: Cds Cds/Cd/Cb
L6: Cds Cds/Ct/Ct
L6: Cds Cds/Ct/Cb
L6: Cds Cds/Cb/Cb
L6: Cds Cdd/Cs/Cs 1 0.025502 -0.043910 -0.051901
L6: Cds Cdd/Cs/Cd
L6: Cds Cdd/Cs/Ct
L6: Cds Cdd/Cs/Cb
L6: Cds Cdd/Cd/Cd
L6: Cds Cdd/Cd/Ct
L6: Cds Cdd/Cd/Cb
L6: Cds Cdd/Ct/Ct
L6: Cds Cdd/Ct/Cb
L6: Cds Cdd/Cb/Cb
L5: Cds Od H2
L5: Cds Od RH
L6: Cds Od/Cs/H
L6: Cds Od/Cd/H
L6: Cds Od/Ct/H
L6: Cds Od/Cb/H
Continued on next page
168
Table C.3 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L5: Cds Od RR
L6: Cds Od/Cs/Cs
L6: Cds Od/Cs/Cd
L6: Cds Od/Cs/Ct
L6: Cds Od/Cs/Cb
L6: Cds Od/Cd/Cd
L6: Cds Od/Cd/Ct
L6: Cds Od/Cd/Cb
L6: Cds Od/Ct/Ct
L6: Cds Od/Ct/Cb
L6: Cds Od/Cb/Cb
L4: Cdd RR 10 0.006662 -0.087249 -0.017622
L5: Cdd CC 4 0.009779 0.023941 0.118288
L6: Cdd Cds/Cds 4 0.009779 0.023941 0.118288
L6: Cdd Cds/Cdd
L6: Cdd Cdd/Cdd
L5: Cdd CO 5 0.027151 -0.090485 -0.038746
L6: Cdd Cds/O 5 0.027151 -0.090485 -0.038746
L6: Cdd Cdd/O
L5: CO2 1 -0.085430 -0.435677 -0.374830
L3: Ct R 10 -0.086860 0.083830 0.034556
L4: Ct Ct/H 7 -0.086533 0.126234 0.065931
L4: Ct Ct/R 3 -0.088331 -0.106990 -0.106634
L5: Ct Ct/C 3 -0.088331 -0.106990 -0.106634
L6: Ct Ct/Cs 1 -0.090958 -0.119910 -0.099841
L6: Ct Ct/Cd 1 -0.088088 -0.112460 -0.102411
L6: Ct Ct/Ct 1 -0.085948 -0.088600 -0.117651
L6: Ct Ct/Cb
L5: Ct Ct/O
L2: Od R 14 -0.067256 -0.155852 -0.318274
L3: Od C 14 -0.067256 -0.155852 -0.318274
L4: Od Cds 12 -0.059610 -0.162289 -0.354429
L4: Od Cdd 2 -0.091044 -0.135826 -0.205789
L3: O2
L1: YJ
L2: Hj 44 -0.013882 -0.203432 -0.187527
L2: Cj 77 -0.009370 0.166583 0.143727
L3: Csj 54 -0.011070 0.196126 0.155730
L4: Csj methyl 29 -0.002435 0.216852 0.168683
L4: Csj RH2 18 -0.013060 0.191287 0.155259
L5: Csj CH2 11 -0.017731 0.197404 0.143925
L6: Csj Cs/H2 10 -0.023349 0.197492 0.141981
L6: Csj Cd/H2
L6: Csj Ct/H2
L6: Csj Cb/H2
L5: Csj OH2 7 -0.006466 0.182651 0.171261
L4: Csj RRH 7 -0.055722 0.089168 0.080161
L5: Csj CCH 4 -0.080798 0.001713 0.027160
Continued on next page
169
Table C.3 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L6: Csj Cs/Cs/H 3 -0.083531 0.019894 0.035911
L6: Csj Cs/Cd/H
L6: Csj Cs/Ct/H
L6: Csj Cs/Cb/H
L6: Csj Cd/Cd/H
L6: Csj Cd/Ct/H
L6: Csj Cd/Cb/H
L6: Csj Ct/Ct/H
L6: Csj Ct/Cb/H
L6: Csj Cb/Cb/H
L5: Csj COH 3 -0.030646 0.176623 0.133162
L6: Csj Cs/O/H 3 -0.030646 0.176623 0.133162
L6: Csj Cd/O/H
L6: Csj Ct/O/H
L6: Csj Cb/O/H
L5: Csj OOH
L4: Csj RRR
L5: Csj CCC
L6: Csj Cs/Cs/Cs
L6: Csj Cs/Cs/Cd
L6: Csj Cs/Cs/Ct
L6: Csj Cs/Cs/Cb
L6: Csj Cs/Cd/Cd
L6: Csj Cs/Cd/Ct
L6: Csj Cs/Cd/Cb
L6: Csj Cs/Ct/Ct
L6: Csj Cs/Ct/Cb
L6: Csj Cs/Cb/Cb
L6: Csj Cd/Cd/Cd
L6: Csj Cd/Cd/Ct
L6: Csj Cd/Cd/Cb
L6: Csj Cd/Ct/Ct
L6: Csj Cd/Ct/Cb
L6: Csj Cd/Cb/Cb
L6: Csj Ct/Ct/Ct
L6: Csj Ct/Ct/Cb
L6: Csj Ct/Cb/Cb
L6: Csj Cb/Cb/Cb
L5: Csj CCO
L6: Csj Cs/Cs/O
L6: Csj Cs/Cd/O
L6: Csj Cs/Ct/O
L6: Csj Cs/Cb/O
L6: Csj Cd/Cd/O
L6: Csj Cd/Ct/O
L6: Csj Cd/Cb/O
L6: Csj Ct/Ct/O
L6: Csj Ct/Cb/O
Continued on next page
170
Table C.3 – continued from previous pageGroups TS count dXH (A) dHY (A) dXY (A)
L6: Csj Cb/Cb/O
L5: Csj COO
L6: Csj Cs/O/O
L6: Csj Cd/O/O
L6: Csj Ct/O/O
L6: Csj Cb/O/O
L5: Csj OOO
L3: Cdj 9 -0.011836 0.214972 0.175425
L4: Cdj Cd 9 -0.011836 0.214972 0.175425
L5: Cdj CdH 4 -0.043276 0.220166 0.200290
L6: Cdj CdsH 4 -0.043276 0.220166 0.200290
L6: Cdj CddH
L5: Cdj CdC 5 0.006795 0.211895 0.160691
L6: Cdj Cds/Cs 2 -0.027209 0.232747 0.174633
L6: Cdj Cds/Cd 1 0.012251 0.293781 0.262610
L6: Cdj Cds/Ct
L6: Cdj Cds/Cb
L6: Cdj Cdd/Cs 2 0.029571 0.155313 0.099274
L6: Cdj Cdd/Cd
L6: Cdj Cdd/Ct
L6: Cdj Cdd/Cb
L5: Cdj CdO
L6: Cdj CdsO
L6: Cdj CddO
L4: Cdj Od
L5: Cdj OdH
L5: Cdj OdC
L6: Cdj OdCs
L6: Cdj OdCd
L6: Cdj OdCt
L6: Cdj OdCb
L5: Cdj OdO
L3: Ctj
L4: Ctj Ct
L3: Cbj
L2: Oj 37 0.029544 -0.079245 -0.056477
L3: OH 11 0.013221 -0.105878 0.056824
L3: OjC
L4: OjCs
L4: OjCd
L4: OjCt
L4: OjCb
L3: OjO 26 0.035945 -0.068801 -0.100909