Use of crystallographic data in searching for isosteric replacements: Composite crystal-field...

17
Pestic. Sci. 1990, 29, 197-213 Use of Crystallographic Data in Searching for Isosteric Replacements: Composite Crystal-Field Environments of Nitro and Carbonyl Groups* Robin Taylor, Anne Mullaley & Graham W. Mullier ICI Agrochemicals, Jealott’s Hill Research Station, Bracknell, Berkshire RG12 6EY, UK (Revised manuscript received 6 September 1989; accepted 22 September 1989) ABSTRACT Over 70 000 organo-carbon crystal structures are available in the public domain. A common functional group may occur in several hundred of these structures. Examination of the immediate intermolecular environment of a functional group in each of the structures in which it occurs enables a composite crystal-field environment to be determined. Although influenced by random packing forces, the composite environment may show strong systematic features which reflect the preferred interactions of the functional group. This is shown by test calculations on structures containing nitro and carbonyl groups, using in-house software. Strong systematic features exist in the composite environments of both groups, showing their tendency to hydrogen bond in approximately the directions of the oxygen lone pairs, and to participate in strong electrostatic interactions in the direction normal to the plane of the group. Such results are of value in aiding the search for bioisosteres. 1 INTRODUCTION A common situation in pesticide research is that a compound has high intrinsic biological activity but is not commercially viable for some other reason, e.g. because of undesirable physical properties, toxicity problems, patent considerations, etc. It is then necessary to modify the compound so that the unwanted feature is removed but intrinsic activity is retained. In many cases, the obvious tactic is to replace a key * Paper presented at the Symposium ‘Isosteric Replacements in Drug and Pesticide Chemistry’ organised by the Pesticides Group of the Society of Chemical Industry and held in London on 20 June, 1989. 197 Pestic. Sci. 0031-613X/90/$03.50 0 1990 Society of Chemical Industry. Printed in Great Britain

Transcript of Use of crystallographic data in searching for isosteric replacements: Composite crystal-field...

Page 1: Use of crystallographic data in searching for isosteric replacements: Composite crystal-field environments of nitro and carbonyl groups

Pestic. Sci. 1990, 29, 197-213

Use of Crystallographic Data in Searching for Isosteric Replacements: Composite Crystal-Field Environments of

Nitro and Carbonyl Groups*

Robin Taylor, Anne Mullaley & Graham W. Mullier

ICI Agrochemicals, Jealott’s Hill Research Station, Bracknell, Berkshire RG12 6EY, UK

(Revised manuscript received 6 September 1989; accepted 22 September 1989)

ABSTRACT

Over 70 000 organo-carbon crystal structures are available in the public domain. A common functional group may occur in several hundred of these structures. Examination of the immediate intermolecular environment of a functional group in each of the structures in which it occurs enables a composite crystal-field environment to be determined. Although influenced by random packing forces, the composite environment may show strong systematic features which reflect the preferred interactions of the functional group. This is shown by test calculations on structures containing nitro and carbonyl groups, using in-house software. Strong systematic features exist in the composite environments of both groups, showing their tendency to hydrogen bond in approximately the directions of the oxygen lone pairs, and to participate in strong electrostatic interactions in the direction normal to the plane of the group. Such results are of value in aiding the search for bioisosteres.

1 INTRODUCTION

A common situation in pesticide research is that a compound has high intrinsic biological activity but is not commercially viable for some other reason, e.g. because of undesirable physical properties, toxicity problems, patent considerations, etc. It is then necessary to modify the compound so that the unwanted feature is removed but intrinsic activity is retained. In many cases, the obvious tactic is to replace a key

* Paper presented at the Symposium ‘Isosteric Replacements in Drug and Pesticide Chemistry’ organised by the Pesticides Group of the Society of Chemical Industry and held in London on 20 June, 1989.

197

Pestic. Sci. 0031-613X/90/$03.50 0 1990 Society of Chemical Industry. Printed in Great Britain

Page 2: Use of crystallographic data in searching for isosteric replacements: Composite crystal-field environments of nitro and carbonyl groups

198 R. Taylor, A . Mullaley, G. W. Muflier

functional group with a ‘mimic’, i.e. another group that will interact with the protein binding site in the same way as the original group. It is therefore of interest to develop methods for investigating and comparing the preferred intermolecular interactions of functional groups.

A variety of computational procedures have been used for this purpose. Primary amongst these is the calculation of electrostatic potentials, which can be displayed on graphics terminals, as isopotential contour plots. Visual comparison of these plots aids identification of molecules or functional groups with similar electrostatic ‘signatures’. However, the electrostatic potential is only an indirect indicator of the types of non-bonded interactions in which a functional group is likely to participate. Thus, although the electrostatic surfaces of enzymes and their inhibitors are often complementary, this is by no means always the case.’

A more direct insight into the preferred intermolecular interactions of functional groups and molecules can be obtained by using computer programs such as ‘GRID’.’ Here, the interaction energy between a molecule and a small chemical ‘probe’ (e.g. -CH,, >C=O, -OH) is calculated at various points in space by means of an empirical molecular-mechanics type energy function. The results can then be contoured and displayed in exactly the same way as for electrostatic potentials. It has been found (Taylor, R., unpublished) that valuable insights can thus be obtained into the preferred interactions of the molecule and its component groups. However, the reliability of the results is necessarily limited by the accuracy of the empirical parameters used in the energy calculations. This is likely to create problems, e.g. for interactions involving metal ions. It is therefore desirable to complement the theoretical results with reliable experimental data.

The most detailed experimental information about intermolecular interactions is obtainable from X-ray and neutron diffraction. Over 70000 organic and organometallic crystal structures have been published and are available in database form in the Cambridge Structural Database (CSD)., A common functional group may occur in several hundred of the structures in CSD. This paper describes how computer and graphics manipulation of these data enables a composite, or ‘average’, crystal-field environment of the functional group to be built up.4 Using two functional groups as examples (nitro and carbonyl), it is demonstrated that the results show strong systematic features despite the random perturbations due to crystal-packing effects. Analysis of these features shows that there are some similarities in the preferred intermolecular environments of nitro and carbonyl groups.

2 METHODS

2.1 Overall strategy

The method involves the following steps:

(a) CSD is searched for crystal structures containing the functional group of

(b) Each crystal structure is considered in turn. All distances between atoms of interest.

Page 3: Use of crystallographic data in searching for isosteric replacements: Composite crystal-field environments of nitro and carbonyl groups

Crystallographic data in searching for isosteric replacements 199

the functional group and atoms of neighbouring molecules in the crystal lattice are calculated. Hence, the ‘contact atoms’ of the functional group are identified (i.e. those atoms involved in short, intermolecular contacts with the functional group). The coordinates of the atoms (functional group plus contact atoms) are subjected to a rigid-body rotation-translation operation in order to bring the functional group into a standard orientation.

(c) A 3-D colour graphics terminal is used to display simultaneously the functional group and contact atoms from all crystal structures in the data set. The functional groups from the various structures will be exactly superimposed (apart from small discrepancies due to variations in the intramolecular geometry of the group) because of the standard rotation- translation operation performed at step (b). Thus, the contact atoms from different structures will be correctly oriented with respect to one another, and the graphics display will represent a composite picture of the crystal-field environments of the functional group as observed in the crystal structures of the data set.

(d) The composite picture is manipulated, e.g. by colouring contact atoms according to their atomic number, or by displaying only contact atoms of a particular type. The investigator is therefore able to study the number and distribution of various types of contact atoms around the functional group.

The details of the method are given below.

2.2 Search for relevant crystal structures and retrieval of crystallographic data

The 1988 version of CSD was used throughout this study. Crystal structures containing the functional group of interest were found by connectivity-searching, using the QUEST88 program of CSD version 3.1.5 Only organic structures were used (CSD chemical classes 1-65 or 70,5 no B-metals, transition metals, lanthanides or actinides), and structures that were flagged in CSD as being in error or being disordered were rejected. If several structure determinations of the same compound were found, only one was used in the analysis. However, multiple occurrences of the functional group in a single structure were regarded as valid independent observations and were all accepted.

The ‘SAVE 3’ option of QUEST88 was used to generate a crystallographic data file in CSD ‘FDAT’ format.6 This file contained cell dimensions, symmetry operations, and atomic symbols and fractional coordinates for all atoms in the asymmetric unit of each structure. The file was then input to the CSD GSTAT88 program,6 where the FRAGment option was used to identify the atoms in each structure which comprised the functional group of interest. This information was written out to a separate file which was then appended to the original FDAT file.

The hydrogen atom coordinates in the modified FDAT file were ‘normalised’, i.e. each hydrogen atom was moved along its observed valence bond direction so that the distance between the H atom and the atom to which it was bonded was equal to a standard value (C-H = 1.08 A, N-H = 1.01 A, 0-H = 099 A). This procedure corrects for the abnormally short hydrogen valence bond distances typically determined by X-ray diffraction.’

Page 4: Use of crystallographic data in searching for isosteric replacements: Composite crystal-field environments of nitro and carbonyl groups

200 R . Taylor, A . Mullaley, G . W. Mullier

2.3 Identification and re-orientation of confact atoms

The modified FDAT file was read into an in-house program, ENVIRON. For each structure in the data set, this program was used to find all short contacts formed by atoms of the functional group of interest to atoms of neighbouring molecules in the crystal lattice. The latter are referred to as ‘contact atoms’ throughout this paper. Contact distances were deemed to be short if they were less than the sum of the Bondi van der Waals radii of the atoms involved.’ When all such contacts had been found, the combined assembly of functional group and contact atoms was translated as a rigid body so that one particular (user-specified) atom of the functional group was at the origin of a Cartesian axial system. Two consecutive rigid-body rotations were then performed on the assembly so that the normal to the least-squares mean-plane of the functional group lay along the zdirection and a user-specified bond vector in the functional group lay along the positive y direction and in the xy plane. The Cartesian coordinates of all atoms in the re-oriented assembly were stored on disk.

2.4 Determination of contact-atom types

Since the original FDAT file contained element symbols, the atomic number of each contact atom was known. However, it was desired to divide the more common types of contact atoms (carbon, hydrogen, nitrogen, oxygen) into a number of sub- categories, e.g. amide oxygen, hydroxyl oxygen, ether oxygen, etc. The FRAGment option of the CSD GSTAT88 program was therefore used to search the original FDAT file for common substructures, e.g. amide linkages, aromatic rings, carboxylate groups, etc. Contact atoms found in such substructures could therefore be assigned to the appropriate subcategory. At the end of the procedure, some contact atoms remained unclassified (i.e. only their atomic number known) because they had not been found in any FRAGment search. The possibility that some contact atoms were misclassified cannot be excluded, since the procedure described above relies on the inference of chemical connectivity from crystallographic atomic coordinates: for example, it was theoretically possible for an un-ionised -COOH group to be confused with a carboxylate ion if the hydrogen atom had not been located in the diffraction study and wa9 therefore absent from the FDAT file. However, a variety of distance tests was used to resolve ambiguities such as this and it is confidently believed that the number of misclassifications was insignificantly small.

2.5 Graphical display and numerical analysis

The atom types and Cartesian coordinates for all atom assemblies (functional group plus contact atoms) in the data set were read into an in-house program, PSDISPLAY (Farrington, J. A., unpublished work). This program was used to display 3-D colour scatterplots on an Evans & Sutherland PS390graphics terminal, showing the distribution of contact atoms around the functional group. Contact atoms from different crystal structures were correctly oriented with respect to one another because of the standard orientation used for the functional group (see Section 2.3). Hence, the scatterplot showing all contact atoms was effectively a

Page 5: Use of crystallographic data in searching for isosteric replacements: Composite crystal-field environments of nitro and carbonyl groups

Crystallographic data in searching for isosteric replacements 201

composite picture of the crystal field environments of the functional group in all of the structures. Using PSDISPLAY, it was possible to display subsets of the contact atoms (e.g. only nitrogen or hydroxyl-oxygen atoms) and to produce hard copy plots showing 2-D orthogonal projections of the scatterplot. The figures in this paper were produced in this way.

For one of the test cases (nitro group), our visual examination of the scatterplots was supplementary by a series of numerical calculations detailed in Sections 4.7 and 4.8. These were performed with in-house software and the semi-empirical molcular orbital program AMPAC.’

3 TRIAL DATA SETS

Two functional groups were examined:

C-N + / O - A /c=o \ X O B

I I1

Subject to the constraints listed in Section 2.2 above, a total of 1765 crystallographically independent nitro groups were found, excluding those which formed no short intermolecular contacts and were therefore irrelevant to the study. Since carbonyl groups are extremely common in CSD, the search for I1 was restricted to amino acid and peptide crystal structures (CSD class 48; the other restrictions listed in Section 2.2 were also applied). Of 2544 crystallographically independent carbonyl groups thus found, some 1950 (77%) formed short intermolecular contacts and were used in the analysis. These include several carboxylate anions (i.e. A = 0- in II), for which both C-0 bonds were regarded as independent observations of carbonyl groups and therefore used in the analysis.

4 RESULTS FOR THE NITRO GROUP

4.1 Overall distribution of contact atom types

In total, the 1765 nitro groups formed short contacts to 5558 atoms. Table 1 lists the distribution of atomic numbers for these 5558 atoms. As expected, C, H, 0 and N predominate. The discrepancy between the number of contacts to potassium and

TABLE 1 Distribution of Contact Atom Types for the Nitro Group

C H Br Cl F I K N Na 0 S Se Te Total

907 3153 23 33 16 4 139 341 1 909 23 3 6 5558

Page 6: Use of crystallographic data in searching for isosteric replacements: Composite crystal-field environments of nitro and carbonyl groups

R . Taylor, A . Muilaiey, G . W. Muliier 202

sodium, and the absence of contacts to other Group I and I1 metal ions, almost certainly indicates that inappropriate radii were used for ions other than potassium (values were taken from Ref. 8). However, the spatial (i.e. angular) distribution of potassium ions around the nitro group is likely to be similar to that of other alkali metal and alkaline earth ions, and will therefore sufice for this study.

4.2 Nitro . . . nitro contacts

The observed crystal-field environments represent a biased sample, in that many of the short contacts are due to the close approach of nitro groups from neighbouring molecules in the crystal structure (see Section 4.7). The composite crystal-field environment is therefore not an accurate picture of the types of contacts formed by nitro groups in, say, a biological environment. Nevertheless, the nitro . . . nitro contacts show some interesting features, from which conclusions of a general nature can be drawn.

The observed spatial distributions of the nitro . . . nitro contacts are illustrated in Figs 1 and 2, which show the distributions of oxygen atoms from neighbouring nitro groups (111), and nitrogen atoms from neighbouring nitro groups (IV), respectively.

I11 IV

These figures, and all others in this paper, show two orthogonal views (V), one looking down the normal to the plane of the group (view on right) and one looking

I C

I C

V

+ I + + * +

C + *

C

Fig. 1. Distribution around reference nitro group of oxygen atoms from neighbouring nitro groups.

Page 7: Use of crystallographic data in searching for isosteric replacements: Composite crystal-field environments of nitro and carbonyl groups

Crystallographic data in searching for isosteric replacements 203

+**+++ +&$ +

+ + + + +

*it*+$ + i + + + "+ i"' + +3 -+ 9

' + I "

i

Fig. 2. Distribution around reference nitro group of nitrogen atoms from neighbouring nitro groups.

'edge-on' to the group (view on left). Since the -NO, group is symmetrical, all contact atoms have been reflected into one quadrant.

Figure 1 shows a pronounced clustering of oxygen atoms above the nitrogen atom of the reference nitro group, whilst Fig. 2 shows a rather smaller cluster of nitrogen atoms above the oxygen atom of the reference group. In Fig. 1, there is a relative paucity of contacts in or near to the nitro-group plane. From these observations, it is concluded that packing arrangements such as VI and VII are favourable. Presumably, such arrangements are stabilised by electrostatic

0

VI VII

attraction between the N and 0 atoms. Whilst of no direct relevance to biological situations, it may be inferred that electrostatic interactions between nitro groups and other polar, unsaturated groups (e.g. carbonyl) may be similarly favoured (Section 4.4).

4.3 Hydrogen bonding

Figure 3 shows the distribution of hydrogen-bonding H atoms (VIII, IX). The plot shows a marked preference for hydrogen bonds to occur in, or close to, the plane of the nitro group and, to some extent, in the directions of the idealised oxygen sp2 lone

RN02....H-0 RNO,....H-N VIII IX

pairs. There seems to be some preference for H-bonding in direction X rather. than XI but the preference is not strong. The corresponding distribution of the proton-

Page 8: Use of crystallographic data in searching for isosteric replacements: Composite crystal-field environments of nitro and carbonyl groups

204 R . Taylor, A . Mullaley, G. W. Muflier

0

N

+ +++ ++

+ + + +!+ +-$+++

+ $ + + f + + + 1 *++# + ++q ++

'%++%. +

+ +' -t

+ +

+t + + ++I + +

Fig. 3. Distribution around nitro group of hydrogen-bonding (0-H, N-H) hydrogen atoms.

0 I H

/ O

\o..' \O

0 / '.. -N :H-0 -N

X XI

donor atoms (i.e. oxygen atoms of hydroxyl groups and water molecules, nitrogen atoms of amine groups, etc.) shows the same general features as Fig. 3.

Contacts to hydrogen atoms covalently bonded to carbon (XII) were also examined. There are too many of these atoms for the entire distribution to be plotted but a representative subset (contacts of type XIII) is shown in Fig. 4. The distribution shows a relative paucity of contacts above the plane of the nitro group.

I / I \

RNO,.....H-C RN02...'.H-C-N

XI1 XI11

This may indicate an energetic advantage for the hydrogen atoms to lie close to the nitro-group plane, perhaps even implying C-H . . . 0 hydrogen bonding." Alternatively, it is possible that the region above the plane of the nitro group is usually occupied by polar groups capable of forming strong electrostatic interactions with the N and 0 atoms (see Sections 4.2 and 4.4), thereby making it sterically inaccessible to other functional groups.

4.4 Interactions with carbonyl groups

Figure 5 illustrates the distribution of carbonyl oxygen atoms (XIV); the corresponding distribution of carbonyl carbon atoms (XV) is shown in Fig. 6. The carbonyl groups tend to lie above the plane of the nitro group, with the carbonyl

Page 9: Use of crystallographic data in searching for isosteric replacements: Composite crystal-field environments of nitro and carbonyl groups

Crystallographic data in searching for isosteric replacements 205

0

N + +:i + + + +

Fig. 4. Distribution around

+ *+

+

c c + + + +

+ + +

4 ++ + t' + i

nitro group

+ : A + + I + # '

+

of hydrogen atoms from N-C-H linkeges.

.(, +

+ + +

+ +

++

+

+ + +

+ +

+

*++++ ++ + +

++ + + ++ + +i#

+ + .a': P+ +

$++ + *+ + + + +++'

#

+

+

+ +

++

+

+ + 0

+IN

Fig. 5. Distribution around nitro group of carbonyl oxygen atoms.

+='+ + + + + + k + + + + +

.L + - + + * + + + + +

+ + +++++ + * + ++ + + ++ +++ * + + ++:+;;:y + ++

++ + + + + +

Fig. 6. Distribution around nitro group of carbonyl carbon atoms.

Page 10: Use of crystallographic data in searching for isosteric replacements: Composite crystal-field environments of nitro and carbonyl groups

206 R . Taylor, A . Mullaley, G . W. Mullier

XIV xv

oxygen positioned directly over the nitro nitrogen. More detailed analysis (Section 4.7) shows that this tendency is particularly pronounced for carbonyl groups from -COOH and -COO- systems. Interestingly, sp3 hybridised oxygen atoms (e.g. from ethers) also appear to cluster above the nitro nitrogen. This suggests that the favourable interaction energy of arrangements such as XVI is largely due to 0 . . . N coulombic attraction rather than Ir-stacking.

o=c \'' ' . \

\\+%

c,, : : +/,, . . , N-0

0 XVI

4.5 Contacts to carbon atoms

Contacts to unsaturated (olefinic, aromatic) carbon atoms were compared with those to other carbons (most of which are sp3 hybridised). The distributions show small but interesting differences. In particular, there are few saturated carbon atoms positioned above the plane of the nitro group, whereas several of the unsaturated carbons lie in this region. Possibly, the region above the nitro group is sterically inaccessible to tetrahedral carbons. Alternatively, there may be some energetic stabilisation associated with stacking interactions between the nitro group and non- polar n-systems such as olefins (XVII).

'$, .\+.

2c=c . . . . \ "' ,, . .

*%,, ' . , N-0 0

XVII

4.6 Interactions with metal ions

Figure 7 shows the distribution of potassium ions around the nitro group. The distribution is similar to that of hydrogen-bonding H atoms (Fig. 3) in that the metal ions tend to lie close to the plane of the nitro group and in approximately the directions of the oxygen lone pairs.

4.7 Numerical analysis of contact-atom distributions

The distribution of contact atoms was analysed in more detail by dividing the space around the nitro group into regions. Three planes (a, b, c) were defined, each normal to the plane of the nitro group, plane a bisecting the C-N bond and planes

Page 11: Use of crystallographic data in searching for isosteric replacements: Composite crystal-field environments of nitro and carbonyl groups

Crystallographic data in searching for isosteric replacements 207

+ +I

+ +

+

I 4

*Ti$ *+- +

++$ + + +++ +b ;.e++2 0

+ + + + + + + ++ 1 . 1 .

- '$' +

+++++b ;-++2 0 + + + + + + + ++ 1 . 1 .

+ $9 + ++,+ $ +%

+ ++ + ++:*

++ + +

+

+ + +pi+ + f++

+ P; ++$ + + 4 ?++ $+;* +

a ++ + + ++

+ + +

+

+

+

"y" + +

Fig. 7. Distribution around nitro group of potassium ions.

I

I 0

Fig. 8. Definition of spatial regions (i)-(iii) used in analysis of nitro-group composite crystal-field environment.

b and c bisecting the N-0 bonds. Each contact atom was then assigned to one of three regions: atoms lying 'behind' plane a (i.e. opposite side to nitro-group N and 0 atoms) were assigned to region (i); atoms lying within the prism enclosed by planes a, b, c were assigned to region (ii); remaining atoms were assigned to region (iii). The latter were further subdivided into regions (iii, in) and (iii, out), according to whether they fell within 30" of the plane of the nitro group (Fig. 8).

Only 50 of the 5558 contact atoms fell into region (i); they are not considered further. The distribution of the remaining atoms between the various regions is given in Table 2, which also gives a more detailed breakdown for C , N and 0 atoms.

Page 12: Use of crystallographic data in searching for isosteric replacements: Composite crystal-field environments of nitro and carbonyl groups

208 R. Taylor, A . Mullaley, G . W. Mullier

TABLE 2 Spatial Distribution of Contact Atoms

~~~~~

Contact atom type Spatial region Total

(ii) (iii, in) (iii, out)

(a ) By atomic number C (C- )H (N-W (0-)H N 0 Halogen S/Se/Te Na/K Total

(b ) Carbon contact atoms Olefinic/aromatic Carbonyl Nitro (i.e. C-NO,) Unclassified/others Total

(c) Nitrogen contact atoms N-H Nitro Unclassified/others Total

( d ) Oxygen contact atoms

Acidlesterlacid anion carbonyl

Other carbonyl Esterlether sp3 Nitro Unclassified/others Total

0 - H

19 7 0 0 6

248 4 1 1

286

15 2 2 0

19

0 0 6 6

4

18 8 9

184 25

248

440 1626 118 58

196 294 45 19 74

2870

157 30 66

187 440

52 116 28

196

40

2 9 7

217 19

294

441 1250

48 23

137 352 27 12 62

2352

168 54 85

134 441

16 109 12

137

10

5 8 5

309 15

352

900 2883

166 81

339 894 76 32

137 5508

340 86

153 321 900

68 225 46

339

54

25 25 21

710 59

894

A x2 test shows that the distribution of contact atoms between the regions varies significantly with atomic number (x2 = 1174 with 16 degrees of freedom, significant at < 0.001 level). In fact, it is hardly necessary to perform the statistical test since there are some remarkable variations in the spatial distributions. In particular, of 286 atoms falling in region (ii), no less than 248 (87 %) are oxygen atoms; also, of 3130 hydrogen atoms, only 7 (0.2%) lie in region (ii). This suggests that electronegative atoms have a strong tendency to form short contacts with the nitro- group nitrogen atom, whereas electropositive atoms tend to avoid such contacts (see Section 4.8). The fact that this tendency is so clearly revealed establishes that random crystal packing forces are insufficient to obscure the systematic effects due to the electronic character of the nitro group.

Page 13: Use of crystallographic data in searching for isosteric replacements: Composite crystal-field environments of nitro and carbonyl groups

Crystallographic data in searching for isosteric replacements 209

Carbon atoms are evenly distributed between regions (iii, in) and (iii, out), but electropositive atoms such as hydrogen (including C-H) and K + tend to lie close to the nitro group plane, i.e. in region (iii, in).

The distributions ofvarious types ofcarbon, nitrogen and oxygen atoms (Table 2) also show several discernible trends. As expected from Section 4.5 above, sp2 hybridised carbon atoms (olefinic and aromatic, plus carbonyl) show a relatively greater tendency to fall into regions (ii) and (iii, out) (i.e. above the nitro-group plane) than do the unclassified carbons (mainly s p 3 ) . A x2 test established that this trend is significant at the < 0.001 level (test performed on the 2 x 2 table generated by eliminating the third row of Table 2(b) and combining the first and third columns and first and second rows; x 2 = 15.1 with one degree of freedom). There is a marked difference between the distributions of H-bond donor oxygen atoms, which tend to lie in region (iii, in), and carbonyl and nitro oxygens, which show a relatively greater preference for region (ii). Of the carbonyl group oxygens, there is some evidence that region (ii) is particularly preferred by those from -COOH, -COOR and -COO- groups b2 = 8.0 with one degree of freedom, significant at < 0.005 level).

4.8 Distribution of contact-atom partial charges

Calculation of partial atomic charges for nitromethane by the MNDO method predicts charges of + 0.45 for the nitrogen and - 0.33 for each oxygen atom. It was desired to investigate whether these charges could be correlated with those of the contact atoms in the composite crystal-field environment. A random subset of about 7 % (383) of the contact atoms was therefore chosen. The molecule containing each selected atom was retrieved from CSD and submitted to an MNDO calculation. The partial charges of the chosen contact atoms were thus obtained. Results are summarised in Table 3. As expected, there is an overwhelming preference for contact atoms in region (ii) to bear negative partial atomic charges (15 out of 19 atoms, two-tailed significance by binomial test=0-019). The mean partial charge of atoms in this region is -0.14, which a two-tailed t-test shows to be significantly different from zero ( P = 0.027). The mean value would be appreciably more negative but for an outlier with a large partial positive charge (+ 0.66). This is the sulphur atom in XVIII," which MNDO predicts to be highly polarised. However, this atom has a lone pair which possibly stabilises the short S . . . N contact despite the large net positive charge on the sulphur.

TABLE 3 Distribution of Contact-Atom Partial Atomic Charges

~ ~ ~~

Spatial region

( i i ) (iii , in) (iii, out)

Number of atoms 19 188 176 Number with positive charge 4 137 112 Number with negative charge 15 51 64 Mean -0.14 0.04 -0.01 Standard deviation 0.25 0.18 020

Page 14: Use of crystallographic data in searching for isosteric replacements: Composite crystal-field environments of nitro and carbonyl groups

210 R . Taylor, A . Mullaley, G . W. Mullier

XVlIl

The mean partial charge for atoms in region (iii, in) is + 0.04, which, although small, is significantly different from zero (two-tailed t-test, P= 0.003); the relative numbers of atoms bearing partial positive and negative charges (137 and 51, respectively) are also statistically significant (binomial test, P< 0001). It was concluded that there is a definite preference for electropositive atoms to fall into this region. The situation is less straightforward for region (iii, out). The mean partial charge (-0.01) is not significantly different from zero (t-test). However, the ratio of electropositive to electronegative atoms (1 12: 67) is significant (two-tailed binomial, P < 0.001) and, on balance, it is concluded that electropositive atoms are favoured in this region.

The results in Table 3 were recalculated, omitting proton donor atoms of hydrogen bonds (XIX, D=O, N). In this type of interaction, the hydrogen atom

0 /

R-N’ . H-D \()..’

XIX

invariably carries a net positive atomic partial charge whilst the proton donor atom is negative. Typically, both the 0 . . . H and 0 . . . D distances are shorter than the sum of the appropriate van der Waals radii. Hence, both are regarded as ‘contact atoms’ of the nitro group (see Section 2.3), although the nitro-group oxygen is presumably shielded from the negative potential of D by the positive hydrogen atom. However, the recalculated results were not appreciably different from those shown in Table 3, and are therefore not given here.

It is concluded that the MNDO partial atomic charges of the contact atoms tend to complement those of the nitro-group atoms.

5 RESULTS FOR THE CARBONYL GROUP

Results for the carbonyl group will not be described in detail, since the purpose is merely to show that comparison of the composite crystal-field environments of different functional groups can highlight similarities in their intermolecular interactions. Discussion is therefore focused on those details of the carbonyl environments that are similar to features already noted above for nitro groups.

Figure 9 shows the distribution around the carbonyl oxygen of hydroxyl and water hydrogen atoms (in view of the symmetry of the carbonyl group, all contact atoms have been reflected into one quadrant in this and subsequent figures). As for nitro groups, the H-atom distribution reveals a clear tendency for hydrogen bonds

Page 15: Use of crystallographic data in searching for isosteric replacements: Composite crystal-field environments of nitro and carbonyl groups

Crystallographic data in searching for isosteric replacements 21 1

+ + + + ++*+

+ +

+ +

+ + +

+ T o I

+ + +

+

Fig. 9. Distribution around carbonyl group of hydrogen atoms from hydroxyl groups and water molecules.

+ + + i. +

+ + + + +++

c c +

t

+

+ + +

+ + ++

+ + + + + +

++

+ + + i+ + +

+., ++ $ +

+ 3

Fig. 10. Distribution around carbonyl group of lithium, sodium, magnesium and potassium ions.

to occur in or near to the plane of the carbonyl group. An overall preference for the idealised oxygen sp2 lone pair direction within this plane is also evident, in agreement with earlier ~ tud ie s . ' ~ . ' ~ Distributions of amine and amide hydrogen atoms, and of the proton donor atoms in hydrogen bonds, are similar to that shown in Fig. 9.

The oxygen atoms of nitro and carbonyl groups also show similarities in their interactions with metal ions. Figure 10 plots the distribution of K', Li', Mg2' and Na' ions around the carbonyl oxygen. As with nitro-group oxygen atoms (Fig. 7), there is a preference for contacts to occur in or near to the idealised sp2 lone-pair direction.

Many of the short contacts formed by carbonyl groups are to neighbouring carbonyl groups in the crystal structure. The observed spatial distributions of the carbonyl . . . carbonyl contacts are plotted in Figs 11 and 12, which show the

Page 16: Use of crystallographic data in searching for isosteric replacements: Composite crystal-field environments of nitro and carbonyl groups

212 R . Taylor, A . Mullaley, G . W. Mullier

*+

+ + + + +

+ + + *

+ + 0

C I

+*" + + +

+ + + +

+ +

+

d I + +<A

P + ' +++ Fig. 11. Distribution around reference carbonyl group of oxygen atoms from neighbouring carbonyl

groups.

.# +;z*;* + +++++

+++ + +! +

+4+++ t; + 4 +

+ $++ + re *+ *++

*

0

C I

$2 t+ ++t +;p ".. * ++ f + + +

++t + + + +*

+ * +

Fig. 12. Distribution around reference carbonyl group of carbon atoms from neighbouring carbonyl groups.

0

xx XXI

distribution of oxygen atoms from neighbouring carbonyl groups (XX) and of carbon atoms from neighbouring carbonyl groups (XXI), respectively. Again, the distributions show clear similarities with the corresponding nitro group distributions (Figs 5 and 6); it is inferred that packing motifs such as XXII are favourable for carbonyl groups, as are arrangements such as XVI for nitro groups.

XXII

Page 17: Use of crystallographic data in searching for isosteric replacements: Composite crystal-field environments of nitro and carbonyl groups

Crystallographic data in searching for isosteric replacements 213

6 SUMMARY AND CONCLUSIONS

The technique presented here offers an alternative to theoretically based measures of functional-group similarity such as the molecular electrostatic potential. It has the advantage of being based entirely on experimental data and its obvious drawback-perturbations due to random crystal packing effects-proves, in practice, to be of little importance. Thus, results described above show that the preferred intermolecular interactions of nitro and carbonyl groups are clearly revealed by systematic features in their composite crystal-field environments. Comparison of the composite environments suggests that -NO, and >C=O groups are similar in the types of intermolecular contacts in which they participate, e.g. strong electrostatic interactions in directions approximately normal to the plane of the group.

It is concluded that analysis of crystal-field environments is likely to be of value in aiding the search for acceptable functional-group replacements in drug and pesticide molecules. Apart from visual and numerical analyses of the type described above, it is possible to envisage using composite crystal-field environments as the basis for calculating functional group 'similarity coefticient~'.'~ Furthermore, independent work suggests that investigation of crystal-field environments is also of value in aiding protein structure prediction.'5-17

REFERENCES

1. Nakamura, H., Komatsu, K., Nakagawa, S. & Umeyama, H., J . Mol. Graphics, 3

2. Goodford, P. J., J . Med. Chem., 28 (1985) 849-57. 3. Allen, F. H., Kennard, 0. & Taylor, R., Acc. Chem. Res., 16 (1983) 146-53. 4. Rosenfield, R. E. Jr, Swanson, S. M., Meyer, E. F. Jr, Carrell, H. L. & Murray-Rust, P.,

J . Mol. Graphics, 2 (1984) 43-6. 5. Cambridge Crystallographic Data Centre, QUEST88 Users Manual (1988). 6. Cambridge Crystallographic Data Centre, GSTAT88 Users Manual (1988). 7. Taylor, R. & Kennard, O., Acta Cryst., B39 (1983) 133-8. 8. Bondi, A., J. Phys. Chem., 68 (1964) 441-51. 9. Quantum Chemistry Program Exchange No . 506, Bloomington, Indiana, USA.

10. Taylor, R. & Kennard, O., J. Am. Chem. SOC., 104 (1982) 5063-70. 11. Tsuchiya, S., Mitomo, S.-I., Seno, M. & Miyamae, H., J. Org. Chem., 49 (1984) 35569. 12. Taylor, R., Kennard, 0. & Versichel, W., J. Am. Chem. SOC., 105 (1983) 57616. 13. Murray-Rust, P. & Glusker, J., J. Am. Chem. SOC., 106 (1984) 1018-25. 14. Richards, W. G. & Hodgkin, E. E., Chem. Brit., 24 (1988) 11414. 15. Thomas, K. A., Smith, G. M., Thomas, T. B. & Feldmann, R. J., Proc. Nat . Acad. Sci.

16. Singh, J.,Thornton, J. M., Snarey, M. &Campbell, S. F., Febs Lett., 224 (1987) 161-71. 17. Rowlands, R. S., Carson, M. & Bugg, C. E., ACS Symposium Ser. 2, 16 (1988) 43.

(1985) 2-11.

USA, 79 (1982) 4843-7.