Paola Gramatica, Elena Bonfanti, Manuela Pavan and Federica Consolaro QSAR Research Unit, Department...
-
Upload
shanon-hoover -
Category
Documents
-
view
216 -
download
2
Transcript of Paola Gramatica, Elena Bonfanti, Manuela Pavan and Federica Consolaro QSAR Research Unit, Department...
![Page 1: Paola Gramatica, Elena Bonfanti, Manuela Pavan and Federica Consolaro QSAR Research Unit, Department of Structural and Functional Biology, University of.](https://reader038.fdocuments.us/reader038/viewer/2022110102/56649f155503460f94c2a4b9/html5/thumbnails/1.jpg)
Paola Gramatica, Elena Bonfanti, Manuela Pavan and Federica Consolaro QSAR Research Unit, Department of Structural and Functional Biology, University of Insubria, Varese, Italy.
E-mail: [email protected] Web: http://fisio.varbio2.unimi.it/dbsf/home.html
INTRODUCTION
Phenols are chemicals widespread in the environment and widely used as precursors for many products. It is well known that phenols exert effects on human health at concentrations commonly encountered in the environment. For this reason, the toxicity of these compounds has been extensively studied on different end points, but obviously data are not available for all phenols and organisms. Thus, reliable estimation methods are required. QSAR studies are useful for a simple and fast prediction of such data
DATA SET
The compounds used in this work are the 109 phenols described by Schultz [2] . Toxicity data, available only for 103 chemicals, are expressed in mM/l and in logarithmic scale as log of the inverse of the IGC50 (percent inhibitory growth concentration) on Tetrahymena pyriformis strain.Three phenols (2-aminophenol, cathecol and 4-nitrophenol) that have been shown as outliers by several models, have been excluded from the data set.
[2] T.W.Schultz et all. Quantitative structure-activity relationships for the Tetrahymena piryformis population growth end-point: a mechanism of action
approach. Practical Applications of Quantitative Structure-Activity Relationships (QSAR) in Environmental chemistry and toxicology, 241-262 (1990).
CHEMOMETRIC METHODS
Several chemometric analyses were applied to the compounds (represented by molecular descriptors) for the selection of an optimal training set for the QSAR models. The analyses performed are:
Principal Component Analysis (PCA): Principal Component Analysis (PCA): this analysis was used to calculate just a few components from a large number of variables. These components allow the highlighting of the distribution of the compounds according to their structure; only the significant components were used in Cluster Analysis and Kohonen Maps to avoid the redundancy of the information.
Hierarchical Cluster Analysis: Hierarchical Cluster Analysis: hierarchical clustering was performed using the significant components of the molecular descriptors as variables. Different distance metrics (Euclidean and Manhattan) and different linkages (Complete, average, etc.) were used and compared to find the best way to cluster these compounds.
Kohonen Maps: Kohonen Maps: this is an additional way that allows the mapping of similar compounds by using the so-called “self-organised topological feature maps”, which are maps that preserve the topology of a multidimensional representation within the new two-dimensional representation. The position of the compounds in the cells of this map shows the similarity level of the structure of the studied phenols. The centroids of each cell have been selected as the most representative compounds in order to create a training set constituted of the more different phenols.
CONCLUSION
The present investigation confirms that the toxic response of phenols in the Tetrahymena system can be modelled by a logKow- dependent QSAR. The models developed starting from a wide set of various molecular descriptors identify the hydrophobicity as the single most important variable, as the logKow alone gives a good enough prediction model with a Q2(LOO)= 72.14; other structural parameters, such as electronic and connectivity ones play a role of secondary but useful relevance, at least for this set of compounds.
Moreover this study demonstrates that theoretical molecular descriptors are an effective and useful alternative of LogKow.
The internal and external validation procedures have confirmed the high predictive capability of the models developed.
Phenols toxicity distribution
N. o
f obs
rvat
ions
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
<= -1 (-1;-.5] (-.5;0] (0;.5] (.5;1] (1;1.5] (1.5;2] > 2
Training
Test
LogIGC 50-1 = -1.66+0.34nX-0.74nHD+0.65CHI1v+2.07Gs
experimental response
ca
lcu
late
d r
esp
on
se
31
6567
104
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0
training set = 56 comp.test set
Log IGC 50-1 = -1.06+0.2nX-1.46ACMO2m+0.58logKOW
experimental response
calc
ula
ted r
esp
onse
17
5865
66
67
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5
training set = 56 comp.test set
OBJ N. VAR N. VARIABLES Q 2LOO Q 2
LMO Q 2ext R 2 SDEP SDEC
100 4 nX nHD CHI1v ACMO1p 82.8 82.6 84.7 0.31 0.29100 4 nOH Mor04u ACMB2v LogKow 83.0 82.7 85.2 0.31 0.29100 1 LogKow 70.8 71.1 72.3 0.41 0.3956 4 nX nHD CHI1v Gs 84.4 84.1 83.8 88.0 0.32 0.2856 3 nX ACMO2m LogKow 82.7 82.6 82.2 84.8 0.34 0.3256 1 LogKow 72.1 71.6 78.3 74.4 0.43 0.41
REGRESSION MODELS
The selection of the best subset variables for modelling toxicity was done by a Genetic Algorithm (GA-VSS) approach, where the response is obtained by ordinary least square regression (OLS).All the calculations have been performed by using the leave-one-out (LOO) and leave-more-out (LMO) procedures and the scrambling of the responses for the validation of the models.
QSAR MODELLING AND PREDICTION OF PHENOL TOXICITYQSAR MODELLING AND PREDICTION OF PHENOL TOXICITY
MOLECULAR DESCRIPTORS
The molecular structures of the studied compounds have been described by using several molecular descriptors, calculated by a software developed by R.Todeschini ([email protected]; http://www.disat.unimib.it/chm)
Sum of atomic properties descriptors (6)
Count descriptors (45)
Empirical descriptors (2)
Information indices (16)
[1 ]R.Todeschini and P.Gramatica, 3D-modelling and prediction by WHIM descriptors. Part 5. Theory development and chemical meaning of the
WHIM descriptors, Quant.Struct.-Act.Relat., 16 (1997) 113-119.
Autocorrelation descriptors (252)
Directional WHIM descriptors (66) [1]
Non directional WHIM descriptors (33) [1]
Topological descriptors (58)
Topographic descriptors (7)
Geometric descriptors (170)
Quanto-chemicals descriptors (6)
test set
training set
Selection of training set
THE NUMBERED COMPOUNDS
ARE OUTLIERS