On-line version ISSN 0717-9707
J. Chil. Chem. Soc. vol.57 no.1 Concepción Mar. 2012
J. Chil. Chem. Soc, 57, No 1 (2012); págs.: 955-963
AN IMPROVED TOPOLOGICAL DESCRIPTOR EDm AND ITS APPLICATION
CHANGMING NIE1*, YAXIN WU1, RONGYAN WU2 and SONGNIAN WEN2
1School of Chemistry and Chemical Engineering, University of South China, Hengyang, Hunan Province 421001, P. R. China. * e-mail: firstname.lastname@example.org
2 College of electrical Engineering, University of South China,, Hengyang, Hunan Province 421001, P. R China
In this paper, on the basis of the topological index EDm derived from ionicity index matrix, improved distance matrix and branching degree matrix, we proposed the new topological descriptor EDm' by introducing the bond angle into hidden hydrogen graph of molecules and using the geometric distance instead of the sum of bond length between two vertexes. The EDm describes the molecular structure more accurately, and realizes unique characterization to cis-trans isomers. The quantitative structure-property relationship (QSPR) models, with correlation coefficients (R) in the range of 0.984-1.000 for the boiling point (b.p.), the standard enthalpy of formation ( DfHmθ), the molar refraction (Rm) and the molar volume (Vm) of some cis-trans isomers for alkenes, are subsequently developed by index EDm'. Moreover, the good stability and predictive ability of the models were demonstrated by LOO (leave-one-out) method and RSP (random sampling prediction) method, which further manifests the index EDm has high potential of wide applications in QSPR study.
Key words: cis/trans isomers, ionicity index matrix, improved distance matrix, branching degree matrix
Since the Wiener index W introduced in 19471, many chemists such as Balaban A.T.2-5, Randic M.6-10, Gutman I.11-14, Trinajstic N.15-18, and Estrada E.19-21 have taken lots of time and gotten many brilliant achievements in quantitative structure-property/activity relationships (QSPR/QSAR) study. These studies have played significant roles in various fields such as drug design, physicochemical property/ biological activity forecast, environmental pollutant judgment, etc22-30. Recently, our group has proposed some new topological descriptor, such as EDm index31, PE index32 and Nt index33, which have been applied to QSPR study for series of compounds.
Stereochemistry is one of the crucial parts in organic chemistry, and the cis-trans isomerism that belongs to diastereoisomeric category, is of practical importance in stereochemistry34. The cis-trans isomers have diverse chemical and physical properties, such as biological activity and inhibiting ability. Many experimental results demonstrated that stereo structural character has a great influence on chemical and vital processes. For instance, the cis-trans isomerism for alkenes often determines whether a reaction could go on wheels or not. Chiral features of drug molecules usually have decisive effect on drug effect35. So it is very necessary to explore the QSPR/QSAR models for cis/ trans-isomers.
The traditional topological indices, although denote the molecular structural features effectively, are not able to differentiate the stereo structural characteristics of molecules35. More and more attention has been given to develop new topological descriptors adapted to cis/trans isomers35-39. Chenzhong Cao and Hua Yuan presented the modified vertex degree-distance index (MVDI) and modified edge degree-distance index (MEDI), and used them together with the odd-even index (OEI) and the number of carbon atoms (N) to characterize the molecular structural information for the cis- and trans-isomers of alkenes35. Golbraikh, A. et al introduced several series of novel ZE-isomerism descriptors using ZE-isomerism correction, which is added to the vertex degrees of atoms connected by double bonds in Z and E configurations39.
Based on the previous work31-33,40-45, we proposed the index EDm by revising EDm index31. The new topological descriptor EDm possesses good distinguishing ability to the molecular structures for cis-trans isomers. Superior to reported work, the EDm break with the tradition methods that all depend on using correction factor.
2. Theories and methods
For alkene, the double bonded carbons and the four attached atoms lie in a plane. C=C is made up of a σ bond and a π bond. The π bond is formed by sideway overlap of p orbitals and without symmetry axis. Since double bond can not be rotated, two alkenes with the same substituents in different spatial location are different compounds, and known as cis-trans isomers.
In modern quantum mechanics theory, the double bonded carbons in ethylene are sp2-hybridized and have three equivalent orbitals, which lie in a plane at angles of 120° to one another. In this paper, we will use the C-C-C angle equal to 120° for sp2 hybridation and approximate it to 110° for sp3 hybridation, and consider all the saturated carbon chains as stable zigzag staggered conformations. We introduce the bond angle into hidden hydrogen graph of molecules, use the geometric distance instead of the sum of bond length between two atoms, and put forward the topological index EDm based on branching degree matrix G, improved distance matrix ,S", and ionicity index matrix Q.
Substituting Eq. (2)-(4) into Eq. (1), we obtain
2.1 Branching degree matrix
In Eq. (5), branching degree40 g is figured as
Where Σk is the sum of the number of bonded non-hydrogen single bonds of atoms numbered i. Double bond is regarded as two single bonds and triple bond as three single bonds when calculated. For primary, secondary, tertiary, and quaternary carbons, obviously, gi values are 1, 0.7071, 0.5774 and 0.5, respectively.
2.2 Improved distance matrix
For molecular graph of n vertexes, the distance matrix is a n x n symmetric matrix. We depict the improved distance matrix as
In Eq. (7), (lm)ij is the ratio of the geometric distance between the two vertexes i and j to the bond-length of CC, and defined as the relative distance. The actual bond lengths of C-C, C=C, C=C are 154, 134, 120 nm46. For example, Fig. 1 is the hidden hydrogen graph of cis-2-pentene.
(lm)ij in Figure 1
The value of (l3)25 is calculated as follows (see Fig.1.b2):
The improved distance matrix S ' of cis-2-pentene
For trans-2-pentene, the calculations are as follows.
(lm)ij in Figure 2
Values of S1' and S2' of trans-2-pentene are identical to that of cis-2-pentene, so we only list the S3' of trans-2-pentene.
2.3 Ionicity index matrix
In Eq. (5), qi is defined as
Xi,INI is the ionicity index43 of the atom i in molecules and is defined as
where XiA is Pauling electronegativity of atom i, and XiE is the equilibrium electronegativity42 of atom i in a molecule. The equilibrium electronegativity XiE is figured as
In Eq. (10), Σl is the sum of atoms or branching groups directly attached to the atom i, and ΣXiG is the sum of electronegativities of atoms or groups directly attached to the atom i. X44 is given by
As shown in Figure 3, the group can be divided into k layers, and the left atoms of the dotted line labeled 1, 2, 3, ...k are known as the ground atoms. The right suffix of atoms l is the numeration of the ground atom. n1l, n2l…nkl are the sum of the ground atom l and other atoms or groups directly attached to the ground atoms in the grade 1, 2, 3,. k. are the sum of electronegativities of the ground atom l and other atoms or branching groups directly attached to the ground atoms in the grade 1, 2, 3,…k.
For instance, Figure 4 is the structural graph of cis-2-pentene.
Where Xc and XH are the Pauling electronegativities of carbon atom and hydrogen atom.
XE of carbon atoms labeled 1, 2, 3, 4 and 5 in Fig. 4
XINI of carbon atoms labeled 1, 2, 3, 4 and 5 in Fig. 4
Q of cis-2-pentene
Some ionicity index matrices Q are given in Table 2.
2.4 Some EDm' values
We calculated the ED1', ED2' and ED3' values of 44 cis-trans isomers for alkenes by Eq. (5)-(11), and listed them on Table 3.
The descriptor EDJ possesses all the traits that the EDm has: (1) The EDJ indicates the size of the molecule, and the more atoms correspond to the bigger values of EDm'; (2) The index EDJ reveals the molecular branching character. The bigger degree of branching for the isomers, the bigger ΣEDm'; (3) The EDJ shows better structural selectivity than EDm in cis-trans isomers distinguishing, because the EDm can not distinguish from cis-trans isomers.
3. RESULTS AND DISCUSSION
3.1 Model Development
In this paper, we considered the interactions of α, β, γ-carbon atoms, and used ED1', ED2' and ED3' to build a multiple linear regression (MLR) model:
Where a, b, c and d are constants, P represents the molecular properties. The correlation coefficient (R), the Fischer ratio value (F), and the standard deviation (S) can be used to assess the quality of the models.
3.2 Relationships between the indices EDm' and properties for alkenes
We regress the boiling point (b.p. )36,47, the standard enthalpy of formation (DiHmθ)46, the molar refraction (Rm)48 and the molar volume (Vm)46 of 44 cis-trans isomers for alkenes with their values of EDm', according to Eq. (12), and listed the statistical results of regression in Table 4.
All the values of R in Table 4 are larger than 0.984, and the adjusted R square (R2) are above 96.12%. The average absolute relative errors between calculated values and experimental ones for b.p., DfHmθ, Rm, and Vm are 1.84%, 2.55%, 1.39% and 0.57%, respectively. One can observe that the calculations agree well with the experimental values (Figure 5), and most factors describing the property change of cis-trans isomers are opened out. The calculated values and experimental ones for b.p., DfHmθ, Rm and Vm are listed in Table 5 and Table 6.
3.3 Model validation
3.3.1 LOO method
In this method40, we take one sample from N samples, do the regression with the experimental values of the remaining N-1 samples, and obtain a regression equation for N-1 samples, then predict the value of the sample taken-out by the regression equation. Using the same method, we get N prediction values. Finally, N prediction values are regressed with their experimental values, and the regression results, such as the correlation coefficient Rcv, the Fischer ratio value Fcv, the standard error SCV, are used to test the stability and validity of the model. The results by LOO method are shown in Table 7.
The values of Rcvby LOO method are close to the R in Table 4 (See Figure 6). The multiple correlation coefficients of models about b.p., DfHmθ, Rm and Vm by LOO method are among 0.9949-0.9958, 0.9796-0.9893, 0.99997-0.99998 and 0.9993-0.9997, respectively. Most of the multiple correlation coefficients fluctuate in a narrow range (See Fig. 7), which demonstrated the high stability and good reliability of the index.
|Figure 7. The distribution plot of multi-correlation coefficient R in LOO method versus frequency of R for four properties of alkenes.|
In this method31, the M samples are randomly taken from all the N samples as the training set, and the N-M remained samples as the test set. First, build a model with the training set, then predict the N-M values of the test set using the model of the test set, and finally regress N-M prediction values with their experimental ones. The correlation coefficient (RP), the Fischer ratio value (Fp), and the standard error (SP) are used to test the predictive ability of the model. The results are shown in Table 8.
3.3.2 RSP method
All the values of R for b.p., DfHmθ, Rm, and Vm in Table 8 are larger than 0.98. Moreover, the plots (Fig. 8) of the predicted values and the experimental ones for the test set and the training set can be an evidence of the good quality for the models. All the experimental values, the calculated values and the predicted values mentioned in this paper are listed in Table 5 and Table 6.
|Figure 8. The experimental values vs. the predicted values based on the RSP method.|
3.4 The contributions of the different EDm
We studied the different effects that different EDm' have on calculated properties for compounds. The relative and fraction contribution of each index were estimated. The relative contribution (Ψr) and fraction contribution Ψf) of the relevant indexes are defined as follows49:
where Ψr, Ψt, and are, respectively, the relative contribution, fraction contribution, and the average value of the ith topological index. Square of the correlation coefficient, R2, is the coefficient of the determination. The sum is over all the indexes in the model. Results of the contribution analysis are summarized in Table 9.
From Table 9, one can observe that the contributions of each EDm' to the properties cover a wide range of values. ED1' plays the most important role in these models. On the contrary, the contributions of ED2' and ED3' are less significant. From Fig. 9, it is clear that Ψf values of ED1' for alkenes range between 70% and 90%, which reflects the interaction of α-carbons to be the strongest. Meantime, we can see that the Ψf values of ED2' and ED3' are in the range of 2%~30% and 1%~9%, respectively.
In this paper, we introduce the bond angle into hidden hydrogen graph of molecules, use the linear distance instead of the sum of bond length between the two atoms, and proposed the new topological descriptor EDm . High-quality models of the cis-trans isomers for QSPR study were put forward on the ground of the index EDJ. By these models, properties of b.p., DfHmθ, Rm and Vm were calculated and predicted, and the results agreed well with the experimental values. Moreover, the LOO method and the RSP method demonstrate that the models have statistical significance and good stability.
The project was supported by Science and Technology Projects Foundation of Hunan province (No. 06FJ4104).
(Received: June 30, 2010 - Accepted: January 26, 2011)
(1) Wiener, H. Structural Determination of Paraffin Boiling Points. J. Am. Chem. Soc. 1947, 69, 17-20. [ Links ]
(2) Balaban, A. T. Highly Discriminating Distance-Based Topological Index. Chem. Phys. Lett. 1982, 89, 399-404. [ Links ]
(3) Balaban, A. T.; Mills, D.; Basak, S. C. Correlation between Structure and Normal Boiling Points of Acyclic Carbonyl Compounds. J. Chem. Inf. Comput. Sci. 1999, 39, 758-764. [ Links ]
(4) Balaban, A. T.; Rücker, C. Using Protochirons for Three-Dimensional Coding of Certain Chemical Structures. J. Chem. Inf. Comput. Sci. 2001, 41, 1145-1149. [ Links ]
(5) Balaban, A. T. Mathematical Chemistry: (3, g)-Cages with Girth g, Topological Indices, and Other Graph-Theoretical Problems. Fund. Inform. 2005, 64, 1-16. [ Links ]
(6) Randic, M. On characterization of molecular branching. J. Am. Chem. Soc. 1975, 97, 6609-6615. [ Links ]
(7) Randic, M.; Zupan, J. On Interpretation of Well-Known Topological Indices. J. Chem. Inf. Comput. Sci. 2001, 41, 550-560. [ Links ]
(8) Randic, M.; Balaban, A. T.; Basak, S. C. On Structural Interpretation of Several Distance Related Topological Indices. J. Chem. Inf. Comput. Sci. 2001, 41, 593-601. [ Links ]
(9) Randic, M. Wiener-Hosoya Index - A Novel Graph Theoretical Molecular Descriptor. J. Chem. Inf. Model. 2004, 44, 373-377. [ Links ]
(10) Randic, M.; Zupan, J.; Vikic-Topic, D.; Plavsic, D. A Novel Unexpected Use of a Graphical Representation of DNA: Graphical Alignment of DNA Sequences. Chem. Phys. Lett. 2006, 431, 375-379. [ Links ]
(11) (a) Gutman, I.; Trinajstic, N. Graph Theory and Molecular Orbitals. Topics Curr. Chem. 1973, 42, 49-93; [ Links ]
(11) (b) Gutman, I.; Ruscic, B.; Trinajstic, N.; Wilcox, Jr. C. F. Graph Theory and Molecular Orbitals. XII. Acyclic Polyenes. J. Chem. Phys. 1975, 62, 3399-3405. [ Links ]
(11) (c) Gutman, I.; Milun, M.; Trinajstic, N. Graph Theory and Molecular Orbitals. XIX. Nonparametric Resonance Energies of Arbitrary Conjugated Systems. J. Amer. Chem. Soc. 1977, 99, 1692-1704. [ Links ]
(12) Gutman, I.; Bosanac, S. Topological studies on heteroconjugated molecules. The stability of alternant systems with one heteroatom. Chem. Phys. Lett. 1976, 43, 371-373. [ Links ]
(13) Gutman, I.; Rouvray, D. H. New theorem for the wiener molecular branching index of trees with perfect matchings. Comput. Chem. 1990, 14, 29-32. [ Links ]
(14) Gutman, I. Selected properties of the Schultz molecular topological index. J. Chem. Inf. Comput. Sci. 1994, 34, 1037-1039. [ Links ]
(15) Trinajstic, N.; Nikolic, S.; Lucic, B.; Amic, D.; Mihalic, Z. The Detour Matrix in Chemistry. J. Chem. Inf. Comput. Sci. 1997, 37, 631-638. [ Links ]
(16) Trinajstic, N. Chemical Nomenclatures and the Computer. Comput. Chem. 1994, 18, 435-436. [ Links ]
(17) Trinajstic, N.; Babic, D.; Nikolic, S.; Plavsic, D.; Amic, D.; Mihalic, Z. The Laplacian matrix in chemistry. J. Chem. Inf. Comput. Sci. 1994, 2, 368-376. [ Links ]
(18) Trinajstic, N. Chemical Graph Theory; CRC Press: Boca Raton, Florida, 1992. [ Links ]
(19) Estrada E. Edge Adjacency Relationships and a Novel Topological Index Related to Molecular Volume. J. Chem. Inf. Comput. Sci. 1995, 35, 31-33. [ Links ]
(20) Estrada, E.; Vilar, S.; Uriarte, E.; Gutierrez, Y. In Silico Studies toward the Discovery of New Anti-HIV Nucleoside Compounds with the Use of TOPS-MODE and 2D/3D Connectivity Indices, 1. Pyrimidyl Derivatives. J. Chem. Inf. Comput. Sci. 2002, 42, 1194-1203. [ Links ]
(21) Estrada, E. Application of a novel graph-theoretic folding degree index to the study of steroid-DB3 antibody binding affinity. Comput. Biol. Chem. 2003, 27, 305-313. [ Links ]
(22) Kier, L. B.; Hall, L. H. Molecular Connectivity in Structure-Activity Analysis; RSP-Wlley: Chichester, UK, 1986. [ Links ]
(23) Devillers, J.; Balaban, A. T. Eds. Topological Indices and Related Descriptors in QSAR and Drug Design; Gordon & Breach: Amsterdam, The Netherlands, 2000. [ Links ]
(24) Xu, L.; Hu, C. Y. The Applications of Graph Theory in Chemistry; Science Press: Beijing, 2000. ( in Chinese) [ Links ]
(25) Willighagen, E. L.; Denissen, H. M. G. W.; Wehrens, R.; Buydens, L. M. C. On the Use of 1H and 13C 1D NMR Spectra as QSPR Descriptors. J. Chem. Inf. Model. 2006, 46, 487-494. [ Links ]
(26) Saçan, M. T.; Erdem, S. S.; Özpinar, G. A.; Balcioglu, I. A. QSPR Study on the Bioconcentration Factors of Nonionic Organic Compounds in Fish by Characteristic Root Index and Semiempirical Molecular Descriptors. J. Chem. Inf. Model. 2004, 44, 985-992. [ Links ]
(27) Modarresi, H.; Dearden, J. C.; Modarress, H. QSPR Correlation of Melting Point for Drug Compounds Based on Different Sources of Molecular Descriptors. J. Chem. Inf. Model. 2006, 46, 930-936. [ Links ]
(28) Kahn, I.; Fara, D.; Karelson, M.; Maran, U.; Andersson, P. L. QSPR Treatment of the Soil Sorption Coefficients of Organic Pollutants. J. Chem. Inf. Model. 2005, 45, 94-105. [ Links ]
(29) Lather, V.; Madan, A. K. Topological models for the prediction of HIV-protease inhibitory activity of tetrahydropyrimidin-2-ones. J. Mol. Graph. Model. 2005, 23, 339-345. [ Links ]
(30) Narasimhan, B.; Kumari, M.; Jain, N.; Dhake, A.; Sundaravelan, C. Correlation of antibacterial activity of some N-[5-(2-furanyl)-2-methyl-4-oxo-4H-thieno[2,3-d]pyrimidin-3-yl]-carboxamide and 3-substituted-5-(2-furanyl)-2-methyl-3H-thieno[2,3-d] pyrimidin-4-ones with topological indices using Hansch analysis. Bioorg. Med. Chem. Let. 2006, 16, 4951-4958. [ Links ]
(31) Nie, C. M.; Wu, Y. X.; Wu, R. Y.; Jiang, S. H.; Zhou, C.Y. Applications Of A New Topological Index EDm In Some Aliphatic Hydrocarbons. J. Theor. Comput. Chem. 2009, 8, 19-45. [ Links ]
(32) Zhou, C. Y.; Nie, C. M. Molecular descriptors of topology and a study on quantitative structure and property relationships. Bull. Chem. Soc. Jpn. 2007, 80, 1504-1510. [ Links ]
(33) Zhou, C. Y.; Nie, C. M.; Li, S.; Li, Z. H. A novel semi-empirical topological descriptor Nt and the application to study on QSPR/QSAR. J. Comput. Chem. 2007, 28, 2413-2423. [ Links ]
(34) Crombie, K. Cis-trans isomerism was formerly called gemetricar isomerism. Q. Rev. Chem. Soc. 1952, 6, 101-140. [ Links ]
(35) Cao, C. Z.; Yuan, H. Study on the Cis/trans-isomerism of Alkenes by Topological Approach. Acta Phys-Chim. Sin. 2005, 21, 360-366. (in chinese) [ Links ]
(36) Yuan, H.; Cao, C. Z. Topological indices based on vertex, edge, ring, and distance: Application to various physicochemical properties of diverse hydrocarbons. J. Chem. Inf. Comput. Sci. 2003, 43, 501-512. [ Links ]
(37) Randic, M. Graph Theoretical Descriptors of Two-Dimensional Chirality with Possible Extension to Three-Dimensional Chirality. Chem. Inf. Comput. Sci. 2001, 41, 639-649. [ Links ]
(38) Golbraikh, A.; Bonchev, D.; Tropsha, A. Novel chirality descriptors derived from molecular topology. Chem. Inf. Comput. Sci. 2001, 41, 147-158. [ Links ]
(39) Golbraikh, A.; Bonchev, D.; Tropsha, A. Novel ZE-isomerism descriptors derived from molecular topology and their application to QSAR analysis. Chem. Inf. Comput. Sci. 2002, 42, 769-787. [ Links ]
(40) Nie, C. M.; Dai, Y. M.; Wen, S. N.; Li, Z. H.; Zhou, C. Y.; Peng, G. W. Topological homologous regularity for additive property of alkanes. Acta Chim. Sinica. 2005, 63, 1449-1455. (in chinese) [ Links ]
(41) Zhou, C. Y.; Nie, C. M.; Li, S.; Wen, S. N.; Peng, G. W.; Li, Z. H. Chemical behavior of topology and its application to QSPR of lanthanide and actinide. Chin. J. Inorg. Chem. 2007, 23, 25-33. (in chinese) [ Links ]
(42) Nie, C. M.; Peng, G. W.; Xiao, F. Z.; Li, S.; He, X. M.; Li, Z. H.; Zhou, C. Y. Study on Topological Chemistry of Gas Chromatography Retention Index for Sulfides. Chin. J. Anal. Chem. 2006, 34, 1560-1564. (in chinese) [ Links ]
(43) Nie, C. M.; Li, Z. H.; Wen, S. N. A Study for the Relationship between 13C NMR Chemical Shifts of Alkanes and the Atomic Ionicity index, Polarizability Effect Index. Chin. J. Org. Chem. 2002, 22, 46-51. (in chinese) [ Links ]
(44) Nie, C. M.; Wen, S. N. Equilibrium Electronegativity and 13C NMR Chemical Shifts of Alkanes. Chin. J. Magn. Reson. 2001, 18, 45-50. (in chinese) [ Links ]
(45) Nie, C. M. Group Electronegativity. J. Wuhan Univ.(Nat. Sci. Ed.) 2000, 46, 176-180. (in Chinese). [ Links ]
(46) Yao, Y. B.; Xie, T.; Gao, Y. M. Handbook of Physical Chemistry; Shanghai Science Technology Press: Shanghai, 1985. (in Chinese) [ Links ]
(47) David, R.L. CRC Handbook of chemistry and physics; 81st ed.; CRC Press: Florida, 2001. [ Links ]
(48) Schultz, H.P.; Schultz, E. B.; Schultz, T. P. Topological Organic Chemistry, 9. Graph Theory and Molecular Topological Indices of Stereoisomeric Organic Compounds. Chem. Inf. Comput. Sci. 1995, 35, 864-870. [ Links ]
(49) Needham, D. E.; Wei, I. C.; Seybold, P. G. Molecular modeling of the physical properties of alkanes. J. Am. Chem. Soc. 1988, 110, 4186-4194. [ Links ]