SciELO - Scientific Electronic Library Online

 
vol.51 issue1CONSTITUENTS OF HELENIUM ATACAMENSE Cabr. author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

Share


Journal of the Chilean Chemical Society

On-line version ISSN 0717-9707

J. Chil. Chem. Soc. vol.51 no.1 Concepción Mar. 2006

http://dx.doi.org/10.4067/S0717-97072006000100001 

J. Chil. Chem. Soc., 51, Nº 1 (2006)

NEW IDEA FOR THE TOPOLOGICAL INDEX EVALUATION AND TREATISE MULTIPLE REGRESSION WITH THREE INDEPENDENT VARIABLES. SATURATED HYDROCARBONS USED LIKE A MODEL

E. CORNWELL

Departamento de Química Inorgánica y Analítica, Facultad de Ciencias Químicas y Farmacéuticas, Universidad de Chile, Casilla 233, Santiago, Chile

E-mail: ecornwel@ciq.uchile.cl


ABSTRACT

In QSRR discipline an easy novel to used parameter was designed (Vc) for evaluated classical topological index (W, 1c, Z, MTI) and two new generation ones (Xu, 1ch). Regression between Vc and 1ch presented a correlation index (r) of 0,9992, a surprising high value in comparison with that founds commonly in QSPR/QSAR discipline. Through Vc parameter, an idea to treatise multiple three independent variable regression is present. Model of 35 saturated hydrocarbons were used.


 

INTRODUCTION

A mayor part of the current research in mathematical chemistry, chemical graph theory and quantitative structure-activity-property relationship studies involves topological indices. Topological indices (TIs) are numerical graph invariants that quantitatively characterize molecular structure.

A graph G = (V, E) is and ordered pair of two set V and E, the former representing a nonempty set and the latter representing unordered pairs of elements of the set V. When V represent the atoms of the molecule and element of E symbolize covalent bonds between pairs of atoms, then G becomes a molecular graph. Such graph depicts the topological of the chemical species. A graph is characterized using graph invariants, an invariant may be a polynomial, a sequence of number, or a single number as the case used in the present article. A single number numerical graph invariant that characterize the molecular structure is called a topological index.

Application of graph theory to chemical and to structure-property-activity (QSPR/QSAR) relationships has led to the emergence of several critical graph-theoretical indices.

First application of graph-theoretical invariants in studies of structure-properties relationship (QSPR) was proposed by Weiner1) [Weiner index, (W)]. However, it was after Randic2) proposed a topological index for characterization of molecular branching [(Randic index, (1c)] that dramatic expansion of studies in the area started. The two former topological indices indicated, plus Hosoya3) (Z), Schultz4) (MTI), Ren5) (Xu), and C.Yang and C Zhong6) (1ch) indices are evaluated using a new idea based on the three physicochemical properties of the molecules, the molar refraction index (MR) the critical pressure (Pcr) and the critical volume (Vcr). In previous report7) y proved that these physicochemical properties correlated so well with logarithmic relative retention time relative to n-hexane (log (trr)) in GLC analysis by means of a linear relation ( y = m*x + n). The novel relation proposed by the author (Vc) is a general idea and was proved in to 35 saturated hydrocarbons8) taken like a model, each one of these hydrocarbons are characteristize by a ordered set xi, yi, zi , xi is the molar refraction index7) (MR), yi the critical pressure7) (Pcr) and zi is a critical volume7) (Vcr) of an hydrocarbon i . Vc is the Euclidian distance of a particular set xi, yi and zi to one hydrocarbon i belonging to the 34 hydrocarbon set respect to ethane with ordered set xo, yo and zo The election of other referent hydrocarbon, produce results not satisfactory as well as methane, perhaps, the cause of that, is a molecular structure differences of any one of the 34 hydrocarbons relative to ethane.

All correlations treatise in this issue (y = mx + n) was referred to the linear regression between Vc and the topological indices cited or log trr versus all other variables in study. Other physical-chemical properties cited in my last published issue7) were used in three elements set to defined Vc and not good results were obtained as the three proposed one (MR, Pcr, Vcr)

Through a linear regression of Vc with all proposed topological index, permit us to order its in accordance with the magnitude correlation index (r), order that is the same when we correlated the relative GLC retention time respect to hexane expressed like a logarithm of these magnitude (log trr )8) with the same set of indices, this characteristic indicated that the idea involved in Vc (definition of Vc with using appropriate physical-chemistry properties) is interesting for evaluate topological index, The second categories is automatic established in function of the first ordering.

All regression function log trr vs.( xi, yi, zi); (xi, yi ); (xi, zi); (yi, zi) presented similar R2 values and similar R2 value respect to regression function log trr vs Vc

Is necessary to point out that the interpretation of the parameters of multivariable regression is valid if these parameters are in orthogonal form9) and the number of independent variable used must be in accordance with the number of cases treatise, if not, R2 value is false by excess10), these limitation are not present in all types of regression used in the present study (y = mx + n) where Vc, log trr and the topological indices were used for the 35 saturated hydrocarbons model.

The results obtained in this issue indicated that is possible used the idea of Vc parameter to the evaluation of topological indices applied to other organic homologue series. And to reduced multiple regression till to three independent variable to linear regression of the type y = mx + n

PROCEDURE

The Vc parameter is obtained through the distance (D) between the set ordered (xi , yi, z1) of a particular saturated hydrocarbon and the pair ordered (xo, yo, zo ) corresponding to ethane, particularly (11,48, 50.299, 147.5) distance D is obtained by Euclidian formulae, equation 1

D = [ (xi _ xo)2 +(yi _ yo)2 + (zi _ zo)2 ]0.5 = Vc (1)

This equation (1) was applied to 35 saturated hydrocarbons with the values expressed in columns 5-7 presented in Table 1 ( For i = 0 using equation (1) the ethane distance is equal 0). Results for Vc values are expressed in Table 2, column 9. In column 10 are presented the calculated Vc from equation (2) and at column 11, the absolute error percent of Vc respect to Vc calculated by equation (3)

The regression of Vc parameter respect to 1ch are defined by equation 2

Vc = -94.732(± 2.416)+149.532( ± 1.010)*1ch (2)
R2 = 99.84%
r = 0.9992
s.d = 3.3994
F = 21896.8



Table 2. Logarithm of retention time and diverses topological and parameter index ( Vc )

The meanings of the column titles are defined in the next,* indicated reference substance

Where r is the correlation coefficient, s.d is standard error of estimate and F is Fisher-ratio. Analysis of variance of the above correlation is in Table 3


Table 3.Analysis of variance of equation number 2 correlation

In this correlation, since the p-value in the ANOVA Table 3 is less than 0.01 there is a statistically significant relationship between both variables at the 99% confidence level. The R-Squared statistic indicates that the model explains 99.84% of the variability in Vc, r indicate a strong relationship between the variables, s.d error shows the standard deviation of the residual to be 3.399

The relation Vc calculated by means of equation (2) versus Vc values is defined by equation Nº 3

Vc calculated = 0.39046 (±1.798) + 0.9985 (±0.0067)*Vc (3)
R2 = 0.9985
R = 0.9993
s.d = 3.39
F = 21896.12

The analysis of this correlation is made by the same way that equation (2) but without ANOVA analysis, not necessary, because little percentage of errors existents between calculated Vc respect to Vc


Tabla 4. Correlation matriz of topological indices and parametrix index

In Table 4 the matrix of all possible combinations of regression were present, each aij matrix term represent the correlation index (r) where the linear relation Vc f (1ch) is the biggest one (0.9992) In function of the r values matrix (terms a8,2 to a8,7) evaluated by Vc f (TIs) studied, is possible to ordered all considered (TIs), the order is: [Z < W< MTI< 1c < Xu < 1ch ] that is the same order considering log trr f (TIs) (terms a2,1 to a7,1) This transitivity property is useful to evaluated the correlation of an experimental relation (log trr ) with topological indices knowing a priori the matrix of r values related to Vc f (TIs) function.

The linear regression (y = mx + n) log trr versus Vc is statistically very similar to the multiple regression (y = a + bx + cy + dz) log trr versus independent variables MR, Pcr and Vcr but the great F ratio value indicated a more predictability capacity for linear model. See Table 5

Table 5. Statistical results of linear and multiple regression model.

These results, indicate that using the concept of Euclidian distance in space E3 it is possible to reduced multiple regression with three independent variables to a linear regression of the form y = mx + n and in this way solved the problem of orthogonal procedure of factors or to depend of the number points analyzed9, 10) these problems was mentioned in the introduction.

Note. Df: Means liberty grade, f indicated function. Regressions were made by Stat-Graphic Plus 4 Software.

CONCLUSSIONS

1. Vc is useful parameter for ordered (TIs) indices in function of its regressions values r respect to GLC relative retention times.

2. Any multiple regressions are possible to reduced to a linear expression by means of Vc parametric idea, only a maximum of three independent variables are permit

3. A very significant linear correlation exist between Vc and 1ch this implies a great dependence between 1ch with critical pressure and critical volume of the hydrocarbons. In fact, this implies a very good significant topological criteria to defined 1ch

REFERENCES

1. Z. Mihalic, N. Trinajstic. J. Chem Educ. 69, 701-712 (1992)         [ Links ]

2. M. Randic. J. Amer. Chem. Soc. 97,6609-6615 (1975).         [ Links ]

3. H. Hosaya. Boll. Chem. Soc. Japan 44, 2332-2339 (1971)         [ Links ]

4. H. P. Schultz. J. Chem. Inform. Comput. Sci. 29, 227-228 (1989).         [ Links ]

5. B. Ren. J. Chem. Inform. Comput. Sci. 39, 139-143 (1999).         [ Links ]

6. C. Yang., C. Zhong. J. Chem. Inform. Comput. Sci. 43, 1998-2004 (2003).         [ Links ]

7. E. Cornwell. J. Chil. Chem. Soc. 50, 483-487 (2005).         [ Links ]

8. G. Zweig, J. Sherma "Handbook of Chromatography" CRC Press (1976). page 50.         [ Links ]

9. M. Randic. J. Chem. Inform. Comput. Sci. 37, 672-687 (1997).         [ Links ]

10. J. C. Toplis., R. P. Edwards. J. Med. Chem. 22, 1238-1244 (1979)         [ Links ]

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License