How Good Can the Characteristic Polynomial Be for Correlations?

Bolboaca, Sorana Daniela; Jantschi, Lorentz

doi:10.3390/i8040335

Open AccessArticle

How Good Can the Characteristic Polynomial Be for Correlations?

by

Sorana Daniela Bolboaca

^1,* and

Lorentz Jantschi

²

¹

“Iuliu Haţieganu” University of Medicine and Pharmacy, 13 Emil Isac, 400023 Cluj-Napoca, Romania

²

Technical University of Cluj-Napoca, 15 Constantin Daicoviciu, 400020 Cluj-Napoca, Romania

^*

Author to whom correspondence should be addressed.

Int. J. Mol. Sci. 2007, 8(4), 335-345; https://0-doi-org.brum.beds.ac.uk/10.3390/i8040335

Submission received: 14 January 2007 / Revised: 27 March 2007 / Accepted: 12 April 2007 / Published: 30 April 2007

(This article belongs to the Special Issue Interaction of Biological Molecules)

Download Versions Notes

Abstract

:

The aim of this study was to investigate the characteristic polynomials resulting from the molecular graphs used as molecular descriptors in the characterization of the properties of chemical compounds. A formal calculus method is proposed in order to identify the value of the characteristic polynomial parameters for which the extremum values of the squared correlation coefficient are obtained in univariate regression models. The developed calculation algorithm was applied to a sample of nonane isomers. The obtained results revealed that the proposed method produced an accurate and unique solution for the best relationship between the characteristic polynomial as molecular descriptor and the property of interest.

Keywords:

Characteristic polynomial; Graph theory; Structure-Property Relationships; Nonane isomers; Henry’s law constant (solubility)

1. Introduction

Polynomials derived from molecular graphs and matrixes find applications in chemistry for the construction of structural descriptors and topological indices [1], in QSPR (quantitative structure-property relationships) and QSAR (quantitative structure-activity relationships) models [2,3]. The characteristic polynomial of a molecular graph is a structural invariant, defined as [4,5]:

ChP (G, X) = det [XI - A (G)]

(1)

where A(G) is the adjacency matrix (A being a square matrix) of the molecular graph G, and I is the identity matrix.

Crum-Brown and Fraser published in 1898 the observation that the physiological action of ammonium salts is a function of their chemical composition and structure [6]. Since then, many indices have been introduced and used in the characterization of compounds’ properties such as the Wiener index [7,8], Hosoya index [9,10], Zagreb index [11,12], Wiener-Hosoya index [13,14], Randić index [15,16], Narumi-Katayama index [17,18], Pogliani index [19], Schultz index [20], Gutman index [21], Harary H index [22], Cluj index [23], Balaban index [24], Xu index [25], and others.

Hosoya first reported the use of the absolute values of the coefficients of the characteristic polynomial of a non-ciclic chemical compound in 1971 [9], known today as the Hosoya index Z. Since then, the analysis of the correlation between Z and many thermodynamic properties has been thoroughly studied [26–31]. However, a polynomial is a more general treatment than an index. The characteristic polynomial is just one polynomial calculated on a molecular structure [5]. The advantage of polynomials is the reduction of degeneration. Our goal was to create a procedure for creating and using a polynomial formula to correlate the structure with a given property through the value of polynomials at a point. This concept generalizes somehow the use of polynomials in regression analysis. Moreover, the desired functionality of our application is to find all singularities of polynomials derivatives, in order to answer our proposed question: How Good Can the Characteristic Polynomial Be for Correlations?

Starting with the characteristic polynomials as molecular descriptors in characterization of structure-property relationships, the aim of the research was to develop a formal calculation algorithm able to identify the value of the characteristic polynomial parameter for which the extremum values of squared correlation coefficients are obtained in univariate regression models.

2. Statement of the Problem and Mathematical Solution

Let’s consider a sample of n compounds. The molecule will be abbreviated as c_i where i is an integer and takes values from 1 to n.

The characteristic polynomial can be built and calculated based on the compound’s structure by using the following generic functions (where for the simplification all polynomials are of the same degree k):

{mol}_{i} {ChP}_{i} = a_{0 i} X^{0} + a_{1 i} X^{1} + a_{2 i} X^{2} + \dots + a_{ki} X^{k}

(2)

where a_ki are coefficients of the characteristic polynomial (a_0i = the constant coefficient and a_ki = the leading coefficient, k = the degree of polynomial), ChP_i are the characteristic polynomial functions, and X is a generic variable.

Each chemical compound from the sample (c_i) has a molecular structure (S_i) and an associated property of interest (Y_i). These can be written as:

{mol}_{i} Y_{i} S_{i}

(3)

We have compounds (c_i) with associated property of interest (Y_i), and starting from their structure associated characteristic polynomials (ChP_i):

{mol}_{i} Y_{i} {ChP}_{i} (X)

(4)

For characterization of the compounds’ property, the abstract function of the characteristic polynomial is not useful; the value associated with the characteristic polynomial function is necessary:

{mol}_{i} Y_{i} {ChP}_{i} (x)

(5)

where ChP_i(x) is the value of the characteristic polynomial function associated to i molecule.

A problem arises at this point: what is the value of X (X = x) for which the correlation between the property of interest and the characteristic polynomial function attain the maximum value?

It is well known that the Pearson product-moment correlation coefficient is the most used correlation coefficient for quantitative variables. In our example this coefficient indicates the strength and direction of the linear relationship between property of interest and characteristic polynomial. Transforming the problem into a formula, the problem becomes:

\begin{array}{c} r (Y, ChP (X)) = \frac{cov (Y, ChP (X))}{σ_{Y} σ_{ChP (X)}} = \frac{M ((Y - μ_{Y}) (ChP (X) - μ_{ChP (X)})}{σ_{Y} σ_{ChP (X)}} \\ r (Y, ChP (X)) = max \end{array}

(6)

where cov is the covariance; σ_y, σ_ChP(X) are the standard deviation of the property activity (Y) and characteristic polynomial (ChP(X)); M is the expected value of the variables Y and ChP(X); and μ_Y, μ_ChP(X) are the variables averages.

The above parameters could be written as: μ_Y= M(Y), σ_y²= M(Y)²– M²(Y), and similarly μ_ChP(X)= M(ChP(X)), σ_ChP(X)²= M(ChP(X))²–M²(ChP(X)). In these conditions, the formula of the correlation coefficient is:

\begin{array}{c} r (Y, ChP (X)) = \frac{M (YChP (X)) - M (Y) M (ChP (X))}{\sqrt{M (Y^{2}) - M^{2} (Y)} \sqrt{M (ChP {(X)}^{2}) - M^{2} (ChP (X))}} \\ r (Y, ChP (X)) = max \end{array}

(7)

To solve the problem it is necessary to find equations of unknown grade in X with real solutions. The formula:

\partial r / \partial X = 0 \to x_{1}, \dots x_{j}

(8)

where ∂r/∂X = derivative of r(Y, ChP(X)), and j is an integer, gives the solutions for x₁, …, x_j.

Note that it is difficult to work with r from Eq. (7); it is much easier to work with its squared value (r²). Using squared correlation coefficient (r²) instead of correlation coefficient (r), Eq. (8) becomes:

\partial r^{2} / \partial X = 2 r \partial r / \partial X = 0

(9)

So, the roots x₁, …, x_j of ∂r(Y, ChP(X))/∂X = 0 will be between the roots of ∂r2(Y, ChP(X))/∂X = 0. In any case, not all roots of r²= 0 (or r = 0) are of interest. Eq. (10) will provide all extremum points (Eq. (11)):

\partial (\cdot) / \partial X = 0

(10)

\partial (\cdot) (Y, ChP (X)) / \partial X ∣_{X = x} = 0 \Leftrightarrow x is a extremum point of (\cdot)

(11)

where dot (·) designs any function (such as r, r² in our case)

In order to find which among the solutions ({x₁, …, x_k}) of Eq. (11) are global maxima, the values of all r(Y,ChP(x_k)) must be computed and from the obtained values the greatest ones must be selected:

x_{j} is a maximum (positive or negative) \Leftrightarrow r (Y, ChP (x_{j})) = max {r (Y, ChP (x_{k}))}

(12)

Assuming that there is a string of polynomials (as in Eq. (2)) with equal degree k:

P_{j} = a_{0 j} X^{0} + a_{1 j} X^{1} + a_{2 j} X^{2} + \dots + a_{kj} X^{k}

(13)

the proposed implementation of the model uses the following elementary mathematical operations:

÷ Multiplication:

R = α P_{j} = α a_{0 j} X^{0} + α a_{1 j} X^{1} + α a_{2 j} X^{2} + \dots + α a_{kj} X^{k}

(14)

÷ Addition:

R = P_{i} + P_{j} = (a_{0 i} + a_{0 j}) X^{0} + (a_{1 i} + a_{1 j}) X^{1} + (a_{2 i} + a_{2 j}) X^{2} + \dots + (a_{ki} + a_{kj}) X^{k}

(15)

÷ Average:

R = M (P_{i}) = M (a_{0 j}) X^{0} + M (a_{1 j}) X^{1} + M (a_{2 j}) X^{2} + \dots + M (a_{kj} {)X}^{k}

(16)

÷ Product:

R = P_{i} P_{j} = (a_{0 i} a_{0 j}) X^{0} + (a_{0 i} a_{1 j} + a_{1 i} a_{0 j}) X^{1} + \dots + (a_{ki} a_{kj}) X^{2 k}

(17)

÷ Derivative:

R = {P_{i}}^{'} = a_{1 j} X^{0} + 2 a_{2 j} X^{1} + \dots + {ka}_{kj} X^{k - 1}

(18)

In order to solve Eq (9), a derivative of a fraction is also necessary:

if R = {(P_{i} / P_{j})}^{'} = 0 then {P_{i}}^{'} P_{j} - P_{i} P j^{'} = 0

(19)

The proposed calculus could be done with pen and paper, but is time consuming, especially when there are many compounds of interest. Thus, a formal computation method could help to find the exact and unique solution of the best relationship between characteristic polynomial and property of interest.

3. Calculation Algorithm

Parse polynomials formulas for all given molecules (ChP_j, 1 ≤ j ≤ n); parse measured data values for given molecules (Y_j, 1 ≤ j ≤ n). Comments:
- The polynomials are stored as sums of monomials;
- Every monomial is in fact a pair of two values: the power of variable (X) and the coefficient;
- A measured data value is assigned with a polynomial through j value (where j is an integer and takes value from 1 to n).
Search in the polynomial formulas and remove the identical monomials (as in Table 1). Comments:
- It is safe to remove the repeated monomials (such for example the X⁹ or − 8·X⁷, see Table 1). The calculations made by using Eq. (7) revealed that the values of correlation coefficients are not affected;
- It is better to remove the identical monomials in order to reduce the calculation complexity, magnitude of numbers, and errors propagation.
Compute the polynomial of squared correlation coefficient formula as pair of two polynomials: numerator and denominator. Comment: The following procedures has been used:
- Compute the mean and dispersion of Y (as numbers): mY = M(Y); and d2Y = M(Y²)-M²(Y);
- Compute the average polynomial (as polynomial): MChP(X) = M(ChP(X));
- Compute the average of YChP(X) products (as polynomial): MYChP(X) = M(YChP(X));
- Construct square polynomials of ChP_j²(X) and average them (as polynomial): MChP2(X) = M(ChP²(X));
- Make the product of MChP (as polynomial): M2ChP(X) = MChP(X)·MChP(X);
- Change the sign of M2ChP (as polynomial): M2ChP(X) = (−1)·M²ChP(X);
- Add M2ChP to MChP2 (as polynomial): MChP2(X) = MChP²(X) + M²ChP(X);
- Multiply the obtained MChP2 by d2Y: MChP2(X) = (d2Y)·MChP²(X) // Comment: now the MChP2(X) polynomial contains the denominator of r²;
- Multiply MChP with (−mY): MChP(X) = (−mY)·MChP(X);
- Add the obtained MChP(X) polynomial to the MYChP(X) polynomial: MYChP(X) = MYChP(X) + MChP(X) // Comment: now the MYChP(X) contains the numerator of r;
- Square the obtained MYChP(X) polynomial: MYChP(X) = MYChP(X)·MYChP(X) // Comment: now the MYChP(X) polynomial contains the numerator of r²;
- Return the pair of polynomials (MYChP(X),MChP2(X)).
Calculate derivative of the numerator of r² (as polynomial): numerator1(X) = ∂numerator(X)/∂X;
Calculate derivative of the denominator of r² (as polynomial): denominator1(X) = ∂denominator(X)/∂X;
Calculate the product between numerator1(X) and denominator(X) (as polynomial): product1(X) = numerator1(X)·denominator(X);
Calculate the product between numerator(X) and denominator1(X) (as polynomial): product2(X) = numerator(X)·denominator1(X);
Change the sign of the product2(X): product2(X) = (−1)·product2(X);
Add the product2(X) to the product1(X) and store the result in the r2_1_numerator: r2_1_numerator(X) = product1(X) + product2(X);
Factorize r2_1_numerator(X) if it is possible (usually is easy to factorize with X if this factor is contain in it, so will factorize on X); let X^p be the factor; delete the factor; thus the r2_1_numerator became: r2_1_numerator(X) = r2_1_numerator(X)/X^p;
Find roots of equation r2_1_numerator(X) = 0 and return them as pairs (x_i,ɛ_i) 1 ≤ i ≤ m where in fact r2_1_numerator(x_i) = ɛ_i. Comments:
- The procedure of finding roots is an approximate one for at least two reasons. First, the M(·) operator is used, so the coefficients cannot be integers. Second, even if the S(·) operator (sum operator) is used instead of the M(·) operator in order to obtain integer coefficients, the degree of the obtained polynomial is too great to apply some nonnumeric methods here (for our example the degree of the obtained polynomial equation was 12);
- The returning of the ɛ_i is used in order to know how close the exact solution is to the result;
- The procedure of finding roots is a recursive one and it also calculates and uses all superior derivatives of the polynomial in order to find all real roots of the equation.
Use the set of roots {x_i}1 ≤ i ≤ m and pairs of polynomials (numerator(X),denominator(X)) to calculate the value of r² in the following points: {x_i}₁_≤_i_≤_m → {r²(x_i)}₁_≤_i_≤_m
Display the results: {x_i,ɛ_i,r²(x_i)}₁_≤_i_≤_m

The above-presented algorithm has been implemented using PHP language (Hypertext Preprocessor). In order to illustrate its effectiveness, the program was run for a sample of nonane isomers, the Henry’s law constant (solubility) being the property of interest.

4. Henry’s Law Constant of Nonane Isomers: Computational Results and Discussion

Nonane isomers are acyclic saturated hydrocarbon structures with the general chemical formula C₉H₂₀. There are thirty-five compounds in this class: 4-methyloctane (c₁), 3-ethyl-2,3-dimethylpentane (c₂), 3,3-diethylpentane (c₃), 2,2,3,3-tetramethyl-pentane (c₄), 2,3,3,4-tetramethylpentane (c₅), nonane (c₆), 2,3,3-trimethylhexane (c₇), 3,3,4-trimethylhexane (c₈), 3-ethyl-3-methylhexane (c₉), 2,2,3,4-tetramethylpentane (c₁₀), 3,4-dimethylheptane (c₁₁), 2,3,4-trimethylhexane (c₁₂), 3-ethyl-4-methylhexane (c₁₃), 3-ethyl-2,2-dimethylpentane (c₁₄), 3-ethyl-2,4-dimethylpentane (c₁₅), 2,3-dimethylheptane (c₁₆), 3,3-dimethylheptane (c₁₇), 4,4-dimethylheptane (c₁₈), 3-ethylheptane (c₁₉), 4-ethyl-heptane (c₂₀), 2,2,3- trimethylhexane (c₂₁), 2,2,5-trimethylhexane (c₂₂), 2,4,4-trimethylhexane (c₂₃), 3-ethyl-2-methylhexane (c₂₄), 2,2,4,4-tetramethylpentane (c₂₅), 3-methyloctane (c₂₆), 2,5-dimethylheptane (c₂₇), 3,5-dimethylheptane (c₂₈), 2,3,5-trimethylhexane (c₂₉), 2-methyloctane (c₃₀), 2,2-dimethylheptane (c₃₁), 2,4- dimethylheptane (c₃₂), 2,6-dimethylheptane (c₃₃), 2,2,4-trimethyl-hexane (c₃₄), and 4-ethyl-2-methylhexane (c₃₅), respectively. The Henry’s law constant (solubility of a gas in water) of alkanes expressed as trace gases of potential importance in environmental chemistry was the property of interest. The measured values were taken from a previously reported research [32] (k_H, Table 1) and were given as M/atm unit measurements (M/atm = [mol_aq/dm³_aq]/atm).

In the fist step of the calculation algorithm, the polynomial formulas for all thirty-five compounds and associated measured Henry’s law constants were parsed. After the second step of the computing algorithm, two identical monomials (X⁹ and -8·X⁷) were identified and those monomials were removed from the polynomials (see Table 1, characteristic polynomials after second step - last column).

The polynomial of the squared correlation coefficient resulting from the third step of the calculation algorithm was of the tenth degree:

\begin{matrix} r^{2} (P (X)) = (X^{2} \cdot 0.55 \dots - X^{4} \cdot 0.94 \dots - X^{6} \cdot 0.39 \dots + X^{8} \cdot 0.66 \dots + X^{10} \cdot 0.27 \dots) / \\ (X^{2} \cdot 14.19 \dots - X^{4} \cdot 56.43 \dots + X^{6} \cdot 100.25 \dots - X^{8} \cdot 52.97 \dots + X^{10} \cdot 9.93 \dots) \end{matrix}

(20)

The derivative of the r² numerator was of the twelfth degree:

\begin{matrix} {r^{2}}^{'} (P (X)) = 0 \\ < = > \\ (- 0.84 \dots) X^{0} + (5.74 \dots) X^{2} + (- 10.9 \dots) X^{4} + (8.47 \dots) X^{6} + (- 1.26 \dots) X^{8} + (- 2.97 \dots) X^{10} + X^{12} = 0 \end{matrix}

(21)

Note that just the first significant digits were displayed in Eqs. (20) and (21) (the “…” sign was written when more digits were available).

The solutions of roots for the squared correlation coefficient obtained by the proposed algorithm for the sample of nonane isomers are presented in Table 2, where the ɛ_i parameter shows how closely the obtained value is to the exact solution r²(x_i)′_X= 0. Indeed, the r2_1_numerator(x_i) = ɛ_i was true, where the r2_1_numerator was from the eleventh step of the proposed algorithm and represented a part of parameter depicted above.

As it can be observed from Table 2, the proposed algorithm obtained pairs of roots (as negative and positive values: 1.1 – 1.2, 2.1 – 2.2, and 3.1 – 3.2, see the values from the x_i column). The values of squared correlation coefficients are local extremum values (maximum and/or minimum values): one negative (for pair of roots of ± 0.856…) and two positive (one minimum for the ± 0.481… pair of roots and one maximum for the ± 1.656… pair of roots). These are the expected results taking into account that the r2_1_numerator(X) is a polynomial pair of X.

Analyzing the results presented in Table 2 it can be observed that, for the identified roots, the numerical errors of the models were in all cases less than 0.0001. These results sustain the power of the model to identify the imposed solutions. Looking at the values of the obtained squared correlation coefficients it can be observed that the proposed method identified one maximum value (for roots ± 1.656…) and two minimum values (± 0.856, and ± 0.481) (note that these are local extremum values). Regarding the maximum value of the squared correlation coefficient, it can be observed that is 0.296 and, from the statistical point of view, revealed a week linear relationship between the characteristic polynomial and Henry’s law constant for the studied alkanes. It must be noted that the aim of the paper was not to obtained a significant correlation coefficient; it was to develop and implement a formal algorithm able to identify the characteristic polynomial parameter for which the extremum values (as maximum and minimum values) for the correlation coefficient are obtained in univariate regression models, this aim being accomplished.

Regarding the proposed method one question can arise: why use the proposed method when the Hosoya Z index [9] can be used in QSPR without using a computer? First, the use of characteristic polynomials instead of the Z index reduces the degeneration. Second, the proposed model is able to find all singularities of polynomial derivatives. Last in sequence but not least in importance, the proposed computer based method is able to work with small as well as with large sample sizes without any involvement of human time or abilities, eliminating any human errors.

It is well known that the squared correlation coefficients increase with the number of variables used by a linear regression model [33]. Starting from this hypothesis it will be interesting to analyze the applicability of the proposed model to multivariate regression models. The next plan of our research refers the implementation of a similar computational algorithm for multivariate models when characteristic polynomials are use as molecular descriptors. Another question that needs to be answered refers the usefulness of the method for characterization of relationships between compound’s activity and structure, an approach that will be investigated in future research.

5. Concluding Remarks

The proposed calculation algorithm is able to obtain unique and reproducible solutions. The solutions are unique, meaning that for a sample of compounds with a property of interest the maximum value of the squared correlation coefficient between property and characteristic polynomials is always given by a single pair of roots. The computation algorithm can be applicable on any class of compounds when the characteristic polynomials are used as descriptors in analysis of the relationship between compounds’ structure and their properties.

Table 1. Nonane isomers: Henry’s law constant and characteristic polynomials.

**Table 1.** Nonane isomers: Henry’s law constant and characteristic polynomials.
Comp. Abbrev.	k_H(·10⁻⁵) [M/atm]*	Characteristic polynomial	After second step of calculation algorithm
c₁	10	X⁹ − 8·X⁷ + 20·X⁵ − 17·X³ + 3·X	20·X⁵ − 17·X³ + 3·X
c₂	15	X⁹ − 8·X⁷ + 17·X⁵ − 12·X³ + 2·X	17·X⁵ − 12·X³ + 2·X
c₃	15	X⁹ − 8·X⁷ + 16·X⁵ − 8·X³	16·X⁵ − 8·X³
c₄	16	X⁹ − 8·X⁷ + 15·X⁵ − 6·X³	15·X⁵ − 6·X³
c₅	16	X⁹ − 8·X⁷ + 18·X⁵ − 16·X³ + 5·X	18·X⁵ − 16·X³ + 5·X
c₆	17	X⁹ − 8·X⁷ + 21·X⁵ − 20·X³ + 5·X	21·X⁵ − 20·X³ + 5·X
c₇	17	X⁹ − 8·X⁷ + 17·X⁵ − 10·X³	17·X⁵ − 10·X³
c₈	17	X⁹ − 8·X⁷ + 17·X⁵ − 11·X³ + 2·X	17·X⁵ − 11·X³ + 2·X
c₉	17	X⁹ − 8·X⁷ + 18·X⁵ − 14·X³ + 3·X	18·X⁵ − 14·X³ + 3·X
c₁₀	17	X⁹ − 8·X⁷ + 16·X⁵ − 6·X³	16·X⁵ − 6·X³
c₁₁	18	X⁹ − 8·X⁷ + 19·X⁵ − 15·X³ + 3·X	19·X⁵ − 15·X³ + 3·X
c₁₂	18	X⁹ − 8·X⁷ + 18·X⁵ − 12·X³ + 2·X	18·X⁵ − 12·X³ + 2·X
c₁₃	18	X⁹ − 8·X⁷ + 19·X⁵ − 16·X³ + 4·X	19·X⁵ − 16·X³ + 4·X
c₁₄	18	X⁹ − 8·X⁷ + 17·X⁵ − 10·X³	17·X⁵ − 10·X³
c₁₅	18	X⁹ − 8·X⁷ + 18·X⁵ − 12·X³	18·X⁵ − 12·X³
c₁₆	19	X⁹ − 8·X⁷ + 19·X⁵ − 14·X³ + 2·X	19·X⁵ − 14·X³ + 2·X
c₁₇	19	X⁹ − 8·X⁷ + 18·X⁵ − 12·X³ + 2·X	18·X⁵ − 12·X³ + 2·X
c₁₈	19	X⁹ − 8·X⁷ + 18·X⁵ − 12·X³	18·X⁵ − 12·X³
c₁₉	19	X⁹ − 8·X⁷ + 20·X⁵ − 18·X³ + 5·X	20·X⁵ − 18·X³ + 5·X
c₂₀	19	X⁹ − 8·X⁷ + 20·X⁵ − 18·X³ + 4·X	20·X⁵ − 18·X³ + 4·X
c₂₁	19	X⁹ − 8·X⁷ + 17·X⁵ − 9·X³	17·X⁵ − 9·X³
c₂₂	19	X⁹ − 8·X⁷ + 17·X⁵ − 6·X³	17·X⁵ − 6·X³
c₂₃	19	X⁹ − 8·X⁷ + 17·X⁵ − 8·X³	17·X⁵ − 8·X³
c₂₄	19	X⁹ − 8·X⁷ + 19·X⁵ − 15·X³ + 2·X	19·X⁵ − 15·X³ + 2·X
c₂₅	19	X⁹ − 8·X⁷ + 15·X⁵	15·X⁵
c₂₆	20	X⁹ − 8·X⁷ + 20·X⁵ − 17·X³ + 4·X	20·X⁵ − 17·X³ + 4·X
c₂₇	20	X⁹ − 8·X⁷ + 19·X⁵ − 13·X³ + 2·X	19·X⁵ − 13·X³ + 2·X
c₂₈	20	X⁹ − 8·X⁷ + 19·X⁵ − 14·X³ + 3·X	19·X⁵ − 14·X³ + 3·X
c₂₉	20	X⁹ − 8·X⁷ + 18·X⁵ − 10·X³	18·X⁵ − 10·X³
c₃₀	21	X⁹ − 8·X⁷ + 20·X⁵ − 16·X³ + 2·X	20·X⁵ − 16·X³ + 2·X
c₃₁	21	X⁹ − 8·X⁷ + 18·X⁵ − 10·X³	18·X⁵ − 10·X³
c₃₂	21	X⁹ − 8·X⁷ + 19·X⁵ − 13·X³	19·X⁵ − 13·X³
c₃₃	21	X⁹ − 8·X⁷ + 19·X⁵ − 12·X³	19·X⁵ − 12·X³
c₃₄	21	X⁹ − 8·X⁷ + 17·X⁵ − 7·X³	17·X⁵ − 7·X³
c₃₅	21	X⁹ − 8·X⁷ + 19·X⁵ − 14·X³ + 2·X	19·X⁵ − 14·X³ + 2·X

^*M/atm = (mol_aq/dm³_aq)/atm

Table 2. Algorithm of calculation: solutions for nonane isomers.

**Table 2.** Algorithm of calculation: solutions for nonane isomers.
Solution	x_i	ɛ_i	r²(x_i)
1.1	− 1.656…	− 5.5…·10⁻¹¹	0.296…
2.1	− 0.856…	1.1…·10⁻¹³	0
3.1	− 0.481…	2.7…·10⁻¹³	0.055…
3.2	0.481…	2.7…·10⁻¹³	0.055…
2.2	0.856…	1.1…·10⁻¹³	0
1.2	1.656…	− 5.5…·10⁻¹¹	0.296…

x_i = root; r²(x_i) = squared correlation coefficient; ɛ_i = numerical error;

… = for all numbers only first significant digits were presented

Acknowledgement

The research was partly supported by UEFISCSU Romania through project ET46/2006.

References

Balaban, A.T.; Ivanciuc, O. Topological Indices and Related Descriptors in QSAR and QSPR; Devillers, J., Balaban, A.T., Eds.; Gordon & Breach: Amsterdam, 1999; Volume Chapter 2, pp. 21–57. [Google Scholar]
Bonchev, D. Information Theoretic Indices for Characterization of Chemical Structure; Research Studies Press – Wiley: Chichester, UK, 1983. [Google Scholar]
Kier, L.B.; Hall, L.H. Molecular Connectivity in Structure-Activity Analysis; Research Studies Press: Letchworth, 1986. [Google Scholar]
Schwenk, A.J. Computing the characteristic polynomial of a graph, Graphs and Combinatorics; Bari, R., Harary, F., Eds.; Springer: Berlin, 1974; pp. 153–172. [Google Scholar]
Diudea, M.V.; Gutman, I.; Jäntschi, L. Molecular Topology; Nova Science: Huntington, New York, 2002; Volume Chapter 3, pp. 53–100. [Google Scholar]
Crum-Brown, A.; Fraser, T.R. On the connection between chemical constitution and physiological action. Part 1. On the physiological action of the salts of the ammonium bases, derived from Strychnia, Brucia, Thebia, Codeia, Morphia, and Nicotia. T. Roy. Soc. Edin 1868, 25, 151–203. [Google Scholar]
Wiener, H. Structural Determination of Paraffin Boiling Points. J. Am. Chem. Soc. 1947, 69(1), 17–20. [Google Scholar]
Wang, H.; Yu, G. All but 49 numbers are wiener indices of trees. Acta Appl. Math. 2006, 92(1), 15–20. [Google Scholar]
Hosoya, H. Topological index, a newly proposed quantity characterizing the topological nature of structural isomers of saturated hydrocarbons. Bull. Chem. Soc. Jpn 1971, 44, 2332–2339. [Google Scholar]
Hosoya, H.; Kawasaki, K.; Mizutani, K. Topological index and thermodynamic properties. I. Empirical rules on the boiling points of saturated hydrocarbons. Bull. Chem. Soc. Jpn 1972, 45, 3415–3421. [Google Scholar]
Gutman, I.; Trinajstić, N. Graph theory and molecular orbitals. Total ϕ-electron energy of alternant hydrocarbons. Chem. Phys. Lett 1972, 17, 535–538. [Google Scholar]
Nikolić, S.; Kovacević, G.; Milicević, A.; Trinajstić, N. The Zagreb indices 30 years after. Croat. Chem. Acta 2003, 76, 113–124. [Google Scholar]
Randić, M. Wiener-Hosoya index - A novel graph theoretical molecular descriptor. J. Chem. Inf. Comput. Sci 2004, 44, 373–377. [Google Scholar]
Westerberg, T.M.; Dawson, K.J.; McLaughlin, K.W. The Hosoya Index, Lucas Numbers, and QSPR. Endeavor 2005, 1, 1–15. [Google Scholar]
Randić, M. On characterization of molecular branching. J. Am. Chem. Soc 1975, 97, 6609–6615. [Google Scholar]
Taherpour, A.; Shafiei, F. The structural relationship between Randić indices, adjacency matrixes, distance matrixes and maximum wave length of linear simple conjugated polyene compounds. J. Mol. Struct. THEOCHEM 2005, 726, 183–188. [Google Scholar]
Narumi, H. New Topological Indices for Finite and Infinite Systems. MATCH Commun. Math. Comput. Chem 1987, 22, 195–207. [Google Scholar]
Tomovic, Z.; Gutman, I. Narumi-Katayama index of phenylenes. J. Serb. Chem. Soc 2001, 66, 243–247. [Google Scholar]
Pogliani, L. Modeling with Special Descriptors Derived from a Medium-Sized Set of Connectivity Indices. J. Phys. Chem 1996, 100, 18065–18077. [Google Scholar]
Schuttz, H.P. Topological organic chemistry. 1. Graph theory and topological indices of alkanes. J. Chem. Inf. Comput. Sci 1989, 29, 227–223. [Google Scholar]
Gutman, I.J. Selected properties of the Schultz molecular topological index. J. Chem. Inf. Comput. Sci 1994, 34, 1037–1039. [Google Scholar]
Plavšić, D.; Nikolić, S.; Trinajstić, N.; Mihalić, Z. On the Harary index for the characterization of chemical graphs. J. Math. Chem 1993, 12, 235–250. [Google Scholar]
Jäntschi, L.; Katona, G.; Diudea, M.V. Modeling Molecular Properties by Cluj Indices. MATCH Commun. Math. Comput. Chem 2000, 41, 151–188. [Google Scholar]
Balaban, A.T. Highly discriminating distance-based topological index. Chem. Phys. Lett 1982, 89, 399–404. [Google Scholar]
Ren, B.A. New Topological Index for QSPR of Alkanes. J. Chem. Inf. Comput. Sci 1999, 39, 139–143. [Google Scholar]
Hosoya, H.; Gotoh, M.; Murakami, M.; Ikeda, S. Topological Index and Thermodynamic Properties. 5. How Can We Explain the Topological Dependency of Thermodynamic Properties of Alkanes with the Topology of Graphs? J. Chem. Inf. Comput. Sci 1999, 39, 192–196. [Google Scholar]
Gao, Y.D.; Hosoya, H. Topological Index and Thermodynamic Properties. IV. Size Dependency of the Structure-Activity Correlation of Alkanes. Bull. Chem. Soc. Jpn. 1988, 61, 3093–3102. [Google Scholar]
Narumi, H.; Hosoya, H. Topological Index and Thermodynamic Properties. III. Classification of Various Topological Aspects of Properties of Acyclic Saturated Hydrocarbons. Bull. Chem. Soc. Jpn 1985, 58, 1778–1786. [Google Scholar]
Narumi, H.; Hosoya, H. Topological Index and Thermodynamic Properties. II. Analysis of the Topological Factors on the Absolute Entropy of Acyclic Saturated Hydrocarbons. Bull. Chem. Soc. Jpn. 1980, 53, 1228–1237. [Google Scholar]
Hosoya, H.; Kawasaki, K.; Mizutani, K. Topological Index and Thermodynamic Properties. I. Empirical Rules on the Boiling Point of Saturated Hydrocarbons. Bull. Chem. Soc. Jpn 1972, 45, 3415–3421. [Google Scholar]
Mekenyan, O.; Bonchev, D.; Trinajstić, N. Chemical graph theory: Modeling the thermodynamic properties of molecules. Int. J. Quantum Chem 2004, 18, 369–380. [Google Scholar]
Yaws, C.L.; Yang, H.C. Thermodynamic and Physical Property Data; Yaws, C. L., Ed.; Gulf Publishing Company: Houston, TX, USA, 1992; pp. 181–206. [Google Scholar]
Hawkins, D.M. The Problem of Overfitting. J. Chem. Inf. Comput. Sci 2004, 44, 1–12. [Google Scholar]

Share and Cite

MDPI and ACS Style

Bolboaca, S.D.; Jantschi, L. How Good Can the Characteristic Polynomial Be for Correlations? Int. J. Mol. Sci. 2007, 8, 335-345. https://0-doi-org.brum.beds.ac.uk/10.3390/i8040335

AMA Style

Bolboaca SD, Jantschi L. How Good Can the Characteristic Polynomial Be for Correlations? International Journal of Molecular Sciences. 2007; 8(4):335-345. https://0-doi-org.brum.beds.ac.uk/10.3390/i8040335

Chicago/Turabian Style

Bolboaca, Sorana Daniela, and Lorentz Jantschi. 2007. "How Good Can the Characteristic Polynomial Be for Correlations?" International Journal of Molecular Sciences 8, no. 4: 335-345. https://0-doi-org.brum.beds.ac.uk/10.3390/i8040335

Article Menu

How Good Can the Characteristic Polynomial Be for Correlations?

Abstract

1. Introduction

2. Statement of the Problem and Mathematical Solution

3. Calculation Algorithm

4. Henry’s Law Constant of Nonane Isomers: Computational Results and Discussion

5. Concluding Remarks

Acknowledgement

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI