Next Article in Journal
A Decision-Making Tool for Algorithm Selection Based on a Fuzzy TOPSIS Approach to Solve Replenishment, Production and Distribution Planning Problems
Next Article in Special Issue
Blood Flow in Multi-Sinusoidal Curved Passages with Biomimetic Rheology: An Application of Blood Pumping
Previous Article in Journal
An Optimal Investigation of Convective Fluid Flow Suspended by Carbon Nanotubes and Thermal Radiation Impact
Previous Article in Special Issue
On the Efficiency of Staggered C-Grid Discretization for the Inviscid Shallow Water Equations from the Perspective of Nonstandard Calculus
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Computer-Aided Methods for Molecular Classification

by
Alina Bărbulescu
1,
Lucica Barbeș
2,* and
Cristian Ștefan Dumitriu
3,*
1
Department of Civil Engineering, Transilvania University of Brașov, 5 Turnului Street, 900152 Brașov, Romania
2
Department of Chemisty and Chemical Engineering, Faculty of Applied Sciences and Engineering, Ovidius University of Constanța, 124 Mamaia Bd., 900527 Constanta, Romania
3
SC Utilnavorep SA, 55 Aurel Vlaicu Av., 900055 Constanta, Romania
*
Authors to whom correspondence should be addressed.
Submission received: 31 March 2022 / Revised: 29 April 2022 / Accepted: 2 May 2022 / Published: 4 May 2022
(This article belongs to the Special Issue Numerical Analysis and Scientific Computing II)

Abstract

:
The study aims to analyze the degree of similarity of some molecules belonging to two subgroups of Aminoalkylindoles. After extracting the molecules’ characteristics using Cheminformatics methods, and the computation of the Tanimoto coefficients, dendrograms and heatmaps were built to reveal the degree of similarity of the analyzed drugs. Some atom-pair similarities between the molecules in the same group were detected. The clusters determined by the k-means method divided the Benzoylindoles into two subgroups but kept all the Phenylacetylindoles together in the same set. The activity spectrum of the elements in each group was also analyzed, and similarities have been emphasized. The clustering has been validated using the Kruskal–Wallis test on the series of computed probabilities of the main effects.
MSC:
92E10; 92C99; 65D99

1. Introduction

The consumption of drugs or psychotropic substances continues to be one of the leading causes of global health problems and mortality among young people and adults [1]. In Europe, the number of drug users has risen alarmingly over the last 10 years, especially among young people in the 14–18 age group [2]. Drug use and addiction produce adverse effects such as emotional (depression, anxiety, or suicide), behavioral (especially aggression), health (e.g., hepatitis B and C), educational/learning (profoundly affecting the long- and short-term memory), brain damage (by brain contraction inducing decreased thinking or impaired thinking perception and intuition, with severe impairment of the central nervous system), and by causing road accidents [3].
Drugs are marketed as “party pills”, “legal highs”, “herbal highs”, “bath salts”, “laboratory reagents”, “designer drugs”, “research chemicals”, or new psychoactive substances (NPS). They represent a real challenge for public health because of their variety and multiplication speed [4,5].
The United Nations Office on Drugs and Crime (UNODC) [6] uses the term “new psychoactive substances” (NPSs) for “substances of abuse, in pure form or in the form of preparations, which are not controlled by the Single Convention on Narcotic Drugs or by the United Nations Convention”. NPSs refer to recent drugs and other substances on the market since the 1960s which are challenging to manage. In conformity with the World Drug Report (2019), prepared by UNODC, approximately 271,000,000 people aged 15–64 have used drugs at least once, representing 5.5% of the world’s population. In other words, 1 in 18 people uses drugs, and from 2009–2017 there was an alarming increase in drug use (about 30% worldwide) [6].
Psychoactive substances are part of different classes of chemical compounds whose classification, according to their composition, is the most rigorous criterion, from a scientific point of view. Drugs are classified based on the following criteria: depending on their source, legal or medical status, chemical structure, and psychoactive effect [7]. Given that the psychoactive compounds added to the NPS list are constantly changing due to the control measures included in the differentiated legislation worldwide [8], the possible combinations are huge, imposing a simplified classification obtained using Cheminformatics [9].
In Romania, the following substances with psychoactive potential have been identified and are under national control: synthetic cannabinoids (SCs), amphetamines, barbiturates, cocaine analogs, benzodiazepines, synthetic cathinone, phenethylamines, piperazines, and tryptamines. SCs (also known as cannabimimetic or synthetic cannabinoid receptor agonists) are substances similar to Δ9 -tetrahydrocannabinol (Δ9-THC) that have the active ingredient specific to cannabis, whose intoxication is slow, affecting perception, reflexes, and body coordination [10].
SCs and designer drugs were created to analyze different receptors and neurotransmitters to find other alternatives to traditional medicine [7]. SCs are sold to be smoked in e-cigarettes in a liquid form, known as “herbal liquid” [1] or “spice-like” herbal mixtures [11]. Some SC derivatives (e.g., JWH series) are well-known and commercialized in many European countries [12].
For many years, scientists have aimed to quickly identify and establish the correlations between drug composition and consumption results and the possible ways to cure addiction and overdose [13,14,15].
SCs are complex structural compounds with a high binding affinity and efficacity at the CB1 and CB2 receptors [16,17]. In general, a drug must be metabolized in a specific way to result in an appropriate chemical structure that matches these receptors [18].
SCs can be grouped into the following categories: (a) Classical cannabinoids (with a structural relationship with Δ9-THC); (b) Non-classical cannabinoids; (c) Hybrid-forms (different combinations of classical and non-classical cannabinoids); (d) Aminoalkylindoles (AIs) or cannabinoid receptor agonists (with five structural chemical groups: Benzoylindoles, Phenylacetylindoles, Naphthylmethylindoles, Naphthoylindoles, and cannabinergic compounds); (e) Eicosanoids (endocannabinoids) [13].
AIs represent the largest group of SCs that can create derivative compounds by adding different substituents, such as alkyl, alkoxy, halogen, etc., to the aromatic ring systems, among other relatively simple alterations. The structure of the Aminoalkylindoles group with the first four subgroups is presented in Figure 1.
In drug discovery, virtual screening (VS) became a powerful computational approach used for screening libraries containing different molecules for finding those with desired characteristics that will be subject to laboratory tests. VS is intended for boosting the discovery of the candidates and reduce the number of those that should be experimentally tested. VS has the main advantage of diminishing resources, cost, labor, and time.
The quantitative structure–activity relationship (QSAR) is one of the most powerful approaches to VS due to its excellent hit rate and fast throughput. After collecting the relevant data, QSAR computes the chemical descriptors on different levels of the molecular structure representation to determine the similarities/dissimilarities of the investigated structures. This is precisely what we intend to do in our work [19]. Here, we are using QSAR to emphasize the similarities/dissimilarities of the studied elements.
QSAR relies on the hypothesis that the chemical structure is responsible for the activity, so similar molecules are expected to have similar properties [20]. Still, activity cliffs (ACs) can be noticed. ACs represent groups of molecules that have similar structures and are active against the same target but exhibit high differences in potency. Since ACs capture chemical modifications that strongly influence biological activity, they are of particular interest in QSAR analysis [21].
Fingerprints are representations of specific molecular structures and may represent a structural key within a molecule; for example, computed properties of a molecule (LogP, Polar Surface area, Hydrogen Bond donor). Being more abstract than a structural key, fingerprints are more general because they do not represent pre-defined patterns [22]. They encode various descriptors of the molecular structure [23].
Different artificial intelligence approaches have been used in various domains for data analysis in the last period. Cheminformatics is a tool used to examine statistical data related to chemical structures. It has an essential role in accumulating, grouping, and analyzing chemical data. It is successfully used for determining new entities that are the base of other structures utilized to construct active molecules [24].
Utilizing an in silico method, one can predict pharmacokinetic parameters [25]. It has been shown [26,27] that each computational procedure employed in drug discovery has advantages and disadvantages. The rcdk, ChemmineR, and rpubchem packages of R or RDKIT in Python (www.rdkit.org, accessed on 15 May 2021) are powerful tools in Cheminformatics [28,29,30,31,32,33,34], helping scientists to group the information efficiently. The Chemistry Development Kit (CDK) (https://cdk.github.io/, accessed on 15 May 2021) has also been employed for the prediction of organic reactions, bioactivities of compounds, or finding the maximally bridging rings in chemical structures [35,36,37,38].
This research has been realized using the R software and its specific packages for characterizing 14 cannabinoids belonging to the Benzoylindoles and Phenylacetylindoles [39] and detecting similarities between them. Performing the hierarchical clustering and k-means algorithm resulted in grouping drugs by taking into account the computed descriptors. The activity spectrum of the elements in each group has also been analyzed, and similarities emphasized. The results come to validate the grouping of the molecules in clusters.

2. Materials and Methods

Data on which the study relies have been retrieved as .sdf files from PubChem [40]. They are molecules from the Aminoalkylindoles class, as well as the Benzoylindole and Phenylacetylindole subgroups.
Figure 2 contains the study flowchart. After importing the molecules (step 1), their structures are drawn (step 2). The molecular formula (MF) and weights (MW), number and types of atoms, and functional groups are determined (step 3). The descriptors computed at the fourth stage, using the ChemmineOB package, are the Hydrogen Bond Acceptors (HBA1, HBA2) and Donors (HBD), log P, the molar refractivity, and topological polar surface area (TPSA) [39]. The reader may refer to [41,42,43,44,45,46,47] for details on these descriptors. The descriptors are utilized to group the molecules into clusters.
At the fifth stage, the atom-pairs (AP) are determined with the help of ChemmineR. AP is formed by a pair of atoms and the shortest bond path length from one to the other [48,49].
Computation of the compounds’ similarity provides the sizes of the query and target molecules, the Tanimoto [49,50] and overlap coefficients, indicating the degree of overlapping of the pair of molecules (step 6).
The first form of the Tanimoto coefficient is:
S A , B = i = 1 m n A , i n B , i i = 1 n n A , i 2 + i = 1 n n B , i 2 i = 1 m n A , i n B , i
and the second one is:
S A , B = i = 1 m min ( n A , i , n B , i ) i = 1 n n A , i + i = 1 n n B , i i = 1 m min ( n A , i , n B , i )   ,
n A , i ( n B , i ) being the number of the ith fragment in A (B).
If one is interested only in the absence/existence of unique fragments, both approaches lead to the binary form [48,49].
SA,B = c/(a + b + c),
where a (b) is the number of fragments contained only by A (B), and c is the number of fragments common to A and B.
Formula (3) is used in our study, together with the corresponding distance:
D A , B = [ a + b ] 1 / 2 ,
Generally, given two structures, A and B, the overlap coefficient is computed by:
    c o p = | A B | min { | A | , | B | }
where | A | and | B | are, respectively, the numbers of elements of A and B, and A B is the intersection of A and B.
To compute the Tanimoto index, the following fingerprints have been utilized: hierarchical elements count rings in a canonic Extended Smallest Set of Smallest Rings (ESSSR) ring set, simple pairs of APs, simple atom nearest neighbors, detailed atom neighborhoods, and simple SMART pattern.
The seventh stage aimed to group the molecules using binning [50,51], the Jarvis–Patrick procedure [52], and hierarchical clustering. The Ward 2 algorithm [53,54] has been chosen for hierarchical clustering because it minimizes the variance inside the clusters. The k-means algorithm has also been run for clustering the molecules.
The last step was to predict the biological activities spectrum, reflecting the substance’s interaction effects with physical entities [55]. For this aim, the algorithm proposed by Lagunin et al. [55], implemented in PASS [56], has been used. It computes the probability of each activity based on the structure descriptors. It returns a table that contains the biological activities and the corresponding probabilities (the likelihood of activity to exist ( P a j ) or not ( P i j )).
When Pa is greater than 0.7, the probability that the substance has the specified activity in experimental conditions and is analogous to a pharmaceutical substance already studied is high.
When Pa is between 0.5 and 0.7, the substance may present the specified activity in experimental conditions. However, the substance is different from the substance already studied.
When Pa is less than 0.5, the probability that the substance has the specified activity is low. In the case when this activity is experimentally observed, it might be a new chemical entity [56].
The most important activities (those with probabilities greater than 0.5) exhibited by each molecule in the groups have been selected, and a table containing these probabilities and those of the corresponding activities for all the molecules in a group has been built. If a molecule does not have a certain activity, the assigned probability is zero. Using these newly built series, the Kruskal–Wallis test [57] has been performed to test the null hypothesis (H0) that the series come from the same distribution, at a significance level of 0.05. The same test has been performed for the series issued from both groups together. These tests will confirm or reject the clustering from step 7. If the null hypothesis was rejected, the test was performed for sub-groups to determine where the difference is.

3. Results and Discussion

The structures of the molecules from the Benzoylindole and Phenylacetylindole groups (Group 1 and 2, respectively) are represented in Figure 3. They are accompanied by the CID (compound ID) in PubChem. The CID, MF, and MW, the atoms’ species and functional groups (present in at least one molecule), and their numbers, retrieved using ChemmineR, are presented in Table 1 and Table 2.
The molar weights in Group 1 are between 307.3862 (for C20H21NO2) and 458.3353 (for C22H23IN2O). Only one molecule contains F and Cl, and three, I. The molar weights in Group 2 are between 335.4394 (for C20H21NO2) and 376.4913 (for C24H28N2O2). No molecule contains F, and one, I. All contain Cl.
The molar weights in Group 2 are between 307.3862 (C22H25NO2) and 458.3353 (C22H23IN2O). Rings, most of them aromatic, are present in all structures of the studied molecules.
The computed descriptors are given in Table 3. The values of HBA1 are lower for the first group than for the second, and HBD is absent for both groups. logP is generally lower for Benzoylindoles (the highest value is 5.8860) than for Phenylacetylindoles (the highest value is 6.0457). The molecule ID 53394099 has the highest hydrophilicity. TPSA varies in more significant limits for Group 1 (22.00 to 43.70) than for Group 2 (22.00 to 34.47). The higher the TPSA is, the lower the drug transport is.
Table 4 displays the values of the Tanimoto coefficients, indicating the similarities of the atoms belonging to pairs of structures. The highest values were computed for the couples (9889172, 117587582) and (57507911, 57507905) (with the coefficients 0.8497 and 0.8462, respectively) in Group 1, and (44397540, 11616723) and (44397500, 11616723) in Group 2 (with the coefficients 0.8526 and 0.8467, respectively).
Table 5 shows the similarities of pairs of atoms belonging to pair of molecules from Benzoylindoles and Phenylacetylindoles. The highest value of the Tanimoto coefficient corresponds to (9989172, 44397500). It is bigger than that corresponding to the couple (57507905, 56463), whose molecules both belong to the first group.
The similarities of the molecules’ couples, one belonging to Benzoylindoles and the other to Phenylacetylindoles, indicated by the Tanimoto coefficient, are shown in Table 6. The rank of the similarity is given in brackets. Minus (−) signifies that the similarity rank is higher than eight. The molecule with the ID 53394099, absent from the table, has a similarity rank higher than eight, along with all the molecules in the first group.
After the similarity analysis, the molecules were grouped in clusters using different algorithms. For Group 1, the binning provided various numbers of clusters (1, 2, 7), depending on the cutoff. The Jarvis–Patrick algorithm provided two (or one) clusters when it took into account four (5 and 6) neighbors. The elbow method (Figure 4) selected the number of clusters (three) in the k-means algorithm.
Running the mentioned algorithm, we found two clusters with three elements and one with one—ID 56463. Similar results were found for the second group.
Figure 5 contains the results of the hierarchical clustering for both groups. In the heatmaps, the darker the color the higher the similarity of the compounds is. The squares in dark blue are associated with the similarity 1—meaning a compound with itself. The dendrograms indicate the similarity strength. The higher the branch between the two compounds is, the lower the similarity.
Figure 5a shows that the most similar molecules in Group 1 are those in the couples (9889172, 117587582) and (56841530, 579507911), for which the Tanimoto coefficient is equal to 0.900 and the overlap coefficient is 0.9583. The second highest similarity is between (10226340, 117587582) and (10226340, 9889172). Both have the overlap coefficient of 0.9583 and the Tanimoto one is equal to 0.8519.
Figure 5c shows that the highest similarity is that of the pairs (11616723, 44397500), (44397500, 44397641), and (11616723, 44397641). The corresponding Tanimoto and the overlap coefficients of the last two pairs are both 0.8846. The overlap coefficient (Tanimoto) of the first couple is 0.9011 (0.9152).
Figure 5b,d show that the distances between the elements in the Phenylacetylindoles group are smaller than those between the molecules in the Benzoylindoles group, in concordance with the results on the molecules’ similarity. It is also emphasized on the scale from Figure 6, where the branches of the molecules in Group 1 are in black.
An analogous approach was followed for the 14 molecules, without considering the groups’ appurtenance. The heatmap and the dendrogram are shown in Figure 6. It results that molecule 56463 presents the highest dissimilarities to the others. The most similar Benzoylindoles are those with CIDs 9889172, 117587582 and those with CIDs 56481530, 957507905. Among the Phenylacetylindoles, the highest similarities are those of the molecules with the second, third, and fourth CIDs in Table 6.
Figure 7 displays the clusters determined by the k-means algorithm, with k = 3.
Phenylacetylindoles belong to the first cluster, confirming the previous findings. Benzoylindoles with CIDs 9889172, 117587582, and 10226340 belong to one cluster, whereas the other Benzoylindoles belong to another. Remark the concordance of this classification with Figure 5b. At first sight, there is a disagreement between the molecules’ classification by IUPAC in only two classes: Benzoylindoles and Phenylacetylindoles. This is not the case, because the dendrogram (Figure 6) provides a classification based on the distances between the molecules, putting together the dendrograms in Figure 5d and Figure 6 and taking into account the branches’ lengths, indicated under the dendrograms.
Moreover, the clustering provided in Figure 7 confirms the homogeneity of the elements in Group 2. The existence of two different clusters for the elements in Group 1 results from applying the k-mean algorithm, with k = 3.
Table 7 presents the positive and negative effects of the molecules in Group 1 with probabilities of apparition greater than or equal to 0.5.
The effects that appear with probabilities between 0.3 and 0.5 are presented in Table S1 (Supplementary material) for the molecules in Group 1. The molecule 10226340 is likely to act as a Nicotinic alpha4beta4 receptor agonist and Analgesic (Pa = 0.775, and Pa = 0.731, respectively), the molecules 57507911, 56841530, and 57507905 are likely to act as an Antineurotic and Gluconate 2-dehydrogenase (acceptor) inhibitor, and the molecule 56463 is expected to act as an Antineurotic. The molecules 9889172 (and 10226340) have a Pa > 0.5 (0.301) associated with the Antineurogenic pain and Nicotinic alpha4beta2 receptor antagonist effects. Pa is greater than 0.5 for the Lymphocytopoiesis inhibitor effect for the molecule 117587582, and 0.5 > Pa > 0.3 for the same effect for 9889172 and 10226340. The molecule 10226340 has an analgesic effect whose Pa = 0.731, whereas the same effect has an associated probability of 0.448 (0.336) for the molecule 9889172 (117587582).
Effects such as Analgesic, Antineoplastic alkaloid, Glycosylphosphatidylinositol phospholipase D inhibitor, Peptide agonist, Thromboxane B2 antagonist, and NADPH- cytochrome-c2 reductase inhibitor have probabilities between 0.3 and 0.5 for the molecules 9889172 and 117587582. Given that the molecules 9889172, 117587582, and 10226340 are in the same cluster, the presence of these activities confirmed in experiments for one of the molecules may indicate the same effect for the other molecules.
The adverse effects of 9889172 are not known well. Based on the actual knowledge, only 10 such effects have been identified, such as Photoallergy dermatitis, Allergic contact dermatitis (with 0.7 > Pa > 0.5), Cyanosis, Nail discoloration, and Torsades de pointes (0.5 > Pa > 0.3). These effects are noted with probabilities between 0.3 and 0.5 for at least one molecule in the same cluster. Effects related to the postural position damage and respiratory issues have probabilities less than 0.5 for the molecules 117587582 and 10226340. The confirmation by experiments of such effects for one of the molecules in the first cluster will represent a warning for using the other two molecules in Cluster 1.
The analysis of the effects of the molecules in the second cluster emphasizes a high concordance between their effects. All the positive effects are common. Some of them, which do not appear in Table 8, appear in Table S1 from the Supplementary Materials, with probabilities close to 0.5; for example, Saccharopepsin inhibitor, Chymosin inhibitor, and Acrocylindropepsin inhibitor, with Pa = 0.491, Amine dehydrogenase inhibitor with Pa = 0.49 for molecule 56841530, or Gastrin inhibitor (Pa = 0.484) for 57507905.
The adverse effects listed for the molecule 56841530 (with Pa 0.5) are common with those listed for 57507911 and 57507905. Fibrosis interstitial, Delirium, Dystonia, Dysphoria, Hematuria, Hypothermic, Cyanosis, and Conjunctivitis are common in 57507911 and 57507905, with Pa > 0.5, and appear for 56841530 (Table S1, Supplementary material) with probabilities between 0.424 and 0.468.
The molecule 56463 has Antineurotic (Pa > 0.7), Gluconate 2-dehydrogenase (acceptor) inhibitor, Calcium channel (voltage-sensitive) activator, Aspulvinone dimethylallyl transferase inhibitor, and Gastrin inhibitor (Pa > 0.54) positive effects, and Twitching, Hepatitis, Dystonia, and Nephritis adverse effects (Pa > 0.512). These effects are common to the other molecules in Group 2, with probabilities over 0.5. Still, there are common effects with the other molecules in Group 2, with smaller probabilities, which explain the presence of 56463 in a separate cluster in Figure 7.
An analogous procedure has been applied to the Phenylacetylindoles. Table 8 contains the positive and negative effects of the molecules in this group, with probabilities of apparition greater than or equal to 0.5. All molecules but 53494930 have antineurotic effects, all but 11616723 are Gluconate 2-dehydrogenase (acceptor) inhibitors, most of them act as Taurine dehydrogenase inhibitors and Thromboxane B2 antagonists and antiallergics, with probabilities greater than 0.5. Some molecules have the same properties with probabilities between 0.3 and 0.5. For example, the molecules are Chlordecone reductase inhibitors, but the probability for 53494930 is 0.434 (Table S2 in Supplementary material).
The main negative effects with probabilities above 0.5 for almost all Phenylacetylindoles are shivering, twitching, sweating, and hypothermic. Still, the following probabilities have been computed: 0.478—44397500, 0.417—44397540, 0.439—53494950, 0.499—53394099. For CID 11616723, the probability to act as hypothermic is 0.423. For 53394099, the following probabilities have been computed: 0.387—shivering, 0.447—twitching, 0.286—hypothermic. The effects that appear with probabilities between 0.3 and 0.5 are presented in Table S2 (Supplementary Materials).
To validate the clustering from Figure 7, the Kruskal–Wallis test has been performed to the series of probabilities corresponding to the most significant effects of the elements in both groups (presented in Table 9 and Table 10).
The main effects were considered those whose probabilities are higher than 0.5 for at least one molecule. If another molecule has the same effect, the associated probability is filled in Table 9 or Table 10, depending on the group to which it belongs. If the molecule does not have a certain effect, the probability filled in the tale is zero.
The p-value for the test performed with all 14 series (putting together the effects from Table 9 and Table 10) is 0.000, so the null hypothesis can be rejected, meaning that there are significant differences in the series distributions.
To distinguish the series resulting from the same distribution, the same test has been performed for the first three molecules in Group 1 (belonging to the second cluster), the last four molecules in Group 1 (belonging to the third cluster), and Group 2 (the third cluster), respectively. The corresponding p-values are 0.4362, 0.1128, and 0.1004, respectively. Since all are higher than 0.05, it results that the series in each of the three clusters are not significantly different from the viewpoint of their positive effects.
Similar tests, performed for the negative effects, lead to the same results. So, the clustering is validated.

4. Conclusions

In this research, the authors utilized the Cheminformatics methods for the analysis of the Benzoylindole and Phenylacetylindole groups of drugs that complete the knowledge [39] about them. Similarity indices and clustering techniques have emphasized the structural similarities and differences between these groups. The highest similarities exist between the molecules in the second group. These are emphasized by the second group’s dendrograms (the length of the highest branches being 0.3, the other being lower than 0.22). By comparison, the branches’ sizes in the dendrogram of Group 1 are generally larger than for Group 2.
The comparisons of the biological activities spectra show that the most similar activities of the molecules in the first group are those of 57507911, 56841530, and 57507905, confirming the grouping provided by the dendrogram (Figure 5b). Analogous conclusions can be drawn from the dendrogram for Group 2 (Figure 5d).
Performing the k-means algorithm for k = 3 results in three clusters, one containing all the molecules in Group 2, while the other two being formed by three and four molecules in Group 1. Performing the same analysis for k = 2 results in two clusters (superposed to Groups 1 and 2). Still, the best clustering is obtained for k = 3 because between the sum of squares/total sum of squares of the distances is 52.0%, compared to only 31.6%, for k = 2.
To validate the clustering results, the probabilities of the main effects of the activity spectra have been utilized to build series to which the Kruskal–Wallis test has been applied. The test results are in concordance with the grouping issued by the k-means algorithm.
Given that the activity spectra have been determined with certain probabilities, future experimental studies should confirm the findings related to particular actions of the molecules of interest and the clustering validation. While this kind of experiment involves human subjects, it is challenging and time-consuming to conduct it without considering the necessary infrastructure, protocols that must be defined and followed, and the approvals that must be obtained. Therefore, our study may be considered as the first step in larger research on these two groups of drugs.

Supplementary Materials

The following supporting information can be downloaded at: https://0-www-mdpi-com.brum.beds.ac.uk/article/10.3390/math10091543/s1, Table S1: Positive and negative effects of the molecules in Group 1, and the probabilities of their activities; Table S2: Positive and negative effects of the mol-ecules in Group 2, and the probabilities of their activities.

Author Contributions

Conceptualization, L.B. and C.Ș.D.; methodology, A.B.; software, A.B. and C.Ș.D.; validation, A.B., L.B. and C.Ș.D.; formal analysis, L.B.; investigation, A.B., L.B. and C.Ș.D.; resources, A.B.; data curation, L.B.; writing—original draft preparation, A.B., L.B. and C.Ș.D.; writing—review and editing, C.Ș.D.; supervision, A.B.; project administration, A.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data has been downloaded from PubChem: https://pubchem.ncbi.nlm.nih.gov (accessed on 15 May 2021).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Hardon, A. Chemical Youth. In Critical Studies in Risk and Uncertainty; Palgrave Macmillan: Cham, Switzerland, 2021; pp. 81–111. [Google Scholar] [CrossRef]
  2. EMCDDA. European Monitoring Centre for Drugs and Drug Addiction, Drug-Related Deaths and Mortality in Europe. 2019. Available online: https://www.emcdda.europa.eu/system/files/publications/11485/20193286_TD0319444ENN_PDF.pdf (accessed on 10 February 2022).
  3. O’Mahony Carey, S. Psychoactive Substances. A Guide to Ethnobotanical Plants and Herbs, Synthetic Chemicals, Compounds and Products, Health Service Executive South (Edition 1.1). Available online: http://lab.bnn.go.id/nps_alert_system/publikasi%20web/Psychoactive%20plant/Psychoactive_plant.pdf (accessed on 15 May 2021).
  4. EMCDDA. European Monitoring Center for Drugs and Drugs Addiction, Perspectives on Drugs Health Responses to New Psychoactive Substances. 2016. Available online: https://www.emcdda.europa.eu/system/files/publications/2933/NPS%20health%20responses_POD2016.pdf (accessed on 15 May 2021).
  5. EMCDDA. European Monitoring Center for Drugs and Drugs Addiction, Perspectives on Drugs Synthetic Cannabinoids in Europe. 2017. Available online: https://www.emcdda.europa.eu/system/files/publications/2753/POD_Synthetic%20cannabinoids0.pdf (accessed on 15 May 2021).
  6. UNODC. United Nations Office on Drugs and Crime, Global Overview of Drug Demand and Supply. 2019. Available online: https://wdr.unodc.org/wdr2019/prelaunch/WDR19_Booklet_2_DRUG_DEMAND.pdf (accessed on 10 February 2022).
  7. Shafi, A.; Berry, A.J.; Sumnall, H.; Wood, D.M.; Tracy, D.K. New psychoactive substances: A review and updates. Ther. Adv. Psychopharmacol. 2020, 10, e2045125320967197. [Google Scholar] [CrossRef] [PubMed]
  8. van Amsterdam, J.; Nutt, D.; van den Brink, W. Generic legislation of new psychoactive drugs. J. Psychopharmacol. 2013, 27, 317–324. [Google Scholar] [CrossRef] [PubMed]
  9. Elliott, L.; Haddock, C.K.; Campos, S.; Benoit, E. Polysubstance use patterns and novel synthetics: A cluster analysis from three U.S. cities. PLoS ONE 2019, 14, e0225273. [Google Scholar] [CrossRef] [PubMed]
  10. Vlădescu, C.; Scîntee, S.G.; Olsavszky, V.; Hernández-Quevedo, C.; Sagan, A. Romania: Health System Review. Health Syst. Trans. 2016, 18, 1–170. [Google Scholar]
  11. Ernst, L.; Langer, N.; Bockelmann, A.; Salkhordeh, E.; Beuerle, T. Identification and quantification of synthetic cannabinoids in ‘spice-like’ herbal mixtures: Update of the German situation in summer 2018. Forensic Sci. Int. 2019, 294, 96–102. [Google Scholar] [CrossRef] [PubMed]
  12. Zapata, F.; Matey, J.M.; Montalvo, G.; García-Ruiz, C. Chemical classification of new psychoactive substances (NPS). Microchem. J. 2021, 163, 105877. [Google Scholar] [CrossRef]
  13. Lesiak, A.D.; Shepard, J.R. Recent advances in forensic drug analysis by DART-MS. Bioanalysis 2014, 6, 819–842. [Google Scholar] [CrossRef]
  14. Mignani, S.; Rodrigues, J.; Tomas, H.; Jalal, R.; Pal Singh, P.; Majoral, J.P.; Vishwakarma, R.A. Present drug-likeness filters in medicinal chemistry during the hit and lead optimization process: How far can they be simplified? Drug Discov. Today 2018, 23, 605–615. [Google Scholar] [CrossRef]
  15. Rogers, P.J. Food and drug addictions: Similarities and differences. Pharmacol. Biochem. Behav. 2017, 153, 182–190. [Google Scholar] [CrossRef] [Green Version]
  16. Alves, V.L.; Gonçalves, J.L.; Aguiar, J.; Teixeira, H.M.; Câmara, J.S. The synthetic cannabinoids phenomenon: From structure to toxicological properties. A review. Crit. Rev. Toxicol. 2020, 50, 359–382. [Google Scholar] [CrossRef]
  17. Soltaninejad, K. Clinical and Forensic Toxicological Aspects of Synthetic Cannabinoids: A Review and Update. Asia Pac. J. Med. Toxicol. 2020, 9, 108–118. [Google Scholar] [CrossRef]
  18. Potts, A.J.; Cano, C.; Thomas, S.H.L.; Hill, S.L. Synthetic cannabinoid receptor agonists: Classification and nomenclature. Clin. Toxicol. 2020, 58, 82–98. [Google Scholar] [CrossRef] [PubMed]
  19. Kwon, S.; Bae, H.; Jo, J. Comprehensive ensemble in QSAR prediction for drug discovery. BMC Bioinformatics 2019, 20, 521. [Google Scholar] [CrossRef]
  20. Gini, G. The QSAR similarity principle in the deep learning era: Confirmation or revision? Found Chem. 2020, 22, 383–402. [Google Scholar] [CrossRef]
  21. Stumpfe, D.; Hu, H.; Bajorath, J. Evolving Concept of Activity Cliffs. ACS Omega 2019, 4, 14360–14368. [Google Scholar] [CrossRef]
  22. Examples of Fingerprint and Descriptors. Available online: https://www.cambridgemedchemconsulting.com/resources/hit_identification/examples_descriptors.php (accessed on 11 February 2021).
  23. Godden, J.W.; Stahura, F.L.; Bajorath, J. Anatomy of fingerprint search calculations on structurally diverse sets of active compounds. J. Chem. Inf. Model. 2005, 45, 1812–1819. [Google Scholar] [CrossRef] [PubMed]
  24. Voicu, A.; Duteanu, N.; Voicu, M.; Daliborca, V.; Dumitrascu, V. The rcdk and cluster R packages applied to drug candidate selection. J. Cheminformatics 2020, 12, 3. [Google Scholar] [CrossRef] [Green Version]
  25. Swandana, R.; Aisyah, P.; Syahdi, R.R. Prediction analysis of pharmacokinetic parameters of several oral systemic drugs using in silico method. Int. J. Appl. Pharm. 2020, 12, 260–263. [Google Scholar] [CrossRef]
  26. Leelananda, S.P.; Lindert, S. Computational Methods in Drug Discovery. Beilstein J. Org. Chem. 2016, 12, 2694–2718. [Google Scholar] [CrossRef] [Green Version]
  27. Sliwoski, G.; Kothiwale, S.; Meiler, J.; Lowe, E.W., Jr. Computational Methods in Drug Discovery. Pharmacol. Rev. 2014, 66, 334–395. [Google Scholar] [CrossRef] [Green Version]
  28. Willett, P. Similarity Searching Using 2D Structural Fingerprints. In Chemoinformatics and Computational Chemical Biology; Bajorath, J., Ed.; Humana Press: Totowa, NJ, USA, 2011; pp. 133–158. [Google Scholar] [CrossRef] [Green Version]
  29. Guha, R.; Gilbert, K.; Fox, G.; Pierce, M.; Wild, D.; Yuan, H. Advances in cheminformatics methodologies and infrastructure to support the data mining of large, heterogeneous chemical datasets. Cur. Comput.-Aid. Drug 2010, 6, 50–67. [Google Scholar] [CrossRef] [PubMed]
  30. Cao, Y.; Charisi, A.; Cheng, L.C.; Jiang, T.; Girke, T. ChemmineR: A compound mining framework for R. Bioinformatics 2008, 24, 1733–1734. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  31. Wang, Y.; Backman, T.W.; Horan, K.; Girke, T. fmcsR: Mismatch tolerant maximum common substructure searching in R. Bioinformatics 2013, 29, 2792–2794. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  32. Guha, R. Chemical Informatics Functionality in R. J. Stat. Softw. 2007, 18, 1–16. [Google Scholar] [CrossRef] [Green Version]
  33. Guha, R.; Cherto, M.R. rcdk: Integrating the CDK with R. Available online: https://mran.microsoft.com/snapshot/2017-02-04/web/packages/rcdk/vignettes/rcdk.pdf (accessed on 15 May 2021).
  34. Mente, S.; Kuhn, M. The use of the R language for medicinal chemistry applications. Curr. Top. Med. Chem. 2012, 12, 1957–1964. [Google Scholar] [CrossRef] [PubMed]
  35. Alvarsson, J.; Lampa, S.; Schaal, W.; Andersson, C.; Wikberg, J.E.S.; Spjuth, O. Large-scale ligand-based predictive modelling using support vector machines. J. Cheminform. 2016, 8, 39. [Google Scholar] [CrossRef] [Green Version]
  36. Marth, C.J.; Gallego, G.M.; Lee, J.C.; Lebold, T.P.; Kulyk, S.; Kou, K.G.M.; Qin, J.; Lilien, R.; Sarpong, R. Network-analysis-guided synthesis of weisaconitine D and liljestrandinine. Nature 2015, 528, 493–498. [Google Scholar] [CrossRef] [Green Version]
  37. Segler, M.H.S.; Waller, M.P. Modelling chemical reasoning to predict and invent reactions. Chem. Eur. J. 2017, 23, 6118–6128. [Google Scholar] [CrossRef] [Green Version]
  38. Willighagen, E.L.; Mayfield, J.W.; Alvarsson, J.; Berg, A.; Carlson, L.; Jeliazkova, N.; Kuhn, S.; Pluskal, T.; Rojas-Chertó, M.; Spjuth, O.; et al. The Chemistry Development Kit (CDK) v2.0: Atom typing, depiction, molecular formulas, and substructure searching. J. Cheminform. 2017, 9, 33. [Google Scholar] [CrossRef] [Green Version]
  39. Bărbulescu, A.; Barbeș, L.; Dumitriu, C.-Ş. Computer-Aided Classification of New Psychoactive Substances. J. Chem. 2021, 2021, 4816970. [Google Scholar] [CrossRef]
  40. PubChem. Available online: https://pubchem.ncbi.nlm.nih.gov (accessed on 5 May 2021).
  41. Kubinyi, H. Hydrogen Bonding: The Last Mystery in Drug Design. In Pharmacokinetic Optimization in Drug Research: Biological, Physicochemical, and Computational Strategies; Testa, B., van de Waterbeemd, H., Folkers, G., Guy, R., Eds.; Verlag Helvetica Chimica Acta: Zürich, Switzerland, 2001; pp. 513–521. [Google Scholar]
  42. Caron, G.; Vallaro, M.; Ermondi, G. Log P as a tool in intramolecular hydrogen bond considerations. Drug Discov. Today 2018, 27, 65–70. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  43. Patrick, G.L. An Introduction to Médicinal Chemistry; Oxford University Press: Oxford, UK, 1995. [Google Scholar]
  44. Cuesta, S.A.; Mora, J.R.; Márquez, E.A. In Silico Screening of the DrugBank Database to Search for Possible Drugs against SARS-CoV-2. Molecules 2021, 26, 1100. [Google Scholar] [CrossRef] [PubMed]
  45. Ertl, P.; Rohde, B.; Selzer, P. Fast Calculation of Molecular Polar Surface Area as a Sum of Fragment-Based Contributions and Its Application to the Prediction of Drug Transport Properties. J. Med. Chem. 2000, 43, 3714–3717. [Google Scholar] [CrossRef] [PubMed]
  46. Vistoli, G.; Pedretti, A. Molecular Fields to Assess Recognition Forces and Property Spaces. Comp. Med. Chem. II 2007, 5, 577–602. [Google Scholar]
  47. Turner, J.V.; Agatonovic-Kustrin, S. In Silico Prediction of Oral Bioavailability. Comp. Med. Chem. II 2007, 5, 699–724. [Google Scholar]
  48. Chen, X.; Reynolds, C.H. Performance of Similarity Measures in 2D Fragment-Based Similarity Searching: Comparison of Structural Descriptors and Similarity Coefficients. J. Chem. Inf. Comput. Sci. 2002, 42, 1407–1414. [Google Scholar] [CrossRef]
  49. Monev, V. Introduction to Similarity Searching in Chemistry. Match-Commun. Math. Comp. Chem. 2004, 51, 7–38. [Google Scholar]
  50. Bajusz, D.; Rácz, A.; Héberger, K. Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations? J. Cheminformatics 2015, 7, 20. [Google Scholar] [CrossRef] [Green Version]
  51. Tversky, A. Features of Similarity. Psychol. Rev. 1977, 84, 327–352. [Google Scholar] [CrossRef]
  52. Jarvis, R.A.; Patrick, E.A. Clustering Using a Similarity Measure Based on Shared Near Neighbors. IEEE Trans. Comput. 1973, 22, 1025–1034. [Google Scholar] [CrossRef]
  53. Murtagh, F.; Contreras, P. Algorithms for hierarchical clustering: An overview. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2021, 2, 86–97. [Google Scholar] [CrossRef]
  54. Ward, J.H., Jr. Hierarchical Grouping to Optimize an Objective Function. J. Am. Stat. Assoc. 1963, 58, 236–244. [Google Scholar] [CrossRef]
  55. Lagunin, A.; Stepanchikova, A.; Filimonov, D.; Poroikov, V. PASS: Prediction of activity spectra for biologically active substances. Bioinformatics 2000, 16, 747–748. [Google Scholar] [CrossRef] [PubMed]
  56. Filimonov, D.A.; Lagunin, A.A.; Gloriozova, T.A.; Rudik, A.V.; Druzhilovskii, D.S.; Pogodin, P.V.; Poroikov, V.V. Prediction of the biological activity spectra of organic compounds using the PASS online web resource. Chem. Heterocyclic Comp. 2014, 50, 444–457. [Google Scholar] [CrossRef]
  57. Kruskal, W.H.; Wallis, W.A. Use of ranks in one-criterion variance analysis. J. Am. Stat. Assoc. 1952, 47, 583–621. [Google Scholar] [CrossRef]
Figure 1. Classification of Aminoalkylindoles—four subgroups.
Figure 1. Classification of Aminoalkylindoles—four subgroups.
Mathematics 10 01543 g001
Figure 2. The flowchart of the study.
Figure 2. The flowchart of the study.
Mathematics 10 01543 g002
Figure 3. The structures of the studied molecules from (a) Group 1 and (b) Group 2.
Figure 3. The structures of the studied molecules from (a) Group 1 and (b) Group 2.
Mathematics 10 01543 g003
Figure 4. The elbow method for determining the optimal number of clusters for Group 1 in the k-means algorithm.
Figure 4. The elbow method for determining the optimal number of clusters for Group 1 in the k-means algorithm.
Mathematics 10 01543 g004
Figure 5. (a) The heatmap for Benzoylindoles; (b) The dendrogram for Benzoylindoles; (c) The heatmap for Phenylacetylindoles; (d) The dendrogram for Phenylacetylindoles.
Figure 5. (a) The heatmap for Benzoylindoles; (b) The dendrogram for Benzoylindoles; (c) The heatmap for Phenylacetylindoles; (d) The dendrogram for Phenylacetylindoles.
Mathematics 10 01543 g005
Figure 6. The heatmap and dendrogram for all molecules. The black branches correspond to the molecules in the first group.
Figure 6. The heatmap and dendrogram for all molecules. The black branches correspond to the molecules in the first group.
Mathematics 10 01543 g006
Figure 7. The results of grouping the molecules by k-means algorithm. The Ox and Oy axes are the first two PCs.
Figure 7. The results of grouping the molecules by k-means algorithm. The Ox and Oy axes are the first two PCs.
Mathematics 10 01543 g007
Table 1. The CID, molecular formula, and MW of the studied drugs.
Table 1. The CID, molecular formula, and MW of the studied drugs.
BenzoylindolesPhenylacetylindoles
CIDFormulaMWCIDFormulaMW
9889172C20H19FINO435.2738 44397641C22H25NO2335.4394
117587582C20H19ClINO451.7284 44397500C21H22ClNO339.8585
10226340C22H23IN2O458.3353 44397540C22H25NO2335.4394
56841530C21H23NO2321.4128 53494930C24H28N2O2376.4913
57507911C21H23NO2321.4128 11616723C22H25NO319.4400
57507905C20H21NO2307.3862 11493740C22H25NO2335.4394
56463C23H26N2O3378.464153394099C25H29NO2375.5033
Table 2. The atoms’ species and functional groups and their numbers.
Table 2. The atoms’ species and functional groups and their numbers.
BenzoylindolesPhenylacetylindoles
CIDCHNOFClIR3NRCORRORRingsAromaticCIDCHNOFClIR3NRCORRORRingsAromatic
9889172201911101110334439764122251202011133
117587582201911011110334439750021221101111033
10226340222321001210434439754022251202011133
56841530212312000111335349493024282202021143
57507911212312000111331161672322251101011033
57507905202112000111331149374022251202011133
56463232623000212435339409925291202011143
Table 3. Molecules’ descriptors.
Table 3. Molecules’ descriptors.
GroupCIDHBA1HBA2HBDlogPMRTPSA
Benzoylindoles988917220205.6167105.070522.00
11758758220205.8860109.815522.00
1022634025304.8991119.330525.24
5684153025305.071198.794531.23
5750791125305.071198.794531.23
5750790523304.681093.987531.23
5646330403.4594114.349543.70
Phenylacetylindoles4439764127305.2655103.601531.23
4439750023205.9103102.119522.00
4439754027305.2655103.601531.23
5349493031404.4975117.912534.47
1161672326205.5653102.075522.00
1149374027305.2655103.601531.23
5339409931306.0457115.908531.23
Table 4. Tanimoto coefficients for the atoms’ similarities.
Table 4. Tanimoto coefficients for the atoms’ similarities.
BenzoilyndolesCID98891721175875821022634057507911568415305750790556463
98891721.0000
1175875820.84621.0000
102263400.67880.67881.0000
575079110.61400.53760.51391.0000
568415300.53760.61400.57740.72501.0000
575079050.49010.49010.48590.84970.66351.0000
564630.44690.44690.50540.54250.50690.49531.0000
PhenylacetylindolesCID44397641443975004439754053494930116167231149374053394099
443976411.0000
443975000.66091.0000
443975400.72490.78401.0000
534949300.57770.57310.68731.0000
116167230.70000.84670.85260.61581.0000
114937400.76540.68020.76540.58880.73571.0000
533940990.54550.55450.65450.68820.57690.54901.0000
Table 5. Similarities of pairs of atoms belonging to pair of molecules from Benzoylindoles and Phenylacetylindoles.
Table 5. Similarities of pairs of atoms belonging to pair of molecules from Benzoylindoles and Phenylacetylindoles.
CID98891721175875821022634057507911568415305750790556463
443976410.48830.48830.47060.63640.66490.60290.4803
443975000.52480.54620.49130.58170.52490.49010.3915
443975400.49220.49220.48100.68420.61340.59370.4456
534949300.45660.45660.60870.56460.51390.48470.4595
116167230.52480.52480.50620.66770.57260.53330.42173
114937400.48840.48840.47410.66000.71940.63610.4934
533940990.45980.45980.48940.50000.46960.42440.4766
Table 6. The similarity of molecules’ couples, one of them belonging to Benzoylindoles and the other to Phenylacetylindoles (Tanimoto coefficient). The number in the brackets represents the similarity rank. Minus (−) signifies that the rank of similarity is higher than eight.
Table 6. The similarity of molecules’ couples, one of them belonging to Benzoylindoles and the other to Phenylacetylindoles (Tanimoto coefficient). The number in the brackets represents the similarity rank. Minus (−) signifies that the rank of similarity is higher than eight.
CID9889172117587582102263407507911568415305750790556463
443976410.6657 (5)0.6040 (5)0.4815 (7)
443975000.5262 (6)0.6374 (7)
443975400.6851 (3)0.6145 (6)0.5948 (6)
534949300.6096 (4)
116167230.5262 (7)0.5262 (7)0.5075 (7)0.5347 (7)
114937400.6609 (6)0.7202 (4)0.6371 (4)0.4945 (6)
Table 7. Positive and negative effects of the Benzoylindoles that appear with probabilities greater than 0.5.
Table 7. Positive and negative effects of the Benzoylindoles that appear with probabilities greater than 0.5.
Positive EffectsNegative Effects
CIDPaEffectPaEffect
98891720.538Antineurogenic pain0.587Photoallergy dermatitis
0.503Nicotinic alpha4beta2 receptor antagonist0.534Allergic contact dermatitis
1175875820.598Lymphocytopoiesis inhibitor0.664Cyanosis
0.515Oxidoreductase inhibitor0.604Tremor
0.604Sleep disturbance
0.556Edema
0.551Drowsiness
0.548Weight loss
0.533Fibrosis, interstitial
0.509Conjunctivitis
0.504Sensory disturbance
0.503Dizziness
102263400.775Nicotinic alpha4beta4 receptor agonist0.596Twitching
0.731Analgesic0.535Cyanosis
0.672Antineurogenic pain0.500Sneezing
0.636Analgesic, non-opioid
0.589Nicotinic alpha6beta3beta4alpha5 receptor antagonist
0.544Nicotinic alpha2beta2 receptor antagonist
0.538Histamine antagonist
0.522Antihistaminic
575079110.790Antineurotic0.754Allergic contact dermatitis
0.744Gluconate 2-dehydrogenase (acceptor) inhibitor0.743Shivering
0.6955 Hydroxytryptamine release stimulant0.693Twitching
0.645Aspulvinone dimethylallyltransferase inhibitor0.639Photoallergy dermatitis
0.636Taurine dehydrogenase inhibitor0.625Myoclonus
0.627Fibrinolytic0.594Torsades de pointes
0.618Thromboxane B2 antagonist0.582Fibrosis, interstitial
0.614Chlordecone reductase inhibitor0.572Delirium
0.606Antieczematic0.568Gastrointestinal hemorrhage
0.563Acrocylindropepsin inhibitor0.561Xerostomia
0.563Calcium channel (voltage-sensitive) activator0.522Conjunctivitis
0.563Chymosin inhibitor0.516Hypothermic
0.563Saccharopepsin inhibitor0.515Dystonia
0.558Amine dehydrogenase inhibitor0.515Cyanosis
0.551Preneoplastic conditions treatment0.512Pseudoporphyria
0.540Platelet aggregation inhibitor0.506Hematuria
0.537Aldehyde oxidase inhibitor0.504Fibrillation, atrial
0.510Gastrin inhibitor0.503Dysphoria
568415300.735Antineurotic0.743Shivering
0.744Gluconate 2-dehydrogenase (acceptor) inhibitor0.693Twitching
0.6205 Hydroxytryptamine release stimulant0.640Allergic contact dermatitis
0.606Antieczematic0.565Gastrointestinal hemorrhage
0.606Fibrinolytic0.562Photoallergy dermatitis
0.586Thromboxane B2 antagonist0.549Torsades de pointes
0.582Taurine dehydrogenase inhibitor0.533Myoclonus
0.577Aspulvinone dimethylallyltransferase inhibitor0.500Xerostomia
0.551Preneoplastic conditions treatment
0.542Chlordecone reductase inhibitor
0.538Platelet aggregation inhibitor
0.526Calcium channel (voltage-sensitive) activator
0.516Gastrin inhibitor
0.506Membrane permeability inhibitor
575079050.802Antineurotic0.744Allergic contact dermatitis
0.757Gluconate 2-dehydrogenase (acceptor) inhibitor0.733Twitching
0.684Aspulvinone dimethylallyltransferase inhibitor0.723Shivering
0.6785 Hydroxytryptamine release stimulant0.667Photoallergy dermatitis
0.661Chlordecone reductase inhibitor0.636Myoclonus
0.620Amine dehydrogenase inhibitor0.619Fibrosis. interstitial
0.615Fibrinolytic0.614Gastrointestinal hemorrhage
0.607Taurine dehydrogenase inhibitor0.594Torsades de pointes
0.600Thromboxane B2 antagonist0.588Delirium
0.596Aldehyde oxidase inhibitor0.577Xerostomia
0.568Antieczematic0.546Pseudoporphyria
0.545Calcium channel (voltage-sensitive) activator0.528Dystonia
0.538Platelet aggregation inhibitor0.52Dysphoria
0.534Preneoplastic conditions treatment0.519Hypotonia
0.530Acrocylindropepsin inhibitor0.518Nephritis
0.530Chymosin inhibitor0.517Postural hypotension
0.530Saccharopepsin inhibitor0.515Choreoathetosis
0.502Acetylcholine neuromuscular blocking agent0.510Urinary retention
0.509Hematuria
0.508Hypothermic
0.502Cyanosis
0.501Conjunctivitis
0.499Hepatitis
564630.862Antineurotic0.673Twitching
0.685Phobic disorders treatment0.562Galactorrhea
0.676Chlordecone reductase inhibitor0.554Hepatitis
0.665Gluconate 2-dehydrogenase (acceptor) inhibitor0.524Toxic. respiration
0.574Insulysin inhibitor0.523Dystonia
0.563Calcium channel (voltage-sensitive) activator0.512Nephritis
0.556Aspulvinone dimethylallyltransferase inhibitor
0.546Gastrin inhibitor
Table 8. Positive and negative effects of Phenylacetylindoles with probabilities greater than 0.5.
Table 8. Positive and negative effects of Phenylacetylindoles with probabilities greater than 0.5.
Positive EffectsNegative Effects
CIDPaEffectPaEffect
443976410.792Antineurotic0.820Shivering
0.7625 Hydroxytryptamine release stimulant0.636Twitching
0.717Gluconate 2-dehydrogenase (acceptor) inhibitor0.551Sweating
0.680Taurine dehydrogenase inhibitor0.537Hypothermic
0.655Chlordecone reductase inhibitor0.536Pseudoporphyria
0.629Antieczematic0.526Excitability
0.610Thromboxane B2 antagonist0.511Torsades de pointes
0.568Antiallergic
0.565Preneoplastic conditions treatment
0.560Aspulvinone dimethylallyltransferase inhibitor
0.529General pump inhibitor
0.517Mediator release inhibitor
443975000.720Antineurotic0.812Twitching
0.605Taurine dehydrogenase inhibitor0.733Shivering
0.601Chlordecone reductase inhibitor0.676Akathisia
0.601CYP2J substrate0.666Excitability
0.589CYP2J2 substrate0.665Dysarthria
0.589Gluconate 2-dehydrogenase (acceptor) inhibitor0.626Weight gain
0.579Glycosylphosphatidylinositol phospholipase D inhibitor0.612
0.591
Myoclonus
Hypomania
0.561Antiallergic0.591Multiple organ failure
0.535Phobic disorders treatment0.582Fibrillation, atrial
0.516Thromboxane B2 antagonist0.580Choreoathetosis
0.563Dystonia
0.560Delirium
0.557Hypothermic
0.533Reproductive dysfunction
0.517Weakness
443975400.766Antineurotic0.780Shivering
0.682Gluconate 2-dehydrogenase (acceptor) inhibitor0.636Twitching
0.600Antieczematic0.516Hypothermic
0.589Chlordecone reductase inhibitor
0.581Antiallergic
0.5745 Hydroxytryptamine release stimulant
0.561Thromboxane B2 antagonist
0.551Taurine dehydrogenase inhibitor
0.521Preneoplastic conditions treatment
0.509Mediator release inhibitor
534949300.797Nicotinic alpha4beta4 receptor agonist0.524Extrapyramidal effect
0.713Nicotinic alpha6beta3beta4alpha5 receptor antagonist
0.698Nicotinic alpha2beta2 receptor antagonist
0.587Gluconate 2-dehydrogenase (acceptor) inhibitor
0.539General pump inhibitor
0.510CYP2H substrate
116167230.657Antineurotic0.802Twitching
0.611Antieczematic0.755Shivering
0.583CYP2J substrate0.609Sweating
0.580Mediator release inhibitor0.596Acidosis
0.578Taurine dehydrogenase inhibitor0.551Weakness
0.559Endopeptidase So inhibitor0.532Excitability
0.557Kidney function stimulant0.515Sneezing
0.554Antiallergic0.505Muscle weakness
0.537Thromboxane B2 antagonist0.504Euphoria
0.506Carboxypeptidase Taq inhibitor
0.499Gastrin inhibitor
0.497CYP2C19 inhibitor
114937400.776Antineurotic0.793Shivering
0.702Gluconate 2-dehydrogenase (acceptor) inhibitor0.569Twitching
0.6995 Hydroxytryptamine release stimulant0.560Hypothermic
0.643Antieczematic0.554Sweating
0.632Taurine dehydrogenase inhibitor0.519Excitability
0.608Chlordecone reductase inhibitor0.511Torsades de pointes
0.589Preneoplastic conditions treatment0.502Pseudoporphyria
0.587Thromboxane B2 antagonist0.499Euphoria
0.571Antiallergic
0.532Mediator release inhibitor
0.531General pump inhibitor
0.516Aspulvinone dimethylallyltransferase inhibitor
533940990.753Antineurotic0.506Hypercholesterolemic
0.661Gluconate 2-dehydrogenase (acceptor) inhibitor0.499Sweating
0.562Antidyskinetic
0.559Nicotinic alpha6beta3beta4alpha5 receptor antagonist
0.545Antiallergic
0.545Antiasthmatic
0.507Acetylcholine neuromuscular blocking agent
0.500Nicotinic alpha2beta2 receptor antagonist
0.4935 Hydroxytryptamine antagonist
Table 9. The series of probabilities used to perform the Kruskal–Wallis test on Group 1. The bold numbers are probabilities less than 0.5.
Table 9. The series of probabilities used to perform the Kruskal–Wallis test on Group 1. The bold numbers are probabilities less than 0.5.
ID98891721175875821022634057507911568415305750790556463
5 Hydroxytryptamine release stimulant0.0000.0000.0000.6950.6200.6780.483
Acetylcholine neuromuscular blocking agent0.0000.0000.0000.4680.4150.5020.469
Acrocylindropepsin inhibitor0.0000.0000.0000.5630.4910.5030.280
Aldehyde oxidase inhibitor0.0000.0000.0000.5370.3300.5960.256
Amine dehydrogenase inhibitor0.0000.0000.0000.5580.4900.6200.305
Analgesic0.4480.3660.7310.0000.0000.0000.260
Analgesic, non-opioid0.4730.2140.6360.0000.0000.0000.000
Antieczematic0.0000.0000.0000.5630.6060.5680.000
Antihistaminic0.0000.1180.5220.1610.1180.1610.000
Antineurogenic pain0.5380.2700.6720.3670.3140.3830.315
Antineurotic0.0000.0000.0000.7900.8020.8020.862
Aspulvinone dimethylallyltransferase inhibitor0.0000.0000.0000.6450.5770.6840.556
Calcium channel (voltage-sensitive) activator0.0000.0000.0000.5630.5260.5450.563
Chlordecone reductase inhibitor0.0000.0000.0000.5630.5420.6610.676
Chymosin inhibitor0.0000.0000.0000.6180.4910.5300.280
Fibrinolytic0.0000.0000.0000.6140.6060.6150.314
Gastrin inhibitor0.0000.0000.0000.5100.5160.4840.546
Gluconate 2-dehydrogenase (acceptor) inhibitor0.0000.0000.0000.7440.7440.7570.665
Histamine antagonist0.0970.1010.5380.1360.0000.1350.000
Insulysin inhibitor0.0000.0000.0000.3870.2740.4090.574
Lymphocytopoiesis inhibitor0.3190.5980.3030.0000.0000.0000.000
Membrane permeability inhibitor 0.0000.0000.0000.4290.5050.3980.000
Nicotinic alpha2beta2 receptor antagonist0.0000.0000.5440.0000.0000.0000.000
Nicotinic alpha4beta2 receptor antagonist0.5030.0000.3010.0000.0000.0000.000
Nicotinic alpha4beta4 receptor agonist0.0000.0000.7550.0000.0000.0000.000
Nicotinic alpha6beta3beta4alpha5 receptor antagonist0.0000.0000.5890.0000.0000.3150.000
Oxidoreductase inhibitor0.0000.5150.0000.4540.3760.4180.000
Platelet aggregation inhibitor0.2010.2450.2400.5400.5380.5380.362
Phobic disorders treatment0.0000.0000.0000.0000.3740.3420.685
Preneoplastic conditions treatment0.0000.0000.0000.5510.5510.5340.246
Saccharopepsin inhibitor0.0000.0000.0000.6140.4910.5300.280
Taurine dehydrogenase inhibitor0.0000.0000.0000.6180.5820.6070.310
Thromboxane B2 antagonist0.3180.3210.0000.6060.5860.6000.457
Table 10. The series of probabilities used to perform the Kruskal–Wallis test on Group 2. The bold numbers are probabilities less than 0.5.
Table 10. The series of probabilities used to perform the Kruskal–Wallis test on Group 2. The bold numbers are probabilities less than 0.5.
ID44397641443975004439754053494930116167231149374053394099
5 Hydroxytryptamine release stimulant 0.7620.4680.5740.4970.3830.6990.493
Acetylcholine neuromuscular blocking agent0.4220.2590.3770.4310.2790.4310.507
Antiallergic0.5680.5610.5810.4250.5540.5710.545
Antiasthmatic0.4850.4810.4890.3100.4620.4920.545
Antidyskinetic0.4160.3970.4050.0000.3950.4130.562
Antieczematic0.6290.4620.6000.2930.6110.6430.450
Antineurotic0.7920.7200.7660.4280.6570.7760.753
Aspulvinone dimethylallyl- transferase inhibitor0.5600.0000.4890.0000.3190.5160.348
Carboxypeptidase Taq inhibitor0.3690.4490.3690.0000.5060.3330.000
Chlordecone reductase inhibitor0.6550.6010.5890.0000.4340.6080.349
CYP2H substrate0.0000.0000.3940.5100.0000.4460.461
CYP2J substrate0.2880.6010.2880.0000.5830.0000.000
CYP2J2 substrate0.3400.5890.3400.0000.4970.3080.000
Endopeptidase So inhibitor0.3550.4290.3080.0000.5590.3180.000
General pump inhibitor0.5290.4050.4710.5390.4370.5310.460
Gluconate 2-dehydrogenase (acceptor) inhibitor0.7170.5890.6820.5870.4340.7020.661
Glycosylphosphatidylinositol phospholipase D inhibitor0.2920.5790.2920.0000.4080.2600.000
Kidney function stimulant0.0000.0000.0000.0000.5570.0000.000
Mediator release inhibitor0.5170.4020.5090.1890.5800.5320.397
Nicotinic alpha2beta2 receptor antagonist0.3030.3500.3030.6980.3760.2720.500
Nicotinic alpha4beta4 receptor agonist0.0000.0000.0000.7970.0000.0000.326
Nicotinic alpha6beta3beta4alpha5 receptor antagonist0.3870.4300.3870.7130.4500.3530.559
Phobic disorders treatment0.0000.5350.0000.0000.0000.0000.000
Preneoplastic conditions treatment0.5650.4200.5210.0000.3650.5890.298
Taurine dehydrogenase inhibitor0.6800.6050.5510.0000.5780.6320.000
Thromboxane B2 antagonist0.6100.5160.5610.0000.5370.5870.303
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Bărbulescu, A.; Barbeș, L.; Dumitriu, C.Ș. Computer-Aided Methods for Molecular Classification. Mathematics 2022, 10, 1543. https://0-doi-org.brum.beds.ac.uk/10.3390/math10091543

AMA Style

Bărbulescu A, Barbeș L, Dumitriu CȘ. Computer-Aided Methods for Molecular Classification. Mathematics. 2022; 10(9):1543. https://0-doi-org.brum.beds.ac.uk/10.3390/math10091543

Chicago/Turabian Style

Bărbulescu, Alina, Lucica Barbeș, and Cristian Ștefan Dumitriu. 2022. "Computer-Aided Methods for Molecular Classification" Mathematics 10, no. 9: 1543. https://0-doi-org.brum.beds.ac.uk/10.3390/math10091543

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop