In this topic, we present and discuss the results of the three stages of the research. Firstly, we describe the development of the Brazilian dataset based on those made available in the repository. Next, we train a model for each of the five cases, which differs according to the data used for training and testing. Finally, we provide a detailed analysis of the two best cases.
3.1. Step 1—Obtaining the Aligned Brazilian Dataset
In this stage, data selection, pre-processing, transformation, mining, analysis, and results assimilation were performed. Case 1 was obtained after training and testing solely on the Brazilian dataset (DATA_BR23), using a proportion of 20–80% for each set.
The UC Irvine Machine Learning Repository dataset was used in Yeh’s study [
13], which serves as a reference for many concrete strength prediction research studies. In this case, Yeh used 1080 instances to demonstrate the adaptability of ANNs in predicting this property in high-performance concrete. A set of concrete mixes was produced in the laboratory, and two main conclusions were drawn: an ANN-based strength model is more accurate than a linear regression-based model [
9,
19], and it is convenient and easy to use ANN models for numerical experiments to review the effects of each variable proportion in the concrete mix. This was the dataset demonstrated for Case 2 in this paper, separating the sample in 20–80% proportions for comparison with Case 1.
The construction of the Brazilian dataset began after analyzing the conclusions from training using only the repository dataset. To align the Brazilian dataset with the repository dataset and quantify the cementitious materials, input data were obtained from the first three instances. The data were organized in table form, with each column representing a component.
The complete Brazilian dataset is exemplified in
Table 3.
When analyzing the two databases, it is important to evaluate Pearson’s correlation between each variable, as shown in
Figure 3 and
Figure 4, so that it helps with feature selection. Concerning the target variable, cement was the input with the highest Pearson’s correlation, an absolute value close to 1, with values of 0.5 for YEH98 and 0.51 for BR2023, indicating a strong relationship with compressive strength and thus justifying its selection. Cement is the primary component of concrete, with binding properties directly related to its hardened state characteristics.
Regarding the correlation between inputs, the correlation between water and admixture stands out with a correlation of −0.66 in YEH98. In neither of the databases did we see a correlation higher than 0.9 among the inputs, which is a positive aspect because if it were to occur, it would lead to multicollinearity, thus potentially interfering with the model.
It can also be highlighted through correlation matrices that, in both databases, the input pozzolan shows the lowest correlation with compressive strength. For this reason, it was decided not to select this variable, being the only one from the dataset not used in the models.
This can be justified, primarily because when estimating the quantities of pozzolana and blast furnace slag in BR2023, based on the average cement consumption prescribed by the Brazilian standards, the calculation was not sufficient to reflect the actual amount of material in the cement samples collected from the literature, unlike YEH98, where the inputs were relevant. Considering the knowledge in the materials field, it is known that these supplementary cementitious materials directly influence compressive strength. Thus, all other inputs, except for pozzolana, were used in the models.
3.2. Step 2—Training of the ML Technique Using ANN
Below is
Table 4 with the maximum (Max), average (μ), minimum (Min), and standard deviation (σ) predicted strength values obtained for each case:
In comparison to Case 2, Case 1 (TR_TE.BR) shows a low variation in the mean between the training and testing samples, with a standard deviation of 10.99 and 11.50, respectively. This can be explained by the fact that, in Case 2, the database used data from various countries, whereas the Brazilian one only utilized national data.
However, in Cases 3 and 4, more significant differences can be observed between the same dataset used for training and testing. In these two cases, different datasets were swapped between training and testing. The difference between the mean values of the DATA_YEH1998 dataset was 12.34%, while for DATA_BR2023, it was 9.24%. In this case, the use of different databases in training and test sets may have been responsible for a higher variation.
Thus, in all the presented cases, the predicted mean values were higher than 25 MPa and lower than 40 MPa, indicating that the analyzed concretes belong to Class I, which comprises concretes with a strength between 25 MPa and 50 MPa. Case 5 showed the highest variance, with an average amplitude of 69.59 MPa and σ = 16.06 in the test results.
3.3. Step 3—Statistical Analysis of the Technique according to the Presented Cases
Table 5 summarizes the results of the statistical parameters for each case.
It can be observed that the best results for training were, in ascending order: Case 4, Case 3, Case 1, Case 5, and Case 2. For testing, also in ascending order: Case 3, Case 4, Case 1, Case 5, and Case 2.
Cases 1, 2, and 5 were satisfactory for modeling, as they had a more linear fit and behavior. On the other hand, Cases 3 and 4 varied in a way that made the fit more difficult. Thus, using Brazilian data to train a model and testing it with data from the repository is not suitable, and vice versa.
Concrete is a material with nonlinear characteristics, and explaining its behavior is a complex task. In this sense, using an ANN is a specific and justified choice, according to authors [
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23], and depends on the dataset used. In Yeh’s study [
13], using four distinct models with varying inputs, RMSE values between 2 MPa and 4.5 MPa were obtained with the augment-neuron networks for both testing and training. This parameter assesses the accuracy of the network, and thus, in Cases 3 and 4, there is not much accuracy.
In another study, Yeh [
17] used an ANN in the same case to predict the strength of high- and low-strength concretes based on the variation of fly ash, obtaining, in the best case, an RMSE of 3.96 MPa (R
2 = 0.890) and 8.82 MPa (R
2 = 0.791) for training and testing data, respectively. Thus, only in the testing of Case 3 was an RMSE of 9 MPa (R
2 = 0.43) obtained, once again reinforcing that the model did not have good performance.
Therefore, the best for both training and testing was Case 2, in which the RMSE presented values of only 1.83 and 4.20, respectively, with only the training measure being acceptable in terms of statistical errors. This means that, in this first case, the model may be wrong by more or less 1.83 MPa while obtaining an R2 of 0.99.
The results of the histograms for (a) DATA_BR2023 and (b) DATA_YEH98 in the measurement units (on the y-axis) of
Table 3, for each input, and the frequency on the x-axis. for each parameter can be observed in
Figure 5 and
Figure 6.
Through the histograms of the input data, a wide range of data can be observed, which were collected only at the ages of 7 and 28 days. The blast furnace and pozzolan exhibited similar behavior, likely due to being quantified approximately, as well as the utilization of various types of cement. Similarly, the frequency curves of fine aggregate and water also exhibit some resemblance.
In the DATA_YEH98 dataset, considering the reference database, a greater variety was observed in terms of the days on which the specimen was ruptured for the test, with the majority concentrated within 50 days. The achieved compressive strengths, reaching approximately 200 MPa, are twice as high as the highest values analyzed in the DATA_BR2023 dataset.
The results of the graphs with the experimental and predicted resistance can be observed in each of the cases shown in
Figure 7.
An equality line is plotted, represented by the dashed line in both graphs. It is important to note that the predicted values are close to the linear fit, confirming an agreement with the concrete compressive strength values, especially in Case 2, using only the DATA_YEH1998 dataset, and in Case 5, using all databases.
The charts in
Figure 4 demonstrate that Case 2 is the best fit when compared to the others; therefore, the model analysis concludes that it is the ideal one. However, this case only considers the Yeh (1998) database. Case 5 presents the second-best performance and considers the DATA_BR2023 and DATA_YEH98 samples as training, and the test presents an RMSE value of 2.62 in training and 6.01 in testing.
It is worth noting that Case 4, during the adjustments, was subdivided into three others to improve the fitting, justified by the variation in the cases, as 100% of the samples were used. To achieve this, the hyperparameters were manually modified: the number of neurons in the layer and the alpha factor for a more underfitting direction to better generalize the test samples. The smaller the alpha, the more overfitting behavior, and the larger, the more underfitting. These modifications justify the differences in the results of the R2, MSE, and RMSE parameters, as well as the differences in the samples between the cases.
Finally, the permutation importance analysis was performed using Cases 2 and 5, as they showed the best overall results. A total of 500 permutations were conducted for each input variable individually, and this tool measures how much the RMSE decreases, allowing us to understand which variables were more relevant for the models. It is demonstrated in
Table 6 and
Figure 8.
In Case 2, the order of relevance, in ascending order, was: age, cement, water, fine aggregate blast furnace slag, coarse aggregate, and admixture. In Case 5, age and cement remained the two most relevant variables; however, in a different order, as illustrated in
Figure 8, it was in the following sequence: cement, age, blast furnace, water, admixture, fine aggregate, and coarse aggregate.