Next Article in Journal
Geoturist Evaluation of Geosites in the Tuchola Forest Biosphere Reserve (N Poland)
Next Article in Special Issue
Forecasting Monthly River Flows in Ukraine under Different Climatic Conditions
Previous Article in Journal
European Hophornbeam Biomass for Energy Application: Influence of Different Production Processes and Heating Devices on Environmental Sustainability
Previous Article in Special Issue
Application of New Hyperspectral Sensors in the Remote Sensing of Aquatic Ecosystem Health: Exploiting PRISMA and DESIS for Four Italian Lakes
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Predicting Ice Phenomena in a River Using the Artificial Neural Network and Extreme Gradient Boosting

1
Department of Hydrology and Water Management, Institute of Physical Geography and Environmental Planning, Adam Mickiewicz University, 61-680 Poznań, Poland
2
Faculty of Civil and Environmental Engineering, Gdańsk University of Technology, 80-233 Gdańsk, Poland
3
College of Hydraulic Science and Engineering, Yangzhou University, Yangzhou 225009, China
*
Author to whom correspondence should be addressed.
Submission received: 22 November 2021 / Revised: 23 January 2022 / Accepted: 24 January 2022 / Published: 26 January 2022

Abstract

:
Forecasting ice phenomena in river systems is of great importance because these phenomena are a fundamental part of the hydrological regime. Due to the stochasticity of ice phenomena, their prediction is a difficult process, especially when data sets are sparse or incomplete. In this study, two machine learning models—Multilayer Perceptron Neural Network (MLPNN) and Extreme Gradient Boosting (XGBoost)—were developed to predict ice phenomena in the Warta River in Poland in a temperate climate zone. Observational data from eight river gauges during the period 1983–2013 were used. The performance of the model was evaluated using four model fit measures. The results showed that the choice of input variables influenced the accuracy of the developed models. The most important predictors were the nature of phenomena on the day before an observation, as well as water and air temperatures; river flow and water level were less important for predicting the formation of ice phenomena. The modeling results showed that both MLPNN and XGBoost provided promising results for the prediction of ice phenomena. The research results of the present study could also be useful for predicting ice phenomena in other regions.

1. Introduction

Prediction of ice phenomena in rivers is an important element of hydrological regime analysis [1] and the assessment of the risk of ice jam type floods [2]. The changing thermal conditions of river waters during the winter season and the nature of river ice may significantly change the hydro-ecological and socio-economic aspects of the functioning of the river ecosystem.
Due to the stochastic nature of ice phenomena, their prediction is difficult, especially when data sets for rivers are sparse or incomplete. An additional complication is the scale of the event (local and regional scales) and the influence of numerous factors on the process of river freezing, e.g., meteorological (e.g., air temperature, solar radiation, wind velocity) [3,4], hydrological (e.g., flow rate, inflow and outflow conditions) [5,6], the complexity of interactions between hydroclimatic factors [7,8], hydraulic (e.g., trough cross-section geometry, river bathymetry, water table drop) [9] and thermodynamic factors (e.g., water temperature and thermal conductivity) [10,11]. Relations between river freezing and features of the hydrological regime, including flow, water state, and water temperature, are usually complex and non-linear, and are also spatially heterogeneous due to the variability of environmental conditions. In addition to the process that determines the number of occurrences of a given phenomenon, there is also a dichotomous process determining whether it has a chance of occurring in a given period [12]. This task is further complicated by the fact that ice phenomena occur in three phases: freezing of the river (first symptom of ice), permanent ice cover, and the disappearance phase when an ice floe is formed and related phenomena appear, such as ice jams, which often lead to winter floods. However, the full freezing cycle is not always recorded for rivers.
The analysis of time series relating to ice phenomena allows for the determination of the frequency and duration of their occurrence and the tendency of changes over time, and also for an assessment of the ice phases, which provides a good background for the characteristics of the freezing process in many regions. However, it is not sufficient for their prediction and forecasting [13]. Although Shulyakovskii [14] has developed a manual for forecasting the freezing of inland rivers and lakes, there are few studies related to this topic, especially works dealing with the prediction of ice phenomena at various stages of their occurrence. The problem most frequently discussed is the prediction of ice jams on rivers and their consequences in the form of ice jam floods. The theoretical model of river ice jams was developed by Uzuner and Kennedy [15]. Existing forecasts of ice extent are most often based on the location of the 0 °C isotherm [16]. Good results in this regard have also been obtained from observations of river ice ranges carried out with the use of satellites. Remote sensing is useful for the monitoring of ice characteristics, such as different types and thicknesses of ice or ice cover, and for tracking the progress of the breakup of ice jams, which can help predict the location and timing of ice blockages [17,18]. However, the results of field studies and analyses of satellite images do not always provide accurate data for forecasting ice and scenarios of changes in ice dynamics [19].
Prediction models for ice phenomena are usually limited to the empirical or the stochastic due to the difficulties in applying deterministic models. The methods used to predict ice phenomena (e.g., ice jams) include empirical single-variable threshold analyses, logistic regression [2,20], and discriminant function analysis [21]. Many numerical models have been developed to simulate ice formation on rivers [22,23]. According to Wang et al. [24] and Beltaos [25], a better understanding of physical processes has increased the possibility of developing more accurate numerical models of ice jams and ice jam floods in rivers, e.g., the public-domain river-ice RIVICE model [10], the DynaRice model, a two-dimensional coupled hydrodynamic and ice dynamic model [23], and hydraulic models [19]. An interesting ice jam flood forecasting system that considers requirements for the real-time predictions of water, ice, and sediment transport, was developed for the lower Odra River [26]. The prediction of ice phenomena was also carried out using teleconnection indices, as presented by Sutyrina [27] in relation to spring ice phenomena in lakes and reservoirs (including for Lake Baikal).
In the prediction of ice phenomena, machine learning methods are used less frequently, although they have already been utilized widely in forecasting time series of hydrometeorological data [28,29,30]. Artificial neural networks (ANNs) have been used to forecast freezing conditions in rivers [31,32] and predict ice jams [4,33]. For example, Chokmani et al. [34] estimated the thickness of ice using artificial neural networks (ANNs), while Hu et al. [35] predicted the disappearance of ice phenomena using hybrid artificial neural networks. Furthermore, fuzzy logic systems have provided favorable results as regards ice phenomena forecasting and its effects [20,36]. For example, Zhao [3] predicted the breakup date of flood ice using a wavelet neural network (WNN) model. Whereas Yan and Ding [37] proposed a predictive model of ice formation based on a dynamic fuzzy neural network (D-FNN) in combination with a particle swarm optimization (PSO) algorithm. The significant advantages of artificial neural networks over standard statistical classification methods consist in their ability to adapt to data of different formats and configurations [32,34]. Ensemble machine learning methodologies, including resampling methods (bagging, boosting, and dagging), model averaging, and stacking, are used in the solving of problems related to simulation and prediction in hydrology [38,39].
The increase in the amount of hydrological and meteorological data makes it more difficult not only to select the methods for their analysis but also to choose predictive and prognostic models so as to maintain both their legibility and accuracy [40]. For the integrated management of an aquatic ecosystem, it is necessary to determine how the thermal–ice regime of the river will develop and change in the future, considering global climate change and local conditions in particular [41]. The identification of the most important hydrological and thermal variables influencing the course of ice phenomena in a river may result in more accurate forecasts of the freezing process.
The main goal of the present study is to predict ice phenomena in a river with the use of the Multilayer Perceptron Neural Network (MLPNN) and Extreme Gradient Boosting (XGBoost) algorithms, which belong to the group of machine learning methods. MLPNN is one of the most widely used ANN models in the field of hydrology [4,7,8]. According to Zounemat-Kermani et al. [42], the boosting methods (e.g., boosting, AdaBoost, and Extreme Gradient Boosting) are becoming more and more effective for modeling and forecasting water quality, runoff, sediment transport, groundwater, flooding, and drought. One of the advantages of XGBoost compared to neural networks is the ability to assess the importance of predictors in the model, and in this study, by employing the XGBoost model, we can assess the dominant factors controlling the dynamics of ice phenomena in the studied river. The objective of the present research was to show the predictability of the selected models and explain spatial differences in terms of the predictors: air temperature (Ta), water temperature (Tw), water level (H), and river flow (Q), as well as the ice phenomenon of the previous day and of the month of occurrence of the phenomenon. The predictions will be carried out using the example of the Warta River in Poland (Central Europe), which is a river of great economic significance and considerable natural value. The results of the study are important for determining the range of intensification of thermal and hydrological ice phenomena variables and the conditions under which their reduction will occur.

2. Study Area

The Warta River is a tributary of the Odra River and the third longest (808 km) river in Poland (Figure 1). Its catchment area (area 54,500 km2) is characterized by a significant diversity of topography and terrain and climatic and hydrological conditions [43]. Within the Warta Water Region there are three main types of relief: old-glacial in the southern part, young-glacial in the northern and central parts, and upland, south of Wielun.
The catchment belongs to nine out of 28 climatic regions designated in Poland by Woś [44]. The average annual air temperature ranges from 7.5 °C in the north to 8.5 °C in the west. In the coldest month—January—the average temperature ranges from −1.2 °C (in the west) to −2.5 °C (in the southeast). Annual rainfall totals in the study area are diverse and range from 520 mm in the Kujawy region (in the northeast) to 675 mm in the south. A regional differentiation of features of the hydrological regime has been observed along the analyzed section of the Warta [45]—from a medium-developed (the upper and lower course of the river) to a highly-developed (along the section from Nowa Wies (Nowa Wieś Podgórna to Poznan) nival regime (Figure 1). The average dates of appearance of ice phenomena on the Warta River, as well as the dates of their disappearance, vary. Research by Graf et al. [46] for the 1991–2010 observation series showed that the earliest ice phenomena occurred in the third decade of December (about 45% of the total number of observations) and the latest in the first decade of January. The disappearance of ice phenomena is usually observed from the end of January to the end of March [47], while about 30% of observations are made in the third decade of February. Most days with ice phenomena on the Warta River are in January (41% of observations), and the most common form of ice is frazil ice (46% of phenomena) and ice cover (30%).

3. Materials and Methods

Predictions of ice phenomena and their numerical descriptions were performed based on daily data on the number of occurrences (the number of days on which the phenomena were observed) and the nature of ice phenomena, and on air temperature (Ta), water temperature (Tw), water levels (H), and river flow (Q) for the years 1983–2013 from the Central Database of Historical Data of the Institute of Meteorology and Water Management—National Research Institute (IMGW-PIB) in Warsaw, Poland (Figure 1). The observation series includes data for the period after 1980, when changes in water temperature in rivers and further consequences, including the lower incidence of ice phenomena, were revealed in response to the sudden climate change associated with changes in CRS (climate regime shift). The regime shift of the late 1980s is a well-documented example of CRS in Poland [48].
Use was made of data from eight water gauges on the Warta River (Bobry, Sieradz, Uniejow, Nowa Wies (Nowa Wieś Podgórna), Srem, Poznan, Skwierzyna, and Gorzow Wielkopolski) and seven meteorological stations (Wielun, Sieradz, Koło, Słupca, Kornik, Poznan, and Gorzow Wielkopolski). Data have been presented in relation to the hydrological year, which in Poland lasts from 1 November until 31 October.

3.1. Classification of Ice Phenomena

The full ice cycle of the river includes forms of ice phenomena observed within the IMGW-PIB water gauge network: frazil ice, border ice, border ice and frazil ice, frazil ice jam, ice cover, ice floes, ice floes and border ice, ice floes and frazil ice, and ice jams. For modeling and predicting ice phenomena, these were grouped into three basic categories: (1) river freeze-up, (2) stable ice cover, and (3) breakup of ice cover—the disappearance of ice (Table 1). The joining into classes is not random, and indeed follows from the order in which ice phenomena appear on the river depending on the thermal and hydrological conditions of the winter season. Each observed ice phenomenon was assigned to the date of occurrence (month and year).
In each of the mentioned phases, characteristic fluvial processes occur as a result of the appearance of various forms of ice. Strictly defined forms, such as frazil ice jams or ice jams, are ephemeral forms. According to data from the IMGW-PIB, in the analyzed period, the Warta River had only several days with frazil ice jams and ice jams, which accounted for 0.1% of all observed ice phenomena. Frazil ice jam embolism occurred on the Warta River in Skwierzyna (three days with jams) and Poznan (one day with jams). Ice jams on the Warta occurred in Srem (three days with jams) and Uniejow (two days with jams).
It should be emphasized that ice jams are not a common event in the Warta River. They are among the characteristic features of the river’s morphology, which makes the reaches of the river susceptible to the formation of frazil ice [46]. In addition, climatic change serves to significantly decrease the intensity of ice jams by increasing the temperature in the vicinity of supercooled water and thereby prohibiting the formation of ice jams. Furthermore, the Warta River is strongly impacted by anthropogenic activity. Regarding the features of its morphology, the Warta River consists of various bed slopes, from mild to steep, which follow each other in a way that affects the pattern of ice formation. Surface ice is observed mainly over the milder sloped beds. Less surface ice is observed over the steeper slopes, while more suspended frazil particles are present. These ice particles may accumulate under the cover of the following flat sections, forming hanging dams at the inlet of the low-sloped sector. There is no specific data for the Warta River regarding hanging dams that would allow us to distinguish between the ice cover itself and frazil depositions with a greater degree of certainty.

3.2. Data Preparation

Predictions of ice phenomena were performed based on daily data on the number of occurrences (number of days with the phenomena) and the nature of ice phenomena and on air temperature (Ta), water temperature (Tw), water levels (H), and river flow (Q), as well as ice phenomena of the previous day (the day before occurrences of ice phenomena from classes 1, 2, or 3, or from class “none”) and the month of individual phenomena (six months of the hydrological winter half-year XI-IV). The choice of input variables was not accidental. The hydroclimatic factors and thermal conditions are important predictors for the process of ice phenomena formation.
To improve the predictability of the tested models and accelerate the process of simulation convergence (in particular as regards artificial neural networks (ANNs)), before inputting the data into Multilayer Perceptron Neural Network (MLPNN) the variables were normalized by converting their values into standardized values (so-called Z-scores) [29]:
z i = x i x _ s d x
where x i is the i-th value of x, x _ is the mean of x, and s d x is the standard deviation of x. The resulting variable z has a mean of zero and a standard deviation of one while retaining all the properties of the original variable. The following variables were transformed: Ta, Tw, H, Q, the day of the month (mon.), and the year (Y).
Additionally, ice phenomena of the previous day, encoded in four columns with the one-hot method (zero for the absence of a given phenomenon, and one when it occurred), and the month of individual phenomena, also encoded with the one-hot method, were introduced into the models. Encoding ice phenomena using the one-hot method with the addition of labels allows the assignment of your own characteristics to specific phenomena, showing the similarities between phenomena or the features that make them different. The ice phenomena data used are treated as categorical variables (also called nominal variables), that is, they represent the types of data that can be broken down into groups. In the tested example, three classes were distinguished (Table 1). However, the categories cannot be ordered from highest to lowest. In the classification methods these variables—as target variables (the ones we want to predict)—are usually converted to numerical form using one-hot coding.

3.3. Descriptive Statistics of the Frequency of Ice Phenomena

The research methodology included several stages. The development of predictive models for ice phenomena on the river was preceded by a statistical description of ice phenomena and their changes in the studied period.
The statistical description of ice phenomena included an analysis of the frequency of ice phenomena in the set of analyzed data, which was determined separately for each measuring station and assuming the classification of ice phenomena into three classes, as presented in Table 1. The next stage concerned the analysis of ice phenomena as sequential phenomena. For this purpose, cross-tables were made comparing ice phenomena from the current day with phenomena from the previous day. In the last step, the relationships between the classes of ice phenomena and air temperature, as well as water temperature, water level, and river flow, were analyzed. For this purpose, box and violin plots were made for the distribution of these parameters for each class of ice phenomenon. The violin plot is a combination of a box plot and a density plot, thus showing more details of data distribution, especially the kernel density distribution [49]. As a result, the problem of overlapping the traditional density plot, which is difficult to identify, is eliminated. Wider sections of the graph signify the higher probability of occurrence of certain values, while narrower sections denote lesser probability. According to Hintze and Nelson [50], the violin plot is used to visualize quantitative and qualitative data, including those that do not conform to the normal distribution, and to define the data structure. Like box plots, violin plots are used to present a comparison of variable distribution (or sample distribution) across different categories.

3.4. Prediction Models

Figure 2 depicts the stages of research activities in brief. The current research was carried out in three stages in total. To begin with, the data that had been cleaned, standardized, and adapted to the needs was referred to as prepared data. The second step was to use the R tools to test model predictions using both XGBoost and MLPNN methods. In the prediction of ice phenomena, the following formula was used (according to the choice of the MLPNN and XGboost algorithms):
ice_0 + ice_1 + ice_2 + ice_3 ~
Ta + Tw + Q + H + D + Y +
day_before0 + day_before1 + day_before2 + day_before3 +
mo1 + mo2 + mo3 + mo4 + mo5 + mo6
where ice_0 means no ice phenomena, ice_1–3 is the classes of ice phenomena (classes 1–3 adopted on the basis of the classification and grouping of ice phenomena presented in Table 1), Ta is air temperature, Tw is water temperature, Q is river flow, H is water level, D is day of the month, Y is day of the year, day_before0 is day before the day with no ice phenomena (class “none”), day_before1–3 means the day before occurrences of ice phenomena from classes 1, 2, 3 and mo1–6 means months in the winter half-year (November– April, according to the hydrological year).
The training and test sets were created using the stratified sampling algorithm, with the year and month variables functioning as layers. The process of determining datasets is detailed in the description Evaluating the Predictions. The confusion matrices were formed on the basis of the second stage of the activity and were then used as inputs for the third stage, which involved evaluating the performance of the XGBoost and MLPNN methods.

3.4.1. The Multilayer Perceptron Neural Network (MLPNN)

The most commonly used type of neural network method is the multi-layered perception method. In this method, the signal is passed to a one-way loop-free input-to-output network. Neither neuron acts on itself. This architecture is referred to as feed-forward, and consists of multiple inputs, hidden layers, and an output, as shown in Figure 3.
The first model used to predict ice phenomena was the Multilayer Perceptron Neural Network (MLPNN), which included an input layer, one hidden layer, and an output layer, and is one of the most widely used ANN models in the field of hydrology [4,7,8]. The input layer, which comprises the predictors, does not perform any calculations. The hidden layer is made of artificial neurons. A single hidden neuron ‘collects’ activations from each neuron of the input layer and calculates the weighted sum of the input variables. Each hidden layer neuron is connected to each input layer neuron. The hidden layer neurons then perform a non-linear transformation of the weighted sums using an activating function and pass the results to the output layer, which in this application is represented by ice phenomena. A neural network of this type with an output variable Y and containing n neurons in the hidden layer can be expressed as follows [51]:
Y = f 2 j = 1 n w j k f 1 j = 1 n x i w i j + δ j + δ 0
where x i is the value of the input variable i,  w i j is the weight (synapse) between the input variable i and the hidden neuron j, δ j is the bias of the hidden neuron j, f 1 is the sigmoidal function constituting the activation function for hidden neurons, w j k is the synapse between the hidden neuron j and the output neuron k (here k = 4), f 2 is also the activation sigmoid function, and δ 0 is the bias of the output layer neuron. The use of the sigmoidal function as an activation function for neurons of the output layer ensured that the predictions would be obtained from the model.
To estimate the weights and biases, the neural net package [52] and implemented elastic back propagation [53] were used. Cross entropy was used as a function of cost. Models with three, four, five, and six neurons in the hidden layer were calculated for each station.

3.4.2. The Extreme Gradient Boosting (XGBoost) Model

The second model tested was the Extreme Gradient Boosting (XGBoost) implemented by Chen et al. [54]—also in the form of the XGBoost library for the R platform. The gradient boosting machine is a team learning technique based on decision trees. A decision tree generates an output variable estimate based on optimized predictor thresholds that divide the data into multiple groups. The gradient boosting algorithm in each subsequent step aims to reduce the prediction error of the previous step. Technically, in each subsequent step the algorithm estimates the parameters of the model whose purpose is to predict the residuals (prediction errors) of the model estimated in the previous step. The objective function (J) in round t (step t) is given by Equation (4) [54]:
J t = i = 1 n l ( y i ,       y ^ i   ) + k = 1 K Ω f k
where: l is the training loss, Ω is regulations, fk is the function of the K–tree. In this study, y i is the observed ice phenomena and y ^ i is the obtained final prediction value.
In the present study, decision trees with a maximum depth of five nodes were used. Formally, a tree is any consistent acyclic graph, i.e., a graph that does not contain cycles. The multi-class log loss function was used as the cost function. Predictions of Y t from the model for iteration t are obtained from Equation (5) [39,54]:
Y t = k = 1 t f k X = Y t 1 + f t X
where X is the predictor or set of predictors and f k is the function that returns the predicted values of the predictors. The second part of the equation shows explicitly that the algorithm prediction in the iteration t is the sum of predictions from the t − 1 iteration and the new predictions from the t iteration. In XGBoost, the function f k consists of classification and regression trees that enable the modeling of arbitrary nonlinear relations and the prediction of variables of any nature (Figure 4).
One of the advantages of XGBoost compared to neural networks is the ability to assess the importance of predictors in the model. The importance of a predictor for regression and classification trees in the gradient boosting algorithm is defined as the profit that the predictor contributes to the entire model by using it to create successive branches of the tree. In this study, by employing the XGBoost model, we can assess the dominant factors controlling the dynamics of ice phenomena in the studied river.

3.5. Evaluating the Predictions

To assess the predictive power of the tested models, cross-validation and four goodness-of-fit metrics were used. Cross-validation was performed by training the models on the available data (training data) and then calculating predictions and goodness-of-fit metrics for the data on which the algorithms were not trained (test data). The XGBoost model was taught on 70% of the training set, and the prediction model was tested on 30% of the test set. The ANN model was taught on the first 50% of the sample, and the prediction model was tested on the remaining 50%. The test and training sets were created using the stratified sampling algorithm. The year and month variables were used in the form of layers. This was done specifically so that the training and test sets had a comparable number of observations within each year and month included in the analysis. Such divisions are in line with the general practice of evaluating machine learning algorithms [51].
The test and training sets were created using the stratified sampling algorithm, with the year and month variables functioning as layers. As a result, the test and random sets had a comparable number of observations within each year and month.
Four metrics were used as goodness-of-fit metrics, calculated separately for each class of ice phenomena: sensitivity, specificity, precision, and weighted validity [55]. For ease of interpretation of these statistics, consider the following cross tables (Table 2), where the letters A–D represent the counts:
The statistics used are defined by the formulas:
Sensitivity = A/(A + C)
Specificity = D/(D + B)
Precision = A/(A + B)
Balanced Accuracy = (Sensitivity + Specificity)/2
Sensitivity TPR (True-Positive Rate) is a measure of “reach” (coverage, “reaching”) that indicates the percentage of the positive class that has been covered by a positive prediction [56]. Specificity TNR (True-Negative Rate) is a measure of “coverage” that indicates the percentage of the negative class being covered by the negative prediction. Theoretically, Sensitivity (TPR) and Specificity (TNR) are independent measures, however. in practice increasing sensitivity often leads to a decrease in specificity [55]. Precision, referred to as the Positive Predictive Value (PPV), is a measure of precision that indicates how confidently we can trust positive predictions, i.e., the percentage of positive predictions that are positive. The confidence interval for the three distinguished measures is built based on the Clopper–Pearson method for a single proportion. Accuracy is the proportion of correct predictions with a set of test data. It is the ratio of the number of correct predictions to the total number of input samples. In turn, Balanced Accuracy is the arithmetic mean of the recall for each class. The closer the value is to 1, the better the prediction. However, exactly 1 indicates a problem that may be typically labeled as over-fitting. For highly unbalanced classification problems, as in the case of the analyzed data, balanced accuracy is particularly useful, because this statistic depends on both the level of correct prediction of a phenomenon and the level of prediction of the absence of a phenomenon.
Data analyses and operations were performed using the R 4.02 statistical environment [57]. The analyses and the necessary data restructuring, as well as the visualization of the data and the results of the analyses, were performed using the basic functions of the R environment and dedicated libraries for a given type of algorithm. The libraries used are cited in the corresponding analysis.

4. Results

4.1. Probability of Occurrence of Ice Phenomena

The frequency of ice phenomena on the Warta River in the analyzed period has been presented in Table 3. At the majority of measuring stations, ice phenomena from class 1 were observed on slightly more than 10% of days, while the frequency of occurrence of phenomena from class 2 varies from about 1.5% to over 8%. Ice phenomena from class 3 (breakup of ice cover—disappearance of freezing) were the least frequently observed. At each measuring station, this class was observed on less than 1% of days.
The probability of occurrence of ice phenomena in specific months of the year (in the cold semester of the hydrological year) has been presented in Figure 5. The probability of occurrence of ice phenomena from class 1 is highest in the months of December and January. Class 2 events are most likely to occur in January and February, whereas the greatest probability of the breakup of ice cover and disappearance of freezing (class 3) is associated with the month of January; with February in Sieradz, and with March in Poznan.
The results of the analysis of ice phenomena as sequential phenomena have been presented in Figure 6. The cross tables compare ice phenomena from the current day with the phenomena of the previous day. It was noted that each class was most often preceded by a phenomenon from its class. Additionally, there was often no ice at all at the river stations the day before the occurrence of ice phenomena from class 1. In a small percentage of days, ice phenomena from class 1 preceded class 2 events. Class 3 occurrences were regularly preceded by phenomena from classes 1 and 2 (Figure 6).

4.2. The Relationship between Ice Phenomena and Hydrological Conditions and Thermal Variables

The assessment of the relationship between various classes of ice phenomena and thermal conditions and hydrological factors has been presented in the form of violin plots of the distribution of these parameters for each class of the phenomenon. Figure 7 shows the differentiation of the variables with respect to the water gauges. For each water gauge station on the Warta River, the differentiation of the occurrence of ice phenomena in relation to air (Ta) and water (Tw) temperatures as well as water level (H) and flow (Q) was presented. As drawn, the graphs indicate certain regularities of occurrence of ice phenomena on the river.
The phenomena from the first stage of ice (border ice, frazil ice) are characteristic of the conditions of poor cooling of the water and mild flow, i.e., for the months of November and December. Although frazil ice requires a significant subcooling of the water and an effective dissipation of the heat of solidification, it can form particularly abundantly during strong, cold winds, even if the air temperature drop is insignificant (even at a few degrees below 0 °C). The analysis of the data showed that the ice phenomena from the first phase occur even at a water temperature of the Warta River of 0.2–0.8 °C, and ice cover is maintained at a water temperature of 0.2 °C and at negative air temperatures, which is understandable. In the case of lowland rivers, which also include the Warta River, the ice cover expansion phase, due to low flows and falls, lasts the longest, and its formation is favored by the persistence of negative air temperatures for a long time [46]. The period of ice cover disappearance as a result of an increase in air temperature occurs on the river in stages, as a result of which an ice floe is created that moves downstream (ice procession). The flow of ice floes in the river usually accelerates the cracking of the ice cover caused by the rise in the water level in the spring.
The distribution of ice phenomena concerning water temperature has a distinct character. In this case, the distribution for phenomena classes 1–3 is unimodal and has one high “peak” at very low water temperatures, which indicates the typical regularity of the occurrence of the first ice on the river. Considering the ice cover, the distribution partly takes the form of a slanting distribution with a long tail, which can be seen in the graph for the water gauges of Uniejow, Nowa Wies, and Srem (Figure 7).
As regards the relationship between ice phenomena and air temperature, distribution becomes more diverse depending on the class of the phenomenon and the location of the observation post. For class 1, the distribution is predominantly unimodal. For the majority of measuring stations, distribution is asymmetric and has features of skewed distribution. At the Sieradz and Skwierzyna stations, this distribution shows a tendency to bimodality, which would suggest the presence of two characteristic periods of air temperature and thus favor an increase in the probability of occurrence of ice phenomena from the first phase in these locations. In the case of class 2 (permanent ice cover), distributions at almost all stations are unimodal with a clear skew towards very low air temperatures, which strongly suggests that the probability of ice cover is related to the accumulation of days with negative air temperature. The exception is the Poznan water gauge, for which distribution has features similar to the bimodal distribution (Figure 7). For class 3 (breakdown of the ice cover), the distribution has features typical of unimodal distribution and is focused on an air temperature ranging from 0 °C to a few degrees above zero. This form of relationship is typical of most water gauges on the Warta River. Finally, as regards the water gauges in Nowa Wies and Skwierzyna, the skewness of the distribution increases, and this points to an increase in outliers.
The distribution of ice phenomena from class 1 (formation of ice phenomena) concerning the water level displayed predominantly bimodal features (Figure 7). In the case of the Bobry and Sieradz stations, the distribution shows features of asymmetry and develops as a skewed distribution. In the case of distributions with two or more mods, the widest sections of the violin diagram indicate the greatest probability of observing ice phenomena on the river at a low and medium water level. However, additional periods with a specific water level on the Warta River (states above the average) at which ice phenomena will occur under favorable river thermal conditions are not excluded. The bimodal distribution indicates that the distribution of ice phenomena in this relationship is unstable or very variable. Distribution displays similar features in the case of the relation of the ice cover (class 2) to the state of the water, which is also bimodal at most stations (Figure 7). The distribution shows similar features as regards the relation between the ice cover (class 2) and the state of the water, which, too, is bimodal at the majority of water gauges. The exceptions here are Sieradz and Poznan, for which a unimodal distribution with a specific skewness has been identified. For ice phenomena from class 3, the relationship with the water level shows different types of distributions: unimodal (Poznan and Gorzow Wlkp.) and biomodal (Nowa Wies and Skwierzyna). In the case of Sieradz, the distribution is flat, while in Gorzow Wlkp. it is strongly skewed.
The distribution of the relationship between ice phenomena from class 1 and river flow is bimodal at Nowa Wies, Poznan, and Skwierzyna and unimodal at other stations (Figure 7). In the case of Uniejow, Srem, and Gorzow Wlkp., distribution is also strongly skewed. The greatest probability of occurrence of ice phenomena in the initial period of the Warta River’s freezing is associated with the low flow of the river. In the case of permanent ice cover (class 2), the distribution is unimodal at all water gauges, except for Nowa Wies, where it exhibits features of bimodality. This means that the distribution of ice in this relationship is relatively stable along the entire river. However, as regards ice phenomena from class 3, the distribution at certain stations has unimodal (Uniejow, Srem, and Gorzow Wlkp.) or bimodal (Sieradz, Nowa Wies, and Skwierzyna) features.

4.3. Predicting Ice Phenomena

The results of predictive modeling have been presented for three sections of the Warta River: the upper course (Bobry, Sieradz, and Uniejow water gauges)—Table 4; the middle course (Nowa Wies, Srem, and Poznan water gauges)—Table 5; and the lower course (Skwierzyna and Gorzow Wlkp. water gauges)—Table 6. In most of the analyzed instances, the predictive power of the tested models was comparable, and the differences in metrics between models were inconsiderable.
In the upper section of the Warta River (Bobry station), the MLPNN with four hidden units (NN4) was the best among the models, as indicated by the highest values of “bal-anced accuracy” (BA) statistics for ice phenomena from class 2 (BA = 0.971), and for the “no ice phenomena” class (BA = 0.933), and the second-highest value of statistics for class 1 (BA = 0.913) in the test set (Table 4). The XGBoost model predicted ice phenomena to a comparable extent. It exhibited a similar “balanced accuracy” profile, but one slightly infe-rior to NN models. Class 3 was too small in terms of abundance for the model to success-fully learn the relationship between the class and the predictors in this dataset. For the Sieradz station, it is difficult to identify the model with the highest predictive power (Table 4). The XGBoost model and the NN3–NN5 models successfully predicted each class of ice phenomena. From the NN models, the model with four hidden units was the most sensitive to the rarely occurring class 3 (balanced accuracy BA = 0.76), while predictions for class 1 were less accurate (BA = 0.834). Among all the models used in the work, the XGBoost model best predicted ice phenomena from class 1 (BA = 0.923) but demonstrated the weakest prediction of phenomena from classes 2 (BA = 0.954) and 3 (BA = 0.64) (Table 4). The predictive power of the tested models for the Uniejów station was different depending on the class of phenomena (Table 3). At this location, the XGBoost model achieved the highest values of balanced accuracy in the test set for ice phenomena from classes 1 (BA = 0.951) and 2 (BA = 0.984). At the same time, the NN5 model was the only one to correctly predict ice phenomena from class 3 (BA = 0.998).
In the middle course of the Warta River, for the Nowa Wies water gauge, the NN5 model turned out to be the best at predicting ice phenomena (Table 5). This model performed well for ice phenomena from classes 1 (BA = 0.906) and 2 (BA = 0.949), comparable with other models, while at the same time being the most sensitive for class 3 (BA = 0.623). However, the best performance in predicting class 1 events was achieved by the NN6 model (BA = 0.936), and the best performance for class 2 by the NN3 model (BA = 0.965). The XGBoost model achieved similar performance to the NN5 model as regards the prediction of phenomena from class 2 (BA = 0.945). In the case of the Srem water gauge, it is difficult to indicate the best model (Table 5). A neural network model NN5 best predicted the ice phenomena from class 3 (BA = 0.749). The NN6 model showed the best prediction for class 2 (BA = 0.993), and class 1 events were best predicted by XGBoost (BA = 0.946). Nevertheless, it is the neural network model with five hidden units (NN5) that seems to have the most balanced prediction profile for all classes of ice phenomena. For the Poznan water gauge (Table 5), the predictive power of the tested models was comparable. The NN3 model can be viewed as the best for predicting ice phenomena at this location because it predicted classes 1 (BA = 0.933) and 2 (BA = 0.958) best and was the third most effective in predicting the level of class 3 events (BA = 0.706). At the same time, the NN5 model turned out to be the best for ice phenomena from class 3 (BA = 0.759).
For the lower course of the Warta River, in the Skwierzyna profile, the predictive power of the tested models was comparable (Table 6). The NN3 model appears to present the most balanced predictive profile. This model best predicted the occurrence of ice phenomena from class 1 (BA = 0.958) and also the absence of river freezing (BA = 0.973). The best prediction of ice phenomena from class 2 was achieved by the NN6 model (BA = 0.982), with the results of the XGBoost model being comparable (BA = 0.98). The NN3 model also displayed good predictability of phenomena from classes 2 (BA = 0.965) and 3 (BA = 0.692). Its class 3 prediction performance is comparable to that of the NN4 model, for which BA = 0.694 was determined. For the Gorzow Wlkp. water gauge, one of the better predictive models for ice phenomena was the NN4 model (BA = 0.932 for class 1, BA = 0.98 for class 2) (Table 6). The NN5 model predicted classes 1 and 2 comparably to the other models, and at the same time was the most sensitive in terms of predicting ice phenomena from class 3 (BA = 0.665). The XGBoost model predicted the phenomena from group 2 best (BA = 0.987), similarly to the NN3 model (BA = 0.983).

4.3.1. Spatial Differences in Model Performance

Among the NN models used, the best predictions were given by the NN5 (eight-fold confirmation of the best prediction) and NN4 models (seven-fold) (Table 7). The XGBoost model also has high predictive power, and the model turned out to be the best in predicting ice phenomena from classes 1 and 2. In three cases, its performance was comparable with those of the NN models. The phenomena from the initial stage of freezing (class 1) were best predicted by the XGBoost model. On the other hand, the disintegration of the ice cover and accompanying ice phenomena were best predicted by the NN5 model (at five water gauge stations). No dependence of the models’ performance on the location of water gauges (Table 7) was observed, although as regards predictions of ice phenomena in the upper section of the Warta River (Bobry, Sieradz, Uniejow stations), the XGBoost model and the NN4 and NN5 models proved to be superior.
Ice phenomena predictions for the river along its middle section (stations in Nowa Wies, Srem, and Poznan) were made most reliably by the XGBoost and NN5 models (Nowa Wies and Srem) and the NN3–NN5 models (Poznan) (Table 7). For the prediction of ice phenomena along the lower section of the Warta, superior performance was demonstrated by the NN models, taking into account the lower efficiency of the XGBoost model.
The most difficult prediction was that for ice phenomena in the decay phase and the formation of ice floes and, consequently, ice jams. Due to the lowest frequency of observations, there were problems with their prediction in the case of the Bobry station. In this case, no results were obtained from the relations determined between the class of ice phenomena and the predictors.

4.3.2. Evaluation of the Importance of Predictors in the Models

The use of XGBoost, as opposed to ANNs, made it possible to assess the importance of predictors in the model. The selected predictor variables were ranked according to the normalized reduction in model error, also known as “variable importance”. Figure 8 shows the most important predictor variables in the final model: water and air temperature, hydrological conditions (water level and river flow), and data for the “day before”, month, and year. The results of this analysis indicate that for each measuring station the most important predictor of ice phenomena is the type of ice phenomenon the day before the identification of a given event, with water temperature and air temperature coming next. In the case of the stations in Uniejow and Srem, water temperature is the second most important predictor of the occurrence of ice phenomena.
These results suggest that when looking for a balance between the complexity of the model and its predictive power, the two most important predictors for the occurrence of ice phenomena on the Warta River should be taken into account, i.e., the nature of the ice phenomenon on the day preceding the observation (especially for class 2 or class 1 events), and water temperature.

5. Discussion

5.1. Selection of Predictors as Input Variables

The predictive modeling of ice phenomena carried out on the example of the Warta River showed that the prediction of their occurrence in different phases and spatial locations gives different results. In this case, the prediction was a difficult process, mainly due to the complexity of interactions between hydroclimatic factors and thermal conditions that contribute to the occurrence of freezing.
In the research conducted on the Warta River, an important assumption was the selection of input variables that affect the accuracy of predictions in the neural network models and XGBoost. A set of daily data were used, these including thermal and hydrological variables, the type of ice phenomenon (group of phenomena) on the day preceding water gauge observations, and the month of their occurrence. The premises confirming the correctness of their choice are the results of studies of the ice regimes of rivers in Poland, including those conducted on the Vistula River [13,58], Oder River [59], on the rivers of the Baltic coastal zone [60,61], Bug River [62], and Warta River [46,63,64]. The selection of input variables significantly affects the performance of ANN models [7], however, it is often arbitrary [8].

5.2. The Most Important Predictor Variables in the Final Model

The results of the predictive models that we developed for the Warta River showed that all the input parameters (predictors) that were taken into account had some significance for the formation of ice phenomena from different classification groups. However, under the thermal conditions established for the reference period (research period 1984–2013), hydrological parameters—river flow and water level—were less important for the process of ice phenomena formation. The research established that ice phenomena occurred irregularly and periodically in the studied period and that the structure of freezing along the river course was diversified. The phenomena from class 1 were predominant, i.e., from the freezing phase of the river, represented by frazil ice and border ice, which is now a typical feature of the ice regime of most rivers in Poland [47].
The results of modeling confirmed that the most important predictors in the analyzed case were the nature of the phenomenon on the day preceding the observation (most often class 2 or class 1), as well as water temperature, and then air temperature (Figure 8). Graf [12] examined the dependencies of the trends of ice phenomena in the Noteć River, in western Poland (a tributary of the Warta River), on air and water temperature using regression models for count data and the Zero-Inflated Negative Binomial Model; results showed that the temperature values are the best predictors. In some locations, however, the model predicting the number of ice phenomena—taking into account the relationship with temperature—turned out to be statistically insignificant. Graf and Tomczyk [11] determined that for the Noteć River, a faster increase in accumulated sequences of negative air temperature contributes to an increase in the probability of a permanent ice cover, and the average degree day increase by one degree increases the chance of ice cover on the river in the range of 1.5–6.0% in different water gauges.
The period of intense changes in thermal conditions in the Warta catchment area, e.g., the cold period or sudden spring river supply, can be represented by changes in the models. However, it is also visible in the types of distributions illustrating the relationship between the classes of ice phenomena on the Warta River and hydrological factors and thermal conditions, which has been presented in the violin plots. The violin plots show diverse and complicated relations resulting from the differing variability of hydroclimatic factors and thermal conditions, which determines the nature of the distribution (Figure 5). Conditions conducive to the emergence of ice phenomena are not always the same in every location on the river, which is the result of local conditions, including, e.g., channel morphology and the influence of anthropic pressure.

5.3. The Performance of Predictive Models

The models performed promisingly in predicting the occurrence of ice phenomena on the Warta River, and this—in addition to their low demand for computational data resources, speed of operation, and ease of use—makes them particularly attractive. Further, it was found that the ANN approach served its purpose. By using more advanced and specialized network architectures (NN5, NN6), the ability to learn and predict the non-linear behavior of ice phenomena was increased for classes 2 and 3, which were characterized by a lower frequency of occurrence in the Warta River.
In ANN predictive modeling, the use of the sigmoidal function as an activation function for output layer neurons ensured that predictions obtained with the model would be the probabilities of occurrence of a given class, since this function maps real values to the range 0–1. As a result, models with 3, 4, 5, and 6 neurons in the hidden layer were developed, and this made it possible to compare their performance in predicting ice phenomena. Guo et al. [4], basing their work on the ANN theory, used the sigmoidal function as the activation function in the hidden layer for the forecasting of ice jams during river ice breakup. Their results were promising, as they predicted the annual occurrence of ice blockages with an accuracy of 85%, while the projected decay date with the projected ten-day period showed a maximum error of two days.
Concerning the prediction of class 1 and class 2 phenomena in the Warta River (permanent ice cover) and their non-occurrence, ANNs require further improvement, although present results indicate that they are comparable to the XGboost algorithm for predicting group 2 phenomena, i.e., permanent ice cover. The performance of the NN models and the XGBoost algorithm is also comparable for the different water gauge locations on the Warta River, although an overall better fit of XGBoost and NN4/NN5 models was observed for the upper course of the river; XGBoost and NN5/NN3 were most successful for the middle course, while NN models predominated for the lower course. The results of a comparison of both types of models in terms of their suitability for predicting ice phenomena on the Warta River showed a high accuracy of prediction for the XGBoost method, which has not been used on a larger scale in this regard so far.
XGBoost models variable interactions and handles the multi-linearity common to ecological datasets seamlessly [65]. Moreover, XGBoost works faster than many other gradient-increasing algorithms due to the regularization factor and the parallel computing functionality. One of the advantages of this method is its resistance to outliers, which eliminates the need to supplement missing data, and thus, in the case of the Warta River, it allows an increase in the efficiency of the prediction of a given phenomenon, even when eliminating outliers. The obtained results were considered satisfactory, which was confirmed by four model fit measures.
The comparison of results obtained for the Warta River with the ANN models used to predict ice phenomena on other rivers shows their considerable similarity. Most of the predictive and prognostic models developed confirm that the results of ice condition forecasts made with the use of ANNs are satisfactory and consistent with the measured data [37,66,67]. Moreover, the high accuracy of forecasts is indicated, which takes into account factors influencing the formation and disappearance of ice phenomena. Too many simplifications, as made in some models, may lower their prognostic accuracy and limit their usefulness for other rivers [36]. According to Massie [33], neural network classifiers, just like in the case of empirical methods, are most likely location-specific, but it is possible to transfer ANN models to other locations with minimal modifications. However, there are still no solutions for the prediction of phenomena in individual phases of their occurrence using the XGBoost algorithm. In Poland to date, models from the ANN group and the XGBoost algorithm have not been used to predict ice phenomena.
A review of the literature shows that numerous parameters are needed to support models developed for forecasting ice phenomena, most commonly ice jams and the resultant floods, but obtaining this data is sometimes difficult or even impossible. Despite the progress made in forecasting ice processes on rivers, this field still has great research potential; however, it also requires comprehensive observations, the collection and testing of data from stationary measurement networks, and direct field studies [8].

6. Conclusions

In the present study, MLPNN and XGBoost models were developed to forecast ice phenomena on the Warta River in Poland. The results obtained lead to the following conclusions:
(1)
Both the MLPNN and XGBoost models produced promising results for the forecasting of ice phenomena, which are presented using the four model fit measures.
(2)
For highly unbalanced classification problems, as in the case of the analyzed data, the “Balanced Accuracy” is particularly useful, since this statistic depends on both the level of correct prediction of a phenomenon and the level of prediction of the absence of a phenomenon.
(3)
The XGBoost turned out to be the best for predicting freeze-up (class 1) and ice cover (class 2 of ice phenomena), and at three water gauges its performance was comparable with that of the NN models, whereas breakup and ice deterioration (class 3) were best predicted by the NN5 model (at five water gauge stations). No dependence of the performance of individual models on the location of water gauges was observed.
(4)
The choice of input variables impacts the accuracy of the models developed. The nature of ice phenomenon on the day preceding the observation, as well as water and air temperature values, are important predictors, while river flow and water level were less important for the process of ice phenomena formation. This information was provided by the XGBoost algorithm.
(5)
The forecasting of ice phenomena is complicated due to the complex interactions between their determinants. This is confirmed by the types of distribution (unimodal, bimodal), illustrating the relationship between classes of phenomena on the river and hydroclimatic factors and thermal conditions.
The results of the research conducted here have important implications for forecasting ice phenomena, specifically as regards the application of XGBoost. Preliminary results seem to indicate that XGBoost, as an ensemble machine learning model, works well as a forecasting tool in hydrological research. Though the MLPNN and XGBoost models performed competently, there is still scope for further improvement through additional studies and the construction of hybrid models. Other factors influencing the occurrence of ice phenomena on rivers that would additionally help to improve the accuracy of these models should also be looked at (e.g., channel morphology, the accumulated degree days of frost and thaw, and the rates of change in water level and flow during the freeze-up and breakup periods). Since the present results concern only one river, future research will focus on applying models to rivers in different geographic locations and hydrological regimes to more accurately test the suitability and effectiveness of models.

Author Contributions

Conceptualization, R.G.; funding acquisition, R.G.; methodology, R.G.; project, R.G.; administration, R.G.; investigation, R.G.; software, R.G.; formal analysis, R.G.; validation, R.G.; visualization, R.G.; writing—original draft preparation, R.G; writing—review and editing, R.G., T.K., and S.Z.; supervision, R.G., T.K., and S.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received funding from an internal grant “Research subsidy for scientific research activities of the Faculty of Geographical and Geological Sciences of the Adam Mickiewicz University in Poznan, Poland.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Hydrological and meteorological measurement data are available directly from https://hydro.imgw.pl (accessed on 10 July 2021) and https://danepubliczne.imgw.pl (accessed on 15 July 2021) as operational data. The data that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

The authors thank the Faculty of Geographical and Geological Sciences, Adam Mickiewicz University, Poznan, Poland for their support. The authors would also like to acknowledge the Institute of Meteorology and Water Management—National Research Institute (IMWM-NRI, Warsaw, Poland) for the release of the input database.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Magnuson, J.J.; Robertson, D.; Benson, B.; Wynne, R.; Livingstone, D.; Arai, T.; Assel, R.; Barry, R.; Card, V.; Kuusisto, E.; et al. Historical trends in lake and river ice cover in the northern hemisphere. Science 2000, 289, 1743–1746. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Hicks, F.; Beltaos, S. River Ice. In Cold Region Atmospheric and Hydrologic Studies. The Mackenzie GEWEX Experience; Woo, M., Ed.; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
  3. Zhao, L. River Ice Breakup Forecasting Using Artificial Neural Networks and Fuzzy Logic Systems. Ph.D. Thesis, Department of Civil and Environmental Engineering, University of Alberta, Edmonton, AB, Canada, 2012. [Google Scholar]
  4. Guo, X.; Wang, T.; Fu, H.; Guo, Y.; Li, J. Ice-jam forecasting during river breakup based on neural network theory. J. Cold Reg. Eng. 2018, 32, 04018010. [Google Scholar] [CrossRef]
  5. Beltaos, S.; Prowse, T. River-ice hydrology in a shrinking cryosphere. Hydrol. Proc. 2009, 23, 122–144. [Google Scholar] [CrossRef]
  6. Lindenschmidt, K.-E.; Huokuna, M.; Burrell, B.C.; Beltaos, S. Lessons learned from past ice-jam floods concerning the challenges of flood mapping. Int. J. River Basin Manag. 2018, 16, 457–468. [Google Scholar] [CrossRef]
  7. Massie, D.D.; White, K.D.; Daly, S.F. Application of neural networks to predict ice jam occurrence. Cold Reg. Sci. Technol. 2002, 35, 115–122. [Google Scholar] [CrossRef]
  8. Madaeni, F.; Lhissou, R.; Chokmani, K.; Raymond, S.; Gauthier, Y. Ice jam formation, breakup and prediction methods based on hydroclimatic data using artificial intelligence: A review. Cold Reg. Sci. Technol. 2020, 174, 103032. [Google Scholar] [CrossRef]
  9. Nafziger, J.; She, Y.; Hicks, F. Anchor ice formation and release in small regulated and unregulated streams. J. Cold Reg. Sci. Technol. 2017, 141, 66–77. [Google Scholar] [CrossRef]
  10. Lindenschmidt, K.-E. RIVICE—A non-proprietary, open-source, one-dimensional river-ice and water-quality model. Water 2017, 9, 314. [Google Scholar] [CrossRef]
  11. Graf, R.; Tomczyk, A.M. The Impact of cumulative negative air temperature degree-days on the appearance of ice cover on a river in relation to atmospheric circulation. Atmosphere 2018, 9, 204. [Google Scholar] [CrossRef] [Green Version]
  12. Graf, R. Estimation of the dependence of ice phenomena trends on air and water temperature in river. Water 2020, 12, 3494. [Google Scholar] [CrossRef]
  13. Kolerski, T. Modeling of ice phenomena in the mouth of the Vistula River. Acta Geophys. 2014, 62, 893–914. [Google Scholar] [CrossRef]
  14. Shulyakovskii, L.G. Manual of Forecasting Ice-Formation for Rivers and Inland Lakes. Manual of Hydrological Forecasting No. 4; Translated from Russian, Israel Program for Scientific Translations; Central Forecasting Institute of USSR: Jerusalem, Israel, 1963. [Google Scholar]
  15. Uzuner, M.S.; Kennedy, J.F. Theoretical model of river ice jams. J. Hydraul. Eng. Div. 1976, 102, 1365–1383. [Google Scholar] [CrossRef]
  16. Yang, X.; Pavelsky, T.M.; Allen, G.H. The past and future of global river ice. Nature 2020, 577, 69–73. [Google Scholar] [CrossRef] [PubMed]
  17. Weber, F.; Nixon, D.; Hurley, J. Semi-automated classification of river ice types on the Peace River using RADARSAT-1 synthetic aperture radar (SAR) imagery. Can. J. Civil Eng. 2003, 30, 11–27. [Google Scholar] [CrossRef]
  18. Lindenschmidt, K.; Syrenne, G.; Harrison, R. Measuring ice thicknesses along the Red River in Canada using RADARSAT-2 satellite imagery. J. Water Resour. Prot. 2010, 2, 923–933. [Google Scholar] [CrossRef] [Green Version]
  19. Kolerski, T. Mathematical modeling of ice dynamics as a decision support tool in river engineering. Water 2018, 10, 1241. [Google Scholar] [CrossRef] [Green Version]
  20. Mahabir, C.; Hicks, F.; Fayek, A.R. Transferability of a neuro-fuzzy river ice jam flood forecasting model. Cold Reg. Sci. Technol. 2007, 48, 188–201. [Google Scholar] [CrossRef]
  21. White, K.D.; Daly, S.F. Predicting Ice Jams with Discriminant Function Analysis. In Proceedings of the ASME 2002 21st International Conference on Offshore Mechanics and Arctic Engineering, Oslo, Norway, 23–28 June 2002; Volume 3, pp. 683–690. [Google Scholar]
  22. Beltaos, S. Numerical modelling of ice-jam flooding on the Peace-Athabasca Delta. Hydrol. Proc. 2003, 17, 3685–3702. [Google Scholar] [CrossRef]
  23. Shen, H.T. Mathematical modelling of river ice processes. Cold Reg. Sci. Technol. 2010, 621, 3–13. [Google Scholar] [CrossRef]
  24. Wang, J.; Sui, J.; Guo, L.; Karney, B.W.; Jüpner, R. Forecast of water level and ice jam thickness using the back propagation neural network and support vector machine methods. Int. J. Environ. Sci. Technol. 2010, 7, 215–224. [Google Scholar] [CrossRef] [Green Version]
  25. Beltaos, S. Progress in the study and management of river ice jams. Cold Reg. Sci. Technol. 2008, 51, 2–19. [Google Scholar] [CrossRef]
  26. Lindenschmidt, K.-E.; Carstensen, D.; Fröhlich, W.; Hentschel, B.; Iwicki, S.; Kögel, K.; Kubicki, M.; Kundzewicz, Z.W.; Lauschke, C.; Łazarów, A.; et al. Development of an ice-jam flood forecasting system for the lower Oder River—Requirements for real-time predictions of water, ice and sediment transport. Water 2019, 11, 95. [Google Scholar] [CrossRef] [Green Version]
  27. Sutyrina, E.N. Prediction of spring ice phenomena on lakes and reservoirs using teleconnection indices. Limnol. Freshw. Biol. 2020, 4, 946–947. [Google Scholar] [CrossRef]
  28. Piotrowski, A.P.; Napiórkowski, M.J.; Napiórkowski, J.J.; Osuch, M. Comparing various artificial neural network types for water temperature prediction in rivers. J. Hydrol. 2015, 529, 302–315. [Google Scholar] [CrossRef]
  29. Graf, R.; Zhu, S.; Sivakumar, B. Forecasting river water temperature time series using a wavelet–neural network hybrid modelling approach. J. Hydrol. 2019, 578, 124115. [Google Scholar] [CrossRef]
  30. Zhu, S.; Heddam, S.; Nyarko, E.K.; Hadzima-Nyarko, M.; Piccolroaz, S.; Wu, S. Modeling daily water temperature for rivers: Comparison between adaptive neurofuzzy inference systems and artificial neural networks models. Environ. Sci. Pollut. Res. 2019, 26, 402–420. [Google Scholar] [CrossRef]
  31. Morse, B.; Hessami, M.; Bourel, C. Mapping environmental conditions in the St. Lawrence River onto ice parameters using artificial neural networks to predict ice jams. Can. J. Civ. Eng. 2003, 30, 758–765. [Google Scholar] [CrossRef]
  32. Wang, T.; Yang, K.L.; Guo, Y.X. Application of artificial neural networks to forecasting ice conditions of the Yellow River in the Inner Mongolia reach. J. Hydrol. Eng. 2008, 13, 811–816. [Google Scholar]
  33. Massie, D.D. Neural-Network Fundamentals for Scientists and Engineers. In Proceedings of the Conference on Efficiency, Costs, Optimization, Simulation and Environmental Impact of Energy Systems (ECOS 01), Istanbul, Turkey, 4–6 July 2001. [Google Scholar]
  34. Chokmani, K.; Khalil, B.; Ouarda, T.B.M.J.; Bourdages, R. Estimation of river ice thickness using artificial neural networks. In Proceedings of the 14th Workshop Hydraulics Ice Covered Rivers, Quebec, QC, Canada, 19–22 June 2007; CGU HS/CRIPE: Quebec, QC, Canada, 2007. [Google Scholar]
  35. Hu, J.; Liu, L.; Huang, Z.; You, Y.; Rao, S. Ice Breakup Date Forecast with Hybrid Artificial Neural Networks. In Proceedings of the 4th International Conference on Natural Computation, ICNC, Jinan, China, 25–27 August 2007; Volume 2, pp. 414–418. [Google Scholar]
  36. Chen, S.Y.; Ji, H.L. Fuzzy optimization neural network approach for ice forecast in the Inner Mongolia reach of the Yellow River. Hydrol. Sci. J. 2005, 50, 319–330. [Google Scholar]
  37. Yan, Q.; Ding, M. Using Dynamic Fuzzy Neural Networks Approach to Predict Ice Formation. In Proceedings of the 2011 MSEC International Conference on Multimedia, Software Engineering and Computing, Wuhan, China, 26–27 November 2011; Jin, D., Lin, S., Eds.; Advances in Multimedia, Software Engineering and Computing Vol.1. Springer: Berlin/Heidelberg, Germany, 2011; Volume 128. [Google Scholar]
  38. Liu, H.; Jiang, Q.; Ma, Y.; Yang, Q.; Shi, P.; Zhang, S.; Tan, Y.; Xi, J.; Zhang, Y.; Liu, B.; et al. Object-Based Multigrained Cascade Forest Method For Wetland Classification Using Sentinel-2 and radarsat-2 imagery. Water 2022, 14, 82. [Google Scholar] [CrossRef]
  39. Shin, Y.; Kim, T.; Hong, S.; Lee, S.; Lee, E.; Hong, S.; Lee, C.; Kim, T.; Park, M.S.; Park, J.; et al. Prediction of chlorophyll-a concentrations in the Nakdong River using machine learning methods. Water 2020, 12, 1822. [Google Scholar] [CrossRef]
  40. Flach, P.A. Machine Learning: The Art and Science of Algorithms that Make Sense of Data; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
  41. Prowse, T.D.; Bonsal, B.R. Historical trends in river-ice break-up: A review. Nordic Hydrol. 2004, 35, 281–293. [Google Scholar] [CrossRef]
  42. Zounemat-Kermani, M.; Batelaan, O.; Fadaee, M.; Hinkelmann, R. Ensemble machine learning paradigms in hydrology: A review. J. Hydrol. 2021, 598, 126266. [Google Scholar] [CrossRef]
  43. Graf, R.; Aghelpour, P. Daily river water temperature prediction: A comparison between neural network and stochastic techniques. Atmosphere 2021, 12, 1154. [Google Scholar] [CrossRef]
  44. Woś, A. The Climate of Poland in the Second Half of the 20th Century; Scientific Publishing House UAM: Poznan, Poland, 2010; 490p. (In Polish) [Google Scholar]
  45. Wrzesiński, D.; Perz, A. The features of the runoff regime in the basin of the Warta River. Bad. Fizjogr. R. VII Ser. A Geogr. Fiz. 2016, A67, 289–304. (In Polish) [Google Scholar]
  46. Graf, R.; Łukaszewicz, J.T.; Jawgiel, K. The analysis of the structure and duration of ice phenomena on the Warta River in relation to thermic conditions in the years 1991–2010. Woda-Sr.-Obsz. Wiej. 2018, 18, 5–28. (In Polish) [Google Scholar]
  47. Pawłowski, B.; Gorączko, M.; Szczerbińska, A. Ice Phenomena on the Rivers of Poland. In Hydrology of the Poland; Jokiel, P., Marszelewski, W., Pociask-Karteczka, J., Eds.; PWN: Warsaw, Poland, 2017; pp. 195–200. (In Polish) [Google Scholar]
  48. Graf, R.; Wrzesiński, D. Detecting patterns of changes in river water temperature in Poland. Water 2020, 12, 1327. [Google Scholar] [CrossRef]
  49. Lan, T.; Lin, K.; Tan, X.; Xu, C.-Y.; Chen, X. Dynamics of hydrological model parameters: Calibration and Reliability. Hydrol. Earth Syst. Sci. 2020, 24, 1347–1366. [Google Scholar] [CrossRef] [Green Version]
  50. Hintze, J.L.; Nelson, R.D. Violin plots: A box plot-density trace synergism. Am. Stat. 1998, 52, 181–184. [Google Scholar]
  51. Brownlee, J. Train-Test Split for Evaluating Machine Learning Algorithms. 2020. Available online: https://machinelearningmastery.com/train-test-split-for-evaluating-machine-learning-algorithms (accessed on 15 September 2021).
  52. Fritsch, S.; Guenther, F.; Wright, M.N. “Neuralnet: Training of Neural Networks.” R package version 1.44.2. 2019. Available online: https://CRAN.R-project.org/package=neuralnet (accessed on 22 November 2021).
  53. Riedmiller, M. Advanced supervised learning in multi-layer perceptrons—From backpropagation to adaptive learning algorithms. Comput. Stand. Interfaces 1994, 16, 265–278. [Google Scholar] [CrossRef]
  54. Chen, T.; He, T.; Benesty, M.; Khotilovich, V.; Tang, Y.; Cho, H.; Chen, K.; Mitchell, R.; Cano, I.; Zhou, T.; et al. Xgboost: Extreme Gradient Boosting. 2020. Available online: https://cran.r-project.org/package=xgboost (accessed on 10 July 2021).
  55. Loong, T.W. Understanding sensitivity and specificity with the right side of the brain. BMJ 2003, 327, 716–719. [Google Scholar] [CrossRef] [PubMed]
  56. Parikh, R.; Mathai, A.; Parikh, S.; Chandra Sekhar, G.; Thomas, R. Understanding and using sensitivity, specificity and predictive values. Indian J. Ophthalmol. 2008, 56, 45–50. [Google Scholar] [CrossRef] [PubMed]
  57. R Core Team, R. A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2019; Available online: http://www.R-project.org (accessed on 15 July 2021).
  58. Pawłowski, B. Course of Ice Phenomena on the Lower Vistula River in 1960–2014; Nicholas Copernicus University Toruń: Torun, Poland, 2017. (In Polish) [Google Scholar]
  59. Marszelewski, W.; Pawłowski, W. Long-term changes in the course of ice phenomena on the oder river along the Polish–German border. Water Resour. Manag. 2019, 33, 5107–5120. [Google Scholar] [CrossRef]
  60. Ptak, M.; Choiński, A. Ice phenomena in rivers of the coastal zone (southern Baltic) in the years 1956–2015. Baltic Coastal Zone. J. Ecol. Prot. Coastline 2016, 20, 73–83. [Google Scholar]
  61. Łukaszewicz, J.T.; Graf, R. The variability of ice phenomena on the rivers of the Baltic coastal zone in the Northern Poland. J. Hydrol. Hydromech. 2020, 68, 38–50. [Google Scholar] [CrossRef] [Green Version]
  62. Bączyk, A.; Suchożebrski, J. Variability of ice phenomena on the Bug River (1903–2012). Inżynieria Ekol. 2018, 49, 136–142. (In Polish) [Google Scholar] [CrossRef] [Green Version]
  63. Gorączko, M.; Pawłowski, B. Changing of ice phenomena on Warta River in vicinity of Uniejów. Biul. Uniejowski 2014, 3, 23–33. (In Polish) [Google Scholar]
  64. Kornaś, M. Ice phenomena in the Warta River in Poznań in 1961–2010. Quaest. Geogr. 2014, 33, 51–59. [Google Scholar] [CrossRef] [Green Version]
  65. Smith, E.H. Using extreme gradient boosting (XGBoost) to evaluate the importance of a suite of environmental variables and to predict recruitment of young-of-the-year spotted seatrout in Florida. bioRxiv 2019, 543181. [Google Scholar] [CrossRef]
  66. Wang, Z.; Li, C. River Ice Forecasting Based on Genetic Neural Network. In Proceedings of the International Conference on Information Engineering and Computer Science (ICIECS2009), Wuhan, China, 19–20 December 2009; pp. 1–4. [Google Scholar]
  67. Li, S.; Qin, J.; He, M.; Paoli, R. Fast evaluation of aircraft icing severity using machine learning based on XGBoost. Aerospace 2020, 7, 36. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Study area and the locations of water gauges and meteorological stations of the Institute of Meteorology and Water Management—National Research Institute (IMGW-PIB, Warsaw, Poland).
Figure 1. Study area and the locations of water gauges and meteorological stations of the Institute of Meteorology and Water Management—National Research Institute (IMGW-PIB, Warsaw, Poland).
Resources 11 00012 g001
Figure 2. Stages of research activities.
Figure 2. Stages of research activities.
Resources 11 00012 g002
Figure 3. Feed-forward multilayer perceptron architecture.
Figure 3. Feed-forward multilayer perceptron architecture.
Resources 11 00012 g003
Figure 4. A general architecture for XGBoost.
Figure 4. A general architecture for XGBoost.
Resources 11 00012 g004
Figure 5. Probability of occurrence of ice phenomena (classes 1–3 and none) as a function of the month for the water gauge stations on the Warta River. Note: Water gauge stations are labeled in the order (ah), according to their location on the river (from upper to lower course).
Figure 5. Probability of occurrence of ice phenomena (classes 1–3 and none) as a function of the month for the water gauge stations on the Warta River. Note: Water gauge stations are labeled in the order (ah), according to their location on the river (from upper to lower course).
Resources 11 00012 g005aResources 11 00012 g005b
Figure 6. Probability of the order of occurrence of ice phenomena classes (none, C1, C2, C3) as a function of the ice phenomenon of the previous day for the water gauge stations on the Warta River. Note: Water gauge stations are labeled in the order (ah), according to their location on the river (from upper to lower course).
Figure 6. Probability of the order of occurrence of ice phenomena classes (none, C1, C2, C3) as a function of the ice phenomenon of the previous day for the water gauge stations on the Warta River. Note: Water gauge stations are labeled in the order (ah), according to their location on the river (from upper to lower course).
Resources 11 00012 g006aResources 11 00012 g006b
Figure 7. Distributions of the relationship between classes of ice phenomena (none, C1, C2, C3) and air and water temperatures (Ta, Tw), the water level (H), and river flow (Q).
Figure 7. Distributions of the relationship between classes of ice phenomena (none, C1, C2, C3) and air and water temperatures (Ta, Tw), the water level (H), and river flow (Q).
Resources 11 00012 g007
Figure 8. Relative importance of predictors in the XGBoost model (profit values).
Figure 8. Relative importance of predictors in the XGBoost model (profit values).
Resources 11 00012 g008aResources 11 00012 g008b
Table 1. Classification and grouping of ice phenomena.
Table 1. Classification and grouping of ice phenomena.
ClassIce Phenomena Ice Phase of the River
Frazil iceI phase—
Freeze-up
1 classBorder ice
Border ice and frazil ice
Frazil ice jam
2 class Ice coverII phase—Ice cover
Ice floeIII phase—
Breakup and ice deterioration
3 classIce floe and border ice
Ice floe and frazil ice
Ice jam
Table 2. Goodness-of-fit metrics.
Table 2. Goodness-of-fit metrics.
PredictionObservation
PhenomenonNo Phenomenon
PhenomenonA (TP)B (FP)
No phenomenonC (FN)D (TN)
Explanation: A—TP (True-Positive), the number of true positive predictions, i.e., correctly classified examples from the selected class, B—FP (False-Positive), the number of false-positive predictions, i.e., examples incorrectly assigned to the selected class when in fact they do not belong to it, the so-called false alarm, C—FN (False-Negative), the number of false-negative predictions, i.e., misclassified examples from this class, i.e., a negative decision while the example is positive (the so-called error of miss), D—TN (True-Negative), the number of truly negative predictions, i.e., examples correctly not assigned to the selected class (correct rejection).
Table 3. The frequency of ice phenomena.
Table 3. The frequency of ice phenomena.
Class of Ice
Phenomena
BobrySieradzUniejowNowa WiesSremPoznanSkwierzynaGorzow
Wlkp.
1Nr. of days518275287417404735449626
(%)10.995.812.1810.0011.7315.6010.3313.29
2Nr. of days8239313030925969354278
(%)1.748.345.527.417.521.468.145.90
3Nr. of days2454124454215
(%)0.040.960.170.290.10.950.970.32
No *Nr. of days41103997193534312777386435033793
(%)87.2284.8682.1382.3080.6381.9980.5780.50
* No—no ice phenomena.
Table 4. Results of predictive modeling of ice phenomena for the Warta River (upper course).
Table 4. Results of predictive modeling of ice phenomena for the Warta River (upper course).
Water GaugeModelClassTraining SetTest Set
SensitivitySpecificityPrecisionBalanced AccuracySensitivitySpecificityPrecisionBalanced Accuracy
BobryXGBoost* No 0.9860.8920.9850.9390.9840.8520.9780.918
10.870.9870.8880.9280.830.9820.8520.906
20.930.9970.870.9640.87210.9710.936
3-1--01-0.5
NN3No 0.9880.9070.9870.9470.9820.8610.9790.922
10.8740.9870.8920.9310.8110.9830.8630.897
20.9780.9980.9170.9880.9190.9940.7230.957
311110100.5
NN4No 0.9880.9280.9890.9580.980.8860.9840.933
10.9050.9880.9050.9470.8440.9820.8440.913
20.9330.9980.9130.9660.9460.9950.7610.971
3111101-0.5
NN5No 0.9880.9210.9880.9540.980.8760.9820.928
10.9090.9870.8980.9480.8630.9790.8330.921
211110.7910.9980.8720.894
301-0.501-0.5
NN6No 0.9980.9340.990.9660.9820.8550.9790.918
10.9280.9970.9720.9620.8310.9810.8440.906
20.927110.9630.8290.9970.8290.913
3-1--01-0.5
SieradzXGBoostNo 0.9920.9190.9850.9550.9830.9360.9890.96
10.7650.9870.8070.8760.8690.9770.6750.923
20.9340.9890.8950.9620.9120.9950.9430.954
30.210.80.60.2810.8750.64
NN3No 0.9830.9650.9930.9740.9730.9170.9860.945
10.880.9830.7670.9320.7740.9740.6440.874
20.9610.9920.9170.9770.9570.9930.9220.975
30.370.9970.5560.6830.4440.9990.80.722
NN4No 0.9910.9280.9870.9590.980.9150.9850.948
10.7390.9910.8360.8650.6860.9820.7010.834
20.990.9940.9340.9920.9590.9870.8690.973
30.5910.9970.6840.7940.5220.9990.80.76
NN5No 0.9970.9290.9870.9630.9840.8520.9760.918
10.770.9960.9190.8830.5590.9860.6960.772
20.9760.9920.9170.9840.9520.9910.9040.972
30.643110.8210.4120.9970.4670.704
NN6No 0.9920.9470.9910.9690.9820.8880.980.935
10.8710.9930.890.9320.6840.9830.7150.833
20.990.9920.9190.9910.9540.9840.8510.969
30.2730.9990.750.63601-0.5
UniejówXGBoostNo0.9860.9220.9830.9540.9930.9410.9880.967
10.8730.9830.8790.9280.9120.990.9260.951
20.9380.9960.9380.9670.9690.9990.9840.984
301 0.501-0.5
NN3No 0.9940.9710.9940.9830.9810.8920.9760.936
10.9510.9920.9450.9720.8180.980.8540.899
20.9690.9990.9840.9840.970.9950.9140.982
3111101-0.5
NN4No 0.9930.9770.9950.9850.9790.9270.9840.953
10.9740.9890.9310.9820.8810.9780.8380.93
20.919110.960.9560.9960.9420.976
3111101-0.5
NN5No 0.9980.9580.9910.9780.980.8350.9650.908
10.9540.9970.980.9760.7560.9780.8160.867
20.951110.9750.8550.9990.9830.927
3111110.9970.3330.998
NN6No 0.9990.9860.9970.9930.990.8880.9760.939
10.980.9990.9930.9890.8040.9890.910.897
211110.9380.9930.8840.966
3111100.99900.5
* No means no ice phenomena.
Table 5. Results of predictive modeling of ice phenomena for the Warta River (middle course).
Table 5. Results of predictive modeling of ice phenomena for the Warta River (middle course).
Water GaugeModelClassTraining SetTest Set
SensitivitySpecificityPrecisionBalanced AccuracySensitivitySpecificityPrecisionBalanced Accuracy
Nowa WiesXGBoost* No 0.9880.9040.9790.9460.9840.910.9810.947
10.8550.9880.8960.9220.8670.9830.8460.925
20.9480.9950.9360.9720.8960.9940.9260.945
301-0.501-0.5
NN3No 0.9890.9170.9820.9530.980.890.9760.935
10.860.990.8990.9250.8190.9850.8640.902
20.9560.9940.9330.9750.940.9910.8870.965
301 0.501-0.5
NN4No 0.9840.9330.9860.9580.9860.8420.9660.914
10.9010.9850.8720.9430.8050.9860.8680.896
20.9290.9970.960.9630.8380.9940.9210.916
30.40.9980.3330.6990.14310.50.571
NN5No 0.9790.950.9890.9650.9810.910.9810.946
10.930.9810.8440.9550.8270.9850.8610.906
20.9560.9970.9680.9770.9060.9910.8880.949
30.625110.8120.250.9960.1110.623
NN6No 0.9870.9660.9920.9760.9720.9190.9830.946
10.9670.9890.9110.9780.8980.9730.7910.936
20.9570.9980.9750.9770.8650.9940.9210.93
311110.1670.9980.20.582
SremXGBoostNo 0.990.9360.9840.9630.9910.9350.9850.963
10.8950.990.9210.9420.9030.9890.9170.946
20.9690.9970.9620.9830.9450.9990.9840.972
301 0.501-0.5
NN3No 0.9920.9820.9960.9870.9760.9270.9820.951
10.9670.9930.9490.980.8760.9770.8290.926
20.9840.9980.9690.9910.9550.9980.9770.977
30.667110.83300.99900.499
NN4No 0.9970.9620.9910.980.990.8940.9760.942
10.940.9970.9810.9680.820.990.9120.905
211110.9310.9960.9450.963
311110.3330.9980.250.666
NN5No 0.9990.9570.990.9780.9880.9050.9770.947
10.9290.9970.9790.9630.8740.9860.8960.93
20.977110.9880.8770.9980.9740.938
311110.50.9980.250.749
NN6No 0.9970.9760.9940.9870.980.9120.9790.946
10.950.9970.9750.9740.8220.9830.8690.903
20.9930.9990.9850.9960.9920.9940.9250.993
311110.3330.9990.3330.666
PoznanXGBoostNo 0.9870.9220.9830.9550.9840.8890.9750.937
10.9060.9830.9060.9450.8740.980.8930.927
20.9350.9990.9060.9670.8950.9990.9190.947
30.5670.9980.810.7820.2670.9980.50.632
NN3No 0.9850.9350.9860.960.9810.9190.9810.95
10.9140.9850.9140.9490.890.9760.8760.933
20.9090.9980.8820.9540.9170.9990.9430.958
30.810.9980.7730.9040.4170.9960.50.706
NN4No 0.9840.930.9830.9570.980.9560.9910.968
10.9190.9810.9050.950.9180.9740.860.946
20.8290.9990.9350.9140.85310.9670.926
30.867110.9330.40.9950.3330.697
NN5No 0.9890.9620.9920.9760.9810.9030.9780.942
10.9460.9890.9410.9670.8640.9760.8710.92
20.970.9990.9410.9840.8060.9990.9060.902
30.8330.9970.7410.9150.5240.9950.4780.759
NN6No 0.990.9560.990.9730.980.9010.9790.94
10.9490.9880.9370.9690.8860.9760.8690.931
20.9120.9990.9390.9550.857110.929
30.750.9990.90.8750.4290.9970.5290.713
* No means no ice phenomena.
Table 6. Results of predictive modeling of ice phenomena for the Warta River (lower course).
Table 6. Results of predictive modeling of ice phenomena for the Warta River (lower course).
Water GaugeModelClassTraining SetTest Set
SensitivitySpecificityPrecisionBalanced AccuracySensitivitySpecificityPrecisionBalanced Accuracy
SkwierzynaXGBoost* No 0.9870.9410.9850.9640.9870.9080.9790.948
10.8910.9850.8720.9380.8630.9820.8440.922
20.9690.9960.9590.9820.9630.9980.9750.98
30.3810.9990.7270.690.143110.571
NN3No 0.9830.9560.9890.970.9830.9640.9910.973
10.9130.980.840.9460.9360.980.840.958
20.950.9990.9830.9740.9310.9980.9820.965
30.5420.9960.6190.7690.3890.9960.4380.692
NN4No 0.990.9770.9940.9840.9830.9170.980.95
10.9520.990.920.9710.8320.9780.8130.905
20.97910.9950.9890.9520.9970.9630.974
30.8420.9980.7620.920.3910.9970.5620.694
NN5No 0.9930.9680.9930.9810.990.8870.9720.938
10.9250.9910.9210.9580.7960.9870.8820.891
20.9840.9990.9890.9910.9420.9950.9470.969
30.6250.9970.6250.8110.3850.9960.5560.69
NN6No 0.9950.9860.9970.990.980.9340.9840.957
10.9810.9910.9250.9860.9030.9790.8420.941
20.995110.9970.970.9950.9410.982
30.55110.7750.1820.9990.5710.59
Gorzow Wlkp.XGBoostNo 0.9840.9380.9850.9610.9820.90.9760.941
10.9110.9810.8770.9460.8660.9820.8830.924
20.9730.9990.9860.9860.9770.9980.9630.987
301-0.501-0.5
NN3No 0.9860.9290.9820.9580.9820.9070.9780.944
10.9070.9850.9040.9460.8750.980.8690.927
20.9810.9930.990.9690.9980.9690.983
301-0.501-0.5
NN4No 0.9810.9460.9870.9630.980.9260.9820.953
10.920.9780.8650.9490.8850.9780.8630.932
20.98610.9930.9930.9640.9960.9370.98
301-0.501-0.5
NN5No 0.9870.9590.9890.9730.9720.9080.9790.94
10.9370.9860.9140.9610.8580.9740.830.916
20.98610.9930.9930.9120.9950.9120.953
30.4440.9990.5710.7220.3330.9970.2220.665
NN6No 0.9890.9470.9870.9680.9730.9130.9790.943
10.9290.9860.9140.9580.8820.9740.8380.928
20.986110.9930.9490.9980.970.974
30.333110.6670.1670.9990.250.583
* No means no ice phenomena.
Table 7. Models with the best prediction of ice phenomena on the Warta River.
Table 7. Models with the best prediction of ice phenomena on the Warta River.
Water
Gauge
No IceClass 1Class 2Class 3
BobryNN4NN5NN4* No results
XGBoost
SieradzXGBoostXGBoostNN3NN4
UniejowXGBoostXGBoostXGBoostNN5
Nowa WiesXGBoostNN5NN5NN5
XGBoost
SremXGBoostXGBoostNN6NN5
PoznanNN4NN3NN3NN5
SkwierzynaNN3NN3NN6NN4
XGBoos
Gorzow
Wlkp.
NN4NN4XGBoostNN5
* No results—no results from the learned relations between the class of the phenomenon and predictors.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Graf, R.; Kolerski, T.; Zhu, S. Predicting Ice Phenomena in a River Using the Artificial Neural Network and Extreme Gradient Boosting. Resources 2022, 11, 12. https://0-doi-org.brum.beds.ac.uk/10.3390/resources11020012

AMA Style

Graf R, Kolerski T, Zhu S. Predicting Ice Phenomena in a River Using the Artificial Neural Network and Extreme Gradient Boosting. Resources. 2022; 11(2):12. https://0-doi-org.brum.beds.ac.uk/10.3390/resources11020012

Chicago/Turabian Style

Graf, Renata, Tomasz Kolerski, and Senlin Zhu. 2022. "Predicting Ice Phenomena in a River Using the Artificial Neural Network and Extreme Gradient Boosting" Resources 11, no. 2: 12. https://0-doi-org.brum.beds.ac.uk/10.3390/resources11020012

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop