CLAP: Gas Saturation Prediction in Shale Gas Reservoir Using a Cascaded Convolutional Neural Network–Long Short-Term Memory Model with Attention Mechanism

Yang, Xuefeng; Zhang, Chenglin; Zhao, Shengxian; Zhou, Tianqi; Zhang, Deliang; Shi, Zhensheng; Liu, Shaojun; Jiang, Rui; Yin, Meixuan; Wang, Gaoxiang; Zhang, Yan

doi:10.3390/pr11092645

Open AccessArticle

CLAP: Gas Saturation Prediction in Shale Gas Reservoir Using a Cascaded Convolutional Neural Network–Long Short-Term Memory Model with Attention Mechanism

by

Xuefeng Yang

¹,

Chenglin Zhang

¹,

Shengxian Zhao

¹,

Tianqi Zhou

^2,*,

Deliang Zhang

¹,

Zhensheng Shi

²

,

Shaojun Liu

¹,

Rui Jiang

¹

,

Meixuan Yin

¹,

Gaoxiang Wang

¹ and

Yan Zhang

¹

Shale Gas Institute of PetroChina Southwest Oil & Gasfield Company, Chengdu 610051, China

²

Research Institute of Petroleum Exploration and Development, PetroChina, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Processes 2023, 11(9), 2645; https://0-doi-org.brum.beds.ac.uk/10.3390/pr11092645

Submission received: 9 August 2023 / Revised: 31 August 2023 / Accepted: 1 September 2023 / Published: 4 September 2023

Download

Browse Figures

Versions Notes

Abstract

:

Gas saturation prediction is a crucial area of research regarding shale gas reservoirs, as it plays a vital role in optimizing development strategies and improving the efficiency of exploration efforts. Despite the advancements in deep learning techniques, accurately modeling the complex nonlinear relationships involved in gas saturation prediction remains a challenge. To address this issue, we propose a novel cascaded model, CLAP, combining convolutional neural networks (CNNs) and Long Short-Term Memory (LSTM) with an attention mechanism. It effectively captures and visualizes the intricate nonlinear relationships, enabling accurate gas saturation prediction in shale gas reservoirs. In this study, nine logging curves from 27 shale gas wells in the Changning area of the Sichuan Basin were used to train the CLAP model for predicting the gas saturation of the Wufeng-Longmaxi Formation shale. Compared to the Archie and random forest models, the CLAP model exhibited enhanced accuracy in predicting shale gas saturation. Promisingly, the CLAP model demonstrates outstanding statistical performance in gas saturation prediction, achieving an impressive R² score of 0.762 and a mean square error (MSE) score of 0.934. These positive results highlight the effectiveness and potential utility of our proposed CLAP model in accurately predicting gas saturation in shale gas reservoirs. The application of deep learning techniques, such as CNNs, LSTM, and attention mechanisms, presents a promising avenue for further advancements in this field.

Keywords:

CNN; LSTM; gas saturation; deep learning; attention mechanism

1. Introduction

A shale gas reservoir is a very special type of reservoir, and the prediction of its gas saturation is very important for evaluating the gas reserves and development potential of the reservoir. However, due to the complexity and particularity of shale gas reservoirs, the prediction of gas saturation faces some challenges. Firstly, the porosity of shale gas reservoirs is very low, and the gas mainly exists in the micropores and nanopores of the rock in the adsorbed state. The release and flow mechanism of this adsorbed gas is very different from that of traditional reservoirs, so the traditional gas saturation model may not be suitable for shale gas reservoirs. Secondly, the rock properties and components of shale gas reservoirs are complex and diverse, including organic matter content, general rock texture (e.g., interlocking pattern, shales, and habits of primary mineral grains), mineral composition, and pore structure. The influence of these factors on gas saturation is very complicated, and traditional models often fail to accurately consider the influence of these factors. Researchers have proposed new methods and models to predict shale gas reservoirs’ gas saturation. These methods include the electrical resistivity method, the non-electro special technique, and deep learning methods. These methods attempt to comprehensively consider factors such as rock properties, geological conditions, and stratigraphic characteristics to improve the prediction accuracy of gas saturation.

The electrical resistivity method was born in the 1920s, and was mainly used for the identification of oil and gas reservoirs, lithology classification, and quantitative interpretation and evaluation of reservoirs [1,2,3,4,5]. However, the study of water saturation did not make any progress until Archie proposed the calculation method of water saturation in 1942 [2]. While the Archie formula has been utilized, it comes with stringent prerequisites and challenging conditions regarding formation [3,4]. A succession of resistivity–porosity–saturation models for water saturation measurement have been developed, with several earning recognition in shale applications. Literature reviews indicate that primary factors affecting moisture estimation include negatively charged clays, clay swelling, conductive kerogen, pyrite, adsorbed water, pore structure, and low permeability [1,2,3,4,5,6,7]. Many studies have been carried out on the relationship between resistivity logging data and water saturation in sandstone reservoirs, especially focusing on the Simandoux model [3], the improved Simandoux model [4], and the Indonesian model [5]. Researchers have improved these models to analyze the relationship between water saturation and different types of clay minerals in more depth, such as the dual-water model [6], the Clay model, and the Waxman–Smits–Thomas (WST) model [7,8,9]. More recently, researchers have also conducted many studies on water saturation prediction in organic-rich shales [10,11,12]. The advantages of the electrical resistivity method are that it is simple to use, highly reliable, and suitable for saturation prediction under general geological conditions. However, shale reservoirs differ from conventional sandstone reservoirs. Factors such as the graphitization of organic matter, high pyrite content, and the added conductivity from clay minerals have a pronounced impact on resistivity. In the resistivity method (electrical approach), the more variable the parameters included in the response equation, the greater the iterative solution error. Traditional electrical logging saturation calculation methods are not particularly suitable for low-resistivity shale gas reservoirs.

In recent years, numerous scholars have conducted research on non-electrical saturation evaluations of shale gas reservoirs using both conventional and specialized logging data [5,6,7,8,9,10]. At present, some researchers relate water saturation to shale skeleton minerals, such as quartz, clay minerals, feldspar, and carbonates (including calcite, dolomite, and siderite) through statistical regression to obtain empirical formulas [10,12,13,14,15,16]. Their models may only be applicable to specific areas, but they work well [14,15,16]. The electrical method of predicting water saturation is common in other countries, whereas the non-electrical method is common in China’s more complex and unique geological conditions [12,13,14,15,16]. Zhang proposed a model that considers different mineral compositions through statistical regression [12]. Yan, based on the fact that the water saturation of shale reservoirs in the Sichuan Basin decreases with the decrease in clay mineral content and the increase in organic matter, indicates that the calculation model with dual factors is more appropriate [13]. Shi proposed a method based on the ratio of the background value of organic matter to the measured and calculated value of TOC, but did not consider the deviation effect of bound water and TOC in clay on the calculation accuracy of water saturation [14]. Li found that the bound water saturation of shale reservoirs is closely related to the clay mineral content [15,16]. To sum up, there is an urgent need to develop a gas saturation prediction model that is widely suitable for most shale reservoirs. Yet, traditional methodologies exhibit distinct limitations [8,9,10,11,12,13,14]. The non-electrical logging method faces challenges such as inaccuracies in the indirect calculation of key reservoir parameters from logging data and a limited amount of experimental data available for correcting these computed parameters [12,13,14,15,16]. Additionally, for shales with diverse mineral combinations, mineral structures, and reservoir properties, it becomes challenging to apply a single non-electrical model to predict gas saturation across different types of low-resistivity shales. It is imperative to develop new gas saturation prediction models based on the mineral composition, organic maturity, and unique logging data characteristics of each specific type of low-resistivity shale [12,13,14,15,16].

In the field of petroleum science and engineering, it has become a common method to use machine learning and deep learning algorithms for logging data analysis and prediction. Aifa successfully predicted the permeability and porosity of the Hassi R’Mel gas field in Algeria using a neuro-fuzzy system [17]. Al-Mudhafar integrated advanced machine learning algorithms with log interpretation to accurately model lithofacies classification and permeability by deriving complex relationships in logging data [18]. Wood proposed a TOB learning network algorithm as an optimized data-matching algorithm with high accuracy and interpretability, which provides an ideal tool for deeper data mining [19]. Otchere introduced a novel ensemble model, combining random forest and Lasso regularization techniques for feature selection, which enhanced reservoir representation using the Extreme Gradient Boosting (XGBoost) regression model for permeability and water saturation prediction [20]. Drawing from the classical capillary pressure formula, Xu established a linear regression model linking rock’s capillary force inversely to porosity squared and correlated it with trap height in the pure gas zone [21]. Huang introduced a shale gas reservoir saturation evaluation model using the random forest regression algorithm, offering a highly adaptable tool with strong generalization for shale gas development. Huang introduced a shale gas reservoir saturation evaluation model using the random forest regression algorithm, offering a highly adaptable tool with strong generalization for shale gas development [22]. However, most machine learning models heavily rely on data preprocessing methods and often struggle to handle complex nonlinear relationships [23,24,25,26,27,28,29,30,31]. This limitation holds true for gas saturation prediction in shale gas reservoirs, where accurately capturing the intricate nonlinear connections across various reservoir indicators becomes crucial [32,33,34,35,36,37].

Historically, gas saturation estimation of the Wufeng-Longmaxi shale in the Southern Sichuan basin was primarily based on the resistivity model [38,39,40,41,42]. However, this method’s limited precision compromised the accuracy of reservoir evaluations [38,39,40,41,42]. In response, this study integrates well logs with experimentally measured gas saturation data, utilizing a cascaded deep learning framework that combines convolutional neural networks (CNNs), Long Short-Term Memory (LSTM), and attention mechanisms (ATTs) to predict the gas saturation of a shale gas reservoir. This approach seeks to harness the strengths of deep learning methodologies in tandem with geological rock and gas reservoir characteristics and logging data, aiming to bolster the precision and reliability of gas saturation predictions.

2. Saturation Calculation Methods

2.1. Processing of Data Set

2.1.1. Source of Data

In this study, we utilized a dataset consisting of nine logging curves, namely natural gamma (GR), uranium-free gamma (KTH), uranium (U), potassium (K), acoustic time difference (AC), neutron (CNL), density (DEN), deep lateral resistivity (RT), and shallow lateral resistivity (RXO), collected from Schlumberger. These curves correspond to the Wufeng-Longmaxi Formation shale in 27 shale gas wells located in the Changning area, located at the southern edge of the Sichuan Basin. To train our model, we employed a total of 1436 gas saturation measurements from these wells as the experimental data. Subsequently, we utilized this dataset to train the model, allowing it to learn the complex relationships between the logging curves and gas saturation levels. Following the training phase, we applied the trained model to predict the gas saturation levels in an additional set of five shale gas wells that were not included in the training phase.

All these samples were analyzed for water saturation. Based on these measurements, gas saturation was determined by subtracting the water saturation from 100%. Dry shale was put in an environment with constant humidity. Vapor diffusion absorption occurs in shale when the humidity in the shale is lower than that in the external environment, increasing the water content in the shale. Then, water saturation was calculated based on the quality variation of the shale samples during this period [25,26].

2.1.2. Analysis of Feature Correlation and Importance

Correlation analysis is a statistical method used to measure and evaluate the strength and direction of relationships between variables to reveal intrinsic connections, predict modeling, explore data, assist feature selection, and find patterns, trends, and anomalies to improve model efficiency and accuracy.

Feature correlations are visualized by plotting, where the numerical magnitude represents the degree of correlation, the maximum value is 1, and a plus or minus sign represents a positive or negative correlation. Since the main task of this experiment is to predict water saturation, we mainly look at the correlation between water saturation and other columns, so that we can select the features with high correlation for prediction. In the second column of the figure, we can see that the correlation of all columns is greater than 0.1, which means that the prediction of water saturation is relevant.

In order to reveal the degree of association between each feature and the target variable and to understand which features play a key role in the prediction of the target variable, the machine learning random forest model has an important feature that can analyze the importance of features [23]. Therefore, a random forest model is constructed to predict the importance of features. In the prediction process, the importance of the input parameters is analyzed, that is, the contribution value of the logging curve type to the prediction results of the random forest regression algorithm. Ten logging curves of natural gamma (GR), uranium-free gamma (KTH), uranium (U), thorium (TH), potassium (K), acoustic time difference (AC), neutron (CNL), density (DEN), deep lateral resistivity (RT), and shallow lateral resistivity (RXO) were selected as the input curves of the sample, and core water saturation (SW) was used as the output label (Figure 1). The relationship between the well logging curve and the core water saturation was analyzed. It was found that DEN, CNL, RT, RXO, K, AC, GR, U, and KTH are of high importance in the prediction of water saturation in ultra-low resistivity shale gas reservoirs, while TH is of general importance. Therefore, nine well-logging curves with high importance were finally selected as the input curves of the prediction model.

2.2. Water Saturation Prediction Method

A brief description of the experimental procedure.

The process of this experiment is as follows: the original data set is obtained, and the experimental procedure involved several steps. Firstly, the original dataset was obtained and subjected to various analyses, including feature correlation and importance analysis, in order to explore the relationships within the data. The experimental model employed in this study is the CLAP model (combining convolutional neural networks (CNNs) and Long Short-Term Memory (LSTM) with an attention mechanism). The model underwent training using the prepared dataset, and its performance was evaluated. To assess the model’s effectiveness, it was utilized for gas saturation prediction. Figure 2 visually represents the complete pipeline for predicting water saturation using the CLAP model. It provides a clear illustration of the step-by-step process and elucidates the specific role played by each component within the pipeline. The CLAP model is a pluggable module bridging the split data to the gas saturation.

Furthermore, Table 1, Table 2 and Table 3 offer comprehensive comparisons and detailed analyses of the gas saturation prediction results, as well as the metrics among the three aforementioned models.

2.2.1. CNN Model

CNNs are particularly advantageous when dealing with data that exhibits a similar network structure, such as time series and image data. Their benefits include parameter sharing and sparse connections, which lead to a reduction in the number of learning parameters required. Consequently, CNNs can effectively train on smaller datasets, thereby mitigating the risk of overfitting. Usually, the CNN architecture is primarily comprised of several modules, with the convolutional layer being the central component that sets it apart from other neural networks. In this layer, filter parameters are employed to perform convolution operations on the input layer (experimental data), enabling the extraction of fundamental features. These filter parameters are initially randomly initialized, and then the defined loss function is utilized for backpropagation in order to obtain the most suitable filter parameters for feature extraction [24]. Taking the convolution of two-dimensional data as an example, the convolution operation formula is:

S (i, j) = (X * K) (i, j) = \sum m \sum n X (m, n) K (i - m, j - n)

(1)

where * is the convolution operation.

The second module in a convolutional neural network is the pooling layer, which calculates aggregate statistics for a specific layer of the network. Its primary purpose is to reduce model size, improve computation speed, and enhance the robustness of extracted features to mitigate overfitting. The pooling layer has a significant advantage over the convolutional layer—it does not have any parameters to learn, thereby alleviating parameter pressure within the network. Two commonly used types of pooling layers are max pooling and average pooling (Table 1, Table 2 and Table 3). Let’s consider two-dimensional data as an example. Max pooling compresses the input data by selecting the maximum element value from the corresponding area, whereas average pooling calculates the average value within the area.

The third module is the fully connected layer, which performs a complete connection between neurons. It applies an appropriate activation function to generate the output activation values, representing the features extracted through the convolutional neural network [25].

2.2.2. LSTM Model

Long Short-Term Memory (LSTM) models are a variant of recurrent neural networks (RNNs), which are commonly used for modeling and prediction tasks that deal with sequence data (Figure 3). Compared with the traditional RNN model, the LSTM model introduces a gating mechanism, which can better capture and remember the long-term dependencies in the input sequence [26].

The core idea of LSTM model is to use memory units (cells) to store and transfer information, and to control the flow of information through gates. Specifically, the LSTM model consists of an input gate, a forget gate, an output gate, and a memory cell [27].

The input gate determines how much input information will be passed to the memory cell, the forget gate determines whether to delete the previous memory, and the output gate determines the output of the hidden state. The memory unit is responsible for storing and transferring long-term information, which is controlled and updated by the calculations of the gating unit.

Where: x_t is the input data at time t; C_t−1 is the memory value at time t − 1; h_t−1 is the output value of LSTM at time t − 1. The three data x_t, C_t₋₁, and h_t₋₁ constitute the input data of the model. C_t is the memory value at time t, h_t is the output value of LSTM at time t, and the two data C_t and h_t constitute the output data of the model [28].

Control functions of forget gate, input gate and output gate:

f_{i}^{(t)} = s i g m a (b_{i}^{f} + \sum_{j} U_{i, j}^{f} x_{j}^{(t)} + \sum_{j} W_{i, j}^{f} h_{j}^{(t - 1)})

(2)

g_{i}^{(t)} = s i g m a (b_{i}^{g} + \sum_{j} U_{i, j}^{g} x_{j}^{(t)} + \sum_{j} W_{i, j}^{g} h_{j}^{(t - 1)})

(3)

q_{i}^{(t)} = s i g m a (b_{i}^{o} + \sum_{j} U_{i, j}^{o} x_{j}^{(t)} + \sum_{j} W_{i, j}^{o} h_{j}^{(t - 1)})

(4)

where b_o, U_c, Wo are the bias, the input weight, and the cyclic weight of the forget gate, respectively. In these variants, one can optionally use the cell state as an additional input to the three gates of the ith cell (Table 1, Table 2 and Table 3).

2.2.3. Attention Mechanism

Attention mechanism is an important component used in neural network models (Figure 4). Its principle is to assign different attention weights to different parts of the input data, thereby reducing the role of irrelevant parts [29,30]. This enables the model to focus more on important information during processing and learning tasks, ultimately enhancing performance. As shown in Figure 4, the attention mechanism focuses on relevant parts of the input sequence while generating an output. It involves computing attention weights that determine the importance of each element in the input sequence with respect to a query. These attention weights are used to calculate a context vector, which captures the most relevant information from the input. The context vector is then combined with the query to produce the final output.

So as to improve the performance and performance of the model, the relevant formula [29,30] is as follows:

α_{i} = \frac{e^{(s (h_{i}, h_{t}))}}{\sum_{j = 1}^{N} e^{(s (h_{i}, h_{t}))}}

(5)

α = \sum_{i = 1}^{N} α_{i} h_{i}

(6)

where

α_{i}

is the score of the feature vector, and a higher score indicates greater attention.

s (h_{i}, h_{t})

is the weight value of the ith input feature in the attention mechanism, which is the ratio of the score of the feature vector to the total population. Then, all vectors are summed and averaged to obtain the final vector

α

(Figure 4).

2.2.4. CLAP Model

The CNN-LSTM model with an attention mechanism is applied in the prediction of shale gas saturation. This study focuses on utilizing data from 27 shale gas wells located in the Changning area, situated on the southern margin of the Sichuan Basin, specifically targeting the ZhongWufeng Formation and Longmaxi Formation shale. The initial step of the model involves performing convolution operations on the input data, effectively extracting local features (depicted in Figure 5). The convolutional neural network (CNN) plays a crucial role in this extraction process. The extracted features obtained from the CNN are then fed into the Long Short-Term Memory (LSTM) network for sequential encoding. The LSTM network enables the modeling of long-term dependencies within the sequence data, generating a contextual representation. Following the LSTM output, attention weights are calculated to discern and emphasize significant segments of the input sequence. These attention weights serve as indicators of the relative importance of different parts of the sequence. By employing these weights, a weighted summation of the LSTM outputs is performed, obtaining an integrated context representation that better captures essential information. This integrated context representation subsequently serves as input to various structures, including the fully connected layer, facilitating the prediction of gas saturation.

The CLAP model offers several advantages. Firstly, the CNN component excels at extracting local features from the gas saturation experimental data, enabling effective capture of feature information at various scales (as demonstrated in Figure 6). This capability allows the model to extract multiple levels of abstract representation from the input sequence, thereby expressing the intrinsic characteristics of the experimental data more accurately. Secondly, the attention mechanism plays a pivotal role in the CLAP model. It selectively focuses on important segments of the input experimental data. Through self-learning, the model can autonomously determine which parts are crucial for predicting gas saturation, effectively introducing attention into the decision-making process. Lastly, due to the cohesive integration of the CNN, LSTM, and attention mechanism, the CLAP model exhibits strong expressive power and generalization ability [31]. It can effectively capture the information embedded within the input experimental data, leading to more precise predictions of gas saturation (as illustrated in Table 4 and Table 5).

2.3. Performance Evaluation

2.3.1. Evaluation Index of the Experiment

The mean absolute error (MAE) is also a common regression loss function, which indicates the average magnitude of error in the predicted value regardless of the direction of the error, and its formula is as follows:

M A E = \frac{1}{N} \sum_{i = 1}^{N} | y_{j} - {\hat{y}}_{j} |

(7)

Mean square error (MSE) is used in regression prediction tasks to measure the average squared difference between the predicted values of a model and the true values of the model. The formula is as follows:

M S E = \frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - {\hat{y}}_{j})}^{2}

(8)

The R² score, also known as the coefficient of determination, is one of the statistical measures used to assess the goodness of fit of a regression model. It indicates the proportion of variance that the model is able to explain in the target variable, and its formula is as follows:

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{j})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - y_{j})}^{2}}

(9)

2.3.2. Presentation and Analysis of Experimental Results

Based on the Pytorch framework, this experiment used the Python language to write code, and completed the construction and training of the CLAP combined model. The dataset was processed and divided into two parts: 70% of the data were used for training the experimental model, while the remaining 30% were utilized for model validation, and the data include feature depth, AC, GR, CNL, DEN.

2.3.3. Characteristics of Error Function Variation and Model Convergence

The CLAP model was trained using 1436 samples over 500 epochs. The training error variation throughout the training process is illustrated in Figure 7. An epoch, in this context, refers to the complete forward and backward propagation of all data through the network [13,14,15]. The loss plots indicated convergence after approximately 100 epochs. As the training progressed, the training error decreased to 0.014 (Figure 7a), while the R² value steadily increased to 0.843 (Figure 7b). This suggests that the model’s regression analysis error was minimal, indicating good performance.

3. Result and Discussion

In this study, nine logging curves—including natural gamma (GR), uranium-free gamma (KTH), uranium (U), potassium (K), acoustic time difference (AC), neutron (CNL), density (DEN), deep lateral resistivity (RT), and shallow lateral resistivity (RXO)—were sourced from 27 shale gas wells in the Changning area at the southern edge of the Sichuan Basin. These curves, related to the Wufeng-Longmaxi Formation shale, were utilized to train the CLAP model. This trained model was then applied to predict gas saturation in five other shale gas wells not used in the training phase. Upon comparison with the Archie model (methods and details referenced in [13]) and the random forest model, the CLAP model demonstrated superior accuracy in predicting shale gas saturation.

Firstly, the prediction accuracy of the three prediction models was compared through the prediction results of gas saturation in a single well (Table 6, Figure 8). Well A is a shale gas well in the west of the Changning area. The lithology of the Wufeng-Longmaxi Formation is mainly black silica shale, gray calcarine shale, and gray clay shale, and the shale gas is well displayed. The GR curve of the Wufeng-Longmaxi Formation shows a high value at 1322–1313 m, and the lateral resistivity curve of the deep and shallow parts is low, ranging from 14 Ω·m to 77 Ω·m. The resistivity of some wells is less than 20 Ω·m, which makes the curve as a whole zigzag. By comparing the gas saturation prediction results of the Archie model, random forest model, and CLAP model in well A, it can be seen that the gas saturation prediction results of the CLAP model are obviously the most consistent with the experimental results of gas saturation obtained from the core. The gas saturation calculated by the Archie model is significantly lower than the experimental data of gas saturation. For the random forest model, the difference between the predicted and experimental results of gas saturation is significantly larger than that of the CLAP model. In order to further compare the accuracy of the prediction results of CLAP model, Archie model, and random forest model, the correlation analysis between the predicted gas saturation results of the three models and the core analysis of well A was carried out (Figure 9). It can be seen that the predicted results of the CLAP model have the strongest correlation with the experimental results, with an R² of 0.97, while the predicted results of random forest model have the worst correlation with the experimental results, with an R² of only 0.57.

The comparison of gas saturation prediction results of the three models with experimental data in the five shale gas wells is shown in Table 7. The mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R²) between the predicted results of the CLAP model and the experimental data were 1.44–2.01, 0.89–1.87, and 0.74–0.98, respectively. It can be seen that there is a strong correlation between the predicted results and the experimental data. Compared with the Archie model, the MAE, RMSE, and R² of the prediction results of the Archie model and the experimental data were between 12.34–27.43, 12.34–16.93, and 0.46–0.78, respectively, indicating a poor correlation between them. The MAE, RMSE, and R² of the random forest model were between 8.45–16.45, 13.46–27.47, and 0.26–0.67, respectively, indicating the worst correlation. From the perspective of the three error evaluation parameters, the CLAP model has the smallest error between the prediction results and the experimental data. By comparing the calculation results of this well, it can be found that the calculation results of the Archie model are generally lower than the experimental results, and there are large differences in some data, while the prediction results of the random forest model are very different from the experimental results. Therefore, both the Archie model and the random forest model are not suitable for the calculation of shale gas saturation in the study area. However, the error between the prediction results of the CLAP model and the experimental results is relatively small, so the CLAP model is more suitable for the prediction of shale gas saturation in the Changning area (Figure 9).

Comparing the above three models, it can be seen that the CLAP model is superior to the traditional Archie model and the random forest model in the prediction calculation of shale gas saturation (Figure 9). Although the CLAP model does not have an equation representing physical properties, it can establish a more accurate prediction model of water saturation through multivariate data network training of logging parameters related to water saturation, and then obtain a more accurate gas saturation. The advantage of the CLAP model is that it can make full use of the diversity of logging data and combine the characteristics of convolutional neural networks (CNNs) and Long Short-Term Memory (LSTM) networks, has good memory ability in time series, and uses an attention mechanism (ATT) to improve the importance and correlation of features [43,44,45]. This enables the model to capture the time series information in the input data and automatically select the most relevant features for prediction [43,44,45]. The superior precision of the CLAP prediction model stems from two fundamental components: the incorporation of the attention mechanism and the synergistic combination of CNN and LSTM neural network architectures [46,47,48,49]. The attention mechanism plays a pivotal role by assigning varying attention weights to different segments of the input data. This enables the model to effectively diminish the influence of irrelevant information and instead prioritize relevant and important details during the processing and learning tasks [46,47,48,49]. The attention mechanism was integrated into CLAP, allowing it to selectively concentrate on significant aspects within the input experimental data [46,47,48,49]. By leveraging the attention mechanism, the model becomes adept at identifying and focusing on the most informative features in the input, leading to enhanced predictive capabilities [50,51]. Moreover, CNN and LSTM neural network structures are combined to model the implicit information of input data. The CNN is mainly responsible for extracting the local features of the experimental data of gas saturation, enabling it to effectively capture the feature information at different scales [50,51,52,53]. The LSTM component addresses the challenge of modeling long-term dependencies, which are crucial in sequence data [50,51,52,53]. Combining these two neural network structures results in a synergistic approach, enabling a more comprehensive understanding of the input experimental data, resulting in an improved prediction model [50,51,52,53].

Compared to the CLAP model, the problem with the random forest model is that it is relatively weak in dealing with high-dimensional data and time series data. The random forest model is more suitable for dealing with static data and low-dimensional data. Therefore, the performance of the random forest model is poor in the prediction of shale gas saturation. The applicability of the CLAP model to predict gas saturation includes four aspects: first, the quality of logging data is good, and non-formation factors can be used for prediction after correction; second, the core positioning should be accurate; third, the parameters of sensitive variables should be selected appropriately. Only by selecting sensitive logging curves of higher importance can the prediction results be optimized. The fourth aspect is the use of a large number of training sets; the number of training sets per well should be greater than 100.

In future endeavors, we aim to enhance the CLAP model by refining its attention mechanism, which, despite effectively discerning complex nonlinear relationships, still offers potential for optimization, especially across varied shale gas contexts. Additionally, while the model has shown notable efficacy with the Sichuan Basin dataset, its broader generalization across different shale gas reservoirs necessitates evaluation using data from diverse geographical regions. Subsequent studies should investigate the CLAP model’s applicability as it extends beyond its current domain, making it a potential tool for predicting various time series data. This includes predicting EUR, TOC, and porosity in shale reservoir research, as well as applications in agriculture, soil analysis, weather forecasting, traffic data interpretation, and power load data analysis. The predictive insights garnered from the CLAP model can offer valuable technical guidance for expert decision-making across these sectors.

4. Conclusions

Traditional logging evaluation methods, often based on empirical relationships or statistical regression, struggle to delineate complex nonlinear relationships within logging data. In response, we introduce CLAP, a model that harnesses deep neural networks, particularly integrating convolutional neural networks (CNNs) and Long Short-Term Memory (LSTM) with an attention mechanism, to discern and extract concealed features from extensive logging data. This innovative approach notably enhances prediction accuracy, yielding outcomes of substantial practical value.

The high accuracy of the CLAP prediction model can be attributed to two primary factors: the introduction of the attention mechanism and the integration of CNN and LSTM neural network structures. The attention mechanism assigns differential weights to segments of input data, effectively emphasizing relevant details and diminishing irrelevant information. This prioritization, facilitated by the attention mechanism, empowers the model to hone in on the most informative features, thereby augmenting its predictive capabilities. In CLAP, the CNN captures local features from gas saturation data, while the LSTM models long-term sequence dependencies. Their combined synergy enhances the accuracy of gas saturation predictions.

Nine logging curves from 27 shale gas wells in the Changning area of the Southern Sichuan Basin were employed to train the CLAP model. This model was subsequently applied to predict gas saturation in five additional shale gas wells not included in the training phase. When compared to both the Archie model and the random forest model, the CLAP model showcased superior accuracy in predicting shale gas saturation. Notably, the CLAP model achieved an R² score of 0.762 and a mean square error (MSE) score of 0.934, underscoring its outstanding statistical performance and potential utility in gas saturation prediction.

Author Contributions

Conceptualization, T.Z. and X.Y.; methodology, C.Z. and S.Z.; software, T.Z. and S.Z.; validation, T.Z.; formal analysis, D.Z. and Z.S.; investigation, X.Y. and C.Z.; resources, T.Z.; data curation, D.Z.; writing—original draft preparation, T.Z. and R.J.; writing—review and editing, S.L. and Y.Z.; supervision, R.J. and Y.Z.; project administration, S.L. and G.W.; funding acquisition, M.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data is unavailable due to privacy or ethical restrictions.

Conflicts of Interest

The authors declare no conflict of interest.

References

Burnaman, M.D.; Xia, W.W.; Shelton, J. Shale gas play screening and evaluation criteria. China Pet. Explor. 2009, 14, 51–64. [Google Scholar]
Kaleris, V.K.; Ziogas, A.I. Using electrical resistivity logs and short duration pumping tests to estimate hydraulic conductivity profiles. J. Hydrol. 2020, 590, 125277. [Google Scholar] [CrossRef]
Simandoux, P. Dielectric Measurements on Porous Media Application to the Measurement of Water Saturations: Study of the Behaviour of Argillaceous Formations; Supplementary Issue; Institut Francais du Petrole: Rueil-Malmaison, France, 1963; Volume 18, pp. 193–215. [Google Scholar]
Mahdi, Z.A.; Farman, G.M. A Review on Models for Evaluating Rock Petrophysical Properties. Iraqi J. Chem. Pet. Eng. 2023, 24, 125–136. [Google Scholar] [CrossRef]
Duan, X.; Wu, Y.; Jiang, Z.; Hu, Z.; Tang, X.; Zhang, Y.; Wang, X.; Chen, W. A New Method for Predicting the Gas Content of Low-Resistivity Shale: A Case Study of Longmaxi Shale in Southern Sichuan Basin, China. Energies 2023, 16, 6169. [Google Scholar] [CrossRef]
Clavier, C.; Coates, G.; Dumanoir, J. Theoretical and Experimental Bases for the Dual-Water Model for Interpretation of Shaly Sands. Soc. Pet. Eng. J. 1984, 24, 153–168. [Google Scholar] [CrossRef]
Schlumberger. Schlumberger Log Interpretation Principles/Applications, 8th ed.; Schlumberger: Sugar Land, TX, USA, 1989. [Google Scholar]
Waxman, M.H.; Thomas, E.C. Electrical conductivities in shaly sands–II. The temperature coefficient of electrical conductivity. J. Petrol. Technol. Trans. AIME 1974, 257, 218. [Google Scholar]
Waxman, M.H.; Smits, L.J.M. Electrical conductivities in oil-bearing shaly sands. Soc. Petrol. Eng. J. 1968, 8, 107–122. [Google Scholar] [CrossRef]
Archie, G.E. The electrical resistivity log as an aid in determining some reservoir characteristics. Trans. AIME 1942, 146, 54–62. [Google Scholar] [CrossRef]
Yong, S.H.; Zhang, C.M. Logging Data Processing and Comprehensive Interpretation; China University of Petroleum Press: Qingdao, China, 2007. [Google Scholar]
Zhang, J.Y.; Sun, J.M. Log evaluation on shale hydrocarbon reservoir. Well Logging Technol. 2012, 36, 146–153. [Google Scholar]
Yan, L.; Zhou, W.; Fan, J.Y.; Wu, J.; Wang, X. Log evaluation method for gas content of deep shale gas reservoirs in southern Sichuan basin, 02. Well Logging Technol. 2019, 43, 149–154. (In Chinese) [Google Scholar]
Shi, W.R.; Zhang, C.M.; Zhang, Z.S.; Xiao, S.H.; Shi, Y.H.; Ren, Y. Log evaluation of gas content from Jiaoshiba shale gas reservoir in fuling gas field. Well Logging Technol. 2015, 39, 357–362. (In Chinese) [Google Scholar]
Li, J.; Wu, Q.Z.; Jin, W.J.; Jing, L.; Nan, Z.Y. Logging evaluation of free-gas saturation and volume content in Wufeng-Longmaxi organic-rich shales in the Upper Yangtze Platform. China Mar. Petrol. Geol. 2019, 100, 530–539. [Google Scholar] [CrossRef]
Li, P.J.; Chen, A.Q.; Fu, Y.L.; Han, W.S. Study on determing Nuclear Magnetic Resonance (NMR) T2 cutoff combined with the oil displacing water experiments. Prog. Geophys. 2019, 34, 1050–1054. (In Chinese) [Google Scholar]
Aifa, T.; Baouche, R.; Baddari, K. Neuro-fuzzy system to predict permeability and porosity from well log data: A case study of Hassi R’Mel gas field, Algeria. J. Pet. Sci. Eng. 2014, 123, 217–229. [Google Scholar] [CrossRef]
Al-Mudhafar, W.J. Integrating well log interpretations for lithofacies classification and permeability modeling through advanced machine learning algorithms. J. Pet. Explor. Prod. Technol. 2017, 7, 1023–1033. [Google Scholar] [CrossRef]
Wood, D.A. Prediction of gas saturation in shale gas reservoirs: A review. J. Pet. Sci. Eng. 2020, 178, 106587. [Google Scholar] [CrossRef]
Almpanis, A.; Tsourlos, P.; Vargemezis, G.; Papazachos, C. Application of crosshole electrical resistivity tomography measurements under the influence of horizontally slotted plastic cased boreholes. Near Surf. Geophys. 2022, 20, 46–63. [Google Scholar] [CrossRef]
Xu, S.L.; Zhang, Y.; Li, H.; Zhang, Z.H.; Luo, Q.G.; Li, W.Y. Quantitative prediction of gas saturation in low permeability tight sandstone reservoirs with multiple parameters. Nat. Gas Explor. Dev. 2020, 45, 92–98. [Google Scholar]
Huang, L.S.; Yan, J.P.; Guo, W.; Zheng, M.J.; Zhong, G.H.; Huang, Y. Saturation evaluation of low resistivity shale gas reservoir based on random forest regression algorithm. Well Logging Technol. 2023, 47, 22–28. [Google Scholar]
Nembrini, S. On what to permute in test-based approaches for variable importance measures in Random Forests. Bioinformatics 2019, 15, 2701–2705. [Google Scholar] [CrossRef]
Li, W. Research on Loan Overdue Prediction Method Based on LSTM-CNN. Inf. Technol. Inform. 2020, 248, 40–41. [Google Scholar]
Wei, J.; Zhao, H.; Liu, D.; Jia, H.; Wang, X.; Zhang, H.; Liu, N. CNN-LSTM Short-Term Power Load Forecasting Method Based on Attention Mechanism. J. North China Electr. Power Univ. Nat. Sci. Ed. 2021, 48, 42–47. [Google Scholar]
Liu, Y.Y. Air quality index prediction based on CNN-LSTM and attention mechanism. Comput. Age 2022, 355, 58–60. [Google Scholar]
Wang, J.; Gao, Z.; Zhu, Y.-M. Water quality prediction of the Yellow River based on CNN-LSTM model. Yellow River 2021, 43, 96–99. [Google Scholar]
Ma, F.; Tu, Z.Y.; Zhu, S.T.; Xiang, M.Y.; Sun, Y.F.; Fang, Q. Research on LSTM water level prediction model based on improved attention mechanism. Jiangxi Water Resour. Sci. Technol. 2023, 49, 162–166. [Google Scholar]
Mu, J.; He, H.; Li, L.; Pang, S.; Liu, C. A Hybrid Network Intrusion Detection Model Based on CNN-LSTM and Attention Mechanism; Springer: Singapore, 2022. [Google Scholar] [CrossRef]
Huang Fu, X.; Qian, H.; Huang, M. Review of deep neural networks combined with Attention mechanism. Comput. Mod. 2023, 330, 40–49. [Google Scholar]
Chen, X.; Zhu, K. Lithium battery health state assessment based on CNN-LSTM and attention mechanism. Ind. Control Comput. 2012, 35, 80–82. [Google Scholar]
Oluwole, A.G.; Nosa, E.J. Morphological Analysis of an Organic-rich Shale: Implication for Potential Gas Energy Generation, Witbank Coalfield, South Africa. Microsc. Microanal. 2023, 29 (Suppl. S1), 766–767. [Google Scholar]
Niu, W.; Sun, Y.; Yang, X.; Lu, J.; Zhao, S.; Yu, R.; Liang, P.; Zhang, J. Toward Production Forecasting for Shale Gas Wells Using Transfer Learning. Energy Fuels 2023, 37, 5130–5142. [Google Scholar] [CrossRef]
Kang, D.; Ma, J.; Zhao, Y.P. Perspectives of Machine Learning Development on Kerogen Molecular Model Reconstruction and Shale Oil/Gas Exploitation. Energy Fuels 2023, 37, 98–117. [Google Scholar] [CrossRef]
Saporetti, C.M.; Fonseca, D.L.; Oliveira, L.C. Hybrid machine learning models for estimating total organic carbon from mineral constituents in core samples of shale gas fields. Mar. Petrol. Geol. 2022, 143, 105783. [Google Scholar] [CrossRef]
Jiang, Z.; Qi, Q.; Jiang, X.; Meng, J.; Wang, X.J. An efficient rock physics scheme for estimating crack density and fluid saturation of shale gas reservoir. Front. Earth Sci. 2022, 9, 829244. [Google Scholar] [CrossRef]
Zhai, S.; Geng, S.; Li, C.; Gong, Y.; Jing, M.; Li, Y. Prediction of gas production potential based on machine learning in shale gas field: A case study. Energy Sources Part A Recovery Util. Environ. Eff. 2022, 44, 6581–6601. [Google Scholar] [CrossRef]
Tang, L.; Song, Y.; Li, Q.; Pang, X.; Jiang, Z.; Li, Z. A Quantitative Evaluation of Shale Gas Content in Different Occurrence States of the Longmaxi Formation: A New Insight from Well JY-A in the Fuling Shale Gas Field, Sichuan Basin. Acta Geol. Sin.-Engl. Ed. 2019, 93, 400–419. [Google Scholar] [CrossRef]
Li, Y.F.; Sun, W.; Liu, X.W.; Zhang, D.W.; Wang, Y.C.; Liu, Z.Y. Study of the relationship between fractures and highly productive shale gas zones, Longmaxi Formation, Jiaoshiba area in eastern Sichuan. Petrol. Sci. 2018, 15, 498–509. [Google Scholar] [CrossRef]
Li, P.; Jiang, Z.; Zheng, M.; Bi, H.; Chen, L. Estimation of shale gas adsorption capacity of the Longmaxi Formation in the Upper Yangtze Platform, China. J. Nat. Gas Sci. Eng. 2016, 34, 1034–1043. [Google Scholar] [CrossRef]
Zhao, Z.; Su, S.; Shan, X.; Li, X.; Zhang, J.; Jing, C.; Ren, H.; Li, A.; Yang, Q.; Xing, J. Lithofacies identification of shale reservoirs using a tree augmented Bayesian network: A case study of the lower Silurian Longmaxi formation in the changning block, South Sichuan basin, China. Geo Sci. Eng. 2023, 221, 211385. [Google Scholar] [CrossRef]
Chen, Z.Q. Quantitative seismic prediction technique of marine shale TOC and its application: A case from the Longmaxi Shale Play in the Jiaoshiba area, Sichuan Basin. Nat. Gas Ind. 2014, 34, 24–29. [Google Scholar]
Kuyumani, E.M.; Hasan, A.N.; Shongwe, T. A Hybrid Model Based on CNN-LSTM to Detect and Forecast Harmonics: A Case Study of an Eskom Substation in South Africa. Electr. Power Compon. Syst. 2023, 51, 746–760. [Google Scholar] [CrossRef]
Chaudhary, M.; Gastli, M.S.; Nassar, L.; Karray, F. Deep Learning Approaches for Forecasting Strawberry Yields and Prices Using Satellite Images and Station-Based Soil Parameters. arXiv 2021, arXiv:2102.09024. [Google Scholar]
Wu, P.W.; Huang, Z.H.; Pian, Y.P.; Xu, L.X.; Li, J.L.; Chen, K.C. A Combined Deep Learning Method with Attention-Based LSTM Model for Short-Term Traffic Speed Forecasting. J. Adv. Transport. 2020, 2020, 8863724. [Google Scholar] [CrossRef]
Ai, X.; Li, S.; Xu, H. Short-term wind speed forecasting based on two-stage preprocessing method, sparrow search algorithm and long short-term memory neural network. Energy Rep. 2022, 8, 14997–15010. [Google Scholar] [CrossRef]
Gupta, A.; Sharma, M.; Srivastava, A. Intelligent Software Bug Prediction Framework with Parameter-Tuned LSTM with Attention Mechanism Using Adaptive Target-Based Pooling Deep Features. Int. J. Reliab. Qual. Saf. Eng. 2023, 30, 2350005. [Google Scholar] [CrossRef]
Shi, Z. Graph neural networks and attention-based CNN-LSTM for protein classification. arXiv 2022, arXiv:2204.09486. [Google Scholar]
Lin, J.; Ma, J.; Zhu, J.; Cui, Y. Short-term load forecasting based on LSTM networks considering attention mechanism. Int. J. Electr. Power 2022, 137, 107818. [Google Scholar] [CrossRef]
Kota, V.R.; Munisamy, S.D. High accuracy offering attention mechanisms based deep learning approach using CNN/bi-LSTM for sentiment analysis. Int. J. Intell. Comput. 2022, 15, 61–74. [Google Scholar] [CrossRef]
Zhang, D.; Wang, S. A protein succinylation sites prediction method based on the hybrid architecture of LSTM network and CNN. J. Bioinf. Comput. Biol. 2022, 20, 2250003. [Google Scholar] [CrossRef]
Liang, Y.; Lin, Y.; Lu, Q. Forecasting gold price using a novel hybrid model with ICEEMDAN and LSTM-CNN-CBAM. Expert Syst. Appl. 2022, 206, 117847. [Google Scholar] [CrossRef]
Kanwal, A.; Lau, M.F.; Ng, S.P.H.; Sim, K.Y.; Chandrasekaran, S. BiCuDNNLSTM-1dCNN—A hybrid deep learning-based predictive model for stock price prediction. Expert Syst. Appl. 2022, 202, 117123. [Google Scholar] [CrossRef]

Figure 1. Correlation analysis of 10 logging curve data with gas saturation (based on R-squared parameter characterization).

Figure 2. Experimental flow chart of water saturation prediction method in this study.

Figure 3. The memory cell structure of an LSTM. The dashed box encompasses the core section of the LSTM.

Figure 4. Attention structure diagram in this study.

Figure 5. Attention-based CNN-LSTM model for water saturation prediction.

Figure 6. CLAP architecture in this study.

Figure 7. Loss plot for CLAP model with 1436 samples, 500 epochs. (a) Loss values vs. epoch, (b) R² vs. Epoch.

Figure 8. Comparison of gas saturation in Well A predicted using three models (CLAP model, random forest model, Archie model) with gas saturation obtained from experimental testing.

Figure 9. Scatter plot of gas saturation in Well A predicted by three models (CLAP model, random forest model, Archie model) versus gas saturation obtained from experiments. (a) Gas saturation calculated by CLAP model vs. experimental values of gas saturation, (b) gas saturation calculated by Archie model vs. experimental values of gas saturation, (c) Gas saturation calculated by random forest model vs. experimental values of gas saturation.

Table 1. Training parameters in this study.

Parameters	Value
Convolution kernel size	3 × 3
Number of convolutional layers	3
Pooling operation	Max pooling
Activation functions	ReLU

Table 2. CNN model structural parameters.

Parameters	Value
Number of LSTM layers	3
Number of hidden units	256
Gated unit activation function	tanh
Gated unit activation function	Sigmoid

Table 3. Structural parameters of LSTM models.

Parameters	Value
Number of hidden units in fully connected layers	128
Weight calculation method	5000

Table 4. ATT parameters.

Parameters	Value
Learning rate	0.0002
Training batch	5000
Number of samples	64

Table 5. Evaluation indicators of model results.

Evaluation Metrics	Numerical Values
R² score	0.762
Mean square error	0.934
Mean absolute error	0.735

Table 6. Comparison of the prediction results of three models (CLAP model, random forest model and Archie model) in Well A and gas saturation data from the experiments.

Depth	Sg_Core	CLAP Model		Archie Model		Random Forest Model		Depth	Sg_Core	CLAP Model		Archie Model		Random Forest Model
Depth	Sg_Core	Sg_ATT	Relative Error	Sg_Archie	Relative Error	Sg_RF	Relative Error	Depth	Sg_Core	Sg_ATT	Relative Error	Sg_Archie	Relative Error	Sg_RF	Relative Error
1259.59	57.18	55.77	1.41	42.07	15.11	45.29	11.89	1276.66	51.60	51.20	0.40	16.34	35.26	20.97	30.63
1261.47	50.40	47.65	2.75	33.23	17.17	54.99	4.59	1277.72	41.23	44.50	3.27	24.46	16.77	29.88	11.35
1262.09	57.18	57.10	0.08	45.43	11.75	45.25	11.93	1278.50	56.31	53.63	2.68	34.43	21.88	47.85	8.46
1263.03	2.01	1.23	0.78	4.17	6.18	14.59	12.58	1279.51	45.49	46.61	1.12	32.80	12.69	44.60	0.89
1263.97	50.40	50.64	0.24	35.14	15.26	39.76	10.64	1280.04	50.41	48.99	1.42	34.82	15.59	44.22	6.19
1265.14	31.02	32.33	1.31	9.59	21.43	18.59	12.43	1280.51	52.25	54.02	1.77	41.17	11.08	56.56	4.31
1265.96	54.81	53.28	1.53	38.05	16.76	22.46	32.35	1281.52	53.56	51.04	2.52	22.34	31.22	46.72	6.84
1266.49	52.38	52.30	0.08	25.35	27.04	46.06	6.32	1282.01	45.49	43.54	1.95	28.90	16.59	42.02	3.47
1267.48	47.02	46.66	0.36	32.34	14.68	35.07	11.95	1282.52	32.99	29.20	3.79	20.30	12.69	36.91	3.92
1267.91	50.60	49.65	0.95	36.60	14.00	41.58	9.02	1283.01	52.25	51.95	0.30	40.65	11.60	57.38	5.13
1268.18	28.80	27.83	0.97	11.36	17.44	20.89	7.91	1283.52	46.72	43.41	3.31	29.32	17.40	77.62	30.90
1268.99	52.38	49.57	2.81	37.82	14.56	54.98	2.60	1284.50	48.13	46.18	1.95	30.78	17.35	59.94	11.81
1269.20	50.99	49.27	1.72	30.92	20.07	46.22	4.77	1285.02	32.99	34.32	1.33	29.15	3.84	35.15	2.16
1270.68	28.80	28.38	0.42	22.34	6.46	17.65	11.15	1285.56	47.58	45.66	1.92	43.92	3.66	56.70	9.12
1270.76	19.44	16.03	3.41	6.93	12.51	15.35	4.09	1286.68	49.78	49.90	0.12	46.39	3.39	55.62	5.84
1271.00	49.84	48.15	1.69	23.34	26.50	62.67	12.83	1287.00	48.13	46.01	2.12	31.16	16.97	40.16	7.97
1272.12	49.32	46.48	2.84	34.17	15.15	24.81	24.51	1288.31	45.28	45.34	0.06	32.76	12.52	62.57	17.29
1273.17	58.08	58.67	0.59	32.32	25.76	49.55	8.53	1288.60	97.43	96.26	1.17	81.52	15.91	86.28	11.15
1273.26	19.44	18.83	0.61	6.94	12.50	12.80	6.64	1289.18	49.78	46.79	2.99	38.18	11.60	46.62	3.16
1274.16	51.60	52.13	0.53	38.98	12.62	26.72	24.88	1290.58	59.86	57.88	1.98	22.19	37.67	64.61	4.75
1275.22	41.23	41.88	0.65	22.80	18.43	12.25	28.98	1290.81	45.28	46.69	1.41	2.01	43.27	78.47	33.19
1275.67	58.08	57.53	0.55	55.58	2.50	27.79	30.29	1291.10	97.43	95.46	1.97	59.90	37.53	51.44	45.99
1276.51	40.10	39.04	1.06	18.79	21.31	17.75	22.35	1292.00	57.79	54.82	2.97	43.37	14.42	66.60	8.81
1293.08	59.86	55.75	4.11	45.63	14.23	63.39	3.53	1306.59	76.55	77.47	0.92	64.16	12.39	71.34	5.21
1293.11	49.47	47.53	1.94	32.09	17.38	71.99	22.52	1307.04	72.69	74.18	1.49	58.52	14.17	82.18	9.49
1294.46	51.05	53.13	2.08	32.46	18.59	61.11	10.06	1308.02	70.43	70.03	0.40	56.35	14.08	84.11	13.68
1295.43	18.73	17.30	1.43	3.00	15.73	29.66	10.93	1308.62	70.27	70.15	0.12	58.03	12.24	82.88	12.61
1295.61	49.47	49.98	0.51	33.40	16.07	41.73	7.74	1309.09	76.55	77.52	0.97	63.55	13.00	85.34	8.79
1295.81	55.48	57.65	2.17	54.83	0.65	44.08	11.40	1309.64	83.80	83.21	0.59	68.34	15.46	94.49	10.69
1296.46	53.13	56.52	3.39	48.26	4.87	43.17	9.96	1310.09	76.52	78.51	1.99	58.10	18.42	68.67	7.85
1296.96	51.05	53.48	2.43	45.47	5.58	39.22	11.83	1310.74	72.38	69.23	3.15	56.59	15.79	65.38	7.00
1297.93	18.73	16.91	1.82	11.92	6.81	13.32	5.41	1311.12	70.27	67.45	2.82	56.99	13.28	59.38	10.89
1298.31	55.48	55.51	0.03	48.53	6.95	44.77	10.71	1311.61	78.29	74.86	3.43	63.69	14.60	75.39	2.90
1298.44	44.42	42.76	1.66	39.51	4.91	47.19	2.77	1312.38	74.92	71.88	3.04	65.84	9.08	65.65	9.27
1298.96	53.13	54.75	1.62	38.16	14.97	75.58	22.45	1312.51	46.69	43.70	2.99	36.60	10.09	42.23	4.46
1299.47	51.24	50.14	1.10	33.94	17.30	58.03	6.79	1313.24	72.38	73.61	1.23	57.03	15.35	82.49	10.11
1300.49	41.56	40.83	0.73	27.69	13.87	53.14	11.58	1313.56	67.73	65.52	2.21	51.48	16.25	68.24	0.51
1300.60	52.21	51.57	0.64	35.33	16.88	43.28	8.93	1314.11	78.29	76.62	1.67	59.28	19.01	92.39	14.10
1301.42	48.12	47.28	0.84	33.60	14.52	39.44	8.68	1314.88	74.92	77.61	2.69	69.46	5.46	82.39	7.47
1301.97	51.24	48.64	2.60	33.54	17.70	41.08	10.16	1315.01	46.69	45.70	0.99	34.65	12.04	51.94	5.25
1302.44	54.17	52.72	1.45	53.14	1.03	47.20	6.97	1316.06	67.73	66.73	1.00	52.85	14.88	56.88	10.85
1303.87	65.40	63.37	2.03	59.84	5.56	60.30	5.10	1316.45	73.84	72.31	1.53	62.12	11.72	64.07	9.77
1303.92	48.12	45.08	3.04	23.16	24.96	49.12	1.00	1317.64	57.13	57.08	0.05	39.35	17.78	52.03	5.10
1304.54	72.69	72.50	0.19	47.48	25.21	75.37	2.68	1318.14	69.39	69.03	0.36	53.78	15.61	62.17	7.22
1305.52	70.43	68.91	1.52	43.46	26.97	85.13	14.70	1319.34	62.69	64.41	1.72	48.08	14.61	62.74	0.05
1306.37	65.40	67.29	1.89	63.65	1.75	58.48	6.92	1320.14	57.13	56.07	1.06	56.24	0.89	67.47	10.34

Table 7. Comparison of the differences between the calculation results from three saturation models and core analysis in 5 wells.

Well Name	Number of Experimental Samples	CLAP Model			Archie Model			Random Forest Model
Well Name	Number of Experimental Samples	R²	RMSE	MAE	R²	RMSE	MAE	R²	RMSE	MAE
Well A	92	0.97	1.87	1.57	0.78	16.93	14.96	0.57	13.46	10.67
Well B	122	0.98	1.02	1.52	0.72	12.34	17.34	0.45	17.45	8.45
Well C	132	0.96	0.89	1.44	0.76	13.26	12.34	0.67	19.22	12.66
Well D	62	0.84	1.63	1.88	0.46	15.74	25.34	0.43	22.42	16.45
Well E	78	0.74	1.72	2.01	0.55	14.66	27.43	0.26	27.47	13.24

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, X.; Zhang, C.; Zhao, S.; Zhou, T.; Zhang, D.; Shi, Z.; Liu, S.; Jiang, R.; Yin, M.; Wang, G.; et al. CLAP: Gas Saturation Prediction in Shale Gas Reservoir Using a Cascaded Convolutional Neural Network–Long Short-Term Memory Model with Attention Mechanism. Processes 2023, 11, 2645. https://0-doi-org.brum.beds.ac.uk/10.3390/pr11092645

AMA Style

Yang X, Zhang C, Zhao S, Zhou T, Zhang D, Shi Z, Liu S, Jiang R, Yin M, Wang G, et al. CLAP: Gas Saturation Prediction in Shale Gas Reservoir Using a Cascaded Convolutional Neural Network–Long Short-Term Memory Model with Attention Mechanism. Processes. 2023; 11(9):2645. https://0-doi-org.brum.beds.ac.uk/10.3390/pr11092645

Chicago/Turabian Style

Yang, Xuefeng, Chenglin Zhang, Shengxian Zhao, Tianqi Zhou, Deliang Zhang, Zhensheng Shi, Shaojun Liu, Rui Jiang, Meixuan Yin, Gaoxiang Wang, and et al. 2023. "CLAP: Gas Saturation Prediction in Shale Gas Reservoir Using a Cascaded Convolutional Neural Network–Long Short-Term Memory Model with Attention Mechanism" Processes 11, no. 9: 2645. https://0-doi-org.brum.beds.ac.uk/10.3390/pr11092645

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

CLAP: Gas Saturation Prediction in Shale Gas Reservoir Using a Cascaded Convolutional Neural Network–Long Short-Term Memory Model with Attention Mechanism

Abstract

1. Introduction

2. Saturation Calculation Methods

2.1. Processing of Data Set

2.1.1. Source of Data

2.1.2. Analysis of Feature Correlation and Importance

2.2. Water Saturation Prediction Method

2.2.1. CNN Model

2.2.2. LSTM Model

2.2.3. Attention Mechanism

2.2.4. CLAP Model

2.3. Performance Evaluation

2.3.1. Evaluation Index of the Experiment

2.3.2. Presentation and Analysis of Experimental Results

2.3.3. Characteristics of Error Function Variation and Model Convergence

3. Result and Discussion

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI