Dual Path Attention Network (DPANet) for Intelligent Identification of Wenchuan Landslides

Wang, Xiao; Wang, Di; Sun, Tiegang; Dong, Jianhui; Xu, Luting; Li, Weile; Li, Shaoda; Ran, Peilian; Ao, Jinxi; Zou, Yulan; Wang, Jing; Zeng, Xinnian

doi:10.3390/rs15215213

Open AccessArticle

Dual Path Attention Network (DPANet) for Intelligent Identification of Wenchuan Landslides

by

Xiao Wang

¹,

Di Wang

²,

Tiegang Sun

³,

Jianhui Dong

¹,

Luting Xu

¹

,

Weile Li

²

,

Shaoda Li

^2,*,

Peilian Ran

²

,

Jinxi Ao

¹,

Yulan Zou

¹,

Jing Wang

¹ and

Xinnian Zeng

¹

School of Architecture and Civil Engineering, Chengdu University, Chengdu 610106, China

²

College of Earth Sciences, Chengdu University of Technology, Chengdu 610059, China

³

China Building Materials Southwest Survey and Design Co., Ltd., Chengdu 610052, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2023, 15(21), 5213; https://0-doi-org.brum.beds.ac.uk/10.3390/rs15215213

Submission received: 29 August 2023 / Revised: 31 October 2023 / Accepted: 1 November 2023 / Published: 2 November 2023

(This article belongs to the Special Issue Advanced Integration of Remote Sensing Techniques with AI on Geo-Environmental Hazards Detection)

Download

Browse Figures

Versions Notes

Abstract

:

Currently, the application of remote sensing technology in landslide identification and investigation is an important research direction in the field of landslides. To address the errors arising from the inaccurate extraction of texture and location information in landslide intelligent recognition, we developed a new network, the dual path attention network (DPANet), and performed experiments in a typical alpine canyon area (Wenchuan County). The results show that the new network recognizes landslide areas with an overall accuracy (OA) and pixel accuracy (PA) of 0.93 and 0.87, respectively, constituting an overall improvement of 4% and 18% compared to the base pyramid scene parsing network (PSPNet). We applied our knowledge of the landslide image features to other areas in the upper reaches of the Minjiang River to enrich the landslide database for this region. Our evaluation of the results shows that the proposed network framework has good robustness and can accurately identify some complex landslides, providing an excellent contribution to the intelligent recognition of landslides.

Keywords:

landslide identification; alpine canyon area; attention module; PSPNet

1. Introduction

Landslides are a natural phenomenon in which the soil or rock on a slope is affected by factors such as water flow, earthquakes, and human engineering activities [1,2,3,4,5] that completely disrupt the mechanical equilibrium conditions and cause the entire rock/soil mass to slide downward along a certain sliding surface.

Every year, a large number of casualties and an extensive amount of property damage are caused by landslides worldwide, and the trend is worsening every year. China is one of the countries with the worst landslides both in Asia and across the globe. The landslide statistics from 1990 to 2021 were employed in the International Emergency Disaster Database [6]. It was found that the historical landslides in China have caused the most severe economic losses (Figure 1). To reduce the hazards caused by landslides, governmental disaster reduction departments have organized censuses and detailed surveys of landslides at different scales and with different accuracies. These efforts have identified a large number of landslide sites and provided an important basis for landslide mitigation [7,8,9,10,11].

At present, landslide censuses and detailed investigation are mainly conducted through the interpretation of remote sensing data and field investigations [12,13]. The identification of landslides through fieldwork is not only time consuming and laborious but also limited by the vegetation on a slope and the professional ability of the inspectors, and the phenomenon of neglecting details upon inspection is common. Although the identification of landslides via the remote sensing interpretation method has greatly improved the efficiency of this process [14,15,16], the current landslide remote sensing interpretation method basically consists of visual interpretation (i.e., the interpreters carry out a manual visual interpretation process on aerial and satellite remote sensing images with the help of some auxiliary equipment), which is not only inefficient in terms of interpretation but also has a high miss rate [17,18,19]. This is one of the reasons why a large number of unrecorded landslides still exist despite numerous landslide surveys. Therefore, using image-processing technology that has been rapidly developed in recent years to construct an automatic landslide identification model based on remote sensing images in order to improve the efficiency and accuracy of landslide identification [20,21] could provide a more effective method for landslide identification and a basis for landslide mitigation.

The identification of landslides using remote sensing technology has been in a period of vigorous development since the 1970s [22,23]. In the past, aerial photographs were mostly used to determine the specific locations of landslides. Through decades of unremitting efforts, many different research methods have been proposed for the identification of landslide remote sensing information [24,25]. These methods include visual interpretation methods, pixel-based classification methods, object-oriented classification methods, and deep learning methods.

Human–computer interaction technology [26,27,28] can reduce the interpretation errors caused by subjective factors and improve the efficiency of visual interpretation, but it is time consuming and labor-intensive. Pixel-based classification [29,30] has yielded some good results in some areas, but there are inevitably some problems in the process. For example, when using image classification to identify landslides, the image features of land use types such as bare soil and residential sites are similar to those of landslides, rendering misclassification and omission much more likely. It is difficult to achieve high recognition accuracy. In addition, the prerequisite for change detection is the possession of remote sensing images of the different phases in the same area, but the quality of remote sensing images can be affected by conditions such as weather and solar altitude. For most areas, remote sensing images cannot always meet the requirements of change detection. The object-oriented classification method can reduce the inherent noise in pixel-based analysis and facilitate subsequent multiscale analysis; however, repeated trials are usually required to determine the optimal segmentation scale, which makes this method considerably time consuming and inefficient for processing large-scale remote sensing images with complex structures.

Research on landslide identification has undergone tremendous changes in the past decade due to the rapid development of artificial intelligence algorithms [31,32,33,34] Deep learning methods, especially deep convolutional neural networks, have advantages in landslide identification [35,36] and thus have a strong capacity for feature representation as well as learning and differentiation [37,38]. Deep learning methods necessitate the use of a large number of samples to help a computer to identify target objects more accurately and efficiently and be able to continuously refine a model as images are added. Currently, researchers have made some attempts in research related to deep learning for landslide identification. Mondini et al. (2021) [39] used convolutional neural networks for landslide feature extraction to identify areas where landslides are likely to occur; then, they performed change detection analysis on the texture features of the suspected areas to finally identify the landslide areas accurately. Hong et al. (2017) [40] combined a deep convolutional neural network with an improved region-growing algorithm for intelligent landslide detection to retrieve areas susceptible to landslides in the test images. Joseph et al. (2018) [41] proposed a new method for automatic landslide description in a deep learning framework using multispectral remote sensing images covering northeast India taken by the Resourcesat-2 linear imaging self-scanner four (LISS-IV) satellite sensor as input data. Their method greatly improved the sensitivity of landslide identification by fusing the advanced semantic information in the images and enables the mapping of smaller-scale landslides. Ghorbanzadeh et al. (2019) [42] used different machine learning models and deep learning models for landslide identification in a strong earthquake zone in Nepal and found that the deep learning models were more accurate than traditional machine learning models in the study area and were able to effectively distinguish between human settlements and landslide hazards after the incorporation of digital elevation models. Meena et al. (2021) [43] used PlanetScope images and convolutional neural networks (CNNs) to combine slope data and spectral information and extract features to map landslide hazards induced by rainfall in the Kodagua district, Karnataka, Western India, in 2018. The results showed that incorporating slope data significantly improved landslide identification accuracy (from 65.5% to 78%).

Most previous studies were based on the use of existing convolutional neural networks [42,44,45] to achieve improved landslide recognition accuracy using the high-level semantic information in the studied images [46]. However, the low-level texture information extracted by a convolutional neural network is often not effectively utilized, which may lead to problems such as failed recognition and false recognition.

In this study, we selected Wenchuan County as the study area and constructed a landslide sample dataset. To address the problems of incomplete access to low-level texture information and high-level semantic information embedded in images and the ineffective use of information in the intelligent extraction of landslide information, we studied the intelligent recognition of landslides within Wenchuan County by designing new modules, adjusting the network structure, reconfiguring the deep learning model, and improving the network’s ability to make use of multi-level features. In addition, the feature knowledge that the trained model obtained from the Wenchuan County landslides was applied to similar regions for cross-regional landslide identification.

The main contributions of this study are as follows:

(1): We constructed a landslide identification database for Wenchuan County.
(2): To address the problems of the confusion of bare ground with landslides and the difficulty in identifying landslides due to dark remote sensing images, we developed the dual path attention network (DPANet), which can effectively solve the above problems and improve the accuracy of landslide identification.
(3): Through migration learning, we transposed the knowledge of the landslide image features memorized by the deep network for Wenchuan County to other areas in the upper reaches of the Minjiang River. In addition, we interpreted the landslides identified by the model in detail and completed field validation to enrich the landslide database for the upper Minjiang River.

2. Materials and Methods

2.1. Study Area

Wenchuan County is located in the upper reaches of the Minjiang River at the northeast edge of the Qinghai–Tibet Plateau and is also the main part of the western Sichuan Plateau. Due to the strong uplift and denudation of the Tibetan Plateau that has taken place over a long period, it has a large elevation differences and complex topography. The elevation of Wenchuan County ranges from 318 to 6125 m (Figure 2), and its elevation is high in the northwest and low in the southeast. Its topography is dominated by high mountain valleys with steep slopes on both sides of the river valleys as well as numerous mountain ranges and interlocking rivers.

Due to the special geological environmental conditions and induced factors such as earthquakes and rainfall, the number of landslides in Wenchuan County is very high. The landslides are mainly distributed along the main stream of the Minjiang River and its main tributaries. The causative factors leading to landslides in Wenchuan County are mainly heavy rainfall and earthquakes and are partially due to excavation at the foot of the slope caused by human engineering activities.

2.2. Production of Sample Database

2.2.1. Production of Landslide Interpretation Database

We selected historical landslide points, including in Wenchuan County, in the WGS-84 coordinate system from 2010 to 2022 as the base study information. We also downloaded all Google Earth imagery of the area from 2010 to the present and combined it with the spatial distribution of landslide points to manually map the landslide boundaries one by one, creating a database of landslide interpretations. The specific research workflow is shown in Figure 3: the detailed interpretation of the landslides in Wenchuan County was completed based on the field landslide points.

Visual interpretation of landslides is conducted via analyzing morphology, color tone, texture, and other factors. The interpreted signs of the landslide in Wenchuan County are summarized in Table 1.

Alongside the aforementioned signs, there were individual landslides among the landslide points that could not be identified from the images. To ensure that the subsequent deep learning models could acquire as many image features of landslides as possible, we chose not to include such landslides in the production of the subsequent interpretation database. Finally, the location distribution and boundary information for 1367 landslides within Wenchuan County were obtained (Figure 4).

2.2.2. Production of a Landslide Sample Database for Intelligent Identification

It is well known that landslide identification methods based on deep learning networks require the use of already-labeled sample data for network training. Because each landslide does not have the same size and dimensions, it is not possible to find a fixed training scale that could satisfy all landslides. Therefore, we consulted a large number of domestic and foreign references related to landslide identification and deep learning [47,48,49]. According to this literature, when it is necessary to traverse the entire study area of an image to produce samples, a fixed size of 512 × 512 pixels is usually chosen for cropping. Therefore, we randomly selected 512 × 512-pixel-size samples with non-landslide features as negative samples in the remote sensing images. The non-landslide samples included water, roads, buildings, and forest land.

An optical image of the entirety of Wenchuan County was cropped according to a ranking scheme of images with 512 × 512 pixels, and the interpretation database data were overlaid to create a positive sample image and label of Wenchuan County. Since the images were cropped sequentially according to the image row order, it was highly likely that a landslide was divided into several parts even if the size of the landslide was not large (Table 2).

Since the sample data were divided sequentially according to their ranks, the sample features in the dataset are relatively similar. Therefore, before the experiment began, the original sample data needed to be disordered and re-randomly combined to divide the sample dataset into a training set and a testing set. In this paper, the method of disordering the sample dataset is beneficial to reducing the correlation between the data, thus improving the ability to learn the image features during training.

2.3. Methodology

The main design of the experimental study in this paper is shown in Figure 5. First, as described in Section 2.2, a landslide sample database was constructed based on remote sensing images and interpreted data. Thus, the obtained landslide sample database could be used for landslide identification. UNet, DeeplabV_3+, and pyramid scene parsing network (PSPNet) were used as a basis [50,51,52], and the PyTorch framework was used to implement the above networks for the training and validation of the sample set. The optimal baseline network was selected, and the problems revealed by the baseline network were addressed by constructing a new network to solve the problems and improve recognition accuracy. We analyzed, tested, and identified the satellite remote sensing images of landslides and then compared the experimental results with the visual interpretation results of experts.

2.3.1. Dual Path Attention Network (DPANet)

When using remote sensing images for landslide interpretation, the quality of the remote sensing images and the texture information in the images is important to ensure the correct interpretation of landslides. However, in the image acquisition process, the images often appear darker due to factors such as weather and the altitude angle of the sun at the time of acquisition. In deep learning, the attention mechanism can improve a model’s attention to a landslide region via weighting, which, in turn, enhances the learning and expression ability of the model. Therefore, the construction of the DPANet effectively solved the above problems.

We propose a new landslide identification network, DPANet, based on the optimal baseline network. The specific process of the implementation of DPANet is illustrated in Figure 6.

DPANet takes the red–green–blue (RGB) three-channel landslide image information as an input of the entire network and outputs a segmented image with semantic information. In the sub-path, we extracted the texture information of the image by setting two specific convolution kernels with dimensions of 7 × 7 and up-dimensioning the extracted landslide information using 1 × 1 convolution. In the main path, Resnet was used as the backbone network for the extraction of high-level semantic information. For the extracted high-level semantic information, the improved pyramid-pooling module and spatial attention module (SAM) were used for the extraction of the global information and location information, and the extracted information was stacked in the channel dimension. Given that the landslide feature maps obtained from the main path and the sub-path were of different sizes, to better fuse the two features without losing information, we up-sampled the landslide feature maps obtained from the main path to the same size as the sub-path feature maps and then overlayed them in the spatial-efficient channel attention module (SECAM) for information fusion, as this process can better enhance the response of the feature map channel to the landslide identification task.

2.3.2. Sub-Paths of DPANet

Combined with the PSPNet structure for detailed analysis, it is clear that PSPNet selects ResNet50 as the backbone network, and with the gradual stacking of the ResNet convolutional layers, the shallow network extracts low-level semantic information, including texture and shape. The deep network extracts more high-level semantic information that has recently been comprehended by humans. However, PSPNet only utilizes the spatial pyramid-pooling module for feature extraction from the deep network, and this does not allow for the effective utilization of the low-level semantic information extracted by the shallow network. Therefore, we propose the texture path coding structure, which is a lightweight convolutional network that fully extracts and utilizes the texture features in the remote sensing images of landslides by constructing three different convolutional layers. The output of the texture path can be expressed according to Equation (1)

T (X) = T_{3} (T_{2} (T_{1} (X)))

(1)

where T is a combined function consisting of a convolutional layer, a batch normalization operation, and a rectified linear unit (ReLU) activation function. The T₁ convolutional layer has convolutional kernel dimensions of 7 × 7, a step size of 2, and a padding of 2, expanding the number of channel dimensions from 3 to 64. The T₂ convolutional layer has convolutional kernel dimensions of 7 × 7, a step size of 2, and a padding of 2, expanding the number of channel dimensions from 64 to 128. The T₃ convolutional layer has convolutional kernel dimensions of 1 × 1, a step size of 1, and a padding of 0, expanding the number of channel dimensions from 128 to 256. The convolutional kernel dimensions in the convolutional layer are 1 × 1, the step size is 1, and the padding is 0, expanding the number of channel dimensions from 128 to 256.

The specific structure is shown in Figure 7: the texture path encoding structure is implemented through three convolutional operations.

A known texture feature is a regional concept, which must be characterized by spatial consistency. The larger the convolution kernel size set, the greater the detectability of the same landslide texture features. When the set convolution kernel size is smaller, the ability to accurately localize the boundary becomes is enhanced. Therefore, this structure uses a convolutional kernel of 7 × 7 in the first two convolutional layers to achieve effective extraction of the landslide texture features from the images in the input network, and the third convolutional operation mainly uses a 1 × 1 convolutional approach to perform a boosting operation on the extracted feature maps to subsequently fuse the landslide features extracted by the backbone network with each other.

2.3.3. Spatial-Efficient Channel Attention Module (SECAM)

A new attention module focusing on channel and location information was designed to address the problem wherein a network cannot accurately extract the location of landslides due to the quality of the remote sensing images and the tendency of landslides to be confused with other features. A new attention module, SECAM, was developed by concatenating the location attention mechanism [53] and the channel attention mechanism (ECAM) [54]. This module improves the attention of the network to the image space and channel dimensions, aiding the extraction and expression of landslide feature information. The specific structure is illustrated in Figure 8.

For a feature map F with scale dimensions of C × H × W, it is first subjected to global maximum pooling and global average pooling along the channel dimension to generate two feature maps, F1 ∈ 1 × H × W and F2 ∈ 1 × H × W, that describe the spatial information of the landslide. F1 and F2 are of the same scale size and can be directly stitched in the channel dimension to obtain feature map F3 (∈ 2 × H × W). F3 combines the maximum pooling information and the average pooling information of feature map F, so it can also retain the landslide feature information and background information in F. Therefore, using a convolution operation (convolution kernel dimensions of 7 × 7, a step size of 1, and padding equal to 3), it can simultaneously adjust the interdependence of the landslide feature information and background information in F3. The spatial attention map F4 ∈ 1 × H × W of F is generated using the sigmoid activation function; then, the generated F4 is multiplied by F. Finally, the landslide feature map FS with location-relative attention can be obtained.

\begin{matrix} F_{S} & = F \times σ (M L P (A v g P o o l (F)) + M L P (M a x P o o l (F))) \\ = F \times σ (W_{1} (W_{0} (F_{1})) + W_{1} (W_{0} (F_{2}))) \end{matrix}

(2)

The landslide feature map FS with location-relational attention is combined with the pooled multiscale information and the texture information of the dual path to obtain feature map FS1 with location-relational attention, texture information, and local information. Each channel in FS1 is a response corresponding to a category. We should focus on not only the focal location and texture features in an image but also on which channel is the more meaningful for the landslide identification task. Therefore, in the SECAM, we selected the FS1 obtained in the previous section as the input and performed a global average pooling operation along the spatial direction to generate a feature map (F5 ∈ C × 1 × 1) describing the channel information. A one-dimensional convolution (a convolution kernel of 5 and padding of 2) was used for channel modeling, thereby avoiding the requirement for dimensionality reduction and more effectively capturing the information interaction across channels. A sigmoid activation function was used to generate a channel relation attention map (F6 ∈ 1 × H × W) for FS; then, the generated F6 was multiplied by FS1 to obtain a landslide feature map (FSC) with location and channel relations. The entire calculation process can be expressed as follows:

F_{S C} = F_{S} ⊙ σ (C o n v 1 d (G A P (F_{S})))

(3)

2.3.4. Accuracy Evaluation Index

To evaluate the experimental results objectively, four evaluation metrics, including pixel accuracy (PA), overall classification accuracy (OA), recall (Recall), and F1 score [55,56], are selected in this paper to measure the performance of the different baseline models in the landslide recognition task to select the most suitable recognition model. PA is the proportion of the number of pixels that are true landslides among all of the pixels that were predicted to be landslides.

P A = \frac{T P}{T P + F P}

(4)

OA is the ratio of the number of pixels correctly predicted for landslides and non-landslides to the total number of predicted pixels.

O A = \frac{T P + T N}{P + N}

(5)

Recall is the proportion of correctly predicted landslide pixels in comparison to the total number of landslide sample pixels.

Re c a l l = \frac{T P}{T P + F N}

(6)

F1 score is calculated by considering the pixel accuracy and recall rate together.

F 1 - s c o r e = 2 \times \frac{Re c a l l \times P A}{Re c a l l + P A}

(7)

In the above equations, P is the number of landslide pixels; N is the number of non-landslide pixels; TP indicates that the true category of a pixel is landslide and that the predicted category is also landslide; TN indicates that the true category of a pixel is non-landslide and that the predicted category is also non-landslide; FP indicates that the true category of a pixel is non-landslide but that its predicted category is landslide; and FP indicates that the true category of a pixel is landslide but that its predicted category is non-landslide.

3. Experimental Analysis and Discussion

Here, we focus on the demonstrated superiority of the DPANet over other deep learning networks, and we use the transfer learning theory [15,57] to complete the extrapolation recognition of the surrounding districts and counties.

3.1. Experimental Platform

In order to satisfy the requirements of all the deep learning experiments involved in this study, the GPU chosen for use was a NVIDIA GeForce RTX 2080Ti (NVIDIA Corporation, Santa Clara, CA, USA) with 12 GB of video memory and 4352 computational units. It was highly capable of ensuring the operation of the different convolutional neural networks in the study. We used the model Intel Xeon E5-2680 v3 CPU, which has the following hardware specifications: 12 cores, 24 threads, and a CPU clock of 2.5 GHz. In addition to the hardware configuration, the software platform is equally important. All the experiments were realized using an Ubuntu 18.04 operating system, which is a public distribution of Linux. In order to use the GPU as a data parallel computing device, the corresponding versions of CUDA and CUDNN needed to be configured. The choice of framework is also particularly important before starting deep learning. We chose the PyTorch architecture to implement the proposed networks. All networks were implemented using Python. The detailed hardware and software platform setups are shown in Table 3.

3.2. Training Details

3.2.1. Optimizer

The network was trained by adjusting the training parameters of the model until the loss value (Loss) was minimized, that is, adjusting the parameters to calculate the degree of deviation between the predicted value and the true value, which is the principle of network optimization for convolutional neural networks. The strategy of network optimization entails minimizing the error between the predicted and actual target values by calculating the first-order gradient values of each parameter. We chose stochastic gradient descent (SGD) as the optimizer; it selects some samples randomly when updating the gradient parameters and follows a certain learning rate in the opposite direction of the gradient during the updating process until convergence occurs.

3.2.2. Loss Function

A landslide training model requires a function for evaluating the difference between the predicted and actual values, which is often called the loss function. In the semantic segmentation task, the lower the loss function value, the smaller the difference between the predicted and actual values, and the better the performance of the model. We chose to use the cross-entropy loss function as the loss function for the training of the landslide model. It can measure the degree of difference between two different probability distributions in the same random variable, and its specific expression is shown in Equation (8)

C E = - \frac{1}{n} \sum_{i = 1}^{n} (y_{i} \log y_{i}^{’} + (1 - y_{i} \log (1 - y_{i}^{’})))

(8)

where y_i is the actual value of the ith image pixel, y′_i is the predicted value of the ith image pixel trained by the model, the value range is [0, 1], and n is the number of pixels.

3.2.3. Learning Rate

The learning rate controls how quickly one should adjust the neural network weights based on the loss gradient, and it is the most important determinant of the convergence of the loss function. In this case, the callback function ReduceLROnPlateau in Pytorch was used. It automatically adjusts the learning rate during training, facilitating the search for a balance between model training speed and training efficiency. The specific process entails reducing the learning rate after every N training rounds. Different learning rates were tested in this study, including 1 × 10⁻¹, 1 × 10⁻², 1 × 10⁻³, 1 × 10⁻⁴, and 1 × 10⁻⁵, and the training results confirm that the learning rate of 1 × 10⁻⁴ functioned best. Therefore, we set the initial learning rate to 0.0001, and N is 10; i.e., if the training accuracy does not decrease after 10 training rounds, a new learning rate is introduced into the model for training. The newly set learning rate is the initial learning rate size reduced by 20%. The learning rate update process is shown in Figure 9.

3.3. Accuracy Evaluation of the Baseline Models

3.3.1. Selection of Backbone Networks

We selected networks with different layers in the residual neural network (ResNet) family, i.e., ResNet18, ResNet34, ResNet50, ResNet101, and ResNet152, and tested the different ResNet backbone networks in the selected benchmark networks and trained them. The test set performance for the network training process is shown in Table 4. Compared with DeepLab_V3+ and PSPNet, the overall performance of the UNet network test set lacks obvious competitiveness, and the values of all of the evaluation metrics are much lower than those of the other two benchmark networks. ResNet152, embedded in UNet, has higher performance in terms of pixel accuracy and overall classification accuracy than ResNet50. However, the UNet network can connect low-level features to high-level characteristics and is able to retain more high-resolution detail information embedded in low-level feature maps. DeepLab_v3+ expands the receptive field by adding voids, and convolution operations can be performed with the same convolution kernel without losing information. The Danet network is a dual-attention network. In this case, two types of attention modules attached to conventional FCN networks are used to capture global feature dependencies in spatial and channel dimensions. Adding these two modules can further enhance the feature representation of a model and improve its segmentation performance. The accuracy of the test set after different ResNet backbone networks were embedded in DeepLab_V3+ and DANet is lower than that of PSPNet alone. The ResNet50 embedded in PSPNet has excellent performance according to all of the accuracy evaluation indices in the testing set. The success of PSPNet mainly lies in its ability to better understand the contextual information in a feature map via aggregating the global and local information of an image through different pooling methods. As a result, ResNet50 was ultimately chosen as the backbone network and embedded in UNet, DeepLab_V3+, DANet, and PSPNet.

3.3.2. Selection of the Baseline Networks

In general, the smaller the loss of the training set during the training process, the higher the accuracy, indicating that the network has better performance. As shown in Figure 10, the training accuracies of the four benchmark networks stabilized after the number of epochs reached 200. PSPNet converged much faster, with the loss values rapidly minimizing within the minimum epoch value and remaining flat. The training accuracy curve for the DeepLab_V3+ curve as a whole is very similar to that of PSPNet, but its training loss convergence is not as fast as that of PSPNet. The training accuracy curves of DeepLab_V3+ and UNet are very similar to the same curve for PSPNet in general. Compared with the other three models, the overall jitter of the training loss of the DANet model is large.

Figure 11 shows the prediction results for the testing set of landslide samples obtained via pairing different models. In Figure 11, the first column shows the original images in the testing set (the predicted images), the second column shows the true values in the testing set (the sample set), and the third, fourth, fifth, and sixth columns show the predicted values obtained using U-Net, DeepLab_V3+, DANet, and PSPNet, respectively.

Among the four convolutional neural networks, PSPNet has a better test set effect. The recognition effects of the four models are as follows. UNet uses Resnet as a backbone network, which is limited by the size of the convolutional kernel, and the scope of the image that can be seen by this network is limited, so UNet cannot obtain the global information in an image, nor can it establish the correlation around the pixels. Thus, the results obtained are usually scattered and of an intermittent, faceted nature. DANet uses positional attention modules to learn the spatial interdependence of features. It also designs a channel attention module to simulate channel interdependencies. By developing rich forward and backward dependencies using local features, its recognition of the test set is improved in comparison to that of UNet. Often, the exposed DeepLab_V3+ and PSPNet can obtain multi-scale information and global information from an image through pooling the cavity pyramid and the pyramid pooling operation for deep features, respectively, and they use multi-scale information to establish the correlation around the pixels. PSPNet has the best performance with respect to the testing set, with continuous and accurate recognition results and fewer cases of determining exposed highlighted features as landslide hazards. In addition, PSPNet is the closest model to the decoded label among the four convolutional neural networks for the prediction set. Based on the above experimental comparison analysis, PSPNet was selected as the benchmark model in this study in order to improve the recognition accuracy in subsequent research related to the improvement of the network.

3.4. Ablation Experiments with DPANet

3.4.1. Sub-Paths of DPANet

There are some landslides in the testing set samples that are significantly different from the surrounding features, but these landslides were not accurately identified in PSPNet, and some of them could not be identified at all. The reason for this problem is that the key texture information, which is used to determine the differences between features, is not effectively used after being extracted by the network. The best-performing PSPNet with respect to the previous experimental results was selected, and the previously proposed Sub-paths were added to PSPNet. The changes in accuracy are shown in Table 5.

The samples were trained using the Sub-paths of DPANet and tested using the testing dataset, and the accuracy results are presented in Table 3. The testing revealed that the accuracy of the landslide recognition model was significantly improved. Compared with the original PSPNet, the landslide recognition accuracy increased by 7%, the recall rate increased by 9%, the F1 score increased by 8%, and the overall prediction accuracy increased by 2%. The recognition enhancement effect is shown in Figure 12.

3.4.2. SECAM of DPANet

Due to the influences of weather and the sun altitude angle, some of the data used in this task are too dark, and some of the landslide samples are surrounded by other features that are easily confused; thus, PSPNet was unable to effectively extract the landslide feature information from such images to achieve sample prediction. The samples were trained using DPANet and tested using the testing dataset, and the accuracy results are presented in Table 6.

The samples were trained and tested using the SECAM of DPANet, and the accuracy results are shown in Table 3. The test results show a significant improvement in the accuracy of landslide recognition. Compared with the original PSPNet, the landslide recognition accuracy increased by 5%, the F1 score increased by 2%, and the overall prediction accuracy increased by 2%. The recognition enhancement effect is shown in Figure 13.

3.4.3. DPANet

By introducing the texture-path-coding and attention mechanism modules based on the optimal benchmark network (PSPNet), a completely new network, DPANet, was obtained. The testing (Table 7) revealed that the accuracy of the landslide recognition model had been significantly improved. Compared with the original PSPNet, the landslide recognition accuracy increased by 18%, the recall rate increased by 16%, the F1 score increased by 17%, and the overall prediction accuracy increased by 4%.

The prediction results of the testing set for DPANet are shown in Figure 14, which demonstrates that after reconstructing the benchmark network, the problems of false recognition, incomplete recognition, and complete failure to recognize that had occurred before were solved more effectively. The overall recognition accuracy of the testing set was substantially improved, clearly indicating that DPANet utilized the advantages of being a deep learning model to a greater extent.

3.5. Validation of DPANet Transfer Effectiveness

3.5.1. Design for Validation

To verify whether our proposed DPANet could effectively identify the landslides in the images and test its applicability to regions with similar natural conditions, we used Mao County, Li County, Songpan County, and Heshui County, which are adjacent to Wenchuan County, as the testing areas and employed the reconstructed convolutional neural network to identify landslide hazards in the above regions. We also performed detailed interpretation (some landslide identification results are incomplete) and field validation of the identification results. The process is illustrated in Figure 15.

3.5.2. Results of DPANet Transfer

Network training was completed using the landslide sample database for Wenchuan County. A new DPANet was obtained after the reconfiguration of the benchmark network PSPNet model with a texture path encoder, the spatial attention mechanism was implemented, and the weight file containing the landslide identification features was also output. Considering the seasonal differences in the remote sensing images of Mao, Li, Songpan, and Heshui counties in the upper reaches of the Minjiang River, we used high-resolution historical optical images to facilitate the mutual complementation of the images and the identification of landslides as exhaustively as possible. The test area images were cropped into small pieces according to a 512 × 512 box, and the cropped images were input into the trained network. By running the network, the task of automatic landslide identification in the upper reaches of the Minjiang River was completed. The model sporadically marked some highlighted ice/snow, roads, and water bodies that were easily confused with landslides as suspected landslides, which we defined as misidentified landslides, and these scattered identifications resulted in a high number of final overall identifications. Since the number of elements occupied by landslides in the regional remote sensing images was much smaller than the same quantities for the other feature types, the obtained results were considered reasonable. These misidentified results were easily determined and rejected without greatly increasing the workload.

After manually eliminating the misidentified landslide areas, 1210 suspected landslide areas remained. Indoor detailed interpretation of these areas yielded 825 landslides (multiple model-identified areas may exist within a single interpreted landslide), and 792 landslide hazards in Mao, Li, Songpan, and Heshui counties were finally confirmed through field verification. In this paper, the detailed results of the decoded target areas identified according to deep learning are shown in Figure 16 for selected key areas in the upper reaches of the Minjiang River.

The field validation of the detailed interpreted landslide results based on the identification results is shown in Figure 17, Figure 18, Figure 19 and Figure 20.

4. Conclusions

We selected Wenchuan County, which is the area with the most severely developed landslides in the typical high mountain valley region in northwestern Sichuan, as a study area and combined its analysis with landslide survey data for detailed interpretation. Since the scale and size of each landslide are not the same, it is impossible to find a fixed training scale to satisfy all landslides. In this study, we reviewed the literature and combed through the high-resolution historical images and interpretation results for the entire region, cropped the data for the whole region, and created a sample library for landslide identification according to a fixed size of ranks and columns (512 × 512).

We tested and compared the effect of different deep learning models applied in the landslide intelligent recognition task and selected the benchmark network with the optimal training strategy. We introduced a texture path coding method and spatial and channel attention mechanisms to reconstruct the benchmark network and obtained our new model, DPANet. The accuracy of DPANet was 17% better than that of the benchmark PSPNet. Through transfer learning, the knowledge of the landslide features acquired by the deep network in Wenchuan County was transferred to other areas in the upper reaches of the Minjiang River, and the identified landslides were decoded in detail. Field validation was completed, and the landslide data that passed field validation were added to the original decoded landslide library.

This study’s results show that the reconstructed convolutional neural network proposed in this paper can provide geological disaster experts with target areas for landslide hazard identification and effectively improve the efficiency and accuracy of regional landslide hazard investigations.

Author Contributions

Conceptualization, X.W. and D.W.; methodology, X.W.; software, D.W.; validation, X.W., T.S. and J.D.; formal analysis, L.X., J.A., Y.Z., J.W., X.Z. and Z.L; investigation, X.W.; resources, W.L.; data curation, S.L.; writing—original draft preparation, X.W. and D.W.; writing—review and editing, S.L.; visualization, P.R.; supervision, S.L.; project administration, X.W.; funding acquisition, W.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key Research and Development Program of China, grant number 2021YFC3000401.

Conflicts of Interest

The authors declare no conflict of interest.

References

Hampton, M.A.; Lee, H.J.; Locat, J. Submarine Landslides. Rev. Geophys. 1996, 34, 33–59. [Google Scholar] [CrossRef]
Fell, R.; Ho, K.K.S.; Lacasse, S.; Leroi, E. A Framework for Landslide Risk Assessment and Management. In Landslide Risk Management; CRC Press: Boca Raton, FL, USA, 2005; pp. 13–36. ISBN 0429151357. [Google Scholar]
Xu, C.; Xu, X.; Yu, G. Landslides Triggered by Slipping-Fault-Generated Earthquake on a Plateau: An Example of the 14 April 2010, Ms 7.1, Yushu, China Earthquake. Landslides 2013, 10, 421–431. [Google Scholar] [CrossRef]
Ibrahim, M.B.; Salisu, S.A.; Musa, A.A.; Abussalam, B.; Hamza, S.M. Framework for the Identification of Shallow Ground Movement in Modified Slopes (an Expert Opinion). In Proceedings of the IOP Conference Series: Earth and Environmental Science, Depok, Indonesia, 27–28 August 2022; IOP Publishing: Bristol, UK, 2022; Volume 1064, p. 12055. [Google Scholar]
Fang, K.; Zhang, J.; Tang, H.; Hu, X.; Yuan, H.; Wang, X.; An, P.; Ding, B. A Quick and Low-Cost Smartphone Photogrammetry Method for Obtaining 3D Particle Size and Shape. Eng. Geol. 2023, 322, 107170. [Google Scholar] [CrossRef]
Cuthbertson, J.; Archer, F.; Robertson, A.; Rodriguez-Llanes, J.M. Improving Disaster Data Systems to Inform Disaster Risk Reduction and Resilience Building in Australia: A Comparison of Databases. Prehosp. Disaster Med. 2021, 36, 511–518. [Google Scholar] [CrossRef] [PubMed]
Hervás, J.; Bobrowsky, P. Mapping: Inventories, Susceptibility, Hazard and Risk. Landslides–Disaster Risk Reduct. 2009, 321–349. [Google Scholar]
Ahmed, B.; Rahman, M.S.; Sammonds, P.; Islam, R.; Uddin, K. Application of Geospatial Technologies in Developing a Dynamic Landslide Early Warning System in a Humanitarian Context: The Rohingya Refugee Crisis in Cox’s Bazar, Bangladesh. Geomat. Nat. Hazards Risk 2020, 11, 446–468. [Google Scholar] [CrossRef]
Garcia-Delgado, H.; Petley, D.N.; Bermúdez, M.A.; Sepúlveda, S.A. Fatal Landslides in Colombia (from Historical Times to 2020) and Their Socio-Economic Impacts. Landslides 2022, 19, 1689–1716. [Google Scholar] [CrossRef]
Fang, K.; Tang, H.; Li, C.; Su, X.; An, P.; Sun, S. Centrifuge Modelling of Landslides and Landslide Hazard Mitigation: A Review. Geosci. Front. 2022, 14, 101493. [Google Scholar] [CrossRef]
Wang, C.; Wang, H.; Qin, W.; Wei, S.; Tian, H.; Fang, K. Behaviour of Pile-Anchor Reinforced Landslides under Varying Water Level, Rainfall, and Thrust Load: Insight from Physical Modelling. Eng. Geol. 2023, 325, 107293. [Google Scholar] [CrossRef]
Huabin, W.; Gangjun, L.; Weiya, X.; Gonghui, W. GIS-Based Landslide Hazard Assessment: An Overview. Prog. Phys. Geogr. Earth Environ. 2005, 29, 548–567. [Google Scholar] [CrossRef]
Liu, J.; Wu, Y.; Gao, X.; Zhang, X. A Simple Method of Mapping Landslides Runout Zones Considering Kinematic Uncertainties. Remote Sens. 2022, 14, 668. [Google Scholar] [CrossRef]
Martha, T.R.; Kerle, N.; Van Westen, C.J.; Jetten, V.; Kumar, K.V. Segment Optimization and Data-Driven Thresholding for Knowledge-Based Landslide Detection by Object-Based Image Analysis. IEEE Trans. Geosci. Remote Sens. 2011, 49, 4928–4943. [Google Scholar] [CrossRef]
Shaha, M.; Pawar, M. Transfer Learning for Image Classification. In Proceedings of the 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 29–31 March 2018; IEEE: Piscataway, MJ, USA, 2018; pp. 656–660. [Google Scholar]
Wang, X.; Fan, X.; Xu, Q.; Du, P. Change Detection-Based Co-Seismic Landslide Mapping through Extended Morphological Profiles and Ensemble Strategy. ISPRS J. Photogramm. Remote Sens. 2022, 187, 225–239. [Google Scholar] [CrossRef]
Castilla, G.; Hay, G.J.; Ruiz-Gallardo, J.R. Size-Constrained Region Merging (SCRM): An Automated Delineation Tool for Assisted Photointerpretation. Photogramm. Eng. Remote Sens. 2008, 74, 409. [Google Scholar] [CrossRef]
Huang, Y.; Chen, Z.; Tao, Y.U.; Huang, X.; Gu, X. Agricultural Remote Sensing Big Data: Management and Applications. J. Integr. Agric. 2018, 17, 1915–1931. [Google Scholar] [CrossRef]
Wasowski, J.; Bovenga, F. Remote Sensing of Landslide Motion with Emphasis on Satellite Multi-Temporal Interferometry Applications: An Overview. Landslide Hazards Risks Disasters 2022, 365–438. [Google Scholar] [CrossRef]
Casagli, N.; Frodella, W.; Morelli, S.; Tofani, V.; Ciampalini, A.; Intrieri, E.; Raspini, F.; Rossi, G.; Tanteri, L.; Lu, P. Spaceborne, UAV and Ground-Based Remote Sensing Techniques for Landslide Mapping, Monitoring and Early Warning. Geoenviron. Disasters 2017, 4, 9. [Google Scholar] [CrossRef]
Mohan, A.; Singh, A.K.; Kumar, B.; Dwivedi, R. Review on Remote Sensing Methods for Landslide Detection Using Machine and Deep Learning. Trans. Emerg. Telecommun. Technol. 2021, 32, e3998. [Google Scholar] [CrossRef]
Patino, J.E.; Duque, J.C. A Review of Regional Science Applications of Satellite Remote Sensing in Urban Settings. Comput. Environ. Urban Syst. 2013, 37, 1–17. [Google Scholar] [CrossRef]
Taylor, L.S.; Quincey, D.J.; Smith, M.W.; Baumhoer, C.A.; McMillan, M.; Mansell, D.T. Remote Sensing of the Mountain Cryosphere: Current Capabilities and Future Opportunities for Research. Prog. Phys. Geogr. Earth Environ. 2021, 45, 931–964. [Google Scholar] [CrossRef]
Hölbling, D.; Friedl, B.; Eisank, C. An Object-Based Approach for Semi-Automated Landslide Change Detection and Attribution of Changes to Landslide Classes in Northern Taiwan. Earth Sci. Inform. 2015, 8, 327–335. [Google Scholar] [CrossRef]
Qi, J.; Chen, H.; Chen, F. Extraction of Landslide Features in UAV Remote Sensing Images Based on Machine Vision and Image Enhancement Technology. Neural Comput. Appl. 2021, 34, 1–15. [Google Scholar] [CrossRef]
Gong, J.; Wang, D.; Li, Y.; Zhang, L.; Yue, Y.; Zhou, J.; Song, Y. Earthquake-Induced Geological Hazards Detection under Hierarchical Stripping Classification Framework in the Beichuan Area. Landslides 2010, 7, 181–189. [Google Scholar] [CrossRef]
Fang, H.; Tong, B.; Du, X.; Li, Y.; Yang, X. Semi-Automatic Terrain Slope Unit Division Method Based on Human–Computer Interaction. In Proceedings of the IOP Conference Series: Earth and Environmental Science, Vienna, Austria, 18–21 May 2020; IOP Publishing: Bristol, UK, 2020; Volume 570, p. 42016. [Google Scholar]
Zheng, X.; He, G.; Wang, S.; Wang, Y.; Wang, G.; Yang, Z.; Yu, J.; Wang, N. Comparison of Machine Learning Methods for Potential Active Landslide Hazards Identification with Multi-Source Data. ISPRS Int. J. Geo-Inf. 2021, 10, 253. [Google Scholar] [CrossRef]
Moosavi, V.; Talebi, A.; Shirmohammadi, B. Producing a Landslide Inventory Map Using Pixel-Based and Object-Oriented Approaches Optimized by Taguchi Method. Geomorphology 2014, 204, 646–656. [Google Scholar] [CrossRef]
Keyport, R.N.; Oommen, T.; Martha, T.R.; Sajinkumar, K.S.; Gierke, J.S. A Comparative Analysis of Pixel- and Object-Based Detection of Landslides from Very High-Resolution Images. Int. J. Appl. Earth Obs. Geoinf. 2018, 64, 1–11. [Google Scholar] [CrossRef]
Dikshit, A.; Pradhan, B.; Alamri, A.M. Pathways and Challenges of the Application of Artificial Intelligence to Geohazards Modelling. Gondwana Res. 2021, 100, 290–301. [Google Scholar] [CrossRef]
Guha, S.; Jana, R.K.; Sanyal, M.K. Artificial Neural Network Approaches for Disaster Management: A Literature Review (2010–2021). Int. J. Disaster Risk Reduct. 2022, 81, 103276. [Google Scholar] [CrossRef]
Li, H.; He, Y.; Xu, Q.; Deng, J.; Li, W.; Wei, Y. Detection and Segmentation of Loess Landslides via Satellite Images: A Two-Phase Framework. Landslides 2022, 19, 673–686. [Google Scholar] [CrossRef]
Li, H.; He, Y.; Xu, Q.; Deng, J.; Li, W.; Wei, Y.; Zhou, J. Sematic Segmentation of Loess Landslides with STAPLE Mask and Fully Connected Conditional Random Field. Landslides 2023, 20, 367–380. [Google Scholar] [CrossRef]
Hacıefendioğlu, K.; Demir, G.; Başağa, H.B. Landslide Detection Using Visualization Techniques for Deep Convolutional Neural Network Models. Nat. Hazards 2021, 109, 329–350. [Google Scholar] [CrossRef]
Su, Z.; Chow, J.K.; Tan, P.S.; Wu, J.; Ho, Y.K.; Wang, Y.-H. Deep Convolutional Neural Network–Based Pixel-Wise Landslide Inventory Mapping. Landslides 2021, 18, 1421–1443. [Google Scholar] [CrossRef]
Höffler, T.N. Spatial Ability: Its Influence on Learning with Visualizations—A Meta-Analytic Review. Educ. Psychol. Rev. 2010, 22, 245–269. [Google Scholar] [CrossRef]
Lu, M.Y.; Williamson, D.F.K.; Chen, T.Y.; Chen, R.J.; Barbieri, M.; Mahmood, F. Data-Efficient and Weakly Supervised Computational Pathology on Whole-Slide Images. Nat. Biomed. Eng. 2021, 5, 555–570. [Google Scholar] [CrossRef]
Mondini, A.C.; Guzzetti, F.; Chang, K.-T.; Monserrat, O.; Martha, T.R.; Manconi, A. Landslide Failures Detection and Mapping Using Synthetic Aperture Radar: Past, Present and Future. Earth-Sci. Rev. 2021, 216, 103574. [Google Scholar] [CrossRef]
Hong, H.; Chen, W.; Xu, C.; Youssef, A.M.; Pradhan, B.; Tien Bui, D. Rainfall-Induced Landslide Susceptibility Assessment at the Chongren Area (China) Using Frequency Ratio, Certainty Factor, and Index of Entropy. Geocarto Int. 2017, 32, 139–154. [Google Scholar] [CrossRef]
Joseph, J.; Mishra, D.; Martha, T.R.; Nidamanuri, R.R. A Deep Learning Framework for Automatic Landslide Inventory Mapping (DLF-ALM). In Proceedings of the 38th INCA International Congress on Emerging Technologies in Cartography, Hyderabad, India, 29 October 2018. [Google Scholar]
Ghorbanzadeh, O.; Blaschke, T.; Gholamnia, K.; Meena, S.R.; Tiede, D.; Aryal, J. Evaluation of Different Machine Learning Methods and Deep-Learning Convolutional Neural Networks for Landslide Detection. Remote Sens. 2019, 11, 196. [Google Scholar] [CrossRef]
Meena, S.R.; Ghorbanzadeh, O.; van Westen, C.J.; Nachappa, T.G.; Blaschke, T.; Singh, R.P.; Sarkar, R. Rapid Mapping of Landslides in the Western Ghats (India) Triggered by 2018 Extreme Monsoon Rainfall Using a Deep Learning Approach. Landslides 2021, 18, 1937–1950. [Google Scholar] [CrossRef]
Ju, Y.; Xu, Q.; Jin, S.; Li, W.; DONG, X.; GUO, Q. Automatic Object Detection of Loess Landslide Based on Deep Learning. Geomat. Inf. Sci. Wuhan Univ. 2020, 45, 1747–1755. [Google Scholar]
Ju, Y.; Xu, Q.; Jin, S.; Li, W.; Su, Y.; Dong, X.; Guo, Q. Loess Landslide Detection Using Object Detection Algorithms in Northwest China. Remote Sens. 2022, 14, 1182. [Google Scholar] [CrossRef]
Liu, P.; Wei, Y.; Wang, Q.; Xie, J.; Chen, Y.; Li, Z.; Zhou, H. A Research on Landslides Automatic Extraction Model Based on the Improved Mask R-CNN. ISPRS Int. J. Geo-Inf. 2021, 10, 168. [Google Scholar] [CrossRef]
Tao, C.; Qi, J.; Li, Y.; Wang, H.; Li, H. Spatial Information Inference Net: Road Extraction Using Road-Specific Contextual Information. ISPRS J. Photogramm. Remote Sens. 2019, 158, 155–166. [Google Scholar] [CrossRef]
Zhou, M.; Sui, H.; Chen, S.; Wang, J.; Chen, X. BT-RoadNet: A Boundary and Topologically-Aware Neural Network for Road Extraction from High-Resolution Remote Sensing Imagery. ISPRS J. Photogramm. Remote Sens. 2020, 168, 288–306. [Google Scholar] [CrossRef]
Li, R.; Zheng, S.; Zhang, C.; Duan, C.; Wang, L.; Atkinson, P.M. ABCNet: Attentive Bilateral Contextual Network for Efficient Semantic Segmentation of Fine-Resolution Remotely Sensed Imagery. ISPRS J. Photogramm. Remote Sens. 2021, 181, 84–98. [Google Scholar] [CrossRef]
Yu, B.; Chen, F.; Xu, C. Landslide Detection Based on Contour-Based Deep Learning Framework in Case of National Scale of Nepal in 2015. Comput. Geosci. 2020, 135, 104388. [Google Scholar] [CrossRef]
Akcay, O.; Kinaci, A.C.; Avsar, E.O.; Aydar, U. Semantic Segmentation of High-Resolution Airborne Images with Dual-Stream DeepLabV3+. ISPRS Int. J. Geo-Inf. 2022, 11, 23. [Google Scholar] [CrossRef]
Schönfeldt, E.; Winocur, D.; Pánek, T.; Korup, O. Deep Learning Reveals One of Earth’s Largest Landslide Terrain in Patagonia. Earth Planet. Sci. Lett. 2022, 593, 117642. [Google Scholar] [CrossRef]
Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11534–11542. [Google Scholar]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. Cbam: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Kalantar, B.; Ueda, N.; Al-Najjar, H.A.H.; Halin, A.A. Assessment of Convolutional Neural Network Architectures for Earthquake-Induced Building Damage Detection Based on Pre-and Post-Event Orthophoto Images. Remote Sens. 2020, 12, 3529. [Google Scholar] [CrossRef]
Ouma, Y.; Nkwae, B.; Moalafhi, D.; Odirile, P.; Parida, B.; Anderson, G.; Qi, J. Comparison of Machine Learning Classifiers for Multitemporal and Multisensor Mapping of Urban Lulc Features. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2022, XLIII-B3-2022, 681–689. [Google Scholar] [CrossRef]
Caroppo, A.; Leone, A.; Siciliano, P. Deep Transfer Learning Approaches for Bleeding Detection in Endoscopy Images. Comput. Med. Imaging Graph. 2021, 88, 101852. [Google Scholar] [CrossRef]

Figure 1. Total estimated damages (in USD) caused by landslide disasters from 1900 to 2023.

Figure 2. Map of the study area: (a) digital elevation model; (b) administrative boundaries.

Figure 3. Production process of landslide interpretation database.

Figure 4. Results of landslide interpretation.

Figure 5. Design of experimental study.

Figure 6. DPANet structure for landslide identification.

Figure 7. Texture path encoding structure.

Figure 8. Spatial-Efficient Channel Attention module (SECAM).

Figure 9. Learning rate update process.

Figure 10. Landslide identification maps for different models: (a) training accuracy; (b) training loss.

Figure 11. Partial test set recognition results in different baseline networks (ResNet50): the red part is the landslide identification area, and the black part is the non-landslide identification area.

Figure 12. Partial test set recognition results comparison between PSPNet and Sub-paths of DPANet: the red part is the landslide identification area, and the black part is the non-landslide identification area.

Figure 13. Partial test set recognition results comparison between PSPNet and SECAM of DPANet: the red part is the landslide identification area, and the black part is the non-landslide identification area.

Figure 14. Comparison of recognition results for partial test sets of PSPNet and DPANet: the red part is the landslide identification area, and the black part is the non-landslide identification area.

Figure 15. Validation process for DPANet landslide identification using transfer learning.

Figure 16. Validation process for DPANet landslide identification via transfer learning.

Figure 17. Mao County landslide: (a) detailed interpretation; (b) field verification.

Figure 18. Li County landslide: (a) detailed interpretation; (b) field verification.

Figure 19. Heishui County landslide: (a) detailed interpretation; (b) field verification.

Figure 20. Songpan County landslide: (a) detailed interpretation; (b) field verification.

Table 1. Landslide interpretation signs of Wenchuan County.

Category	Interpretation Criterion	Major Features
Direct interpretation of signs	Shape	Typical landslide bodies are usually dustpan-, strap-, or ellipse-shaped in imaging data; most earthquake-caused landslides feature a strip- or spoon-shape.
	Color tone	Landslide locations are toned differently with respect to the surrounding stable landforms.
	Texture	Fragmented terrain and uneven surfaces cause spectra to reflect differently on each part of a landslide slope, leading to roughly textured images.
Indirect interpretation of signs	Vegetation	Disturbed vegetation on a slope’s surface is low in coverage, distributed in irregular and scattered manners as sheets, points, or clusters. Exposed rocky soil always covers a large portion of the surface.
	Hydrology	Landslides formed on riverbanks will cause arc-shaped protrusions or variations in the river at a given location; squeezes from the slope to the channel will create abnormalities of water flow; the river will also erode the landslide slope at the front edge.
	Slope terrain	Indications of landslides include ridges, terraces, and hills in a canyon that are staggered or interrupted; a series of terraces that have changed or been buried under gentle hillsides; grooves or diversions in hillside ravines; ravine captured; obviously narrower or shallower cross-sections; etc.

Table 2. Production of positive samples of a partial landslide.

Landslide Boundaries	Cropped Images (512 × 512 Pixels)	Cropped Labels (512 × 512 Pixels)

①–④ are the numbers of the final positive samples obtained by row and column cropping.

Table 3. Hardware and software details.

Hardware and Software	Parameters
CPU	Intel Xeon E5-2680 v3
GPU	NVIDIA GeForce RTX 2080Ti
Operating memory	256 GB
Total video memory	60 GB
Operating system	Ubuntu 18.04
Python	Python 3.6
IDE	PyCharm 2020.1 (Professional Edition)
CUDA	CUDA 10.0
CUDNN	CUDNN 7.6.5
Deep learning architecture	PyTorch 1.2.0

Table 4. Accuracy comparison of different ResNet-based convolutional neural network test sets.

Baseline Model	Backbone Network	PA	Recall	F1-Score	OA
U-Net	Resnet18	0.54	0.41	0.47	0.83
	Resnet34	0.53	0.40	0.46	0.83
	Resnet50	0.52	0.40	0.45	0.83
	Resnet101	0.48	0.49	0.48	0.83
	Resnet152	0.55	0.35	0.43	0.84
Deeplab_V3+	Resnet18	0.55	0.38	0.45	0.84
	Resnet34	0.49	0.48	0.49	0.83
	Resnet50	0.69	0.68	0.68	0.87
	Resnet101	0.65	0.57	0.61	0.86
	Resnet152	0.67	0.49	0.57	0.86
PSPNet	Resnet18	0.54	0.37	0.44	0.84
	Resnet34	0.52	0.47	0.49	0.84
	Resnet50	0.69	0.69	0.69	0.89
	Resnet101	0.67	0.68	0.67	0.88
	Resnet152	0.67	0.67	0.67	0.88
DANet	Resnet18	0.67	0.63	0.65	0.87
	Resnet34	0.64	0.63	0.64	0.87
	Resnet50	0.67	0.64	0.65	0.88
	Resnet101	0.65	0.64	0.64	0.88
	Resnet152	0.66	0.66	0.66	0.88

Table 5. Comparison of results of sub-path ablation experiments.

Models	PA	Recall	F1-Score	OA
PSPNet	0.69	0.69	0.69	0.89
Sub-paths of DPANet	0.76	0.78	0.77	0.91

Table 6. Comparison of results of SECAM ablation experiments.

Models	PA	Recall	F1-Score	OA
PSPNet	0.69	0.69	0.69	0.89
SECAM of DPANet	0.74	0.69	0.71	0.90

Table 7. Comparison of results of DPANet ablation experiments.

Models	PA	Recall	F1-Score	OA
PSPNet	0.69	0.69	0.69	0.89
DPANet	0.87	0.85	0.86	0.93

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, X.; Wang, D.; Sun, T.; Dong, J.; Xu, L.; Li, W.; Li, S.; Ran, P.; Ao, J.; Zou, Y.; et al. Dual Path Attention Network (DPANet) for Intelligent Identification of Wenchuan Landslides. Remote Sens. 2023, 15, 5213. https://0-doi-org.brum.beds.ac.uk/10.3390/rs15215213

AMA Style

Wang X, Wang D, Sun T, Dong J, Xu L, Li W, Li S, Ran P, Ao J, Zou Y, et al. Dual Path Attention Network (DPANet) for Intelligent Identification of Wenchuan Landslides. Remote Sensing. 2023; 15(21):5213. https://0-doi-org.brum.beds.ac.uk/10.3390/rs15215213

Chicago/Turabian Style

Wang, Xiao, Di Wang, Tiegang Sun, Jianhui Dong, Luting Xu, Weile Li, Shaoda Li, Peilian Ran, Jinxi Ao, Yulan Zou, and et al. 2023. "Dual Path Attention Network (DPANet) for Intelligent Identification of Wenchuan Landslides" Remote Sensing 15, no. 21: 5213. https://0-doi-org.brum.beds.ac.uk/10.3390/rs15215213

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Dual Path Attention Network (DPANet) for Intelligent Identification of Wenchuan Landslides

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Production of Sample Database

2.2.1. Production of Landslide Interpretation Database

2.2.2. Production of a Landslide Sample Database for Intelligent Identification

2.3. Methodology

2.3.1. Dual Path Attention Network (DPANet)

2.3.2. Sub-Paths of DPANet

2.3.3. Spatial-Efficient Channel Attention Module (SECAM)

2.3.4. Accuracy Evaluation Index

3. Experimental Analysis and Discussion

3.1. Experimental Platform

3.2. Training Details

3.2.1. Optimizer

3.2.2. Loss Function

3.2.3. Learning Rate

3.3. Accuracy Evaluation of the Baseline Models

3.3.1. Selection of Backbone Networks

3.3.2. Selection of the Baseline Networks

3.4. Ablation Experiments with DPANet

3.4.1. Sub-Paths of DPANet

3.4.2. SECAM of DPANet

3.4.3. DPANet

3.5. Validation of DPANet Transfer Effectiveness

3.5.1. Design for Validation

3.5.2. Results of DPANet Transfer

4. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI