Next Article in Journal
Ectomycorrhizal Community on Norway Spruce Seedlings Following Bark Beetle Infestation
Next Article in Special Issue
Spatial Pattern of Climate Change Effects on Lithuanian Forestry
Previous Article in Journal
Transcriptome Analysis of Elm (Ulmus pumila) Fruit to Identify Phytonutrients Associated Genes and Pathways
Previous Article in Special Issue
Evaluating Management Strategies for Mount Kenya Forest Reserve and National Park to Reduce Fire Danger and Address Interests of Various Stakeholders
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Using Machine Learning to Assess Site Suitability for Afforestation with Particular Species

1
School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China
2
Beijing Research Center for Information Technology in Agriculture, Beijing 100097, China
3
China Agricultural University, International College Beijing, Beijing 100083, China
*
Author to whom correspondence should be addressed.
Submission received: 5 June 2019 / Revised: 7 August 2019 / Accepted: 25 August 2019 / Published: 27 August 2019

Abstract

:
Judging and predicting tree suitability is of great significance in the cultivation and management of forests. Background and Objectives: Due to the diversity of tree species for afforestation in China and the lack of experts or the limitations of expert knowledge, the site rules of tree species in some regions are lacking or incomplete, so that a small number of tree suitability empirical site rules are difficult to adapt to the afforestation expert system’s diverse needs. Research Highlights: This paper explores an intelligent method to automatically extract rules for selecting favorable site conditions (tree suitability site rules) from a large amount of data to solve the problem of knowledge acquisition, updating and maintenance of suitable forest site rules in the expert system. Materials and Methods: Based on the method of site quality evaluation and the theory of the decision tree in knowledge discovery and machine learning, the dominant species of Chinese fir and Masson pine in the forest resources subcompartment data (FRSD) of Jinping County, Guizhou Province were taken as examples to select the important site factors affecting the forest quality and based on the site quality of potential productivity. Assessment methodology was proposed to determine the afforestation of a stand site by nonlinear quantile regression, the decision tree was constructed from the ID3, C5.0 and CART algorithms. Results: Finally, the best-performing CART algorithm was selected to construct the model, and the extractor of the afforestation rules was constructed. After validating the rules for selecting favorable site conditions of Chinese fir and Masson pine, the production representation method was used to construct the relationship model of the knowledge base. Conclusions: Intelligent extraction of suitable tree rules for afforestation design in an expert system was realized, which provided the theoretical basis and technical support for afforestation land planning and design.

1. Introduction

Tree suitability involves adapting the afforestation characteristics of a tree species to the site conditions to give full play to the productive potential of forests and achieve a higher level of productivity of afforestation tree species under the current technical and economic conditions of the site. Tree suitability is the embodiment of the principle of suitability to local conditions. To improve the survival rate of afforestation and the growth of trees, it is necessary to select suitable tree species scientifically and reasonably, which is the key part to promote the afforestation quality [1,2].
Basically, there are three aspects that determinate the growth of a tree species: climatic water, energy (i.e., either solar radiation or temperature) and soil nutrient availability. These aspects are described then by topography factors (e.g., slope and altitude), soil factors (e.g., soil type and soil layer thickness), meteorological factors (e.g., precipitation, radiation and temperature) and biological factors (e.g., plant biodiversity, composition and stand factors) [3,4,5]. For afforestation, site conditions, such as topography factors, soil factors and stand factors may be important factors that determine the growth of a tree species. Tree suitability judgment and prediction are highly important links in forest production and management. Currently, there are three primary kinds of quantitative criteria for evaluating afforestation in the relevant literature: site index (SI), average volume growth and site rules. Site index is normally used to measure the site quality under various site conditions to better reflect the relationship between site performance and tree species growth [6,7,8]. The dominant height model [9,10,11,12,13,14,15] and site index curve [14,16,17,18,19,20,21] are two main forms of site indexes. The disadvantage of the site index is that it is difficult to directly explain the productivity level of site (i.e., tree suitability performance) through the SI value. This is because the planting density and the relationship between tree height and diameter at breast height (1.3 m)(DBH) depend on the tree species, the relationship between SI value and yield of different tree species is also different [1,2]. Average volume growth is another index to measure tree suitability performance, which measures tree suitability by the average volume growth of a tree species when it reaches maturity. Because the average volume growth is not only affected by the site conditions but also by the stand density and management level [1,2,22], it is necessary to consider complex conditions (different regions, site conditions, tree species and management measures) when using average volume growth as an evaluation index. However, it is not practical to use the average volume growth as a measure of tree suitability because of the complexity [23]. Site rules could also be used to judge the tree suitability performance in the expert system of forest cultivation. Several papers that study the application of site rules matching afforestation tree species can also be found in the literature. Schröder [24] developed the Multipurpose Tree and Shrub Database, which contains the first-hand, site-specific information about multipurpose tree species. This information can provide decision support when candidate species for specific sites or end-uses are required. Hu Bo [25] used the accumulated empirical site rules to establish a forestry knowledge base by tree structure representation and construct a reasoning afforestation expert system. Ding Quanlong [26] used the tree structure method to build the site rules knowledge base and used the fact base to control the repository to enhance the universality of usage and reduces the regional restrictions of the Expert System. Wu Baoguo [27,28] realized the implementation of a web-based afforestation decision-making consultation system. The knowledge base of afforestation site rules was represented by the production rule method, and the forest site factors provided by users were inferred and analyzed by imitating afforestation experts to select suitable afforestation tree species. Ma Chi [29,30] expressed afforestation site rules knowledge through the combination of the production rule method and the frame method, realized the separation of a reasoning machine and knowledge base and improved the practicability of afforestation expert system. Helton Nonato de Souza [31] selected the suitable trees in the agroforestry systems was based on market accessibility and environmental needs (e.g., soil fertility). Han Yanyun [32] used the credibility of the rules to deal with the uncertainty of forest cultivation knowledge and realized the inference engine algorithm of uncertainty reasoning in the forest cultivation expert system. Prabakaran [33] proposed fuzzy system structure along with integration of expert knowledge. Vásquez Ruben Purroy [34] proposed fuzzy multicriteria decision support system founded on logic-based decision rules. With the wide application of the forestry-service-oriented expert system of afforestation consultation, the site rules quantitative criteria for evaluating afforestation was more intuitive, simple and practical.
However, one practical and several theoretical problems have arisen after almost a decade of practical experience with the site rules of the forest cultivation expert system. The practical problem pertains to the absence and incompleteness of site rules; the current site rules are summarized by experienced experts through long-term afforestation practice, commonly referred to as empirical site rules. Due to the diversity of tree species for afforestation in China and the lack of experts or the limitations of expert knowledge, the site rules of tree species in some regions are lacking or incomplete, so that a small number of empirical tree suitability site rules are difficult to adapt to the diverse needs. Two of the theoretical problems are as follows: (i) The empirical site rules of a certain tree species are subjective due to human judgment by experts, these experts’ subjectivity is high during the operation process, thus the tree suitability evaluation lacks scientific accuracy. (ii) Since the site conditions are changing and the rules of the expert system are relatively fixed, the expert system is faced with great difficulty in maintaining the knowledge of the site rules.
In order to remedy these practical and theoretical problems, it is necessary to explore a method of automatically extracting appropriate site rules. The machine learning algorithm features excellent self-organization, self-learning and self-adaptability, and can acquire the implicit knowledge from the data through massive data learning. The long-term-accumulated survey data of forest resources and the statistical data contain a large amount of explicit information (e.g., dominant tree species, landform, gradient, slope direction, slope position, soil type and soil thickness) and, more importantly, the relationships and rules among them [2]. By mining the hidden knowledge behind the forest resource data, the bottleneck problem of obtaining afforestation site rules has been overcome to some extent. The decision tree algorithm in the machine learning algorithm has been widely used for its advantages of easily extracting rules [35,36], displaying important decision attributes [37] and high classification accuracy [38,39,40]. Currently, the decision tree algorithm has rarely been used to extract afforestation suitability rules from a large amount of forest resource survey data. Compared with the site rules obtained from the experience summary, this algorithm has some advantages. First, the algorithm extracts rules from the overall data. The rules are relatively comprehensive which, to some extent, resolves the bottleneck problem of acquiring rules for afforestation sites. Second, extracting rules from the data can overcome the limitation and subjectivity of expert judgment and enhance confidence. Third, the more data, the more accurate the extraction rules. Thus, the extraction rules can be updated and maintained.
The purpose of this study was to solve the problem of acquiring and updating the site rules in the afforestation expert system. Therefore, the objectives of this study were as follows: (i) To explore an intelligent method to automatically extract the afforestation site rules; (ii) the knowledge of site rules was represented by the production rule method and then applied in the afforestation expert system.

2. Materials and Methods

2.1. Data Source and Processing

The data were obtained from the 2005 and 2015 forest resources subcompartment database in Jinping County, Guizhou Province. Jinping County is located in the eastern part of Guizhou province, with terrain gradually declining from west to east. The western part is dominated by low mountain and low-middle mountain landforms, with an elevation of 800–1300 m. The eastern part encompasses low mountains, hills, valleys and basins with an elevation between 500 and 700 m. Yellow soil is the majority followed by red soil, yellow brown soil and rice soil. Located in the subtropical evergreen broad-leaved forest area, the tree species are mainly Chinese fir followed by Masson pine and broad-leaved tree species, such as Quercus and Liquidambar. The dominant tree species Chinese fir (Cunninghamia lanceolate (Lamb.) Hook.) and Masson pine (Pinus massoniana Lamb.), which were the main fast-growing timber forest in the area, were selected, and the forest factors, including topographic factors, such as landform (DM), slope direction (PX), slope position (PW), slope gradient (PD), elevation (HB), soil factors, such as soil type (TRMC), soil parent material (TRMZ), soil layer thickness (TCHD), and stand factors, such as average age (t), volume per hectare of dominant tree species (YSSZGQXJ), number of trees per hectare of dominant tree species (YSSZGQZS), average height of dominant tree species (YSSZPJG) and average DBH of dominant tree species (YSSZPJXJ), were sorted out. Among these species, there were 7971 effective subcompartments of Chinese fir as the dominant tree species and 263 effective subcompartments of Masson pine as the dominant tree species. Eight attributes, such as landform, slope direction, slope position, slope gradient, altitude, soil type, soil parent material and soil thickness, were extracted from the forest resources subcompartment database, forming the growth site information for Chinese fir and Masson pine.
Ideally, model validation should involve the use of an independent data set [41]. In the present study, the validation data sets were gathered from the 2015 forest resources subcompartment database in the region. After processing, the validation data sets contained 1224 subcompartments of Chinese fir and 171 subcompartments of Masson pine (Table 1). Moreover, variations in stand factors and environmental site factors were included in the data set.

2.2. Extraction of Tree Suitability Site Rules Based on the Decision Tree

The decision tree is an instance-based learning algorithm similar to the tree structure classification of flowcharts. The decision tree is composed of three main parts: the decision node, branch and leaf node. The classification rules represented by the decision tree are inferred from the multiple irregular and disorderly data tuples. The whole decision-making process starts with the root decision node. From top to bottom, each decision node represents a data category or attribute to be classified, and each leaf node represents the result. Each path corresponds to a classification rule, and the set of classification rules constitutes a complete set of decision tree expressions. In this paper, the decision tree algorithm was selected to discover the knowledge of tree suitability, and the decision tree model was used to extract the implicit classification rules between a large number of site factors and tree suitability. Figure 1 is a schematic diagram of the decision tree being transformed into a decision rule.
Based on the above principles, first, the site factors and grading index were constructed to determine the input of the decision tree model. Second, the tree suitability was evaluated, and the output of the decision tree model was determined. Third, the training results of decision tree models trained by different algorithms were compared. The decision tree model of the optimal algorithm was selected as the knowledge rule extraction model of tree suitability, and the extracted rules were the output.

2.2.1. Site Factors and Grading Index

In the actual afforestation operation plan of Guizhou province, the site conditions mainly involve 8 site factors, including landform, slope direction, slope position, slope gradient, elevation, soil type, soil parent material and soil layer thickness. Therefore, the 8 site factors served as input variables of the decision tree model. Based on the “Detailed Rules for the Implementation of Forest Resources Planning and Design Survey in Guizhou Province"”, site factors were divided, and the meanings of site factors and their grading indicators in the study area are presented in Table 2.

2.2.2. Determining Tree Suitability Based on Quantile Regression

The decision tree is a classification algorithm. The output variable corresponded to the classification of tree suitability (most suitable, suitable and unsuitable). The essence of determining tree suitability is to evaluate the site quality. To ensure the accuracy and flexibility of the decision tree model, that is, to make the extracted tree suitability site rules accurate and flexible in the application of natural, uneven-aged and mixed forests of different species in different regions. This paper used the theory of site quality assessment based on potential growth to determine tree suitability. This method assumes that at the same site and with the same stand type, if there were an approximate stand structure and approximate density, there would be an approximate growth process, including height growth, area growth and volume growth [42,43]. Since quantile regression could comprehensively describe the relationship between independent variables and dependent variables under different quantiles [44,45], this hypothesis could be quantified by quantile regression, which could capture the tail characteristics of the distribution. When the independent variables had different effects on the distribution of dependent variables in different parts, it could describe the distribution characteristics more comprehensively and obtain a comprehensive analysis [46]. The growth process was essentially the distribution of the age-dependent variable and growth-dependent variable. The theoretical growth equations (Table 3) were used to fit the age and growth of one-third and two-thirds of the quantile positions. The optimal equations were screened by the Akaike Information Criterion (AIC) information criterion. Therefore, for forests of the same age, the two quantile regression lines divided the growth into three types: the most suitable, the suitable and the unsuitable.
When quantile regression of forest resource subcompartment survey data was used to determine tree suitability, there were three forms of growth in the growth process: (1) The determination of the average volume growth under the same age forest condition (AGE-V-Quantile method); (2) the determination of the average height growth under the same age forest condition (AGE-H-Quantile method); and (3) the determination of the average diameter at breast height growth under the same age forest condition (AGE-DBH-Quantile method). Since height growth was less affected by density, the AGE-H-Quantile method was selected in the experiment. Although the average tree height of the dominant trees can more accurately reflect the site quality of the sample plot, in the actual forest resources subcompartment survey, the forest stand survey factors only included the average height, but not the average height of the dominant trees. In terms of determining tree suitability, because the average height was smaller than the average height of the dominant trees, the average tree height of the dominant trees must be “suitable” if the results of judging by average tree height were “suitable”. Therefore, determining the tree suitability by the average tree height ensured the appropriate accuracy of the tree suitability rules to some extent.

2.2.3. Decision Tree Algorithms Modeling

The decision tree is a data mining technology that could realize functions, such as data classification, association rules extraction and regression prediction [36]. Common algorithms include the ID3 algorithm based on entropy theory and information gain theory, the C5.0 algorithm based on information gain ratio to select features (an improved algorithm based on C4.5), the CART algorithm with the Gini index as the sorting criterion. This paper used ID3, C5.0 and CART algorithms of a decision tree to construct the model and implemented it in R language. All statistical analyses were performed using C50 and rpart R packages. Generally, the decision tree construction includes three processes: feature selection (ID3 for information gain, C5.0 for information gain ratio and CART for Gini index), decision tree construction and pruning. The decision tree model was constructed with R language as follows:
Step 1.
the training data were pretreated, the input variables were discretized by establishing the site factor and grading index, and the output variables were discretized by determining tree suitability.
Step 2.
by adjusting the parameters, the decision tree model was generated and the tree suitability site rules were produced.
Step 3.
the decision tree was pruned, and the tree suitability site rules were the output.
Step 4.
the importance of site factors in the final decision tree model was analyzed.
Step 5.
the accuracy of the decision tree classification was evaluated.

2.3. Rules Validation

The accuracy of the tree suitability rules extracted from the model was verified based on validation data. Table 4 summarizes the experience rules of Chinese fir and Masson pine in Guizhou Province Forest Management Information Collection in 1989. Table 5 shows the empirical rules of Masson pine collected from the table of afforestation investigation and design in Guizhou province in actual afforestation operations. The incompleteness and obsolescence of the empirical rules affected the accuracy of judgment in Table 4. In addition, the empirical rules can only roughly determine the suitability of the tree species and cannot further determine the degree of suitability (most suitable, suitable and unsuitable) in Table 5. The existing site index tables of Chinese fir and Masson pine in Guizhou were divided into three grades according to SI grade, that is, 8–12 index grade is unsuitable, 14–18 index grade is suitable and 20–22 index grade is most suitable. In this paper, the extracted site rules for tree suitability evaluation were compared with the existing site-specific site index tables of Chinese fir and Masson pine respectively, which were used to test the consistency of the judgment results of the tree suitability.

2.4. Rules Application

Based on the rules extracted by the decision tree algorithm, an expert-assisted decision support system for afforestation could be constructed with five parts. Figure 2 shows the relationship between the rule extraction and the knowledge base of the expert system for afforestation. A database was used to store the original data and intermediate results. In this study, the database stored the forest resource subcompartment data. The Site Rule Intelligent Extractor is the procedure of extracting site rules for afforestation based on decision tree algorithm, including the determination of tree suitability by the quantile method, decision tree rule extraction and rule verification, which is the core module for data-to-knowledge conversion. The Site Rule Knowledge Base was used to store the extracted knowledge of the site rules. The knowledge of site rules was represented by the production rule method in this paper. The Inference Machine transformed site rules into IF-THEN form and deduced the conclusion based on the user input conditions. The Human-computer Interaction Interface was the interface between the system and the user for communication. Through this interface, users input basic information to answer the relevant questions raised by the system, and the system outputs reasoning results and relevant explanations.

3. Results

3.1. Determining Tree Suitability Based on the Quantile Method

The fitting results at one-third and two-thirds of the quantile positions by the AGE-H-Quantile method are shown in Table 6. Both Chinese fir and Masson pine, Logistics regression models (Equation 1) with one-third of the quantile were the best, Richards’s regression models (Equation 5) with two-thirds of the quantile were the best, and the corresponding stand site tree suitability results for Chinese fir and Masson pine are shown in Figure 3.

3.2. Model Evaluation

Using the decision tree ID3 C5.0 and CART algorithms, eight site factors in the subcompartment data were taken as input data sets according to Table 2, and the tree suitability of Chinese fir and Masson pine, which were determined by quantile regression, were used as output data. The decision tree models for classifying the suitability of Chinese fir and Masson pine were established. The three algorithm decision tree models were used to predict the tree suitability of the subcompartments, and the accuracy was calculated and is shown in Table 7. It showed that the CART algorithm for the Chinese fir and Masson Pine decision tree models was slightly superior to the other two algorithms.
Further analysis of the coincidence matrix of the three algorithms (Table 8), the CART algorithm showed that the number of correct judgments for the suitable growth of Chinese fir was 3308, and the number of wrong judgments was 2039, the correct rate being 61.87%. The number of correct judgments for the suitable growth of Masson pine was 123, and the number of wrong judgments was 44, the correct rate being 73.65%. The ID3 algorithm the correct rate of correct judgments for the suitable growth of Chinese fir and Masson pine were 59.40% and 75.45%, respectively. The C5.0 algorithm the correct rate of correct judgments for the suitable growth of Chinese fir and Masson pine were 59.89% and 69.46%, respectively. For Chinese fir, the prediction effect of the three algorithms are similar, CART performed best, followed by C5.0 and ID3. For Masson pine, the number of correct judgments for the suitable growth of the three algorithms were 126,116 and 123, ID3 and CART performed slightly better. Therefore, the decision tree model was constructed by CART algorithm comprehensively.
Based on the decision tree model of the CART algorithm, the decision trees for the classification of tree suitability of Chinese fir and Masson pine were established, as shown in Figure 4 and Figure 5. Figure 4 shows that the decision tree established for Chinese fir had 10 layers after pruning, and 23 leaf nodes were generated, among which 6 were most suitable and 10 were suitable. Table 9 shows the importance of each input site factor of the corresponding decision tree. The slope position, soil parent material, soil layer thickness, elevation and landform were important factors affecting the growth of Chinese fir in the Jinping area of Guizhou Province, corresponding to the main classification factors in the decision tree. For the first layer of the soil parent material factor, (shale) was better for tree suitability. For the second layer of the slope position factor, (whole slope) was more suitable. For the third layer of soil thickness factor, (thickness) was more suitable and (low elevation was selected for tree suitability. In the region concerned, the three factors of slope, soil type and slope direction had little influence on tree suitability.
Similarly, the decision tree established by Masson pine shared nine layers after pruning, with 14 leaf nodes generated, among which four were most suitable and three were suitable. Table 10 shows the importance of the site factor of the corresponding decision tree (Figure 5). The important factors affecting the growth of Masson pine in the Jinping area of Guizhou Province were slope position, soil parent material, soil layer thickness, slope gradient and slope gradient, corresponding to the main classification factors in the decision tree. For the first layer attribute factor, slope location factor, (flat land and valley) were better for tree suitability. For the second layer of the soil parent material factor, (shale) was more suitable. For the third layer of soil thickness factor, (thickness) was more suitable. The suitable tree suitability choice for slope direction was (sunny slope). Because the suitable sample size of Masson pine in this area was relatively small, the lack of site factor classifications had a certain impact on the model, which will be improved in future studies.

3.3. Rules Validation

The site rules and site index were used to judge the tree suitability. The results of rule verification Table 11 indicated that 920 of 1224 subcompartments were consistent for Chinese fir, and 304 were inconsistent with 75.16% consistency. For Masson pine, 121 out of 171 subcompartments were consistent, and 50 were inconsistent, with 70.76% consistency. Therefore, for Chinese fir and Masson pine species in Guizhou, the two methods were consistent in more than 70% of the tree suitability judgment results, indicating that the rules extracted by this method can be applied to the site suitability judgment to some extent, especially those species without site index. Similarly, the limited errors indicate that the lack of appropriate samples affected the model, which should be improved in future studies.

3.4. Rules Application

The production rule method could simplify the reasoning knowledge of afforestation represented by multiple rules into one rule, greatly reducing the number of rules recorded in the Site Rule Knowledge Base and facilitating the maintenance of the knowledge base by afforestation experts. It is essentially a description of the final leaf nodes. To illustrate this method, a tree suitability site rule includes m values of landforms conditions, n values of soil type conditions and p values of soil thickness conditions; thus, the number of rules stored in the tree structure representation method is m * n * p *... Then, only one generative expression is sufficient. The relation schema of the afforestation site rules table designed according to this idea follows:

Tree Suitability Site Rule Table

(Rule Number, region, DM, PX, PW, PD, HB, TRMC, TRMZ, TCHD, Tree species, Tree Suitability)
The content of the rules refers to the combination of tree suitability site rules. According to the multiple values of each site condition factor, those rules are separated by ’*’. The tree suitability site rules extracted from Chinese fir and Masson Pine in the Jinping area of Guizhou Province were saved (see Appendix A for tree suitability site rule table for Chinese fir and Masson Pine in Jinping County, Guizhou Province in Table A1).
The above classification rules could intuitively express the relationship between tree suitability and site factors. At the same time, the judgment of tree suitability was influenced by many site factors. The effect of each factor depended on whether it is larger or smaller than a threshold value. Only a few threshold values of variables were needed to determine the final suitability or unsuitability of the site factors of great importance. For example, in rule number 2, the suitability of Chinese fir could be judged as the most suitable by the soil parent material being shale, while some rules required more site factors to be judged repeatedly before obtaining the results of tree suitability.

4. Discussion

To construct the decision tree model, the tree suitability needs to be determined as the output variable of the model. The realization of this process was based on the theory of site quality evaluation of potential productivity. The site index is the most commonly used method for site quality evaluation, mainly for older, even-aged, well-stocked, free-growing, undisturbed and pure or single species-dominated stands [47,48]. This paper quantified the theory by combining the characteristics of quantile regression and proposed a method to determine the tree suitability based on the quantile method. The proposed method will extend the application scope of the decision tree model. (1) It could be applied to different types of forest resource data. The average age and average tree height are generally included in forest resources subcompartment data, permanent sample plot data and parsed tree data. However, the site index method needs dominant average tree height variables, which are usually missing in forest resource surveys, which limits the use of this method. (2) It is suitable for mixed forest types with different origins. In this paper, Chinese fir and Masson pine were selected as the dominant tree species for screening data, and their origins were not limited to natural or artificial. At the same time, if the dominant tree species were used to extract the rules, further study could analyze the most suitable mixed forest types. For example, it is quite likely that the productivity will be affected by the mixing of tree species, so taking this into account in the composition, if the two rules of tree suitability extracted from Chinese fir were “most suitable” and “suitable”, the corresponding species composition being “8 fir 2 broad” and “7 fir 2 broad”, respectively. We would have reason to think that the “8 fir 2 broad” mixed forest type would be better than the “7 fir 2 broad” mixed forest type. (3) It is suitable for unevenly aged forests. The quantile method converts the growth of unevenly aged forests into the growth of the evenly aged forests according to the distribution of different quantiles.
After determining the tree suitability of a stand site, a large amount of explicit information (e.g., dominant tree species, landform, slope, slope direction, slope position, soil species and soil thickness) in forest resource data can be used to judge the afforestation suitability (most suitable, suitable or unsuitable). However, it is very likely that the same site factor corresponds to an inconsistent suitability judgment. The significance of rule extraction by decision algorithm lies in the fact that the algorithm could mine the relationship and rules hidden in a large amount of data in a probabilistic way and summarizes a large number of rules. The reliability of the rules has been enhanced. The existing data could be used to predict and evaluate the tree suitability effectively by these rules. Compared with the empirical rules, the extracted rules not only judge the suitability of tree species but also further clarify the suitability degree, which could solve the problem of incomplete and fuzzy knowledge of the empirical rules.
The machine learning algorithm and expert system, as two branches of the artificial intelligence application, are becoming research hotspots in the field of forestry currently, but mostly have been applied separately. Both methods have significant limitations. Few studies have combined the two technologies to achieve complementary advantages. Feng Yuqiang [49,50] developed an integrated system of a neural network and expert system for stock analysis, which minimized the difficulty of knowledge acquisition in the expert system by using the neural network. Han Yanyun [32] mentioned that a neural network is used to solve the bottleneck problem of knowledge acquisition in the forest cultivation expert system, because of the problems, such as the black-box operation without an explanation mechanism and slow existing training technology in the neural network, which is not applied in afforestation knowledge acquisition. In this paper, the ability to extract association rules from a decision tree algorithm in machine learning was used to tackle the difficulty of acquiring knowledge of afforestation site rules in an afforestation expert system. At the same time, the knowledge base and reasoning mechanism in the expert system were used to make up for the defects of knowledge expression in the decision tree model. The decision tree algorithm and the afforestation expert system were combined to achieve the advantages of the integrated system and then enhanced the ability of assistant decision. The essence of the CART algorithm is the binary partition of feature space, and it could split nominal attributes and continuous attributes. The core idea was to use the Gini index minimization criterion to select features and generate a binary tree. The larger the Gini index, the greater the uncertainty of the sample set, which is similar to entropy. Table 7 and Table 8 showed that the CART algorithm for the Chinese fir and Masson Pine decision tree models was slightly superior to the other two algorithms. The main difference of these three decision tree algorithms lies in the different measurement standards of attribute splitting, the ID3 algorithm is based on entropy theory and information gain theory, the C5.0 algorithm is based on information gain ratio to select features (an improved algorithm based on C4.5), the CART algorithm selects the Gini index as the sorting criterion. In the binary classification problem, the relation between the Gini index, entropy half and classification error rate, and the curves of the Gini index and entropy half are very close, both of which can represent classification error rate approximately [51]. Various classification algorithms were emerging, but the CART algorithm was the base classifier of many ensemble learning classification algorithms, and it was the most widely used classification technology.

5. Conclusions

To solve the problems of incomplete expert knowledge and difficult acquisition, this paper proposed to determine forest site suitability by quantile regression, then constructed a decision tree of the ID3, C5.0 and CART algorithms, and finally selected the best performing CART algorithm to construct the model. This system constructed the extractor of afforestation rules based on site quality evaluation of the potential productivity method, the knowledge discovery method and the decision tree theory of the machine learning method, taking the Chinese fir and Masson Pine tree species in Jinping County of Guizhou Province as examples. After validating the stand rules of Chinese fir and Masson pine, the production representation method was used to construct the relationship model of the knowledge base. In conclusion, the following are the highlights of the paper:
  • Tree suitability site rules were automatically extracted by decision tree algorithms to solve the problem of acquiring and updating the knowledge in the expert system.
  • The knowledge of site rules was represented by the production rule method and then applied in the afforestation expert system.
  • Site quality of potential productivity was quantified by quantile regression.
  • As per the findings, the consistency of the extracted rules and the stand index is more than 70% for Chinese fir and Masson pine.
Because of the limitations of the data, the consistency of the extracted rules and the stand index is more than 70%, which means that the expectation has been generally realized. However, the rule verification (Table 11) results, due to the insufficient amount of sample plot data suitable for the growth conditions of Masson pine, rule extraction has lower prediction accuracy and poor stability, and further research is to improve the prediction accuracy and stability.
It should be pointed out that the decision tree model and rule extractor should be used in the system to realize the dynamic updating of expert knowledge. The next research goal is to develop the prototype of the afforestation expert assisted decision support system based online, the main function of the system is to realize the transformation from data to knowledge, and to provide users with the auxiliary decision making based on the extracted rule knowledge. Further research can explore the inference engine algorithm in the expert system, so that the order questions to be asked will be decided dynamically depending upon the answers of the users. Based on the answers of users, the suitable afforestation tree species are provided.

Author Contributions

Conceptualization, Y.C., D.C. and B.W.; methodology, Y.C. and D.C.; software, D.C.; validation, Y.C.; formal analysis, Y.C.; investigation, Y.C.; resources, B.W.; data curation, Y.C.; writing—original draft preparation, Y.C. and D.C.; writing—review and editing, Y.C.; visualization, Y.C.; supervision, B.W. and Y.Q.; project administration, Y.C., D.C. and B.W.; funding acquisition, B.W.

Funding

This research was funded by the Key National Research and Development Program of China, grant number 2017YFD0600906.

Acknowledgments

We are grateful to the Guizhou Forestry Department and the Guizhou Forestry Investigation and Planning Institute for supplying valuable modeling data.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Tree suitability site rule table for Chinese fir and Masson Pine in Jinping County, Guizhou Province.
Table A1. Tree suitability site rule table for Chinese fir and Masson Pine in Jinping County, Guizhou Province.
Rule NumberRegionDMPDPXPWHBTRMZTRMCTCHDTree SpeciesTree Suitability
6Guizhou Jinping NoSlope Sandstone*SandstoneSha*SlateChinese firMostSuitable
1856Guizhou Jinping GentleSlope*InclineHalfsunnySlo*ShadySlopedownhill*MidSlopeLowSandstoneShaYellow SoilThickChinese firMostSuitable
2Guizhou Jinping Shale Chinese firMostSuitable
450Guizhou Jinping AbruptSlope*FlatSlope*GentleSlope*InclineHalfsunnySlodownhill*MidSlope*Ridge*valleySlate ThickChinese firMostSuitable
902Guizhou Jinping AbruptSlope*FlatSlope*GentleSlope*InclineSunnySlopeMidSlope*Ridge*valleySlate ThickChinese firMostSuitable
932Guizhou Jinping AbruptSlope*DangerousSlo*SteepSlopeMidSlopeLowSandstone*SandstoneShaYellowSoilThickChinese firMostSuitable
224Guizhou Jinping AbruptSlope*FlatSlope*GentleSlope*InclineShadySlopedownhill*MidSlope*Ridge*valleySlate ThickChinese firSuitable
903Guizhou Jinping AbruptSlope*FlatSlope*GentleSlope*InclineSunnySlopedownhill Slate ThickChinese firSuitable
243Guizhou Jinping HalfsunnySloFlatLand*Ridge*Uphill*valleyLowSandstoneShaMiddl*ThinChinese firUnsuitable
242Guizhou Jinping ShadySlope*SunnySlopeFlatLand*Ridge*Uphill*valleyLowSandstoneShaMiddl*ThinChinese firSuitable
120Guizhou Jinping downhill*MidSlopeLowSandstoneShaMiddl*ThinChinese firSuitable
1857Guizhou Jinping GentleSlope*InclineSunnySlopedownhill*MidSlopeLowSandstoneShaYellowSoilThickChinese firSuitable
933Guizhou Jinping AbruptSlope*DangerousSlo*SteepSlopedownhillLowSandstone*SandstoneShaYellowSoilThickChinese firUnsuitable
929Guizhou Jinping GentleSlope*Inclinedownhill*MidSlopeLowSandstoneYellowSoilThickChinese firSuitable
57Guizhou Jinping Uphill Slate ThickChinese firSuitable
117Guizhou Jinping UphillLowSandstone*SandstoneShaThickChinese firUnsuitable
61Guizhou Jinping downhill*FlatLand*MidSlope*Ridge*Uphill*valleyLowSandstone*SlateMiddl*ThinChinese firUnsuitable
465Guizhou Jinping GentleSlope*Inclinedownhill*MidSlopeLowSandstone*SandstoneShaRedSoil*YellowBrownSThickChinese firUnsuitable
467Guizhou Jinping AbruptSlope*DangerousSlo*SteepSlopedownhill*MidSlopeLowSandstone*SandstoneShaRedSoil*YellowBrownSThickChinese firSuitable
31Guizhou Jinping downhill*FlatLand*MidSlope*Ridge*Uphill*valleyMediSandstone*SandstoneSha*SlateMiddl*ThinChinese firUnsuitable
119Guizhou Jinping MidSlopeMediSandstone*SandstoneShaThickChinese firUnsuitable
118Guizhou Jinping Ridge*UphillMediSandstone*SandstoneShaThickChinese firSuitable
113Guizhou Jinping SteepSlope downhill*MidSlope*Ridge*valleySlate ThickChinese firSuitable
20Guizhou Jinping MidSlope*Ridge Thick*ThinMasson PineMostSuitable
8Guizhou Jinping downhill*FlatLandSandstone*Shale*Slate Masson PineMostSuitable
12Guizhou Jinping ShadySlopeUphill Shale*Slate Masson PineMostSuitable
180Guizhou Jinping AbruptSlope*InclineHalfsunnySloMidSlope*RidgeShale*Slate MiddlMasson PineMostSuitable
9Guizhou Jinping downhill*FlatLandSandstoneSha Masson PineUnsuitable
44Guizhou Jinping NoSlope Shale*Slate MiddlMasson PineSuitable
363Guizhou Jinping InclineShadySlope*SunnySlopeMidSlope*RidgeShale*Slate MiddlMasson PineUnsuitable
362Guizhou Jinping AbruptSlopeShadySlope*SunnySlopeMidSlope*RidgeShale*Slate MiddlMasson PineSuitable
21Guizhou Jinping NoSlope Thick*ThinMasson PineUnsuitable
26Guizhou Jinping HalfsunnySlo*SunnySlopeUphill Shale*Slate MiddlMasson PineSuitable
91Guizhou Jinping GentleSlope*SteepSlopeMidSlope*RidgeShale*Slate MiddlMasson PineUnsuitable
23Guizhou Jinping MidSlope*NoSlope*RidgeSandstone*SandstoneShaMiddlMasson PineUnsuitable
7Guizhou Jinping Uphill Sandstone*SandstoneSha Masson PineUnsuitable
27Guizhou Jinping HalfsunnySlo*SunnySlopeUphill Shale*Slate ThickMasson PineUnsuitable

References

  1. Shen, G.; Zhai, M. Silvculture, 2nd ed.; Chinese Forestry Press: Beijing, China, 2011. [Google Scholar]
  2. Gong, Y. Study of Site Knowledge Discovery Based on Multivariate Forestry Information; Beijing Forestry University: Beijing, China, 2013. [Google Scholar]
  3. Gillman, L.N.; Wright, S.D.; Cusens, J.; McBride, P.D.; Malhi, Y.; Whittaker, R.J. Latitude, productivity and species richness. Glob. Ecol. Biogeogr. 2015, 24, 107–117. [Google Scholar] [CrossRef]
  4. Poorter, L.; Van der Sande, M.T.; Arets, E.J.; Ascarrunz, N.; Enquist, B.J.; Finegan, B.; Licona, J.C.; Martínez-Ramos, M.; Mazzei, L.; Meave, J.A.; et al. Biodiversity and climate determine the functioning of Neotropical forests. Glob. Ecol. Biogeogr. 2017, 26, 1423–1434. [Google Scholar] [CrossRef]
  5. Ali, A.; Lin, S.L.; He, J.K.; Kong, F.M.; Yu, J.H.; Jiang, H.S. Climatic water availability is the main limiting factor of biotic attributes across large-scale elevational gradients in tropical forests. Sci. Total Environ. 2019, 647, 1211–1221. [Google Scholar] [CrossRef] [PubMed]
  6. Albrektson, A. Needle litterfall in stands of Pinus sylvestris L. in sweden, in relation to site quality, stand age and latitude. Scand. J. For. Res. 1988, 3, 333–342. [Google Scholar] [CrossRef]
  7. Green, R.N.; Marshall, P.L.; Klinka, K. Estimating site index of Douglas-fir (Pseudotsuga menziesii [Mirb.] Franco) from ecological varibles in southwestern British Columbia. For. Sci. 1989, 35, 50–63. [Google Scholar]
  8. Karlsson, A.; Albrektson, A.; Sonesson, J. Site index and productivity of artificially regenerated Betula pendula and Betula pubescens stands on former farmland in southern and central Sweden. Scand. J. For. Res. 1997, 12, 256–263. [Google Scholar] [CrossRef]
  9. Carmean, W.H. Forest Site Quality Evaluation in the United States. Adv. Agron. 1975, 27, 209–269. [Google Scholar]
  10. Zeide, B.; Zakrzewski, W.T. Selection of site trees: The combined method and its application. Can. J. For. Res. 1993, 23, 1019–1025. [Google Scholar] [CrossRef]
  11. Palahí, M.; Pukkala, T.; Kasimiadis, D.; Poirazidis, K.; Papageorgiou, A.C. Modelling site quality and individual-tree growth in pure and mixed Pinus brutia stands in north-east Greece. Ann. For. Sci. 2008, 65, 501. [Google Scholar] [CrossRef]
  12. Socha, J. Effect of topography and geology on the site index of Picea abies in the West Carpathian, Poland. Scand. J. For. Res. 2008, 23, 203–213. [Google Scholar] [CrossRef]
  13. Fu, L.; Lei, X.; Sharma, R.P.; Li, H.; Zhu, G.; Hong, L.; You, L.; Duan, G.; Guo, H.; Lei, Y.; et al. Comparing height–age and height–diameter modelling approaches for estimating site productivity of natural uneven-aged forests. For. Int. J. For. Res. 2018, 91, 419–433. [Google Scholar] [CrossRef]
  14. Borders, B.E.; Bailey, R.L.; Clutter, M.L. Forest growth models: Parameter estimation using real growth series. In Forest Growth Modelling and Prediction, Proceedings of the IUFRO Conference, Minneapolis, MN, USA, 23–27 August 1987; Ek, A.R., Shifley, S.R., Burk, T.E., Eds.; U.S. Department of Agriculture, Forest Service, North Central Forest Experiment Station: St. Paul, MN, USA, 1988. [Google Scholar]
  15. Zhao, L.; Ni, C.; Gordon, N. Generalized Algebraic Difference Site Index Model for Ponderosa Pine in British Columbia, Canada. Sci. Silvae Sin. 2012, 48, 74–81. [Google Scholar]
  16. Kimberley, M.O.; Ledgard, N.J. Site index curves for grown in the South Island high country, New Zealand. N. Z. J. For. Sci. 1998, 28, 389–399. [Google Scholar]
  17. Hui, G.; Zhang, L.; Hu, Y.; Zhao, Z. A new Method for Establishing Richards Polymorphic Site Index Model: Parameter Replacement. For. Res. 2010, 23, 481–486. [Google Scholar]
  18. Pienaar, L.V.; Shiver, B.D. Dominant height growth and site index curves for loblolly pine plantations in the Carolina flatwoods. South. J. Appl. For. 1980, 4, 54–59. [Google Scholar] [CrossRef]
  19. Upadhyay, A.; Eid, T.; Sankhayan, P.L. Construction of site index equations for even aged stands of Tectona grandis (teak) from permanent plot data in India. For. Ecol. Manag. 2005, 212, 14–22. [Google Scholar] [CrossRef]
  20. Rivas, J.J.; González, J.G.; González, A.D.; Von Gadow, K. Compatible height and site index models for five pine species in El Salto, Durango (Mexico). For. Ecol. Manag. 2004, 201, 145–160. [Google Scholar] [CrossRef]
  21. Duan, A.; Zhang, J. Modeling of Dominant Height Growth and Building of Polymorphic Site Index Equations of Chinese Fir Plantation. Sci. Silvae Sin. 2004, 40, 13–19. [Google Scholar]
  22. Wang, C. Establishment of Plantation Site Quality Evaluation System; Beijing Forestry University: Beijing, China, 2013. [Google Scholar]
  23. Moisen, G.G.; Frescino, T.S. Comparing five modelling techniques for predicting forest characteristics. Ecol. Model. 2002, 157, 209–225. [Google Scholar] [CrossRef] [Green Version]
  24. Schröder, J.M.; Jaenicke, H. A computerized database as decision support tool for the selection of agroforestry tree species. Agrofor. Syst. 1994, 26, 65–70. [Google Scholar] [CrossRef]
  25. Hu, B.; Wu, B.; Lu, D. System of Experts on Afforestation Based on ASP. NET. For. Inventory Plan. 2005, 30, 20–23. [Google Scholar]
  26. Ding, Q.; Wu, B. The design and application of an expert system based on generative rule. Agric. Netw. Inf. 2006, 8, 16–18. [Google Scholar]
  27. Wu, B.; Ding, Q.; Hu, B. Study on Consultation System for Experts in Afforestation Based on Web. Sci. Silvae Sin. 2006, 42, 85–89. [Google Scholar]
  28. Wu, B.; Ding, Q.; Wang, L. A forestation planning expert decision advisory system. N. Z. J. Agric. Res. 2007, 50, 1399–1404. [Google Scholar] [CrossRef]
  29. Ma, C.; Wu, B. Forestation expert system based on the production and framework knowledge representation. Agric. Netw. Inf. 2009, 5, 22–24. [Google Scholar]
  30. Ma, C. Establishment of Fast-Growing and High-Yielding Forest Cultivation Expert System; Beijing Forestry University: Beijing, China, 2009. [Google Scholar]
  31. Souza, H.N.D.; Graaff, J.D.; Pulleman, M.M. Strategies and economics of farming systems with coffee in the Atlantic Rainforest Biome. Agrofor. Syst. 2012, 84, 227–242. [Google Scholar] [CrossRef]
  32. Han, Y.; Wu, B.; Liu, J.; Guo, Y.; Dong, C. Application of uncertainty inference in the forest cultivation expert system. J. Beijing For. Univ. 2014, 36, 88–93. [Google Scholar]
  33. Prabakaran, G.; Vaithiyanathan, D.; Ganesan, M. Fuzzy decision support system for improving the crop productivity and efficient use of fertilizers. Comput. Electron. Agric. 2018, 150, 88–97. [Google Scholar] [CrossRef]
  34. Vásquez, R.P.; Aguilar-Lasserre, A.A.; López-Segura, M.V.; Rivero, L.C.; Rodríguez-Duran, A.A.; Rojas-Luna, M.A. Expert system based on a fuzzy logic model for the analysis of the sustainable livestock production dynamic system. Comput. Electron. Agric. 2019, 161, 104–120. [Google Scholar] [CrossRef]
  35. Chi, Q. Classification Algorithm Study and Application based on Decision Tree; Shandong Normal University: Jinan, China, 2005. [Google Scholar]
  36. Miao, J.; Gong, Y. Data mining of environmental factors affecting the value of forest assets. Land Resour. North China 2013, 2, 50–54. [Google Scholar]
  37. Torresan, C.; Corona, P.; Scrinzi, G.; Marsal, J.V. Using classification trees to predict forest structure types from LiDAR data. Ann. For. Res. 2016, 59, 281–298. [Google Scholar] [CrossRef]
  38. Isaac-Renton, M.G.; Roberts, D.R.; Hamann, A.; Spiecker, H. Douglas-fir plantations in Europe: A retrospective test of assisted migration to address climate change. Glob. Chang. Biol. 2014, 20, 2607–2617. [Google Scholar] [CrossRef]
  39. Wang, T.; Campbell, E.M.; O’Neill, G.A.; Aitken, S.N. Projecting future distributions of ecosystem climate niches: Uncertainties and management applications. For. Ecol. Manag. 2012, 279, 128–140. [Google Scholar] [CrossRef]
  40. Marchi, M.; Ducci, F. Some refinements on species distribution models using tree-level National Forest Inventories for supporting forest management and marginal forest population detection. iForest Biogeosciences For. 2018, 11, 291. [Google Scholar] [CrossRef]
  41. De-Miguel, S.; Mehtätalo, L.; Shater, Z.; Kraid, B.; Pukkala, T. Evaluating marginal and conditional predictions of taper models in the absence of calibration data. Can. J. For. Res. 2012, 42, 1383–1394. [Google Scholar] [CrossRef]
  42. Meng, X. Forest Mensuration, 3rd ed.; China Forestry Publishing House: Beijing, China, 2006. [Google Scholar]
  43. Lei, X.; Fu, L.; Li, H.; Li, Y.; Tang, S. Methodology and Applications of Site Quality Assessment Based on Potential Mean Annual Increment. Sci. Silvae Sin. 2018, 54, 116–126. [Google Scholar]
  44. Hao, L.; Naiman, D.Q. Quantile Regression Model; Shanghai People’s Publishing House: Sahnghai, China, 2012. [Google Scholar]
  45. Zhang, L. The Change Point Problem in Quantile Regression and Its Application; Economic Management Press: Beijing, China, 2017. [Google Scholar]
  46. Su, Y.; Wan, Y. The Idea and Application of Quantile Regression. Stat. Thinktank 2009, 10, 58–61. [Google Scholar]
  47. Carmean, W.H.; Lenthall, D.J. Height-growth and site-index curves for jack pine in north central Ont. Can. J. For. Res. 1989, 19, 215–224. [Google Scholar] [CrossRef]
  48. Huang, S.; Titus, S.J. An index of site productivity for uneven-aged or mixed-species stands. Can. J. For. Res. 1993, 23, 558–562. [Google Scholar] [CrossRef]
  49. Feng, Y.; Huang, T.; Hou, C. Design of Expert System and Neural Network Integration System. J. Manag. Sci. China 1999, 1, 82–88. [Google Scholar]
  50. Feng, Y. Research on Intelligent Decision Support System Based on Integration of Neural Network and Expert System; Harbin University of Technology: Harbin, China, 1999. [Google Scholar]
  51. Li, H. Statistical Learning Method, 2nd ed.; Tsinghua University Press: Beijing, China, 2019. [Google Scholar]
Figure 1. Principle of decision rules extraction based on the decision tree algorithm. Where Input1 and Input2 outputs are variables of the decision tree and a1, a2, b1, b2, Y1 and Y2 are the decision tree sets.
Figure 1. Principle of decision rules extraction based on the decision tree algorithm. Where Input1 and Input2 outputs are variables of the decision tree and a1, a2, b1, b2, Y1 and Y2 are the decision tree sets.
Forests 10 00739 g001
Figure 2. The architecture diagram of the afforestation expert assistant decision support system.
Figure 2. The architecture diagram of the afforestation expert assistant decision support system.
Forests 10 00739 g002
Figure 3. Distribution of stand site tree suitability of Chinese fir and Masson Pine.
Figure 3. Distribution of stand site tree suitability of Chinese fir and Masson Pine.
Forests 10 00739 g003
Figure 4. Classification decision tree model of tree suitability for certain forest site conditions of Chinese fir. Where each box contains the following information according to the order of top to bottom and left to right: Rule number, tree suitability judgment, three tree suitability (most suitable, suitable and unsuitable) percentage and the probability of the rule.
Figure 4. Classification decision tree model of tree suitability for certain forest site conditions of Chinese fir. Where each box contains the following information according to the order of top to bottom and left to right: Rule number, tree suitability judgment, three tree suitability (most suitable, suitable and unsuitable) percentage and the probability of the rule.
Forests 10 00739 g004
Figure 5. Classification decision tree model of tree suitability for certain forest site conditions of Masson pine.
Figure 5. Classification decision tree model of tree suitability for certain forest site conditions of Masson pine.
Forests 10 00739 g005
Table 1. Growth information of the forest resource subcompartment data in Jinping County of Guizhou Province in 2015.
Table 1. Growth information of the forest resource subcompartment data in Jinping County of Guizhou Province in 2015.
HB (m)DMPWPD (°)PXTRMZTRMCTC HD (cm)YSSZtYSSZPJXJ (cm)YSSZPJG (m)YSSZGQXJ (m3/ha)YSSZGQZS (N/ha)SZZCQY
770Low MountainMidSlope27SoutheastSandstoneYellow Soil80Chinese fir2616.510123.741005.510firPlantation
640Low MountainMidSlope38NorthwestSandstoneYellow Soil70Chinese fir352514205.26550.0410firPlantation
760Low MountainUphill14WestSandstoneYellow Soil50Chinese fir321810112.23766.39fir1MPlantation
830Low MountainMidSlope20SoutheastSandstoneRed Soil60Chinese fir322213158.53565.598fir2MPlantation
760Low MountainUphill6SouthSandstoneYellow Soil80Chinese fir3519.510.5128.79719.9110firPlantation
1090Middle MountainMidSlope25EastSlateYellow Soil50Chinese fir2614.5121601453.410firPlantation
620Low Mountaindownhill28EastSandstoneYellow Soil60Masson pine2623.51388.41299.726M4firPlantation
500Low Mountaindownhill26WestSandstoneYellow Soil70Masson pine79516.89785.957M3firPlantation
530Low MountainMidSlope37NorthwestSandstoneYellow Soil60Masson pine302813116.32276.086M4firPlantation
790Low Mountaindownhill42EastSandstoneYellow Soil50Masson pine212413118.58397.8910MPlantation
670Low MountainMidSlope35SoutheastSlateYellow Soil30Masson pine26251487.54254.657M3BPlantation
1150Middle MountainUphill19SoutheastSlateYellow Soil50Masson pine25208.523.26159.158M2firPlantation
560Low MountainMidSlope43NorthwestSandstoneYellow Soil60Masson pine105.2419.512825.2310MPlantation
770Low Mountaindownhill30WestSandstoneShaleYellow Soil60Masson pine3520.511.4127.55651.3910Mnatural-forests
Note: SZZC, tree species composition; QY, origin.
Table 2. Site factors and their classification indexes.
Table 2. Site factors and their classification indexes.
NO.Site FactorGrading Index
1DMMiddleMountain;LowMountain;Hills
2PXSunnySlope (south, southeast, southwest); ShadySlope (north, northwest, northeast); HalfsunnySlope (east, west)
3PWRidge; Uphill; Middle Slope; Downhill; Valley; FlatLand; NoSlope
4PDFlatSlope (0°–5°); GentleSlope(6°–15°); Incline(16°–25°); AbruptSlope(26°–35°);SteepSlope(36°–45°);DangerousSlope(≥46°)
5HBLow (≤1000 m); Medium (1000–3500 m); High (>3500 m)
6TRMCYellowSoil;RedSoil;YellowBrownSoil;RiceSoil
7TRMZSandstone;Shale;SandstoneShale;Slate
8TCHDThick (≥80 cm); Middle (40–79 cm); Thin (<40 cm)
Table 3. Theoretical growth equations.
Table 3. Theoretical growth equations.
EquationModelExpression
1Logistic y = A 1 + m exp r t
2Mitscherlich y = A 1 exp r t
3Gompertz (1825) y = A exp m exp r t
4Korf (1939) y = A exp m t ^ r  
5Richards (1959) y = A 1 exp r t m
Note: y, predictive factors, such as DBH, tree height, volume, etc.; t, independent variables, generally expressed as age; A, M, r are the regression coefficients.
Table 4. Summary of empirical site rules from Guizhou Province Forest Management Information Collection in 1989.
Table 4. Summary of empirical site rules from Guizhou Province Forest Management Information Collection in 1989.
Site Type DistrictSite Type GroupSite TypeSuitable Tree SpecieEvaluate
Low mountainMetamorphic rocksUpper humusMasson pineTree suitability is moderate, and the average site index of Masson pine is 12–14.
middle humusChinese firTree suitability is good, and site index of Chinese fir is 14–18.
Lower humusChinese firTree suitability is best, and site index of Chinese fir is 18–20.
Sandstone ShaleUpper thin soil layerMasson pineTree suitability is slightly worse, and site index of Masson pine is 10–12.
Middle thin soil layerChinese fir
Masson pine
Tree suitability is slightly better, and site index of Chinese fir is 10–12.
Lower thin soil layerChinese firTree suitability is good, and site index of Chinese fir is 14–16.
GraniteUpper thin soil layerMasson pineTree suitability is slightly worse, and site index of Masson pine is 10–12.
Middle thin soil layerMasson pineTree suitability is slightly better, and site index of Chinese fir is 10–12.
Lower thin soil layerChinese firTree suitability is good, and site index of Chinese fir is 14–16.
Hilly in front of hillmetamorphic rockMid-upper Mid-thin humusMasson pineTree suitability is moderate, and site index of Masson pine is 12–14.
Mid-lower Mid-thick humusChinese firTree suitability is good, and site index of Chinese fir is 14–18.
Sandstone ShaleMid-upper thin soil layerMasson pineTree suitability is slightly better, and site index of Masson pine is 10–12.
Mid-lower thick soil layerChinese firTree suitability is good, and site index of Chinese fir is 14–16.
hillsSandstone ShaleMid-upper thin soil layerMasson pineTree suitability is slightly worse, and site index of Masson pine is 10–12.
Mid-upper middle soil layerMasson pineTree suitability is slightly better, and site index of Masson pine is 12–14.
Mid-lower thick soil layerChinese firTree suitability is good, and site index of Chinese fir is 12–14.
Where, site region is Mountainous and Hilly Areas in Southeast Guizhou; Site sub-region is Jinping Metamorphic rocks, clasolite, low mountains and hills.
Table 5. Summary of empirical site rules from Guizhou Plantation Survey and Design.
Table 5. Summary of empirical site rules from Guizhou Plantation Survey and Design.
Type RegionType DistrictType GroupSite TypeSuitable Tree Specie
Landform ElevationLithologySlopeSoil Thickness
Middle mountains (>1000 m)Sandstone and shaleFlat-gentle slope (≤15°)Thick (≥80 cm)Masson pine
Middle (40–80 cm)Masson pine
Thin (≤40 cm)Masson pine
Incline (15°–25°)Thick (≥80 cm)Masson pine
Middle (40–80 cm)Masson pine
Thin (≤40 cm)Masson pine
Abrupt-dangerous slope (≥25°)Thick (≥80 cm)Masson pine
Middle (40–80 cm)Masson pine
Thin (≤40 cm)Masson pine
Low mountains and hilly (≤1000 m)Sandstone and shaleFlat-gentle slope (≤15°)Thick (≥80 cm)Masson pine
Middle (40–80 cm)Masson pine
Thin (≤40 cm)Masson pine
Incline (15°–25°)Thick (≥80 cm)Masson pine
Middle (40–80 cm)Masson pine
Thin (≤40 cm)Masson pine
Abrupt-dangerous slope (≥25°)Thick (≥80 cm)Masson pine
Middle (40–80 cm)Masson pine
Thin (≤40 cm)Masson pine
Table 6. Fitting results of AGE-H-Quantile method.
Table 6. Fitting results of AGE-H-Quantile method.
Tree Species Chinese FirMasson Pine
QuantileEquationParameterAICParameterAIC
AmrAmr
1/31 *12.159886.03290.1797730332.0913.588899.86640.158621154.531 *
23.47527 0.0024661230.453.47527 0.002462042.726
312.426552.707570.1360330339.1114.010583.588740.115051156.298
414.6142616.257061.2906930400.2617.0651127.753591.31851162.853
512.584291.936280.1183530349.5814.209972.504350.099921157.933
2/3114.16818−24.28120.103594367.3414.72274−2.875470.168811912.02
212.4428 0.6370234809.3413.88931 0.633721366.349
315.156941.609020.0877331064.1315.269323.370980.122791174.222
418.5413565.222921.7770439694.4517.8489925.281941.36041173.953
5 *16.239220.845680.0542831051.7415.460452.314230.105771173.915 *
Note: symbol * express the best results.
Table 7. Comparison results of three decision tree models.
Table 7. Comparison results of three decision tree models.
Tree SpeciesAlgorithmAccuracy (%)
Chinese firID350.51%
C5.050.11%
CART50.88%
Masson pineID351.71%
C5.053.99%
CART54.75%
Table 8. The coincidence matrix of three algorithms for Chinese fir and Masson Pine.
Table 8. The coincidence matrix of three algorithms for Chinese fir and Masson Pine.
AlgorithmPredictedChinese Fir (Predicted Value)Masson Pine (Predicted Value)
ID3 Most SuitableSuitableUnsuitableMost SuitableSuitableUnsuitable
Most Suitable1389478827492019
Suitable6087011344203722
Unsuitable2674781879142656
C50 Most SuitableSuitableUnsuitableMost SuitableSuitableUnsuitable
Most Suitable138948182564915
Suitable6087411331311236
Unsuitable267485187239849
CART Most SuitableSuitableUnsuitableMost SuitableSuitableUnsuitable
Most Suitable1391525778472021
Suitable5778151261193723
Unsuitable2705041850112560
Note: The number of judgments for the suitable growth (in bold number) = the number of most suitable + the number of suitable.
Table 9. Importance of variables for Chinese fir.
Table 9. Importance of variables for Chinese fir.
Variable PWTRMZTCHDHBDMPDTRMCPX
importance39371165111
Table 10. Importance of variables of Masson pine.
Table 10. Importance of variables of Masson pine.
VariablePWTRMZTCHDPDPXDMHBTRMC
importance37281097441
Table 11. Results of rule verification.
Table 11. Results of rule verification.
Tree SpeciesConsistencyInconsistencyTotal
Chinese fir9203041224
75.16%24.84%
Masson pine12150171
70.76%29.24%

Share and Cite

MDPI and ACS Style

Chen, Y.; Wu, B.; Chen, D.; Qi, Y. Using Machine Learning to Assess Site Suitability for Afforestation with Particular Species. Forests 2019, 10, 739. https://0-doi-org.brum.beds.ac.uk/10.3390/f10090739

AMA Style

Chen Y, Wu B, Chen D, Qi Y. Using Machine Learning to Assess Site Suitability for Afforestation with Particular Species. Forests. 2019; 10(9):739. https://0-doi-org.brum.beds.ac.uk/10.3390/f10090739

Chicago/Turabian Style

Chen, Yuling, Baoguo Wu, Dong Chen, and Yan Qi. 2019. "Using Machine Learning to Assess Site Suitability for Afforestation with Particular Species" Forests 10, no. 9: 739. https://0-doi-org.brum.beds.ac.uk/10.3390/f10090739

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop