Skip to main content
  • Original Paper
  • Published:

Evaluation of the use of low-density LiDAR data to estimate structural attributes and biomass yield in a short-rotation willow coppice: an example in a field trial

Abstract

Key message

LiDAR data (low-density data, 0.5 pulses m −2 ) represent an excellent management resource as they can be used to estimate forest stand characteristics in short-rotation willow coppice (SRWC) with reasonable accuracy. The technology is also a useful, practical tool for carrying out inventories in these types of stands.

Context

This study evaluated the use of very low-density airborne LiDAR (light detection and ranging) data (0.5 pulses m−2), which can be accessed free of charge, in an SRWC established in degraded mining land.

Aims

This work aimed to determine the utility of low-density LiDAR data for estimating main forest structural attributes and biomass productivity and for comparing the estimates with field measurements carried out in an SRWC planted in marginal land.

Methods

The SRWC was established following a randomized complete block design with three clones, planted at two densities and with three fertilization levels. Use of parametric (multiple regression) and non-parametric (classification and regression trees, CART) fitting techniques yielded models with good predictive power and reliability. Both fitting methods were used for comprehensive analysis of the data and provide complementary information.

Results

The results of multiple regression analysis indicated close relationships (Rfit 2 = 0.63–0.97) between LiDAR-derived metrics and the field measured data for the variables studied (H, D20, D130, FW, and DW). High R 2 values were obtained for models fitted using the CART technique (R 2 = 0.73–0.94).

Conclusion

Low-density LiDAR data can be used to model structural attributes and biomass yield in SRWC with reasonable accuracy. The models developed can be used to improve and optimize follow-up decisions about the management of these crops.

1 Introduction

The prospects of successfully achieving and maintaining sustainable energy production worldwide depend on the increased use of renewable resources in general and biomass in particular (Edenhofer et al. 2011). One of the best ways of ensuring the long-term availability of biomass for producing renewable energy is to establish and grow new perennial energy crops, which can also add value to marginal land (Rosso et al. 2013) or can be used for bioremediation purposes.

Biomass plantations are an attractive source of renewable energy (González-Ferreiro et al. 2013) and also have many other advantages. Some of the reasons why crops are grown for bioenergy purposes include the recovery of economic activities in rural areas, provision of a neutral CO2 balance, and restoration of degraded land.

Depending on the final destination (heat and/or electricity), three types of biomass can be produced: oilseed, alcohol, and lignocellulose (IDAE 2007). Crops that produce lignocellulosic biomass (fiber crops) can be used to produce both heat and electricity and can be grown as short-rotation coppice (SRC). Producing these so-called energy crops is considered one of the most energy-efficient methods of carbon conversion, as growing the crops is considered an efficient means of reducing greenhouse gas emissions (Styles and Jones 2007).

Short-rotation plantations can be established on various types of land, including marginal land (Broeckx et al. 2012). Zurba et al. (2013) recommended planting SRC on marginal land and brownfields, in parallel with other sustainable land management options. Planting SRC on this type of land may also contribute in the long term to improving soil quality and biodiversity, protecting groundwater and preventing soil erosion (Kuzovkina and Quigley 2005; Zurba et al. 2013). The use of Salicaceae (Salix and Populus spp.) provides several advantages: the ease of propagating plants from cuttings (low production cost and easy to establish), the wide range of improved genetic material available, production of high biomass yields in a short time, and vigorous coppicing regrowth after cutting (Keoleian and Volk 2005). Taking into account the wide adaptability of members of the genus Salix to extreme conditions and to nutrient-impoverished and polluted soils (Kuzovkina and Quigley 2005), SRC willow can be established on marginal land or in soils that are not suitable for agricultural exploitation (Jama and Nowak 2012). Indeed, short-rotation willow coppice (SRWC), together with Populus spp., Eucalyptus spp., and Robinia pseudoacacia L, is one of the most promising bioenergy cropping systems for use in temperate regions of Europe (Venturi et al. 1999) as well as in Canada and the USA (Tahvanainen and Rytko 1999; Weih 2004).

The region of Asturias (north-western Spain) was a major coal-producing region during the past century. Although coal mining continues to be one of the most important sources of employment in the region (Paredes-Sánchez et al. 2016; Suárez-Antuña 2005), the sector is currently in recession, and large areas of mining land have been abandoned. The mining company Grupo Hunosa currently owns up to 700 ha of former mining land that is suitable for machine-based establishment of forest energy crops. This is currently considered the best option for use of this land, despite the difficulties in establishing energy crops (unfavorable soil structure/properties in these degraded areas). To date, the only trials involving SRC energy crops in Asturias are those associated with research projects (7 ha). At present, a commercial plantation (for bioenergy purposes) is being established in 20 ha of abandoned mine land of similar characteristics to those considered in the present study.

In 2008, an experimental trial with willow energy crops was established in abandoned mining land in Asturias. The aims of this experimental trial were to obtain information about structural attributes and biomass production in an SRWC crop established in a restored coal mining area and to evaluate the effects of clone, fertilization, and planting density on crop yield. For this purpose, detailed and comprehensive field inventories were conducted in order to obtain as much information as possible about the development of the energy crop. Forest inventories were used to estimate multiple parameters at plot level, including structure and biomass production.

For some decades now, remote sensing has enabled information about forest biomass (particularly in extensive forest areas) to be obtained at a wide range of spatial and temporal scales, thus greatly reducing costs and the amount of fieldwork required (Montealegre Gracia et al. 2015). The correlation between the spectral response of vegetation and structural attributes or biomass production has been investigated in numerous studies in which active sensors were used (Estornell et al. 2011; González-Ferreiro et al. 2012; Næsset 2002; Næsset and Gobakken 2008).

The interest shown by the aforementioned company in developing and applying new procedures and the possibility of obtaining data (forest structure and other forest variables) from airborne sensors provides a valuable opportunity to quantify the resources obtained directly from SRC. This represents a breakthrough in this field, as carrying out the field inventories necessary for adequate planning and monitoring of the forest energy plantations (characterized by high tree densities of 5000–20,000 plants ha−1, high number of shoots per stool, etc.) is tedious and time consuming.

The use of free light detection and ranging (LiDAR)-derived data provides certain advantages such as smaller estimation errors, a reduction in the duration of field inventories and the ability to cover larger areas of land.

LIDAR is one of the most important technologies developed in this field in recent years. This technique is already being used successfully to evaluate total forest area, improving the accuracy of forest inventories and reducing the cost and time spent on these (Eid et al. 2004; González-Ferreiro et al. 2012; Wehr and Lohr 1999). LiDAR is an active remote sensing system based on the use of a laser sensor and the application of various techniques to determine the distance from a laser transmitter to an object Sánchez Martínez et al. (2011). This distance is established by measuring the time delay between emission of a signal and detection of its reflection (Tanarro 2010). The method is therefore a combination of three different technologies: laser telemetry, the global positioning system (GPS), and inertial measurement units (IMUs). The laser beam, emitted at a frequency of thousands of energy pulses per second toward the earth, creates a dense strip of 3D points (Manue 2007). This 3D point cloud is based on accurate measurement by a plane-mounted pulse sensor, which calculates the distance separating it from the earth’s surface and objects existing on it (Magdaleno-Mas and Martínez-Romero 2006). As the position and orientation of the sensor are known for each pulse emitted, each return signal has unique three-dimensional coordinates. LiDAR data have been captured for the entire Spanish territory under the National Aerial Orthophotography Plan (PNOA). Data were collected during 2012 in the region of Asturias.

LiDAR technology has been used successfully to characterize numerous types of forest stands (Hayashi et al. 2014; Lefsky et al. 2002; Means et al. 1999; Means and Acker 2000; Næsset et al. 2004). However, in forest inventories, the technique has been found to underestimate height (Clark et al. 2004; Næsset 1997; Zimble et al. 2003), and some authors (Falkowski et al. 2006) have suggested that higher density data (6–8 pulses m−2) are required for forest monitoring. However, other studies have shown that a low pulse density is sufficient for establishing strong correlations with the main attributes measured in forest inventories (Hawbaker et al. 2010; Means and Acker 2000; Thomas et al. 2006). Although studies based on the use of low-density discrete-return LiDAR to determine forest structure have been reported (Coops et al. 2007; Hall et al. 2005), to date only basal area has been estimated in short-rotation coppice (SRC) (Seidel and Ammer 2014). Nonetheless, the structure of SRCs facilitates the application of LiDAR technology as these are dense, rather uniform stands with little or no accompanying vegetation. These features favor good correlations between variables measured in forest inventories and those measured using LiDAR technology.

The main objective of this study was to assess the usefulness of low-resolution discrete return LiDAR (0.5 pulses m−2) data to estimate structural attributes and biomass production with the aim of facilitating management of an SRC plantation. For the purposes of this study, we developed statistical models that relate the information provided by the LiDAR to the data obtained in detailed field inventories conducted in the study area (Montealegre Gracia et al. 2015). The methods were evaluated by complementary techniques: parametric multiple linear regression, which enabled us to develop predictive models, evaluated by Rfit 2 and RMSE in order to indicate the accuracy of the fits, and non-parametric classification and regression trees (CART), which provided more detailed, descriptive information about the variables. The main aims of the study were (i) to estimate the forest structure and the productivity of SRWC and (ii) to apply and compare the use of different types of model fitting methods (multiple linear regression and CART).

2 Material and methods

2.1 Study area

2.1.1 Location of the study area

The experimental trial included three commercial willow clones and covered an area of 2 ha in the region of Asturias (north-western Spain) (Fig. 1). The Salix energy crop trial was established in May 2008 in restored land surrounding an abandoned opencast coal mine, denominated Mozquita (ETRS89 UTM 30 N, N: 4,794,443, E: 280,981). The study area is characterized by an average annual temperature of 13 °C and an average annual precipitation of 1,115 mm, of which 345 mm falls during the growing season (May–September). The climate is oceanic with high annual precipitation and, although summer precipitation is relatively low in some areas, physiological drought does not occur in any part of the region, which is located entirely within the European Biogeographic Atlantic Region (EEA 2011).

Fig. 1
figure 1

Distribution of the sampling plots used for estimation of structural attributes and biomass yield. The photographic insert shows the experimental layout of the three commercial willow clones under study (green, Bjor clone; red, Inger clone; and blue, Olof clone)

The clay loam substrate (with a high presence of coarse elements, approximately 30%) was dumped and ameliorated in 2003. Soil formation is at an early stage and the soil structure is still unstable. The steep slopes of the terrain minimize groundwater effects. The physiography of the plots was characterized by a mean slope of 19% and an elevation ranging from 508 to 597 m above sea level.

2.1.2 Experimental design

In the winter of 2008, the surface was subsoiled, plowed to a depth of 30–40 cm, and harrowed before the willow cuttings were planted. Three commercially available willow clones were chosen for the study because of their adaptability to extreme soil conditions (e.g., nutrient poor and polluted soils) (Kuzovkina and Quigley 2005) and because they display good structural attributes and yield capacities for biomass production in SRC (Keoleian and Volk 2005). The cuttings were planted according to a double row planting design, leaving a distance of 0.75 m between each set of double rows, a distance of 1.5 m to the next set of double rows, and a distance between plants of 0.9 m (10,000 plants ha−1) or 0.6 m (15,000 plants ha−1) to provide two stocking levels (Fig. 2).

Fig. 2
figure 2

Diagram of the planting designs used in the trial

The experiment was established following a randomized complete block design (three blocks), in which three qualitative factors were considered for analysis: clone (three levels), planting density (two levels), and fertilization treatment (three levels), as outlined in Table 1.

Table 1 Main characteristics of the experimental design

Finally, the basic design was repeated in three blocks, with a total of 54 square plots each with an area of 400 m2 (20 × 20 m) and constituted by 9 double rows with 22 or 33 cuttings per row (depending on the stocking density). Irrigation and pest/disease control were not performed during the cultivation period throughout the study area.

2.2 Field data collection

The experimental plots were measured in the autumn of 2012 according to the protocol described by the UK Forest Research (Forest Research 2003) for collecting data in short-rotation willow plantations (first rotation, stand age 5 years). Several variables were measured after the vegetative period with the aim of assessing the performance of each clone. Measurements were made in rectangular subplots of 27 m2 (9 × 3 m for density N1) and 18 m2 (6 × 3 m for density N2) located in the center of each plot, to avoid the edge effect. A total of 40 stools (live or dead) were measured in each of the 54 subplots in this study. Within each of the subplots, shoot diameters were measured 20 and 130 cm aboveground level (D20 and D130) with a digital caliper, and the total mean heights (H) were measured with a Vertex III hypsometer. The survival rate was also recorded at the end of each vegetative season. Before the trial, crops were harvested (in autumn 2012; stand age 5 years) and 5 stools were randomly selected in each of the abovementioned subplots and subsequently cut. A total of 270 stools were harvested manually with pruning shears (Electrocoup F3010) or a chainsaw. The fresh weight (FW) of each stool was measured with an electronic balance (precision ± 10 g). A representative subsample (300–500 g) of each stool was taken to the laboratory and weighed immediately with a precision balance (precision ± 1 g). The subsample was subsequently dried to constant weight at 70 °C: the dry weight (DW) was recorded and the dry biomass of the plot was calculated by multiplying the FW by the ratio of dry to fresh weight of the subsample.

The mean values and standard deviations for these variables are shown in Table 2. The biomass and structural attributes differed depending on the treatment applied (F0, F1, and F2) (Table 1), thus explaining the high standard deviations. The trial involved a dense plantation (10,000 and 15,000 plants ha−1) with a very homogeneous stand structure in each plot and scarcely any companion vegetation. The plot location data (four vertices) were collected with GPS submeter precision (Trimble Geoexplorer 2008 series).

Table 2 Mean values and standard deviations of the test variables (H, D20, D130, FW, and DW) for the three clones (data corresponding to a field inventory carried out in 2012)

2.3 LiDAR procedure

2.3.1 LiDAR data

The LiDAR data (Table 3) were acquired in July 2012 with ALS60 (Leica) and LMS-Q680 (Riegl) sensors. The beam divergence was 0.3 mrad and the pulsing frequency, 45 kHz; the scan frequency was 70 Hz, and the maximum scan angle, 50°. The first and last return pulses were recorded. Flights were conducted across the whole study area, and one flight was conducted for each strip, yielding an average measurement density of about 0.5 pulses m−2. The LiDAR data provided by the PNOA (official web page, 2016) included information about return type (first and last); X, Y, and Z coordinates; and intensity of the returned pulse by the sensor.

Table 3 Specifications of the LiDAR flight. Source: PNOA

2.3.2 Extraction of LiDAR variables

For the low-density data acquired (0.5 pulses m−2), FUSION software (McGaughey 2010) was used to filter, interpolate, and generate the digital terrain model (DTM)/digital crown model (DCM) and also to compute the following variables related to the metrics of heights and return intensity distributions within the limits of the 54 field plots: mean, maximum and minimum values, mode, standard deviation, variance, interquartile distance, coefficients of skewness and kurtosis, average absolute deviation, and percentiles. The proportion of returns (%) above a specific height threshold was also estimated in order to separate tree crop canopy returns from other vegetation.

The following steps were carried out with several processing programs (algorithms) implemented in the Fusion LiDAR Toolkit (McGaughey 2010).

  • GROUNDFILTER. Ground returns were extracted from the LiDAR point cloud by using the GroundFilter tool, which implements a filtering algorithm adapted from Kraus and Pfeifer (1998) based on linear prediction (Kraus and Mikhail 1972).

  • GRIDSURFACE CREATE. These returns were used to generate a DTM (1 m2 resolution) grid with the GridSurfaceCreate tool, which computes the elevation of each grid cell by using the average elevation of all points within the cell; if the cell does not contain ground return points, its elevation is generated by interpolation from the neighboring cells. The metrics were generated for the exact size of each 20 × 20 m plot.

  • CLIPDATA. The normalized LiDAR point cloud was obtained by subtraction of the ellipsoidal height of the DTM from the Z coordinate of each LiDAR return with the ClipData tool; this tool was used also to exclude returns below a normalized height of 0.5 m, considered as not belonging to tree crowns (e.g., hits on shrubs, rocks and logs).

  • POLYCLIPDATA. The normalized LiDAR point cloud was clipped with the limits of each field plot—which were stored as polygons in vector format—by using the PolyClipData tool; one independent file was created per plot.

  • CLOUDMETRICS. The metrics of heights and return intensity distributions of the 54 clipped and normalized point clouds were computed with the CloudMetrics tool.

Table 4 shows the complete set of metrics and the corresponding abbreviations used in this study.

Table 4 Statistics extracted from the heights and intensities of LiDAR flights and used as independent variables for the regression models

2.4 Statistical analysis

Several statistical methods can be used in remote sensing prediction studies. In this study, parametric and non-parametric fitting methods were used to predict the main structural attributes and biomass production variables from LiDAR data and field measurements. Both fitting methods were used for comprehensive analysis of the data and because they provide important complementary information.

2.4.1 Parametric methods

Multiple linear regression (MLR) was used to model the relationships between field measurements (H, D20, D130, FW, and DW) and the LiDAR variables in order to produce general models (for all clones together) and models classified by clone (Bjor, Inger, and Olof).

Candidate predictor variables were required to have an entering F-statistic with a significance level of 0.05 or less for inclusion in the model, and no predictor was left in the model with a partial F-statistic with a significance level greater than 0.05. Dependent variables derived from field data and predicted in regressions were mean height (H), basal diameter (D20), diameter at breast height (D130), fresh weight (FW), and dry weight (DW).

Comparison of the model estimates was based on the following two statistics: the adjusted coefficient of determination (Rfit 2) and the root mean square error (RMSE). The Rfit 2 compares the descriptive power of regression models that include the diverse numbers of predictors, and RMSE is a quadratic scoring rule that measures the average magnitude of the error (the square root of the average of squared differences between predicted and actual observations), which was calculated to provide additional information. Finally, residual plots were checked in order to validate the model fit. The variance inflation factor (VIF) was also taken into account. This factor quantified the severity of multicollinearity in ordinary least squares regression analysis and also provided an index that measured the extent to which the variation in an estimated regression coefficient increased due to collinearity. Only models in which all parameters were significant at the 5% level and with a VIF < 5 were included, thus ensuring that predictions were not highly correlated (Belsley et al. 2005; Mandeville 2008).

In this study, all of the data were used to construct the general and clone-specific models, as according to Myers (1990, p. 170) and Hirsch (1991), final estimation of model parameters using the entire dataset is more precise than when a model is fitted using only one portion of the data, especially with relatively small sample numbers.

2.4.2 Non-parametric method

In a preliminary analysis carried out to determine the most appropriate non-parametric statistical method, random forest (RF) and classification and regression trees (CART) were compared. The CART method was chosen because it provided better fits to the data, with larger R 2 values and lower RMSE, than yielded by RF (see Online Resource 4). It is also a good exploratory technique that aims to determine classification and prediction rules.

The main advantages of the CART method are as follows (Gordon 2013; Timofeev 2004): (i) it does not require specification of any functional form, (ii) it does not require variables to be selected in advance, (iii) it can easily handle outliers, (iv) it does not require the assumptions of statistical models and is computationally fast, (v) it is flexible and can deal with missing data, (vi) it works better than RF with relatively small databases, and (vii) it is easy to interpret (unlike random forest).

The objective of CART is usually to classify a dataset into several groups by use of a rule that displays the groups in the form of a binary tree (Breiman et al. 1984), which is determined by a procedure known as recursive partitioning. In this study, the CART method was used to classify the variables considered (H, D20, D130, FW, and DW) in relation to the LiDAR data available.

Each tree branch is described by the value of one descriptor, chosen so that all objects in a daughter group have more similar response variable values. The split for continuous variables is defined by x i  < a j , where x i is the selected descriptor or explanatory variable, and a j is its split value. To choose the most appropriate descriptor x i and value of a j , CART uses an algorithm in which all descriptors and all split values are considered, selecting those giving the best reduction in impurity between the mother group (t p ) and the daughter groups (t L and t R ) (Deconinck et al. 2005). This process is repeated for each daughter group until the maximal tree height is reached. Mathematically, this is expressed as follows:

$$ \varDelta i\left(s,{t}_p\right)=i\left({t}_p\right)-{p}_Li\left({t}_L\right)-{p}_Ri\left({t}_R\right) $$
(1)

where i t is the impurity, s the candidate split value, and p L and p R are the fractions of the objects in respectively the left and right daughter groups.

The impurity is defined as the total sum of squares of the deviations of the individual responses from the mean response of the group and is expressed as follows:

$$ \mathrm{i}\left(\mathrm{t}\right)=\sum_{\mathrm{n}}{\left({\mathrm{y}}_{\mathrm{n}}-\overline{\mathrm{y}}\ \left(\mathrm{t}\right)\right)}^2 $$
(2)

where i(t) is the impurity of group t, y n is the value of the response variable for object x n , and \( \overline{y}\ (t) \) is the mean of the response variable in group t.

CART methods are not required to conform to probability distribution restrictions, and there is no assumption of linearity or any need to pre-specify a probability distribution for the errors (Bell 1999).

Complexity and robustness are competing characteristics that must be considered simultaneously during construction of statistical models. The more complex a model is, the less reliable it will be for purposes of prediction. To prevent this from occurring, stopping rules must be applied during elaboration and the development of a decision tree to prevent the model from becoming overly complex. Common parameters used in stopping rules include (a) the minimum number of observations in a leaf, (b) the minimum number of observations in a node prior to splitting, and (c) the depth (i.e., number of levels) of any leaf from the root node (Song and Lu 2015). On this occasion, no pruning was necessary because a maximum of 2 levels was considered for tree depth (or a maximum of 6 nodes). The risk estimate, which is a measure of the within-node variance, was used as an indicator of model performance (IBM Corp. Released 2015).

3 Results

We used two well-defined and complementary parametric and non-parametric fitting procedures to analyze and estimate the best response. In this case, the structural attributes (H, D20, and D130) and biomass yield variables (FW and DW) were strongly and positively correlated (R 2 > 0.87, additional data are given in Online Resource 5).

The main results obtained with both fitting methods are shown in Tables 5, 6, and 7. All the dependent variables (H, D20, D130, FW, and DW) were analyzed and interpreted at trial level and separately for each clone (see Table 7 for data on the Olof clone and additional data are given in Online Resources 1 and 2 for data on the Bjor and Inger clones). The parameter estimates and goodness-of-fit statistics for the best parametric model (multiple linear regression) are summarized in Table 5. Finally, the best non-parametric (CART) models are included in Table 6 (trial level) and Table 7 (Olof clone). The scatter plots generated by MLR and CART (Online Resource 3) show the generally close relationship between the predicted values and the field measured values.

Table 5 Results of multiple regression showing the best models obtained for H, D20, D130, FW, and DW, for each trial and each clone
Table 6 Results of the non-parametric fitting (CART) of the models obtained for H, D20, D130, FW, and DW, at the trial level
Table 7 Results of non-parametric fitting of the models obtained for H, D20, D130, FW, and DW, for the Olof clone

3.1 Parametric methods

At trial level, the models for the structural attributes (H, D20, and D130) provided good fits, with more than 76% of the variance explained, and the LiDAR variables with greatest influence were those related to elevation percentile, total returns, and the first returns on 0.5. The highest R 2 value was obtained for the mean height variable (Rfit 2 = 89%, Table 5). When modeling the biomass variables (FW and DW), the Rfit 2 values were higher than 75% (Table 5), and in both cases, the most influential variables in the model were elevation percentile and percentage of all returns above mean.

Regarding the fit for the models for each clone, the Bjor clone model provided the best results with H and D20 (91 and 97%, respectively; Table 5). However, the Olof clone model produced the highest Rfit 2 using D130 (95%; Table 5). In the case of the biomass variables (FW and DW), the amount of variance explained was sometimes lower, with Rfit 2 values above 66%. The Bjor clone model produced the highest Rfit 2 values (90%) for both the FW and DW variables (Table 5).

The homogeneity of variance was evaluated using SPSS graphs obtained with ZRESID and ZPRED (standardized predicted values and standardized residuals) commands. Plots of residuals against predicted values showed no evidence of heterogeneous variance and no systematic pattern. The results indicated the absence of atypical or scattered sample data and, furthermore, bell-shaped histograms indicated that all datasets were normally distributed.

3.2 Non-parametric methods

The proportion of variance explained by the mean height variable-dependent model for the whole trial produced an R 2 value of 84.6%. In this case, the most important LiDAR variable was the mean elevation. It is important to highlight the influence of this variable on the other structural attributes and production variables, apart from D20, with the most significant independent variable related to the elevation of the percentiles, namely the variable Elev. P60 (Table 6).

The independent variables that best define the decision trees (CART) per clone were more varied. The fitting provided good results for all three clones studied. For the Olof clone, which yielded the highest values for structural attributes and biomass production in the field, the fit was good, with R 2 > 92% for all study variables (Table 7). However, in plots with lower values for mean height, diameter, and biomass, i.e., plots with the Bjor and Inger clones, the independent explanatory variables were those related to different percentiles of elevation and % returns (Online Resources 1 and 2). In the CART analysis for each clone, we observed that, with some exceptions, most of the models produced a second level of classification, with variables related to elevation and % returns.

Once the fitting was completed, we were able to verify that the LiDAR variables associated with mean elevation and elevation percentile (tree height) were the most important in the fitting process, as these were included in all models produced by the parametric and non-parametric fitting methods used.

4 Discussion

The parametric and non-parametric model fitting conducted in this study has revealed acceptable results that indicate the usefulness of LiDAR data for estimating structural attributes and biomass production in forest energy crops grown on degraded mining land.

The data collected in the field inventory and LiDAR data (available for free) captured in the same period were highly correlated. One of the objectives of the study was to determine differences in the performance of the different models studied (H, D20, D130, FW, and DW) at trial level (without differentiating clones) and separately for each clone.

The results indicate that free LiDAR data can be used to estimate the variables with acceptable precision (Rfit 2 > 63% in multiple regression and R 2 > 73.5% for CART analysis) and that satisfactory results were obtained, despite the low density of LiDAR points. At the trial level, although MLR produced higher R 2 values than CART, except for the D20 and FW fits, the values were generally very similar. However, when the fits were carried out taking into account the clone level (i.e., a more uniform sample), the CART method produced larger R 2 values (except for D130 in Inger and Olof and H and D20 in Bjor). Finally, CART and MLR produced very similar RMSE values at trial level, but CART generally yielded smaller RSME values at the clone level (except for Bjor and Olof clones with D20 and D130, respectively). The goodness-of-fit levels yielded by the models are comparable to those obtained in other studies using LiDAR to characterize forest attributes (Dalponte et al. 2011; García-Gutierrez et al. 2014; Seidel and Ammer 2014).

The MLR and CART scatter plots (additional data are given in Online Resource 3) show that both methods provided acceptably good fits to the data. However, for the structural variables (H, D20, D130), the models developed produced better predictions than for biomass variables (FW and DW), in that the data were more widely dispersed.

On the other hand, to verify and compare the models, for both parametric and non-parametric methods, a newly collected dataset (additional inventory) should be used for validation (Hirsch 1991; Kozak and Kozak 2003; Myers 1990) because the only universally acceptable method for validating a model and assessing its goodness of fit after model selection is to use an independent sample (Lever et al. 2016). However, independent validation was not possible in this study, as application of a validation method requires more data and the CART method thus becomes more unstable (Gordon 2013), especially as regards the models developed for each clone. Nonetheless, the models fitted in this study can be considered sufficiently robust for estimating structural attributes and productivity of SRWC. In this case, the CART analysis of the relationships between field-measured variables and LiDAR data produced reasonable results. Moreover, the errors were acceptable, considering the high degree of variability between the trial plots.

The trial level models produced by the MLR method show that the most important variables, for trial and clone, were those related to elevation and returns, as also shown by the CART method. However, the models generated by MLR for each clone were defined by more diverse LiDAR variables, which also included (apart from those already mentioned) variables related to intensity (see Table 5).

Good fits were obtained for the structural attributes studied, mean height (H) and both diameters (D20 and D130), probably because these are closely related to the LiDAR-derived elevation variables (tree height). However, good-fit models were also obtained for FW and particularly DW biomass, as expected, because these variables are strongly correlated with structural variables (see Online Resource 5). Some variables such as mean elevation (Elev mean) and elevation percentile (Elev %) were included in most of the models. The data for the trial plots planted with the Olof clone were uniformly distributed, and the independent variable that provided the best predictions was the mean elevation (Elev mean).

The mean height variable was closely correlated with the LiDAR variables associated with the height of the trees (i.e., elevation) and with the returns, as also observed by Montealegre Gracia et al. (2015) in a study of a Pinus halapensis Mill plantation in Spain. Another study in plantations of Pinus radiata D.Don in northern Spain (González-Ferreiro et al. 2012) indicated that mean height can be accurately modeled from low-density laser data (0.5 pulses m−2), yielding an R 2 value of 0.75 (in comparison with R 2 values of 0.70–0.93 obtained by parametric methods and R 2 values of 0.85–0.94 by non-parametric (CART) methods in the present study). In a study of an olive plantation in Spain, with low-density data (0.5 pulses m−2), good results were also obtained for height (R 2 = 0.67) (Estornell et al. 2014).

The results obtained for the models of the diameter variables (D20 and D130) were better than expected, taking into account the reports in the relevant scientific literature. Thus, the models used to estimate diameters performed similarly to those reported by Gonçalves-Seco et al. (2011) for Eucalyptus globulus Labill plantations in Spain (R 2 = 0.71) and by Dalponte et al. (2011) for a mixed stand (R 2 = 0.63). Similar findings were also reported by Graham (2008) for the D130 model and a pine plantation (R 2 = 0.82). However, in the same study, a much lower R 2 value was obtained for a natural pine stand (R 2 = 0.20). In a study conducted by Estornell et al. (2014) in a plantation of small trees (olives) in the Mediterranean area of Spain, with low-density LiDAR data (0.5 pulses m−2), the findings indicated good model performance (R 2 = 0.70) for estimating volume, which is directly related to diameter. The findings thus seem similar to those of the present study.

Several studies carried out worldwide with low-point-density data have reported similar results to those obtained in the present study. For example, a study carried out in Europe to evaluate the aboveground biomass in boreal forest zone with an average point density of between 0.7 and 1.2 m− 2 reported an R 2 value of 0.88 (Næsset and Gobakken 2008). A study carried out in Canada using low-density data (0.5 pulses m−2) obtained an excellent fit for the biomass, with an overall R 2 of 0.93 (Treitz et al. 2010). Li et al. (2008) also observed a significant relationship between field-based aboveground biomass estimates based on field and LiDAR measurements for the three study sites, located in the USA and Canada and in which different forest species were used. However, a study by Næsset (2011) in small areas of forest land in Norway where the main tree species were Norway spruce (P. abies (L.) Karst.) and Scots pine (P. sylvestris L.) showed that these species are subject to substantial inherent canopy height variation, leading to highly variable predictions for estimate aboveground biomass in young forests (Magnussen and Boudewyn 1998).

In the present study, no comparison was made for different LiDAR point densities because of the scarcity of the LiDAR data. Previous studies did not find any evidence indicating that a reduction in point density affects the model accuracy, and Treitz et al. (2010) considered that data captured at 0.5 pulses m−2 may be an excellent source of information for forest management. The same was also concluded by González-Ferreiro et al. (2013), who showed that low-intensity LiDAR data (0.5 pulses m−2) can be used without significant loss of information. This suggests that data captured at 0.5 pulses m−2 yields good estimates, without excessive loss of model quality.

According to the findings reported by García et al. (2010), the intensity variables are more strongly related to biomass than to mean height, as also shown in the present study. Likewise, González-Ferreiro et al. (2013) found that the independent variables associated with the return intensity and related canopy measurements can add some valuable information for predicting biomass in eucalyptus plantations in northern Spain (R 2 = 0.75; 4 pulses m−2). LiDAR studies carried out in small areas have shown that the height percentiles are usually closely correlated with biomass (González-Ferreiro et al. 2012); in the present study, the elevation percentiles (tree height) were found to be the most important LiDAR variables to include in multiple linear regression models for estimating biomass (FW and DW) at trial level. The results of a study by Zhao et al. (2009) showed that the models can accurately predict forest biomass and that the predictive performance was consistent across a range of scales, with R 2 ranging from 0.80 to 0.95 across all fitted models. However, in a study conducted by Van Aardt et al. (2006) in a coniferous plantation, the R 2 values for volume (0.66) and aboveground biomass (0.59) were low, which was attributed to variability in the volume. Condés et al. (2013) noted that better fits can be achieved with multiple linear regression models by inclusion of a larger number of variables. This was also observed in the present study, and the R 2 value increased as the number of independent variables that make up the model increased, e.g., in the multiple linear regression, the best result was obtained for D20 and the Bjor clone, with six variables included in the model (R 2 = 0.98).

Different studies in conifer plantations worldwide also indicate that height can be accurately modeled from medium- and low-density laser data. In the abovementioned study on P. radiata, González-Ferreiro et al. (2012) indicated differences in the model fit for height for different data point densities (0.5 and 4 pulses m−2), although the difference in the goodness of fit was only 8%. Thus, although better fits are obtained by using higher point densities, the extra costs involved may not be justified by the final result.

In addition to the importance of reducing the cost of the inventory, the LiDAR information obtained is very useful for forest management purposes. Several studies indicate that the use of LiDAR data generates more accurate inventories that traditional inventory methods based on field measurements (Maltamo et al. 2004; Næsset 2002). The combined use of LiDAR technology and advanced statistical techniques has led to a number of different studies exploring their potential for producing accurate results in forest biomass research (Cho et al. 2012; Gleason and Im 2012; Lefsky et al. 2001, 2005; Weishampel et al. 1996; Wulder 1998). In this respect, selection of a suitable statistical approach is essential considering that the models will be used for predictive purposes. Besides being easy to use and interpret, the non-parametric method (CART) includes classification rules and is thus a very useful tool for LiDAR-based inventories.

As mentioned earlier, the methods used in this study enable more accurate inventories to be carried out and at a fraction of the cost than possible by traditional methods (assuming the timely availability of suitable data) (Means and Acker 2000). The forest variables estimated from the LiDAR data in this study are of great interest to the timber industry and represent information that is expensive to collect in the field. The results indicate the good relationship between LiDAR data and the various forest variables considered. This is of great value in forest management, providing a tool to determine the best time for harvesting the energy crop, as well as for monitoring and managing plots (Lim et al. 2003).

The results obtained in this study, in terms of the models fitted, indicate that forest energy crops can be accurately modeled from low-intensity laser data and that the models are similar to those reported in the international scientific literature.

5 Conclusions

The study findings show that low density LiDAR data (0.5 pulses m−2) can be used to construct models to estimate the main variables (structural attributes and biomass yield) of interest in the management of short-rotation (willow) energy crops. According to the moderate to highly accurate estimates obtained, the models developed on the basis of LiDAR data can be used to produce good predictions and estimates for an SRC crop, and would serve as a management tool for improving and optimizing follow-up decisions related to a commercial crop. In view of the results obtained, acquisition or purchase of low-density LiDAR information to facilitate the monitoring of the energy plantations can be considered from a commercial point of view. The parametric and non-parametric statistical methods tested in the present study (MLR and CART) provided complementary and robust information for predicting stand variables in an SRC energy crop from low-density LiDAR data (0.5 pulses m−2) and therefore can be considered suitable for developing models for accurate estimation of forest variables. The predictive power of both methods was generally high (particularly when limited data were available for fitting), although MLR models are easier to interpret and apply.

References

Download references

Acknowledgements

This work was funded by the Hunosa Group coal mining company. The authors acknowledge the helpful co-operation of staff from Hunosa Group in this study.

Funding

The research was supported by the Hunosa Chair at the University of Oviedo.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to María Castaño-Díaz.

Additional information

Handling Editor: Aaron R. Weiskittel

Contribution of the co-authors

- María CASTAÑO-DÍAZ = co-writing the paper and analyzing the data.

- Pedro ÁLVAREZ-ÁLVAREZ = co-writing the paper and analyzing the data.

- Brian TOBIN = co-writing the paper and supervising the work.

- Maarten NIEUWENHUIS = co-writing the paper and supervising the work.

- Elías AFIF-KHOURI = supervising the work.

- Asunción CÁMARA-OBREGÓN = supervising the work and coordinating the research project.

Electronic supplementary material

Online Resource 1

(DOCX 22.2 kb)

Online Resource 2

(DOCX 17.6 kb)

Online Resource 3

(PDF 454 kb)

Online Resource 4

(DOCX 16.1 kb)

Online Resource 5

(DOCX 12.8 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Castaño-Díaz, M., Álvarez-Álvarez, P., Tobin, B. et al. Evaluation of the use of low-density LiDAR data to estimate structural attributes and biomass yield in a short-rotation willow coppice: an example in a field trial. Annals of Forest Science 74, 69 (2017). https://0-doi-org.brum.beds.ac.uk/10.1007/s13595-017-0665-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://0-doi-org.brum.beds.ac.uk/10.1007/s13595-017-0665-7

Keywords