Mobile laser scanning (MLS), which can quickly collect a high-resolution and high-precision point cloud of the surroundings of a vehicle, is an appealing technology for three-dimensional (3D) urban scene analysis. In this regard, the classification of MLS point clouds is a common and core task. We focus on pointwise classification, in which each individual point is categorized into a specific class by applying a binary classifier involving a set of local features derived from the neighborhoods of the point. To speed up the neighbor search and enhance feature distinctiveness for pointwise classification, we exploit the topological and semantic information in the raw data acquired by light detection and ranging (LiDAR) and recorded in scan order. First, a two-dimensional (2D) scan grid for data indexing is recovered, and the relative 3D coordinates with respect to the LiDAR position are calculated. Subsequently, a set of local features is extracted using an efficient neighbor search method with a low computational complexity independent of the number of points in a point cloud. These features are further merged to produce a variety of binary classifiers for specific classes via a GentleBoost supervised learning algorithm combining decision trees. The experimental results on the Paris-rue-Cassette database demonstrate that the proposed approach outperforms the state-of-the-art methods with a 10% improvement in the F1 score, whereas it uses simpler geometric features derived from a spherical neighborhood with a radius of 0.5 m. |
1.IntroductionWith the development of light detection and ranging (LiDAR) technology, mobile laser scanning (MLS) systems, which deploy one or multiple LiDARs on a ground-based vehicle,1 can quickly collect a high-resolution and high-precision point cloud of the surroundings of the vehicle and have gained increasing attention in three-dimensional (3D) urban scene analysis,2 including urban 3D modeling3 and automated urban driving.4 Classification of MLS point clouds, in which each point in an MLS point cloud is determined to belong to a specific class, e.g., ground,5 road,6 road markings,7 vehicles,8 power lines,9 and street trees,10,11 is a common and core task for various applications of 3D urban scene analysis.12 Weinmann et al.13 proposed a pointwise classification framework, whereby each individual point is classified by a binary classifier involving a set of local geometric features derived from the neighborhoods of the point. The framework does not require expert knowledge in the specific domain12 and thus can be applied to label a variety of urban object classes. However, the main challenges of pointwise classification include a low distinctiveness of local geometric features and a high computational complexity of the neighbor search. The commonly used neighbor search approaches for MLS point clouds are based on the -D tree algorithm, in which a -dimensional index tree is constructed and the average complexity for the nearest-neighbor search is for a point cloud with points. To enhance the discrimination of local low-level geometric features, multiple neighborhood scales14–16 or a selected optimal neighborhood scale13,17,18 are recovered via the -D tree, resulting in a higher computational cost. To address the above issues, Hackel et al.14 downsampled the point cloud and built a multiscale pyramid of -D trees to help improve the efficiency of neighbor searching. An MLS point cloud is usually georeferenced by merging data from LiDAR and other sensors, such as inertial measurement units and global positioning systems.1 However, more semantic information exists among the raw data acquired by the LiDAR, i.e., the relative positions of the measured points with respect to the LiDAR, which can be viewed as the relative positions of these points with respect to the road surface. This information can help to monitor the presence of most constructed objects since the spatial distributions of these objects along the road are generally regular. In addition, the raw data recorded in scan order are beneficial for organizing the point cloud.19 To reduce the neighbor search time and enhance the distinctness of local geometric features for pointwise classification, this study fully exploits the contextual and topological information in the raw data of the MLS point clouds, as shown in Fig. 1. First, a two-dimensional (2D) scan grid for data indexing is recovered, and the relative 3D coordinates with respect to the LiDAR position are calculated. Subsequently, a set of local features is extracted using an efficient neighbor search method with a low computational complexity independent of the number of points in a point cloud. These features are further merged to produce a variety of binary classifiers for specific classes via a boosting supervised learning algorithm combining decision trees. 2.MethodsThis study considers an MLS system with a single 2D LiDAR sensor used in push-broom mode;20 i.e., the scan plane of the sensor is orthogonal to the direction of vehicle movement. 2.1.Scan Grid ConstructionA scan line is defined as a 2D profile acquired by a single rotation of the 2D LiDAR mirror.19,21 Thus, an MLS point cloud can be organized by constructing a scan grid in which each row represents a scan line, as shown in Fig. 2(a). In the grid, a point measured by the ’th beam on the ’th scan line can be indexed by . The scan grid provides a compact representation of the point cloud with a size of , where is the number of scan lines and is the number of laser beams per scan line. To construct a scan grid from the MLS data, the scan angles of the point cloud should be recorded. Some LiDARs do not record the point if the laser beam does not return. In this case, we can segment the point cloud into scan lines to calculate the scan line index for point by finding a significant jump between the scan angles of adjacent points. Then, the beam index for point can be calculated as where is the scan angle of point , is the start scan angle of a scan line, and is the scan resolution, i.e., the nominal angular increment between adjacent beams on a scan line.If the scan resolution is not provided, it can be estimated using the mean or median of the differences between scan angles of adjacent points on the same scan line. 2.2.Relative Coordinate CalculationTo obtain more contextual information from LiDAR data, a relative 3D coordinate system, where represents the moving distance from the origin along the trajectory of the LiDAR, represents the horizontal displacement with respect to the LiDAR position, and represents the vertical displacement with respect to the LiDAR position, is constructed for the MLS point cloud. The relative coordinates of point are calculated as where is the vehicle speed at the time when the ’th scan line is acquired by the LiDAR, is the time interval between adjacent scan lines, is the radial distance of point , and is the scan angle of point .Since the attitude of the LiDAR (vehicle) over a short period of time can be regarded as constant, the relative coordinates can describe a local spatial distribution of points as well as the georeferenced coordinates, as shown in Figs. 2(b) and 2(c). 2.3.Neighbor SearchThe neighborhood of each point should be recovered for local feature extraction. A spherical neighborhood of point is defined as a set of points within a sphere centered at with a radius of . We propose a fast procedure for searching with a computational complexity of , i.e., independent of the number of points in a point cloud. Consider the measurement resolutions and of the scan grid at point . The resolution , the minimum distance between the point and its adjacent points on the ’th scan line, can be estimated by the distance between the point and the ’th beams on the ’th scan line, as shown in Fig. 3. The resolution , the distance between the point and its adjacent points , can be estimated by the distance between the ’th scan line and the ’th scan lines along the trajectory of the LiDAR. Thus, the measurement resolutions and are calculated as Then, a candidate neighborhood can be derived from the scan grid: The spherical neighborhood is finally determined as The computational complexity of the proposed neighbor search method can be measured by the number of points in : which depends on the measurement resolutions of the scan grid at point and is independent of the number of points in the whole point cloud. Note that affected by the variation of point density, the computational complexity of the proposed method becomes high with a small range .2.4.Feature ExtractionTo demonstrate the effectiveness of the relative 3D coordinates, a set of intuitive geometric features are extracted from the spherical neighborhood with a radius, as shown in Table 1. The density feature is corrected with the measurement resolutions to eliminate the influences of varying vehicle speeds and measurement ranges: where is the number of points in the spherical neighborhood.Table 1Local features extracted from the spherical neighborhood.
In addition, radiometric and penetrating features are derived based on the intensities and numbers of echoes measured by the LiDAR, providing further distinctive properties not covered by geometric features. 2.5.Classifier LearningThe local features should be merged into a series of binary classifiers for a variety of specific classes by using supervised learning algorithms. The decision tree adopts a divide-and-conquer strategy by partitioning the feature space into subregions with a high class purity on the training set.22 A top-down tree structure is constructed and each node chooses the best feature from the feature set to split the training data. Since a single decision tree has a very low bias and extremely high variance, we use the boosting framework to ensemble multiple decision trees to improve the classification accuracy and generalization ability. We chose the GentleBoost algorithm, in which the total exponential loss on the training set is minimized using a functional Newton-like numerical optimization method.23 The ensemble classifier is where is a decision tree generated as an incremental function in the ’th iteration and is the number of iterations.3.Results3.1.DatasetTo compare our method with prior state-of-the-art methods, we use a publicly available and labeled database, namely Paris-rue-Cassette.20 It contains 12 million points recorded on a street section in Paris with a length of . A 2D LiDAR sweeps from to 180 deg with time interval. The starting scan angle is , i.e., the upward direction. All coordinates are geo-referenced (E,N,U) in Lambert 93 and altitude IGN1969 (grid RAF09) reference system. For the points with laser beam return, the range , scan angle , intensity , number of echoes , and georeferenced LiDAR position are recorded in scan order in addition to the georeferenced coordinates. The object classes for the experiments include façade, ground, cars, two-wheelers, road inventory, pedestrians, and vegetation. See Table 2 for details. Table 2Object classes in the Paris-rue-Cassette database.
3.2.Recovery of the Scan GridThe point cloud was first divided into scan lines at the positions where the sign of the scan angle changed from positive to negative. Then, the angle resolution was estimated by analyzing the distribution of the difference between scan angles of adjacent points on the same scan line. Finally, a scan grid with a size of was constructed, as shown in Fig. 4. Given the real-time speeds of LiDAR estimated by the georeferenced LiDAR positions, the relative 3D coordinates were computed using Eq. (2). 3.3.Neighbor Search EfficiencyTo compare the computational complexities of the -D tree algorithm and the proposed neighbor search method in searching spherical neighborhood, we run the two algorithms on a laptop PC with an AMD Ryzen 5 4600H CPU (hexa-core, 3.0 GHz) and 16 GB of RAM. The -D tree algorithm was performed over the raw Paris-rue-Cassette dataset, while the proposed neighbor search method was performed over the scan grid of the Paris-rue-Cassette dataset. The Point Cloud Library was used for the -D tree neighbor search. The search radius is set to 0.2, 0.5, and 0.8 m. As shown in Table 3, our method is much faster. Table 3Computational complexity of the neighbor search.
Figure 5 shows the frequency distribution of the search rate on the Paris-rue-Cassette dataset with an average search rate of 49.49%, demonstrating the efficiency of the proposed neighbor search method. 3.4.Effectiveness of the Relative 3D CoordinatesTo demonstrate the effectiveness of the relative 3D coordinates, we compare the distinctiveness of the geometric features derived from the georeferenced and relative 3D coordinates. The geometric features in Table 1 are divided into three kinds for comparison: (i) single coordinates of the central point; (b) basic features derived from the neighborhood using a single coordinate, i.e., the means, standards, and ranges within the neighborhood; and (c) shape features derived from the structure tensor of the neighborhood using three coordinates, i.e., the linearity , planarity , scattering , and omnivariance . The Bayes error22 for each feature is numerically estimated as follows: where and are the ’th bins of the probability histograms with bins for a single feature, corresponding to an object class and a nonobject class, respectively. All features are mapped to the interval [0, 1].The search radius is set to 0.5 m. Figure 6 shows the average Bayes error for each kind of geometric feature on the seven classes. Figures 6(a) and 6(b) show that single relative coordinates are more distinctive than single georeferenced coordinates, implying that the relative coordinates introduce more semantic information. Figure 6(c) shows that shape features using three relative coordinates have similar distinctiveness as those using three georeferenced coordinates, implying that the relative 3D coordinates can maintain the local spatial distribution of points as well as the georeferenced 3D coordinates. 3.5.Pointwise Classification AccuracyThe best results of pointwise classification we are aware of are those of Refs. 14, 24, and 25. In Ref. 25, a hierarchical framework composed of ground filtering, structural segmentation, and contextual classification was proposed. In Refs. 14 and 24, geometric features are derived at multiple scales or an optimal scale and then are combined into pointwise object classifiers. To improve the classification accuracy, statistical features derived from the 2D projection of the point cloud, the shape context 3D, and signature of histogram of orientations features are also utilized in Refs. 14 and 24 as well as fundamental geometric features. The experiments use the same training and test sets as in Refs. 14 and 24; i.e., 1000 points per class are randomly selected as training samples and the remaining data are used as test samples. The number of iterations for GentleBoost is . Table 4 shows the classification results. Results of a variety of pointwise classification approaches are provided in Ref. 22, and the best result for each class is used for comparison. Compared with the prior state-of-the-art methods, our approach achieves an approximately 10% improvement in terms of the score. The improvement increased to 15% by adding radiometric and penetrating features mentioned in Sec. 2.4 into the classifiers. Table 4Pointwise classification results.
Note: F, façade; G, ground; C, cars; 2W, two-wheelers; RI, road inventory; P, pedestrians; V, vegetation.Bold values indicate our experimental results, which are better than the state-of-the-art (the second, third, and fourth columns). The score depends on the decision threshold. When the decision threshold changes, the score changes. The area under the receiving operating characteristic (ROC) curve (AUC) does not depend on the decision threshold, so it is better than score to evaluate the performance of a classifier. Figure 7 shows the ROC curves for the proposed approach with all features. The AUC can summarize the relationship between the true- and false-positive rates of a binary classifier for different decision thresholds; hence, we also use the AUC to evaluate our approach in Table 4. 4.Discussion and ConclusionThis study aims to speed up the neighbor search and enhance feature distinctiveness for pointwise classification by exploiting topological and contextual information among raw data. Considering an MLS system with a single 2D LiDAR sensor, the cores of our approach are: (i) to construct a scan grid according to the scan pattern to organize an MLS point cloud; (ii) to compute the relative 3D coordinates with respect to the LiDAR position; and (iii) to recover the neighborhood by a fast search method. The computational complexity of the proposed neighbor search strategy is independent of the number of points in a point cloud with an average search rate of 49.49%. In terms of the Bayes error, geometric features using the relative coordinates are more distinctive than these features using georeferenced coordinates. Compared with the state-of-the-art methods, the proposed pointwise classification achieves an approximately 10% improvement in terms of the score, whereas it uses simpler geometric features derived for a search radius of 0.5 m. Furthermore, our approach is straightforward to parallelize and would be faster when taking advantage of parallel programming. The proposed approach has several limitations: (1) the proposed neighbor search approach only works on an MLS system with a single 2D LiDAR sensor used in push-broom mode; (ii) to construct or recover a scan grid from the MLS data, the scan angles of the point cloud should be recorded, and the MLS data needs to be recorded in scan order or time stamps of the MLS data are recorded; (iii) because the proposed approach uses local features derived from several adjacent scan lines, it will not work well if the vehicle drives have multiple drive-runs in different directions (e.g., drive forward and backward along the road). In future work, the proposed approach will be extended to MLS systems with a single 3D LiDAR or multiple 2D/3D LiDARs by exploring the measurement geometry of 3D LiDARs as well as the spatial relationship of multiple LiDARs. Furthermore, postprocessing, such as soft labeling,24 will be considered to improve the classification accuracy. AcknowledgmentsThis research was supported by the National Natural Science Foundation of China under Grant No. 31901239. ReferencesI. Puente et al.,
“Land-based mobile laser scanning systems: a review,”
Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XXXVIII-5/W12 163
–168
(2011). https://doi.org/10.5194/isprsarchives-XXXVIII-5-W12-163-2011 Google Scholar
Y. Wang et al.,
“A survey of mobile laser scanning applications and key techniques over urban areas,”
Remote Sens., 11
(13), 1540
(2019). https://doi.org/10.3390/rs11131540 Google Scholar
C. Wang et al.,
“Urban 3D modeling with mobile laser scanning: a review,”
Virtual Real. Intell. Hardware, 2 175
–212
(2020). https://doi.org/10.1016/j.vrih.2020.05.003 Google Scholar
R. W. Wolcott and R. M. Eustice,
“Visual localization within LIDAR maps for automated urban driving,”
in IEEE Int. Conf. Intell. Robots and Syst.,
(2014). https://doi.org/10.1109/IROS.2014.6942558 Google Scholar
H. Zhao et al.,
“Ground surface recognition at voxel scale from mobile laser scanning data in urban environment,”
IEEE Geosci. Remote Sens. Lett., 17
(2), 317
–321
(2020). https://doi.org/10.1109/LGRS.2019.2919297 IGRSBY 1545-598X Google Scholar
C. Ye et al.,
“Robust lane extraction from MLS point clouds towards HD maps especially in curve road,”
IEEE Trans. Intell. Transp. Syst., 1
–14
(2020). https://doi.org/10.1109/TITS.2020.3028033 Google Scholar
Y. Li et al.,
“Localization and extraction of road poles in urban areas from mobile laser scanning data,”
Remote Sens., 11
(4), 401
(2019). https://doi.org/10.3390/rs11040401 Google Scholar
J. Zhang et al.,
“Vehicle tracking and speed estimation from roadside Lidar,”
IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 13 5597
–5608
(2020). https://doi.org/10.1109/JSTARS.2020.3024921 Google Scholar
S. Xu and R. Wang,
“Power line extraction from mobile LiDAR point clouds,”
IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., 12 734
–743
(2019). https://doi.org/10.1109/JSTARS.2019.2893967 Google Scholar
Y. Q. Li et al.,
“Street tree information extraction and dynamics analysis from mobile lidar point cloud,”
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLIII-B2-2020 271
–277
(2020). https://doi.org/10.5194/isprs-archives-XLIII-B2-2020-271-2020 Google Scholar
S. Xu et al.,
“Automatic extraction of street trees’ nonphotosynthetic components from MLS data,”
Int. J. Appl. Earth Obs. Geoinf., 69 64
–77
(2018). https://doi.org/10.1016/j.jag.2018.02.016 Google Scholar
E. Che, J. Jung and M. J. Olsen,
“Object recognition, segmentation, and classification of mobile laser scanning point clouds: a state of the art review,”
Sensors, 19
(4), 810
(2019). https://doi.org/10.3390/s19040810 Google Scholar
M. Weinmann et al.,
“Semantic point cloud interpretation based on optimal neighborhoods, relevant features and efficient classifiers,”
ISPRS J. Photogramm. Remote Sens., 105 286
–304
(2015). https://doi.org/10.1016/j.isprsjprs.2015.01.016 Google Scholar
T. Hackel, J. D. Wegner and K. Schindler,
“Fast semantic segmentation of 3d point clouds with strongly varying density,”
ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., III-3 177
–184
(2016). https://doi.org/10.5194/isprs-annals-III-3-177-2016 Google Scholar
M. Weinmann et al.,
“Distinctive 2D and 3D features for automated large-scale scene analysis in urban areas,”
Comput. Graphics, 49 47
–57
(2015). https://doi.org/10.1016/j.cag.2015.01.006 Google Scholar
T. Hackel, J. D. Wegner and K. Schindler,
“Joint classification and contour extraction of large 3D point clouds,”
ISPRS J. Photogramm. Remote Sens., 130 231
–245
(2017). https://doi.org/10.1016/j.isprsjprs.2017.05.012 Google Scholar
M. Weinmann et al.,
“Geometric features and their relevance for 3d point cloud classification,”
ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., IV-1/W1 157
–164
(2017). https://doi.org/10.5194/isprs-annals-IV-1-W1-157-2017 Google Scholar
J. Demantké et al.,
“Dimensionality based scale selection in 3d lidar point clouds,”
ISPRS Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci., XXXVIII-5/W12 97
–102
(2011). https://doi.org/10.5194/isprsarchives-XXXVIII-5-W12-97-2011 Google Scholar
E. Che and M. J. Olsen,
“An efficient framework for mobile lidar trajectory reconstruction and Mo-norvana segmentation,”
Remote Sens., 11
(7), 836
(2019). https://doi.org/10.3390/rs11070836 Google Scholar
B. Vallet et al.,
“TerraMobilita/iQmulus urban point cloud analysis benchmark,”
Comput. Graphics, 49 126
–133
(2015). https://doi.org/10.1016/j.cag.2015.03.004 Google Scholar
Y. Zhou et al.,
“A fast and accurate segmentation method for ordered LiDAR point cloud of large-scale scenes,”
IEEE Geosci. Remote Sens. Lett., 11
(11), 1981
–1985
(2014). https://doi.org/10.1109/LGRS.2014.2316009 Google Scholar
R. O. Duda, P. E. Hart and D. G. Stork, Pattern Classification, 2nd ed.John Wiley & SonsNew York,1998). Google Scholar
J. Friedman, T. Hastie and R. Tibshirani,
“Additive logistic regression: a statistical view of boosting,”
Ann. Stat., 28
(2), 337
–407
(2000). https://doi.org/10.1214/aos/1016218223 Google Scholar
L. Landrieu et al.,
“A structured regularization framework for spatially smoothing semantic labelings of 3D point clouds,”
ISPRS J. Photogramm. Remote Sens., 132 102
–118
(2017). https://doi.org/10.1016/j.isprsjprs.2017.08.010 Google Scholar
Y. Li, B. Wu and X. Ge,
“Structural segmentation and classification of mobile laser scanning point clouds with large variations in point density,”
ISPRS J. Photogramm. Remote Sens., 153 151
–165
(2019). https://doi.org/10.1016/j.isprsjprs.2019.05.007 Google Scholar
Biography |