COMPAR A TIVE STUDY OF ROAD AND URBAN OBJECT CLASSIFICATION BASED ON MOBILE LASER SCANNERS

: Recently, the rapid development of new laser technologies has led to the continuous evolution of mobile laser systems, resulting in even greater capabilities for transport infrastructure. However, the market offers numerous MLS systems with varying specifications for global navigation satellite systems (GNSS), inertial measurement units (IMU), and laser scanners, which can result in different accuracies, resolutions, and densities. In this regard, this paper aims to compare two different MLS system, integrated with different GNSS and IMU for mapping in road and urban environments. The study evaluates the performance of these sensors using different classifiers and neighborhood sizes to determine which sensor produces better results. Random forest was found to be the most suitable classifier with an overall accuracy of (91.81% for Optech and 94.38% for Riegl) in road environment and (86.39% for Optech and 84.21% for Riegl) in urban environment. In terms of MLS, Optech achieved the highest accuracy in the road environment, while Riegl obtained the highest accuracy in the urban environment. This study provides valuable insights into the most effective MLS systems and approaches for accurate mapping in road and urban infrastructure .


INTRODUCTION
Rapid advancement in mobile laser scanners (MLS) have attracted considerable interest in transport infrastructure, as they allow for establishing a 3D digital representation of complex environment, making it efficient to capture data with high accuracy and point density.The MLS system comprises a laser scanning sensor, global navigation satellite system (GNSS), inertial measurement unit (IMU), and other additional components such as cameras and distance measurement instruments (DMIs).Although there is growing research into the use of the MLS point clouds to classify road features, there are still many challenges.MLS generates large amount of data with high densities.The density and intensity of MLS point clouds often fluctuate in space from the distance between laser scanner mounted on a moving platform and the target (Wen et al., 2019).Thus, making it difficult to extract useful information because of noise, variation in densities and occlusions.Therefore, applying the 3D MLS point cloud data in urban environment is challenging in terms of processing and automatic classification (Xiang et al., 2018) .
Supervised machine learning has shown great promise in the field of road object classification.This approach involves training a machine learning model on labeled data, allowing it to learn to recognize different road objects based on their features.The model can then be used to classify objects in new data, providing quick and accurate analysis of the road environment.
Recent studies have demonstrated the effectiveness of this approach in accurately classifying road objects, highlighting its importance for improving road safety and maintenance.For example, in a study (Yadav et al., 2022) proposed a machine learning-based approach using random forest to identify pole-like objects (PLOs) in mobile laser scanning (MLS) data of roadway scenes.The approach was tested on two MLS datasets with simple and complex PLOs, achieving an average correctness and completeness of 97.67% and 97.79%, respectively, indicating its potential for use in roadway inventory-related studies.
Moreover, (Mohamed et al., 2021) employed machine learning algorithms for mobile LiDAR data classification.The method involved using a cylindrical neighbourhood selection approach to determine the contextual surroundings of each point, followed by deriving a set of geometric point features that included geometric, moment, and height features.Three different machine learning algorithms, namely Random Forest, Gaussian Naïve Bayes, and Quadratic Discriminant Analysis, were utilized for classification purposes.Additionally, another study (Mohamed et al., 2022) by same authors extended this approach by introducing a novel point feature, which along with other features, was utilized as input for a Random Forest classifier.The method achieved an accuracy of 95.23% and demonstrated the effectiveness of the new point feature.A recent study in (Balado et al., 2023) test ML classification in point cloud data by modifying search criteria.They used the Klemperer Rosette to extract 14 features from 3D shapes and tested with a Random Forest classifier in MLS data.The results suggest that the feature extraction based on a fixed radius of 25 cm performs better than the Rosette with a 25% better f-score and a shorter processing time.
The state-of-the-art in machine learning for point clouds focuses to extract meaningful features and patterns from raw point cloud data.One important aspect of this approach is determining the appropriate neighborhood size for processing points in a point cloud.Some studies (Demantké et al., 2012;M Weinmann et al., 2015) have explored the impact of neighborhood size on the performance of machine learning algorithms for point clouds.For instance, a study (M.Weinmann et al., 2015) proposed effective method to collect multiple neighborhoods of optimal size at varying scales to allow for multi-scale feature representations.Another study (Atik et al., 2021) assessed various machine learning techniques to classify data at different scales.They utilized a spherical neighborhood approach to create areas with different radi at each point.
Upon review of the literature, it has been found that there has been a lack of studies that specifically compare commonly used mobile laser scanners on different environments.Despite the increasing popularity and widespread use of these devices in various applications, such as mapping, surveying, and environmental monitoring, there has not been enough research that directly compares their performance and capabilities.Overall, Mobile laser scanners depend on sensor orientation, which is obtained through a combination of GNSS, IMU, and occasionally odometry information (Jende et al., 2018) .These variations in these sensors can lead to differences in accuracy, resolution, and point cloud density.Thus, understanding how these different sensor characteristics affect the final accuracy of the point clouds is crucial.
Secondly, the selection of a study area is an important factor to consider when conducting a mobile laser scanning (MLS) study.It is widely acknowledged that MLS systems can perform differently depending on the environment in which they are used.Specifically, MLS systems may exhibit varying performance in urban and road environments due to differences in the surrounding structures and traffic patterns.Thus, it is important to carefully consider the study area when designing an MLS study.This gap in the literature is significant as it limits the ability of researchers and practitioners to make informed decisions when selecting a mobile laser scanner for their specific application or project.As such, there is a need for further studies that focus on comparing the commonly used mobile laser scanners to provide a comprehensive evaluation of their strengths, weaknesses, and suitability for different types of applications.
In this regard, the main objective of this paper is to conduct comparative analysis of the effectiveness of two commonly used MLS sensors in road and urban environments.Specifically, the study aims to evaluate the performance of these sensors based on different classifiers and varying sizes of neighborhood, with the ultimate goal of determining which sensor can produce superior results for mapping in such environments.By rigorously assessing the performance of these sensors, the study aims to provide valuable insights into the most effective sensor and approach to employ for accurate mapping in road and urban infrastructure.

MATERIALS
This research study involved collecting data from two different environments, namely the road and urban environments, using two mobile laser scanning (MLS) systems, Optech Lynx (Home | Teledyne Geospatial, n.d.) and Riegl VUX-1HA (RIEGL -RIEGL Laser Measurement Systems, n.d.).The study was conducted in Vigo, Spain, with the data being collected at normal speeds of 50 km/h for the road environment and 30 km/h for the urban environment.The road environment comprised guardrails, marking lines, and surrounding vegetation, while the urban environment encompassed roads and predominantly built-up areas.The Riegl VUX-1HA and Optech Lynx are two commonly used MLS systems in the road infrastructure industry.These scanners differ in their specifications and performance, which can impact the density accuracy of the collected data.The Riegl VUX-1HA is a lightweight scanner with high accuracy and a maximum range of up to 420 meters, capable of acquiring data at a maximum rate of 1,000,000 points per second with range precision of up to 3 mm.In this experiment, Riegl VUX-1HA is incorporated with Trimble Zephyr 3 Rover GNSS.On the other hand, the Optech Lynx is a medium-range scanner with a maximum range of up to 200 meters, range precision of up to 8 mm, and a maximum data acquisition rate of 500,0000 points per second.It is also integrated with Applanix POS LV 520, which consists of an IMU with a 2-antenna heading measurement system (GAMS).The details of the sensors used are tabulated in Table 1.
Table 1.Technical specifications of sensors used.

METHODS
The methodology as shown in Figure 1 is divided into three stages.Firstly, to comprehend how the local neighborhood's size affects the classification results, eight distinct K-nearest neighborhood parameters are calculated.Secondly, different features are estimated according to the study (Martin Weinmann et al., 2013), including linearity, planarity, scattering, omni variance, anisotropy, eigentropy, and change of curvature.Additionally, local point density is also calculated from different neighborhood.Lastly, three supervised machine learning classifiers are analyzed in this paper, namely: Support vector machine (SVM), Random Forest (RF), and Neural Network (NN).

Neighborhood determination
Neighborhood search method is defined as the predetermined scale around each point.The neighborhood size refers to the radius or distance around each point that is used to gather information about its surroundings.Different neighborhood sizes can be used depending on the specific task or application.KNN method is defined as the nearest k number of points to the point of interest x according to the Euclidean distance.To comprehend how the local neighborhood's size affects the classification results, eight distinct K-nearest neighborhood parameters (25, 50, 75, 100, 125, 150, 175, and 200) are calculated.

Feature extraction
In order to distinguish the different classes in road and urban environment, eight different features were estimated according to the study (Martin Weinmann et al., 2013), including linearity, planarity, scattering, omni variance, anisotropy, eigentropy, and change of curvature.Also, local point density was calculated from different neighborhood. (1) (2) (8)

Classifiers
Three supervised machine learning classifiers are analyzed in this paper, namely: Support vector machine (SVM), Random Forest (RF), and Neural Network (NN).
3.3.1 Support vector machine: Support vector machine (SVM) (Yang & Dong, 2013) is based on finding a hyperplane that separates the data points into different classes.The hyperplane is chosen such that it maximizes the margin, which is the distance between the hyperplane and the closest points from each class.These closest points are called support vectors.

Random Forest:
Random Forest (Breiman, 2001) is an algorithm that can learn from data and is useful for both classification and regression problems .The algorithm works by creating multiple decision trees, each trained on a different subset of the data, and then combining the predictions from all of these trees to generate a final prediction.Each tree is trained using a random selection of the available features, which helps to prevent overfitting and improve the accuracy of the final predictions.

Neural Network:
In this study we applied the two-layer Bilayered Neural Network.A bilayered neural network is a type of neural network that consists of two fully connected layers.In this architecture, the first layer has a size of 10 neurons, and the second layer also has a size of 10 neurons.The activation function used in this network is the Rectified Linear Unit (ReLU), which is a popular activation function used in neural networks.The network is trained using an iterative process, where the weights of the neurons are adjusted through backpropagation based on the error between the predicted output and the actual output.The iteration limit for training this bilayered neural network is set to 1000 iterations.

EXPERIMENTAL EVALUATION
The use of supervised classification methods necessitates the presence of labeled data, which was accomplished by manually labeling the point clouds through Cloud Compare.The labeling was carried out by choosing relevant classes based on the usefulness and number of points present in the road or urban environment.For the road environment, five classes were selected: road, traffic marks, guardrails, vegetation, and others (which encompassed cars, bus stops, and traffic signs).Six classes were chosen for the urban environment, including road, traffic marks, roadside features (such as curbs, sidewalks, and median strips), buildings, cars, and others (which comprised waste containers, pedestrians, pole-like objects, and vegetation).Following this, 1000 samples per category were randomly selected for training, while the remaining points were reserved for testing.The experiments were conducted utilizing a five-set cross-validation.
Two accuracy metrics, namely Global accuracy and IoU, were employed to assess the effectiveness of the suggested approach.

Effect of neighborhood on global classification
Table 2 shows the global accuracy results for three different classifiers -Support Vector Machines, Random Forest, and Neural Networks -using two MLS sensors -Optech and Rieglat varying neighborhood sizes (25, 50, 75, 100, 125, 150, 175, and 200).The results demonstrate the accuracy achieved by each classifier for a specific neighborhood size and MLS sensor.
When comparing the accuracy of the different models, it is important to consider the performance of each algorithm for each LiDAR sensor.Riegl outperforms Optech for all three classifiers, with Random Forest achieving the highest global accuracy for both sensors (91.81% for Optech and 94.38% for Riegl).
In terms of the classifier's performance, Random Forest consistently outperforms SVM and Neural Network for both sensors, achieving the highest global accuracy in most cases.However, Neural networks also show competitive performance.This suggests that the complex relationships within the data could also be captured by the Neural Networks approach in road environment.
The results for urban environment in  RF is a powerful ensemble learning method that can handle complex datasets with high-dimensional feature spaces, making it well-suited for classifying vegetation with multiple features.NN, on the other hand, is a powerful machine learning model that can learn complex relationships between features, making it effective for handling non-linear and highly correlated datasets.
The slight difference in accuracy between RF and NN could be due to several factors.One possible explanation is that RF is better able to handle noise in the data, which is common in vegetation classification tasks.Additionally, RF can identify important features and minimize the effects of irrelevant or redundant ones, which can improve classification accuracy.Another possible explanation is that the structure of the data may favor RF over NN.RF is known to perform well on structured datasets, while NN is more suitable for unstructured data.
In terms of MLS comparison in road environment, Riegl outperformed Optech across various classes such as road, traffic marks, roadsides, buildings, and cars primarily due to Riegl's capability to capture higher density point clouds than Optech.Furthermore, as illustrated in Figure 4, certain sections of the road were misclassified as "others" due to sparse point clouds in those areas, resulting in lower accuracy for Optech.
As seen from Figure 3, Random Forest (RF) attained the highest scores for all classes in urban environment.This finding highlights the effectiveness of the RF classifier in accurately predicting the classes.Optech demonstrates superior performance across all classes in urban environment, with the exception of the "others" class, where Riegl outperforms Optech.The main factor affecting the quality of Riegl data in urban areas is occlusions caused by vehicles and buildings, due to unreliable GNSS positioning.In contrast, Optech is integrated with high-precision orientation information facilitating the device's prompt responsiveness to environmental changes.

CONCLUSION
This research compared different size of neighborhood based on KNN method for Riegl and Optech dataset, in two environments, and three machine learning classifiers: Support Vector Machine, Random Forest and Neural Network.RF was found to be the most suitable classifier with an overall accuracy of (91.81% for Optech and 94.38% for Riegl) in road environment and (86.39% for Optech and 84.21% for Riegl) in urban environment.RF achieved high scores for each class, including classes with low number of samples that were considered non-existent for the other two classifiers in road environment.Regarding MLS, Optech demonstrated the highest level of accuracy in the road environment, whereas Riegl achieved the highest accuracy in the urban environment.This suggests that the performance of MLS systems may vary depending on the type of environment they are used in and the GNSS/IMU integrated with each MLS system.
By focusing on these aspects, this paper provides valuable information about different MLS systems, and help readers make informed decisions when selecting a system for their specific needs.Therefore, careful attention to these aspects is essential for producing a comprehensive and informative study on the impact of different sensor characteristics on the final accuracy of MLS point clouds.
In conclusion, the results demonstrate that the choice of LiDAR sensor, GNSS, IMU and classifier is critical in achieving accurate classification results.The appropriate classifier and neighborhood must be selected based on the case study and dataset type.The findings of this study provide important insights into the selection of LiDAR sensors and classification methods for accurate mapping in road and urban infrastructure.

Figure 1 :
Figure 1: Proposed methodology of the study Equations 9-10 depict the computation of these metrics, which involve information on true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN).Global accuracy provides overall accuracy of the model whereas IoU gives an indication of the model's performance for each class.(9) The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLVIII-1/W1-2023 12th International Symposium on Mobile Mapping Technology (MMT 2023), 24-26 May 2023, Padua, Italy(10)

4. 2
Figure2and Figure3illustrate IoU results of each class using three different classifiers and neighbourhood sizes.The predicted results for each environment are shown in Figure4and 5, respectively.The IoU results in road environment (Figure2) show that the Support Vector Machine (SVM) achieved the highest accuracy for the classification of traffic marks.This suggests that SVM is effective in identifying and distinguishing different types of traffic marks, such as road signs and lane markings, from other features in the environment.On the other hand, for the classification of vegetation, Random Forest (RF) and Neural Network (NN) demonstrated relatively similar accuracy scores, with RF slightly outperforming NN.This finding suggests that RF and NN are both viable options for the classification of vegetation.RF is a powerful ensemble learning method that can handle complex datasets with high-dimensional feature spaces, making it well-suited for classifying vegetation with multiple features.NN, on the other hand, is a powerful machine learning model that can learn complex relationships between features, making it effective for handling non-linear and highly correlated datasets.The slight difference in accuracy between RF and NN could be due to several factors.One possible explanation is that RF is better able to handle noise in the data, which is common in vegetation classification tasks.Additionally, RF can identify important features and minimize the effects of irrelevant or redundant ones, which can improve classification accuracy.Another possible explanation is that the structure of the data may favor RF over NN.RF is known to perform well on structured datasets, while NN is more suitable for unstructured data.

Figure 2 .
Figure 2. IoU results of road environment with different classes and classifiers.

Figure 3 .
Figure 3. IoU results of urban environment with different classes and classifiers.

Figure 4 .
Figure 4. Predicted results of each classifier from Optech and Riegl in road environment.

Figure 5 .
Figure 5. Predicted results of each classifier from Optech and Riegl in urban environment.

Table 2 .
Table 3 indicate that the Random Forest outperformed the other classifiers, consistently achieving the highest global accuracy for both MLS sensors at neighborhood size 200 (86.39% for Optech and 84.21% for Riegl).Neural Networks showed the second-best performance, followed by SVM, which consistently had the lowest accuracy scores.In terms of the MLS sensors, the Optech sensor generally outperformed the Riegl sensor, particularly in the larger neighborhood sizes.Global accuracy of each neighborhood with different classifiers of Riegl and Optech in road environment.

Table 3 .
Global accuracy of each neighborhood with different classifiers of Riegl and Optech in urban environment.