AN EFFICIENT METHOD FOR AUTOMATIC ROAD EXTRACTION BASED ON MULTIPLE FEATURES FROM LiDAR DATA

The road extraction in urban areas is difficult task due to the complicated patterns and many contextual objects. LiDAR data directly provides three dimensional (3D) points with less occlusions and smaller shadows. The elevation information and surface roughness are distinguishing features to separate roads. However, LiDAR data has some disadvantages are not beneficial to object extraction, such as the irregular distribution of point clouds and lack of clear edges of roads. For these problems, this paper proposes an automatic road centerlines extraction method which has three major steps: (1) road center point detection based on multiple feature spatial clustering for separating road points from ground points, (2) local principal component analysis with least squares fitting for extracting the primitives of road centerlines, and (3) hierarchical grouping for connecting primitives into complete roads network. Compared with MTH (consist of Mean shift algorithm, Tensor voting, and Hough transform) proposed in our previous article, this method greatly reduced the computational cost. To evaluate the proposed method, the Vaihingen data set, a benchmark testing data provided by ISPRS for “Urban Classification and 3D Building Reconstruction” project, was selected. The experimental results show that our method achieve the same performance by less time in road extraction using LiDAR data.


INTRODUCTION
Automatic road extraction from remotely sensed data has been an active research area during last few decades.Many strategies and algorithms have been proposed and reached various extents of success.Generally, road extraction is based on knowledge or models about roads in geometric, spatial, spectral, radiometry, and topological properties.Imagery, especially high resolution image, is the main source for road detection because it contains rich texture and spectral information.Das et al. (2011) extracted road networks from high-resolution by the methods of dominant singular measure, probabilistic SVM classifier, and a constraint satisfaction neural network.Poullis (2014) proposed a novel framework for road extraction and classification from satellite images, which denoted as Tensor-Cuts.The framework combined the strengths of tensor encoding, feature extraction using Gabor Jets, and global optimization using Graph-Cuts.Poz et al. (2012) proposed a semiautomatic 3D road extraction method in rural areas based on the dynamic programming algorithm.
Although many studies have achieved impressive results, road extraction only from high resolution images in urban built-up areas is still a difficult task due to the complexity of the scene and complicated patterns of roads.First, the contextual objects, such as moving vehicles and lane markers on the road, decrease the radiometric homogeneity of road.Second, the occlusion and shadows of road surface cast by tall objects, like buildings and trees, increase the spectral variance of road.Third, the man-made features, such as buildings and parking lots, have the similar reflectance properties with roads may increase misclassification errors.
LiDAR (Light Detection and Ranging) delivers accurately georeferenced set of dense point clouds with the intensity of the returned signals.Compared to aerial imagery, LiDAR has several advantages with respect to automatic object extraction (Shan, 2008): (1)The 3D points and elevation information provided by LiDAR are much easy to separate tall objects from roads, no matter they have similar radiometry.(2) Surface roughness is easily obtained from 3D coordinates, which is a distinguishing feature of man-made objects.(3) LiDAR data have less occlusions and smaller shadows due to narrow scanning angle and active sensing technology, so the features of road in LiDAR data are more complete than in imagery.(4)The intensity of LiDAR points can be used as an additional, useful feature for road extraction because road surfaces have similar reflectance.(5)The river shaped like roads are easy to be detected because they absorb laser light and present as nodata.
However, LiDAR data for objects recognition also have drawbacks.Two of the most obvious are as follows.First, LiDAR lack texture and spectral information.To improve the reliability of road detection, many studies have integrated LiDAR data with imagery (Hu, 2004;Zhu, 2004;Wang, 2011) or existing geo-database (Hatger, 2003;Boyko, 2011).However, the precise co-registration of different data sources is difficult and the cost of data acquisition is higher.Therefore, road extraction from only LiDAR data is necessary to explore.
Second, LiDAR points distribute irregularly with nonuniformity density.The point density in the overlapping area between fight strips is greater than non-overlapping regions.Moreover, there is no point in the area occluded by tall objects.To solve this problem, points are interpolated into intensity image (Zhao, 2012;Zhu, 2009), binary image (Clode, 2007), or others (Samadzadegan, 2009;Jiangui, 2011) to extract road lines.In these methods, resampling interval is an important element, which affects point processing efficiency and accuracy.In addition, a series of image processing methods and parameter selection methods are required to improve image quality.Therefore, in this paper, we propose a nonrasterization method for road lines extraction in LiDAR data.
LiDAR points of tall objects, such as building and trees, are relatively easy to remove by the filtering algorithms in the literatures (Masaharu, 2002;Zhang, 2004;Sithole, 2004).The remaining problem is the non-road points on the ground such as parking lots and bare ground, which have the similar height and intensity characteristics with road points.How to recognize and separate non-road points from road points by multiple features?How to be more robust to the irregular point distribution and to variations in road pattern and width?A MTH method proposed in our previous paper (Hu, 2014) has addressed these questions.On the basis of previous study, we propose a more efficient method for road lines extraction in this paper which denoted as MPL(consist of Mean shift algorithm, principal component analysis, and least square fitting).Compared with MTH, this method greatly reduces computational cost, and achieves the same performance with MTH.

Overview
Because road points are mainly distributed on the earth surface, the initial work is to classify the point clouds into ground and non-ground classes.The subsequent steps include three major steps: (1) spatial clustering using adaptive mean shift for road center point detection, (2) local principal component analysis and least square line fitting for the primitives of road centerlines extraction, (3) hierarchical primitives grouping for connecting roads segments into complete roads network.Figure 1 shows the workflow of the proposed method.

Road center point detection by spatial clustering
Ground points mainly contain roads, parking lots, bare ground and low land grass, etc. Compare to roads, parking lots and bare ground have different shape features, and land grass appears different intensity and surface roughness.These shape and reflectance differences are beneficial to recognize road points from ground points.Mean shift is a clustering algorithm of multi-dimensional space by setting different features of window size, so that the sample point is moved to the direction of increasing the kernel density.According to this principle, we integrate a variety of features for roads detection, such as intensity and surface roughness.We also found that, when road width is close to the clustering window, all road points distributed like ribbon shape will move to the centerline of the road and shape into a linear distribution.A linear distributed object can be easy to detect via computer vision algorithms.Therefore, we use mean shift algorithm to separate non-road points from ground points, and then arrange road points to road centerlines.To deal with urban roads with various patterns and width, we develop a spatial clustering algorithm with adaptive window size base on mean shift, described in our previous paper (Hu, 2014).This method is robust to LiDAR points with non-uniform and irregular distribution, and does not require a prior road model and rasterization.In addition, the method is straightforward to use multiple features, such as both intensity and surface smoothness, to detect road points.
Figure 1 Workflow of the proposed road centerline extraction method

Road centerlines extraction
After spatial clustering, various ground points with different characteristics are separated.Besides road points, the detected ground points contain parking lots, bare ground, and low grass.Compared to parking lots, bare ground, and low grass moved together into clumps or scattering, road points are clustered into linear features.Therefore, the subsequent process is to recognize road points by line structure and extract the primitives of road centerlines.
To improve road centreline detection efficiency, this study uses Principal Component Analysis (PCA) and Least Squares Fitting (LSF), instead of TH (Tensor Voting and Hough transform in MTH method (Hu, 2014)).This is a two step-wise strategy.The first step is point cloud gridding, and point distribution structure analysis in each gird by PCA.In the literature (Demantke, 2011), the point distribution features are classified as three categories: one dimensional (1D) linear, two dimensional (2D) planar, and three dimensional (3D) volumetric.In order to indicate the type of point distribution, three eigenvalues of covariance matrix for points in each grid are calculated.The classification criteria are shown in Equation1.
where 1 a , and 1D a > 3D a , the points are classified as 1D linear distribution, which are mostly road points; otherwise, they are classified as 2D planar, and probably parking lots, bare ground, or other non-road points.After filtering and clustering, the elevation of points has small variation, so there is no 3D volumetric in this case.The second step is the extraction of road primitives.The points in the grid, which are classified as road points, are fitted into straight line by LSF.

Road centerlines grouping and network building
In order to link primitives into continuous road lines and further form a complete network, a hierarchical grouping method is adopted.The gaps and collinear features between primitives are the essential factors for grouping.There are two steps: (1) Local primitive connection in sequence for every adjacent primitive with small collinear thresholds.These adjacent primitives broke by gridding could be connected into longer road segments in this pre-grouping.Furthermore, the number of primitives could be reduced, which will improve the efficiency of subsequent grouping.(2) Global primitive grouping by establishing 2D connectivity matrix for primitives.Matrix elements, which represent connectivity probability between primitives, are calculated iteratively until no road segments exist to connect.
Finally, the extracted road lines are smoothed, and further road networks are established by connecting and intersecting these smoothed road lines, followed by the removal of short lines which are likely not meaningful roads.

EXPERIMENTS
We tested the algorithm using the Vaihingen data set from the ISPRS Test Project on Urban Classification and 3D Building Reconstruction.The 'Vaihingen' data was captured by a Leica ALS50 system, and the point density is 4 points/m 2 .The data set covers complex urban scene, which is a good sample to test road extraction methods.This scene includes various types of roads with different widths and many other man-made and natural objects neared by the roads or covered on the roads surface.The extraction results are shown in Figure 3 and Figure 4. Figure 3 shows the primitive detection results.In our method, the road centerlines are artificially broke by point clouds gridding, leading to the short and disconnected road primitives.

CONCLUSION AND DISCUSSIONS
Road extraction in urban areas is a difficult task because of the complicated road patterns and many contextual objects.This paper proposed an automatic road extraction method from LiDAR data in complex scene.A three-step approach was presented.Firstly, based on multiple features spatial clustering, road center points were detected to separate road points from other ground points.Secondly, the primitives of road centerlines were extracted by local principal component analysis and least square multiple line fitting.In order to improve the efficiency of the proposed method, gridding was used, and then road segments were extracted in every grid.Finally, a hierarchical primitives grouping was used to connect roads segments into a complete roads network.We carried out the method on the 'Vaihingen' data.The experimental result showed that, compared to MTH, our method achieved the similar road extraction performance but less computational cost.
In the future, the following aspects will be focused: (1) for each step, the selection of the parameters and thresholds will be adaptive.
(2) Integration of LiDAR and high resolution imagery is still worth studying.Fusion contextual information into our method might improve the quality of road extraction.

Figure 2 .
Figure 2. Road extraction results.(a) ground points (black), (b) road center points detected by MS.
These road primitives are less complete and smooth than the primitives extracted by MTH.The curve roads are rough fitted by multi-segment straight primitives.Furthermore, there are more wrongly primitives (false positive), because the grids which are used for line analysis and extraction, break the structural integrity of objects.Compared to MTH, our method has more false positive but consumes much less time.