AN IMPROVED SNAKE MODEL FOR REFINEMENT OF LIDAR-DERIVED BUILDING ROOF CONTOURS USING AERIAL IMAGES

Building roof contours are considered as very important geometric data, which have been widely applied in many fields, including but not limited to urban planning, land investigation, change detection and military reconnaissance. Currently, the demand on building contours at a finer scale (especially in urban areas) has been raised in a growing number of studies such as urban environment quality assessment, urban sprawl monitoring and urban air pollution modelling. LiDAR is known as an effective means of acquiring 3D roof points with high elevation accuracy. However, the precision of the building contour obtained from LiDAR data is restricted by its relatively low scanning resolution. With the use of the texture information from high-resolution imagery, the precision can be improved. In this study, an improved snake model is proposed to refine the initial building contours extracted from LiDAR. First, an improved snake model is constructed with the constraints of the deviation angle, image gradient, and area. Then, the nodes of the contour are moved in a certain range to find the best optimized result using greedy algorithm. Considering both precision and efficiency, the candidate shift positions of the contour nodes are constrained, and the searching strategy for the candidate nodes is explicitly designed. The experiments on three datasets indicate that the proposed method for building contour refinement is effective and feasible. The average quality index is improved from 91.66% to 93.34%. The statistics of the evaluation results for every single building demonstrated that 77.0% of the total number of contours is updated with higher quality index.


INTRODUCTION
Buildings are the most significant component of the urban scence.The location information of buildings is widely applied in many fields, such as urban planning, land investigation, change detection and military reconnaissance.As an effective means to acquire high-resolution elevation data, airborne light detection and ranging (LiDAR) is increasingly frequently used in building detection.Considering that this technique has an obvious advantage on height precision ， building detection based on LiDAR usually has a higher degree of automation, which makes these types of detection methods able to perform well in many applications (Dorninger and Pfeifer, 2008;Awrangjeb and Fraser, 2014;Mongus et al., 2014;Niemeyer et al., 2014).
Currently, the demand on building contours at a finer scale (especially in urban areas) has been raised in a growing number of studies such as urban environment quality assessment, urban sprawl monitoring and urban air pollution modelling.However, the quality of the building contours extracted from LiDAR are always restricted by the scan resolution of the point cloud.In spite of the design level of the airborne LiDAR equipment is continuously improved in recent years, the ground resolution of the point cloud still falls behind the aerial images when acquiring data at a similar height.This difference means that even if the building detection results are completely correct in LiDAR, there is still some room for the accuracy improvement of the contours, which makes the refinement of LiDAR-derived building contours a problem of concern.
Using the edge information from images with higher resolution is a feasible way to perform the building contour refinement.The * Corresponding author refinement based on image straight-line features is one of the traditional methods (Cheng et al., 2011;Awrangjeb et al., 2013;Li et al., 2013;Dal Poz, 2013).By updating the initial building contours with the building edge line segments extracted from high-resolution images, more accurate contours can be obtained through these methods.However, these methods are generally too dependent on the results of line feature extraction, the false features and difficult-extracted features can both pose challenges for the refinement process.
In general, the LiDAR-derived building contours should be simplified or regularized before being refined.Some assumptions including rectangularity criterion (Galvanin and Dal Poz, 2012) and principal orientation constraint (Cheng et al., 2011;Zhao et al., 2016) are frequently used for the contour simplification and regularization.However, since the increasing complexity of buildings in various applications, these assumptions may not be completely suitable for many buildings (as shown in Figure 1).Refining the building contour with an inappropriate constraint would probably lead to erroneous results.
Figure 1.Building contours that contradict against the commonly-used regularization assumptions Compared with the abovementioned building contour processing methods, the advantage of the Snake model lies in the ability of integrating the input data, initial estimation, optimal contour generation and knowledge-based constraint into a single process.It makes this model applied widely in building contour extraction and optimization.The Snake model is more often employed for building contour extraction with high-resolution images as a single data source (Peng et al., 2005;Li et al., 2012;Fazan and Dal Poz 2013).However, considering that this model largely relies on a good initial contour, refined contour with high quality may be difficult to obtain when only using images.Kabolizade et al. (2010) proposed an improved Snake model to refine the LiDAR-derived building contours with aerial images.In their approach, better results were achieved with the improved model than the traditional Snake model.But it should be noted that they only used single aerial image to perform the refinement, and the test objects are mainly buildings with rectangular corners.
In this paper, we modified the classic Snake model to make it adaptable to more complicated building contours extracted from LiDAR.Assuming that high-precision building detection results have been obtained from LiDAR point cloud, the target of this study is to further improve the quality of building contours using multi-view aerial images.Our research will partly reflect the potential of automatically generating high-quality building contour based on combined utilization of airborne LiDAR and aerial imagery.
The remainder of this paper is organized as follows.Section 2 describes the proposed method.The experimental and evaluation results are presented and discussed in Section 3. Conclusions are given in Section 5.

Traditional Snake Model
The basic principle of Snake model is to find an optimal contour that satisfies the minimization constraint of internal and external energy, where the internal energy ensures the contour to be smooth and continuous, and the external energy drives the contour toward the edge of the target object.In the actual digital image processing, the contour in Snake model is discretized into a series of control nodes (v 1 , v 2 , v 3 , ⋯ , v  ), the energy function of the contour is expressed as the sum of the internal energy   and the external energy   of each node: where   is composed of the elastic energy E  and the bending energy E  , which can be expressed as: where α and  are weight values for E  and E  , respectively.The elastic energy aims to prevent the contour from being stretched, while the bending energy gives the contour a "restitution" force when bent or twisted, driving it to be a smooth curve or a straight line.

Improved Snake Model
Since most of the buildings in reality have distinct corners, which actually contradicts the smooth and continuous constraint of the Snake model.In order to better meet the requirements of building contour refinement, some improvements are made for the traditional Snake model in our approach.

Internal Energy Term:
As shown in Figure 2(a), a number of redundant nodes often exist on the initial building contour derived from LiDAR point cloud.Line simplification methods such as Douglas-Peucker algorithm (Douglas and Peucker, 1973) are generally adopted to reduce the redundant nodes.But for complex buildings that have many detailed edge information on the roof, a perfect threshold may not exist for the simplification: a large threshold leads to loss of the original details, while a small value would have little practical effect of simplification.In order to preserve the details of the initial LiDAR-derived contour as much as possible, a relatively small threshold (0.5 m) is used to perform the contour simplification.Meanwhile, in this approach, the contour will be simplified in each iteration during the contour optimization process until the optimal result is obtained.To constrain the redundant nodes to a reasonable position, a measurement named deviation angle θ is introduced to redefine the internal energy term.Figure 2(b) shows three consecutive nodes of a contour, which are P −1 ( −1 ,  −1 ,  −1 ), P  (  ,   ,   ) and  +1 ( +1 ,  +1 ,  +1 ).The deviation angle θ  can be calculated using the two vectors formed by these three nodes.Considering that the elevation difference of the roof may easily create an obvious angle between the adjacent nodes, only the 2D coordinates are used for calculating the deviation angle: where   ⃗⃗ and  ⃗ +1 are the 2D vectors calculated.
The irregularity of the initial contour is mainly caused by the redundant nodes, the reason that these redundant nodes cannot be simplified is the existence of the abovementioned deviation angle.
When the contour becomes more regular, the accumulated value of the deviation angles will tend to decrease gradually and converge to a stable value eventually.Thus, the internal energy term of the building contour can be represented as follows: (4)

External Energy Term:
The internal energy term only provides a constraint that makes the contour become regular when being moved, while the definition of the external energy term aims at improving the contour accuracy with the use of external data such as aerial imagery.In our approach, the external energy term includes gradient energy E  and area energy E  : In general, the building contour accuracy can be improved based on the image edge information.The image gradient is the primary form of the edge information, based on which edge-point or straight-line features can be extracted.However, compared with point or line features, the continuity of gradient information makes it much easier to be applied in the Snake model.
Since the aerial images may have large overlapping area, a same building would probably appear on multiple images.Due to the different imaging angle, building edges that cannot be completely captured from a single image, may be largely complemented by the overlapped images.Figure 5 shows the original and gradient images of a building on two overlapped images.It can be seen that there is a significant complementary effect of edge information between the two images.Therefore, multi-view images will be adopted for the contour refinement in this study.In order to calculate the gradient energy, the 3D contour should be projected onto the images using collinearity equation, and then the line segments of the contour should be rasterized.In our approach, the rasterization is performed using Bresenham algorithm (Bresenham, 1977), and the image gradient is extracted with Sobel operator.Gaussian blurring is performed to reduce the image noise before gradient calculation.The gradient energy will be calculated with the following formulas: where E  () is the total gradient energy of the contour calculated from the jth image, N is the number of images used for gradient extraction, C is the horizontal projected perimeter of the 3D contour.∇  (, ) is the gradient value of the kth pixel of the rasterized 2D contour on the image,  is the number of pixels in the rasterized contour.
Owing to the existence of detailed structures such as skylights and chimneys on the roof, there are also some pixels inside the building contour having strong gradient.As a result, the contour may still move toward these kind of pixels when only using gradient as the external constraint.Using the fact that the area of the LiDAR-derived initial contour is smaller than that of the actual contour in most cases, in our approach, an approximate method is proposed to estimate the area difference between the initial contour and the actual contour.As shown in Figure 4, the red lines represent the LiDAR derived contour, the green lines represent the actual contour, ∆GSD is the difference between the average scanning resolution of LiDAR and the image resolution.
Assuming that the LiDAR points collected on the roof are symmetrically distributed, and considering that ∆GSD is a rather small value compared with the roof edge, the shadow area can be ignored.Therefore, the area difference between the initial contour and the actual contour can be estimated by multiplying ∆GSD and the perimeter of the initial contour.Base on the abovementioned assumptions, it can be considered that the area increment of the contour during the refinement process should be close to the estimated area difference.Thus, the area energy term is defined as follows: where  0 is the horizontal projected area of the LiDAR-derived 3D contour,  ̂ is the estimated area of the actual contour,  is the area of the current contour being refined.It can be noted that larger difference between the current contour area and the estimated area generates a greater area energy.
To sum up, the total energy E  of a building contour can be represented as: where α , β and  are weights assigned to internal energy, gradient energy and area energy, respectively.In our approach, these values are set to 1, 10 and 1, respectively.

Convergence Criterion:
Since the input data in this study are 3D vector contours rather than 2D rasterized contours, the process flow of building contour refinement is different from many other related studies based on Snake model.The key processes are as follows: 1) The initial building contours are extracted from the classification results of LiDAR point cloud.
2) The internal energy and area energy of the initial contour are calculated according to their definition, the gradient energy is calculated by projecting the contour onto multi-view images, and then the total energy can be calculated.
3) The nodes of the 3D contour are moved, and the shapes of the corresponding 2D contours on images are updated simultaneously.The total energy of the current contour is calculated.4) Step 3) is repeated, and by minimizing the total energy of the contour, the optimal refined contour is generated.
The convergent iteration of the Snake model can be considered as a process of seeking the best solution based on dynamic optimization.In this approach, the greedy algorithm, which is efficient and easy to implement, is chosen as the optimization method.The main principle of this method is that the contour nodes are moved one by one, and the total energy is minimized in a local area in each move.After several iterations, a local optimal solution can be obtained.In order to reduce the number of iterations and define a reasonable search range, the image resolution is adopted as the search step to move the nodes, and the search direction is determined with certain restrictions.As shown in Figure 5, the local optimal position of a contour node can be determined using the following steps: 1) One of the two segments that are connected to the node is chosen as the reference segment.
2) The local optimal position on this reference segment can be determined by first searching along the reference segment, and then searching along the direction perpendicular to the reference segment.The position that minimizes the total energy is selected as the updated position for the node.3) Step 2) is repeated by choosing the other segment as the reference segment, and the plane coordinates of the local optimal position can be determined.4) By searching along the vertical direction at the determined plane position, the optimal elevation for the node can be selected with energy minimization constraint.

EXPERIMENT
Three datasets located in vaihingen, Germany, which are provided by III/4 working group of ISPRS (Rottensteiner 2013), are used in the experiments to evaluate the performance of the proposed approach.Each dataset contains LiDAR point clouds and aerial images that are accurately registered.Figure 6 shows the experimental areas and the corresponding ground truths that are extracted by manual operation.The average density of LiDAR point cloud is 6.7 points / m 2 , the resolution of aerial images is 8 cm.A classification method based on support vector machine (Zhang et al., 2013) is adopted to automatically detect the building points from LiDAR point cloud, and the errors in the classification results are modified by manual editing using TerraScan software.The initial building contours are extracted using alpha-shapes algorithm (Edelsbrunner and Mücke, 1994).Figure 7 shows the overlying views of Area 1 between the aerial images and the building contours before and after refinement, in which (a) shows the overall results, and (b) shows the enlarged views of the rectangular areas in (a). Figure 8-9 show the refinement results of Area 2 and 3, respectively.It can be seen that the building contours are obviously simplified and regularized, and fit better with the edge of buildings after refinement.In addition, it should be noted that some of contours are not strictly composed of rectangles, the proposed method achieves good results for building contours with certain complexity and non-rectangular corners.
In order to quantify the accuracy of the building contours before and after the refinement, by counting the number of true positive, false positive and false negative pixels with the ground truths, three kinds of indexes defined in (Lee et al., 2003), namely, completeness, correctness and quality are evaluated, respectively.Table 1 lists the statistics of these indexes for the three datasets.
In overall, the quality indexes of the three datasets are both improved, which proves the effectiveness of the proposed method.The statistics also indicate that while the completeness indexes increase, the correctness indexes both decrease.This is largely because most of the refined building contours move outward from their initial positions.To find out whether the quality of a single building contour is improved or not, the evaluation for each individual building is performed in our approach.The statistics show that the quality indexes for most of the building contours are improved, the percentages of the building contours that have improved quality in three datasets are 84.0%，70.0%and 74.4%, respectively.In total, 77.0% of the building contours are improved after refinement.The processing times are 16, 10 and 17 s for three datasets, respectively.It can be considered that the efficiency of the proposed method has great application potential in the study of building contour extraction.

CONCLUSION
An improved Snake model is proposed to refine the LiDARderived building contours with the use of multi-view aerial images.In order to better meet the requirements of building contour refinement, the internal and external energy of Snake model are redefined, and the search method and the convergence criterion for the improved model are determined.Experiments on three datasets demonstrate that the quality indexes are both improved after refinement.The average quality index of the three datasets is improved from 91.66% to 93.34%.The evaluation results for every single building indicate that 77.0% of the total number of the building contours are updated with higher quality indexes.However, it should be pointed out that the interference from vegetation, shadow and the edge features inside the building contour may still result in failure of the proposed method.A subsequent study will investigate these problems in the future.
Determination of the internal energy item.(a) The redundant nodes of the initial LiDAR-derived building contour.(b) Definition of the deviation angle.

Figure 3 .
Figure 3.The complementary effect of edge information between the overlapped images

Figure 4 .
Figure 4. Area difference estimation between the LiDAR-derived contour and the actual contour

Figure 5 .
Figure 5.The search method for optimal position based on the reference segments.The black lines represent the initial contour, the red lines represent the actual contour.The red circle is the current position of the node, the green dashed lines are the search directions.The light red circles are the candidate positions when searching along the reference segment, the light green circles are the candidate positions when searching along the direction perpendicular to the reference segment.

Figure 6 .
Figure 6.The experimental datasets and their corresponding ground truths

Figure 7 .Figure 9 .
Figure 7.The results of building contour refinement of Area 1

Table 1 .
The evaluation results before and after the refinement