SEMI-AUTOMATED DETECTION OF SURFACE DEGRADATION ON BRIDGES BASED ON A LEVEL SET METHOD

Due to the effect of climate factors, natural phenomena and human usage, buildings and infrastructures are subject of progressive degradation. The deterioration of these structures has to be monitored in order to avoid hazards for human beings and for the natural environment in their neighborhood. Hence, on the one hand, monitoring such infrastructures is of primarily importance. On the other hand, unfortunately, nowadays this monitoring effort is mostly done by expert and skilled personnel, which follow the overall data acquisition, analysis and result reporting process, making the whole monitoring procedure quite expensive for the public (and private, as well) agencies. This paper proposes the use of a partially user–assisted procedure in order to reduce the monitoring cost and to make the obtained result less subjective as well. The developed method relies on the use of images acquired with standard cameras by even inexperienced personnel. The deterioration on the infrastructure surface is detected by image segmentation based on a level sets method. The results of the semi-automated analysis procedure are remapped on a 3D model of the infrastructure obtained by means of a terrestrial laser scanning acquisition. The proposed method has been successfully tested on a portion of a road bridge in Perarolo di Cadore (BL), Italy.


INTRODUCTION
Surfaces of human infrastructures are subject to degradation, mostly due to the effect of (natural) climate elements (sun, rain, wind), to human dependent factors (e.g. increase in the volume of traffic) and to aging.
Nowadays, the monitoring activity of the state of these infrastructures is mostly done by visual inspection performed by expert staff: these highly skilled personnel is involved in periodical recognitions, recording and reporting of the updated infrastructure conditions.It is worth to notice that this procedure typically provides subjective results (conditions are typically subjectively reported on questionnaires) at a quite high cost, related to the relatively frequent use of expert staff.This paper proposes a different approach in order to reduce the cost of infrastructure monitoring.
The goal of surface infrastructure monitoring is that of periodically updating the detected conditions on a digital representation of the infrastructure, in order to ease the control of the degradation temporal evolution by specific human operators (Costantino and Angelini, 2013a, Costantino and Angelini, 2015, Camarda et al., 2010, Guarnieri et al., 2013).
In this work, the digital 3D representation of the infrastructure is obtained by means of terrestrial laser scanning (TLS).Notice that the use of TLS devices is usually reserved to skilled personnel, hence this is usually a quite expensive operation.For this reason, the use of TLS in this work is limited to una tantum acquisition typically done at the beginning of the monitoring.Nevertheless, other TLS acquisitions can be considered in the following, when required by the occurrence of specific conditions.
Instead, the degradation detections and condition updates are obtained by the analysis of photos acquired by standard digital cameras (not necessarily professional, e.g.cameras embedded in smartphones).Then, the degradation results are mapped onto their correspondent positions on the TLS-based 3D model of the infrastructure.
The main advantages of this integrated strategies approach are as follows: • The use of digital cameras is commonly considered a much simpler operation than TLS acquisition, hence it can be performed by not highly skilled personnel.This allows a significant monitoring cost reduction, and/or the possibility of more frequent degradation observations.
• Despite some work has been done in order to exploit the luminance of laser scanner radiation in order to classify materials (Costantino and Angelini, 2013b), actually the analysis of the surface degradation status is usually simpler on digital images than on laser scanning data.
• The reprojection of the detected deterioration results on the 3D model of the infrastructure help the operator to realize the overall conditions of the infrastructure (e.g. the areas where deterioration is more remarkable) and to follow the temporal evolution of the degradation.
A quite common choice is to exploit level set methods as an efficient image segmentation tool (Sethian, 1999, Sethian, 2001, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-3/W3, 2015ISPRS Geospatial Week 2015, 28 Sep -03 Oct 2015, La Grande Motte, France This contribution has been peer-reviewed.Editors: S. Oude Elberink, A. Velizhev, R. Lindenbergh, S. Kaasalainen, and F. Pirotti doi:10.5194/isprsarchives-XL-3-W3-15-2015Osher and Fedkiw, 2003): accordingly to other previous works (Cerimele andCossu, 2007, Cerimele andCossu, 2009), this paper considers the use of level set methods applied to the L * , a * , b * color space.As shown in Section 2., the use of L * , a * , b * instead of the RGB color space allows to successfully reduce the correlation between different color channels.
Details on the whole segmentation procedure are provided in the next section, while the experimental results on a portion of a road bridge in Perarolo di Cadore (BL), Italy, are shown in Section 3. Finally, some conclusions are drawn in Section 4.

ASSESSMENT OF DEGRADED SURFACE AREAS
The data analysis procedure works as follows: once the images are acquired they are separately processed by an ad hoc algorithm.Notice that if the number of images is sufficiently large and with a proper overlapping level, then a 3D photogrammetric reconstruction would be possible as well.Despite this case would be more informative, actually it typically requires an higher level of accuracy during the image acquirement process.Therefore, in order to keep the image capture task as simple as possible, in the proposed method the data analysis is performed on each single image.
The image analysis procedure is user-assisted in its first step (step (a) of the following procedure), whereas the following ones are performed automatically: (a) The user pinpoints the matches between a set of points in the image and in the 3D infrastructure representation: these correspondences are used for determining the infrastructure surfaces to be controlled and to assess the local map between the image points and their 3D coordinates.(c) Region segmentation (i.e.determination of deteriorated areas) based on a level sets method.
(d) Remapping of the segmentation curves on the 3D representation.
The above steps will be detailed in the following.

Identification of the regions of interest and of the camera pose
The user manually select a set of matching points on the image and on the 3D model.Let {Mi}i=1,...,n be the set of (noncoplanar) 3D points and {mi}i=1,...,n the corresponding 2D coordinates on the image plane.The distortion caused by the lens of the acquisition camera is assumed to be negligible or already estimated and properly corrected.Hence, without loss of generalization, hereafter the coordinates {mi}i=1,...,n on the image plane are assumed to be distortion free.
A subset of the {mi} points is used by the user to outline polygonal curves for delimiting the areas to be analyzed, e.g.areas to be analyzed are delimited by the user by using closed polygonal curves.Each of these areas is assumed to be well approximated by a planar surface.The j-th area is delimited by the points {mij}i=1,...,n j , with nj ≥ 3, for each j.
Camera position and orientation with respect to the reference system used to express the {Mi} points can be computed in closedform as described in the following (Ma et al., 2003).
According to the pinhole camera model, undistorted camera measurements can be modeled as follows: where P12 and P3 are the first two rows and the third row, respectively, of the camera projection matrix P .The projection matrix P can be expressed in terms of the matrix of inner parameters K and of the camera position t and orientation matrix R with respect to the global reference system: By simple matrix manipulations of (1), it immediately follows that the value of the matrix P can be estimated (in closed-form) by solving a simple linear system: for i = 1, . . ., n, where p is a unit vector containing the normalized values of P , and mi = [ui vi] .A scaled version of P can be obtained by simply rearranging the terms in p.
Once P has been computed (up to a scale factor), the matrices K and [R| − Rt] can be computed as the results of the QR factorization, (Golub and Loan, 1989), of the first three columns of P .
The scale factor of P can be obtained by imposing that the term on the last column and last row of K is equal to one.Finally, t can be computed by pre-multiplying the forth column of The above estimation can be improved by using bundle adjustmentlike procedures in order to obtain more accurate estimations (Ma et al., 2003), if needed.
Points {mi}i=1,...,n and {Mi}i=1,...,n will be used also in subsection 2.5 to map the 2D segmentation results onto the 3D bridge points.
2.2 L * , a * , b * representation Images acquired by standard cameras are usually represented in the RGB (red, green, blue) color space.Despite being convenient for the visualization on standard displays, this space is not the most suitable when dealing with segmentation purpose.Indeed in the RGB representation each color channel is typically highly correlated with the others, making the analysis of the three channels mostly redundant.
In order to tackle this issue, the L * , a * , b * representation for the image colors is considered as it allows to obtain much less correlated color channels.
The conversion between RGB and L * , a * , b * color space is usually described by passing through the XY Z representation: and where Xn, Yn, Zn represent the (normalized) values representing the white color in the XY Z representation, and the function f (•) is defined as follows: Figure 1: Example of region to be segmented.
In order to validate the above considerations on correlations among RGB and L * , a * , b * channels, the Pearson correlation coefficient has been computed as well.In the case of the image shown in Fig. 1, this coefficient results to be 0.97 for the red and green color channels (Fig. 2), whereas it is 0.03 for the L * and a * channels (Fig. 3).Correlation between the red and green channel is also apparent in Fig. 2, while L * and a * are weakly correlated as shown in Fig. 3. Similar considerations can be repeated for the the other possible couples of channels.To simplify the notation hereafter a one dimensional signal will be considered (i.e. the lightness L * ).Nevertheless the approach can be extended to the three dimensional signal case.

Lightness multimodal density
The goal of this step is to estimate the probability density of the lightness L * values in the region of interest: such density is usu- ally a multimodal density, where each mode is typically associated to a region with different characteristics.Hence, the rational is that of obtaining a rough segmentation of the region in different areas according to the distribution of the lightness values in different modes.
The histogram of the lightness values will be considered as a sample approximation of the probability density.Fig. 4 shows the histogram corresponding to the multimodal lightness density of image in Fig. 1.It is assumed that the mode with the largest number of associated pixels corresponds to the non-deteriorated region, this way, it is possible to easily distinguish between non-deteriorated and deteriorated regions (those corresponding to the larger mode and to the others, respectively).It is worth to notice that under this assumption it is not necessary to determine the total number of significant modes in the image (everything out of the largest mode is considered as a deteriorated region).Similar considerations allow to draw analogous conclusions if the algorithm can associate the most significant mode to deteriorated regions.
Otherwise, when neither of the above cases can be considered, associating a different region to each mode (i.e. to each local maximum in the probability density), it is possible to segment regions characterized by different statistics, but without automatically distinguishing which correspond to the deteriorated ones Notice that the above assumption and/or restriction will be removed in the future development of this project: assuming to monitor the same infrastructure for a certain amount of time, several temporal samples will be available.The analysis of the current temporal sample will take advantage of those of the previous ones: this way, limitations on the analysis procedure will be removed thanks to the use of previously acquired information.
Once the number of modes m has been determined (for instance by using local maxima detection), the separation between different modes is done on the histogram based on the Otsu's mthresholding method (Otsu, 1975, Otsu, 1979).Otsu's method provides the threshold τi to separate the histogram bins associated to the i-th detected mode from those associated to the (i+1)th, for i = 1, . . ., m − 1.
In order to simplify the presentation, without loss of generalization hereafter only two regions are considered, i.e. m = 2, separable by means of the threshold τ .The generalization to a generic value of m is immediate.

Region segmentation
Region segmentation considered in this subsection is the final step to be done in order to separate deteriorated areas from the non-deteriorated ones.The approach considered here is level set segmentation-like (Sethian, 1999, Sethian, 2001, Osher and Fedkiw, 2003).
First, a rough segmentation is obtained by means of the model decomposition presented in the previous subsection: the number of segmented areas to be considered is set equal to the number of detected modes m.Then, a pixel is assigned to the i-th segmented area if its lightness value has been assigned in the previous subsection to the i-th mode, for i = 1, . . ., m.
Unfortunately, the above rough segmentation is usually very noisy, hence the following level set method is applied: • The closed curve Γ separating the two regions of interest on the image is implicitly described as where (x, y) are the coordinates on the image domain and the level set function φ(x, y) will be described in the following.
• The level set function is initialized as L * (x, y)−τ , then it is evolved by considering both the positions where edges are more probable in the images (i.e.where it is more likely to have the border between the two regions) and the regularity (i.e.smoothness) of the curve: where the first term on the right side of the above equation, S • ∇φ, corresponds to the introduction of an external vector field related to the image edges, whereas the second term, k|∇φ|, aims at increasing the curve smoothness.
• In this work the external vector field S corresponds to an edge detection function.Let IG be the original image filtered by a Gaussian and LoG be the Laplacian of Gaussian filter: and where * stands for the 2-dimensional convolution and the Gaussian filter G is defined as follows Let IL = LoG * I, then, S on the point (x, y) is defined as follows: that is, the magnitude of the external field in (x, y) is determined by IL(x, y) (i.e. the value of the original image I filtered with the LoG operator), whereas the direction of the the external field is determined by the filtered gradient of the image, ∇IG(x, y), normalized by its length.Where the gradient is equal to 0, S(x, y) is set to be a zero vector as well.

Remapping of the segmented regions
Once the regions have been segmented the resulting boundaries have to be remapped on the 3D space in order to spatially positioning them on the infrastructure.
By assumption each of the areas to be segmented can be well approximated as a planar surface, hence a local two-dimensional coordinate system can be defined on such surface.
Consider the j-th analyzed area, delimited by the points {mij}i, i = 1, . . ., nj, with nj ≥ 3, for each j.Since the considered area can be approximated as a planar surface, each of its point can be expressed as a linear combination of 3 non-collinear points in {mij}i, i = 1, . . ., nj.Without loss of generalization, let mi1, mi2, mi3 be non-collinear.
Furthermore, let u = mi1 and let U be the result of the Gram-Schmidt orthogonalization of the matrix [mi2−mi1 mi3−mi1].
Then, each point M * of the planar surface of interest can be expressed as follows: x ∈ R 2 , where, with a slight abuse of notation, hereafter x corresponds to the point coordinates according to the new reference system.
Let M * be the 3D point associated to a 2D point m * on the segmentation boundary.Accordingly to (1), the 3D position M * can be obtained from the following: Substituting ( 14) in the above equation, and, after simple matrix manipulations of the above equation: Finally, M * can be obtained from ( 14).

RESULTS
The proposed method has been tested on a road bridge located in Perarolo di Cadore, a small village in the Italian alps.The considered bridge is shown in Fig. 5, whereas the cloud points of a portion of the bridge (acquired by means of TLS) is shown in Fig. 6.Once the 2D regions have been segmented, their boundaries can be remapped on the 3D space accordingly to the procedure described in subsection 2.5.For instance, the segmentation boundaries in Fig. 7 can be remapped as shown in Fig. 10 (in order to improve the readability of the figure, x points (computed as in ( 17)) are shown instead of M * ).
Actually, the segmentation boundaries can be remapped on the 3D-model as well: Fig. 11 shows the curve formed by the segmentation boundary points {M * } (obtained from ( 14)) superimposed on the cloud points data.In order to ease the visualization of the results the figure shows a zoom on a specific (and significant) portion of the original image, viewed from an observation direction approximately orthogonal to the analyzed surface.
Similarly, Fig. 12 shows the segmentation boundaries for Fig. 8 remapped on the 3D-model.

DISCUSSION AND CONCLUSIONS
This paper proposed a semi-automated method that exploits level set image segmentation in order to estimate the boundaries of degraded regions on complex infrastructures, e.g. the bridge considered in this work.
As shown in the obtained results, the method allows to obtain appropriate 2D region boundaries (Fig. 7 and 7), which can be remapped on the 3D-model as well (Fig. 10, 11 and 12).
Despite our results are quite encouraging, some work has still to be done in order to make the whole procedure as simple as  possible to the user.In particular, the case of the analysis of periodicals recordings of the same infrastructure will be considered in the future.The rational in such case is that of properly exploiting information already available from previous analyzed data to reduce the interaction with the user, i.e. to make the procedure more autonomous.
Furthermore, this work relies on the use of differences in the lightness values in order to distinguish different areas.Despite this is quite intuitive, it might be subject to errors in certain cases.Future investigation will be dedicated to the use of different image analysis techniques (e.g.(Geladi and Grahn, 1996, Facco et al., 2011al., , Facco et al., 2013))) in order to provide more robust results from this point of view.Also different level set segmentation methods can be considered in order to improve this work from this point of view, for instance the method proposed in (Li et al., 2011) allows to partially compensate intensity inhomogeneities in the image (e.g.due to lightness changes in the image).
It is also worth to notice that one might consider the possibility of computing the segmentation of degraded areas directly on the intensities of TLS data.Actually, this is possible, however such intensities are influenced by several factors, which make them typically not so reliable.For instance, while in Fig. 11 intensities of TLS data are quite reliable (they allow to approximately distinguish the degraded areas), TLS data shown in Fig. 12 are not useful for our aim.According to this consideration, the use of TLS data to determine the accuracy of the positions of the computed boundaries might be not so reliable.Instead, TLS data can be used to compare the distance between the remapped planar surface where M * points lie on and the corresponding 3D TLS point positions: this distance, that in our simulations is typically of few centimeters, clearly depends on the accuracy of the method, on the measurement error of TLS data points and on how close is the real surface to a plane.
Our future investigations will also consider the evaluation of possible issues related to the use of reconstructed 3D model (Tucci et al., 2012).This method has been tested also in relation to a didactic activity which will take place at the Joint Summer School of Alpine Research 2015, applied to terrestrial laser scanning data sets, and further results in this sense will enable to assess its suitability in various fields of application.
(b) Automatic modal decomposition of the image values (according to a convenient L * , a * , b * representation) on the region of interest.

Figure 2 :
Figure 2: Red (top) and green (bottom) color channels for the image in Fig. 1.Brighter pixels are associated to higher values of the considered variables.

Figure 3 :
Figure 3: L * (top) and a * (bottom) channels for the image in Fig. 1.Brighter pixels are associated to higher values of the considered variables.

Figure 4 :
Figure 4: Example of histogram of lightness values, corresponding to the image in Fig. 1.

Figure 5 :
Figure 5: Road bridge located in Perarolo di Cadore considered in as test case in this work.

Figure 6 :
Figure 6: Cloud points of a portion of the road bridge of Fig. 5.The image segmentation results obtained (by means of the level set method described in the previous section) for the image in Fig. 1 are shown in Fig. 7.The region considered in the analysis of this image is a portion of the planar surface under the bridge, delimited by green lines in the figure.Instead, segmented regions are delimited by red lines: the regions corresponds to degraded areas, where degradation has been probably caused by percolation issues.It is worth to notice that segmentation shown in the figure corresponds to the separation between the two largest modes in Fig. 4: this is sufficient to distinguish the areas where degradation is more clear, whereas, if needed, segmentation based on the other modes with lower intensity in Fig. 4 can be used to distinguish other areas in the figure with other (i.e.lower) levels of degradation.Similarly, the results for other portions of the bridge are shown in Fig. 8 and 9 (Fig. 8 and 9 are composed by two sub-figures: the original image is shown on the top, whereas the obtained results are shown in the bottom).

Figure 7 :
Figure 7: Example of segmentation results, corresponding to the image in Fig. 1.The region considered for the analysis is delimited by the green lines, whereas boundaries of the segmented areas are indicated by red lines.

Figure 8 :
Figure 8: Top: original image.Bottom: example of segmentation results.

Figure 9 :
Figure 9: Top: original image.Bottom: example of segmentation results.

Figure 10 :
Figure 10: Example of segmentation results: boundaries of detected degradation areas of Fig. 7 remapped in the 3D space (in order to improve the readability of the figure, x points are shown instead of M * ).

Figure 11 :
Figure 11: Example of segmentation results: a portion of the segmentation boundaries (red lines) for Fig. 1 remapped in the 3D space.Boundary lines are superimposed to the cloud points acquired by means of TLS.Observation direction is approximately orthogonal to the analyzed surface.

Figure 12 :
Figure 12: Example of segmentation results: a portion of the segmentation boundaries (red lines) for Fig. 8 remapped in the 3D space.Boundary lines are superimposed to the cloud points acquired by means of TLS.Observation direction is approximately orthogonal to the analyzed surface.