SEMI-AUTOMATIC BUILDING MODELS AND FAÇ ADE TEXTURE MAPPING FROM MOBILE PHONE IMAGES

Research on 3D urban modelling has been actively carried out for a long time. Recently the need of 3D urban modelling research is increased rapidly due to improved geo-web services and popularized smart devices. Nowadays 3D urban models provided by, for example, Google Earth use aerial photos for 3D urban modelling but there are some limitations: immediate update for the change of building models is difficult, many buildings are without 3D model and texture, and large resources for maintaining and updating are inevitable. To resolve the limitations mentioned above, we propose a method for semi-automatic building modelling and façade texture mapping from mobile phone images and analyze the result of modelling with actual measurements. Our method consists of camera geometry estimation step, image matching step, and façade mapping step. Models generated from this method were compared with actual measurement value of real buildings. Ratios of edge length of models and measurements were compared. Result showed 5.8% average error of length ratio. Through this method, we could generate a simple building model with fine façade textures without expensive dedicated tools and dataset.


INTRODUCTION
3D urban modelling has been studied for long time in photogrammetry.It was used for various applications like urban planning, environment, construction and tourism.In these days demands are rapidly moving to may public fields.According to these change some problems were occurred in current methodologies.Currently, urban modelling is heavily relying on photogrammetric mapping and needs long working time, highly qualified expert and expensive equipment.To overcome these weaknesses, many researches were carried out about 3D urban mapping.Suveg and Vosselman (2004) developed automatic urban modelling method with aerial photo and map and they reconstruct over 75% of building in photo and the accuracy of the reconstruction was good enough for mapping purpose.Schwalbe et al (2005) made automatic urban modelling method with aerial laser scanner data and 2D GIS data.They reconstruct 40~50% building in complex city, and reconstruct 100% building in new-built residential area.And Verma et al (2006) made automatic building detection and modelling from aerial LIDAR data.These methods make precision building models automatically but some limitations remain.First, these still needs some expensive source data like aerial photos.This derives long update period and insufficient number of building models.And they do not have enough building façade texture despite that public's needs are focused on façade information.Therefore, in this paper we propose a new semi-automatic method to make 3D urban model.Our goals are as follows.

Easy and low cost 2. Sufficient façade information 3. Appropriate visible quality
To achieve these goals, we have to select low cost source data which can get enough façade data and enough parameter specification to apply photogrammetric theory.We also have to select proper modelling and processing algorithm.We select smart phones which are commercialized widely and can achieve high resolution image easily.From our previous research (Ahn et al., 2014), photogrammetric possibility of smart phone images were verified.We try to design our algorithm appropriate to the current specification of smartphones: their camera quality, processing power and orientation accuracy.We get a 3D point cloud with stereo matching algorithm developed in-house.We assume user input to define the boundary of building façade and confine building models as a simple cube model.Building models generated from this method will be compared with the actual length ratio of the real buildings.

METHODOLOGY
For modelling buildings from a smart phone camera, a user must take two images of a building of interest.A user needs to locate the camera so that each image contains a full view of two building facades.After this step, we design the algorithm as the following steps: camera geometry estimation, epipolar resampling, ROI input from a user, 3D point cloud generation, and 3D building modelling.Each will be explained below..

Camera geometry estimation
To estimate camera geometry between two images, we use relative orientation with Coplanarity constraint.As shown in Fig 1, this constraint states that two focal points of stereo images and tie points from each image constitute one plane on some model space.We use SIFT algorithm which developed in computer vision, well known for feature extraction and feature matching algorithm, to generate tie points in stereo images.With (1) we can calculate relative x, y, z, w, p, k with linearized form of Coplanarity constraint by least square estimation.

Epipolar resampling
We generate dense 3D point cloud's coordinates with modified relaxation matching.Before matching, we eliminate Y parallax by epipolar resampling to find candidate point for matching easily.We adopt epipolar resampling based on the two rotations in (2).With these processes, we can get a stereo image pair without Yparallax.

Get ROI from user
mentioned above, our model has a constraint of a simple cube shape.This constraint makes process efficiently.Related to this constraint, we get user input about model's boundary in image reduce our processing time.With this information we can make façade image for 3D model, limit relaxation matching bounds, and estimate building planes from 3D point clouds after matching.

Point Cloud Generation
we have epipolar images with ROI information.We can generate 3D point cloud in efficient way by confining match region to be within the ROI.In this paper, we use modified relaxation matching, developed in house and named as MDR (Multi-Dimensional Relaxation).In this matching, we use a number (p in Fig 3) of patches with different sizes and shapes instead of one correlation patch.We calculate correlation value p times on every n candidates along an epipolar line.We accept a candidate as 'match' if the candidate has a largest number of patches with maximum correlation value.Figure 3. MDR matching So we calculate p*n times in every candidate.And it could make overhead in calculating.To cover this overhead, we adopt pyramid matching scheme in matching to reduce match processing time by constraint candidates' boundary in small scale images.After match each candidates we calculate 3D point cloud's model coordinates.

3D building modelling
To build 3D building model we estimate 3D planar equation for each façade defined from user's input.We use 3D point clouds data generated from MDR for fitting 3D facade planes.We used the RANSAC algorithm to remove some outliers in match results.We then adjust two fitted planes to a cube shape.This process is to overcome some boundary misalignment errors between the two planes and some angular errors.After that, we could make a cube model in the model space.At last we attach façade information which we got user's ROI input.After these processes we still have some insufficient data about opposite sides of buildings.We handle this problem by simply attaching same façade on opposite sides.

Experiments design
For experiments described in this paper, we took two stereo pairs, one for an artificial cube-shaped box in our laboratory and one for a real building in outdoor.We selected the first case to check distortion in the model generated from the proposed method with an ideal box shape.We selected the second to check the proposed method's applicability in cost, information and visible quality's aspect.The two cases are shown in      To estimate the accuracy of the models generated by our method, we compared the ratio of our models' length, width and height against the real ratio.Note that the 3D building models generated are only in the model space and that we cannot estimate the real dimension of buildings.Table 2 shows the accuracy of our proposed method for the two cases.The accuracy of the box was better than that of the real building.In all cased, buildings with relative scale accuracy of 10% were achieved.In the two cases, we observe the accuracy degradation in Y scale.We are currently studying the cause of this degradation.Qualitatively, using smartphone image and semi-automatic method we could generate building models easily and with low cost.The existing photogrammetric models are far more sophisticated.But the model generated here can have finer texture maps observable in street-level view point.

CONCLUSION AND FUTURE WORKS
By using our proposed method we could make 3D building model with easy and cheap way then the existing methods.And also we can get enough façade data from street view.And as we can see in Table 2, our models have small errors of 10% in length ratio.
For future research, we aim to achieve our two goals.First we will make more precise models, which do not have accuracy degradation in a particular direction, and more complicated models then a cube model, which have planes more than two.Second we will make the whole process in real-time.Currently we can generate the models within 2 minutes from user's ROI input.We will optimize the whole process and minimize 3D point cloud density for proper planar fitting so that users may not notice delay due to processing.

Figure 4 .
Figure 4. Artificial and Real buildings used in experiments

Figure 8 .
Figure 8. Model extracted from stereo image

Table 2 .
Errors in length ratio