AUTOMATIC RECTIFICATION OF BUILDING FAÇADES
Keywords: Inter-image homography, rectification, fundamental matrix, SURF detector, Canny operator, Hough transform
Abstract. Focusing mainly on the case of (near-)planar building façades, a methodology for their automatic projective rectification is described and evaluated. It relies on a suitably configured, calibrated stereo pair of an object expected to contain a minimum of vertical and/or horizontal lines for the purposes of levelling. The SURF operator has been used for extracting and matching interest points. The coplanar points have been separated with two alternative methods. First, the fundamental matrix of the stereo pair, computed using robust estimation, allows estimating the relative orientation of the calibrated pair; initial parameter values, if needed, may be estimated via the essential matrix. Intersection of valid points creates a 3D point set in model space, to which a plane is robustly fitted. Second, all initial point matches are directly used for robustly estimating the inter-image homography of the pair, thus directly selecting all image matches referring to coplanar points; initial values for the relative orientation parameters, if needed, may be estimated from a decomposition of the inter-image homography. Finally, all intersected coplanar model points yield the object-to-image homography to allow image rectification. The in-plane rotation required to finalize the transformation is found by assuming that rectified images contain sufficient straight linear segments to form a dominant pair of orthogonal directions which correspond to horizontality/verticality in 3D space. In our implementation, image edges from Canny detector are used in linear Hough Transform (HT) resulting in a 2D array (ρ, θ) with values equal to the sum of pixels belonging to the particular line. Quantization parameter values aim at absorbing possible slight deviations from collinearity due to thinning or uncorrected lens distortions. By first imposing a threshold expressing the minimum acceptable number of edge-characterized pixels, the resulting HT is accumulated along the ρ-dimension to give a single vector, whose values represent the number of lines of the particular direction. Since here the dominant pair of orthogonal directions has to be found, all vector values are added with their π/2-shifted counterpart. This function is then convolved with a 1D Gaussian function; the optimal angle of in-plane rotation is at the maximum value of the result. The described approach has been successfully evaluated with several building façades of varying morphology by assessing remaining line convergence (projectivity), skewness and deviations from horizontality/verticality. Mean estimated deviation from a metric result was 0°.2. Open questions are also discussed.