IMAGE ORIENTATION BY EMBEDDING IN A GAN LATENT SPACE
Keywords: exterior orientation, deep learning, generative models
Abstract. Estimation of an image exterior orientation is required in multiple tasks including 3D reconstruction, autonomous driving and navigation, and single-photo 3D reconstruction. The problem can be easily solved if some reference points (keypoints) with known coordinates in the reference frame are detected in the image. While multiple robust keypoint detectors were developed, estimation of an image exterior orientation from a single image remains challenging in many cases. For example, repeating structures in the scene or absence of textures can reduce the performance of keypoint detectors. In this paper, we propose an algorithm for estimation of an image exterior orientation that leverages the latent space of Generative Adversarial Network (GAN). We propose a modification of the StyleGAN2
model that we term ExteriorGAN
. Unlike the StyleGAN2
that generates random images from a random noise vector z, we aim training a mapping from a random vector z and a given image exterior orientation p G : (z,p) → A. Our model generates random images for a constant exterior orientation p and random z that have constant geometry but differs in the scene appearance (e.g. different light direction or intensity). We perform embedding of the image A into the latent space w and reconstruct the input noise vector z and exterior orientation parameters w using a stochastic gradient descent. We developed a dedicated dataset with 50k images and corresponding orientation parameters to train and validate our ExteriorGAN
model. The results of evaluation demonstrate that our algorithm allows estimation of the exterior orientation of an image with respect to a known 3D scene. The accuracy of the exterior orientation is comparable with modern state-of-the-art methods. The camera pose can be recovered with a mean error of 50 mm for a working space of 5 by 5 meters.