Investigating Visual Localization Using Geospatial Meshes
Keywords: Visual Localization, Image Orientation, Mesh, Aerial Image, Smartphone Image, Image descriptors, GNSS, Navigation
Abstract. This paper investigates the use of geospatial mesh data for visual localization, focusing on city-scale aerial meshes as map representations for locating ground-level query images captured by smartphones. Visual localization, essential for applications such as robotics and augmented reality, traditionally relies on Structure-from-Motion (SfM) reconstructions or image collections as maps. However, mesh-based approaches offer dense spatial representation, memory efficiency, and real-time rendering capabilities. In this work, we evaluate initialization strategies, image matching techniques, and pose refinement methods for mesh-based localization pipelines, comparing the performance of both traditional and deep-learning-based techniques in image matching between real and synthetic views. We created a dataset from nadir and oblique aerial imagery and accurately georeferenced smartphone images to test cross-modal localization. Our findings demonstrate that combining global feature retrieval with GNSS-based spatial filtering yields significant improvements in accuracy and efficiency, achieving submeter positional and subdegree rotational errors. This study advances scalable visual localization using meshes and highlights the potential of integrating smartphone GNSS data for improved performance in urban environments.