The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Publications Copernicus
Download
Citation
Articles | Volume XLVIII-2/W8-2024
https://doi.org/10.5194/isprs-archives-XLVIII-2-W8-2024-31-2024
https://doi.org/10.5194/isprs-archives-XLVIII-2-W8-2024-31-2024
14 Dec 2024
 | 14 Dec 2024

A handheld stereo vision and LiDAR system for outdoor dense RGB-D mapping using depth map completion based on learned priors

Michael Bleier, Yijun Yuan, and Andreas Nüchter

Keywords: 3D Mapping, LiDAR, Handheld Mapping System, Depth Map Completion, Dense RGB-D, Learned Depth Covariance Function

Abstract. This paper proposes a handheld mapping system consisting of a stereo camera setup combined with low-cost automotive LiDAR. The prototype system is applicable to various types of mobile monocular, stereo vision and LiDAR data collection and processing. Capturing dense RGB-D data outdoors with low-cost sensors is challenging, especially when low latency is required. Readily available commercial RGB-D sensors are typically limited to a range of less than 10m, which is too small to capture large outdoor structures. Currently available low-cost automotive LiDAR scanners feature a suitable range but provide only sparse data. To enable low-latency dense RGB-D scans we augment the sparse LiDAR data with the RGB data stream based on learned models. We apply monocular depth estimation based on a single image and apply scale correction based on learned priors and sparse automotive LiDAR scans. Using the laser scan data, accurate metric information is incorporated directly into the scale estimation stage. For validation, the learning-based depth map completion is compared to traditional LiDAR mapping using scan matching on an outdoor data set acquired with the proposed handheld. While the model-based regression of the sparse LiDAR data produces significantly less accurate results in our experiments, it is able to compute dense RGB-D data from a single sparse 3D scan and monocular RGB image with low latency.