ISPRS-Archives

The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences

ISPRS-Archives

Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci.

2194-9034

Copernicus Publications

Göttingen, Germany

10.5194/isprs-archives-XLVIII-1-W5-2025-1-2025

A novel CAD-aided coarse-to-fine framework of RGBD-to-point clouds registration

Mengchi

¹ Elhabiby

Mohamed

¹ El-Sheimy

Naser

Micro Engineering Tech. Inc., Calgary, AB, Canada

Dept. of Geomatics Engineering, University of Calgary, 2500 University Dr NW, Calgary, AB T2N 1N4, Canada

04 11 2025

XLVIII-1/W5-2025 1 6

2025

This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/

This article is available from https://isprs-archives.copernicus.org/articles/XLVIII-1-W5-2025/1/2025/isprs-archives-XLVIII-1-W5-2025-1-2025.html

The full text article is available as a PDF file from https://isprs-archives.copernicus.org/articles/XLVIII-1-W5-2025/1/2025/isprs-archives-XLVIII-1-W5-2025-1-2025.pdf

Accurate registration between RGB-D images and point clouds is a critical task for various indoor applications. Estimating the relative pose by aligning the sensor frame with indoor 3D point clouds significantly enhances environmental perception and scene understanding. Existing research primarily focuses on cross-modal feature association through traditional unsupervised methods or supervised learning-based approaches. However, these methods often rely on strong assumptions, such as the availability of an initial pose or substantial overlap between the RGB-D images and the target point clouds. Moreover, the quality of registration is highly sensitive to the density and completeness of the point clouds. To address these limitations, this paper presents a novel coarse-to-fine registration framework with the aid of CAD models. First, a data enhancement process is introduced using the Scan2CAD method to replace functional objects (e.g., chairs and tables) with CAD models, improving semantic and quality consistency. Second, a geometry-aware graph matching is computed to identify regions of interest (ROI) within the point cloud map and estimate the initial pose of the RGBD sensor. Finally, an iterative fine matching using cross-modal is introduced to refine the initial estimated pose. Experimental validation on the ScanNet dataset demonstrates that the proposed framework achieves robust and accurate registration between RGB-D images and 3D point clouds.