Querying 3D point clouds exploiting open-vocabulary semantic segmentation of images

Alami, Ashkan; Remondino, Fabio

doi:https://doi.org/10.5194/isprs-archives-XLVIII-2-W8-2024-1-2024

Articles | Volume XLVIII-2/W8-2024

https://doi.org/10.5194/isprs-archives-XLVIII-2-W8-2024-1-2024

© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/isprs-archives-XLVIII-2-W8-2024-1-2024

© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume XLVIII-2/W8-2024

14 Dec 2024

| 14 Dec 2024

Querying 3D point clouds exploiting open-vocabulary semantic segmentation of images

Ashkan Alami and Fabio Remondino

Keywords: deep learning, point cloud, query, open-vocabulary

Abstract. While deep models have advanced the 3D data analysis and demonstrated impressive results, they often struggle to generalize to new classes that are absent from the training dataset. Recently, open-vocabulary and zero-shot models have addressed this problem. However, these models are still relying on some data for training and fine-tuning for specific tasks. This requirement limits them to real-world applications. In this research, we propose an open-vocabulary method for point cloud segmentation, which does not require additional training data beyond the images and point cloud from the survey scene. By using the capabilities of the power of 2D open-vocabulary models and geometric features from the 3D data, combined with an XGBoost-guided region growing algorithm, our approach segments the queried objects directly in 3D scenes. We evaluate our method on 3D benchmark datasets, such as Replica and ScanNet, showing its practicality and scalability to real-world scenarios with limited data.

Querying 3D point clouds exploiting open-vocabulary semantic segmentation of images

Useful Links

Useful External Links

Our Contact