Semantic-Consistent 3D Reconstruction via Gaussian Splatting and SAM-Guided Annotation
Keywords: Semantic Segmentation, Point Cloud, Segment Anything Model, 3D Gaussian Splatting, 2D-3D projection
Abstract. 3D Gaussian Splatting (3DGS) provides a novel paradigm for multi-view semantic scene reconstruction, offering explicit and fine-grained geometric representations. However, accurate semantic expression in reconstructed scenes remains problematic as existing methods relying on multi-view 2D semantic projections inherently suffer from cross-view ambiguities in semantic boundaries and label inconsistencies. These limitations arise from occlusions, illumination variations, and object deformations, which degrade the semantic fidelity of 3D reconstructions by introducing conflicting label assignments across viewpoints. This paper proposes a geometry-verified multi-view semantic scene reconstruction framework. First, a depth-aware projection aligns 2D semantic masks with 3DGS-reconstructed point clouds, filtering semantic ambiguities in multi-view annotations via a conflict-locking mechanism. Second, a geometry-aware semantic propagation model globally diffuses semantic labels by leveraging the local geometric continuity of the point cloud. Experiments demonstrate that the proposed framework achieves significantly superior reconstruction consistency in occlusion-heavy scenes compared to conventional methods. Specifically, it outperforms multi-view voting strategies with improvements of 12.39% in Overall Accuracy (OA) and 17.03% in mean Intersection over Union (mIoU) on the BeDOI-GB dataset. Project web: https://bigbigman233.github.io/SC3D.github.io/
