Pushing the limit to near real-time indoor LiDAR-based semantic segmentation
Keywords: Semantic Segmentation, Transformers, Door and Windows Detection,3D Building Understanding, UAV, LiDAR
Abstract. Semantic segmentation of indoor 3D point clouds is a critical technology for understanding three dimensional indoor environments, with significant applications in indoor navigation, positioning, and intelligent robotics. While real-time semantic segmentation is already a reality for images, existing classification pipelines for LiDAR point clouds assume a pre-existing map which relies on data collected from accurate but heavy sensors. However, this approach is impractical for high-level task planning and autonomous exploration, which benefits from a rapid 3D structure understanding of the environment. Furthermore, while RGB cameras remain a popular choice in good visibility conditions, such sensors are inefficient in environments where visibility is hindered. Consequently, LiDAR point clouds emerge as a rather reliable source of environmental information in such circumstances. In this paper, we adapt an existing semantic segmentation model, Superpoint Transformer, to LiDAR-based situation where RGB inputs are not available and near real-time processing is attempted. To this end, we simulated our robot’s trajectory and leveraged Hidden Point Removal using the open-source dataset S3DIS to train the model. We investigated various strategies such as modifying the interval prediction and thoroughly study its influence on the prediction intervals. Our model demonstrates an improvement from 40 to 67.6 mean Intersection over Union (mIoU) compared to the baseline on simple (floor, ceiling, walls) and complex (doors, windows) classes.