AI-Based 3D Vision System for Inspection and Monitoring
Keywords: Crack Detection, Monocular Depth Estimation, Stereo Matching, Visual-SLAM, YOLO, Inspection, Monitoring
Abstract. Simultaneous Localization and Mapping (SLAM) has become a fundamental technology in various applications, including robotics, autonomous navigation, geographic information systems (GIS), and infrastructure inspection. This paper presents a new version of GuPho, a low-cost, lightweight, and portable visual SLAM-based system equipped with AI-driven capabilities for real-time mapping, object detection, and defect analysis. The system integrates stereo vision and deep learning (DL) methods to enhance spatial understanding and enable accurate real-time scene interpretation. In particular, we explore DL-based semantic segmentation, monocular depth estimation (MDE), and stereo depth estimation to improve 3D reconstruction and size measurement of cracks for infrastructure monitoring. We implement state-of-the-art neural networks, including RF-DETR and YOLO for real-time crack and windows segmentation and Depth Anything V2, Depth Pro, and Unimatch for depth estimation. Our results demonstrate the potential of GuPho as an affordable and efficient system for real-time mobile mapping and defect assessment. The real-time and AI capabilities of our in-house solution are showcased here: https://youtu.be/ATIwn4zOSFw
