Using the Segment Anything Model to Develop Control Pallet Loading System
Keywords: Segmentation, Automatic Machine Learning, Computer Vision, Pallet Loading, Segment Anything Model
Abstract. Modern warehouse complexes face the need for efficient and accurate pallet loading control in conditions of high dynamics and variety of objects. This paper proposes an approach to solving this problem based on the Segment Anything model (SAM) for automatic image tagging and the YOLOv8 model for subsequent accurate segmentation. This combination provides both high processing speed and adaptability to changing lighting conditions, partial overlaps, and complex object geometry. The proposed algorithm tracks changes in the area of segmented zones in order to estimate the addition of new cargo. The experiments show that YOLOv8 provides the best balance between accuracy and performance (Dice = 0.88), outperforming Mask R-CNN and the newer version YOLOv12. Additionally, the paper contains an analysis of the models' resistance to noise and visual distortions. The presented solution has the potential for integration into next-generation industrial logistics systems, reducing the need for manual annotation and increasing the autonomy of loading control.