EVALUATING CONVNET AND TRANSFORMER BASED SELF-SUPERVISED ALGORITHMS FOR BUILDING ROOF FORM CLASSIFICATION
Keywords: Roof-form classification, Self-supervised learning, SimCLR, MoCo, ConvNets, Vision transformers, BYOL, BEiT
Abstract. This research paper presents a comprehensive evaluation of various self-supervised learning models for building roof type classification. We conduct linear evaluation experiments for the models pretrained on both the ImageNet1K dataset and a custom building roof type dataset to assess the models’ performance for the roof type classification task. The results demonstrate the effectiveness of the ViT-based BEiTV2 model, which outperforms other models on both datasets, achieving an accuracy of 96.8% from the model pretrained on ImageNet1K dataset and 92.67% on the model pretrained on building roof type dataset. The class activation maps further validate the strong performance of MoCoV3, BarlowTwins, and DenseCL models. These findings emphasize the potential of self-supervised learning for accurate building roof type classification, with the ViT-based BEiTV2 model showcasing state-of-the-art results.