The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Publications Copernicus
Download
Citation
Articles | Volume XLVIII-3-2024
https://doi.org/10.5194/isprs-archives-XLVIII-3-2024-519-2024
https://doi.org/10.5194/isprs-archives-XLVIII-3-2024-519-2024
07 Nov 2024
 | 07 Nov 2024

Assessing the Generalization Capacity of Convolutional Neural Networks and Vision Transformers for Deforestation Detection in Tropical Biomes

Pedro J. Soto Vega, Daliana Lobo Torres, Gustavo X. Andrade-Miranda, Gilson A. O. P. da Costa, and Raul Queiroz Feitosa

Keywords: Deforestation Detection, Deep Learning, Convolutions, Transformers, Domain Shift

Abstract. Deep Learning (DL) models, such as Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs), have become popular for change detection tasks, including the deforestation mapping application. However, not enough attention has been paid to the domain shift issue, which affects classification performance when pre-trained models are used in areas with different forest covers and deforestation practices. This study compares DL methods for deforestation detection, focusing on assessing how well CNNs and ViTs can adapt to the domain shift. Two different models, namely, DeepLabv3+ and UNETR, were trained using remote sensing images and references from a specific location and then tested in other sites to simulate real-world scenarios. The results showed that the ViT-based architecture achieved better performance when trained and tested in the same region but showed lower generalization capacity in cross-domain scenarios. We consider this a work in progress that needs further research to confirm its findings, with the evaluation of additional architectures on a wider range of domains.