How Effective Are Foundation Models for Crop Type Mapping Using Hyperspectral Imaging? A Comparative Study of Machine Learning, Deep Learning, and Geospatial Foundation Models
Keywords: Hyperspectral Imaging, Crop-Type Mapping, Machine Learning, Deep Learning, Geospatial Foundation Models
Abstract. Accurate and precise information on cultivated crop types is essential for studies related to food security, crop yield prediction, and yield gap analysis. Crop type mapping using remote sensing plays a crucial role in these applications, with multi-spectral imagery (MSI) widely employed alongside machine learning (ML) and deep learning (DL) methods. However, multi-spectral sensors often fail to differentiate crops with similar spectral signatures, whereas hyperspectral imaging (HSI) enables more precise discrimination with its high spectral resolution. Additionally, ML and DL algorithms often struggle to generalize well in data- scarce scenarios due to their reliance on extensive labeled ground truth data. Addressing these challenges, geo-spatial foundation models (GFMs) (i.e., very large deep learning models) trained on large-scale datasets have emerged as a promising alternative, using self-supervised learning (SSL) to improve classification in low-label environments. This study evaluates the performance of traditional machine learning algorithms, including Support Vector Machines (SVM) and Random Forests (RF), deep learning models such as Convolutional Neural Networks (CNN) and HybridSN, and GFMs, specifically HyperSIGMA and Prithvi-EO-1.0, using the Indian Pines benchmark dataset, a widely used hyperspectral dataset for agricultural land cover classification. The key novelty of this work is the adaptation of Prithvi-EO-1.0, a multi-spectral foundation model to HSI. The models were tested across four different scenarios with a reduction in training data, and their performance was evaluated using Overall Accuracy (OA), and Kappa (K) coefficient to analyze their generalization capabilities. The results indicate that HybridSN achieved the highest accuracy in most scenarios, with OA reaching up to 99.8%, demonstrating its ability to capture spatial-spectral relationships. HyperSIGMA, a vision transformer-based foundation model for HSI analysis outperformed all models when trained on only 1% of the labeled data, highlighting the advantage of self-supervised learning in low-label scenarios. Furthermore, the adaptation of Prithvi-EO- 1.0 to hyperspectral data achieved an OA up to 97%, demonstrating that multi-spectral foundation models can be successfully adopted for hyperspectral data with appropriate fine-tuning and optimization techniques. These findings offer key insights into the conditions where GFMs outperform traditional ML and DL approaches, particularly in overcoming data limitations for agricultural applications. This research paves the way for advancing large-scale crop-type mapping using HSI through the application of GFMs.
