Automatic Training Sample Selection for a Multi-Evidence Based Crop Classification Approach
Keywords: Crop, Classification, Training, Vector, Neural, Satellite, Imagery, Learning
Abstract. An approach to use the available agricultural parcel information to automatically select training samples for crop classification is investigated. Previous research addressed the multi-evidence crop classification approach using an ensemble classifier. This first produced confidence measures using three Multi-Layer Perceptron (MLP) neural networks trained separately with spectral, texture and vegetation indices; classification labels were then assigned based on Endorsement Theory. The present study proposes an approach to feed this ensemble classifier with automatically selected training samples. The available vector data representing crop boundaries with corresponding crop codes are used as a source for training samples. These vector data are created by farmers to support subsidy claims and are, therefore, prone to errors such as mislabeling of crop codes and boundary digitization errors. The proposed approach is named as ECRA (Ensemble based Cluster Refinement Approach). ECRA first automatically removes mislabeled samples and then selects the refined training samples in an iterative training-reclassification scheme. Mislabel removal is based on the expectation that mislabels in each class will be far from cluster centroid. However, this must be a soft constraint, especially when working with a hypothesis space that does not contain a good approximation of the targets classes. Difficulty in finding a good approximation often exists either due to less informative data or a large hypothesis space. Thus this approach uses the spectral, texture and indices domains in an ensemble framework to iteratively remove the mislabeled pixels from the crop clusters declared by the farmers. Once the clusters are refined, the selected border samples are used for final learning and the unknown samples are classified using the multi-evidence approach. The study is implemented with WorldView-2 multispectral imagery acquired for a study area containing 10 crop classes. The proposed approach is compared with the multi-evidence approach based on training samples selected randomly and border samples based on initial cluster centroids within agricultural parcels without any refinement. The results clarify the improvement in overall classification accuracy to 82.3 % based on the proposed approach from 74.9 % based on random selection and 71.4% on non-refined border samples.