REGIONAL SCALE CROP MAPPING USING MULTI-TEMPORAL SATELLITE IMAGERY

One of the problems in dealing with optical images for large territories (more than 10,000 sq. km) is the presence of clouds and shadows that result in having missing values in data sets. In this paper, a new approach to classification of multi-temporal optical satellite imagery with missing data due to clouds and shadows is proposed. First, self-organizing Kohonen maps (SOMs) are used to restore missing pixel values in a time series of satellite imagery. SOMs are trained for each spectral band separately using nonmissing values. Missing values are restored through a special procedure that substitutes input sample's missing components with neuron's weight coefficients. After missing data restoration, a supervised classification is performed for multi-temporal satellite images. An ensemble of neural networks, in particular multilayer perceptrons (MLPs), is proposed. Ensembling of neural networks is done by the technique of average committee, i.e. to calculate the average class probability over classifiers and select the class with the highest average posterior probability for the given input sample. The proposed approach is applied for regional scale crop classification using multi temporal Landsat-8 images for the JECAM test site in Ukraine in 2013. It is shown that ensemble of MLPs provides better performance than a single neural network in terms of overall classification accuracy, kappa coefficient, and producer's and user's accuracies for separate classes. The overall accuracy more than 85% is achieved. The obtained classification map is also validated through estimated crop areas and comparison to official statistics. * Corresponding author.


INTRODUCTION
Geographical location and distribution of crops at global, national and regional scale is an extremely valuable source of information for many applications.Reliable crop maps can be used for more accurate agriculture statistics estimation (Gallego et al., 2010(Gallego et al., , 2013(Gallego et al., , 2014)), stratification purposes (Boryan and Zhengwei, 2013), better crop yield prediction (Becker-Reshef et al., 2010;Kogan et al., 2013aKogan et al., , 2013b)).
Remote sensing images from space have always been an obvious and promising source of information for deriving crop maps.This is mainly due capabilities to timely acquire images and provide repeatable, continuous, human independent measurements for large territories.Yet, there are no globally available satellite-derived crop specific maps at present moment.Only coarse-resolution imagery (at least 250 m spatial resolution) has been utilized to derive global cropland extent (e.g.GlobCover, MODIS).Nevertheless, even these maps provide variable quality and reliability in capturing cropland (Fritz et al., 2013).With availability of Landsat-8 and Sentinel-2 images and their synergic exploitation (Roy et al., 2014), it becomes possible to generate crop specific maps at high spatial resolution scale for main agriculture regions.
It should be however noted that most studies on crop mapping using high and medium resolution satellite imagery (e.g.Landsat-5/7, SPOT, AWiFS) have been carried out at local scale (Conrad et al., 2010;Peña-Barragán et al., 2011;Yang et al., 2011).One of the exceptions is the creation of the Cropland Data Layer (CDL) of the US Department of Agriculture (USDA) National Agricultural Statistics Service (NASS) (Boryan et al., 2011).The CDL product provides crop maps for 47 states at 56 m spatial resolution.Another effort to create a global cropland product based on Landsat TM and ETM+ is performed by Yu et al. (2013aYu et al. ( , 2013b)).Yu et al. (2013b) created a 30 m global land cover product called FROM-GLC (Fine Resolution Observation and Monitoring of Global Land Cover).Producer's accuracy (PA) and user's accuracy (UA) for cropland class were 75.25% and 55.62%, respectively, which is below the target of 85% for agriculture applications (McNairn et al., 2009).
One of the main issues in utilizing optical imagery is the presence of clouds and shadows that introduce missing values.At local scale, it is usually possible to acquire cloud-free images in the crucial period of vegetation cycle.However, this is not the case for large territories.That is why, most of the existing studies on large scale crop mapping use high-and mediumresolution cloud-free optical images coupled with weatherindependent synthetic-aperture radar (SAR) (McNairn et al., 2009) or use coarse-resolution imagery at high temporal resolution (Pittman et al., 2010;Wardlow and Egbert, 2008).In order to deal with missing data in optical satellite imagery, a number of approaches have been proposed.On of the most popular approach is compositing.Yan and Roy (2014) utilize a 30 m Web Enabled Landsat data (WELD) time series to derive cropland and agriculture crop field boundaries.The WELD is based on compositing Landsat ETM+ images with cloud cover <80% within 150 × 150 km tiles on weekly, monthly, seasonal, and annual basis.However, missing value can still happen in composite products.Another popular approach is related to fill in missing data with different techniques such as multi-spectral and multi-temporal.Roy et al. (2014) utilize course resolution MODIS data for filling gaps and predicting Landsat data.Yu et al. (2013b) improve the 30 m FROM-GLC global land cover map based on Landsat TM and ETM+ imagery by adding coarse resolution MODIS imagery.It allows them to increase overall accuracy from 64.89% to 67.08%.Chen et al. (2011) propose a neighbourhood similar pixel interpolator (NSPI) for filling gaps in Landsat ETM+ SLC-off images.Latif et al. (2008) propose self-organizing Kohonen maps (SOMs) for reconstructing missing values in a time-series of low-resolution satellite imagery.It should be however noted that only few studies on filling in techniques assessed their efficiency on generating dedicated products, for example land cover maps (Chen et al., 2011).Moré et al. (2006) propose a hybrid classifier to dealing with missing data in a time-series of Landsat imagery.First, unsupervised classification for different combinations of input data is performed based on clustering algorithm IsoMM.Then, an algorithm called ClsMix is run to assign every spectral class to a thematic class through training areas defined by the user.The proposed approach achieves overall accuracy of 88.6% comparing to 67.2% obtained by the maximum likelihood (ML) classifier.
No previous studies used restored missing data from high-and medium resolution satellite imagery (such as Landsat-8) to provide crop classification and mapping for large areas.In this paper, a new approach to classification of multi-temporal Landsat-8 imagery with missing data due to clouds and shadows is presented.The approach combines different neural networks (NNs) architectures to restore missing values in a time-series of satellite imagery and provide supervised classification for crop discrimination.Results are presented for the Joint Experiment of Crop Assessment and Monitoring (JECAM) test site in Ukraine with the area of more than 28,000 km 2 (Gallego et al., 2014;Shelestov et al., 2013).The resulting classification map from Landsat-8 imagery is produced, and derived crop area estimates are compared to official statistics.To our best knowledge, the obtained crop map is one of the first ones produced at regional scale using new Landsat-8 images.

Restoration of missing data in satellite images
SOM is a type of artificial neural network that is trained using unsupervised learning to produce a discretised representation of the input space of the training samples, called a map (Kohonen, 1995).The map seeks to preserve the topological properties of the input space.SOM is formed of the neurons located on a regular, usually one-or two-dimensional grid.Neurons compete with each other in order to pass to the "excited" state.The output of the map is, so called, neuron-winner or best-matching unit (BMU) whose weight vector has the greatest similarity with the input sample where i(x) = SOM output, i.e. the number of BMU x = an input vector L = a number of neurons in the output grid w l is a vector of weight coefficients for neuron l • means metric (e.g.Euclidean) It should be noted that dimension of weight vectors w l is identical to dimension of the input vectors x. Figure 1 shows a general procedure for restoration of missing values in a timeseries of data sets.The reconstruction of satellite images is performed for each spectral band separately, i.e. a separate SOM is trained for each spectral band.Pixels that have no missing values in the time-series are selected for training.
Selecting the number of training pixels represents a trade-off, in particular increasing the number of training samples will lead to the increased time of SOM training while increasing the quality of restoration.Also, training data sets should be selected automatically.As such, we propose to select training samples on a regular grid of pixels.Therefore, the SOM seeks to project a large number of non-missing data to the subspace vectors in the map.
Figure 1.A procedure to restore missing values in input data using SOM Restoration of missing values is performed in the following way (Figure 1).The multi-temporal pixel values with missing components are input to the SOM.A neuron-winner in the SOM is selected following Eq.( 1).It is worth noting, however, that missing values are omitted from metric estimation when selecting BMU, i.e.only components with valid values in the input vector are used.When the BMU is selected, missing values are substituted by corresponding components of the BMU weight values.Detailed description of the algorithm and its performance evaluation is described in (Skakun and Basarab, 2014).

Committee of neural networks for image classification
Support vector machine (SVM), decision tree (DT) and RF classifiers have been probably the most popular ones for remote sensing image classification in the past years (Boryan et al., 2011;McNairn et al., 2009;Pittman et al., 2010;Shao and Lunetta, 2012;Wardlow and Egbert, 2008).Many papers report better performance of SVM, DT and RF comparing to other techniques, including MLP (McNairn et al., 2009).However, some other studies show MLP to outperform SVM and DT (Gallego et al., 2012(Gallego et al., , 2014)).Though the MLP training phase might be resource and time consuming (but this is becoming less problematic with the use of high-performance computations (Kravchenko et al., 2008;Kussul et al., 2009Kussul et al., , 2010aKussul et al., , 2010bKussul et al., , 2012;;Shelestov et al., 2006;Shelestov and Kussul, 2008)), and might require experience from the user, it has several advantages over SVM and DT.In particular, MLP is fast at processing new data which can be critical to the processing of large volumes of satellite data, and can produce probabilistic outputs which can be used for indicating reliability of the map.In many cases, in our opinion, not a full potential of MLP has been explored.In particular, cost function for MLP training is usually considered square (e.g.root mean square error -RMSE) in remote sensing literature while it had been shown that cross-entropy (CE) error function provides better performance in terms of speed of training and classification accuracy (Bishop, 2006;Meier et al., 2011;Simard et al. 2003).Another potential is to explore a committee of neural networks since the committee of classifiers tends to outperform the single classifier (Zhang and Xie, 2014).
Therefore, an MLP classifier is used as a basic one in this study for classification of restored multi-temporal satellite imagery.
The MLP classifier has a hyperbolic tangent activation function for neurons in the hidden layer and logistic activation function in the output layer.The CE error function is defined using the following equation (Bishop, 2006) where E(w) = CE error function that depends on the neurons' weight coefficients w T = set of vectors of target outputs in the training set composed of N samples K = number of classes t nk and y nk = target and MLP outputs, respectively In the target output for class k, all components of vector t n are set to 0, except for the k-th component which is set to 1.The CE error E(w) is minimized by means of the scaled conjugate gradient algorithm by varying weight coefficients w (Bishop, 2006).
A committee of MLPs is used to increase performance of individual classifiers.Two approaches to forming the committee are evaluated in this study.Both these approaches are modifications of the bagging technique (Bishop, 2006).Within the first approach, committee is formed using MLPs trained on different data sets.Within the second approach, committee is formed using MLPs with different parameters trained on the same training data.These approaches are quite simple, noncomputation intensive and proved to be efficient for other applications (Meier et al., 2011).
Outputs from different MLPs are integrated using the technique of average committee (Meier et al., 2011).Under this technique the average class probability over classifiers is calculated, and the class with the highest average posterior probability for the given input sample is selected (Figure 3).The following equation formalizes this procedure where k* = class to which the committee of classifiers assigns the input sample The average committee procedure has advantage over majority voting technique in two aspects: (i) it gives probabilistic output which can be used as an indicator of reliability for mapping particular pixel or area; (ii) it does not have ambiguity when two or more classes give the same number of "votes".
Classification of satellite images is performed on a per-pixel basis.Though, it was previously reported that per-field classification often outperforms per-pixel classification, it requires availability of accurate field boundaries.Unfortunately, field boundaries for most regions of Ukraine are not available at present moment, and therefore, it complicates the use of perfield classification in the operational context (McNairn et al., 2009).

STUDY AREA DESCRIPTION
The proposed methodology is evaluated for the JECAM test site in Ukraine.The JECAM test site in Ukraine was established in 2011 and covers the administrative region of Kyiv oblast with the geographic area of 28,100 km 2 and almost 1.0 M ha of cropland.Northern part of the region is dominated by forests and grasslands, while central and southern parts are agriculture intensive areas.Land cover classes are quite heterogeneous including croplands, forests, grassland, rivers, lakes and wetlands.The climate in the region is humid continental with approximately 709 mm of annual precipitations.Landscape is mostly flat terrain with slopes ranging from 0% to 2%; near 10% of the territory is hilly with slopes about 2-5%.The crop calendar is September-July for winter crops, and April-October for spring and summer crops.Major crop types include maize (25.1% of total cropland area in 2013), winter wheat (16.1%), soybeans (12.6%), vegetables (10.3%), sunflower (9.3%), spring barley (6.8%), winter rapeseed (4.0%), and sugar beet (1.3%).A remark should be made considering vegetables.In the region, vegetables are mainly (approximately 96%) produced by small farmers and people living in villages for self-consumption purposes (so called family gardens (Gallego et al., 2014)).The fields are mainly located next to the houses and, as a rule, are very small in size (less than 0.1 ha).This requires special techniques and the use of very high-resolution satellite data that were not available for the test site at large scale.Therefore, vegetables are not considered among major crops types within this study.Due to relatively large number of major crops and other factors there is no a typical simple crop rotation scheme in this region.Most farmers use different crop rotations depending on specialization.Fields in the region are quite large (except family gardens) with size generally ranging up to 250 ha.

Ground measurements
Ground surveys were conducted in June 2013 to collect data on crop types and other land cover classes.European Land Use and Cover Area frame Survey (LUCAS) nomenclature is used in this study as a basis for land cover / land use types.In total, 386 polygons are collected covering the area of 22,700 ha (Table 1).Data are collected along the roads using mobile devices with built-in GPS.

Landsat-8 satellite imagery
Remote sensing images acquired by Operational Land Imager (OLI) sensor aboard Landsat-8 satellite are used for crop mapping over the study region.Landsat-8/OLI acquires images in eight spectral bands (bands 1-7, 9) at 30 m spatial resolution and in panchromatic band 8 at 15 m resolution (Roy et al., 2014)  Conversion from the TOA reflectance to the surface reflectance (SR) using the Simplified Model for Atmospheric Correction (SMAC) (Rahman and Dedieu, 1994).The source code for the model is acquired from http://www.cesbio.upstlse.fr/multitemp/?p=2956.Parameters of the atmosphere to run the model (in particular, aerosol optical depth) are acquired from the Aeronet network's station in Kyiv (geographic coordinates +50.374N and +30.497E).(3).Detection of clouds and shadows using Fmask algorithm proposed by Zhu and Woodcock (2012).

Preparation of satellite images for classification
Multi-temporal Landsat-8 images acquired in bands 2 through 7 are reconstructed using SOMs and used for classification of satellite imagery.Bands 1 and 9 are not used due to the strong atmospheric influence.Panchromatic band and thermal bands by Thermal Infrared Sensor (TIRS) are not utilized as well.Multi-temporal SR values in six spectral bands form a feature vector that is input to the classifier.Therefore, a total amount of 36 variables have been introduced in the classification.All variables are normalized to have mean 0 and standard deviation 1. Feature vectors of SR values are derived for fields collected during ground survey.All surveyed fields are randomly divided into training set (50%) to train the classifier and testing set (50%) for testing purposes.Fields are selected in such a way so there is no overlap between training and testing sets.All classification results, in particular overall accuracy (OA), kappa coefficient, user's (UA) and producer's (PA) accuracies are reported for testing set.The input features are classified into one of the 13 classes (Table 1).

Restoration of missing values in time-series of Landsat-8 images
The results of restoration show that relative root mean square error (RRMSE) are dependent on the number of missing data, and increase when the number of missing values increases (Skakun and Basarab, 2014).RRMSE values are dependant on the Landsat-8 spectral bands with minimum value being for Band 5 (11.4%) and maximum value being for Band 4 (19.7%).Quality of reconstruction of vegetated areas is higher than for artificial surface.The example of missing data restoration for images acquired on the 5 th of July 2013 is shown in Figure 2.

Landsat-8 images classification
Three different classification schemes are compared in the study.The first scheme (Scheme 1) utilizes a single MLP classifier that is trained on all training data.For this, the number of hidden neurons in MLP is varied (from 20 to 80) in order to select the MLP classifier that yields the largest OA.The second scheme (Scheme 2) utilizes a committee of MLPs that are trained on different training data sets that are randomly divided into five disjoint subsets.For each subset, a number of MLPs are trained and the best MLP in terms of OA is selected into the committee.Therefore, the committee is composed of five MLP classifiers.The third scheme (Scheme 3) utilizes a committee of seven MLPs that are trained on all training data and have different number of hidden neurons, in particular 20, 30, 40, 50, 60, 70, and 80.The obtained classification metrics, in particular OA, Kappa, PA and UA, are summarized in Table 2.The use of multi-temporal Landsat-8 imagery and a committee of MLP classifiers allow us to achieve overall accuracy of slightly over 85% which is considered as target accuracy for agriculture applications (McNairn et al., 2009).The use of committee of MLP classifiers comparing to the single MLP classifier is essential, and it is statistically confirmed by using z-test (Foody, 2004).In particular, z value is equal to 5.36 when comparing Scheme 3 to Scheme 1 which is larger than the threshold value of |z|>1.96.It means that hypothesis of no significant difference between two classifiers would be rejected at the widely used 5 percent level of significance.This contribution has been peer-reviewed.doi:10.5194/isprsarchives-XL-7-W3-45-2015-maize (class 5, PA=90.5%,UA=86.8%):main confusion with soybeans (class 8); in particular almost 88% of commission error and 75% of omission error for maize is due to confusion with soybeans.
sugar beet (class 6, PA=94.9%,UA=89.6%):main confusion with soybeans (class 8) and maize (class 5); in particular, almost 55% of commission error is due to confusion with maize, and almost 95% of omission error is due to confusion with soybeans.
For the following agriculture classes the accuracy of 85% is not obtained: spring crops (class 4, PA=40.6%,UA=34.6%):classification using available set of satellite imagery fail to produce reasonable performance for spring crops.The main confusion of this class is with winter wheat (class 2) and other cereals (class 9).The reasons for this are as follows.When collecting ground data, it was impossible to discriminate winter crops from spring crops in the fields.Therefore, all wheat samples are assigned winter wheat class (since proportion of spring wheat is small), and all barley samples are assigned spring crops class (since proportion of winter barley is small All non-agriculture classes including forest and grassland yield PA and UA of more than 85%.The final classification map is shown in Figure 3.

Comparison to official statistics
The derived crop map over the Kyiv oblast is used to estimate crop statistics and compare it to the official one.The official statistics on crops for the region was released only in January 2014, while the crop map was produced using the remote sensing images acquired until the 6 th of August 2013.Therefore, within operational context, the map could be potentially produced within August-September 2013 which is 4-5 months in advance of the official statistics report.
A simple pixel counting procedure is applied for crop area estimation.Pixel counting is known to be biased (Gallego et al., 2010), and the bias can be approximated as Bias = Commission error -omission error. (4) Using commission and omission errors from the confusion matrix, this bias is used to correct pixel counting estimates and provide final crop area values.The results are given in Table 3

Discussion of results
The results achieved in this study show the efficiency of different neural networks architectures for classification of multi-temporal satellite imagery with missing data.The use of SOMs makes possible to restore missing data by training the neural network in an unsupervised fashion.Only data with all valid components are used for SOMs training.In such a way, the neural network projects data from training set into the subspace of neurons weight coefficients which are further used for restoration of missing values.The restoration is not perfect and introduces the error.It is found that the error is dependent on the spectral band: in particular, the relative error of restoration is 11.4% to 19.7% for Landsat-8 bands 2-7.However, the error shows small variations when varying training data size for SOM training.It should be also noted that data for SOM training and SOM size are selected automatically.It is very important when processing large volumes of data, and is one of the advantages of the proposed approach.
After all missing values are restored a supervised classification procedure is performed.For this, a committee of MLP classifiers is used.Two approaches to compose a committee are evaluated with both showing better performance over a single MLP classifier.The use of MLPs committee allows us to achieve overall accuracy of 85.32% and Kappa coefficient of 0.8235 when classifying multi-temporal Landsat-8 images over the JECAM test site in Ukraine.Accuracy of 85% is usually considered as a target for space-based agriculture applications (McNairn et al., 2009).Analysis of user's and producer's accuracies shows that some crop-specific classes achieve the target accuracy (such as winter wheat, winter rapeseed, maize and sugar beet) while others do not (spring crops, sunflower and soybeans).This is due to similar vegetation cycle of summer crops which requires much better temporal resolution.Another way to improve discrimination of summer crops is to utilize SAR imagery.These activities are ongoing and will be reported in future papers.
The derived crop map is used for crop area estimation for the Kyiv oblast.The estimates are compared to the official statistics and show good correspondence.Relative error for major crops is within ±28%.It should be emphasised that the latest image that is used to produce a crop map was acquired on the 6th of August 2013.Therefore, classification could be performed and crop map could be made available within August-September of the same vegetation year that is extremely important within operational context.For comparison, preliminary official statistics was only available in January 2014.

CONCLUSIONS
Knowledge on the area and distribution of crops is extremely important for many applications.To enable crop mapping at large scale, remote sensing images from space present the only source of reliable, continuous and human independent information.Optical images are contaminated by the presence of clouds and shadows that introduce missing values in the datasets.These missing values need to be properly processed to enable further classification of satellite imagery.This paper provides an integrated use of unsupervised and supervised neural networks in order to classify multi-temporal optical satellite images with the presence of missing data.First, SOMs are trained on available datasets with non-missing components, and are used to restore missing values.This restoration technique is universal, computationally effective and could be used for multiple scenes and satellite sensors.In the case of Landsat-8 multi-temporal images that are used in this study, it is possible to restore spectral bands with up to 19.7% relative RMSE error.Afterwards, a supervised classification is performed with the use of committee of MLP classifiers.This approach is applied for the JECAM test site in Ukraine for large area crop mapping (more than 28,000 km2).The committee of MLPs outperforms the best single MLP classifier and reaches a threshold of 85% of overall classification accuracy (OA=85.32%and Kappa 0.8235).For the following agriculture classes an 85% threshold of producer's and user's accuracies is achieved: winter wheat, winter rapeseed, maize, and sugar beet.Such crops as sunflower, soybeans, and spring crops show worse performance.The resulting crop map is used to derive crop area estimates that are compared to the official statistics.
Results show good correspondence with 28% of relative error.
posterior probability of the committee l i p = posterior probability of each MLP L = number of classifiers in the committee, and K is the number of classes

Figure 2 .
Figure 2. Example of restoration of missing data in Landsat-8 images acquired on the 5th of July 2013.Original image with identified clouds and shadows as missing data is show in (a).Result of restoration is shown in (b).For both images true colour composite of bands 4-3-2 is shown.SR reflectance values are scaled from 0 to 0.15

Figure 3 .
Figure 3. Final map obtained by classifying multi-temporal Landsat-8 imagery using a committee of MLP classifiers

Table 1 .
. Three scenes with path/row coordinates 181/24, 181/25 and 181/26 cover the test site region.Dates of acquisition are April 16, May 02, May 18, June 19, July 05, and August 06.Number of polygons and total area of crops and land cover types collected during the ground survey

Table 2 .
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-7/W3, 2015 36th International Symposium on Remote Sensing of Environment, 11-15 May 2015, Berlin, Germany Classification results of using different neural network approaches

Table 3 .
. In general, there is a good correspondence between satellite derived crop area estimates and official statistics except winter rapeseed and sugar beet.The former crop class is overestimated +28% while the latter crop is underestimated -28%.Comparison of official statistics and crop area estimates derived from Landsat-8 imagery for Kyiv region Spring crops class (mostly, barley) is the least discriminated class due to difficulties in discriminating winter and spring classes in the field during summer ground surveys, and confusion with other spring and summer cereals such as rye and oats.If spring crops and other cereals classes are combined The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-7/W3, 2015 36th International Symposium on Remote Sensing of Environment, 11-15 May 2015, Berlin, Germany together, accuracies considerably increase: from 40.6% to 79.93% of PA and from 34.6% to 83.53% of UA.Winter crops (wheat and rapeseed) yield very good performance with PA and UA both more that 91%.There is a mixed performance for summer crops.In particular, maize and sugar beet exceeded the threshold of 85% while sunflower (almost exceeded with 84.1% of PA and 85.4% of UA) and soybeans did not.Soybeans class is least discriminated summer crop far below the 85% threshold: PA=69.7%,UA=77.1%.Main confusion of soybeans is with other summer crops, namely maize, sunflower, and sugar beet.