ACCURACY ASSESSMENT OF LANDSAT-DERIVED CONTINUOUS FIELDS OF TREE COVER PRODUCTS USING AIRBORNE LIDAR DATA IN THE EASTERN UNITED STATES

Knowing the detailed error structure of a land cover map is crucial for area estimation. Facilitated by the opening of the Landsat archive, global land cover mapping at 30-m resolution has become possible in recent years. Two global Landsat-based continuous fields of tree cover maps have been generated by Sexton et al. (2013) and Hansen et al. (2013) but the accuracy of which have not been comprehensively evaluated. Here we used canopy cover derived from airborne small-footprint Lidar data as a reference to evaluate the accuracy of these two datasets as well as the National Land Cover Database 2001 canopy cover layer (Homer et al. 2004) in two entire counties in Maryland, United States. Our results showed that all three Landsat datasets captured well the spatial variations of tree cover in the study area with an r ranging between 0.54 and 0.58, a mean bias error ranging between -15% and 5% tree cover, and a root mean square error ranging between 27% and 29% tree cover. When the continuous tree cover maps were converted to binary forest/nonforest maps, all three products were proved to have an overall accuracy >= 80% but with significant differences in producer’s accuracy and user’s accuracy. Data users are thus suggested to beware of these accuracy patterns when selecting the most appropriate dataset for their specific applications.


INTRODUCTION
Changes in forest cover significantly affect the global carbon cycle, the hydrological cycle and biodiversity richness (Foley et al., 2005).Satellite observations, owing to their synoptic and repetitive nature, are commonly used for characterizing forest cover and monitoring forest cover change, especially in remote regions.Among various types of satellite data, optical imagery is often the primary data source for characterizing forest cover and detecting forest cover change owing to its large data availability.In particular, the Landsat series of satellite have been providing consistent moderate spatial resolution data since 1972.Recent opening of the Landsat archive and distribution of standardized radiometric images by the United States Geological Survey (USGS) have ignited the use of Landsat data in a wide range of scientific applications, including global land cover mapping at 30-m resolution (Wulder et al., 2012).
Global land cover mapping has been historically relying on coarse spatial resolution data.Since the generation of the first satellite-based land cover product debuted in the mid-1990s (Defries and Townshend, 1994), many global land cover maps have been produced at resolutions from 300-m to 1-km (Bartholomé and Belward, 2005;Bicheron et al., 2008;Friedl et al., 2002;Hansen et al., 2000;Loveland et al., 2000).The proliferation of available datasets provides users rich alternatives yet simultaneously creates some confusion as to which data to choose for their specific applications.This confusion is mainly caused by the lack of a comprehensive accuracy assessment of each available product.Many datasets have simply not been comprehensively evaluated.For those validated maps, the accuracy numbers are often generated using diverse reference data and thus are not directly comparable (Fritz and See, 2008;Pflugmacher et al., 2011;Zhao et al., 2014).Many studies have been conducted on comparing different products to identify the * Corresponding author relative strength and weakness of each one and in rare cases integrating different datasets for an improved land cover characterization (Jung et al., 2006;Schepaschenko et al., 2015;Song et al., 2014a).However, absolute error estimation of a land cover map is still and always needed because accuracy information is a crucial input for subsequent applications, such as area and associated uncertainty estimation and land cover change detection (Olofsson et al., 2013;Sexton et al., 2015;Song et al., 2014b).
The last few years have witnessed the production of global land cover maps at 30-m spatial resolution using Landsat data (Chen et al., 2014;Gong et al., 2013;Townshend et al., 2012).Two global, Landsat-derived continuous fields of tree cover products have been generated for free public access (Hansen et al., 2013;Sexton et al., 2013).The spatial details revealed by these Landsat-based products are 100-1000 times more than those coarse-resolution maps.However, users may still face the same choice confusion as with their coarse-resolution counterparts because neither product has been comprehensively validated (Pengra et al., 2015), even though Sexton et al. (2013) applied error estimation of their map in four selected forest sites.Till now, no direct comparison analysis which may inform users on the agreement and discrepancy of the two datasets has been conducted yet.
The availability of high-quality reference data is a major constraint to global land cover validation (Strahler et al., 2006).Reference data can be collected from ground surveys, which are often unavailable because of the associated high economic cost.They can also be derived from higher-resolution imagery or other data sources which depict land cover reliably.A conceptually different and potentially more reliable way of characterizing tree cover is using light detection and ranging (Lidar) data.Lidar is a newly developed active remote sensing technology.It can accurately determine the actual position of objects in threedimensional space by counting the roundtrip time of emitted laser between sensor and target (Lefsky et al., 2002).The laser is typically operated at a wavelength of either green or near-infrared range with very fine resolution footprint (0.5 m or less) and high scanning frequency (typically 50 khz to 100 khz) in studies of terrestrial ecology.The high density of Lidar scanning shots can depict detailed structure of forest and thus allows a highly accurate virtual re-construction of individual trees (Dubayah and Drake, 2000).Such advantage makes Lidar a powerful and popular tool in measurements of different forest attributes, including canopy height, aboveground biomass, leaf area index, as well as canopy cover (Falkowski et al., 2008;Korhonen et al., 2011;Lovell et al., 2003;Morsdorf et al., 2006;Tang et al., 2014;Tang et al., 2012).
The objective of this paper is to demonstrate the applicability of airborne Lidar data as reference to evaluate land cover products generated from optical satellite data.Two unique advantages of small-footprint Lidar allow itself to be an excellent reference data: (1) a high spatial resolution and (2) an explicit characterization of canopy height as well as canopy cover.Here we use wall-to-wall Lidar-derived canopy cover above 2.5-m height to evaluate the accuracy of three Landsat-based continuous fields of tree cover maps in the eastern United Sates.

Study Area
The study area encompasses two counties-Howard County and Anne Arundel County in the State of Maryland, United Sates (Figure 1).Bordering the Chesapeake Bay, the study area has a seasonal climate and flat terrains.Located in the corridor of two metropolises-Washington D.C. and Baltimore, it has a typical North American suburban landscape consisting of residential lands, agricultural fields, and fragmented forests.Large patches of forest are mainly located in state and local park reserves, largely dominated by broad-leaf deciduous trees mixed with some needle-leaf evergreen trees.As a well-developed region, land cover change in this area is relatively rare and mainly in the form of housing development on forested or agricultural lands.

Landsat-derived continuous fields of tree cover maps
Three Landsat-derived vegetation continuous fields (VCF) of tree cover products were evaluated in this study.The first dataset (hereafter referred as DS1) was developed by Sexton et al. (2013).The basic input data were Landsat Thematic Mapper (TM) and Enhanced Thematic Mapper Plus (ETM+) images from the Global Land Survey (GLS) collection circa 2000 (Gutman et al., 2013).Leaf-off images were replaced with leaf-on images selected from the USGS Landsat archive based on phenological information from the Moderate Resolution Imaging Spectroradiometer (MODIS) (Channan et al., 2015).All images were converted to surface reflectance (SR) using the Landsat Ecosystem Disturbance Adaptive Processing System (LEDAPS) (Masek et al., 2006).Water, cloud and shadow pixels were identified using methods reported in Huang et al. (2008) and Huang et al. (2010).30-m Landsat SR data were first spatially aggregated to 250-m resolution.Regression tree models were trained for each Landsat Worldwide Reference System (WRS)-2 tile using reference data derived from spatiotemporally collocated 250-m MODIS VCF product and then applied to 30m Landsat SR to predict percent tree canopy cover per pixel.This dataset is available at http://glcf.umd.edu/data/landsatTreecover/.
The second dataset (hereafter referred as DS2) was developed by Hansen et al. (2013).The basic input data were all Landsat ETM+ images of year 2000 in the USGS archive.Landsat data were converted to top-of-atmosphere (TOA) reflectance, normalized to surface reflectance according to MODIS SR, and corrected for surface anisotropy (Hansen et al., 2008;Potapov et al., 2012).After screening cloud, shadow and water, a per-pixel composition was carried out to create a series of phonological metrics (Hansen et al., 2013;Potapov et al., 2015).Regression tree models were trained using reference data derived from highresolution imagery and then applied to Landsat metrics to predict percent tree canopy cover per pixel.This dataset is available at http://earthenginepartners.appspot.com/science-2013-globalforest/download_v1.1.html.
The third dataset (hereafter referred as DS3) was the National Land Cover Database 2001 (NLCD2001) tree canopy cover layer developed by Homer et al. (2004).Advanced Very High Resolution Radiometer (AVHRR)-derived Normalized Difference Vegetation Index (NDVI) was used to select Landsat TM or ETM+ images acquired in early, peak and late of vegetation growing seasons.Selected Landsat images were converted to at-satellite reflectance for the six reflective bands and to at-satellite temperature for the thermal band and subsequently transformed to brightness, greenness and wetness indices through a Tasseled Cap Transformation (Huang et al., 2002;Kauth and Thomas, 1976).Similar to the other two global products, regression tree classifiers were trained using reference data obtained from aerial photographs, field-work as well as the Forest Inventory Analysis (FIA) database and applied to multiseason Landsat image triplets to predict percent tree canopy cover per 30-m pixel.This dataset is available at http://www.mrlc.gov/nlcd01_data.php.

Reference tree canopy cover derived from Lidar
The Lidar data were obtained from the Maryland Department of Natural Resources in primary support of shore erosion studies along the Chesapeake Bay.

Evaluation metrics
The accuracy of the Landsat-based continuous tree cover was evaluated against Lidar-derived percent canopy cover using four metrics: mean bias error (MBE), mean absolute error (MAE), root mean square error (RMSE) and r 2 (Willmott, 1982) :   1 ( ) ) where i = pixel index tci = percent tree cover of each product ri = reference percent tree cover from Lidar r = mean of reference percent tree cover n = sample size In addition to the above continuous error metrics, we also converted the continuous tree cover maps to discrete forest/nonforest classification maps and constructed traditional confusion matrixes for each classification product.Following the International Geosphere-Biosphere Programme (IGBP) definition of open forest, a 30% tree cover threshold was applied to categorize tree cover pixels into either forest or nonforest class (Belward, 1996).Producer's accuracy (1 -omission error), user's accuracy (1 -commission error) and overall accuracy for both classes were then summarized from the confusion matrixes for each product.

Qualitative Assessment
All three products depict well the spatial variations of tree cover over the study area (Figure 2).Dense tree cover patches of riparian forests and the Patuxent Research Refuge (center of the study area) are clearly shown on all maps.A close visual examination concludes that DS1 underestimates high-end tree cover (i.e. less pixels with high tree cover than the reference), DS2 overestimates high-end tree cover, whereas DS3 overestimates low-end tree cover.
The frequency distributions of tree cover from the four datasets confirm the conclusions drawn from the visual examination (Figure 3).All three datasets show a bimodal distribution.Both DS1 and DS2 have a close agreement with the reference at low canopy cover but DS1 saturates at about 80% with its peak located at around 60% canopy cover.The maximum value of DS2 reaches 100% canopy cover but has significantly more 100% tree cover pixels than the reference.The high-end peak of DS3 (~90% canopy cover) has the closest agreement with the reference, but DS3 has significantly more 0% tree cover pixels than the reference.DS2 (Hansen et al. 2013).c.DS3 (Homer et al. 2004).Colours in the legends indicate scatter density.

CONCLUSIONS
We have demonstrated the applicability of small-footprint airborne Lidar data as reference to evaluate the accuracy of land cover products generated from optical satellite data.Using wallto-wall Lidar-derived canopy cover as reference, we estimated the accuracy of three Landsat-based continuous fields of tree cover datasets (Sexton et al. 2013;Hansen et al. 2013;Homer et al. 2004) in Howard County and Anne Arundel County, Maryland, USA.The results showed various error patterns of the three percent tree cover datasets, although they were generated from similar input data with similar machine learning algorithms.All three datasets captured well the spatial variations of tree canopy cover with an r 2 ranging between 0.54 and 0.58, a mean bias error ranging between -15% and 5%, and a root mean square error ranging between 27% to 29%.When the continuous tree cover maps were converted to binary forest/nonforest maps, all three products were proved to have an overall accuracy >= 80%, with various producer's accuracy and user's accuracy the forest and nonforest classes.Future research will expand the study area to include more study sites in other major forest biomes in the United States.
They were later made available in support of the National Aeronautics and Space Administration (NASA) Carbon Monitoring System (CMS) projects.The original Lidar point cloud was collected from April to May 2004 The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-7/W4, 2015 2015 International Workshop on Image and Data Fusion, 21 -23 July 2015, Kona, Hawaii, USA over two entire counties of Maryland (Anne Arundel and Howard).Canopy cover was calculated as the percentage of Lidar points above 2.5-m height in total Lidar points within each 30-m grid.Lidar point cloud data were also processed to derive digital elevation model (DEM) and canopy height model (DSM) at 2-m spatial resolution.

Figure 3 .Figure 4 .
Figure 3. Frequency distributions of percent tree canopy cover in the study area from Lidar reference and three Landsat-based products.

Table 1 .
Table1lists four error metrics calculated for the three datasets.The MBE values suggest that both DS1 and DS3 underestimate canopy cover while DS2 overestimates canopy cover.All three products are comparable in terms of absolute error with MAE ranging between 20% and 22% and RMSE ranging between 27% and 29%.Additionally, both DS1 and DS2 explain 58% of the variation of tree cover captured by the Lidar reference, while DS3 explains a slightly lower portion (54%) of the variation.Error metrics of the three Landsat-based tree cover datasets.The units of MBE, MAE and RMSE are all percent tree canopy cover. 3.

Table 2 .
Summary of accuracy numbers of the re-classified forest/nonforest maps derived from the three Landsat-based continuous tree cover datasets.