Mapping Land Cover in the Taita Hills, Se Kenya, Using Airborne Laser Scanning and Imaging Spectroscopy Data Fusion

The Taita Hills, located in southeastern Kenya, is one of the world's biodiversity hotspots. Despite the recognized ecological importance of this region, the landscape has been heavily fragmented due to hundreds of years of human activity. Most of the natural vegetation has been converted for agroforestry, croplands and exotic forest plantations, resulting in a very heterogeneous landscape. Given this complex agro-ecological context, characterizing land cover using traditional remote sensing methods is extremely challenging. The objective of this study was to map land cover in a selected area of the Taita Hills using data fusion of airborne laser scanning (ALS) and imaging spectroscopy (IS) data. Land Cover Classification System (LCCS) was used to derive land cover nomenclature, while the height and percentage cover classifiers were used to create objective definitions for the classes. Simultaneous ALS and IS data were acquired over a 10 km × 10 km area in February 2013 of which 1 km × 8 km test site was selected. The ALS data had mean pulse density of 9.6 pulses/m 2 , while the IS data had spatial resolution of 1 m and spectral resolution of 4.5–5 nm in the 400–1000 nm spectral range. Both IS and ALS data were geometrically co-registered and IS data processed to at-surface reflectance. While IS data is suitable for determining land cover types based on their spectral properties, the advantage of ALS data is the derivation of vegetation structural parameters, such as tree height and crown cover, which are crucial in the LCCS nomenclature. Geographic object-based image analysis (GEOBIA) was used for segmentation and classification at two scales. The benefits of GEOBIA and ALS/IS data fusion for characterizing heterogeneous landscape were assessed, and ALS and IS data were considered complementary. GEOBIA was found useful in implementing the LCCS based classification, which would be difficult to map using pixel-based methods.


INTRODUCTION
The land cover has changed rapidly in the Taita Hills, in southeastern Kenya.Large areas of forests, woodlands and shrublands have been converted into agricultural use (Clark and Pellikka, 2009;Pellikka et al., 2009).However, mapping these changes using remote sensing (RS) is a challenging task as even the class definitions are based on heuristic views of given classification system.The land cover in the area is very heterogeneous and consists mostly of mixed classes of trees, crops and other vegetation.
Mapping heterogeneous classes using L-resolution satellite data is difficult, since individual components (e.g.single trees) that form agroforestry or woodland classes cannot be distinguished (Zomer et al. 2009;Blinn et al. 2013).On the other hand, using H-resolution data allows a clear distinction of individual trees (e.g.Eysn et al., 2012;Dalponte et al., 2014).However, linking individual trees to certain minimum mapping unit (MMU) of agricultural land, which would represent agroforestry is challenging (Zomer et al., 2009;Blinn et al., 2013).Lresolution data refers to situations where the scene objects are smaller than the pixel size of the data, while in H-resolution data the scene objects are larger than the pixel size (Strahler et al., 1986).
Mapping mixed land cover classes accurately is important because agroforestry has high potential for carbon sequestration in developing countries (Negash and Kanninen, 2015).
Agroforestry could be a direct target in REDD+ programs depending on the country's forest definition (Minang et al., 2014).In REDD+ programs the countries are rewarded for keeping forests and reducing emissions from deforestation.Agroforestry is, however, not been mentioned in REDD+ programs, despite its proven climate change mitigation and adaptation benefits (Minang et al., 2014).Trees outside forests (TOF) also have importance on local and regional scale economy.For example, in India TOFs constitute estimated 49 % of the annual fuelwood and 48% of the annual timber consumed (Panday, 2002).RS could, therefore, highly benefit carbon sequestration plans by improving the identification and characterization of agroforestry systems.However, several bottlenecks are still evident for obtaining reliable results based on H-resolution imagery.For instance, when H-resolution data is used with pixel-based methods there are problems with salt-and-pepper effects and mixing of classes within targets (Blaschke et al. 2014;Piiroinen et al., 2015).This problem can be tackled using geographic object-based image analysis (GEOBIA) approach (Blaschke et al., 2014;Benz et al., 2004) where pixels are segmented into objects before the classification.Since, for example, agroforestry consists of crops with trees, a multiscale segmentation and classification, where the target area is first segmented into larger areas that consist of different fractions of finer scale land cover types, can be applied.A finer scale segmentation and classification can then be done inside these larger segments.The larger areas can then be classified based on the fractions of finer scale land cover types (e.g.crops and trees) they contain.
Furthermore, combining optical data with Light Detection and Ranging (LiDAR) technology could highly improve the accuracy of land cover classification in agroforestry areas.Unfortunately, this approach has not yet been fully explored, in particular in tropical mountain environments.
In this article, H-resolution airborne laser scanning (ALS) and imaging spectroscopy (IS) data fusion (Torabzadeh et al., 2014) was used together with GEOBIA to characterize the mixed land cover classes based on biophysical definitions derived from Land Cover Classification System (LCCS) (Di Gregorio, 2005).ALS enables the use of accurate three dimensional information of the land cover, while IS enables the detection of different materials based on their spectral characteristics.In this study our objectives were to: (i) create clear definitions for the land cover classes based on LCCS and measurable biophysical attributes (tree height, crown cover) that are non-overlapping and measurable using remote sensing methods.
(ii) test multiscale segmentation for defining mapping units for characterizing heterogeneous landscape with emphasis on forests and agroforestry.
(iii) assess the benefits of data fusion in creating the land cover classes.

Study area
The study area (8 km × 1 km) is located in the elevation range of 1100-1800 m a.s.l. in the highlands of the Taita Hills (3° 25′ S, 38° 19′ E) in south-eastern Kenya (Figure 1).There are two rainy seasons with long rains occurring in the March-June and short rains in October-December (Jaetzold and Schmidt, 1983).
Figure 1.Location of the Taita Hills in south-eastern Kenya.

Airborne data collection
Flight campaign was conducted in 3-8.2.2013 during the dry season.Two sensors were used to collect ALS and IS data from mean flying height of 750 m.Optech ALTM 3100 (Optech, Canada) is an oscillating mirror laser scanner capable of recording up to four echoes.The sensor was operated with pulse rate of 100 kHz and scan rate of 36 Hz.Scan angle was ±16°.Achieved pulse density was 9.6 pulses m −2 .Mean footprint diameter was 23 cm.AisaEAGLE is a pushbroom scanner with instantaneous field of view of 0.648 mrad and field of view of 36.04°(Spectral Imaging Ltd., Finland).The sensor was used in four times spectral binning mode that produces output images with 129 bands and full width at half maximum of 4.5-5.0nm in spectral range of 400-1000 nm.The output pixel size was 1.0 meters.

Preprocessing of the data
ALS data was pre-processed by the data vendor (Topscan Gmbh) and delivered as a georeferenced point cloud in UTM/WGS84 coordinate system with ellipsoidal heights.TerraScan software (Terrasolid Ltd., Finland) was used to create digital terrain model (DTM) and digital surface model (DSM) from ALS data, which was classified to ground and other returns.Some of the very steep slopes were falsely classified and had to be manually corrected.The first returns were used to create DSM and the returns classified as ground to create DTM at 1 m resolution.DTM values were subtracted from DSM values to create canopy height model (CHM), which represents the height of vegetation and buildings from the ground level (CHMbuildings).Another CHM was created using the same approach, but the power lines were first removed manually and returns classified as buildings were removed (CHMnobuildings).DTM was used for calculating slope.
The raw data produced by AisaEAGLE was radiometrically corrected and georectified with CaliGeoPro 2.2 (Spectral Imaging Ltd., Finland).DSM derived from ALS data was interpolated to pixel size of 3 meters and used in the georectification process.Atmospheric correction was applied using ATCOR-4 (Schläpfer and Richter, 2002).
After the initial georectification of AisaEAGLE data, it was noted that there were geometric mismatches between AisaEAGLE and ALS data.The geometric accuracy of the ALS data was assumed to be better and thus CHMbuildings was used as a reference data for manual co-registration of AisaEAGLE data.The processed flight lines were first subsetted so that the side overlap between images was minimized.Next, 50-100 control points were collected from CHMbuildings and AisaEAGLE data for each flight line.Then, the first order polynomial transformation was applied to co-register the AisaEAGLE flight lines and CHMbuildings.After co-registration, RMSE for an example flight line was 1.06 pixels (meters), which was considered accurate enough for data fusion (Valbuena, 2014).

LCCS class definitions
To map mixed classes of trees and other land cover types, the first step is to define tree and forest, which is a complex issue as numerous definitions exist (Magdon and Klein, 2013).LCCS (Di Gregorio, 2005) defines that all woody life forms taller than 5 m are trees, while 3-5 m tall plants can be trees if they have clear physiognomic aspects of a tree.Mapping physiognomic aspects using RS is challenging and thus all the plants taller than 3 m were considered as trees.Forest is defined typically based on certain tree crown cover (CC) on MMU (Magdon and Klein, 2013;Eysn et al., 2013).CC refers to the proportion of the forest floor covered by the vertical projection of the tree crowns (Jennings et al., 1999;Korhonen et al., 2006).For example, UNFCCC as part of Kyoto protocol states that: 'forest is a minimum area of land of 0.05-1.0hectares with tree crown cover (or equivalent stocking level) of more than 10-30 per cent with trees with the potential to reach a minimum height of 2-5 metres at maturity in situ' (Minang et al., 2014).FAO uses 10 % CC threshold and 0.5 ha MMU (FAO, 2000).LCCS does not define a specific MMU for forests or any other land cover type.ICRAF (1993) defines agroforestry as follows: "Agroforestry is a collective name for land-use systems and technologies, where woody perennials are deliberately used on the same land management unit as agricultural crops and/or animals, either in some form of spatial arrangement or temporal sequence.In agroforestry systems there are both ecological and economical interactions between the different components".Panday (2002) defines it to include all forms of tree-growing in agroecosystems.Mapping agroforestry is difficult using RS methods and thus in this article agroforestry refers to trees on farmland.The CC thresholds for defining the agroforestry and woodland classes were derived from LCCS percentage cover classifier.

Segmentation and classification
Segmentation and classification was done in eCognition Developer (Trimble Navigation, Ltd.).First, multiresolution segmentation (MRS) was used to create level-0 segmentation based on normalized difference vegetation index (NDVI; Tucker, 1979) and CHM using scale parameter of 275, shape of 0.5 and compactness of 0.7.Next, MRS was used for level-1 segmentation inside level-0 segments based on CHM using scale parameter of 25, shape of 0.2 and compactness of 0.6.
The classification was done first on level-1.The first step was to separate all trees.This was done based on CHMnobuildings where all targets taller than 3 m were classified as trees.Next, CHMbuildings was used to classify the remaining targets that were taller than 3 m as buildings.A temporary class was created for objects with heights between 1-3 m and training samples were collected for buildings and shrubs.Training samples were collected visually using AisaEAGLE data.Support Vector Machine (SVM) classification (Vapnik, 1998;Mountrakis et al., 2011) was applied using 12 first minimum noise fraction (MNF) transformed (Green et al., 1988) AisaEAGLE bands to classify all 1-3 m objects into buildings and shrubs.12 MNF bands have been shown to yield highest classification accuracies with AisaEAGLE data in a previous study (Piiroinen et al., 2015).
The remaining objects were classified into a temporary class and merged to create one segment containing all objects of 0-1 m height.This object was segmented based only on NDVI for better separation of bare soil, water and low vegetation targets.SVM classification was applied using median NDVI as input to separate these classes.Level-0 classification was based on level-1 classes and LCCS class definitions.

LCCS class definitions and their implementation based on GEOBIA approach
We created eight level-0 classes based on LCCS (Table 1).These classes were composed of finer scale segmentation and classification results based on six land cover classes described in Table 2.As agricultural land and other low vegetation targets were not separated at level-1, the presence of buildings was used as criteria to separate agroforestry and woodland classes.This approach is based on the knowledge that in the study area agriculture is practiced mainly on small scale family farms, in which cultivated areas are located next to buildings.One of the strengths of GEOBIA approach is to take advantage of these context based rules to separate classes that are hard to separate based only on their biophysical characteristics (Blaschke et al., 2014).
Further visual analysis showed that remaining agroforestry and agricultural land, without nearby buildings, were still present in the level-0 segments.These remaining agricultural areas were separated based on the presence of terraces.Agricultural terraces were detected based on the standard deviation (STD) of the terrain slope, given that terraced farms are combinations of very steep slopes and levelled terraces, which makes the STD very high when compared to natural areas where slope angles change more gradually.
Currently, there is a large demand for a global land cover classification system, which would cover all variation in land cover.LCCS (Di Gregorio, 2005) is so far the most ambitious attempt to create one.However, there are still many decisions that are left for the user when the classification system is applied on a certain area.For example, one of the problems with LCCS is that it does not define agroforestry class implicitly.Agroforestry would belong to cultivated and managed lands, while cover and height classifiers are included only for natural and semi-natural terrestrial class.This prevents using LCCS as such to characterize agroforestry systems and further interpretations of the system are left for the user.
ALS data makes it possible to use LCCS height classifier in land cover characterization, which makes it possible to use unambiguous and objective threshold values for trees, shrubs and low vegetation.Percentage cover classifier can be used to define mixed land cover classes.However, there is very little discussion in the LCCS manual on how the MMUs should be selected, while this is a key element in forming the classes.
MRS is one possibility to create MMUs that would represent different levels of heterogeneity in the landscape.However, the segmentation process itself is subjective and the segmentation parameters are difficult to define objectively (Arvor et al., 2013;Belgiu and Dragut, 2014;Hay et al., 2005).Even if objective definitions for the classes are created the local knowledge of land cover is still needed as, for example, some agroforestry systems may have a CC of up to 88% (Panday, 2002;Bisseleua et al., 2009), which makes them overlap with many forest definitions (e.g., Magdon and Klein, 2013;Minang et al., 2014;FAO, 2000).Table 2. Land cover classes at level-1.

Data fusion and multi-scale segmentation results
CHMbuildings shows the height of targets above ground (Figure 2a) while IS data shows the reflectance (Figure 2b).The results of the level-0 segmentation based on NDVI and CHM are presented in Figure 2c.The shape value was set to 0.5 so that the segments would include heterogeneous objects of trees and low vegetation while the forest patches with sharp edges were still separated into their own segments.Both data sources were shown to be important for creating meaningful segments, as segments based only on CHM would stick too closely to individual trees.NDVI was used instead of reflectance to minimize the influence of shadows.The level-1 segmentation, which was based only on CHM and smaller scale parameter separated trees and buildings in their own segments (Figure 2d).Our results show that in highly heterogeneous landscapes, the fusion of ALS and optical data was essential for successful separation of targets that could not be distinguished independently by using either ALS or IS data.For instance, ALS data created segments that followed trees and buildings very closely (Figure 3a), while it could not separate bare soil from low vegetation (Figure 3b).The classification based on level-1 segmentation was successful in separating the main land cover components (Figure 4a).However, this evaluation is based on visual interpretation and further validation is needed for a more comprehensive accuracy assessment.The validation will be possible when the classification is extended to the complete study area of 10 km × 10 km where field data collected in 2012, 2013 and 2014 are available.
The level-0 classification (Figure 4b) describes the mixed classes based on level-1 classification.This reveals how much trees there are in each segment in relation to agriculture or other low vegetation.The most common class was sparse agriculture, which indicates that in the study area it is common practice to grow trees on farmland.On the other hand, dense agroforestry areas were not commonly observed.Woodlands were located mainly on steep slopes and further away from roads and buildings.The benefits of data fusion in segmentation and classification process were evident as, for example, CHM was very suitable for segmenting and classifying trees, while AisaEAGLE data was suitable for segmenting and classifying bare soil, water and low vegetation.These results are in agreement with Torabzadeh et al. ( 2014), who concluded that, at the moment, land cover maps are RS products that are benefiting the most from ALS and IS data fusion.

CONCLUSIONS AND FURTHER DEVELOPMENTS
LCCS was successfully modified to characterize mixed land cover classes like agroforestry.The height and percentage cover classifiers are useful when these classes are defined as objectively as possible.Multiresolution segmentation and multiscale approach created meaningful mapping units for the classes, while further validation of the results is needed before making more detailed conclusions.The benefit of this approach over pixel based methods is that clear threshold values can be used when the classes are defined and mapped.ALS and IS data nicely complemented each other, while all the potential of IS data was not used at this stage.The importance of IS increases as the classification is extended to species level.Species level information can then be used for more detailed definitions of level-0 classes.Information of tree types is important for agroforestry classification as farmers are known to plant specific species on their farmland for specific purposes.The strength of GEOBIA approach is to include these context based rules in the classification process.Future efforts will extend the classification system to our entire study area and the results will be validated based on field surveys.These results, together with further assessments of above ground biomass, will be used for assessing the carbon stocks and biodiversity in different land cover classes.This information will be highly beneficial for researchers and policy makers aiming to promote agroforestry practices and the design of biodiversity conservation plans.

Figure 3 .
Figure 3. Segmentation based on CHM (a) creates segments based on their height and thus it does not differentiate between very low vegetation and bare soil.NDVI based segmentation separates the bare soil from the very low vegetation (b).

Figure 4 .
Figure 4. Example of the level-1 classification (a) and level-0 classification based on the fractional cover of level-1 classes (b).