AN OBJECT-BASED METHOD FOR CHINESE LANDFORM TYPES CLASSIFICATION

Landform classification is a necessary task for various fields of landscape and regional planning, for example for landscape evaluation, erosion studies, hazard prediction, et al. This study proposes an improved object-based classification for Chinese landform types using the factor importance analysis of random forest and the gray-level co-occurrence matrix (GLCM). In this research, based on 1km DEM of China, the combination of the terrain factors extracted from DEM are selected by correlation analysis and Sheffield's entropy method. Random forest classification tree is applied to evaluate the importance of the terrain factors, which are used as multi-scale segmentation thresholds. Then the GLCM is conducted for the knowledge base of classification. The classification result was checked by using the 1:4,000,000 Chinese Geomorphological Map as reference. And the overall classification accuracy of the proposed method is 5.7% higher than ISODATA unsupervised classification, and 15.7% higher than the traditional object-based classification method.


INTRODUCTION
Landform is one of the most important elements in physical environment (Shen et al., 1982).Landforms classification is the basis of digital terrain analysis which provides evidences for researches of geomorphology mapping and landform distribution, and thus has a great value in production practices and scientific researches (Li et al., 2013;Gao, 2004;Zhou and Liu, 2008;Gerç ek et al., 2011).Object-based image analysis (OBIA) has become a widely spread concept for many landforms classification studies based on DEM recent years (Drăguţ and Eisank, 2012;Whiteside and Ahmad, 2005;Manakos et al., 2000;Blaschke, 2010;Van Niekerk, 2010).Objects obtained by image segmentation are considered to be analysis units.Also, compared to pixel-based classification, this approach can avoid the salt and pepper noise and improve the execution efficiency by utilizing feature geometry and structural information (Manakos et al., 2000;Blaschke, 2010).In addition, Object-oriented multi-scale segmentation is sensitive to the discontinuity of DEM forms, hence the image can be segmented by maximizing ground features' internal homogeneities and minimizing heterogeneities between the entities.Lucian Drăguţ (Drăguţ and Blaschke, 2006) presents an automated object-based classification method for landform elements which is reproducible and transferable.Test area has been classified into nine classes by using flexible fuzzy membership functions constructed by elevation, profile curvature, plan curvature and slope gradient.By comparison experiment on high resolution remote sensing images of test areas, Soe W. Myint (Myint et al., 2011) suggests that the accuracy of object-based classification method is 22.8% higher than that of pixel-based maximum likelihood classification method.Lucian Drăguţ and Clemens Eisank (Drăguţ and Eisank, 2012) classified 1km global SRTM DEM into eight topographic classes automatically using an OBIA approach by setting the threshold with average elevation and elevation standard deviation.Results resemble reasonably patterns of existing global and regional classifications.Based on the object-based strategy, Ciaran Robb (Robb et al., 2015) classified the glacial landforms by using LIDAR data which achieves an overall accuracy of 77% and Rajesh B.V.Shruthi (Benz et al., 2004) proved the object-based analysis method has advantages in studies of gully erosion comparing to traditional approaches.However, previous studies have shown that the accuracy of current object-based landforms classification is still relatively low.Moreover, subjectivity exists in determining the threshold of multi-scale segmentation algorithm.In sum, based on the 1:10 00000 DEM of China, this paper aims to explore an improved method for automatic object-based landforms classification which will help deepen the comprehension of geomorphology characters and spatial heterogeneity rules of China from a macroscopic and quantitative view.

TEST DATA
1:1000000 DEM data used in this study was produced by National Geographic Information Centre which is a 1 × 1 kilometre grid resolution ground elevation digital matrix constructed by reading the elevation of square rid's intersections precisely according to a grid interval of 28.125″ (Longitude difference) × 18.750 ″ (Latitude difference) based on the Chinese 1:50000 and 1:100000 topographic map (Song, 2006).This data has a high sampling accuracy and can reflect the terrain relief of China primely (Liu and Tang, 2012).Elevation of data range from 0m to 8848m and the standard deviation is 1147.1m.The 1:1000000 DEM of China and the landform type map of China (1:4000000) are shown in Figure 1 and Figure 2 respectively.

Overview of the method
Firstly, relief amplitude, surface roughness, elevation, elevation variation coefficient, slope of slope, hillshade and composite curvature were filtered and extracted as the factor combination of landform classification.Then multi-scale segmentation was proceeded on the combined image based on the threshold of factor importance calculated by random forest classifier.Next, the experiment selected Gray-level co-occurrence matrix (GLCM) to construct classificatory knowledge base and automatic landform classification was implemented by using nearest-neighbour classification method.Finally, accuracy assessment was completed referring to the 1:4 000 000 geomorphologic map of China and adjacent areas.The technology flow chart was as followed:

Determination of landform classification factors
Terrain factors extracted from DEM are the most effective tools to describe landform quantitatively and the combination of a variety of terrain factors can reflect the morphological characteristics of landscape entities comprehensively (Tang, 2014;Wang et al., 2012).To depict the regional landform features of China, elevation, relief amplitude, surface roughness, surface incision and elevation variation coefficient as macro topographic factors, and slope, slope of slope and composite curvature as micro topographic factors are selected firstly.In addition, hillshade image which produces terrain surface shading effect thus provides a visual enhancement of the terrain undulation was also selected to classify the landform.Different terrain factors describe morphological features and spatial features of the landforms from different angles.However, due to the high correlation between factors, information redundancy caused by high-dimensional characteristic factors will emerge.So it is necessary to filter terrain factor combination with weak correlation to classify landforms.In order to avoid the influence of different dimensions, all of the terrain factors should be normalized to make their values fall between the ranges of 0-255.Correlation of nine terrain factors are shown in Table 1.  1, high correlation exists among slope, relief amplitude and surface incision.Gauche entropy method was used to filter these three factors which are processed as formula (1).
Where S is entropy of the n-dimensional data subset, n is the dimension, S M is the covariance matrix's row value of ndimensional data subset.The higher the value is, the larger entropy of the image is and thus are more beneficial for image classification.The entropy values are 26.89, 26.86, 27.20, respectively which means the combination with relief amplitude has the maximum entropy value.Therefore, relief amplitude, surface roughness, elevation, elevation variation coefficient, slope of slope, hillshade and composite curvature were determined as the best factor combination.

Calculation of factors importance
Seven terrain factors were selected to classify the landform types and each factor was considered as single layer to composite a 7-band image.However, each factor has different ability to characterize and classify different landforms, accordingly, important factors should be given a higher weight.This study proposes a method which thresholds of multiscale segmentation layer were determined based on variable importance measurement provided by random forest classifier.The Random Forest is a general tern for ensemble methods using tree-type classifiers {h(x, Θk), k=1, …} where the {Θk} are independent identically distributed random vectors.For classification, each tree and x is an input pattern.The output of the classifier is determined by a majority vote of multiple CART-like trees (Breiman, 2001).The trees are created by drawing a subset of training samples through replacement (a bagging approach).And the variable importance estimate was carried out based on the error estimate which is known as the out-of-bag (OOB) error.In sum, variable importance is estimated by measuring the growth of RF classification error before and after replacing the factor to a random noise (Lei, 2012;Song, 2012).The importance of every terrain factor is shown in Figure 4.

Object-based classification base on GLCM
In this paper, we use the nearest neighbour classifier to make classification of landform types.Seven landforms-plain, hill, low mountain, low-mid mountain, high-mid mountain, high mountain and extreme high mountain were inserted into class hierarchy respectively.Each landform select GLCM texture to build the classification feature space, samples were selected as well.DEM, as an analogue digital terrain surface, is naturally a space field model.Due to the interpolation technique, high spatial autocorrelation exists among the elevation of DEM raster.Based on the probability density function which estimates second order combination conditions, GLCM model describes the occurrence probability of a pair of pixel with gray value of i and j which locates on the θ direction and with the distance of d.Spatial relationship between pixels described by spatial distance, angle and other parameters were introduced into the texture analysis model to reveal image texture features.Spatial autocorrelation between pixels can be taken into considerations by this model thus it is a feasible approach to use this model to make the texture analysis of the DEM (Liu et al., 2013;Liu and Kuang, 2009).To take full advantage of the directivity of GLCM, eight factors-Homogeneity, Contrast, Dissimilarity, Entropy, Ang.2nd moment, Mean, Std Deviation and Correlation in four directions for seven terrain factors layers were selected to build the feature space and then the samples were extracted.During the first classification processing, it is suitable to choose fewer samples.As it is shown in Figure 5, the number of three plain area's sample objects is relatively small due to their large area and complete patches.On the contrast, the number is larger in high mountain and extreme high mountain which locates near the Hengduan Mountains and Himalaya mountains because of their broken patches.Based on the strategy of feature space optimization (FSO), Euclidean distance between samples used for classification in feature space was calculated and the feature combination which has the optimal inter-distance was selected as the optimal classification feature combination (Laliberte et al., 2012).The inter-distance of different feature space in different dimension is shown in Figure 5.As it is shown in Figure 7, the maximum inter-distance, with the value of 0.176, occurs when the dimension is 9.In this case, nine features-Entropy, Homogeneity and Contrast of hillshade, slope of slope Correlation, Correlation and Entropy of surface roughness, Entropy and Homogeneity of relief amplitude and Correlation of elevation variation coefficient were filtered as the optimal feature space combination.

Overall accuracy assessment
In addition, the classification accuracies of method in this study were compared to results obtained by ISODATA unsupervised classification method (Shruthi et al., 2015) and common objectbased classification mentioned in article (Wang et al., 2012).
The results are presented in Table 3.As results shown in Table 3, accuracy of the method put forward in this study is 5.7% higher than that of ISODATA unsupervised classification, and 15.7% higher than the common one, which means that this approach has obvious advantages in classification.
Further, this study selected an area locates in Chinese southwest region with complex landform features (Test area is shown in Figure 9) to test the applicability of this approach to different DEM resources.SRTM (Shuttle Radar Topography Mission) with the resolution of 90m was selected as data resource, and it was resampled into 500m and 1000m resolution DEM.Landforms of the test area was classified and accuracy was shown in Table 4.The table shows that, classification based on 90m resolution STRM has the highest accuracy and Kappa coefficient, and the accuracy declines as the resolution reduces.Moreover, classification accuracy based on the 1000m resolution SRTM is higher than that based on the same resolution DEM produced by China National Geographic Information Centre.However, due to its large data quantity, 90m resolution SRTM has a considerable time-consuming in accomplishing the multi-scale segmentation and texture-based landform classification, thus future research will focus on optimizing the classification method to fix this difficulty.

CONCLUSION AND FUTURE DEVELOPMENT
This study proposes an improved object-based classification for Chinese landform types.Based on 1000 resolution DEM of China, the combination of the terrain factors extracted from DEM are selected by correlation analysis and Sheffield's entropy method.Then the Random Forest classifier is applied to measure the importance of terrain factors, which are used as multi-scale segmentation thresholds.Besides, the GLCM is conducted for the knowledge base of classification.And at last, objects obtained by multi-scale segmentation was classified based on the filtered optimal factors combination.The experiment result shows that this method achieves a complete result with a high classification accuracy.Also, this approach reduces the subjectivity of classification process.In addition, compared to the common object-based classification method, in this study has obvious advantages in classification accuracy.However, the process of determining the segmentation scale requires multiple tests which leads to low efficiency.Thus, further research should be emphasize on how to determine the segmentation scale adaptively according to different terrain factors combinations and whether this approach is suitable for micro-landform extraction and classification.

Figure 1
Figure 1.1:1,000,000 DEM of China

Figure 3 .
Figure 3.The technology flow chart

Figure 4 .
Figure 4.Each terrain factor importance measure Those values are rounded and then set to be the multiscale segmentation threshold for each layer.Segmentation threshold was set as 90 after repeated experiments.

Figure 5 .
Figure 5.The separation distance map of the feature combination dimensionInitial classification of the image was proceeded by using this combination, and the final results was obtained after three or

Figure
Figure 6.Result of China landform classification

Table 3 .
Comparison of Classification accuracy