EFFECT OF THE TRAINING SET CONFIGURATION ON SENTINEL-2-BASED URBAN LOCAL CLIMATE ZONE CLASSIFICATION
Keywords: Label Noise, Canonical Correlation Forests (CCFs), Local Climate Zones (LCZs), Sentinel-2, Spectral features, Classification, Open Street Map (OSM)
Abstract. As any supervised classification procedure, also Local Climate Zone (LCZ) mapping requires reliable reference data. These are usually created manually and inevitably include label noise, caused by the complexity of the LCZ class scheme as well as variations in cultural and physical environmental factors. This study aims at evaluating the impact of the training set configuration, i.e. training sample number and imbalance, on the performance of Canonical Correlation Forests (CCFs) for a classification of the 11 urban LCZ classes. Experiments are carried out based on globally available Sentinel-2 imagery. Besides multi-spectral observations, different index measures extracted from the images as well as the Global Urban Footprint (GUF) and Open Street Map (OSM) layers are fed into the CCFs classifier. The results show that different LCZs favor different configurations in terms of training sample number and balance. Based on the findings, majority voting of different predictions from different configurations is proposed and performed. This way, a significant accuracy improvement can be achieved.