P-LINKNET: LINKNET WITH SPATIAL PYRAMID POOLING FOR HIGH-RESOLUTION SATELLITE IMAGERY
Keywords: Deep learning, high-resolution remote sensing imagery, building extraction, semantic segmentation, encoder-decoder, fully convolutional networks
Abstract. Automatic extraction of buildings from high-resolution remote sensing imagery is very useful in many applications such as city management, mapping, urban planning and geographic information updating. Although extensively studied in the past years, due to the general texture of the building and the complexity of the image background, high-precision building segmentation from high-resolution sensing image is still a challenging task. Repeated pooling and striding operations used in CNNs reduce feature resolutions and cause the loss of detail information. In order to solve this problem, we proposed a deep learning model with a spatial pyramid pooling module based on the LinkNet. The proposed model called P-LinkNet that takes advantage of a spatial pyramid pooling module to capture and aggregate multi-scale contextual information. We tested it on Inria Building dataset. Experimental results show that the proposed P-LinkNet is superior to the LinkNet.