SEMANTIC SEGMENTATION OF AIRBORNE IMAGES AND CORRESPONDING DIGITAL SURFACE MODELS – ADDITIONAL INPUT DATA OR ADDITIONAL TASK?
Keywords: Convolutional Network, Semantic Segmentation, Multi-Task Learning, Height Estimation
Abstract. We analyze the effects of additional height data for semantic segmentation of aerial images with a convolutional encoder-decoder network. Besides a merely image-based semantic segmentation, we trained the same network with height as additional input and furthermore, we defined a multi-task model, where we trained the network to estimate the relative height of objects in parallel to semantic segmentation on the image data only. Our findings are, that excellent results are possible for image data only and additional height information has no significant effect – neither when employed as extra input nor when used for multi-task training, even with differently weighted losses. Based on our results, we, thus, hypothesize that a strong encoder-decoder network implicitly learns the correlation of object categories and relative heights.