High-Quality Road Detection Using U-Net-Based Semantic Segmentation with High-Resolution Orthophotos and DSM Data in Urban Environments
Keywords: Road Detection, U-net, Orthophotos, DSM, Urban Environments
Abstract. Road detection and recognition from high-resolution geospatial data in urban environments is critical for numerous applications, including urban planning, navigation systems, and automated driving technologies. This study explores the potential of deep learning methodologies, specifically U-Net-based semantic segmentation, for high-quality road detection using open-access datasets. The input data consists of high-resolution digital orthophotos of a city region and corresponding Digital Surface Models (DSMs), allowing for a comprehensive analysis of two scenarios: (A) semantic segmentation using only imagery data and (B) segmentation utilizing both imagery and DSM data. Building on prior works by the authors, which include digital surface modelling and satellite image classification using U-Net and other neural network architectures, this research applies state-of-the-art techniques to leverage the spatial richness of orthophotos and the vertical information embedded in DSMs. Preliminary results indicate a significant improvement in road detection accuracy when integrating DSM data, highlighting the synergistic value of multi-source data in geospatial analysis. To validate the segmentation outputs, independent satellite imagery data are employed as a benchmark, enabling quantitative assessments of positional accuracy. The integrated orthophoto-DSM strategy achieved completeness, correctness, quality, and overall accuracy of 99.76%, 77.35%, 77.21%, and 95.95%, respectively, surpassing the solely orthophoto-based model, which achieved 73.76%, 51.10%, 42.52%, and 86.29% in these metrics. Experiments ensure that the proposed methodology achieves a robust delineation of road networks in urban environments, with scenario B outperforming scenario A. This research contributes to the growing body of literature on deep learning applications in photogrammetry and remote sensing, aligning with related studies on the semantic segmentation of urban features. The results underscore the importance of multimodal data fusion in geospatial analysis and its implications for enhancing road detection frameworks. Extracted road maps provide an accurate baseline for urban development planning and transportation management applications.