The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Publications Copernicus
Download
Citation
Articles | Volume XLVIII-1-2024
https://doi.org/10.5194/isprs-archives-XLVIII-1-2024-605-2024
https://doi.org/10.5194/isprs-archives-XLVIII-1-2024-605-2024
10 May 2024
 | 10 May 2024

Building footprint extraction from aerial images using an edge-aware YOLO-v8 network

Ziyu Song, Weixi Wang, Zhenyu Hao, Xiaoming Li, Shengjun Tang, and Linfu Xie

Keywords: Building footprint extraction, Remote sensing images, YOLO-v8, Instance segmentation, Prewitt model

Abstract. Building footprint extraction is a critical indicator for assessing urban infrastructure, and extracting building footprints from remote sensing imagery can have significant practical applications. However, achieving rapid and accurate extraction of building footprints remains highly challenging, especially in scenarios with complex scenes, dense building distributions, and small targets. The instance segmentation models of the YOLO series offer strong real-time performance, reducing considerable time and effort in practical applications. Therefore, we propose building footprint extraction based on an enhance YOLO-v8 network. This study focuses on three enhancements to the YOLO-v8 network to improve extraction accuracy. Building upon the YOLO-v8 framework, we have incorporated the Feature Pyramid Network (FPN) module into feature maps at all scales to efficiently propagate high-level semantic information. Additionally, we introduce the Triple Feature Encoder (TFE) module, which integrates spatial detail information from feature maps at three different scales to enhance the network's ability to extract multi-scale information. Finally, we explore the integration of the Prewitt model, a conventional edge detection operator, to assist in extracting edge features in target regions of feature maps. This integration aims to reduce the jagged edges frequently seen in the outcomes of the original YOLO-v8. Furthermore, the Prewitt operator's noise suppression capability helps mitigate the influence of non-target areas in the feature maps. The proposed framework achieves an instance segmentation accuracy of mAP50 is 84.6% and mAP50:95 is 51.4% on public datasets, outperforming the original YOLO-v8 network.