The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Share
Publications Copernicus
Download
Citation
Share
Articles | Volume XLVIII-2/W9-2025
https://doi.org/10.5194/isprs-archives-XLVIII-2-W9-2025-161-2025
https://doi.org/10.5194/isprs-archives-XLVIII-2-W9-2025-161-2025
04 Sep 2025
 | 04 Sep 2025

Hierarchical Scene Graph Generation and Vectorization of Aerial Images

Vladimir A. Knyaz, Vladimir V. Kniaz, Sergey Yu. Zheltov, Anton V. Emelyanov, and Egor R. Smirnov

Keywords: image vectorization, scene graph generation, hierarchical representation, maps updating, convolutional neural networks

Abstract. Vector representation of geodata is widely used in various application due to high density of information and the advanced level of information representation, introduced by the human operator while creating a map. We can say that a map is a vector representation of understanding a scene based on its image. Scene understanding can be considered at different levels of depth, beginning from image classification and semantic segmentation and completing with rich semantic relationships between objects and retrieving its hierarchy. With the progress in machine learning methods and tools for obtaining and processing large amounts of data a set of neural network models has been developed that demonstrate state-of-the art performance (humanlike and better) in image classification and image semantic segmentation tasks. After object detection and recognition, the next step in scene understanding is retrieving the relations between objects and their hierarchy. This problem is known as scene graph generation, and recently it received notable attention by the scientific community. The developed approach incorporates the information about the structural and functional relationships between objects in the image, which, on the one hand, improves the quality of segmentation through the use of new a priori data, and on the other hand, reduces the time spent by the operator on subsequent processing of the results of the neural network algorithm. To train and evaluate the developed framework, a special dataset is collected and annotated. It contains more than 10k aerial photographs representing various types of objects taken in different years and seasons. The evaluation results on the created dataset proved the state-of-the-art performance of the developed framework.

Share