GRAPH NEURAL NETWORK BASED MULTI-FEATURE FUSION FOR BUILDING CHANGE DETECTION
Keywords: Graph Neural Network, Feature extraction, Multi-feature fusion, Dense Image Matching, Building Change Detection, Node aggregation
Abstract. Building Change Detection (BCD) via multi-temporal remote sensing images is essential for various applications such as urban monitoring, urban planning, and disaster assessment. However, most building change detection approaches only extract features from different kinds of remote sensing images for change index determination, which can not determine the insignificant changes of small buildings. Given co-registered multi-temporal remote sensing images, the illumination variations and misregistration errors always lead to inaccurate change detection results. This study investigates the applicability of multi-feature fusion from both directly extract 2D features from remote sensing images and 3D features extracted by the dense image matching (DIM) generated 3D point cloud for accurate building change index generation. This paper introduces a graph neural network (GNN) based end-to-end learning framework for building change detection. The proposed framework includes feature extraction, feature fusion, and change index prediction. It starts with a pre-trained VGG-16 network as a backend and uses U-net architecture with five layers for feature map extraction. The extracted 2D features and 3D features are utilized as input into GNN based feature fusion parts. In the GNN parts, we introduce a flexible context aggregation mechanism based on attention to address the illumination variations and misregistration errors, enabling the framework to reason about the image-based texture information and depth information introduced by DIM generated 3D point cloud jointly. After that, the GNN generated affinity matrix is utilized for change index determination through a Hungarian algorithm. The experiment conducted on a dataset that covered Setagaya-Ku, Tokyo area, shows that the proposed method generated change map achieved the precision of 0.762 and the F1-score of 0.68 at pixel-level. Compared to traditional image-based change detection methods, our approach learns prior over geometrical structure information from the real 3D world, which robust to the misregistration errors. Compared to CNN based methods, the proposed method learns to fuse 2D and 3D features together to represent more comprehensive information for building change index determination. The experimental comparison results demonstrated that the proposed approach outperforms the traditional methods and CNN based methods.