The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Publications Copernicus
Download
Citation
Articles | Volume XLVIII-4/W10-2024
https://doi.org/10.5194/isprs-archives-XLVIII-4-W10-2024-199-2024
https://doi.org/10.5194/isprs-archives-XLVIII-4-W10-2024-199-2024
31 May 2024
 | 31 May 2024

Global Structure-From-Motion Enhanced Neural Radiance Fields 3D Reconstruction

Tong Ye, He Huang, Yucheng Liu, and Junxing Yang

Keywords: Urban 3D reconstruction, Neural Radiation Fields, sparse point cloud, SFM algorithm

Abstract. Urban three-dimensional modeling involves digitizing geographical elements such as urban ground and architecture, profoundly impacting urban management, planning, and development. In recent years, such models have demonstrated various advantages, but traditional challenges in acquiring comprehensive geographical information persist. The emergence of unmanned aerial vehicle (UAV) oblique photography offers a viable solution, enabling cost-effective acquisition of geographical information and facilitating the construction of three-dimensional urban models. However, UAV-based 3D reconstruction encounters certain issues.
Conventional 3D reconstruction typically begins with data collection, involving two-dimensional images or point cloud data acquisition using imagery or LiDAR technology. Subsequent steps include data preprocessing, feature extraction and matching, surface model construction, texture mapping, rendering, and generating three-dimensional models. However, traditional methods exhibit limitations and drawbacks, such as inadequate adaptation to complex scenes, increased computational demands for large-scale scenes, and difficulties with dynamic scenes.
Deep learning-based 3D reconstruction methods, like MVSNet, employ deep neural networks to infer scene depth information from multiple-view images, yielding high-quality 3D reconstruction results. However, they rely heavily on prior datasets and auxiliary information, limiting generalization.
Neural Radiance Fields (NeRF) combine neural networks with volumetric rendering techniques, excelling in reconstructing objects with high precision and detail, handling dynamic scenes, and addressing areas with sparse viewpoints. However, NeRF's input requires sparse point clouds, commonly obtained using COLMAP, which has limitations. To address these limitations, using NeRF with global Structure-from-Motion (SfM) has emerged as a promising solution. This simultaneous processing of the entire dataset estimates camera trajectories and scene structure from shared features and information among multiple images, resulting in more accurate sparse point clouds and significantly improving image quality. Experimental validation using open-source and self-generated datasets demonstrates that this algorithm markedly enhances surface texture quality compared to traditional 3D reconstruction and NeRF with incremental SfM. In summary, this algorithm enhances 3D reconstruction efficacy and exhibits superior robustness, scalability, and accuracy compared to conventional methods.