HANDCRAFTED AND LEARNING-BASED TIE POINT FEATURES – COMPARISON USING THE EUROSDR RPAS BENCHMARK DATASET
Keywords: Aerial triangulation, Benchmark, CNN, Deep learning, EuroSDR, RPAS, SfM photogrammetry, Tie points
Abstract. The identification of accurate and reliable image correspondences is fundamental for Structure-from-Motion (SfM) photogrammetry. Alongside handcrafted detectors and descriptors, recent machine learning-based approaches have shown promising results for tie point extraction, demonstrating matching success under strong perspective and illumination changes, and a general increase of tie point multiplicity. Recently, several methods based on convolutional neural networks (CNN) have been proposed, but few tests have yet been performed under real photogrammetric applications and, in particular, on full resolution aerial and RPAS image blocks that require rotationally invariant features. The research reported here compares two handcrafted (Metashape local features and RootSIFT) and two learning-based methods (LFNet and Key.Net) using the previously unused EuroSDR RPAS benchmark datasets. Analysis is conducted with DJI Zenmuse P1 imagery acquired at Wards Hill quarry in Northumberland, UK. The research firstly extracts keypoints using the aforementioned methods, before importing them into COLMAP for incremental reconstruction. The image coordinates of signalised ground control points (GCPs) and independent checkpoints (CPs) are automatically detected using an OpenCV algorithm, and then triangulated for comparison with accurate geometric ground-truth. The tests showed that learning-based local features are capable of outperforming traditional methods in terms of geometric accuracy, but several issues remain: few deep learning local features are trained to be rotation invariant, significant computational resources are required for large format imagery, and poor performance emerged in cases of repetitive patterns.