<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="3.0" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher">ISPRS-Archives</journal-id>
<journal-title-group>
<journal-title>The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences</journal-title>
<abbrev-journal-title abbrev-type="publisher">ISPRS-Archives</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">2194-9034</issn>
<publisher><publisher-name>Copernicus Publications</publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5194/isprs-archives-XLVIII-4-W11-2024-183-2024</article-id>
<title-group>
<article-title>Stereo Vision SLAM with SuperPoint and SuperGlue</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Yoon</surname>
<given-names>Si-Won</given-names>
<ext-link>https://orcid.org/0009-0004-8842-607X</ext-link>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Park</surname>
<given-names>Soon-Yong</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
</contrib-group><aff id="aff1">
<label>1</label>
<addr-line>Graduate School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, South Korea</addr-line>
</aff>
<pub-date pub-type="epub">
<day>27</day>
<month>06</month>
<year>2024</year>
</pub-date>
<volume>XLVIII-4/W11-2024</volume>
<fpage>183</fpage>
<lpage>188</lpage>
<permissions>
<copyright-statement>Copyright: &#x000a9; 2024 Si-Won Yoon</copyright-statement>
<copyright-year>2024</copyright-year>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri"  xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p>
</license>
</permissions>
<self-uri xlink:href="https://isprs-archives.copernicus.org/articles/XLVIII-4-W11-2024/183/2024/isprs-archives-XLVIII-4-W11-2024-183-2024.html">This article is available from https://isprs-archives.copernicus.org/articles/XLVIII-4-W11-2024/183/2024/isprs-archives-XLVIII-4-W11-2024-183-2024.html</self-uri>
<self-uri xlink:href="https://isprs-archives.copernicus.org/articles/XLVIII-4-W11-2024/183/2024/isprs-archives-XLVIII-4-W11-2024-183-2024.pdf">The full text article is available as a PDF file from https://isprs-archives.copernicus.org/articles/XLVIII-4-W11-2024/183/2024/isprs-archives-XLVIII-4-W11-2024-183-2024.pdf</self-uri>
<abstract>
<p>This paper presents a method for stereo visual odometry and mapping that integrates VINS-Fusion-based visual odometry estimation with deep learning techniques for camera pose tracking and stereo image matching. Traditional approaches in the VINS-Fusion relied on classical methods for feature extraction and matching, which often resulted in inaccuracies in triangulation-based 3D position estimation. These inaccuracies could be mitigated by incorporating IMU-based position estimation, which yielded more accurate odometry estimates compared to using stereo camera only in three-dimensional space. Consequently, the original VINS-stereo algorithm necessitated a tightly-coupled integration of IMU sensor measurements with estimated visual odometry.&lt;br /&gt;To address these challenges, our work proposes replacing the traditional feature extraction method used in VINS-Fusion, the Shi-Tomasi (Good Features to Track) technique, with feature extraction via the SuperPoint deep network. This approach has demonstrated promising experimental results. Additionally, we have applied deep learning models to the matching of feature points that project the same three-dimensional point to pixel coordinates in different images. Instead of using the KLT optical flow algorithm previously employed by VINS-Fusion, our proposed method utilizes SuperGlue, a deep graph neural network for graph matching, to improve image tracking and stereo image matching performance. The performance of the proposed algorithm is evaluated using the publicly available EuRoC dataset, providing a comparison with existing algorithms.</p>
</abstract>
<counts><page-count count="6"/></counts>
</article-meta>
</front>
<body/>
<back>
</back>
</article>