<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="3.0" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher">ISPRS-Archives</journal-id>
<journal-title-group>
<journal-title>The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences</journal-title>
<abbrev-journal-title abbrev-type="publisher">ISPRS-Archives</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">2194-9034</issn>
<publisher><publisher-name>Copernicus Publications</publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5194/isprs-archives-XLVI-4-W4-2021-101-2021</article-id>
<title-group>
<article-title>ASSESSING LIDAR TRAINING DATA QUANTITIES FOR CLASSIFICATION MODELS</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Majgaonkar</surname>
<given-names>O.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Panchal</surname>
<given-names>K.</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Laefer</surname>
<given-names>D.</given-names>
</name>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Stanley</surname>
<given-names>M.</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Zaki</surname>
<given-names>Y.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
</contrib-group><aff id="aff1">
<label>1</label>
<addr-line>New York University Abu Dhabi, Abu Dhabi, UAE</addr-line>
</aff>
<aff id="aff2">
<label>2</label>
<addr-line>New York University, New York, NY, USA</addr-line>
</aff>
<aff id="aff3">
<label>3</label>
<addr-line>Center for Urban Science and Progress; Department of Civil and Urban Engineering, New York University, New York, NY, USA</addr-line>
</aff>
<pub-date pub-type="epub">
<day>07</day>
<month>10</month>
<year>2021</year>
</pub-date>
<volume>XLVI-4/W4-2021</volume>
<fpage>101</fpage>
<lpage>106</lpage>
<permissions>
<copyright-statement>Copyright: &#x000a9; 2021 O. Majgaonkar et al.</copyright-statement>
<copyright-year>2021</copyright-year>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri"  xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p>
</license>
</permissions>
<self-uri xlink:href="https://isprs-archives.copernicus.org/articles/XLVI-4-W4-2021/101/2021/isprs-archives-XLVI-4-W4-2021-101-2021.html">This article is available from https://isprs-archives.copernicus.org/articles/XLVI-4-W4-2021/101/2021/isprs-archives-XLVI-4-W4-2021-101-2021.html</self-uri>
<self-uri xlink:href="https://isprs-archives.copernicus.org/articles/XLVI-4-W4-2021/101/2021/isprs-archives-XLVI-4-W4-2021-101-2021.pdf">The full text article is available as a PDF file from https://isprs-archives.copernicus.org/articles/XLVI-4-W4-2021/101/2021/isprs-archives-XLVI-4-W4-2021-101-2021.pdf</self-uri>
<abstract>
<p>Classifying objects within aerial Light Detection and Ranging (LiDAR) data is an essential task to which machine learning (ML) is applied increasingly. ML has been shown to be more effective on LiDAR than imagery for classification, but most efforts have focused on imagery because of the challenges presented by LiDAR data. LiDAR datasets are of higher dimensionality, discontinuous, heterogenous, spatially incomplete, and often scarce. As such, there has been little examination into the fundamental properties of the training data required for acceptable performance of classification models tailored for LiDAR data. The quantity of training data is one such crucial property, because training on different sizes of data provides insight into a model’s performance with differing data sets. This paper assesses the impact of training data size on the accuracy of PointNet, a widely used ML approach for point cloud classification. Subsets of ModelNet ranging from 40 to 9,843 objects were validated on a test set of 400 objects. Accuracy improved logarithmically; decelerating from 45 objects onwards, it slowed significantly at a training size of 2,000 objects, corresponding to 20,000,000 points. This work contributes to the theoretical foundation for development of LiDAR-focused models by establishing a learning curve, suggesting the minimum quantity of manually labelled data necessary for satisfactory classification performance and providing a path for further analysis of the effects of modifying training data characteristics.</p>
</abstract>
<counts><page-count count="6"/></counts>
</article-meta>
</front>
<body/>
<back>
</back>
</article>