<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="3.0" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher">ISPRS-Archives</journal-id>
<journal-title-group>
<journal-title>The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences</journal-title>
<abbrev-journal-title abbrev-type="publisher">ISPRS-Archives</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">2194-9034</issn>
<publisher><publisher-name>Copernicus Publications</publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5194/isprs-archives-XLVIII-4-W20-2025-33-2026</article-id>
<title-group>
<article-title>Transformer-LSTM Improve Maize-Yield Estimation in Smallholder Fields of Malawi</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Jere</surname>
<given-names>Mathews</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Munthali</surname>
<given-names>Kondwani</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
</contrib-group><aff id="aff1">
<label>1</label>
<addr-line>University of Malawi, Malawi</addr-line>
</aff>
<aff id="aff2">
<label>2</label>
<addr-line>University of Malawi, Malawi</addr-line>
</aff>
<pub-date pub-type="epub">
<day>29</day>
<month>04</month>
<year>2026</year>
</pub-date>
<volume>XLVIII-4/W20-2025</volume>
<fpage>33</fpage>
<lpage>38</lpage>
<permissions>
<copyright-statement>Copyright: &#x000a9; 2026 Mathews Jere</copyright-statement>
<copyright-year>2026</copyright-year>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri"  xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p>
</license>
</permissions>
<self-uri xlink:href="https://isprs-archives.copernicus.org/articles/XLVIII-4-W20-2025/33/2026/isprs-archives-XLVIII-4-W20-2025-33-2026.html">This article is available from https://isprs-archives.copernicus.org/articles/XLVIII-4-W20-2025/33/2026/isprs-archives-XLVIII-4-W20-2025-33-2026.html</self-uri>
<self-uri xlink:href="https://isprs-archives.copernicus.org/articles/XLVIII-4-W20-2025/33/2026/isprs-archives-XLVIII-4-W20-2025-33-2026.pdf">The full text article is available as a PDF file from https://isprs-archives.copernicus.org/articles/XLVIII-4-W20-2025/33/2026/isprs-archives-XLVIII-4-W20-2025-33-2026.pdf</self-uri>
<abstract>
<p>Accurate in-season yield estimation is essential for Malawi&amp;rsquo;s food-security planning, yet conventional crop-cut surveys cover fewer than 1% of the nation&amp;rsquo;s approximately 1.8 million sub-hectare maize plots. In this study, we exploit Sentinel-2 time-series imagery to benchmark five modelling paradigms: spectral-index linear regression, XGBoost, CNN-LSTM, a frozen Vision Transformer (ViT) and a ViT-LSTM hybrid. We apply these across eight rain-fed maize fields (0.2&amp;ndash;0.9 ha) in Zomba District, Malawi, under a strict nested leave-one-field-out cross-validation design. Our results show that the recurrent architectures significantly outperform the tabular baselines (&lt;em&gt;p&lt;/em&gt; &amp;le; 0.02, exact paired-permutation test). The ViT-LSTM hybrid achieved the lowest error (RMSE = 0.022 &lt;em&gt;tha&lt;/em&gt;&lt;sup&gt;&amp;minus;1&lt;/sup&gt;; MAE = 0.019 &lt;em&gt;tha&lt;/em&gt;&lt;sup&gt;&amp;minus;1&lt;/sup&gt;), representing an approximate 80% improvement over the best CNN-LSTM comparator, with statistical significance (&lt;em&gt;p &lt;/em&gt;= 0.031). Inference speed remains practical at &amp;asymp; 35 &lt;em&gt;ms&lt;/em&gt; per 32&amp;times;32-pixel patch, or about &amp;sim; 3 &lt;em&gt;has&lt;/em&gt;&lt;sup&gt;&amp;minus;1&lt;/sup&gt; on a low-end Quadro P1000 GPU, enabling national-scale yield mosaics within a week. These findings align with emerging evidence that transformer&amp;ndash;recurrent hybrid architectures represent the current state-of-the-art for crop-yield prediction (see e.g., ViT-based studies) and highlight the enduring trade-off between accuracy and throughput in operational contexts. Moreover, our open-source pipeline, the first validated on data-scarce, intercropped smallholder plots in sub-Saharan Africa, provides a reproducible blueprint for operational yield monitoring across similar agro-ecologies. The experiment scripts are available at https://github.com/jahnical/yield-pred-models-comp</p>
</abstract>
<counts><page-count count="6"/></counts>
</article-meta>
</front>
<body/>
<back>
</back>
</article>