Comparison of manual and semi-automated synthetic training data creation for individual tree crown delineation

Steier, Janik; Iwaszczuk, Dorota

doi:10.5194/isprs-archives-XLVIII-1-W6-2025-227-2025

Articles | Volume XLVIII-1/W6-2025

https://doi.org/10.5194/isprs-archives-XLVIII-1-W6-2025-227-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/isprs-archives-XLVIII-1-W6-2025-227-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume XLVIII-1/W6-2025

31 Dec 2025

| 31 Dec 2025

Comparison of manual and semi-automated synthetic training data creation for individual tree crown delineation

Janik Steier and Dorota Iwaszczuk

Keywords: Manual Labeling, Synthetic Training Data, Individual Tree Crown Delineation

Abstract. Deep learning models in the field of individual tree detection and crown delineation (ITDCD) rely on large and high-quality annotation datasets to produce accurate predictions. Training data or annotations for most ITDCD studies are collected through manual labeling. Manual labeling, especially for complex structures like tree crowns, is a time-consuming process that often results in error-prone annotations. Error-prone annotations, in turn, can lead to significant errors in the predictions of deep learning models. Semi- or fully-automated training data creation shows the potential to make the creation process more efficient and ensure high quality of the training dataset. In this work, we present a methodology for generating semi-automated synthetic training data for deep learning-based ITDCD applications. Furthermore, a systematic criteria-based - validity, efficiency, variety and scalability - comparison is conducted between the manual and synthetic training data creation methods to structurally and practically illustrate the advantages and disadvantages of the two approaches. Overall, the semi-automated synthetic data approach outperforms manual labeling in terms of validity, efficiency, and scalability; once the algorithm is implemented, it rapidly generates arbitrarily large, high-quality, reproducible tree crown annotation datasets. In contrast, a manual creation approach shows its advantages as an efficient way to create small, low-quality datasets (e.g., for fine-tuning a pre-trained model) compared to developing a semi-automated method from scratch.

Comparison of manual and semi-automated synthetic training data creation for individual tree crown delineation

Useful Links

Useful External Links

Our Contact