The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Publications Copernicus
Download
Citation
Articles | Volume XLVIII-2/W8-2024
https://doi.org/10.5194/isprs-archives-XLVIII-2-W8-2024-327-2024
https://doi.org/10.5194/isprs-archives-XLVIII-2-W8-2024-327-2024
14 Dec 2024
 | 14 Dec 2024

Deep Learning-Based AI-Assisted Visual Inspection Systems for Historic Buildings and their Comparative Performance with ChatGPT-4O

Mayank Mishra, Kai Zhang, Chiara Mea, Luigi Barazzetti, Francesco Fassi, Fausta Fiorillo, and Mattia Previtali

Keywords: Artificial intelligence, ChatGPT, Computer Vision, Damage identification, Deep Learning

Abstract. Historical buildings and monuments are typically subject to degradation over time due to the passage of time and constant exposure to external agents. The use of artificial intelligence (AI) to support the work of conservation and restoration specialists in identifying surface decay is a research topic of considerable interest at present. This study presents two approaches: ChatGPT and an object detection architecture (YOLOv5). Specifically, this investigation sought to evaluate the ChatGPT’s ability to identify and describe surface degradation pathologies by exploiting its pre-trained models for image analysis. The ICOMOS-ISCS: Illustrated Glossary on Stone Deterioration Patterns (2008) was provided as a reference to guide the use of specific terminology. In the first test phase, to verify the accuracy of the ChatGPT results, benchmark images (depicting different types of damage) extracted from the UNI 11182 (2006) standard referring to the definition of degradation types were used. Only later were images from literature studies and other photographic datasets also used. In general, the results of the analysis were validated with the conclusions of professionals and with the conclusions of other AI techniques, as well as with the descriptions provided by reference manuals in the literature. In particular, the decay annotations predicted by the pre-trained object detection model were compared with those made by human experts. The capabilities and limitations of both approaches as tools for identifying deterioration pathologies are illustrated.