Exploring Point Transformers on 3D Semantic Segmentation of Javanese Architectures
Keywords: 3D Semantic Segmentation, Javanese Architecture, Point Clouds, Deep Learning, Cultural Heritage
Abstract. The complex geometry of Javanese architecture poses significant challenges for 3D semantic segmentation in cultural heritage documentation. This study evaluates state-of-the-art Point Transformers, i.e., PTv1, PTv2, PTv3, and LitePT, on the Sewu temple dataset, focusing on robustness and efficiency. While PTv1 and PTv2 achieve the highest Intersection-over-Union (mIoU 0.71), they incur high computational costs. Conversely, LitePT provides an optimal balance, delivering competitive results (0.69 mIoU) while being drastically faster. Furthermore, experiments with limited data reveal the significant benefits of transfer learning from European heritage datasets. We conclude that efficient Point Transformer architectures are promising for the automated understanding of complex non-European monuments.
