PCINet: a Prototype- and Concept-based Interpretable Network for Mutli-scene Recognition

Hua, Yuansheng; Zhu, Jiasong; Li, Qingquan

doi:https://doi.org/10.5194/isprs-archives-XLVIII-1-2024-265-2024

Articles | Volume XLVIII-1-2024

https://doi.org/10.5194/isprs-archives-XLVIII-1-2024-265-2024

© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/isprs-archives-XLVIII-1-2024-265-2024

© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume XLVIII-1-2024

10 May 2024

| 10 May 2024

PCINet: a Prototype- and Concept-based Interpretable Network for Mutli-scene Recognition

Yuansheng Hua, Jiasong Zhu, and Qingquan Li

Keywords: Aerial image interpretation, Multi-scene recognition, Network interpretability, Concept bottleneck

Abstract. With the development of remote sensing techniques, a large number of high-resolution aerial images is now available and benefit many applications. Multi-scene recognition plays a key role in applying remote sensing images to these applications, which refers to predicting multiple scenes coexisted in an aerial image and has attracted an increasing attention. Recently, most researchers tend to invent deep learning-based recognition models and has gained great achievements. However, few efforts have been deployed to explaining the success of deep neural networks in multi-scene recognition. To address this, we introduce concept bottleneck model (CBM) to interpreting model performance and propose a novel network, namely Prototype- and Concept-based Interpretable Network (PCINet), that projects aerial imagery into a prototype-concept memory bank and encode their correlations for explaining how a network can identify coexisting scenes in an aerial image. Specifically, the proposed network mainly consists of two branches: prototype matching that measures similarity scores between image features and scene prototypes, and concept bottleneck branches that aligned image features to textual embeddings and compute their relations with concept embeddings. Afterwards, Outputs are integrated for inferring scene categories. Experimental results show that the model enhances interpretability, providing valuable insights for urban planning and resource management, thereby bridging the gap between deep learning models and practical applications.

PCINet: a Prototype- and Concept-based Interpretable Network for Mutli-scene Recognition

Useful Links

Useful External Links

Our Contact