Adaptive Hybrid Transformer Network with Frequency Attention for All-in-One Image Restoration

Pham, Cong Thang; Nguyen, An Hung; Nguyen, Quoc Cuong; Phan, Minh Nhat; Nguyen, Thanh Than

doi:https://doi.org/10.5194/isprs-archives-XLVIII-2-W9-2025-213-2025

Articles | Volume XLVIII-2/W9-2025

https://doi.org/10.5194/isprs-archives-XLVIII-2-W9-2025-213-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/isprs-archives-XLVIII-2-W9-2025-213-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume XLVIII-2/W9-2025

04 Sep 2025

| 04 Sep 2025

Adaptive Hybrid Transformer Network with Frequency Attention for All-in-One Image Restoration

Cong Thang Pham, An Hung Nguyen, Quoc Cuong Nguyen, Minh Nhat Phan, and Thanh Than Nguyen

Keywords: Image Restoration, Transformer, Frequency Attention, Hybrid Convolution and Attention, All-in-One model

Abstract. Image restoration is a critical task in computer vision, aiming to reconstruct high-quality images from degraded inputs caused by environmental factors or sensor limitations. Traditional restoration methods are often designed around prior knowledge of specific degradation types, such as Gaussian noise, rain streaks, or haze. This specificity constrains their flexibility and limits their effectiveness in real-world scenarios where degradations are diverse and unpredictable. To address this limitation, this study proposes a unified image restoration framework capable of handling multiple degradation types without requiring explicit prior knowledge. Specifically, the proposed approach targets three common and challenging degradation scenarios: Gaussian noise, rain, and haze, which are known to exhibit distinct patterns in the frequency domain. The core of the framework is a Hybrid Convolution and Attention (HCA) mechanism. This mechanism integrates the localized feature extraction capability of convolutional neural networks with the global context modeling strength of attention mechanisms, allowing the network to adaptively capture both spatial details and long-range dependencies. Additionally, a Frequency Attention (FA) module is introduced to enhance the model’s sensitivity to frequency-domain features. This enables more effective discrimination of degraded image structures and improves restoration accuracy across tasks. To further improve convergence and perceptual quality, the training process is guided by a composite loss function combining Multi-Scale Structural Similarity (MS-SSIM) and l1 loss. Experimental evaluations conducted on benchmark datasets demonstrate that the proposed method consistently outperforms existing approaches, achieving a PSNR of 34.97 dB and an SSIM of 0.950 when trained jointly across all degradation types. Remarkably, the model attains a PSNR of 39.19 dB and an SSIM of 0.990 on the Dehazing (SOTS) dataset, highlighting its strong generalization and robustness in diverse restoration scenarios.

Adaptive Hybrid Transformer Network with Frequency Attention for All-in-One Image Restoration

Useful Links

Useful External Links

Our Contact