The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Share
Publications Copernicus
Download
Citation
Share
Articles | Volume XLVIII-2/W9-2025
https://doi.org/10.5194/isprs-archives-XLVIII-2-W9-2025-219-2025
https://doi.org/10.5194/isprs-archives-XLVIII-2-W9-2025-219-2025
04 Sep 2025
 | 04 Sep 2025

RSB-MedNeXt: An attempt at beating the STU-Net through Robust Stem and Bottleneck Design

Cong Thang Pham, Minh Toan Dinh, and Thi Thu Thao Tran

Keywords: Medical image segmentation, STU-Net, MedNeXt, U-Net

Abstract. Medical image segmentation is a crucial task that supports clinical diagnosis and treatment planning. This field was revolutionized in both theoretical and practical aspects due to the employment of deep learning, specifically U-Net and its variants. Recently, with the aim of improving scaling and transferable capabilities, which are the drawbacks of U-Net, STU-Net, and other similar works were released. As a result, this led to significant advancements in medical applications practically. However, STU-Net trades efficiency for performance disproportionately, resulting in huge fine-tuning costs to achieve improvement over training from scratch. In this paper, we systematically identify architectural strengths and limitations of STU-Net and MedNeXt that hinder optimal feature learning. Through this analysis, we propose RSB-MedNeXt, a more robust CNN architecture designed to surpass STU-Net while maintaining efficiency. Our architecture introduces two key innovations: (1) a robust stem module with three parallel branches that extract information at multiple scales, (2) a hybrid bottleneck that combines CNN-based feature extraction with self-attention mechanisms to capture both fine-grained details and global context. We integrate our network into the nnU-Net framework and conduct comprehensive experiments on multiple segmentation tasks against STU-Net and MedNeXt. Results demonstrate that RSBMedNeXt achieves superior performance while requiring fewer computational resources than STU-Net. Through our approach, we hope that the trade-off between performance and efficiency in medical image segmentation can be effectively addressed and offers a promising method in resource-constrained clinical applications.

Share