Intelligent System for Automatic Bidirectional Sign Language Translation Based on Recognition and Synthesis of Audiovisual and Sign Speech
Keywords: Sign Language Translation, Gesture Recognition, Audio-Visual Speech Recognition, Speech Synthesis, Multimodal Communication
Abstract. This paper presents an intelligent system for automatic bidirectional translation between sign language and spoken language, aimed at facilitating inclusive communication between deaf or hard-of-hearing individuals and hearing persons. The proposed system integrates four core modules: sign language recognition, audiovisual speech recognition, sign language synthesis, and speech synthesis. Developed with a constrained vocabulary of 84 phrases relevant to medical consultations, the system enables translation in both directions—sign-to-speech and speech-to-sign. The architecture leverages state-of-the-art deep learning techniques, including transformer-based models, neural vocoders, and avatar-driven gesture synthesis. Experimental evaluations demonstrate high accuracy in gesture and speech recognition, with strong subjective ratings for the naturalness and intelligibility of synthesized outputs. This work contributes to the advancement of accessible communication technologies and lays the groundwork for future expansion into broader domains and unconstrained dialog scenarios.