Improving Gesture Recognition Efficiency with MediaPipe and YOLO-Pose

Andriyanov, Nikita; Mikhailova, Svetlana

doi:https://doi.org/10.5194/isprs-archives-XLVIII-2-W9-2025-13-2025

Articles | Volume XLVIII-2/W9-2025

https://doi.org/10.5194/isprs-archives-XLVIII-2-W9-2025-13-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/isprs-archives-XLVIII-2-W9-2025-13-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume XLVIII-2/W9-2025

04 Sep 2025

| 04 Sep 2025

Improving Gesture Recognition Efficiency with MediaPipe and YOLO-Pose

Nikita Andriyanov and Svetlana Mikhailova

Keywords: Gesture Recognition, Keypoint Detection, Performance Algorithms, Computer Vision, MediaPipe, YOLO-Pose

Abstract. This paper presents an improved combined approach for gesture recognition, combining a fast and lightweight keypoint detection algorithm using the MediaPipe method with a highly accurate YOLO-Pose model (integration of keypoints into the YOLO pipeline). This combination allows to drastically reduce the computational load compared to traditional convolutional networks while maintaining or even improving the recognition accuracy. As part of the extended study, in addition to the original experiment comparing different models on the HaGRID dataset, an additional experiment was implemented to evaluate the robustness of the system to changes in camera angle and gesture execution speed. The results show that the proposed method provides stable gesture recognition with mean Average Precision above 0.80 even under extreme conditions, which opens up prospects for its integration into mobile and embedded systems. We also tested different Artificial Intelligence ensembles to detect and classify gestures, but results for traditional methods are worse then YOLO-pose with MediaPipe.

Improving Gesture Recognition Efficiency with MediaPipe and YOLO-Pose

Useful Links

Useful External Links

Our Contact