The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Publications Copernicus
Download
Citation
Articles | Volume XL-5/W6
https://doi.org/10.5194/isprsarchives-XL-5-W6-83-2015
https://doi.org/10.5194/isprsarchives-XL-5-W6-83-2015
18 May 2015
 | 18 May 2015

DACTYL ALPHABET GESTURE RECOGNITION IN A VIDEO SEQUENCE USING MICROSOFT KINECT

S. G. Artyukhin and L. M. Mestetskiy

Keywords: Gesture recongniton, Kinect, Morphological skeleton, Dactyl sign, Segmentation

Abstract. This paper presents an efficient framework for solving the problem of static gesture recognition based on data obtained from the web cameras and depth sensor Kinect (RGB-D - data). Each gesture given by a pair of images: color image and depth map. The database store gestures by it features description, genereated by frame for each gesture of the alphabet. Recognition algorithm takes as input a video sequence (a sequence of frames) for marking, put in correspondence with each frame sequence gesture from the database, or decide that there is no suitable gesture in the database. First, classification of the frame of the video sequence is done separately without interframe information. Then, a sequence of successful marked frames in equal gesture is grouped into a single static gesture. We propose a method combined segmentation of frame by depth map and RGB-image. The primary segmentation is based on the depth map. It gives information about the position and allows to get hands rough border. Then, based on the color image border is specified and performed analysis of the shape of the hand. Method of continuous skeleton is used to generate features. We propose a method of skeleton terminal branches, which gives the opportunity to determine the position of the fingers and wrist. Classification features for gesture is description of the position of the fingers relative to the wrist. The experiments were carried out with the developed algorithm on the example of the American Sign Language. American Sign Language gesture has several components, including the shape of the hand, its orientation in space and the type of movement. The accuracy of the proposed method is evaluated on the base of collected gestures consisting of 2700 frames.