In the quest for robust hand segmentation methods, we evaluated the performance of the state of the art semantic segmentation methods, off the shelf and fine-tuned, on existing datasets.
Our model is built on the observation that egocentric activities are highly characterized by the objects and their locations in the video.
We propose a two-stage convolutional neural network (CNN) architecture for robust recognition of hand gestures, called HGR-Net, where the first stage performs accurate semantic segmentation to determine hand regions, and the second stage identifies the gesture.
To overcome this challenge, we develop a neural network which is able to adapt the receptive field not only for each layer but also for each neuron at the spatial location.
Thus, we propose hand segmentation method for hand-object interaction using only a depth map.
Hand segmentation and fingertip detection play an indispensable role in hand gesture-based human-machine interaction systems.
We propose an automatic method for generating high-quality annotations for depth-based hand segmentation, and introduce a large-scale hand segmentation dataset.