A Study of Convolutional Architectures for Handshape Recognition applied to Sign Language

Convolutional Neural Networks have been providing a performance boost in many areas in the last few years, but their performance for Handshape Recognition in the context of Sign Language Recognition has not been thoroughly studied. We evaluated several convolutional architectures in order to determine their applicability for this problem. Using the LSA16 and RWTH-PHOENIX-Weather handshape datasets, we performed experiments with the LeNet, VGG16, ResNet-34 and All Convolutional architectures, as well as Inception with normal training and via transfer learning, and compared them to the state of the art in these datasets. We included experiments with a feedforward neural network as a baseline. We also explored various preprocessing schemes to analyze their impact on the recognition. We determined that while all models perform reasonably well on both datasets (with performance similar to hand-engineered methods), VGG16 produced the best results, closely followed by the traditional LeNet architecture. Also, pre-segmenting the hands from the background provided a big boost to accuracy.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Hand Gesture Recognition LSA16 VGG16 Accuracy 95.92 # 2
Hand Gesture Recognition RWTH-PHOENIX Handshapes dev set VGG16 Accuracy 82.88 # 2


No methods listed for this paper. Add relevant methods here