Sign Language Translation
10 papers with code • 2 benchmarks • 7 datasets
Given a video containing sign language, the task is to predict the translation into (written) spoken language.
Image credit: How2Sign
Most implemented papers
Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison
Based on this new large-scale dataset, we are able to experiment with several deep learning methods for word-level sign recognition and evaluate their performances in large scale scenarios.
TSPNet: Hierarchical Feature Learning via Temporal Semantic Pyramid for Sign Language Translation
Sign language translation (SLT) aims to interpret sign video sequences into text-based natural language sentences.
Neural Sign Language Translation
SLR seeks to recognize a sequence of continuous signs but neglects the underlying rich grammatical and linguistic structures of sign language that differ from spoken language.
Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation
We report state-of-the-art sign language recognition and translation results achieved by our Sign Language Transformers.
Better Sign Language Translation with STMC-Transformer
This contradicts previous claims that GT gloss translation acts as an upper bound for SLT performance and reveals that glosses are an inefficient representation of sign language.
ASL Recognition with Metric-Learning based Lightweight Network
In the past decades the set of human tasks that are solved by machines was extended dramatically.
Frozen Pretrained Transformers for Neural Sign Language Translation
Our results show that pretrained language models can be used to improve sign language translation performance and that the self-attention patterns in BERT transfer in zero-shot to the encoder and decoder of sign language translation models.
Stochastic Transformer Networks with Linear Competing Units: Application to end-to-end SL Translation
In this paper, we attenuate this need, by introducing an end-to-end SLT model that does not entail explicit use of glosses; the model only needs text groundtruth.
Explore More Guidance: A Task-aware Instruction Network for Sign Language Translation Enhanced with Data Augmentation
Sign language recognition and translation first uses a recognition module to generate glosses from sign language videos and then employs a translation module to translate glosses into spoken sentences.