Sign Language Translation

38 papers with code • 6 benchmarks • 14 datasets

Given a video containing sign language, the task is to predict the translation into (written) spoken language.

Image credit: How2Sign

Most implemented papers

A Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation

FangyunWei/SLRT CVPR 2022

Concretely, we pretrain the sign-to-gloss visual network on the general domain of human actions and the within-domain of a sign-to-gloss dataset, and pretrain the gloss-to-text translation network on the general domain of a multilingual corpus and the within-domain of a gloss-to-text corpus.

Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison

dxli94/WLASL 24 Oct 2019

Based on this new large-scale dataset, we are able to experiment with several deep learning methods for word-level sign recognition and evaluate their performances in large scale scenarios.

TSPNet: Hierarchical Feature Learning via Temporal Semantic Pyramid for Sign Language Translation

verashira/TSPNet NeurIPS 2020

Sign language translation (SLT) aims to interpret sign video sequences into text-based natural language sentences.

Neural Sign Language Translation

neccam/nslt CVPR 2018

SLR seeks to recognize a sequence of continuous signs but neglects the underlying rich grammatical and linguistic structures of sign language that differ from spoken language.

Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation

neccam/slt CVPR 2020

We report state-of-the-art sign language recognition and translation results achieved by our Sign Language Transformers.

Better Sign Language Translation with STMC-Transformer

kayoyin/transformer-slt COLING 2020

This contradicts previous claims that GT gloss translation acts as an upper bound for SLT performance and reveals that glosses are an inefficient representation of sign language.

ASL Recognition with Metric-Learning based Lightweight Network

openvinotoolkit/training_extensions 2020

In the past decades the set of human tasks that are solved by machines was extended dramatically.

How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language

how2sign/ CVPR 2021

Towards this end, we introduce How2Sign, a multimodal and multiview continuous American Sign Language (ASL) dataset, consisting of a parallel corpus of more than 80 hours of sign language videos and a set of corresponding modalities including speech, English transcripts, and depth.

Frozen Pretrained Transformers for Neural Sign Language Translation

m-decoster/fpt4slt International Workshop on Automatic Translation for Signed and Spoken Languages (AT4SSL) 2021

Our results show that pretrained language models can be used to improve sign language translation performance and that the self-attention patterns in BERT transfer in zero-shot to the encoder and decoder of sign language translation models.

Stochastic Transformer Networks with Linear Competing Units: Application to end-to-end SL Translation

avoskou/Stochastic-Transformer-Networks-with-Linear-Competing-Units-Application-to-end-to-end-SL-Translatio ICCV 2021

In this paper, we attenuate this need, by introducing an end-to-end SLT model that does not entail explicit use of glosses; the model only needs text groundtruth.