Back-translation aug- ments parallel data by translating monolingual sentences in the target side to source language.
We use a state-of-the-art face reenactment network to detect key points in the non-pivot frames and transmit them to the receiver.
The identity-aware generator takes the source image and the warped motion features as input to generate a high-quality output with fine-grained details.
Tables in unstructured business documents are tough to parse due to the high diversity of layouts, varying alignments of contents, and the presence of empty cells.
Apart from evaluating our approach on the ALS patient, we also extend it to people with hearing impairment relying extensively on lip movements to communicate.
We propose a novel adaptation strategy, where we iteratively prune and retrain the redundant parameters of an MNMT to improve bilingual representations while retaining the multilinguality.
Since the current datasets are inadequate for generating sign language directly from speech, we collect and release the first Indian sign language dataset comprising speech-level annotations, text transcripts, and the corresponding sign-language videos.
As Deep Neural Network models for face processing tasks approach human-like performance, their deployment in critical applications such as law enforcement and access control has seen an upswing, where any failure may have far-reaching consequences.
In this paper, we address the task of improving pair-wise machine translation for specific low resource Indian languages.