Search Results for author: C V Jawahar

Found 10 papers, 5 papers with code

IndicSTR12: A Dataset for Indic Scene Text Recognition

no code implementations12 Mar 2024 Harsh Lunia, Ajoy Mondal, C V Jawahar

Several benchmark datasets and substantial work on deep learning models are available for Latin languages to meet this need.

Benchmarking Scene Text Recognition +1

Compressing Video Calls using Synthetic Talking Heads

1 code implementation7 Oct 2022 Madhav Agarwal, Anchit Gupta, Rudrabha Mukhopadhyay, Vinay P. Namboodiri, C V Jawahar

We use a state-of-the-art face reenactment network to detect key points in the non-pivot frames and transmit them to the receiver.

Face Reenactment Talking Head Generation +1

Audio-Visual Face Reenactment

1 code implementation6 Oct 2022 Madhav Agarwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C V Jawahar

The identity-aware generator takes the source image and the warped motion features as input to generate a high-quality output with fine-grained details.

Face Reenactment

Visual Understanding of Complex Table Structures from Document Images

no code implementations13 Nov 2021 Sachin Raja, Ajoy Mondal, C V Jawahar

Tables in unstructured business documents are tough to parse due to the high diversity of layouts, varying alignments of contents, and the presence of empty cells.

Novel Object Detection object-detection +1

Personalized One-Shot Lipreading for an ALS Patient

no code implementations2 Nov 2021 Bipasha Sen, Aditya Agarwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C V Jawahar

Apart from evaluating our approach on the ALS patient, we also extend it to people with hearing impairment relying extensively on lip movements to communicate.

Domain Adaptation Lipreading

More Parameters? No Thanks!

1 code implementation Findings (ACL) 2021 Zeeshan Khan, Kartheek Akella, Vinay P. Namboodiri, C V Jawahar

We propose a novel adaptation strategy, where we iteratively prune and retrain the redundant parameters of an MNMT to improve bilingual representations while retaining the multilinguality.

Learning Language specific models Machine Translation +1

Towards Automatic Speech to Sign Language Generation

1 code implementation24 Jun 2021 Parul Kapoor, Rudrabha Mukhopadhyay, Sindhu B Hegde, Vinay Namboodiri, C V Jawahar

Since the current datasets are inadequate for generating sign language directly from speech, we collect and release the first Indian sign language dataset comprising speech-level annotations, text transcripts, and the corresponding sign-language videos.

Text Generation

Canonical Saliency Maps: Decoding Deep Face Models

1 code implementation4 May 2021 Thrupthi Ann John, Vineeth N Balasubramanian, C V Jawahar

As Deep Neural Network models for face processing tasks approach human-like performance, their deployment in critical applications such as law enforcement and access control has seen an upswing, where any failure may have far-reaching consequences.

Face Model Object Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.