Search Results for author: Pankaj Wasnik

Found 6 papers, 0 papers with code

Efficient infusion of self-supervised representations in Automatic Speech Recognition

no code implementations • 19 Apr 2024 • Darshan Prabhu, Sai Ganesh Mirishkar, Pankaj Wasnik

Self-supervised learned (SSL) models such as Wav2vec and HuBERT yield state-of-the-art results on speech-related tasks.

Paper
Add Code

Isometric Neural Machine Translation using Phoneme Count Ratio Reward-based Reinforcement Learning

no code implementations • 20 Mar 2024 • Shivam Ratnakant Mhaskar, Nirmesh J. Shah, Mohammadi Zaki, Ashishkumar P. Gudmalwar, Pankaj Wasnik, Rajiv Ratn Shah

In this paper, we present the development of an isometric NMT system using Reinforcement Learning (RL), with a focus on optimizing the alignment of phoneme counts in the source and target language sentence pairs.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Paper
Add Code

Fiducial Focus Augmentation for Facial Landmark Detection

no code implementations • 23 Feb 2024 • Purbayan Kar, Vishal Chudasama, Naoyuki Onoe, Pankaj Wasnik, Vineeth Balasubramanian

To effectively utilize the newly proposed augmentation technique, we employ a Siamese architecture-based training mechanism with a Deep Canonical Correlation Analysis (DCCA)-based loss to achieve collective learning of high-level feature representations from two different views of the input images.

Ranked #1 on Facial Landmark Detection on WFLW

Face Alignment Facial Landmark Detection +1

Paper
Add Code

Revisiting Class Imbalance for End-to-end Semi-Supervised Object Detection

no code implementations • 4 Jun 2023 • Purbayan Kar, Vishal Chudasama, Naoyuki Onoe, Pankaj Wasnik

Semi-supervised object detection (SSOD) has made significant progress with the development of pseudo-label-based end-to-end methods.

Ranked #6 on Semi-Supervised Object Detection on COCO 10% labeled data

object-detection Object Detection +2

Paper
Add Code

M2FNet: Multi-modal Fusion Network for Emotion Recognition in Conversation

no code implementations • 5 Jun 2022 • Vishal Chudasama, Purbayan Kar, Ashish Gudmalwar, Nirmesh Shah, Pankaj Wasnik, Naoyuki Onoe

We introduce a new feature extractor to extract latent features from the audio and visual modality.

Ranked #10 on Emotion Recognition in Conversation on MELD

Emotion Recognition in Conversation

Paper
Add Code

Smartphone Multi-modal Biometric Authentication: Database and Evaluation

no code implementations • 5 Dec 2019 • Raghavendra Ramachandra, Martin Stokkenes, Amir Mohammadi, Sushma Venkatesh, Kiran Raja, Pankaj Wasnik, Eric Poiret, Sébastien Marcel, Christoph Busch

One of the unique features of this dataset is that it is collected in four different geographic locations representing a diverse population and ethnicity.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.