Search Results for author: Saurabh Sahu

Found 9 papers, 0 papers with code

Leveraging Local Temporal Information for Multimodal Scene Classification

no code implementations • 26 Oct 2021 • Saurabh Sahu, Palash Goyal

In this paper, we propose a novel self-attention block that leverages both local and global temporal relationships between the video frames to obtain better contextualized representations for the individual frames.

Classification Scene Classification +1

Paper
Add Code

Can't Fool Me: Adversarially Robust Transformer for Video Understanding

no code implementations • 26 Oct 2021 • Divya Choudhary, Palash Goyal, Saurabh Sahu

To address this, several techniques have been proposed to increase robustness of a model for image classification tasks.

Image Classification Video Understanding

Paper
Add Code

Enhancing Transformer for Video Understanding Using Gated Multi-Level Attention and Temporal Adversarial Training

no code implementations • 18 Mar 2021 • Saurabh Sahu, Palash Goyal

GAT uses a multi-level attention gate to model the relevance of a frame based on local and global contexts.

Video Understanding

Paper
Add Code

Cross-modal Learning for Multi-modal Video Categorization

no code implementations • 7 Mar 2020 • Palash Goyal, Saurabh Sahu, Shalini Ghosh, Chul Lee

Multi-modal machine learning (ML) models can process data in multiple modalities (e. g., video, audio, text) and are useful for video content analysis in a variety of problems (e. g., object detection, scene understanding, activity recognition).

Activity Recognition object-detection +2

Paper
Add Code

Exploiting Temporal Coherence for Multi-modal Video Categorization

no code implementations • 7 Feb 2020 • Palash Goyal, Saurabh Sahu, Shalini Ghosh, Chul Lee

Multimodal ML models can process data in multiple modalities (e. g., video, images, audio, text) and are useful for video content analysis in a variety of problems (e. g., object detection, scene understanding).

object-detection Object Detection +1

Paper
Add Code

Modeling Feature Representations for Affective Speech using Generative Adversarial Networks

no code implementations • 31 Oct 2019 • Saurabh Sahu, Rahul Gupta, Carol Espy-Wilson

In this work, we experiment with variants of GAN architectures to generate feature vectors corresponding to an emotion in two ways: (i) A generator is trained with samples from a mixture prior.

Cross-corpus Emotion Recognition +1

Paper
Add Code

On Enhancing Speech Emotion Recognition using Generative Adversarial Networks

no code implementations • 18 Jun 2018 • Saurabh Sahu, Rahul Gupta, Carol Espy-Wilson

GANs consist of a discriminator and a generator working in tandem playing a min-max game to learn a target underlying data distribution; when fed with data-points sampled from a simpler distribution (like uniform or Gaussian distribution).

Cross-corpus Speech Emotion Recognition

Paper
Add Code

Semi-supervised and Transfer learning approaches for low resource sentiment classification

no code implementations • 7 Jun 2018 • Rahul Gupta, Saurabh Sahu, Carol Espy-Wilson, Shrikanth Narayanan

Sentiment classification involves quantifying the affective reaction of a human to a document, media item or an event.

Classification General Classification +3

Paper
Add Code

Adversarial Auto-encoders for Speech Based Emotion Recognition

no code implementations • 6 Jun 2018 • Saurabh Sahu, Rahul Gupta, Ganesh Sivaraman, Wael Abd-Almageed, Carol Espy-Wilson

Recently, generative adversarial networks and adversarial autoencoders have gained a lot of attention in machine learning community due to their exceptional performance in tasks such as digit classification and face recognition.

Emotion Recognition Face Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.