Search Results for author: Saurabh Sahu

Found 9 papers, 0 papers with code

Leveraging Local Temporal Information for Multimodal Scene Classification

no code implementations26 Oct 2021 Saurabh Sahu, Palash Goyal

In this paper, we propose a novel self-attention block that leverages both local and global temporal relationships between the video frames to obtain better contextualized representations for the individual frames.

Classification Scene Classification +1

Can't Fool Me: Adversarially Robust Transformer for Video Understanding

no code implementations26 Oct 2021 Divya Choudhary, Palash Goyal, Saurabh Sahu

To address this, several techniques have been proposed to increase robustness of a model for image classification tasks.

Image Classification Video Understanding

Enhancing Transformer for Video Understanding Using Gated Multi-Level Attention and Temporal Adversarial Training

no code implementations18 Mar 2021 Saurabh Sahu, Palash Goyal

GAT uses a multi-level attention gate to model the relevance of a frame based on local and global contexts.

Video Understanding

Cross-modal Learning for Multi-modal Video Categorization

no code implementations7 Mar 2020 Palash Goyal, Saurabh Sahu, Shalini Ghosh, Chul Lee

Multi-modal machine learning (ML) models can process data in multiple modalities (e. g., video, audio, text) and are useful for video content analysis in a variety of problems (e. g., object detection, scene understanding, activity recognition).

Activity Recognition object-detection +2

Exploiting Temporal Coherence for Multi-modal Video Categorization

no code implementations7 Feb 2020 Palash Goyal, Saurabh Sahu, Shalini Ghosh, Chul Lee

Multimodal ML models can process data in multiple modalities (e. g., video, images, audio, text) and are useful for video content analysis in a variety of problems (e. g., object detection, scene understanding).

object-detection Object Detection +1

Modeling Feature Representations for Affective Speech using Generative Adversarial Networks

no code implementations31 Oct 2019 Saurabh Sahu, Rahul Gupta, Carol Espy-Wilson

In this work, we experiment with variants of GAN architectures to generate feature vectors corresponding to an emotion in two ways: (i) A generator is trained with samples from a mixture prior.

Cross-corpus Emotion Recognition +1

On Enhancing Speech Emotion Recognition using Generative Adversarial Networks

no code implementations18 Jun 2018 Saurabh Sahu, Rahul Gupta, Carol Espy-Wilson

GANs consist of a discriminator and a generator working in tandem playing a min-max game to learn a target underlying data distribution; when fed with data-points sampled from a simpler distribution (like uniform or Gaussian distribution).

Cross-corpus Speech Emotion Recognition

Adversarial Auto-encoders for Speech Based Emotion Recognition

no code implementations6 Jun 2018 Saurabh Sahu, Rahul Gupta, Ganesh Sivaraman, Wael Abd-Almageed, Carol Espy-Wilson

Recently, generative adversarial networks and adversarial autoencoders have gained a lot of attention in machine learning community due to their exceptional performance in tasks such as digit classification and face recognition.

Emotion Recognition Face Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.