Search Results for author: Moitreya Chatterjee

Found 13 papers, 1 papers with code

Tensor Factorization for Leveraging Cross-Modal Knowledge in Data-Constrained Infrared Object Detection

no code implementations28 Sep 2023 Manish Sharma, Moitreya Chatterjee, Kuan-Chuan Peng, Suhas Lohit, Michael Jones

We first pretrain these factor matrices on the RGB modality, for which plenty of training data are assumed to exist and then augment only a few trainable parameters for training on the IR modality to avoid over-fitting, while encouraging them to capture complementary cues from those trained only on the RGB modality.

object-detection Object Detection +1

CAVEN: An Embodied Conversational Agent for Efficient Audio-Visual Navigation in Noisy Environments

no code implementations6 Jun 2023 Xiulong Liu, Sudipta Paul, Moitreya Chatterjee, Anoop Cherian

Audio-visual navigation of an agent towards locating an audio goal is a challenging task especially when the audio is sporadic or the environment is noisy.

Hierarchical Reinforcement Learning Navigate +5

Learning Audio-Visual Dynamics Using Scene Graphs for Audio Source Separation

no code implementations29 Oct 2022 Moitreya Chatterjee, Narendra Ahuja, Anoop Cherian

In this paper, we propose to use this connection between audio and visual dynamics for solving two challenging tasks simultaneously, namely: (i) separating audio sources from a mixture using visual cues, and (ii) predicting the 3D visual motion of a sounding source using its separated audio.

Audio Source Separation

Visual Scene Graphs for Audio Source Separation

no code implementations ICCV 2021 Moitreya Chatterjee, Jonathan Le Roux, Narendra Ahuja, Anoop Cherian

At its core, AVSGS uses a recursive neural network that emits mutually-orthogonal sub-graph embeddings of the visual graph using multi-head attention.

AudioCaps Audio Source Separation

Learning to Generate Videos Using Neural Uncertainty Priors

no code implementations1 Jan 2021 Moitreya Chatterjee, Anoop Cherian, Narendra Ahuja

Predicting the future frames of a video is a challenging task, in part due to the underlying stochastic real-world phenomena.

Video Generation

Sound2Sight: Generating Visual Dynamics from Sound and Context

no code implementations ECCV 2020 Anoop Cherian, Moitreya Chatterjee, Narendra Ahuja

To tackle this problem, we present Sound2Sight, a deep variational framework, that is trained to learn a per frame stochastic prior conditioned on a joint embedding of audio and past frames.

Multimodal Reasoning

Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformers

no code implementations8 Jul 2020 Shijie Geng, Peng Gao, Moitreya Chatterjee, Chiori Hori, Jonathan Le Roux, Yongfeng Zhang, Hongsheng Li, Anoop Cherian

Given an input video, its associated audio, and a brief caption, the audio-visual scene aware dialog (AVSD) task requires an agent to indulge in a question-answer dialog with a human about the audio-visual content.

Answer Generation Graph Representation Learning

Diverse and Coherent Paragraph Generation from Images

no code implementations ECCV 2018 Moitreya Chatterjee, Alexander G. Schwing

Paragraph generation from images, which has gained popularity recently, is an important task for video summarization, editing, and support of the disabled.

Image Captioning Image Paragraph Captioning +1

Coreset-Based Neural Network Compression

no code implementations ECCV 2018 Abhimanyu Dubey, Moitreya Chatterjee, Narendra Ahuja

We propose a novel Convolutional Neural Network (CNN) compression algorithm based on coreset representations of filters.

Neural Network Compression Quantization

An Active Learning Based Approach For Effective Video Annotation And Retrieval

no code implementations27 Apr 2015 Moitreya Chatterjee, Anton Leuski

Conventional multimedia annotation/retrieval systems such as Normalized Continuous Relevance Model (NormCRM) [16] require a fully labeled training data for a good performance.

Active Learning Clustering +1

Cannot find the paper you are looking for? You can Submit a new open access paper.