Search Results for author: Sagnik Majumder

Found 14 papers, 5 papers with code

Unified Probabilistic Deep Continual Learning through Generative Replay and Open Set Recognition

3 code implementations28 May 2019 Martin Mundt, Iuliia Pliushch, Sagnik Majumder, Yongwon Hong, Visvanathan Ramesh

Modern deep neural networks are well known to be brittle in the face of unknown data instances and recognition of the latter remains a challenge.

Audio Classification Bayesian Inference +3

Open Set Recognition Through Deep Neural Network Uncertainty: Does Out-of-Distribution Detection Require Generative Classifiers?

no code implementations26 Aug 2019 Martin Mundt, Iuliia Pliushch, Sagnik Majumder, Visvanathan Ramesh

We present an analysis of predictive uncertainty based out-of-distribution detection for different approaches to estimate various models' epistemic uncertainty and contrast it with extreme value theory based open set recognition.

Open Set Learning Out-of-Distribution Detection

Learning to Set Waypoints for Audio-Visual Navigation

1 code implementation ICLR 2021 Changan Chen, Sagnik Majumder, Ziad Al-Halah, Ruohan Gao, Santhosh Kumar Ramakrishnan, Kristen Grauman

In audio-visual navigation, an agent intelligently travels through a complex, unmapped 3D environment using both sights and sounds to find a sound source (e. g., a phone ringing in another room).

Visual Navigation

Model Agnostic Answer Reranking System for Adversarial Question Answering

no code implementations EACL 2021 Sagnik Majumder, Chinmoy Samant, Greg Durrett

While numerous methods have been proposed as defenses against adversarial examples in question answering (QA), these techniques are often model specific, require retraining of the model, and give only marginal improvements in performance over vanilla models.

Question Answering

Move2Hear: Active Audio-Visual Source Separation

no code implementations ICCV 2021 Sagnik Majumder, Ziad Al-Halah, Kristen Grauman

We introduce the active audio-visual source separation problem, where an agent must move intelligently in order to better isolate the sounds coming from an object of interest in its environment.

Audio Source Separation Object

Active Audio-Visual Separation of Dynamic Sound Sources

1 code implementation2 Feb 2022 Sagnik Majumder, Kristen Grauman

We explore active audio-visual separation for dynamic sound sources, where an embodied agent moves intelligently in a 3D environment to continuously isolate the time-varying audio stream being emitted by an object of interest.

Few-Shot Audio-Visual Learning of Environment Acoustics

no code implementations8 Jun 2022 Sagnik Majumder, Changan Chen, Ziad Al-Halah, Kristen Grauman

Room impulse response (RIR) functions capture how the surrounding physical environment transforms the sounds heard by a listener, with implications for various applications in AR, VR, and robotics.

audio-visual learning Room Impulse Response (RIR)

Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations

no code implementations CVPR 2023 Sagnik Majumder, Hao Jiang, Pierre Moulon, Ethan Henderson, Paul Calamia, Kristen Grauman, Vamsi Krishna Ithapu

Can conversational videos captured from multiple egocentric viewpoints reveal the map of a scene in a cost-efficient way?

Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos

no code implementations10 Jul 2023 Sagnik Majumder, Ziad Al-Halah, Kristen Grauman

We propose a self-supervised method for learning representations based on spatial audio-visual correspondences in egocentric videos.

Audio Denoising Denoising

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

no code implementations30 Nov 2023 Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain, Rawal Khirodkar, Devansh Kukreja, Kevin J Liang, Jia-Wei Liu, Sagnik Majumder, Yongsen Mao, Miguel Martin, Effrosyni Mavroudi, Tushar Nagarajan, Francesco Ragusa, Santhosh Kumar Ramakrishnan, Luigi Seminara, Arjun Somayazulu, Yale Song, Shan Su, Zihui Xue, Edward Zhang, Jinxu Zhang, Angela Castillo, Changan Chen, Xinzhu Fu, Ryosuke Furuta, Cristina Gonzalez, Prince Gupta, Jiabo Hu, Yifei HUANG, Yiming Huang, Weslie Khoo, Anush Kumar, Robert Kuo, Sach Lakhavani, Miao Liu, Mi Luo, Zhengyi Luo, Brighid Meredith, Austin Miller, Oluwatumininu Oguntola, Xiaqing Pan, Penny Peng, Shraman Pramanick, Merey Ramazanova, Fiona Ryan, Wei Shan, Kiran Somasundaram, Chenan Song, Audrey Southerland, Masatoshi Tateno, Huiyu Wang, Yuchen Wang, Takuma Yagi, Mingfei Yan, Xitong Yang, Zecheng Yu, Shengxin Cindy Zha, Chen Zhao, Ziwei Zhao, Zhifan Zhu, Jeff Zhuo, Pablo Arbelaez, Gedas Bertasius, David Crandall, Dima Damen, Jakob Engel, Giovanni Maria Farinella, Antonino Furnari, Bernard Ghanem, Judy Hoffman, C. V. Jawahar, Richard Newcombe, Hyun Soo Park, James M. Rehg, Yoichi Sato, Manolis Savva, Jianbo Shi, Mike Zheng Shou, Michael Wray

We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge.

Video Understanding

Cannot find the paper you are looking for? You can Submit a new open access paper.