no code implementations • 4 Sep 2024 • Owais Iqbal, Omprakash Chakraborty, Aftab Hussain, Rameswar Panda, Abir Das
Specifically, we utilize a 2D image-transformer to generate representations and apply a contrastive loss function to minimize the similarity between representations from different videos while maximizing the representations of identical videos.
no code implementations • 13 May 2024 • Kaushik Dey, Satheesh K. Perepu, Abir Das, Pallab Dasgupta
In recent years, there has been some work on AI-based IMFs that can handle conflicting intents and prioritize the global objective based on apriori definition of the utility function and accorded priorities for competing intents.
no code implementations • CVPR 2024 • Anurag Roy, Riddhiman Moulick, Vinay K. Verma, Saptarshi Ghosh, Abir Das
Continual Learning (CL) enables machine learning models to learn from continuously shifting new training data in absence of data from old tasks.
no code implementations • 26 Oct 2023 • Kaushik Dey, Satheesh K. Perepu, Abir Das
Often there exists a hierarchical structure of intent fulfilment where multiple pre-trained, self-interested agents may need to be further orchestrated by a supervisor or controller agent.
no code implementations • ICCV 2023 • Anurag Roy, Vinay Kumar Verma, Sravan Voonna, Kripabandhu Ghosh, Saptarshi Ghosh, Abir Das
Although there have been some recent CL approaches for vision transformers, they either store training instances of previous tasks or require a task identifier during test time, which can be limiting.
no code implementations • 2 Mar 2023 • Kaushik Dey, Satheesh K. Perepu, Pallab Dasgupta, Abir Das
The dynamic and evolutionary nature of service requirements in wireless networks has motivated the telecom industry to consider intelligent self-adapting Reinforcement Learning (RL) agents for controlling the growing portfolio of network services.
no code implementations • 13 Oct 2022 • Anurag Roy, David Johnson Ekka, Saptarshi Ghosh, Abir Das
In this paper, we propose a new and challenging Few-Shot Visual Question Generation (FS-VQG) task and provide a comprehensive benchmark to it.
no code implementations • 26 Nov 2021 • Siddhant Agarwal, Owais Iqbal, Sree Aditya Buridi, Madda Manjusha, Abir Das
Black-box methods to generate saliency maps are particularly interesting due to the fact that they do not utilize the internals of the model to explain the decision.
no code implementations • NeurIPS 2021 • Aadarsh Sahoo, Rutav Shah, Rameswar Panda, Kate Saenko, Abir Das
Unsupervised domain adaptation which aims to adapt models trained on a labeled source domain to a completely unlabeled target domain has attracted much attention in recent years.
Ranked #2 on Unsupervised Domain Adaptation on UCF-HMDB
1 code implementation • CVPR 2021 • Ankit Singh, Omprakash Chakraborty, Ashutosh Varshney, Rameswar Panda, Rogerio Feris, Kate Saenko, Abir Das
We approach this problem by learning a two-pathway temporal contrastive model using unlabeled videos at two different speeds leveraging the fact that changing video speed does not change an action.
no code implementations • 6 Dec 2020 • Aadarsh Sahoo, Rameswar Panda, Rogerio Feris, Kate Saenko, Abir Das
Partial domain adaptation which assumes that the unknown target label space is a subset of the source label space has attracted much attention in computer vision.
Ranked #1 on Partial Domain Adaptation on Office-31
1 code implementation • 12 Aug 2020 • Aadarsh Sahoo, Ankit Singh, Rameswar Panda, Rogerio Feris, Abir Das
In this work we address these questions from the perspective of dataset imbalance resulting out of severe under-representation of annotated training data for certain classes and its effect on both deep classification and generation methods.
no code implementations • 31 Mar 2020 • Huijuan Xu, Ximeng Sun, Eric Tzeng, Abir Das, Kate Saenko, Trevor Darrell
In this paper, we present a conceptually simple and general yet novel framework for few-shot temporal activity detection based on proposal regression which detects the start and end time of the activities in untrimmed videos.
no code implementations • 5 Jun 2019 • Huijuan Xu, Abir Das, Kate Saenko
We address the problem of temporal activity detection in continuous, untrimmed video streams.
Ranked #4 on Action Recognition on THUMOS’14
12 code implementations • 19 Jun 2018 • Vitali Petsiuk, Abir Das, Kate Saenko
We compare our approach to state-of-the-art importance extraction methods using both an automatic deletion/insertion metric and a pointing metric based on human-annotated object segments.
Explainable Artificial Intelligence (XAI) Feature Importance +5
no code implementations • ICCV 2017 • Rameswar Panda, Abir Das, Ziyan Wu, Jan Ernst, Amit K. Roy-Chowdhury
Casting the problem as a weakly supervised learning problem, we propose a flexible deep 3D CNN architecture to learn the notion of importance using only video-level annotation, and without any human-crafted training data.
3 code implementations • ICCV 2017 • Huijuan Xu, Abir Das, Kate Saenko
We address the problem of activity detection in continuous, untrimmed video streams.
Ranked #1 on Action Recognition In Videos on THUMOS’14
6 code implementations • CVPR 2017 • Vasili Ramanishka, Abir Das, Jianming Zhang, Kate Saenko
Neural image/video captioning models can generate accurate descriptions, but their internal process of mapping regions to words is a black box and therefore difficult to explain.
no code implementations • 1 Aug 2016 • Rameswar Panda, Abir Das, Amit K. Roy-Chowdhury
While most existing video summarization approaches aim to extract an informative summary of a single video, we propose a novel framework for summarizing multi-view videos by exploiting both intra- and inter-view content correlations in a joint embedding space.
no code implementations • 25 Jul 2016 • Niki Martinel, Abir Das, Christian Micheloni, Amit K. Roy-Chowdhury
Person re-identification is an open and challenging problem in computer vision.
no code implementations • 1 Jul 2016 • Abir Das, Rameswar Panda, Amit K. Roy-Chowdhury
We demonstrate the effectiveness of our approach on multi-camera person re-identification datasets, to demonstrate the feasibility of learning online classification models in multi-camera big data applications.