Search Results for author: Mausoom Sarkar

Found 20 papers, 6 papers with code

LOCATE: Self-supervised Object Discovery via Flow-guided Graph-cut and Bootstrapped Self-training

1 code implementation22 Aug 2023 Silky Singh, Shripad Deshmukh, Mausoom Sarkar, Balaji Krishnamurthy

We demonstrate the effectiveness of our approach, named LOCATE, on multiple standard video object segmentation, image saliency detection, and object segmentation benchmarks, achieving results on par with and, in many cases surpassing state-of-the-art methods.

Object Object Discovery +5

FODVid: Flow-guided Object Discovery in Videos

no code implementations10 Jul 2023 Silky Singh, Shripad Deshmukh, Mausoom Sarkar, Rishabh Jain, Mayur Hemani, Balaji Krishnamurthy

Segmentation of objects in a video is challenging due to the nuances such as motion blurring, parallax, occlusions, changes in illumination, etc.

Object Object Discovery +5

SARC: Soft Actor Retrospective Critic

1 code implementation28 Jun 2023 Sukriti Verma, Ayush Chopra, Jayakumar Subramanian, Mausoom Sarkar, Nikaash Puri, Piyush Gupta, Balaji Krishnamurthy

The two-time scale nature of SAC, which is an actor-critic algorithm, is characterised by the fact that the critic estimate has not converged for the actor at any given time, but since the critic learns faster than the actor, it ensures eventual consistency between the two.

Towards Estimating Transferability using Hard Subsets

no code implementations17 Jan 2023 Tarun Ram Menta, Surgan Jandial, Akash Patil, Vimal KB, Saketh Bachu, Balaji Krishnamurthy, Vineeth N. Balasubramanian, Chirag Agarwal, Mausoom Sarkar

As transfer learning techniques are increasingly used to transfer knowledge from the source model to the target task, it becomes important to quantify which source models are suitable for a given target task without performing computationally expensive fine tuning.

Transfer Learning

One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text

no code implementations12 Sep 2022 Abhinav Java, Shripad Deshmukh, Milan Aggarwal, Surgan Jandial, Mausoom Sarkar, Balaji Krishnamurthy

MONOMER fuses context from visual, textual, and spatial modalities of snippets and documents to find query snippet in target documents.

document understanding object-detection +3

Video2Skill: Adapting Events in Demonstration Videos to Skills in an Environment using Cyclic MDP Homomorphisms

no code implementations8 Sep 2021 Sumedh A Sontakke, Sumegh Roychowdhury, Mausoom Sarkar, Nikaash Puri, Balaji Krishnamurthy, Laurent Itti

Humans excel at learning long-horizon tasks from demonstrations augmented with textual commentary, as evidenced by the burgeoning popularity of tutorial videos online.

Decision Making

Form2Seq : A Framework for Higher-Order Form Structure Extraction

1 code implementation EMNLP 2020 Milan Aggarwal, Hiresh Gupta, Mausoom Sarkar, Balaji Krishnamurthy

To mitigate this, we propose Form2Seq, a novel sequence-to-sequence (Seq2Seq) inspired framework for structure extraction using text, with a specific focus on forms, which leverages relative spatial arrangement of structures.

Semantic Segmentation

Multi-Modal Association based Grouping for Form Structure Extraction

1 code implementation9 Jul 2021 Milan Aggarwal, Mausoom Sarkar, Hiresh Gupta, Balaji Krishnamurthy

Experimental results show the effectiveness of our approach achieving a recall of 90. 29%, 73. 80%, 83. 12%, and 52. 72% for the above structures, respectively, outperforming semantic segmentation baselines significantly.

Semantic Segmentation

Retrospective Loss: Looking Back to Improve Training of Deep Neural Networks

no code implementations24 Jun 2020 Surgan Jandial, Ayush Chopra, Mausoom Sarkar, Piyush Gupta, Balaji Krishnamurthy, Vineeth Balasubramanian

Deep neural networks (DNNs) are powerful learning machines that have enabled breakthroughs in several domains.

Document Structure Extraction using Prior based High Resolution Hierarchical Semantic Segmentation

no code implementations ECCV 2020 Mausoom Sarkar, Milan Aggarwal, Arneh Jain, Hiresh Gupta, Balaji Krishnamurthy

We introduce our new human-annotated forms dataset and show that our method significantly outperforms different segmentation baselines on this dataset in extracting hierarchical structures.

Segmentation Semantic Segmentation +2

Attention Based Natural Language Grounding by Navigating Virtual Environment

1 code implementation23 Apr 2018 Akilesh B, Abhishek Sinha, Mausoom Sarkar, Balaji Krishnamurthy

We develop an attention mechanism for multi-modal fusion of visual and textual modalities that allows the agent to learn to complete the task and achieve language grounding.

Navigate Zero-shot Generalization

Learning to navigate by distilling visual information and natural language instructions

no code implementations ICLR 2018 Abhishek Sinha, Akilesh B, Mausoom Sarkar, Balaji Krishnamurthy

In this work, we focus on the problem of grounding language by training an agent to follow a set of natural language instructions and navigate to a target object in a 2D grid environment.

Navigate Zero-shot Generalization

Introspection: Accelerating Neural Network Training By Learning Weight Evolution

no code implementations17 Apr 2017 Abhishek Sinha, Mausoom Sarkar, Aahitagni Mukherjee, Balaji Krishnamurthy

In this paper, we explore the idea of learning weight evolution pattern from a simple network for accelerating training of novel neural networks.

General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.