Search Results for author: Badri N. Patro

Found 18 papers, 9 papers with code

SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series

1 code implementation • 22 Mar 2024 • Badri N. Patro, Vijay S. Agneeswaran

Transformers have widely adopted attention networks for sequence mixing and MLPs for channel mixing, playing a pivotal role in achieving breakthroughs across domains.

Inductive Bias Time Series +1

113

Paper
Code

Learning Semantic Sentence Embeddings using Sequential Pair-wise Discriminator

2 code implementations • COLING 2018 • Badri N. Patro, Vinod K. Kurmi, Sandeep Kumar, Vinay P. Namboodiri

One way to ensure this is by adding constraints for true paraphrase embeddings to be close and unrelated paraphrase candidate sentence embeddings to be far.

Paraphrase Generation Sentence +3

Paper
Code

Revisiting Paraphrase Question Generator using Pairwise Discriminator

1 code implementation • 31 Dec 2019 • Badri N. Patro, Dev Chauhan, Vinod K. Kurmi, Vinay P. Namboodiri

One way to ensure this is by adding constraints for true paraphrase embeddings to be close and unrelated paraphrase candidate sentence embeddings to be far.

Paraphrase Generation Sentence +3

Paper
Code

SpectFormer: Frequency and Attention is what you need in a Vision Transformer

1 code implementation • 13 Apr 2023 • Badri N. Patro, Vinay P. Namboodiri, Vijay Srinivas Agneeswaran

Vision transformers have been applied successfully for image recognition tasks.

Instance Segmentation object-detection +3

Paper
Code

Efficiency 360: Efficient Vision Transformers

1 code implementation • 16 Feb 2023 • Badri N. Patro, Vijay Srinivas Agneeswaran

Transformers are widely used for solving tasks in natural language processing, computer vision, speech, and music domains.

Continual Learning Fairness +1

Paper
Code

Multimodal Differential Network for Visual Question Generation

1 code implementation • EMNLP 2018 • Badri N. Patro, Sandeep Kumar, Vinod K. Kurmi, Vinay P. Namboodiri

Generating natural questions from an image is a semantic task that requires using visual and language modality to learn multimodal representations.

Ranked #1 on Question Generation on COCO Visual Question Answering (VQA) real images 1.0 open ended

Natural Questions Question Generation +1

Paper
Code

Robust Explanations for Visual Question Answering

1 code implementation • 23 Jan 2020 • Badri N. Patro, Shivansh Pate, Vinay P. Namboodiri

Our model explains the answers obtained through a VQA model by providing visual and textual explanations.

Question Answering Visual Question Answering

Paper
Code

Barlow constrained optimization for Visual Question Answering

1 code implementation • 7 Mar 2022 • Abhishek Jha, Badri N. Patro, Luc van Gool, Tinne Tuytelaars

In this paper, we propose a novel regularization for VQA models, Constrained Optimization using Barlow's theory (COB), that improves the information content of the joint space by minimizing the redundancy.

Question Answering Visual Question Answering

Paper
Code

Spectral Convolutional Transformer: Harmonizing Real vs. Complex Multi-View Spectral Operators for Vision Transformer

1 code implementation • 26 Mar 2024 • Badri N. Patro, Vinay P. Namboodiri, Vijay S. Agneeswaran

Transformers used in vision have been investigated through diverse architectures - ViT, PVT, and Swin.

Instance Segmentation Semantic Segmentation +1

Paper
Code

U-CAM: Visual Explanation using Uncertainty based Class Activation Maps

no code implementations • ICCV 2019 • Badri N. Patro, Mayank Lunayach, Shivansh Patel, Vinay P. Namboodiri

These have two-fold benefits: a) improvement in obtaining the certainty estimates that correlate better with misclassified samples and b) improved attention maps that provide state-of-the-art results in terms of correlation with human attention regions.

Probabilistic Deep Learning Question Answering +1

Paper
Add Code

Probabilistic framework for solving Visual Dialog

no code implementations • 11 Sep 2019 • Badri N. Patro, Anupriy, Vinay P. Namboodiri

In this paper, we propose a probabilistic framework for solving the task of `Visual Dialog'.

Ranked #1 on Common Sense Reasoning on Visual Dialog v0.9

Common Sense Reasoning Visual Dialog

Paper
Add Code

Dynamic Attention Networks for Task Oriented Grounding

no code implementations • 14 Oct 2019 • Soumik Dasgupta, Badri N. Patro, Vinay P. Namboodiri

In this work, we show that Dynamic Attention helps in achieving grounding and also aids in the policy learning objective.

Paper
Add Code

Granular Multimodal Attention Networks for Visual Dialog

no code implementations • 13 Oct 2019 • Badri N. Patro, Shivansh Patel, Vinay P. Namboodiri

Particularly, in this work, we propose a new method Granular Multi-modal Attention, where we aim to particularly address the question of the right granularity at which one needs to attend while solving the Visual Dialog task.

Visual Dialog

Paper
Add Code

Explanation vs Attention: A Two-Player Game to Obtain Attention for VQA

no code implementations • 19 Nov 2019 • Badri N. Patro, Anupriy, Vinay P. Namboodiri

It also results in a good improvement in rank correlation metric on the VQA task.

Question Answering Visual Question Answering

Paper
Add Code

Deep Exemplar Networks for VQA and VQG

no code implementations • 19 Dec 2019 • Badri N. Patro, Vinay P. Namboodiri

Specifically, we incorporate exemplar based approaches and show that an exemplar based module can be incorporated in almost any of the deep learning architectures proposed in the literature and the addition of such a block results in improved performance for solving these tasks.

Question Answering Question Generation +2