Search Results for author: Anand Mishra

Found 15 papers, 5 papers with code

Towards Scene-Text to Scene-Text Translation

no code implementations6 Aug 2023 Onkar Susladkar, Prajwal Gatti, Anand Mishra

In this work, we study the task of ``visually" translating scene text from a source language (e. g., English) to a target language (e. g., Chinese).

Scene Text Editing Translation

Answer Mining from a Pool of Images: Towards Retrieval-Based Visual Question Answering

no code implementations29 Jun 2023 Abhirama Subramanyam Penamakuri, Manish Gupta, Mithun Das Gupta, Anand Mishra

We study visual question answering in a setting where the answer has to be mined from a pool of relevant and irrelevant images given as a context.

Answer Generation Question Answering +2

Query-guided Attention in Vision Transformers for Localizing Objects Using a Single Sketch

no code implementations15 Mar 2023 Aditay Tripathi, Anand Mishra, Anirban Chakraborty

and Sketchy datasets, respectively, and a $12. 2\%$ improvement in AP@50 for large objects that are `unseen' during training.

Object object-detection +2

Few-Shot Referring Relationships in Videos

1 code implementation CVPR 2023 Yogesh Kumar, Anand Mishra

Given a query visual relationship as <subject, predicate, object> and a test video, our objective is to localize the subject and object that are connected via the predicate.

Object Relation Network +1

Contrastive Multi-View Textual-Visual Encoding: Towards One Hundred Thousand-Scale One-Shot Logo Identification

no code implementations23 Nov 2022 Nakul Sharma, Abhirama S. Penamakuri, Anand Mishra

To fill this gap in the literature, we introduce Wikidata Reference Logo Dataset (WiRLD), containing logos for 100K business brands harvested from Wikidata.

Logo Recognition

Look, Read and Ask: Learning to Ask Questions by Reading Text in Images

no code implementations23 Nov 2022 Soumya Jahagirdar, Shankar Gangisetty, Anand Mishra

However, it is challenging as it requires an in-depth understanding of the scene and the ability to semantically bridge the visual content with the text present in the image.

Optical Character Recognition (OCR) Question Answering +4

Grounding Scene Graphs on Natural Images via Visio-Lingual Message Passing

no code implementations3 Nov 2022 Aditay Tripathi, Anand Mishra, Anirban Chakraborty

In VL-MPAG Net, we first construct a directed graph with object proposals as nodes and an edge between a pair of nodes representing a plausible relation between them.

Object Object Localization

COFAR: Commonsense and Factual Reasoning in Image Search

no code implementations16 Oct 2022 Prajwal Gatti, Abhirama Subramanyam Penamakuri, Revant Teotia, Anand Mishra, Shubhashis Sengupta, Roshni Ramnani

To enable both commonsense and factual reasoning in the image search, we present a unified framework, namely Knowledge Retrieval-Augmented Multimodal Transformer (KRAMT), that treats the named visual entities in an image as a gateway to encyclopedic knowledge and leverages them along with natural language query to ground relevant knowledge.

Image Retrieval Retrieval +1

Few-shot Visual Relationship Co-localization

1 code implementation ICCV 2021 Revant Teotia, Vaibhav Mishra, Mayank Maheshwari, Anand Mishra

In this paper, given a small bag of images, each containing a common but latent predicate, we are interested in localizing visual subject-object pairs connected via the common predicate in each of the images.

Meta-Learning Object

Enhancing Energy Minimization Framework for Scene Text Recognition with Top-Down Cues

no code implementations13 Jan 2016 Anand Mishra, Karteek Alahari, C. V. Jawahar

We build a conditional random field model on these detections to jointly model the strength of the detections and the interactions between them.

Scene Text Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.