Search Results for author: Ajay Divakaran

Found 42 papers, 6 papers with code

FoodX-251: A Dataset for Fine-grained Food Classification

1 code implementation14 Jul 2019 Parneet Kaur, Karan Sikka, Weijun Wang, Serge Belongie, Ajay Divakaran

Food classification is a challenging problem due to the large number of categories, high visual similarity between different foods, as well as the lack of datasets for training state-of-the-art deep models.

Classification Fine-Grained Visual Categorization +1

Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models

1 code implementation8 Sep 2023 Yangyi Chen, Karan Sikka, Michael Cogswell, Heng Ji, Ajay Divakaran

Based on this pipeline and the existing coarse-grained annotated dataset, we build the CURE benchmark to measure both the zero-shot reasoning performance and consistency of VLMs.

Visual Reasoning

Integrating Text and Image: Determining Multimodal Document Intent in Instagram Posts

1 code implementation IJCNLP 2019 Julia Kruk, Jonah Lubin, Karan Sikka, Xiao Lin, Dan Jurafsky, Ajay Divakaran

Computing author intent from multimodal data like Instagram posts requires modeling a complex relationship between text and image.

Intent Detection

Modular Adaptation for Cross-Domain Few-Shot Learning

1 code implementation1 Apr 2021 Xiao Lin, Meng Ye, Yunye Gong, Giedrius Buracas, Nikoletta Basiou, Ajay Divakaran, Yi Yao

Adapting pre-trained representations has become the go-to recipe for learning new downstream tasks with limited examples.

cross-domain few-shot learning Representation Learning

Probing Conceptual Understanding of Large Visual-Language Models

1 code implementation7 Apr 2023 Madeline Chantry Schiappa, Michael Cogswell, Ajay Divakaran, Yogesh Singh Rawat

In recent years large visual-language (V+L) models have achieved great success in various downstream tasks.

Benchmarking

Zero-Shot Object Detection

no code implementations ECCV 2018 Ankan Bansal, Karan Sikka, Gaurav Sharma, Rama Chellappa, Ajay Divakaran

We introduce and tackle the problem of zero-shot object detection (ZSD), which aims to detect object classes which are not observed during training.

Object object-detection +2

Human Social Interaction Modeling Using Temporal Deep Networks

no code implementations6 May 2015 Mohamed R. Amer, Behjat Siddiquie, Amir Tamrakar, David A. Salter, Brian Lande, Darius Mehri, Ajay Divakaran

We present a novel approach to computational modeling of social interactions based on modeling of essential social interaction predicates (ESIPs) such as joint attention and entrainment.

Understanding Visual Ads by Aligning Symbols and Objects using Co-Attention

no code implementations4 Jul 2018 Karuna Ahuja, Karan Sikka, Anirban Roy, Ajay Divakaran

We show that our model outperforms other baselines on the benchmark Ad dataset and also show qualitative results to highlight the advantages of using multihop co-attention.

Stacked Spatio-Temporal Graph Convolutional Networks for Action Segmentation

no code implementations26 Nov 2018 Pallabi Ghosh, Yi Yao, Larry S. Davis, Ajay Divakaran

We show results on CAD120 (which provides pre-computed node features and edge weights for fair performance comparison across algorithms) as well as a more complex real-world activity dataset, Charades.

Action Recognition Action Segmentation +2

Can You Explain That? Lucid Explanations Help Human-AI Collaborative Image Retrieval

no code implementations5 Apr 2019 Arijit Ray, Yi Yao, Rakesh Kumar, Ajay Divakaran, Giedrius Burachas

Our experiments, therefore, demonstrate that ExAG is an effective means to evaluate the efficacy of AI-generated explanations on a human-AI collaborative task.

Image Retrieval Question Answering +2

Data-Efficient Mutual Information Neural Estimator

no code implementations8 May 2019 Xiao Lin, Indranil Sur, Samuel A. Nastase, Ajay Divakaran, Uri Hasson, Mohamed R. Amer

We demonstrate the effectiveness of our estimators on synthetic benchmarks and a real world fMRI data, with application of inter-subject correlation analysis.

Meta-Learning

Deep Unified Multimodal Embeddings for Understanding both Content and Users in Social Media Networks

no code implementations17 May 2019 Karan Sikka, Lucas Van Bramer, Ajay Divakaran

We also show that the user embeddings learned within our joint multimodal embedding model are better at predicting user interests compared to those learned with unimodal content on Instagram data.

Cross-Modal Retrieval Retrieval

Progressive Growing of Neural ODEs

no code implementations ICLR Workshop DeepDiffEq 2019 Hammad A. Ayyubi, Yi Yao, Ajay Divakaran

Neural Ordinary Differential Equations (NODEs) have proven to be a powerful modeling tool for approximating (interpolation) and forecasting (extrapolation) irregularly sampled time series data.

Time Series Time Series Forecasting

Deep Adaptive Semantic Logic (DASL): Compiling Declarative Knowledge into Deep Neural Networks

no code implementations16 Mar 2020 Karan Sikka, Andrew Silberfarb, John Byrnes, Indranil Sur, Ed Chow, Ajay Divakaran, Richard Rohwer

We introduce Deep Adaptive Semantic Logic (DASL), a novel framework for automating the generation of deep neural networks that incorporates user-provided formal knowledge to improve learning from data.

Image Classification Relationship Detection +1

Lifelong Learning using Eigentasks: Task Separation, Skill Acquisition, and Selective Transfer

no code implementations14 Jul 2020 Aswin Raghavan, Jesse Hostetler, Indranil Sur, Abrar Rahman, Ajay Divakaran

We propose a wake-sleep cycle of alternating task learning and knowledge consolidation for learning in our framework, and instantiate it for lifelong supervised learning and lifelong RL.

Continual Learning Transfer Learning

Towards Solving Multimodal Comprehension

no code implementations20 Apr 2021 Pritish Sahu, Karan Sikka, Ajay Divakaran

We then evaluate M3C using a textual cloze style question-answering task and highlight an inherent bias in the question answer generation method from [35] that enables a naive baseline to cheat by learning from only answer choices.

16k Answer Generation +3

Challenges in Procedural Multimodal Machine Comprehension:A Novel Way To Benchmark

no code implementations22 Oct 2021 Pritish Sahu, Karan Sikka, Ajay Divakaran

We also observe a drop in performance across all the models when testing on RecipeQA and proposed Meta-RecipeQA (e. g. 83. 6% versus 67. 1% for HTRN), which shows that the proposed dataset is relatively less biased.

Answer Generation Machine Reading Comprehension +2

A Data-Efficient Mutual Information Neural Estimator for Statistical Dependency Testing

no code implementations25 Sep 2019 Xiao Lin, Indranil Sur, Samuel A. Nastase, Uri Hasson, Ajay Divakaran, Mohamed R. Amer

Measuring Mutual Information (MI) between high-dimensional, continuous, random variables from observed samples has wide theoretical and practical applications.

Meta-Learning

Lifelong Learning using Eigentasks: Task Separation, Skill Acquisition and Selective Transfer

no code implementations ICML Workshop LifelongML 2020 Aswin Raghavan, Jesse Hostetler, Indranil Sur, Abrar Rahman, Ajay Divakaran

We propose a wake-sleep cycle of alternating task learning and knowledge consolidation for learning in our framework, and instantiate it for lifelong supervised learning and lifelong RL.

Continual Learning Starcraft +1

Detecting out-of-context objects using contextual cues

no code implementations11 Feb 2022 Manoj Acharya, Anirban Roy, Kaushik Koneripalli, Susmit Jha, Christopher Kanan, Ajay Divakaran

GCRN consists of two separate graphs to predict object labels based on the contextual cues in the image: 1) a representation graph to learn object features based on the neighboring objects and 2) a context graph to explicitly capture contextual cues from the neighboring objects.

Anomaly Detection Object

Towards Understanding Confusion and Affective States Under Communication Failures in Voice-Based Human-Machine Interaction

no code implementations15 Jul 2022 Sujeong Kim, Abhinav Garlapati, Jonah Lubin, Amir Tamrakar, Ajay Divakaran

We present a series of two studies conducted to understand user's affective states during voice-based human-machine interactions.

Model-Free Generative Replay for Lifelong Reinforcement Learning: Application to Starcraft-2

no code implementations9 Aug 2022 Zachary Daniels, Aswin Raghavan, Jesse Hostetler, Abrar Rahman, Indranil Sur, Michael Piacentino, Ajay Divakaran

We present a version of GR for LRL that satisfies two desiderata: (a) Introspective density modelling of the latent representations of policies learned using deep RL, and (b) Model-free end-to-end learning.

Management reinforcement-learning +3

Unpacking Large Language Models with Conceptual Consistency

no code implementations29 Sep 2022 Pritish Sahu, Michael Cogswell, Yunye Gong, Ajay Divakaran

The success of Large Language Models (LLMs) indicates they are increasingly able to answer queries like these accurately, but that ability does not necessarily imply a general understanding of concepts relevant to the anchor query.

Language Modelling Large Language Model

System Design for an Integrated Lifelong Reinforcement Learning Agent for Real-Time Strategy Games

no code implementations8 Dec 2022 Indranil Sur, Zachary Daniels, Abrar Rahman, Kamil Faber, Gianmarco J. Gallardo, Tyler L. Hayes, Cameron E. Taylor, Mustafa Burak Gurbuz, James Smith, Sahana Joshi, Nathalie Japkowicz, Michael Baron, Zsolt Kira, Christopher Kanan, Roberto Corizzo, Ajay Divakaran, Michael Piacentino, Jesse Hostetler, Aswin Raghavan

In this paper, we introduce the Lifelong Reinforcement Learning Components Framework (L2RLCF), which standardizes L2RL systems and assimilates different continual learning components (each addressing different aspects of the lifelong learning problem) into a unified system.

Continual Learning reinforcement-learning +2

Confidence Calibration for Systems with Cascaded Predictive Modules

no code implementations21 Sep 2023 Yunye Gong, Yi Yao, Xiao Lin, Ajay Divakaran, Melinda Gervasio

Existing conformal prediction algorithms estimate prediction intervals at target confidence levels to characterize the performance of a regression model on new test samples.

Conformal Prediction Prediction Intervals +1

Demonstrations Are All You Need: Advancing Offensive Content Paraphrasing using In-Context Learning

no code implementations16 Oct 2023 Anirudh Som, Karan Sikka, Helen Gent, Ajay Divakaran, Andreas Kathol, Dimitra Vergyri

Paraphrasing of offensive content is a better alternative to content removal and helps improve civility in a communication environment.

In-Context Learning

DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback

no code implementations16 Nov 2023 Yangyi Chen, Karan Sikka, Michael Cogswell, Heng Ji, Ajay Divakaran

The critique NLF identifies the strengths and weaknesses of the responses and is used to align the LVLMs with human preferences.

Language Modelling

A Video is Worth 10,000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval

no code implementations30 Nov 2023 Matthew Gwilliam, Michael Cogswell, Meng Ye, Karan Sikka, Abhinav Shrivastava, Ajay Divakaran

To provide a more thorough evaluation of the capabilities of long video retrieval systems, we propose a pipeline that leverages state-of-the-art large language models to carefully generate a diverse set of synthetic captions for long videos.

Benchmarking Retrieval +2

BloomVQA: Assessing Hierarchical Multi-modal Comprehension

no code implementations20 Dec 2023 Yunye Gong, Robik Shrestha, Jared Claypoole, Michael Cogswell, Arijit Ray, Christopher Kanan, Ajay Divakaran

We propose a novel VQA dataset, BloomVQA, to facilitate comprehensive evaluation of large vision-language models on comprehension tasks.

Data Augmentation Memorization +2

Cannot find the paper you are looking for? You can Submit a new open access paper.