Search Results for author: Vishaal Udandarao

Found 13 papers, 9 papers with code

A Practitioner's Guide to Continual Multimodal Pretraining

1 code implementation26 Aug 2024 Karsten Roth, Vishaal Udandarao, Sebastian Dziadzio, Ameya Prabhu, Mehdi Cherti, Oriol Vinyals, Olivier Hénaff, Samuel Albanie, Matthias Bethge, Zeynep Akata

In this work, we complement current perspectives on continual pretraining through a research test bed as well as provide comprehensive guidance for effective continual model updates in such scenarios.

Continual Learning Continual Pretraining +1

CiteME: Can Language Models Accurately Cite Scientific Claims?

no code implementations10 Jul 2024 Ori Press, Andreas Hochlehnert, Ameya Prabhu, Vishaal Udandarao, Ofir Press, Matthias Bethge

We pose the following research question: Given a text excerpt referencing a paper, could an LM act as a research assistant to correctly identify the referenced paper?

Attribute

No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance

1 code implementation4 Apr 2024 Vishaal Udandarao, Ameya Prabhu, Adhiraj Ghosh, Yash Sharma, Philip H. S. Torr, Adel Bibi, Samuel Albanie, Matthias Bethge

Web-crawled pretraining datasets underlie the impressive "zero-shot" evaluation performance of multimodal models, such as CLIP for classification/retrieval and Stable-Diffusion for image generation.

Benchmarking Image Generation +1

Lifelong Benchmarks: Efficient Model Evaluation in an Era of Rapid Progress

1 code implementation29 Feb 2024 Ameya Prabhu, Vishaal Udandarao, Philip Torr, Matthias Bethge, Adel Bibi, Samuel Albanie

However, with repeated testing, the risk of overfitting grows as algorithms over-exploit benchmark idiosyncrasies.

Benchmarking

Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models

1 code implementation12 Oct 2023 Vishaal Udandarao, Max F. Burg, Samuel Albanie, Matthias Bethge

This finding points to a blind spot in current frontier VLMs: they excel in recognizing semantic content but fail to acquire an understanding of visual data-types through scaling.

SuS-X: Training-Free Name-Only Transfer of Vision-Language Models

2 code implementations ICCV 2023 Vishaal Udandarao, Ankush Gupta, Samuel Albanie

Contrastive Language-Image Pre-training (CLIP) has emerged as a simple yet effective way to train large-scale vision-language models.

Retrieval Zero-Shot Learning

It's LeVAsa not LevioSA! Latent Encodings for Valence-Arousal Structure Alignment

1 code implementation20 Jul 2020 Surabhi S. Nath, Vishaal Udandarao, Jainendra Shukla

We build a novel algorithm for mapping categorical and dimensional model labels using annotation transfer across affective facial image datasets.

DisCont: Self-Supervised Visual Attribute Disentanglement using Context Vectors

1 code implementation10 Jun 2020 Sarthak Bhagat, Vishaal Udandarao, Shagun Uppal

Disentangling the underlying feature attributes within an image with no prior supervision is a challenging task.

Attribute Contrastive Learning +1

COBRA: Contrastive Bi-Modal Representation Algorithm

1 code implementation7 May 2020 Vishaal Udandarao, Abhishek Maiti, Deepak Srivatsav, Suryatej Reddy Vyalla, Yifang Yin, Rajiv Ratn Shah

In this paper, we present a novel framework COBRA that aims to train two modalities (image and text) in a joint fashion inspired by the Contrastive Predictive Coding (CPC) and Noise Contrastive Estimation (NCE) paradigms which preserve both inter and intra-class relationships.

Cross-Modal Retrieval Image Captioning +3

EDUQA: Educational Domain Question Answering System using Conceptual Network Mapping

no code implementations12 Nov 2019 Abhishek Agarwal, Nikhil Sachdeva, Raj Kamal Yadav, Vishaal Udandarao, Vrinda Mittal, Anubha Gupta, Abhinav Mathur

Most of the existing question answering models can be largely compiled into two categories: i) open domain question answering models that answer generic questions and use large-scale knowledge base along with the targeted web-corpus retrieval and ii) closed domain question answering models that address focused questioning area and use complex deep learning models.

Answer Generation Open-Domain Question Answering +1

Cannot find the paper you are looking for? You can Submit a new open access paper.