Search Results for author: Amanpreet Singh

Found 23 papers, 9 papers with code

Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality

no code implementations7 Apr 2022 Tristan Thrush, Ryan Jiang, Max Bartolo, Amanpreet Singh, Adina Williams, Douwe Kiela, Candace Ross

We present a novel task and dataset for evaluating the ability of vision and language models to conduct visio-linguistic compositional reasoning, which we call Winoground.

FLAVA: A Foundational Language And Vision Alignment Model

no code implementations8 Dec 2021 Amanpreet Singh, Ronghang Hu, Vedanuj Goswami, Guillaume Couairon, Wojciech Galuba, Marcus Rohrbach, Douwe Kiela

State-of-the-art vision and vision-and-language models rely on large-scale visio-linguistic pretraining for obtaining good performance on a variety of downstream tasks.

Zero-shot Image Retrieval Zero-shot Text Retrieval

Human-Adversarial Visual Question Answering

no code implementations NeurIPS 2021 Sasha Sheng, Amanpreet Singh, Vedanuj Goswami, Jose Alberto Lopez Magana, Wojciech Galuba, Devi Parikh, Douwe Kiela

Human subjects interact with a state-of-the-art VQA model, and for each image in the dataset, attempt to find a question where the model's predicted answer is incorrect.

Question Answering Visual Question Answering +1

TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text

no code implementations CVPR 2021 Amanpreet Singh, Guan Pang, Mandy Toh, Jing Huang, Wojciech Galuba, Tal Hassner

A crucial component for the scene text based reasoning required for TextVQA and TextCaps datasets involve detecting and recognizing text present in the images using an optical character recognition (OCR) system.

Optical Character Recognition Scene Text Detection

Physics Informed Convex Artificial Neural Networks (PICANNs) for Optimal Transport based Density Estimation

no code implementations2 Apr 2021 Amanpreet Singh, Martin Bauer, Sarang Joshi

Optimal Mass Transport (OMT) is a well studied problem with a variety of applications in a diverse set of fields ranging from Physics to Computer Vision and in particular Statistics and Data Science.

Density Estimation

UniT: Multimodal Multitask Learning with a Unified Transformer

2 code implementations ICCV 2021 Ronghang Hu, Amanpreet Singh

We propose UniT, a Unified Transformer model to simultaneously learn the most prominent tasks across different domains, ranging from object detection to natural language understanding and multimodal reasoning.

Multi-Task Learning Natural Language Understanding +1

Open4Business(O4B): An Open Access Dataset for Summarizing Business Documents

1 code implementation15 Nov 2020 Amanpreet Singh, Niranjan Balasubramanian

The dataset introduces a new challenge for summarization in the business domain, requiring highly abstractive and more concise summaries as compared to other existing datasets.

Are we pretraining it right? Digging deeper into visio-linguistic pretraining

no code implementations19 Apr 2020 Amanpreet Singh, Vedanuj Goswami, Devi Parikh

Surprisingly, we show that automatically generated data in a domain closer to the downstream task (e. g., VQA v2) is a better choice for pretraining than "natural" data but of a slightly different domain (e. g., Conceptual Captions).

Visual Question Answering VQA

SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

3 code implementations NeurIPS 2019 Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman

In the last year, new models and methods for pretraining and transfer learning have driven striking performance improvements across a range of language understanding tasks.

Transfer Learning

CanvasGAN: A simple baseline for text to image generation by incrementally patching a canvas

no code implementations5 Oct 2018 Amanpreet Singh, Sharan Agrawal

We propose a new recurrent generative model for generating images from text captions while attending on specific parts of text captions.

Sentence Embeddings Text to image generation +1

Neural Network Acceptability Judgments

1 code implementation TACL 2019 Alex Warstadt, Amanpreet Singh, Samuel R. Bowman

This paper investigates the ability of artificial neural networks to judge the grammatical acceptability of a sentence, with the goal of testing their linguistic competence.

General Classification Language Acquisition +1

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

8 code implementations WS 2018 Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman

For natural language understanding (NLU) technology to be maximally useful, both practically and as a scientific object of study, it must be general: it must be able to process language in a way that is not exclusively tailored to any one specific task or dataset.

Natural Language Inference Natural Language Understanding +1

Cannot find the paper you are looking for? You can Submit a new open access paper.