no code implementations • EMNLP 2021 • Arjun Akula, Soravit Changpinyo, Boqing Gong, Piyush Sharma, Song-Chun Zhu, Radu Soricut
One challenge in evaluating visual question answering (VQA) models in the cross-dataset adaptation setting is that the distribution shifts are multi-modal, making it difficult to identify if it is the shifts in visual or language features that play a key role.
no code implementations • 13 Oct 2022 • Ayush Maheshwari, Piyush Sharma, Preethi Jyothi, Ganesh Ramakrishnan
In this work we present \dictdis, a lexically constrained NMT system that disambiguates between multiple candidate translations derived from dictionaries.
2 code implementations • Findings (EMNLP) 2021 • Mert İnan, Piyush Sharma, Baber Khalid, Radu Soricut, Matthew Stone, Malihe Alikhani
Developers of text generation models rely on automated evaluation metrics as a stand-in for slow and expensive manual evaluations.
2 code implementations • CVPR 2021 • Soravit Changpinyo, Piyush Sharma, Nan Ding, Radu Soricut
The availability of large-scale image captioning and visual question answering datasets has contributed significantly to recent successes in vision-and-language pre-training.
Ranked #9 on
Image Captioning
on nocaps-val-out-domain
1 code implementation • CoNLL (EMNLP) 2021 • Edwin G. Ng, Bo Pang, Piyush Sharma, Radu Soricut
Image captioning models generally lack the capability to take into account user interest, and usually default to global descriptions that try to balance readability, informativeness, and information overload.
no code implementations • COLING 2022 • Khyathi Raghavi Chandu, Piyush Sharma, Soravit Changpinyo, Ashish Thapliyal, Radu Soricut
Training large-scale image captioning (IC) models demands access to a rich and diverse set of training examples, gathered from the wild, often from noisy alt-text data.
no code implementations • ACL 2020 • Malihe Alikhani, Piyush Sharma, Shengjie Li, Radu Soricut, Matthew Stone
We use coherence relations inspired by computational models of discourse to study the information needs and goals of image captioning.
no code implementations • 2 May 2020 • Malihe Alikhani, Piyush Sharma, Shengjie Li, Radu Soricut, Matthew Stone
We use coherence relations inspired by computational models of discourse to study the information needs and goals of image captioning.
no code implementations • 21 Nov 2019 • Paul Hongsuck Seo, Piyush Sharma, Tomer Levinboim, Bohyung Han, Radu Soricut
Human ratings are currently the most accurate way to assess the quality of an image captioning model, yet most often the only used outcome of an expensive human rating evaluation is a few overall statistics over the evaluation dataset.
46 code implementations • ICLR 2020 • Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut
Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks.
Ranked #1 on
Natural Language Inference
on QNLI
no code implementations • IJCNLP 2019 • Maxwell Forbes, Christine Kaeser-Chen, Piyush Sharma, Serge Belongie
We introduce the new Birds-to-Words dataset of 41k sentences describing fine-grained differences between photographs of birds.
1 code implementation • NAACL 2021 • Tomer Levinboim, Ashish V. Thapliyal, Piyush Sharma, Radu Soricut
Automatic image captioning has improved significantly over the last few years, but the problem is far from being solved, with state of the art models still often producing low quality captions when used in the wild.
no code implementations • IJCNLP 2019 • Soravit Changpinyo, Bo Pang, Piyush Sharma, Radu Soricut
Object detection plays an important role in current solutions to vision and language tasks like image captioning and visual question answering.
Ranked #4 on
Visual Question Answering (VQA)
on VizWiz 2018
no code implementations • ACL 2019 • Sanqiang Zhao, Piyush Sharma, Tomer Levinboim, Radu Soricut
An image caption should fluently present the essential information in a given image, including informative, fine-grained entity mentions and the manner in which these entities interact.
1 code implementation • ACL 2018 • Piyush Sharma, Nan Ding, Sebastian Goodman, Radu Soricut
We present a new dataset of image caption annotations, Conceptual Captions, which contains an order of magnitude more images than the MS-COCO dataset (Lin et al., 2014) and represents a wider variety of both images and image caption styles.