no code implementations • ECCV 2020 • Dipanjan Das, Sandika Biswas, Sanjana Sinha, Brojeshwar Bhowmick
Current state-of-the-art methods fail to generate realistic animation from any speech on unknown faces due to their poor gen-eralization over different facial characteristics, languages, and accents.
no code implementations • 5 Sep 2023 • Priyanka Bose, Dipanjan Das, Fabio Gritti, Nicola Ruaro, Christopher Kruegel, Giovanni Vigna
Yet, there are sophisticated actors who turn their domain knowledge and market inefficiencies to their strategic advantage; thus extracting value from trades not accessible to others.
no code implementations • 22 May 2023 • Elizabeth Clark, Shruti Rijhwani, Sebastian Gehrmann, Joshua Maynez, Roee Aharoni, Vitaly Nikolaev, Thibault Sellam, Aditya Siddhant, Dipanjan Das, Ankur P. Parikh
In this work, we introduce SEAHORSE, a dataset for multilingual, multifaceted summarization evaluation.
no code implementations • 28 Apr 2023 • Fantine Huot, Joshua Maynez, Shashi Narayan, Reinald Kim Amplayo, Kuzman Ganchev, Annie Louis, Anders Sandholm, Dipanjan Das, Mirella Lapata
While conditional generation models can now generate natural language well enough to create fluent text, it is still difficult to control the generation process, leading to irrelevant, repetitive, and hallucinated content.
1 code implementation • 15 Dec 2022 • Bernd Bohnet, Vinh Q. Tran, Pat Verga, Roee Aharoni, Daniel Andor, Livio Baldini Soares, Massimiliano Ciaramita, Jacob Eisenstein, Kuzman Ganchev, Jonathan Herzig, Kai Hui, Tom Kwiatkowski, Ji Ma, Jianmo Ni, Lierni Sestorain Saralegui, Tal Schuster, William W. Cohen, Michael Collins, Dipanjan Das, Donald Metzler, Slav Petrov, Kellie Webster
We take human annotations as a gold standard and show that a correlated automatic metric is suitable for development.
1 code implementation • 15 Nov 2022 • Priyanka Agrawal, Chris Alberti, Fantine Huot, Joshua Maynez, Ji Ma, Sebastian Ruder, Kuzman Ganchev, Dipanjan Das, Mirella Lapata
The availability of large, high-quality datasets has been one of the main drivers of recent progress in question answering (QA).
no code implementations • 31 Oct 2022 • Reinald Kim Amplayo, Kellie Webster, Michael Collins, Dipanjan Das, Shashi Narayan
Large language models (LLMs) have been shown to perform well in answering questions and in producing long-form texts, both in few-shot closed-book settings.
2 code implementations • 6 Oct 2022 • Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei
Finally, we show that the multilingual reasoning abilities of language models extend to other tasks such as commonsense reasoning and word-in-context semantic judgment.
1 code implementation • 1 Jul 2022 • Shashi Narayan, Joshua Maynez, Reinald Kim Amplayo, Kuzman Ganchev, Annie Louis, Fantine Huot, Anders Sandholm, Dipanjan Das, Mirella Lapata
The ability to convey relevant and faithful information is critical for many tasks in conditional generation and yet remains elusive for neural seq-to-seq models whose outputs often reveal hallucinations and fail to correctly cover important details.
1 code implementation • ACL 2022 • Shashi Narayan, Gonçalo Simões, Yao Zhao, Joshua Maynez, Dipanjan Das, Michael Collins, Mirella Lapata
We propose Composition Sampling, a simple but effective method to generate diverse outputs for conditional generation of higher quality compared to previous stochastic decoding strategies.
1 code implementation • 23 Dec 2021 • Hannah Rashkin, Vitaly Nikolaev, Matthew Lamm, Lora Aroyo, Michael Collins, Dipanjan Das, Slav Petrov, Gaurav Singh Tomar, Iulia Turc, David Reitter
With recent improvements in natural language generation (NLG) models for various applications, it has become imperative to have the means to identify and evaluate whether NLG output is only sharing verifiable information about the external world.
no code implementations • ACL 2021 • Hannah Rashkin, David Reitter, Gaurav Singh Tomar, Dipanjan Das
At training time, additional inputs based on these evaluation measures are given to the dialogue model.
2 code implementations • ICLR 2022 • Thibault Sellam, Steve Yadlowsky, Jason Wei, Naomi Saphra, Alexander D'Amour, Tal Linzen, Jasmijn Bastings, Iulia Turc, Jacob Eisenstein, Dipanjan Das, Ian Tenney, Ellie Pavlick
Experiments with pre-trained models such as BERT are often based on a single checkpoint.
no code implementations • 9 Feb 2021 • Eunsol Choi, Jennimaria Palomaki, Matthew Lamm, Tom Kwiatkowski, Dipanjan Das, Michael Collins
Models for question answering, dialogue agents, and summarization often interpret the meaning of a sentence in a rich context and use that meaning in a new context.
no code implementations • ACL (GEM) 2021 • Sebastian Gehrmann, Tosin Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Aremu Anuoluwapo, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna Clinciu, Dipanjan Das, Kaustubh D. Dhole, Wanyu Du, Esin Durmus, Ondřej Dušek, Chris Emezue, Varun Gangal, Cristina Garbacea, Tatsunori Hashimoto, Yufang Hou, Yacine Jernite, Harsh Jhamtani, Yangfeng Ji, Shailza Jolly, Mihir Kale, Dhruv Kumar, Faisal Ladhak, Aman Madaan, Mounica Maddela, Khyati Mahajan, Saad Mahamood, Bodhisattwa Prasad Majumder, Pedro Henrique Martins, Angelina McMillan-Major, Simon Mille, Emiel van Miltenburg, Moin Nadeem, Shashi Narayan, Vitaly Nikolaev, Rubungo Andre Niyongabo, Salomey Osei, Ankur Parikh, Laura Perez-Beltrachini, Niranjan Ramesh Rao, Vikas Raunak, Juan Diego Rodriguez, Sashank Santhanam, João Sedoc, Thibault Sellam, Samira Shaikh, Anastasia Shimorina, Marco Antonio Sobrevilla Cabezudo, Hendrik Strobelt, Nishant Subramani, Wei Xu, Diyi Yang, Akhila Yerukola, Jiawei Zhou
We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics.
Ranked #1 on
Extreme Summarization
on GEM-XSum
Abstractive Text Summarization
Cross-Lingual Abstractive Summarization
+5
no code implementations • WMT (EMNLP) 2020 • Thibault Sellam, Amy Pu, Hyung Won Chung, Sebastian Gehrmann, Qijun Tan, Markus Freitag, Dipanjan Das, Ankur P. Parikh
The quality of machine translation systems has dramatically improved over the last decade, and as a result, evaluation has become an increasingly challenging problem.
no code implementations • 10 May 2020 • Vignesh Prasad, Dipanjan Das, Brojeshwar Bhowmick
Since we wish to efficiently discriminate between different clusters in the data, we propose a method based on VAEs where we use a Gaussian Mixture prior to help cluster the images accurately.
1 code implementation • EMNLP 2020 • Ankur P. Parikh, Xuezhi Wang, Sebastian Gehrmann, Manaal Faruqui, Bhuwan Dhingra, Diyi Yang, Dipanjan Das
We present ToTTo, an open-domain English table-to-text dataset with over 120, 000 training examples that proposes a controlled generation task: given a Wikipedia table and a set of highlighted table cells, produce a one-sentence description.
Ranked #3 on
Data-to-Text Generation
on ToTTo
1 code implementation • ACL 2020 • Junghyun Min, R. Thomas McCoy, Dipanjan Das, Emily Pitler, Tal Linzen
Pretrained neural models such as BERT, when fine-tuned to perform natural language inference (NLI), often show high accuracy on standard datasets, but display a surprising lack of sensitivity to word order on controlled challenge sets.
3 code implementations • ACL 2020 • Thibault Sellam, Dipanjan Das, Ankur P. Parikh
We propose BLEURT, a learned evaluation metric based on BERT that can model human judgments with a few thousand possibly biased training examples.
1 code implementation • ACL 2019 • Bhuwan Dhingra, Manaal Faruqui, Ankur Parikh, Ming-Wei Chang, Dipanjan Das, William W. Cohen
Automatically constructed datasets for generating text from semi-structured data (tables), such as WikiBio, often contain reference texts that diverge from the information in the corresponding semi-structured data.
2 code implementations • ICLR 2019 • Ian Tenney, Patrick Xia, Berlin Chen, Alex Wang, Adam Poliak, R. Thomas McCoy, Najoung Kim, Benjamin Van Durme, Samuel R. Bowman, Dipanjan Das, Ellie Pavlick
The jiant toolkit for general-purpose text understanding models
1 code implementation • ACL 2019 • Ian Tenney, Dipanjan Das, Ellie Pavlick
Pre-trained text encoders have rapidly advanced the state of the art on many NLP tasks.
no code implementations • NAACL 2019 • Hao Peng, Ankur P. Parikh, Manaal Faruqui, Bhuwan Dhingra, Dipanjan Das
We propose a novel conditioned text generation model.
no code implementations • 19 Jan 2019 • Dipanjan Das, Ratul Ghosh, Brojeshwar Bhowmick
Despite significant advances in clustering methods in recent years, the outcome of clustering of a natural image dataset is still unsatisfactory due to two important drawbacks.
no code implementations • 23 Dec 2018 • Vignesh Prasad, Dipanjan Das, Brojeshwar Bhowmick
The proposed method results in better depth images and pose estimates, which capture the scene structure and motion in a better way.
1 code implementation • EMNLP 2018 • Manaal Faruqui, Dipanjan Das
Understanding search queries is a hard problem as it involves dealing with "word salad" text ubiquitously issued by users.
Ranked #1 on
Query Wellformedness
on Query Wellformedness
1 code implementation • EMNLP 2018 • Jan A. Botha, Manaal Faruqui, John Alex, Jason Baldridge, Dipanjan Das
Split and rephrase is the task of breaking down a sentence into shorter ones that together convey the same meaning.
no code implementations • EMNLP 2018 • Manaal Faruqui, Ellie Pavlick, Ian Tenney, Dipanjan Das
We release a corpus of 43 million atomic edits across 8 languages.
no code implementations • WS 2017 • Gaurav Singh Tomar, Thyago Duque, Oscar Täckström, Jakob Uszkoreit, Dipanjan Das
We present a solution to the problem of paraphrase identification of questions.
1 code implementation • 4 Nov 2016 • Kenton Lee, Shimi Salant, Tom Kwiatkowski, Ankur Parikh, Dipanjan Das, Jonathan Berant
In this paper, we focus on this answer extraction task, presenting a novel model architecture that efficiently builds fixed length representations of all spans in the evidence document with a recurrent network.
Ranked #43 on
Question Answering
on SQuAD1.1 dev
10 code implementations • EMNLP 2016 • Ankur P. Parikh, Oscar Täckström, Dipanjan Das, Jakob Uszkoreit
We propose a simple neural architecture for natural language inference.
Ranked #48 on
Natural Language Inference
on SNLI
1 code implementation • TACL 2016 • Siva Reddy, Oscar T{\"a}ckstr{\"o}m, Michael Collins, Tom Kwiatkowski, Dipanjan Das, Mark Steedman, Mirella Lapata
In contrast{---}partly due to the lack of a strong type system{---}dependency structures are easy to annotate and have become a widely used form of syntactic analysis for many languages.
no code implementations • TACL 2015 • Oscar T{\"a}ckstr{\"o}m, Kuzman Ganchev, Dipanjan Das
We present a dynamic programming algorithm for efficient constrained inference in semantic role labeling.
no code implementations • ACL 2013 • Ryan McDonald, Joakim Nivre, Yvonne Quirmbach-Brundage, Yoav Goldberg, Dipanjan Das, Kuzman Ganchev, Keith Hall, Slav Petrov, Hao Zhang, Oscar T{\"a}ckstr{\"o}m, Claudia Bedini, N{\'u}ria Bertomeu Castell{\'o}, Jungmee Lee
no code implementations • TACL 2013 • Oscar T{\"a}ckstr{\"o}m, Dipanjan Das, Slav Petrov, Ryan Mcdonald, Joakim Nivre
We consider the construction of part-of-speech taggers for resource-poor languages.
1 code implementation • LREC 2012 • Slav Petrov, Dipanjan Das, Ryan Mcdonald
To facilitate future research in unsupervised induction of syntactic structure and to standardize best-practices, we propose a tagset that consists of twelve universal part-of-speech categories.