1 code implementation • EMNLP 2021 • Ramakanth Pasunuru, Veselin Stoyanov, Mohit Bansal
In this work, we propose a continual few-shot learning (CFL) task, in which a system is challenged with a difficult phenomenon and asked to learn to correct mistakes with only a few (10 to 15) training examples.
1 code implementation • NAACL 2022 • Ori Shapira, Ramakanth Pasunuru, Mohit Bansal, Ido Dagan, Yael Amsterdamer
Interactive summarization is a task that facilitates user-guided exploration of information within a document set.
no code implementations • ACL (RepL4NLP) 2021 • Han Guo, Ramakanth Pasunuru, Mohit Bansal
Many recalibration methods have been proposed in the literature for quantifying predictive uncertainty and calibrating model outputs, with varying degrees of complexity.
no code implementations • NAACL (sdp) 2021 • Yash Gupta, Pawan Sasanka Ammanamanchi, Shikha Bordia, Arjun Manoharan, Deepak Mittal, Ramakanth Pasunuru, Manish Shrivastava, Maneesh Singh, Mohit Bansal, Preethi Jyothi
Large pretrained models have seen enormous success in extractive summarization tasks.
no code implementations • 30 Jan 2024 • Silin Gao, Jane Dwivedi-Yu, Ping Yu, Xiaoqing Ellen Tan, Ramakanth Pasunuru, Olga Golovneva, Koustuv Sinha, Asli Celikyilmaz, Antoine Bosselut, Tianlu Wang
LLM agents trained with our method also show more efficient tool use, with inference speed being on average ~1. 4x faster than baseline tool-augmented LLMs.
no code implementations • 8 Dec 2023 • Olga Golovneva, Sean O'Brien, Ramakanth Pasunuru, Tianlu Wang, Luke Zettlemoyer, Maryam Fazel-Zarandi, Asli Celikyilmaz
Using constrained reasoning, PathFinder integrates novel quality constraints, pruning, and exploration methods to enhance the efficiency and the quality of generation.
no code implementations • 8 Oct 2023 • Howard Chen, Ramakanth Pasunuru, Jason Weston, Asli Celikyilmaz
Large language models (LLMs) have advanced in large strides due to the effectiveness of the self-attention mechanism that processes and compares all tokens at once.
1 code implementation • 7 Oct 2023 • Jiacheng Liu, Ramakanth Pasunuru, Hannaneh Hajishirzi, Yejin Choi, Asli Celikyilmaz
Extensive work has shown that the performance and interpretability of commonsense reasoning can be improved via knowledge-augmented reasoning methods, where the knowledge that underpins the reasoning process is explicitly verbalized and utilized.
no code implementations • 26 Sep 2023 • Jiacheng Liu, Andrew Cohen, Ramakanth Pasunuru, Yejin Choi, Hannaneh Hajishirzi, Asli Celikyilmaz
The key idea is not to throw out the value network, a byproduct of PPO training for evaluating partial output sequences, when decoding text out of the policy network.
1 code implementation • 5 Sep 2023 • Lili Yu, Bowen Shi, Ramakanth Pasunuru, Benjamin Muller, Olga Golovneva, Tianlu Wang, Arun Babu, Binh Tang, Brian Karrer, Shelly Sheynin, Candace Ross, Adam Polyak, Russell Howes, Vasu Sharma, Puxin Xu, Hovhannes Tamoyan, Oron Ashual, Uriel Singer, Shang-Wen Li, Susan Zhang, Richard James, Gargi Ghosh, Yaniv Taigman, Maryam Fazel-Zarandi, Asli Celikyilmaz, Luke Zettlemoyer, Armen Aghajanyan
It is also a general-purpose model that can do both text-to-image and image-to-text generation, allowing us to introduce self-contained contrastive decoding methods that produce high-quality outputs.
Ranked #2 on Text-to-Image Generation on COCO
1 code implementation • 8 Aug 2023 • Tianlu Wang, Ping Yu, Xiaoqing Ellen Tan, Sean O'Brien, Ramakanth Pasunuru, Jane Dwivedi-Yu, Olga Golovneva, Luke Zettlemoyer, Maryam Fazel-Zarandi, Asli Celikyilmaz
As large language models improve, there is increasing interest in techniques that leverage these models' capabilities to refine their own outputs.
1 code implementation • 22 Dec 2022 • Srinivasan Iyer, Xi Victoria Lin, Ramakanth Pasunuru, Todor Mihaylov, Daniel Simig, Ping Yu, Kurt Shuster, Tianlu Wang, Qing Liu, Punit Singh Koura, Xian Li, Brian O'Horo, Gabriel Pereyra, Jeff Wang, Christopher Dewan, Asli Celikyilmaz, Luke Zettlemoyer, Ves Stoyanov
To this end, we create OPT-IML Bench: a large benchmark for Instruction Meta-Learning (IML) of 2000 NLP tasks consolidated into task categories from 8 existing benchmarks, and prepare an evaluation framework to measure three types of model generalizations: to tasks from fully held-out categories, to held-out tasks from seen categories, and to held-out instances from seen tasks.
Ranked #26 on Natural Language Inference on RTE
1 code implementation • 19 Dec 2022 • Mengzhou Xia, Mikel Artetxe, Chunting Zhou, Xi Victoria Lin, Ramakanth Pasunuru, Danqi Chen, Luke Zettlemoyer, Ves Stoyanov
Why do larger language models demonstrate more desirable behaviors?
no code implementations • 16 Dec 2022 • Swarnadeep Saha, Xinyan Velocity Yu, Mohit Bansal, Ramakanth Pasunuru, Asli Celikyilmaz
We propose MURMUR, a neuro-symbolic modular approach to text generation from semi-structured data with multi-step reasoning.
1 code implementation • 25 Nov 2022 • Xi Ye, Srinivasan Iyer, Asli Celikyilmaz, Ves Stoyanov, Greg Durrett, Ramakanth Pasunuru
Large language models (LLMs) have exhibited remarkable capabilities in learning from explanations in prompts, but there has been limited understanding of exactly how these explanations function or why they are effective.
no code implementations • NAACL 2022 • Mingda Chen, Jingfei Du, Ramakanth Pasunuru, Todor Mihaylov, Srini Iyer, Veselin Stoyanov, Zornitsa Kozareva
Self-supervised pretraining has made few-shot learning possible for many NLP tasks.
2 code implementations • 20 Dec 2021 • Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, Tianlu Wang, Shuohui Chen, Daniel Simig, Myle Ott, Naman Goyal, Shruti Bhosale, Jingfei Du, Ramakanth Pasunuru, Sam Shleifer, Punit Singh Koura, Vishrav Chaudhary, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Zornitsa Kozareva, Mona Diab, Veselin Stoyanov, Xian Li
Large-scale generative language models such as GPT-3 are competitive few-shot learners.
no code implementations • 20 Dec 2021 • Mikel Artetxe, Shruti Bhosale, Naman Goyal, Todor Mihaylov, Myle Ott, Sam Shleifer, Xi Victoria Lin, Jingfei Du, Srinivasan Iyer, Ramakanth Pasunuru, Giri Anantharaman, Xian Li, Shuohui Chen, Halil Akin, Mandeep Baines, Louis Martin, Xing Zhou, Punit Singh Koura, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Mona Diab, Zornitsa Kozareva, Ves Stoyanov
This paper presents a detailed empirical study of how autoregressive MoE language models scale in comparison with dense models in a wide range of settings: in- and out-of-domain language modeling, zero- and few-shot priming, and full-shot fine-tuning.
2 code implementations • NAACL 2022 • Ori Ernst, Avi Caciularu, Ori Shapira, Ramakanth Pasunuru, Mohit Bansal, Jacob Goldberger, Ido Dagan
Text clustering methods were traditionally incorporated into multi-document summarization (MDS) as a means for coping with considerable information repetition.
1 code implementation • 3 Oct 2021 • Ori Shapira, Ramakanth Pasunuru, Ido Dagan, Yael Amsterdamer
Keyphrase extraction has been extensively researched within the single-document setting, with an abundance of methods, datasets and applications.
1 code implementation • EMNLP (ACL) 2021 • Eran Hirsch, Alon Eirew, Ori Shapira, Avi Caciularu, Arie Cattan, Ori Ernst, Ramakanth Pasunuru, Hadar Ronen, Mohit Bansal, Ido Dagan
We introduce iFacetSum, a web application for exploring topical document sets.
1 code implementation • NAACL 2021 • Ramakanth Pasunuru, Mengwen Liu, Mohit Bansal, Sujith Ravi, Markus Dreyer
We also show improvements in a transfer-only setup on the DUC-2004 dataset.
1 code implementation • NAACL 2021 • Ori Shapira, Ramakanth Pasunuru, Hadar Ronen, Mohit Bansal, Yael Amsterdamer, Ido Dagan
In this paper, we develop an end-to-end evaluation framework for interactive summarization, focusing on expansion-based interaction, which considers the accumulating information along a user session.
1 code implementation • 2 Mar 2021 • Ramakanth Pasunuru, Asli Celikyilmaz, Michel Galley, Chenyan Xiong, Yizhe Zhang, Mohit Bansal, Jianfeng Gao
The progress in Query-focused Multi-Document Summarization (QMDS) has been limited by the lack of sufficient largescale high-quality training datasets.
no code implementations • 2 Mar 2021 • Ramakanth Pasunuru, David Rosenberg, Gideon Mann, Mohit Bansal
Since these are sequence models, we must choose an ordering of the objects in the graphics programs for likelihood training.
no code implementations • EMNLP 2020 • Ramakanth Pasunuru, Han Guo, Mohit Bansal
Further, it is important to consider using a dynamic combination and curriculum of metric rewards that flexibly changes over time.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Ramakanth Pasunuru, Mohit Bansal
Architecture search is the automatic process of designing the model or cell structure that is optimal for the given dataset or task.
no code implementations • 17 Sep 2020 • Ori Shapira, Ramakanth Pasunuru, Hadar Ronen, Mohit Bansal, Yael Amsterdamer, Ido Dagan
Allowing users to interact with multi-document summarizers is a promising direction towards improving and customizing summary results.
1 code implementation • CoNLL (EMNLP) 2021 • Ori Ernst, Ori Shapira, Ramakanth Pasunuru, Michael Lepioshkin, Jacob Goldberger, Mohit Bansal, Ido Dagan
Aligning sentences in a reference summary with their counterparts in source documents was shown as a useful auxiliary summarization task, notably for generating training data for salience detection.
no code implementations • 13 Jan 2020 • Han Guo, Ramakanth Pasunuru, Mohit Bansal
Next, we develop a DistanceNet model which uses these distance measures, or a mixture of these distance measures, as an additional loss function to be minimized jointly with the task's loss function, so as to achieve better unsupervised domain adaptation.
1 code implementation • ACL 2019 • Ramakanth Pasunuru, Mohit Bansal
Architecture search is the process of automatically learning the neural model or cell structure that best suits the given task.
1 code implementation • NAACL 2019 • Ori Shapira, David Gabay, Yang Gao, Hadar Ronen, Ramakanth Pasunuru, Mohit Bansal, Yael Amsterdamer, Ido Dagan
Conducting a manual evaluation is considered an essential part of summary evaluation methodology.
no code implementations • NAACL 2019 • Han Guo, Ramakanth Pasunuru, Mohit Bansal
To address these issues, we present AutoSeM, a two-stage MTL pipeline, where the first stage automatically selects the most useful auxiliary tasks via a Beta-Bernoulli multi-armed bandit with Thompson Sampling, and the second stage learns the training mixing ratio of these selected auxiliary tasks via a Gaussian Process based Bayesian optimization framework.
1 code implementation • EMNLP 2018 • Ramakanth Pasunuru, Mohit Bansal
Current dialogue systems focus more on textual and speech context knowledge and are usually based on two speakers.
no code implementations • COLING 2018 • Han Guo, Ramakanth Pasunuru, Mohit Bansal
In this work, we first present a strong pointer-copy mechanism based sequence-to-sequence sentence simplification model, and then improve its entailment and paraphrasing capabilities via multi-task learning with related auxiliary tasks of entailment and paraphrase generation.
Ranked #2 on Text Simplification on Newsela
no code implementations • ACL 2018 • Han Guo, Ramakanth Pasunuru, Mohit Bansal
An accurate abstractive summary of a document should contain all its salient information and should be logically entailed by the input document.
Ranked #33 on Text Summarization on GigaWord
no code implementations • NAACL 2018 • Ramakanth Pasunuru, Mohit Bansal
Abstractive text summarization is the task of compressing and rewriting a long document into a short summary while maintaining saliency, directed logical entailment, and non-redundancy.
Ranked #41 on Abstractive Text Summarization on CNN / Daily Mail
no code implementations • WS 2017 • Ramakanth Pasunuru, Han Guo, Mohit Bansal
Abstractive summarization, the task of rewriting and compressing a document into a short summary, has achieved considerable success with neural sequence-to-sequence models.
no code implementations • EMNLP 2017 • Ramakanth Pasunuru, Mohit Bansal
Sequence-to-sequence models have shown promising improvements on the temporal task of video captioning, but they optimize word-level cross-entropy loss during training.
no code implementations • ACL 2017 • Ramakanth Pasunuru, Mohit Bansal
Video captioning, the task of describing the content of a video, has seen some promising improvements in recent years with sequence-to-sequence models, but accurately learning the temporal and logical dynamics involved in the task still remains a challenge, especially given the lack of sufficient annotated data.