Search Results for author: Ramakanth Pasunuru

Found 40 papers, 19 papers with code

The Effect of Pretraining on Extractive Summarization for Scientific Documents

no code implementations • NAACL (sdp) 2021 • Yash Gupta, Pawan Sasanka Ammanamanchi, Shikha Bordia, Arjun Manoharan, Deepak Mittal, Ramakanth Pasunuru, Manish Shrivastava, Maneesh Singh, Mohit Bansal, Preethi Jyothi

Large pretrained models have seen enormous success in extractive summarization tasks.

Extractive Summarization Word Embeddings

Paper
Add Code

Continual Few-Shot Learning for Text Classification

1 code implementation • EMNLP 2021 • Ramakanth Pasunuru, Veselin Stoyanov, Mohit Bansal

In this work, we propose a continual few-shot learning (CFL) task, in which a system is challenged with a difficult phenomenon and asked to learn to correct mistakes with only a few (10 to 15) training examples.

continual few-shot learning Few-Shot Learning +4

Paper
Code

Interactive Query-Assisted Summarization via Deep Reinforcement Learning

1 code implementation • NAACL 2022 • Ori Shapira, Ramakanth Pasunuru, Mohit Bansal, Ido Dagan, Yael Amsterdamer

Interactive summarization is a task that facilitates user-guided exploration of information within a document set.

Informativeness reinforcement-learning +1

Paper
Code

An Overview of Uncertainty Calibration for Text Classification and the Role of Distillation

no code implementations • ACL (RepL4NLP) 2021 • Han Guo, Ramakanth Pasunuru, Mohit Bansal

Many recalibration methods have been proposed in the literature for quantifying predictive uncertainty and calibrating model outputs, with varying degrees of complexity.

text-classification Text Classification

Paper
Add Code

Efficient Tool Use with Chain-of-Abstraction Reasoning

no code implementations • 30 Jan 2024 • Silin Gao, Jane Dwivedi-Yu, Ping Yu, Xiaoqing Ellen Tan, Ramakanth Pasunuru, Olga Golovneva, Koustuv Sinha, Asli Celikyilmaz, Antoine Bosselut, Tianlu Wang

LLM agents trained with our method also show more efficient tool use, with inference speed being on average ~1. 4x faster than baseline tool-augmented LLMs.

Math Mathematical Reasoning +1

Paper
Add Code

PathFinder: Guided Search over Multi-Step Reasoning Paths

no code implementations • 8 Dec 2023 • Olga Golovneva, Sean O'Brien, Ramakanth Pasunuru, Tianlu Wang, Luke Zettlemoyer, Maryam Fazel-Zarandi, Asli Celikyilmaz

Using constrained reasoning, PathFinder integrates novel quality constraints, pruning, and exploration methods to enhance the efficiency and the quality of generation.

Pathfinder

Paper
Add Code

Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading

no code implementations • 8 Oct 2023 • Howard Chen, Ramakanth Pasunuru, Jason Weston, Asli Celikyilmaz

Large language models (LLMs) have advanced in large strides due to the effectiveness of the self-attention mechanism that processes and compares all tokens at once.

Question Answering Retrieval

Paper
Add Code

Crystal: Introspective Reasoners Reinforced with Self-Feedback

1 code implementation • 7 Oct 2023 • Jiacheng Liu, Ramakanth Pasunuru, Hannaneh Hajishirzi, Yejin Choi, Asli Celikyilmaz

Extensive work has shown that the performance and interpretability of commonsense reasoning can be improved via knowledge-augmented reasoning methods, where the knowledge that underpins the reasoning process is explicitly verbalized and utilized.

Paper
Code

Don't throw away your value model! Generating more preferable text with Value-Guided Monte-Carlo Tree Search decoding

no code implementations • 26 Sep 2023 • Jiacheng Liu, Andrew Cohen, Ramakanth Pasunuru, Yejin Choi, Hannaneh Hajishirzi, Asli Celikyilmaz

The key idea is not to throw out the value network, a byproduct of PPO training for evaluating partial output sequences, when decoding text out of the policy network.

Text Generation

Paper
Add Code

Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

1 code implementation • 5 Sep 2023 • Lili Yu, Bowen Shi, Ramakanth Pasunuru, Benjamin Muller, Olga Golovneva, Tianlu Wang, Arun Babu, Binh Tang, Brian Karrer, Shelly Sheynin, Candace Ross, Adam Polyak, Russell Howes, Vasu Sharma, Puxin Xu, Hovhannes Tamoyan, Oron Ashual, Uriel Singer, Shang-Wen Li, Susan Zhang, Richard James, Gargi Ghosh, Yaniv Taigman, Maryam Fazel-Zarandi, Asli Celikyilmaz, Luke Zettlemoyer, Armen Aghajanyan

It is also a general-purpose model that can do both text-to-image and image-to-text generation, allowing us to introduce self-contained contrastive decoding methods that produce high-quality outputs.

Ranked #2 on Text-to-Image Generation on MS COCO

Language Modelling Retrieval +2

318

Paper
Code

Shepherd: A Critic for Language Model Generation

1 code implementation • 8 Aug 2023 • Tianlu Wang, Ping Yu, Xiaoqing Ellen Tan, Sean O'Brien, Ramakanth Pasunuru, Jane Dwivedi-Yu, Olga Golovneva, Luke Zettlemoyer, Maryam Fazel-Zarandi, Asli Celikyilmaz

As large language models improve, there is increasing interest in techniques that leverage these models' capabilities to refine their own outputs.

Language Modelling

196

Paper
Code

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

1 code implementation • 22 Dec 2022 • Srinivasan Iyer, Xi Victoria Lin, Ramakanth Pasunuru, Todor Mihaylov, Daniel Simig, Ping Yu, Kurt Shuster, Tianlu Wang, Qing Liu, Punit Singh Koura, Xian Li, Brian O'Horo, Gabriel Pereyra, Jeff Wang, Christopher Dewan, Asli Celikyilmaz, Luke Zettlemoyer, Ves Stoyanov

To this end, we create OPT-IML Bench: a large benchmark for Instruction Meta-Learning (IML) of 2000 NLP tasks consolidated into task categories from 8 existing benchmarks, and prepare an evaluation framework to measure three types of model generalizations: to tasks from fully held-out categories, to held-out tasks from seen categories, and to held-out instances from seen tasks.

Ranked #26 on Natural Language Inference on RTE

Language Modelling Meta-Learning +2

Paper
Code

Training Trajectories of Language Models Across Scales

1 code implementation • 19 Dec 2022 • Mengzhou Xia, Mikel Artetxe, Chunting Zhou, Xi Victoria Lin, Ramakanth Pasunuru, Danqi Chen, Luke Zettlemoyer, Ves Stoyanov

Why do larger language models demonstrate more desirable behaviors?

In-Context Learning Multiple-choice

Paper
Code

MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text Generation

no code implementations • 16 Dec 2022 • Swarnadeep Saha, Xinyan Velocity Yu, Mohit Bansal, Ramakanth Pasunuru, Asli Celikyilmaz

We propose MURMUR, a neuro-symbolic modular approach to text generation from semi-structured data with multi-step reasoning.

Data-to-Text Generation Hallucination +1

Paper
Add Code

Complementary Explanations for Effective In-Context Learning

1 code implementation • 25 Nov 2022 • Xi Ye, Srinivasan Iyer, Asli Celikyilmaz, Ves Stoyanov, Greg Durrett, Ramakanth Pasunuru

Large language models (LLMs) have exhibited remarkable capabilities in learning from explanations in prompts, but there has been limited understanding of exactly how these explanations function or why they are effective.

In-Context Learning

Paper
Code

Improving In-Context Few-Shot Learning via Self-Supervised Training

no code implementations • NAACL 2022 • Mingda Chen, Jingfei Du, Ramakanth Pasunuru, Todor Mihaylov, Srini Iyer, Veselin Stoyanov, Zornitsa Kozareva

Self-supervised pretraining has made few-shot learning possible for many NLP tasks.

Few-Shot Learning

Paper
Add Code

Efficient Large Scale Language Modeling with Mixtures of Experts

no code implementations • 20 Dec 2021 • Mikel Artetxe, Shruti Bhosale, Naman Goyal, Todor Mihaylov, Myle Ott, Sam Shleifer, Xi Victoria Lin, Jingfei Du, Srinivasan Iyer, Ramakanth Pasunuru, Giri Anantharaman, Xian Li, Shuohui Chen, Halil Akin, Mandeep Baines, Louis Martin, Xing Zhou, Punit Singh Koura, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Mona Diab, Zornitsa Kozareva, Ves Stoyanov

This paper presents a detailed empirical study of how autoregressive MoE language models scale in comparison with dense models in a wide range of settings: in- and out-of-domain language modeling, zero- and few-shot priming, and full-shot fine-tuning.

Language Modelling

Paper
Add Code

Few-shot Learning with Multilingual Language Models

2 code implementations • 20 Dec 2021 • Xi Victoria Lin, Todor Mihaylov, Mikel Artetxe, Tianlu Wang, Shuohui Chen, Daniel Simig, Myle Ott, Naman Goyal, Shruti Bhosale, Jingfei Du, Ramakanth Pasunuru, Sam Shleifer, Punit Singh Koura, Vishrav Chaudhary, Brian O'Horo, Jeff Wang, Luke Zettlemoyer, Zornitsa Kozareva, Mona Diab, Veselin Stoyanov, Xian Li

Large-scale generative language models such as GPT-3 are competitive few-shot learners.

Cross-Lingual Transfer Few-Shot Learning +5

29,187

Paper
Code

Proposition-Level Clustering for Multi-Document Summarization

2 code implementations • NAACL 2022 • Ori Ernst, Avi Caciularu, Ori Shapira, Ramakanth Pasunuru, Mohit Bansal, Jacob Goldberger, Ido Dagan

Text clustering methods were traditionally incorporated into multi-document summarization (MDS) as a means for coping with considerable information repetition.

Clustering Document Summarization +3

Paper
Code

Multi-Document Keyphrase Extraction: Dataset, Baselines and Review

1 code implementation • 3 Oct 2021 • Ori Shapira, Ramakanth Pasunuru, Ido Dagan, Yael Amsterdamer

Keyphrase extraction has been extensively researched within the single-document setting, with an abundance of methods, datasets and applications.

Keyphrase Extraction

Paper
Code

iFacetSum: Coreference-based Interactive Faceted Summarization for Multi-Document Exploration

1 code implementation • EMNLP (ACL) 2021 • Eran Hirsch, Alon Eirew, Ori Shapira, Avi Caciularu, Arie Cattan, Ori Ernst, Ramakanth Pasunuru, Hadar Ronen, Mohit Bansal, Ido Dagan

We introduce iFacetSum, a web application for exploring topical document sets.

Paper
Code

Efficiently Summarizing Text and Graph Encodings of Multi-Document Clusters

1 code implementation • NAACL 2021 • Ramakanth Pasunuru, Mengwen Liu, Mohit Bansal, Sujith Ravi, Markus Dreyer

We also show improvements in a transfer-only setup on the DUC-2004 dataset.

Document Summarization Multi-Document Summarization

Paper
Code

Extending Multi-Document Summarization Evaluation to the Interactive Setting

1 code implementation • NAACL 2021 • Ori Shapira, Ramakanth Pasunuru, Hadar Ronen, Mohit Bansal, Yael Amsterdamer, Ido Dagan

In this paper, we develop an end-to-end evaluation framework for interactive summarization, focusing on expansion-based interaction, which considers the accumulating information along a user session.

Document Summarization Multi-Document Summarization

Paper
Code

Data Augmentation for Abstractive Query-Focused Multi-Document Summarization

1 code implementation • 2 Mar 2021 • Ramakanth Pasunuru, Asli Celikyilmaz, Michel Galley, Chenyan Xiong, Yizhe Zhang, Mohit Bansal, Jianfeng Gao

The progress in Query-focused Multi-Document Summarization (QMDS) has been limited by the lack of sufficient largescale high-quality training datasets.

Data Augmentation Document Summarization +1

Paper
Code

Dual Reinforcement-Based Specification Generation for Image De-Rendering

no code implementations • 2 Mar 2021 • Ramakanth Pasunuru, David Rosenberg, Gideon Mann, Mohit Bansal

Since these are sequence models, we must choose an ordering of the objects in the graphics programs for likelihood training.

Inductive Bias

Paper
Add Code

DORB: Dynamically Optimizing Multiple Rewards with Bandits

no code implementations • EMNLP 2020 • Ramakanth Pasunuru, Han Guo, Mohit Bansal

Further, it is important to consider using a dynamic combination and curriculum of metric rewards that flexibly changes over time.

Data-to-Text Generation Question Generation +1

Paper
Add Code

FENAS: Flexible and Expressive Neural Architecture Search

no code implementations • Findings of the Association for Computational Linguistics 2020 • Ramakanth Pasunuru, Mohit Bansal

Architecture search is the automatic process of designing the model or cell structure that is optimal for the given dataset or task.

Image Classification Language Modelling +3

Paper
Add Code

Evaluating Interactive Summarization: an Expansion-Based Framework

no code implementations • 17 Sep 2020 • Ori Shapira, Ramakanth Pasunuru, Hadar Ronen, Mohit Bansal, Yael Amsterdamer, Ido Dagan

Allowing users to interact with multi-document summarizers is a promising direction towards improving and customizing summary results.

Paper
Add Code

Summary-Source Proposition-level Alignment: Task, Datasets and Supervised Baseline

1 code implementation • CoNLL (EMNLP) 2021 • Ori Ernst, Ori Shapira, Ramakanth Pasunuru, Michael Lepioshkin, Jacob Goldberger, Mohit Bansal, Ido Dagan

Aligning sentences in a reference summary with their counterparts in source documents was shown as a useful auxiliary summarization task, notably for generating training data for salience detection.

Clustering Document Summarization +1

Paper
Code

Multi-Source Domain Adaptation for Text Classification via DistanceNet-Bandits

no code implementations • 13 Jan 2020 • Han Guo, Ramakanth Pasunuru, Mohit Bansal

Next, we develop a DistanceNet model which uses these distance measures, or a mixture of these distance measures, as an additional loss function to be minimized jointly with the task's loss function, so as to achieve better unsupervised domain adaptation.

General Classification Sentiment Analysis +3

Paper
Add Code

Continual and Multi-Task Architecture Search

1 code implementation • ACL 2019 • Ramakanth Pasunuru, Mohit Bansal

Architecture search is the process of automatically learning the neural model or cell structure that best suits the given task.

Continual Learning General Classification +8

Paper
Code

Crowdsourcing Lightweight Pyramids for Manual Summary Evaluation

1 code implementation • NAACL 2019 • Ori Shapira, David Gabay, Yang Gao, Hadar Ronen, Ramakanth Pasunuru, Mohit Bansal, Yael Amsterdamer, Ido Dagan

Conducting a manual evaluation is considered an essential part of summary evaluation methodology.

Paper
Code

AutoSeM: Automatic Task Selection and Mixing in Multi-Task Learning

no code implementations • NAACL 2019 • Han Guo, Ramakanth Pasunuru, Mohit Bansal

To address these issues, we present AutoSeM, a two-stage MTL pipeline, where the first stage automatically selects the most useful auxiliary tasks via a Beta-Bernoulli multi-armed bandit with Thompson Sampling, and the second stage learns the training mixing ratio of these selected auxiliary tasks via a Gaussian Process based Bayesian optimization framework.

Bayesian Optimization Inductive Bias +2

Paper
Add Code

Game-Based Video-Context Dialogue

1 code implementation • EMNLP 2018 • Ramakanth Pasunuru, Mohit Bansal

Current dialogue systems focus more on textual and speech context knowledge and are usually based on two speakers.

Retrieval

Paper
Code

Dynamic Multi-Level Multi-Task Learning for Sentence Simplification

no code implementations • COLING 2018 • Han Guo, Ramakanth Pasunuru, Mohit Bansal

In this work, we first present a strong pointer-copy mechanism based sequence-to-sequence sentence simplification model, and then improve its entailment and paraphrasing capabilities via multi-task learning with related auxiliary tasks of entailment and paraphrase generation.

Ranked #2 on Text Simplification on Newsela

Multi-Task Learning Paraphrase Generation +3

Paper
Add Code

Soft Layer-Specific Multi-Task Summarization with Entailment and Question Generation

no code implementations • ACL 2018 • Han Guo, Ramakanth Pasunuru, Mohit Bansal

An accurate abstractive summary of a document should contain all its salient information and should be logically entailed by the input document.

Ranked #33 on Text Summarization on GigaWord

Abstractive Text Summarization Multi-Task Learning +2

Paper
Add Code

Multi-Reward Reinforced Summarization with Saliency and Entailment

no code implementations • NAACL 2018 • Ramakanth Pasunuru, Mohit Bansal

Abstractive text summarization is the task of compressing and rewriting a long document into a short summary while maintaining saliency, directed logical entailment, and non-redundancy.

Ranked #41 on Abstractive Text Summarization on CNN / Daily Mail

Abstractive Text Summarization

Paper
Add Code

Towards Improving Abstractive Summarization via Entailment Generation

no code implementations • WS 2017 • Ramakanth Pasunuru, Han Guo, Mohit Bansal

Abstractive summarization, the task of rewriting and compressing a document into a short summary, has achieved considerable success with neural sequence-to-sequence models.

Abstractive Text Summarization Machine Translation +2

Paper
Add Code

Reinforced Video Captioning with Entailment Rewards

no code implementations • EMNLP 2017 • Ramakanth Pasunuru, Mohit Bansal

Sequence-to-sequence models have shown promising improvements on the temporal task of video captioning, but they optimize word-level cross-entropy loss during training.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Add Code

Multi-Task Video Captioning with Video and Entailment Generation

no code implementations • ACL 2017 • Ramakanth Pasunuru, Mohit Bansal

Video captioning, the task of describing the content of a video, has seen some promising improvements in recent years with sequence-to-sequence models, but accurately learning the temporal and logical dynamics involved in the task still remains a challenge, especially given the lack of sufficient annotated data.

Multi-Task Learning Video Captioning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.