Search Results for author: Tanya Goyal

Found 20 papers, 15 papers with code

Contemporary NLP Modeling in Six Comprehensive Programming Assignments

no code implementations NAACL (TeachingNLP) 2021 Greg Durrett, Jifan Chen, Shrey Desai, Tanya Goyal, Lucas Kabela, Yasumasa Onoe, Jiacheng Xu

We present a series of programming assignments, adaptable to a range of experience levels from advanced undergraduate to PhD, to teach students design and implementation of modern NLP systems.

FABLES: Evaluating faithfulness and content selection in book-length summarization

3 code implementations1 Apr 2024 Yekyung Kim, Yapei Chang, Marzena Karpinska, Aparna Garimella, Varun Manjunatha, Kyle Lo, Tanya Goyal, Mohit Iyyer

While LLM-based auto-raters have proven reliable for factuality and coherence in other settings, we implement several LLM raters of faithfulness and find that none correlates strongly with human annotations, especially with regard to detecting unfaithful claims.

Long-Context Understanding

Evaluating Large Language Models at Evaluating Instruction Following

1 code implementation11 Oct 2023 Zhiyuan Zeng, Jiatong Yu, Tianyu Gao, Yu Meng, Tanya Goyal, Danqi Chen

As research in large language models (LLMs) continues to accelerate, LLM-based evaluation has emerged as a scalable and cost-effective alternative to human evaluations for comparing the ever increasing list of models.

Instruction Following

A Long Way to Go: Investigating Length Correlations in RLHF

1 code implementation5 Oct 2023 Prasann Singhal, Tanya Goyal, Jiacheng Xu, Greg Durrett

Furthermore, we find that even running RLHF with a reward based solely on length can reproduce most of the downstream improvements over the initial policy model, showing that reward models in these settings have a long way to go.

Question Answering

BooookScore: A systematic exploration of book-length summarization in the era of LLMs

2 code implementations1 Oct 2023 Yapei Chang, Kyle Lo, Tanya Goyal, Mohit Iyyer

We find that closed-source LLMs such as GPT-4 and Claude 2 produce summaries with higher BooookScore than those generated by open-source models.

WiCE: Real-World Entailment for Claims in Wikipedia

1 code implementation2 Mar 2023 Ryo Kamoi, Tanya Goyal, Juan Diego Rodriguez, Greg Durrett

Textual entailment models are increasingly applied in settings like fact-checking, presupposition verification in question answering, or summary evaluation.

Fact Checking Natural Language Inference +3

News Summarization and Evaluation in the Era of GPT-3

1 code implementation26 Sep 2022 Tanya Goyal, Junyi Jessy Li, Greg Durrett

Finally, we evaluate models on a setting beyond generic summarization, specifically keyword-based summarization, and show how dominant fine-tuning approaches compare to prompting.

News Summarization Text Summarization

Understanding Factual Errors in Summarization: Errors, Summarizers, Datasets, Error Detectors

1 code implementation25 May 2022 Liyan Tang, Tanya Goyal, Alexander R. Fabbri, Philippe Laban, Jiacheng Xu, Semih Yavuz, Wojciech Kryściński, Justin F. Rousseau, Greg Durrett

We compare performance of state-of-the-art factuality metrics, including recent ChatGPT-based metrics, on this stratified benchmark and show that their performance varies significantly across different types of summarization models.

Abstractive Text Summarization

SNaC: Coherence Error Detection for Narrative Summarization

1 code implementation19 May 2022 Tanya Goyal, Junyi Jessy Li, Greg Durrett

In this work, we introduce SNaC, a narrative coherence evaluation framework rooted in fine-grained annotations for long summaries.

Benchmarking Coherence Evaluation +1

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

2 code implementations6 Dec 2021 Kaustubh D. Dhole, Varun Gangal, Sebastian Gehrmann, Aadesh Gupta, Zhenhao Li, Saad Mahamood, Abinaya Mahendiran, Simon Mille, Ashish Shrivastava, Samson Tan, Tongshuang Wu, Jascha Sohl-Dickstein, Jinho D. Choi, Eduard Hovy, Ondrej Dusek, Sebastian Ruder, Sajant Anand, Nagender Aneja, Rabin Banjade, Lisa Barthe, Hanna Behnke, Ian Berlot-Attwell, Connor Boyle, Caroline Brun, Marco Antonio Sobrevilla Cabezudo, Samuel Cahyawijaya, Emile Chapuis, Wanxiang Che, Mukund Choudhary, Christian Clauss, Pierre Colombo, Filip Cornell, Gautier Dagan, Mayukh Das, Tanay Dixit, Thomas Dopierre, Paul-Alexis Dray, Suchitra Dubey, Tatiana Ekeinhor, Marco Di Giovanni, Tanya Goyal, Rishabh Gupta, Louanes Hamla, Sang Han, Fabrice Harel-Canada, Antoine Honore, Ishan Jindal, Przemyslaw K. Joniak, Denis Kleyko, Venelin Kovatchev, Kalpesh Krishna, Ashutosh Kumar, Stefan Langer, Seungjae Ryan Lee, Corey James Levinson, Hualou Liang, Kaizhao Liang, Zhexiong Liu, Andrey Lukyanenko, Vukosi Marivate, Gerard de Melo, Simon Meoni, Maxime Meyer, Afnan Mir, Nafise Sadat Moosavi, Niklas Muennighoff, Timothy Sum Hon Mun, Kenton Murray, Marcin Namysl, Maria Obedkova, Priti Oli, Nivranshu Pasricha, Jan Pfister, Richard Plant, Vinay Prabhu, Vasile Pais, Libo Qin, Shahab Raji, Pawan Kumar Rajpoot, Vikas Raunak, Roy Rinberg, Nicolas Roberts, Juan Diego Rodriguez, Claude Roux, Vasconcellos P. H. S., Ananya B. Sai, Robin M. Schmidt, Thomas Scialom, Tshephisho Sefara, Saqib N. Shamsi, Xudong Shen, Haoyue Shi, Yiwen Shi, Anna Shvets, Nick Siegel, Damien Sileo, Jamie Simon, Chandan Singh, Roman Sitelew, Priyank Soni, Taylor Sorensen, William Soto, Aman Srivastava, KV Aditya Srivatsa, Tony Sun, Mukund Varma T, A Tabassum, Fiona Anting Tan, Ryan Teehan, Mo Tiwari, Marie Tolkiehn, Athena Wang, Zijian Wang, Gloria Wang, Zijie J. Wang, Fuxuan Wei, Bryan Wilie, Genta Indra Winata, Xinyi Wu, Witold Wydmański, Tianbao Xie, Usama Yaseen, Michael A. Yee, Jing Zhang, Yue Zhang

Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on.

Data Augmentation

Training Dynamics for Text Summarization Models

no code implementations Findings (ACL) 2022 Tanya Goyal, Jiacheng Xu, Junyi Jessy Li, Greg Durrett

Across different datasets (CNN/DM, XSum, MediaSum) and summary properties, such as abstractiveness and hallucination, we study what the model learns at different stages of its fine-tuning process.

Hallucination News Summarization +1

HydraSum: Disentangling Stylistic Features in Text Summarization using Multi-Decoder Models

1 code implementation8 Oct 2021 Tanya Goyal, Nazneen Fatema Rajani, Wenhao Liu, Wojciech Kryściński

Summarization systems make numerous "decisions" about summary properties during inference, e. g. degree of copying, specificity and length of outputs, etc.

Abstractive Text Summarization Specificity

HydraSum - Disentangling Stylistic Features in Text Summarization using Multi-Decoder Models

no code implementations29 Sep 2021 Tanya Goyal, Nazneen Rajani, Wenhao Liu, Wojciech Maciej Kryscinski

Existing abstractive summarization models lack explicit control mechanisms that would allow users to influence the stylistic features of the model outputs.

Abstractive Text Summarization Specificity

Annotating and Modeling Fine-grained Factuality in Summarization

2 code implementations NAACL 2021 Tanya Goyal, Greg Durrett

Recent pre-trained abstractive summarization systems have started to achieve credible performance, but a major barrier to their use in practice is their propensity to output summaries that are not faithful to the input and that contain factual errors.

Abstractive Text Summarization Sentence

Evaluating Factuality in Generation with Dependency-level Entailment

1 code implementation Findings of the Association for Computational Linguistics 2020 Tanya Goyal, Greg Durrett

Experiments show that our dependency arc entailment model trained on this data can identify factual inconsistencies in paraphrasing and summarization better than sentence-level methods or those based on question generation, while additionally localizing the erroneous parts of the generation.

Natural Language Inference Question Generation +3

Neural Syntactic Preordering for Controlled Paraphrase Generation

2 code implementations ACL 2020 Tanya Goyal, Greg Durrett

Paraphrasing natural language sentences is a multifaceted process: it might involve replacing individual words or short phrases, local rearrangement of content, or high-level restructuring like topicalization or passivization.

Machine Translation Paraphrase Generation +2

Embedding time expressions for deep temporal ordering models

3 code implementations ACL 2019 Tanya Goyal, Greg Durrett

Data-driven models have demonstrated state-of-the-art performance in inferring the temporal ordering of events in text.

Cannot find the paper you are looking for? You can Submit a new open access paper.