1 code implementation • 31 Oct 2024 • David Wan, Jesse Vig, Mohit Bansal, Shafiq Joty
Large Language Models (LLMs) often exhibit positional bias in long-context settings, under-attending to information in the middle of inputs.
no code implementations • 27 Sep 2023 • Philippe Laban, Jesse Vig, Marti A. Hearst, Caiming Xiong, Chien-Sheng Wu
Conversational interfaces powered by Large Language Models (LLMs) have recently become a popular way to obtain feedback during document editing.
1 code implementation • 7 Sep 2023 • Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryściński, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Joty, Caiming Xiong
Most open-source LLMs, on the other hand, are limited in their ability to support longer sequence lengths, which is a key requirement for many tasks that require inference over an input context.
1 code implementation • 1 Jun 2023 • Fan Yin, Jesse Vig, Philippe Laban, Shafiq Joty, Caiming Xiong, Chien-Sheng Jason Wu
Large language models (LLMs) have shown impressive performance in following natural language instructions to solve unseen tasks.
1 code implementation • 30 May 2023 • Philippe Laban, Jesse Vig, Wojciech Kryscinski, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu
Text simplification research has mostly focused on sentence-level simplification, even though many desirable edits - such as adding relevant background information or reordering content - may require document-level context.
1 code implementation • 11 Nov 2022 • Alexander R. Fabbri, Prafulla Kumar Choubey, Jesse Vig, Chien-Sheng Wu, Caiming Xiong
We propose to use sentence-compression data to train the post-editing model to take a summary with extrinsic entity errors marked with special tokens and output a compressed, well-formed summary with those errors removed.
no code implementations • 5 May 2022 • Anamaria Crisan, Margaret Drouhard, Jesse Vig, Nazneen Rajani
Deep learning models for natural language processing (NLP) are increasingly adopted and deployed by analysts without formal training in NLP or machine learning (ML).
1 code implementation • 8 Mar 2022 • Jun Yuan, Jesse Vig, Nazneen Rajani
Error analysis in NLP models is essential to successful model development and deployment.
1 code implementation • Findings (NAACL) 2022 • Jesse Vig, Alexander R. Fabbri, Wojciech Kryściński, Chien-Sheng Wu, Wenhao Liu
Query-focused summarization (QFS) aims to produce summaries that answer particular questions of interest, enabling greater user control and personalization.
no code implementations • 14 Oct 2021 • Prafulla Kumar Choubey, Alexander R. Fabbri, Jesse Vig, Chien-Sheng Wu, Wenhao Liu, Nazneen Fatema Rajani
Then, we fine-tune a base summarization model, which is trained on all training samples, on the clean (noisy) subset to obtain an \textit{expert} (\textit{anti-expert}) model.
no code implementations • NAACL 2021 • Karan Goel, Laurel Orr, Nazneen Fatema Rajani, Jesse Vig, Christopher R{\'e}
If not, how easily can such a system be repurposed for their use case?
2 code implementations • ACL 2021 • Jesse Vig, Wojciech Kryściński, Karan Goel, Nazneen Fatema Rajani
Novel neural architectures, training strategies, and the availability of large-scale corpora haven been the driving force behind recent progress in abstractive text summarization.
2 code implementations • NAACL 2021 • Karan Goel, Nazneen Rajani, Jesse Vig, Samson Tan, Jason Wu, Stephan Zheng, Caiming Xiong, Mohit Bansal, Christopher Ré
Despite impressive performance on standard benchmarks, deep neural networks are often brittle when deployed in real-world systems.
no code implementations • NeurIPS 2020 • Jesse Vig, Sebastian Gehrmann, Yonatan Belinkov, Sharon Qian, Daniel Nevo, Yaron Singer, Stuart Shieber
As a case study, we apply this methodology to analyzing gender bias in pre-trained Transformer language models.
no code implementations • 1 Dec 2020 • Pascal Sturmfels, Jesse Vig, Ali Madani, Nazneen Fatema Rajani
Recent deep-learning approaches to protein prediction have shown that pre-training on unlabeled data can yield useful representations for downstream tasks.
2 code implementations • ICLR 2021 • Jesse Vig, Ali Madani, Lav R. Varshney, Caiming Xiong, Richard Socher, Nazneen Fatema Rajani
Transformer architectures have proven to learn useful representations for protein classification and generation tasks.
1 code implementation • 26 Apr 2020 • Jesse Vig, Sebastian Gehrmann, Yonatan Belinkov, Sharon Qian, Daniel Nevo, Simas Sakenis, Jason Huang, Yaron Singer, Stuart Shieber
Common methods for interpreting neural models in natural language processing typically examine either their structure or their behavior, but not both.
3 code implementations • ACL 2019 • Jesse Vig
The Transformer is a sequence model that forgoes traditional recurrent architectures in favor of a fully attention-based approach.
no code implementations • WS 2019 • Jesse Vig, Yonatan Belinkov
The Transformer is a fully attention-based alternative to recurrent networks that has achieved state-of-the-art results across a range of NLP tasks.
no code implementations • 4 Apr 2019 • Jesse Vig
We present an open-source tool for visualizing multi-head self-attention in Transformer-based language representation models.