1 code implementation • 1 Jul 2024 • Philippe Laban, Alexander R. Fabbri, Caiming Xiong, Chien-Sheng Wu
The "Summary of a Haystack" (SummHay) task then requires a system to process the Haystack and generate, given a query, a summary that identifies the relevant insights and precisely cites the source documents.
no code implementations • 24 Apr 2024 • Divyansh Agarwal, Alexander R. Fabbri, Ben Risher, Philippe Laban, Shafiq Joty, Chien-Sheng Wu
We measure the mitigation effect of 7 black-box defense strategies, along with finetuning an open-source model to defend against leakage attempts.
2 code implementations • 16 Apr 2024 • Liyan Tang, Philippe Laban, Greg Durrett
We release LLM-AggreFact, code for data synthesis, and models.
no code implementations • 14 Nov 2023 • Philippe Laban, Lidiya Murakhovs'ka, Caiming Xiong, Chien-Sheng Wu
The interactive nature of Large Language Models (LLMs) theoretically allows models to refine and improve their answers, yet systematic analysis of the multi-turn behavior of LLMs remains limited.
1 code implementation • 26 Oct 2023 • Lidiya Murakhovs'ka, Philippe Laban, Tian Xie, Caiming Xiong, Chien-Sheng Wu
Making big purchases requires consumers to research or consult a salesperson to gain domain expertise.
no code implementations • 5 Oct 2023 • Yao Dou, Philippe Laban, Claire Gardent, Wei Xu
In this tutorial, we focus on text-to-text generation, a class of natural language generation (NLG) tasks, that takes a piece of text as input and then generates a revision that is improved according to some specific criteria (e. g., readability or linguistic styles), while largely retaining the original meaning and the length of the text.
no code implementations • 27 Sep 2023 • Philippe Laban, Jesse Vig, Marti A. Hearst, Caiming Xiong, Chien-Sheng Wu
Conversational interfaces powered by Large Language Models (LLMs) have recently become a popular way to obtain feedback during document editing.
no code implementations • 25 Sep 2023 • Tuhin Chakrabarty, Philippe Laban, Divyansh Agarwal, Smaranda Muresan, Chien-Sheng Wu
Inspired by the Torrance Test of Creative Thinking (TTCT), which measures creativity as a process, we use the Consensual Assessment Technique [3] and propose the Torrance Test of Creative Writing (TTCW) to evaluate creativity as a product.
1 code implementation • 17 Sep 2023 • Kung-Hsiang Huang, Philippe Laban, Alexander R. Fabbri, Prafulla Kumar Choubey, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu
In this paper, we propose a new task of summarizing diverse information encountered in multiple news articles encompassing the same event.
1 code implementation • 7 Sep 2023 • Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryściński, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien-Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Joty, Caiming Xiong
Most open-source LLMs, on the other hand, are limited in their ability to support longer sequence lengths, which is a key requirement for many tasks that require inference over an input context.
no code implementations • 27 Jun 2023 • Xiang 'Anthony' Chen, Jeff Burke, Ruofei Du, Matthew K. Hong, Jennifer Jacobs, Philippe Laban, DIngzeyu Li, Nanyun Peng, Karl D. D. Willis, Chien-Sheng Wu, Bolei Zhou
Through iterative, cross-disciplinary discussions, we define and propose next-steps for Human-centered Generative AI (HGAI).
1 code implementation • 1 Jun 2023 • Fan Yin, Jesse Vig, Philippe Laban, Shafiq Joty, Caiming Xiong, Chien-Sheng Jason Wu
Large language models (LLMs) have shown impressive performance in following natural language instructions to solve unseen tasks.
1 code implementation • 30 May 2023 • Philippe Laban, Jesse Vig, Wojciech Kryscinski, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu
Text simplification research has mostly focused on sentence-level simplification, even though many desirable edits - such as adding relevant background information or reordering content - may require document-level context.
1 code implementation • 23 May 2023 • Philippe Laban, Wojciech Kryściński, Divyansh Agarwal, Alexander R. Fabbri, Caiming Xiong, Shafiq Joty, Chien-Sheng Wu
To address this, we propose a new protocol for inconsistency detection benchmark creation and implement it in a 10-domain benchmark called SummEdits.
no code implementations • 17 Feb 2023 • Philippe Laban, Chien-Sheng Wu, Lidiya Murakhovs'ka, Xiang 'Anthony' Chen, Caiming Xiong
In a second usability study, we developed and implemented a reading exercise with 95 novice news readers to measure exposure to coverage diversity.
1 code implementation • 9 Nov 2022 • Philippe Laban, Chien-Sheng Wu, Lidiya Murakhovs'ka, Xiang 'Anthony' Chen, Caiming Xiong
There are many potential benefits to news readers accessing diverse sources.
1 code implementation • 25 May 2022 • Liyan Tang, Tanya Goyal, Alexander R. Fabbri, Philippe Laban, Jiacheng Xu, Semih Yavuz, Wojciech Kryściński, Justin F. Rousseau, Greg Durrett
We compare performance of state-of-the-art factuality metrics, including recent ChatGPT-based metrics, on this stratified benchmark and show that their performance varies significantly across different types of summarization models.
1 code implementation • 13 May 2022 • Philippe Laban, Chien-Sheng Wu, Wenhao Liu, Caiming Xiong
Precisely assessing the progress in natural language generation (NLG) tasks is challenging, and human evaluation to establish a preference in a model's output over another is often necessary.
no code implementations • Findings (NAACL) 2022 • Philippe Laban, Chien-Sheng Wu, Lidiya Murakhovs'ka, Wenhao Liu, Caiming Xiong
Question generation (QGen) models are often evaluated with standardized NLG metrics that are based on n-gram overlap.
no code implementations • 15 Feb 2022 • Philippe Laban, Elicia Ye, Srujay Korlakunta, John Canny, Marti A. Hearst
News podcasts are a popular medium to stay informed and dive deep into news topics.
3 code implementations • 18 Nov 2021 • Philippe Laban, Tobias Schnabel, Paul N. Bennett, Marti A. Hearst
In this work, we revisit the use of NLI for inconsistency detection, finding that past work suffered from a mismatch in input granularity between NLI datasets (sentence-level), and inconsistency detection (document level).
1 code implementation • Findings (NAACL) 2022 • Lidiya Murakhovs'ka, Chien-Sheng Wu, Philippe Laban, Tong Niu, Wenhao Liu, Caiming Xiong
Asking good questions is an essential ability for both human and machine intelligence.
no code implementations • ACL 2021 • Philippe Laban, Luke Dai, Lucas Bandarkar, Marti A. Hearst
The Shuffle Test is the most common task to evaluate whether NLP models can measure coherence in text.
1 code implementation • ACL 2021 • Philippe Laban, Luke Dai, Lucas Bandarkar, Marti A. Hearst
The Shuffle Test is the most common task to evaluate whether NLP models can measure coherence in text.
1 code implementation • ACL 2021 • Philippe Laban, Tobias Schnabel, Paul Bennett, Marti A. Hearst
This work presents Keep it Simple (KiS), a new approach to unsupervised text simplification which learns to balance a reward across three properties: fluency, salience and simplicity.
no code implementations • ACL 2020 • Philippe Laban, John Canny, Marti A. Hearst
This work describes an automatic news chatbot that draws content from a diverse set of news articles and creates conversations with a user about the news.
1 code implementation • NAACL 2021 • Philippe Laban, Lucas Bandarkar, Marti A. Hearst
Recent progress in Natural Language Understanding (NLU) has seen the latest models outperform human performance on many standard tasks.
1 code implementation • ACL 2020 • Philippe Laban, Andrew Hsi, John Canny, Marti A. Hearst
This work presents a new approach to unsupervised abstractive summarization based on maximizing a combination of coverage and fluency for a given length constraint.
Ranked #51 on Abstractive Text Summarization on CNN / Daily Mail
no code implementations • WS 2017 • Philippe Laban, Marti Hearst
We propose a method to aggregate and organize a large, multi-source dataset of news articles into a collection of major stories, and automatically name and visualize these stories in a working system.