no code implementations • EMNLP 2020 • Chris Kedzie, Kathleen McKeown
We study the degree to which neural sequence-to-sequence models exhibit fine-grained controllability when performing natural language generation from a meaning representation.
no code implementations • COLING 2022 • Fei-Tzin Lee, Miguel Ballesteros, Feng Nan, Kathleen McKeown
Large pretrained language models offer powerful generation capabilities, but cannot be reliably controlled at a sub-sentential level.
no code implementations • COLING 2022 • Elsbeth Turcan, David Wan, Faisal Ladhak, Petra Galuscakova, Sukanta Sen, Svetlana Tchistiakova, Weijia Xu, Marine Carpuat, Kenneth Heafield, Douglas Oard, Kathleen McKeown
Query-focused summaries of foreign-language, retrieved documents can help a user understand whether a document is actually relevant to the query term.
no code implementations • NAACL (ACL) 2022 • Kasturi Bhattacharjee, Rashmi Gangadharaiah, Kathleen McKeown, Dan Roth
Users often leave feedback on a myriad of aspects of a product which, if leveraged successfully, can help yield useful insights that can lead to further improvements down the line.
no code implementations • VarDial (COLING) 2020 • Alyssa Hwang, William R. Frey, Kathleen McKeown
Researchers in natural language processing have developed large, robust resources for understanding formal Standard American English (SAE), but we lack similar resources for variations of English, such as slang and African American English (AAE).
1 code implementation • EMNLP 2021 • Manling Li, Tengfei Ma, Mo Yu, Lingfei Wu, Tian Gao, Heng Ji, Kathleen McKeown
Timeline Summarization identifies major events from a news collection and describes them following temporal order, with key dates tagged.
1 code implementation • 6 Mar 2023 • David Wan, Mengwen Liu, Kathleen McKeown, Markus Dreyer, Mohit Bansal
We present a systematic study of the effect of generation techniques such as beam search and nucleus sampling on faithfulness in abstractive summarization.
1 code implementation • 31 Jan 2023 • Tianyi Zhang, Faisal Ladhak, Esin Durmus, Percy Liang, Kathleen McKeown, Tatsunori B. Hashimoto
Large language models (LLMs) have shown promise for automatic summarization but the reasons behind their successes are poorly understood.
no code implementations • 31 Jan 2023 • Melanie Subbiah, Amrita Bhattacharjee, Bobby Yilun Hua, Tharindu Kumarage, Huan Liu, Kathleen McKeown
Manipulated news online is a growing problem which necessitates the use of automated systems to curtail its spread.
1 code implementation • 25 Jan 2023 • Kung-Hsiang Huang, Siffi Singh, Xiaofei Ma, Wei Xiao, Feng Nan, Nicholas Dingwall, William Yang Wang, Kathleen McKeown
Missing information is a common issue of dialogue summarization where some information in the reference summaries is not covered in the generated summaries.
no code implementations • 20 Dec 2022 • Yukun Huang, Yanda Chen, Zhou Yu, Kathleen McKeown
We propose to combine in-context learning objectives with language modeling objectives to distill both the ability to read in-context examples and task knowledge to the smaller models.
1 code implementation • 21 Nov 2022 • Noah Bergam, Emily Allaway, Kathleen McKeown
As a natural extension of this political stance detection, we propose the more specialized task of legal stance detection with our new dataset SC-stance, which matches written opinions to legal questions.
no code implementations • COLING (CreativeSumm) 2022 • Divyansh Agarwal, Alexander R. Fabbri, Simeng Han, Wojciech Kryściński, Faisal Ladhak, Bryan Li, Kathleen McKeown, Dragomir Radev, Tianyi Zhang, Sam Wiseman
We detail the process of curating these datasets for the task, as well as the metrics used for the evaluation of the submissions.
no code implementations • 9 Nov 2022 • Hardy Hardy, Miguel Ballesteros, Faisal Ladhak, Muhammad Khalifa, Vittorio Castelli, Kathleen McKeown
Summarizing novel chapters is a difficult task due to the input length and the fact that sentences that appear in the desired summaries draw content from multiple places throughout the chapter.
no code implementations • 18 Oct 2022 • Sharon Levy, Emily Allaway, Melanie Subbiah, Lydia Chilton, Desmond Patton, Kathleen McKeown, William Yang Wang
Understanding what constitutes safe text is an important issue in natural language processing and can often prevent the deployment of models deemed harmful and unsafe.
no code implementations • 17 Oct 2022 • Alex Mei, Anisha Kabir, Sharon Levy, Melanie Subbiah, Emily Allaway, John Judge, Desmond Patton, Bruce Bimber, Kathleen McKeown, William Yang Wang
An increasingly prevalent problem for intelligent technologies is text safety, as uncontrolled systems may generate recommendations to their users that lead to injury or life-threatening consequences.
no code implementations • 16 Sep 2022 • Yanda Chen, Chen Zhao, Zhou Yu, Kathleen McKeown, He He
In-context learning (ICL) suffers from oversensitivity to the prompt, making it unreliable in real-world scenarios.
no code implementations • 23 May 2022 • Emily Allaway, Jena D. Hwang, Chandra Bhagavatula, Kathleen McKeown, Doug Downey, Yejin Choi
Generics express generalizations about the world (e. g., birds can fly) that are not universally true (e. g., newborn birds and penguins cannot fly).
no code implementations • 23 May 2022 • Anish Saha, Amith Ananthram, Emily Allaway, Heng Ji, Kathleen McKeown
Practitioners from many disciplines (e. g., political science) use expert-crafted taxonomies to make sense of large, unlabeled corpora.
1 code implementation • 13 Apr 2022 • Griffin Adams, Han-Chin Shing, Qing Sun, Christopher Winestock, Kathleen McKeown, Noémie Elhadad
In real-world scenarios with naturally occurring datasets, reference summaries are noisy and may contain information that cannot be inferred from the source text.
1 code implementation • Findings (ACL) 2022 • Chao Zhao, Tenghao Huang, Somnath Basu Roy Chowdhury, Muthu Kumar Chandrasekaran, Kathleen McKeown, Snigdha Chaturvedi
A common method for extractive multi-document news summarization is to re-formulate it as a single-document summarization problem by concatenating all documents as a single meta-document.
no code implementations • 10 Mar 2022 • Kung-Hsiang Huang, Kathleen McKeown, Preslav Nakov, Yejin Choi, Heng Ji
While there has been a lot of research and many recent advances in neural fake news detection, defending against human-written disinformation remains underexplored.
no code implementations • 27 Nov 2021 • Fei-Tzin Lee, Chris Kedzie, Nakul Verma, Kathleen McKeown
Prior work in AMR-based summarization has automatically merged the individual sentence graphs into a document graph, but the method of merging and its effects on summary content selection have not been independently evaluated.
no code implementations • EMNLP 2021 • Muhammad Khalifa, Miguel Ballesteros, Kathleen McKeown
Dialogue summarization comes with its own peculiar challenges as opposed to news or scientific articles summarization.
1 code implementation • EMNLP (sustainlp) 2021 • Gengyu Wang, Xiaochen Hou, Diyi Yang, Kathleen McKeown, Jing Huang
Large pre-trained language models (PLMs) have led to great success on various commonsense question answering (QA) tasks in an end-to-end fashion.
1 code implementation • ACL 2022 • Faisal Ladhak, Esin Durmus, He He, Claire Cardie, Kathleen McKeown
Despite recent progress in abstractive summarization, systems still suffer from faithfulness errors.
no code implementations • ACL 2021 • Muhao Chen, Hongming Zhang, Qiang Ning, Manling Li, Heng Ji, Kathleen McKeown, Dan Roth
This tutorial targets researchers and practitioners who are interested in AI technologies that help machines understand natural language text, particularly real-world events described in the text.
no code implementations • ACL 2021 • Yi Fung, Christopher Thomas, Revanth Gangi Reddy, Sandeep Polisetty, Heng Ji, Shih-Fu Chang, Kathleen McKeown, Mohit Bansal, Avi Sil
To defend against machine-generated fake news, an effective mechanism is urgently needed.
no code implementations • ACL 2021 • Yanda Chen, Chris Kedzie, Suraj Nair, Petra Galuščáková, Rui Zhang, Douglas W. Oard, Kathleen McKeown
This paper proposes an approach to cross-language sentence selection in a low-resource setting.
1 code implementation • NAACL 2021 • Elsbeth Turcan, Smaranda Muresan, Kathleen McKeown
The problem of detecting psychological stress in online posts, and more broadly, of detecting people in distress or in need of help, is a sensitive application for which the ability to interpret models is vital.
1 code implementation • NAACL 2021 • Emily Allaway, Malavika Srikanth, Kathleen McKeown
Stance detection on social media can help to identify and understand slanted news or commentary in everyday life.
1 code implementation • ACL 2021 • Feng Nan, Cicero Nogueira dos santos, Henghui Zhu, Patrick Ng, Kathleen McKeown, Ramesh Nallapati, Dejiao Zhang, Zhiguo Wang, Andrew O. Arnold, Bing Xiang
A commonly observed problem with the state-of-the art abstractive summarization models is that the generated summaries can be factually inconsistent with the input documents.
no code implementations • EACL 2021 • David Wan, Chris Kedzie, Faisal Ladhak, Elsbeth Turcan, Petra Galuščáková, Elena Zotkina, Zhengping Jiang, Peter Bell, Kathleen McKeown
Typical ASR systems segment the input audio into utterances using purely acoustic information, which may not resemble the sentence-like units that are expected by conventional machine translation (MT) systems for Spoken Language Translation.
2 code implementations • NAACL 2021 • Dejiao Zhang, Feng Nan, Xiaokai Wei, Shangwen Li, Henghui Zhu, Kathleen McKeown, Ramesh Nallapati, Andrew Arnold, Bing Xiang
Unsupervised clustering aims at discovering the semantic categories of data according to some distance measured in the representation space.
Ranked #1 on
Short Text Clustering
on Tweet
1 code implementation • EACL 2021 • Feng Nan, Ramesh Nallapati, Zhiguo Wang, Cicero Nogueira dos santos, Henghui Zhu, Dejiao Zhang, Kathleen McKeown, Bing Xiang
A key challenge for abstractive summarization is ensuring factual consistency of the generated summary with respect to the original document.
no code implementations • EACL 2021 • Kailash Karthik Saravanakumar, Miguel Ballesteros, Muthu Kumar Chandrasekaran, Kathleen McKeown
We propose a method for online news stream clustering that is a variant of the non-parametric streaming K-means algorithm.
no code implementations • 4 Dec 2020 • Amith Ananthram, Emily Allaway, Kathleen McKeown
General purpose relation extraction has recently seen considerable gains in part due to a massively data-intensive distant supervision technique from Soares et al. (2019) that produces state-of-the-art results across many benchmarks.
no code implementations • COLING 2020 • Amith Ananthram, Emily Allaway, Kathleen McKeown
General purpose relation extraction has recently seen considerable gains in part due to a massively data-intensive distant supervision technique from Soares et al. (2019) that produces state-of-the-art results across many benchmarks.
1 code implementation • COLING 2020 • Efsun Sarioglu Kayi, Linyong Nan, Bohan Qu, Mona Diab, Kathleen McKeown
We adopt cross-lingual embeddings constructed using different methods to extract features of the tweets, including a few state-of-the-art contextual embeddings such as BERT, RoBERTa and XLM-R. We train classifiers of different architectures on the extracted features.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Dejiao Zhang, Ramesh Nallapati, Henghui Zhu, Feng Nan, Cicero Nogueira dos santos, Kathleen McKeown, Bing Xiang
Unsupervised domain adaptation addresses the problem of leveraging labeled data in a source domain to learn a well-performing model in a target domain where labels are unavailable.
Cross-Lingual Document Classification
Document Classification
+2
1 code implementation • WMT (EMNLP) 2020 • David Wan, Chris Kedzie, Faisal Ladhak, Marine Carpuat, Kathleen McKeown
In this paper, we present both autoregressive and non-autoregressive models for lexically constrained APE, demonstrating that our approach enables preservation of 95% of the terminologies and also improves translation quality on English-German benchmarks.
no code implementations • 19 Oct 2020 • David Wan, Zhengping Jiang, Chris Kedzie, Elsbeth Turcan, Peter Bell, Kathleen McKeown
In this work, we focus on improving ASR output segmentation in the context of low-resource language speech-to-text translation.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Faisal Ladhak, Esin Durmus, Claire Cardie, Kathleen McKeown
As a set of baselines for further studies, we evaluate the performance of existing cross-lingual abstractive summarization methods on our dataset.
Abstractive Text Summarization
Cross-Lingual Abstractive Summarization
+2
1 code implementation • EMNLP 2020 • Emily Allaway, Kathleen McKeown
Stance detection is an important component of understanding hidden influences in everyday life.
no code implementations • EACL 2021 • Emily Allaway, Kathleen McKeown
Ideological attitudes and stance are often expressed through subtle meanings of words and phrases.
1 code implementation • ACL 2020 • Faisal Ladhak, Bryan Li, Yaser Al-Onaizan, Kathleen McKeown
We present a new summarization task, generating summaries of novel chapters using summary/chapter pairs from online study guides.
no code implementations • EMNLP 2020 • Miguel Ballesteros, Rishita Anubhai, Shuai Wang, Nima Pourdamghani, Yogarshi Vyas, Jie Ma, Parminder Bhatia, Kathleen McKeown, Yaser Al-Onaizan
In this paper, we propose a neural architecture and a set of training methods for ordering events by predicting temporal relations.
1 code implementation • WS 2019 • Chris Kedzie, Kathleen McKeown
Deep neural networks (DNN) are quickly becoming the de facto standard modeling method for many natural language generation (NLG) tasks.
1 code implementation • WS 2019 • Elsbeth Turcan, Kathleen McKeown
Stress is a nigh-universal human experience, particularly in the online world.
no code implementations • IJCNLP 2019 • Serina Chang, Kathleen McKeown
In this paper, we pose the question: do people talk about women and men in different ways?
no code implementations • 19 Aug 2019 • Ruiqi Zhong, Steven Shao, Kathleen McKeown
While the general task of textual sentiment classification has been widely studied, much less research looks specifically at sentiment between a specified source and target.
1 code implementation • NAACL 2019 • Tuhin Chakrabarty, Christopher Hidey, Kathleen McKeown
Claims are the central component of an argument.
no code implementations • 24 Feb 2019 • Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, Xinyi Wang, Jiateng Xie, Ruochen Xu, Chunting Zhou, Peter J. Jansen, Yiming Yang, Lori Levin, Florian Metze, Teruko Mitamura, David R. Mortensen, Graham Neubig, Eduard Hovy, Alan W. black, Jaime Carbonell, Graham V. Horwood, Shabnam Tafreshi, Mona Diab, Efsun S. Kayi, Noura Farra, Kathleen McKeown
This paper describes the ARIEL-CMU submissions to the Low Resource Human Language Technologies (LoReHLT) 2018 evaluations for the tasks Machine Translation (MT), Entity Discovery and Linking (EDL), and detection of Situation Frames in Text and Speech (SF Text and Speech).
2 code implementations • EMNLP 2018 • Chris Kedzie, Kathleen McKeown, Hal Daume III
We carry out experiments with deep learning models of summarization across the domains of news, personal stories, meetings, and medical articles in order to understand how content selection is performed.
no code implementations • WS 2018 • Rohan Kshirsagar, Tyus Cukuvac, Kathleen McKeown, Susan McGregor
We present a neural-network based approach to classifying online hate speech in general, as well as racist and sexist speech in particular.
1 code implementation • EMNLP 2018 • Serina Chang, Ruiqi Zhong, Ethan Adams, Fei-Tzin Lee, Siddharth Varia, Desmond Patton, William Frey, Chris Kedzie, Kathleen McKeown
Gang-involved youth in cities such as Chicago have increasingly turned to social media to post about their experiences and intents online.
no code implementations • 23 Jul 2018 • Philipp Blandfort, Desmond Patton, William R. Frey, Svebor Karaman, Surabhi Bhargava, Fei-Tzin Lee, Siddharth Varia, Chris Kedzie, Michael B. Gaskell, Rossano Schifanella, Kathleen McKeown, Shih-Fu Chang
In this paper we partnered computer scientists with social work researchers, who have domain expertise in gang violence, to analyze how public tweets with images posted by youth who mention gang associations on Twitter can be leveraged to automatically detect psychosocial factors and conditions that could potentially assist social workers and violence outreach workers in prevention and early intervention programs.
no code implementations • IJCNLP 2017 • Or Biran, Kathleen McKeown
RDF ontologies provide structured data on entities in many domains and continue to grow in size and diversity.
no code implementations • 13 Aug 2017 • Tao Yu, Christopher Hidey, Owen Rambow, Kathleen McKeown
This model outperforms many deep learning models and achieves comparable results to other deep learning models with complex architectures on sentiment analysis datasets.
no code implementations • EACL 2017 • Noura Farra, Kathleen McKeown
We consider entity-level sentiment analysis in Arabic, a morphologically rich language with increasing resources.
no code implementations • COLING 2016 • Terra Blevins, Robert Kwiatkowski, Jamie MacBeth, Kathleen McKeown, Desmond Patton, Owen Rambow
Violence is a serious problems for cities like Chicago and has been exacerbated by the use of social media by gang-involved youths for taunting rival gangs.
no code implementations • 28 Sep 2016 • Desmond Upton Patton, Kathleen McKeown, Owen Rambow, Jamie MacBeth
The U. S. has the highest rate of firearm-related deaths when compared to other industrialized countries.
no code implementations • 12 May 2016 • Chris Kedzie, Fernando Diaz, Kathleen McKeown
We present a system based on sequential decision making for the online summarization of massive document streams, such as those found on the web.
no code implementations • LREC 2012 • Jacob Andreas, Sara Rosenthal, Kathleen McKeown
We introduce a new corpus of sentence-level agreement and disagreement annotations over LiveJournal and Wikipedia threads.