no code implementations • NAACL 2022 • Puneet Mathur, Vlad Morariu, Verena Kaynig-Fittkau, Jiuxiang Gu, Franck Dernoncourt, Quan Tran, Ani Nenkova, Dinesh Manocha, Rajiv Jain
We introduce DocTime - a novel temporal dependency graph (TDG) parser that takes as input a text document and produces a temporal dependency graph.
no code implementations • Findings (ACL) 2022 • Zihan Wang, Jiuxiang Gu, Jason Kuen, Handong Zhao, Vlad Morariu, Ruiyi Zhang, Ani Nenkova, Tong Sun, Jingbo Shang
We present a comprehensive study of sparse attention patterns in Transformer models.
no code implementations • 18 Apr 2024 • Shengcao Cao, Jiuxiang Gu, Jason Kuen, Hao Tan, Ruiyi Zhang, Handong Zhao, Ani Nenkova, Liang-Yan Gui, Tong Sun, Yu-Xiong Wang
Using raw images as the sole training data, our method achieves unprecedented performance in self-supervised open-world segmentation, marking a significant milestone towards high-quality open-world entity segmentation in the absence of human-annotated masks.
no code implementations • 1 Mar 2024 • Chantal Shaib, Joe Barrow, Jiuding Sun, Alexa F. Siu, Byron C. Wallace, Ani Nenkova
The applicability of scores extends beyond analysis of generative models; for example, we highlight applications on instruction-tuning datasets and human-produced texts.
no code implementations • 28 Feb 2024 • Chantal Shaib, Joe Barrow, Alexa F. Siu, Byron C. Wallace, Ani Nenkova
Modern instruction-tuned models have become highly capable in text generation tasks such as summarization, and are expected to be released at a steady pace.
no code implementations • 25 Oct 2023 • Zhendong Chu, Ruiyi Zhang, Tong Yu, Rajiv Jain, Vlad I Morariu, Jiuxiang Gu, Ani Nenkova
To achieve state-of-the-art performance, one still needs to train NER models on large-scale, high-quality annotated data, an asset that is both costly and time-intensive to accumulate.
1 code implementation • 23 Oct 2023 • Sicheng Zhu, Ruiyi Zhang, Bang An, Gang Wu, Joe Barrow, Zichao Wang, Furong Huang, Ani Nenkova, Tong Sun
Safety alignment of Large Language Models (LLMs) can be compromised with manual jailbreak attacks and (automatic) adversarial attacks.
no code implementations • 16 Sep 2023 • Jon Saad-Falcon, Joe Barrow, Alexa Siu, Ani Nenkova, David Seunghyun Yoon, Ryan A. Rossi, Franck Dernoncourt
Representing such structured documents as plain text is incongruous with the user's mental model of these documents with rich structure.
no code implementations • 18 Jun 2023 • David Demeter, Oshin Agarwal, Simon Ben Igeri, Marko Sterbentz, Neil Molino, John M. Conroy, Ani Nenkova
Academic literature does not give much guidance on how to build the best possible customer-facing summarization system from existing research components.
no code implementations • 20 May 2023 • Kaige Xie, Tong Yu, Haoliang Wang, Junda Wu, Handong Zhao, Ruiyi Zhang, Kanak Mahadik, Ani Nenkova, Mark Riedl
In this paper, we focus on improving the prompt transfer from dialogue state tracking to dialogue summarization and propose Skeleton-Assisted Prompt Transfer (SAPT), which leverages skeleton generation as extra supervision that functions as a medium connecting the distinct source and target task and resulting in the model's better consumption of dialogue state information.
no code implementations • 11 May 2023 • Gaurav Verma, Ryan A. Rossi, Christopher Tensmeyer, Jiuxiang Gu, Ani Nenkova
Visual text evokes an image in a person's mind, while non-visual text fails to do so.
no code implementations • 23 Feb 2023 • Chieh-Yang Huang, Ting-Yao Hsu, Ryan Rossi, Ani Nenkova, Sungchul Kim, Gromit Yeuk-Yin Chan, Eunyee Koh, Clyde Lee Giles, Ting-Hao 'Kenneth' Huang
Prior work often treated figure caption generation as a vision-to-language task.
no code implementations • IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023 • Puneet Mathur, Rajiv Jain, Ashutosh Mehra, Jiuxiang Gu, Franck Dernoncourt, Anandhavelu N, Quan Tran, Verena Kaynig-Fittkau, Ani Nenkova, Dinesh Manocha, Vlad I. Morariu
Experiments show that our approach outperforms competitive baselines by 10-15% on three diverse datasets of forms and mobile app screen layouts for the tasks of spatial region classification, higher-order group identification, layout hierarchy extraction, reading order detection, and word grouping.
no code implementations • 27 Nov 2022 • Zilong Wang, Jiuxiang Gu, Chris Tensmeyer, Nikolaos Barmpalios, Ani Nenkova, Tong Sun, Jingbo Shang, Vlad I. Morariu
In contrast, region-level models attempt to encode regions corresponding to paragraphs or text blocks into a single embedding, but they perform worse with additional word-level features.
1 code implementation • 25 Oct 2022 • Sarthak Jain, Varun Manjunatha, Byron C. Wallace, Ani Nenkova
We show the practical utility of segment influence by using the method to identify systematic annotation errors in two named entity recognition corpora.
no code implementations • 14 Oct 2022 • Nikita Salkar, Thomas Trikalinos, Byron C. Wallace, Ani Nenkova
In a regression analysis, we find that the three architectures have different propensities for repeating content across output summaries for inputs, with BART being particularly prone to self-repetition.
no code implementations • 22 Apr 2022 • Jiuxiang Gu, Jason Kuen, Vlad I. Morariu, Handong Zhao, Nikolaos Barmpalios, Rajiv Jain, Ani Nenkova, Tong Sun
Document intelligence automates the extraction of information from documents and supports many business applications.
Ranked #8 on Document Layout Analysis on PubLayNet val
no code implementations • NeurIPS 2021 • Jiuxiang Gu, Jason Kuen, Vlad Morariu, Handong Zhao, Rajiv Jain, Nikolaos Barmpalios, Ani Nenkova, Tong Sun
Document intelligence automates the extraction of information from documents and supports many business applications.
1 code implementation • 24 Nov 2021 • Oshin Agarwal, Ani Nenkova
Keeping the performance of language technologies optimal as time passes is of great practical interest.
no code implementations • EACL 2021 • Anushree Hede, Oshin Agarwal, Linda Lu, Diana C. Mutz, Ani Nenkova
The ability to quantify incivility online, in news and in congressional debates, is of great interest to political scientists.
no code implementations • 7 Oct 2020 • Benjamin E. Nye, Jay DeYoung, Eric Lehman, Ani Nenkova, Iain J. Marshall, Byron C. Wallace
Here we consider the end-to-end task of both (a) extracting treatments and outcomes from full-text articles describing clinical trials (entity identification) and, (b) inferring the reported results for the former with respect to the latter (relation extraction).
1 code implementation • ACL 2020 • Benjamin E. Nye, Ani Nenkova, Iain J. Marshall, Byron C. Wallace
We apply the system at scale to all reports of randomized controlled trials indexed in MEDLINE, powering the automatic generation of evidence maps, which provide a global view of the efficacy of different interventions combining data from all relevant clinical trials on a topic.
no code implementations • CL (ACL) 2021 • Oshin Agarwal, Yinfei Yang, Byron C. Wallace, Ani Nenkova
We examine these questions by contrasting the performance of several variants of LSTM-CRF architectures for named entity recognition, with some provided only representations of the context as features.
1 code implementation • 8 Apr 2020 • Oshin Agarwal, Yinfei Yang, Byron C. Wallace, Ani Nenkova
We propose a method for auditing the in-domain robustness of systems, focusing specifically on differences in performance due to the national origin of entities.
no code implementations • IJCNLP 2019 • Simeng Sun, Ani Nenkova
ROUGE is widely used to automatically evaluate summarization systems.
no code implementations • WS 2019 • Simeng Sun, Ori Shapira, Ido Dagan, Ani Nenkova
We show that plain ROUGE F1 scores are not ideal for comparing current neural systems which on average produce different lengths.
no code implementations • NAACL 2019 • Rushab Munot, Ani Nenkova
It has been established that the performance of speech recognition systems depends on multiple factors including the lexical content, speaker identity and dialect.
1 code implementation • SEMEVAL 2019 • Oshin Agarwal, Funda Durup{\i}nar, Norman I. Badler, Ani Nenkova
Word representations trained on text reproduce human implicit bias related to gender, race and age.
no code implementations • WS 2019 • Soham Parikh, Elizabeth Conrad, Oshin Agarwal, Iain Marshall, Byron Wallace, Ani Nenkova
Typical information needs, such as retrieving a full list of medical interventions for a given condition, or finding the reported efficacy of a particular treatment with respect to a specific outcome of interest cannot be straightforwardly posed in typical text-box search.
no code implementations • WS 2019 • Oshin Agarwal, Sanjay Subramanian, Ani Nenkova, Dan Roth
It is therefore important that coreference resolution systems are able to link these different types of mentions to the correct entity name.
no code implementations • NAACL 2019 • Yinfei Yang, Oshin Agarwal, Chris Tar, Byron C. Wallace, Ani Nenkova
Experiments on a complex biomedical information extraction task using expert and lay annotators show that: (i) simply excluding from the training data instances predicted to be difficult yields a small boost in performance; (ii) using difficulty scores to weight instances during training provides further, consistent gains; (iii) assigning instances predicted to be difficult to domain experts is an effective strategy for task routing.
no code implementations • 26 Oct 2018 • Oshin Agarwal, Sanjay Subramanian, Ani Nenkova, Dan Roth
Here, we evaluate two state of the art coreference resolution systems on the subtask of Named Person Coreference, in which we are interested in identifying a person mentioned by name, along with all other mentions of the person, by pronoun or generic noun phrase.
no code implementations • EMNLP 2018 • Ori Shapira, David Gabay, Hadar Ronen, Judit Bar-Ilan, Yael Amsterdamer, Ani Nenkova, Ido Dagan
Practical summarization systems are expected to produce summaries of varying lengths, per user needs.
2 code implementations • ACL 2018 • Benjamin Nye, Junyi Jessy Li, Roma Patel, Yinfei Yang, Iain J. Marshall, Ani Nenkova, Byron C. Wallace
We present a corpus of 5, 000 richly annotated abstracts of medical articles describing clinical randomized controlled trials.
no code implementations • NAACL 2018 • Roma Patel, Yinfei Yang, Iain Marshall, Ani Nenkova, Byron Wallace
Medical professionals search the published literature by specifying the type of patients, the medical intervention(s) and the outcome measure(s) of interest.
1 code implementation • ACL 2017 • An Thanh Nguyen, Byron Wallace, Junyi Jessy Li, Ani Nenkova, Matthew Lease
Despite sequences being core to NLP, scant work has considered how to handle noisy sequence labels from multiple annotators for the same text.
no code implementations • 3 Apr 2017 • Yinfei Yang, Ani Nenkova
On manually annotated data, we compare the performance of domain-specific classifiers, trained on data only from a given news domain and a general classifier in which data from all four domains is pooled together.
no code implementations • EACL 2017 • Yinfei Yang, Forrest Sheng Bao, Ani Nenkova
We present a robust approach for detecting intrinsic sentence importance in news, by training on two corpora of document-summary pairs.
no code implementations • LREC 2016 • Junyi Jessy Li, Bridget O{'}Daniel, Yi Wu, Wenli Zhao, Ani Nenkova
We found that the lack of specificity distributes evenly among immediate prior context, long distance prior context and no prior context.
no code implementations • LREC 2014 • Kai Hong, John Conroy, Benoit Favre, Alex Kulesza, Hui Lin, Ani Nenkova
In the period since 2004, many novel sophisticated approaches for generic multi-document summarization have been developed.
no code implementations • TACL 2013 • Annie Louis, Ani Nenkova
We show that the distinction between great and typical articles can be detected fairly accurately, and that the entire spectrum of our features contribute to the distinction.
no code implementations • LREC 2012 • Annie Louis, Ani Nenkova
We present a corpus of sentences from news articles that are annotated as general or specific.