Search Results for author: Ani Nenkova

Found 60 papers, 8 papers with code

A corpus of general and specific sentences from news

no code implementations • LREC 2012 • Annie Louis, Ani Nenkova

We present a corpus of sentences from news articles that are annotated as general or specific.

General Classification Information Retrieval +2

Paper
Add Code

Acoustic-Prosodic Entrainment and Social Behavior

no code implementations • NAACL 2012 • Rivka Levitan, Agust{\'\i}n Gravano, Laura Willson, S̆tefan Ben̆us̆, Julia Hirschberg, Ani Nenkova

Paper
Add Code

An Assessment of the Accuracy of Automatic Evaluation in Summarization

no code implementations • WS 2012 • Karolina Owczarzak, John M. Conroy, Hoa Trang Dang, Ani Nenkova

Document Summarization Multi-Document Summarization

Paper
Add Code

A Coherence Model Based on Syntactic Patterns

no code implementations • EMNLP 2012 • Annie Louis, Ani Nenkova

Paper
Add Code

Lexical Differences in Autobiographical Narratives from Schizophrenic Patients and Healthy Controls

no code implementations • EMNLP 2012 • Kai Hong, Christian G. Kohler, Mary E. March, Amber A. Parker, Ani Nenkova

Paper
Add Code

Automatically Assessing Machine Summary Content Without a Gold Standard

no code implementations • CL 2013 • Annie Louis, Ani Nenkova

Paper
Add Code

What Makes Writing Great? First Experiments on Article Quality Prediction in the Science Journalism Domain

no code implementations • TACL 2013 • Annie Louis, Ani Nenkova

We show that the distinction between great and typical articles can be detected fairly accurately, and that the entire spectrum of our features contribute to the distinction.

Information Retrieval Recommendation Systems +1

Paper
Add Code

A Decade of Automatic Content Evaluation of News Summaries: Reassessing the State of the Art

no code implementations • ACL 2013 • Peter A. Rankel, John M. Conroy, Hoa Trang Dang, Ani Nenkova

Paper
Add Code

Improving the Estimation of Word Importance for News Multi-Document Summarization

no code implementations • EACL 2014 • Kai Hong, Ani Nenkova

Document Summarization Multi-Document Summarization

Paper
Add Code

Verbose, Laconic or Just Right: A Simple Computational Model of Content Appropriateness under Length Constraints

no code implementations • EACL 2014 • Annie Louis, Ani Nenkova

Paper
Add Code

A Repository of State of the Art and Competitive Baseline Summaries for Generic News Summarization

no code implementations • LREC 2014 • Kai Hong, John Conroy, Benoit Favre, Alex Kulesza, Hui Lin, Ani Nenkova

In the period since 2004, many novel sophisticated approaches for generic multi-document summarization have been developed.

Document Summarization Extractive Summarization +2

Paper
Add Code

Reducing Sparsity Improves the Recognition of Implicit Discourse Relations

no code implementations • WS 2014 • Junyi Jessy Li, Ani Nenkova

Paper
Add Code

Addressing Class Imbalance for Improved Recognition of Implicit Discourse Relations

no code implementations • WS 2014 • Junyi Jessy Li, Ani Nenkova

Natural Language Inference Question Answering

Paper
Add Code

Assessing the Discourse Factors that Influence the Quality of Machine Translation

no code implementations • ACL 2014 • Junyi Jessy Li, Marine Carpuat, Ani Nenkova

Machine Translation Translation

Paper
Add Code

Cross-lingual Discourse Relation Analysis: A corpus study and a semi-supervised classification system

no code implementations • COLING 2014 • Junyi Jessy Li, Marine Carpuat, Ani Nenkova

General Classification Machine Translation +4

Paper
Add Code

Identification and Characterization of Newsworthy Verbs in World News

no code implementations • HLT 2015 • Ani Nenkova, Benjamin Nye

Descriptive Sentence +1

Paper
Add Code

Inducing Lexical Style Properties for Paraphrase and Genre Differentiation

no code implementations • HLT 2015 • Ani Nenkova, Ellie Pavlick

Paper
Add Code

Detecting Content-Heavy Sentences: A Cross-Language Case Study

no code implementations • EMNLP 2015 • Junyi Jessy Li, Ani Nenkova

Machine Translation Text Simplification

Paper
Add Code

System Combination for Multi-document Summarization

no code implementations • EMNLP 2015 • Kai Hong, Mitchell Marcus, Ani Nenkova

Document Summarization Multi-Document Summarization

Paper
Add Code

Improving the Annotation of Sentence Specificity

no code implementations • LREC 2016 • Junyi Jessy Li, Bridget O{'}Daniel, Yi Wu, Wenli Zhao, Ani Nenkova

We found that the lack of specificity distributes evenly among immediate prior context, long distance prior context and no prior context.

Sentence Specificity

Paper
Add Code

The Instantiation Discourse Relation: A Corpus Analysis of Its Properties and Improved Detection

no code implementations • NAACL 2016 • Junyi Jessy Li, Ani Nenkova

Document Summarization Relation +1

Paper
Add Code

Phrase Generalization: a Corpus Study in Multi-Document Abstracts and Original News Alignments

no code implementations • WS 2016 • Ariani Di-Felippo, Ani Nenkova

Sentence Compression

Paper
Add Code

Detecting (Un)Important Content for Single-Document News Summarization

no code implementations • EACL 2017 • Yinfei Yang, Forrest Sheng Bao, Ani Nenkova

We present a robust approach for detecting intrinsic sentence importance in news, by training on two corpora of document-summary pairs.

Document Summarization News Summarization +1

Paper
Add Code

Combining Lexical and Syntactic Features for Detecting Content-dense Texts in News

no code implementations • 3 Apr 2017 • Yinfei Yang, Ani Nenkova

On manually annotated data, we compare the performance of domain-specific classifiers, trained on data only from a given news domain and a general classifier in which data from all four domains is pooled together.

Question Answering

Paper
Add Code

Aggregating and Predicting Sequence Labels from Crowd Annotations

1 code implementation • ACL 2017 • An Thanh Nguyen, Byron Wallace, Junyi Jessy Li, Ani Nenkova, Matthew Lease

Despite sequences being core to NLP, scant work has considered how to handle noisy sequence labels from multiple annotators for the same text.

named-entity-recognition Named Entity Recognition +2

Paper
Code

Syntactic Patterns Improve Information Extraction for Medical Search

no code implementations • NAACL 2018 • Roma Patel, Yinfei Yang, Iain Marshall, Ani Nenkova, Byron Wallace

Medical professionals search the published literature by specifying the type of patients, the medical intervention(s) and the outcome measure(s) of interest.

Paper
Add Code

A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature

2 code implementations • ACL 2018 • Benjamin Nye, Junyi Jessy Li, Roma Patel, Yinfei Yang, Iain J. Marshall, Ani Nenkova, Byron C. Wallace

We present a corpus of 5, 000 richly annotated abstracts of medical articles describing clinical randomized controlled trials.

Ranked #5 on Participant Intervention Comparison Outcome Extraction on EBM-NLP

Participant Intervention Comparison Outcome Extraction PICO

Paper
Code

Evaluating Multiple System Summary Lengths: A Case Study

no code implementations • EMNLP 2018 • Ori Shapira, David Gabay, Hadar Ronen, Judit Bar-Ilan, Yael Amsterdamer, Ani Nenkova, Ido Dagan

Practical summarization systems are expected to produce summaries of varying lengths, per user needs.

Paper
Add Code

Named Person Coreference in English News

no code implementations • 26 Oct 2018 • Oshin Agarwal, Sanjay Subramanian, Ani Nenkova, Dan Roth

Here, we evaluate two state of the art coreference resolution systems on the subtask of Named Person Coreference, in which we are interested in identifying a person mentioned by name, along with all other mentions of the person, by pronoun or generic noun phrase.

coreference-resolution named-entity-recognition +2

Paper
Add Code

Predicting Annotation Difficulty to Improve Task Routing and Model Performance for Biomedical Information Extraction

no code implementations • NAACL 2019 • Yinfei Yang, Oshin Agarwal, Chris Tar, Byron C. Wallace, Ani Nenkova

Experiments on a complex biomedical information extraction task using expert and lay annotators show that: (i) simply excluding from the training data instances predicted to be difficult yields a small boost in performance; (ii) using difficulty scores to weight instances during training provides further, consistent gains; (iii) assigning instances predicted to be difficult to domain experts is an effective strategy for task routing.

Paper
Add Code

Browsing Health: Information Extraction to Support New Interfaces for Accessing Medical Evidence

no code implementations • WS 2019 • Soham Parikh, Elizabeth Conrad, Oshin Agarwal, Iain Marshall, Byron Wallace, Ani Nenkova

Typical information needs, such as retrieving a full list of medical interventions for a given condition, or finding the reported efficacy of a particular treatment with respect to a specific outcome of interest cannot be straightforwardly posed in typical text-box search.

Paper
Add Code

Emotion Impacts Speech Recognition Performance

no code implementations • NAACL 2019 • Rushab Munot, Ani Nenkova

It has been established that the performance of speech recognition systems depends on multiple factors including the lexical content, speaker identity and dialect.

speech-recognition Speech Recognition

Paper
Add Code

Word Embeddings (Also) Encode Human Personality Stereotypes

1 code implementation • SEMEVAL 2019 • Oshin Agarwal, Funda Durup{\i}nar, Norman I. Badler, Ani Nenkova

Word representations trained on text reproduce human implicit bias related to gender, race and age.

Word Embeddings

Paper
Code

Evaluation of named entity coreference

no code implementations • WS 2019 • Oshin Agarwal, Sanjay Subramanian, Ani Nenkova, Dan Roth

It is therefore important that coreference resolution systems are able to link these different types of mentions to the correct entity name.

coreference-resolution

Paper
Add Code

How to Compare Summarizers without Target Length? Pitfalls, Solutions and Re-Examination of the Neural Summarization Literature

no code implementations • WS 2019 • Simeng Sun, Ori Shapira, Ido Dagan, Ani Nenkova

We show that plain ROUGE F1 scores are not ideal for comparing current neural systems which on average produce different lengths.

Paper
Add Code

The Feasibility of Embedding Based Automatic Evaluation for Single Document Summarization

no code implementations • IJCNLP 2019 • Simeng Sun, Ani Nenkova

ROUGE is widely used to automatically evaluate summarization systems.

Document Summarization News Summarization +1

Paper
Add Code

Entity-Switched Datasets: An Approach to Auditing the In-Domain Robustness of Named Entity Recognition Models

1 code implementation • 8 Apr 2020 • Oshin Agarwal, Yinfei Yang, Byron C. Wallace, Ani Nenkova

We propose a method for auditing the in-domain robustness of systems, focusing specifically on differences in performance due to the national origin of entities.

Fairness named-entity-recognition +2

Paper
Code

Interpretability Analysis for Named Entity Recognition to Understand System Predictions and How They Can Improve

no code implementations • CL (ACL) 2021 • Oshin Agarwal, Yinfei Yang, Byron C. Wallace, Ani Nenkova

We examine these questions by contrasting the performance of several variants of LSTM-CRF architectures for named entity recognition, with some provided only representations of the context as features.

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

Trialstreamer: Mapping and Browsing Medical Evidence in Real-Time

1 code implementation • ACL 2020 • Benjamin E. Nye, Ani Nenkova, Iain J. Marshall, Byron C. Wallace

We apply the system at scale to all reports of randomized controlled trials indexed in MEDLINE, powering the automatic generation of evidence maps, which provide a global view of the efficacy of different interventions combining data from all relevant clinical trials on a topic.

Paper
Code

Understanding Clinical Trial Reports: Extracting Medical Entities and Their Relations

no code implementations • 7 Oct 2020 • Benjamin E. Nye, Jay DeYoung, Eric Lehman, Ani Nenkova, Iain J. Marshall, Byron C. Wallace

Here we consider the end-to-end task of both (a) extracting treatments and outcomes from full-text articles describing clinical trials (entity identification) and, (b) inferring the reported results for the former with respect to the latter (relation extraction).

Decision Making Relation Extraction

Paper
Add Code

From Toxicity in Online Comments to Incivility in American News: Proceed with Caution

no code implementations • EACL 2021 • Anushree Hede, Oshin Agarwal, Linda Lu, Diana C. Mutz, Ani Nenkova

The ability to quantify incivility online, in news and in congressional debates, is of great interest to political scientists.

Paper
Add Code

Temporal Effects on Pre-trained Models for Language Processing Tasks

1 code implementation • 24 Nov 2021 • Oshin Agarwal, Ani Nenkova

Keeping the performance of language technologies optimal as time passes is of great practical interest.

Domain Adaptation Experimental Design +3

Paper
Code

UniDoc: Unified Pretraining Framework for Document Understanding

no code implementations • NeurIPS 2021 • Jiuxiang Gu, Jason Kuen, Vlad Morariu, Handong Zhao, Rajiv Jain, Nikolaos Barmpalios, Ani Nenkova, Tong Sun

Document intelligence automates the extraction of information from documents and supports many business applications.

document understanding Self-Supervised Learning

Paper
Add Code

Unified Pretraining Framework for Document Understanding

no code implementations • 22 Apr 2022 • Jiuxiang Gu, Jason Kuen, Vlad I. Morariu, Handong Zhao, Nikolaos Barmpalios, Rajiv Jain, Ani Nenkova, Tong Sun

Document intelligence automates the extraction of information from documents and supports many business applications.

Ranked #7 on Document Layout Analysis on PubLayNet val

Document Layout Analysis document understanding +1

Paper
Add Code

Self-Repetition in Abstractive Neural Summarizers

no code implementations • 14 Oct 2022 • Nikita Salkar, Thomas Trikalinos, Byron C. Wallace, Ani Nenkova

In a regression analysis, we find that the three architectures have different propensities for repeating content across output summaries for inputs, with BART being particularly prone to self-repetition.

Paper
Add Code

Influence Functions for Sequence Tagging Models

1 code implementation • 25 Oct 2022 • Sarthak Jain, Varun Manjunatha, Byron C. Wallace, Ani Nenkova

We show the practical utility of segment influence by using the method to identify systematic annotation errors in two named entity recognition corpora.

named-entity-recognition Named Entity Recognition +3

Paper
Code

MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding

no code implementations • 27 Nov 2022 • Zilong Wang, Jiuxiang Gu, Chris Tensmeyer, Nikolaos Barmpalios, Ani Nenkova, Tong Sun, Jingbo Shang, Vlad I. Morariu

In contrast, region-level models attempt to encode regions corresponding to paragraphs or text blocks into a single embedding, but they perform worse with additional word-level features.

Paper
Add Code

LayerDoc: Layer-wise Extraction of Spatial Hierarchical Structure in Visually-Rich Documents

no code implementations • IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023 • Puneet Mathur, Rajiv Jain, Ashutosh Mehra, Jiuxiang Gu, Franck Dernoncourt, Anandhavelu N, Quan Tran, Verena Kaynig-Fittkau, Ani Nenkova, Dinesh Manocha, Vlad I. Morariu

Experiments show that our approach outperforms competitive baselines by 10-15% on three diverse datasets of forms and mobile app screen layouts for the tasks of spatial region classification, higher-order group identification, layout hierarchy extraction, reading order detection, and word grouping.

Reading Order Detection

Paper
Add Code

Summaries as Captions: Generating Figure Captions for Scientific Documents with Automated Text Summarization

no code implementations • 23 Feb 2023 • Chieh-Yang Huang, Ting-Yao Hsu, Ryan Rossi, Ani Nenkova, Sungchul Kim, Gromit Yeuk-Yin Chan, Eunyee Koh, Clyde Lee Giles, Ting-Hao 'Kenneth' Huang

Prior work often treated figure caption generation as a vision-to-language task.

Abstractive Text Summarization Caption Generation

Paper
Add Code

Learning the Visualness of Text Using Large Vision-Language Models

no code implementations • 11 May 2023 • Gaurav Verma, Ryan A. Rossi, Christopher Tensmeyer, Jiuxiang Gu, Ani Nenkova

Visual text evokes an image in a person's mind, while non-visual text fails to do so.

Contrastive Learning Image Retrieval +2

Paper
Add Code

Few-Shot Dialogue Summarization via Skeleton-Assisted Prompt Transfer in Prompt Tuning

no code implementations • 20 May 2023 • Kaige Xie, Tong Yu, Haoliang Wang, Junda Wu, Handong Zhao, Ruiyi Zhang, Kanak Mahadik, Ani Nenkova, Mark Riedl

In this paper, we focus on improving the prompt transfer from dialogue state tracking to dialogue summarization and propose Skeleton-Assisted Prompt Transfer (SAPT), which leverages skeleton generation as extra supervision that functions as a medium connecting the distinct source and target task and resulting in the model's better consumption of dialogue state information.

Dialogue State Tracking Transfer Learning

Paper
Add Code

Summarization from Leaderboards to Practice: Choosing A Representation Backbone and Ensuring Robustness

no code implementations • 18 Jun 2023 • David Demeter, Oshin Agarwal, Simon Ben Igeri, Marko Sterbentz, Neil Molino, John M. Conroy, Ani Nenkova

Academic literature does not give much guidance on how to build the best possible customer-facing summarization system from existing research components.

Paper
Add Code

PDFTriage: Question Answering over Long, Structured Documents

no code implementations • 16 Sep 2023 • Jon Saad-Falcon, Joe Barrow, Alexa Siu, Ani Nenkova, David Seunghyun Yoon, Ryan A. Rossi, Franck Dernoncourt

Representing such structured documents as plain text is incongruous with the user's mental model of these documents with rich structure.

Question Answering Retrieval

Paper
Add Code

AutoDAN: Interpretable Gradient-Based Adversarial Attacks on Large Language Models

1 code implementation • 23 Oct 2023 • Sicheng Zhu, Ruiyi Zhang, Bang An, Gang Wu, Joe Barrow, Zichao Wang, Furong Huang, Ani Nenkova, Tong Sun

Safety alignment of Large Language Models (LLMs) can be compromised with manual jailbreak attacks and (automatic) adversarial attacks.

Adversarial Attack Blocking

Paper
Code

Improving a Named Entity Recognizer Trained on Noisy Data with a Few Clean Instances

no code implementations • 25 Oct 2023 • Zhendong Chu, Ruiyi Zhang, Tong Yu, Rajiv Jain, Vlad I Morariu, Jiuxiang Gu, Ani Nenkova

To achieve state-of-the-art performance, one still needs to train NER models on large-scale, high-quality annotated data, an asset that is both costly and time-intensive to accumulate.

NER

Paper
Add Code

How Much Annotation is Needed to Compare Summarization Models?

no code implementations • 28 Feb 2024 • Chantal Shaib, Joe Barrow, Alexa F. Siu, Byron C. Wallace, Ani Nenkova

Modern instruction-tuned models have become highly capable in text generation tasks such as summarization, and are expected to be released at a steady pace.

News Summarization Text Generation

Paper
Add Code

Standardizing the Measurement of Text Diversity: A Tool and a Comparative Analysis of Scores

no code implementations • 1 Mar 2024 • Chantal Shaib, Joe Barrow, Jiuding Sun, Alexa F. Siu, Byron C. Wallace, Ani Nenkova

The applicability of scores extends beyond analysis of generative models; for example, we highlight applications on instruction-tuning datasets and human-produced texts.