Search Results for author: Nanyun Peng

Found 85 papers, 43 papers with code

Improving Pre-trained Vision-and-Language Embeddings for Phrase Grounding

no code implementations EMNLP 2021 Zi-Yi Dou, Nanyun Peng

Phrase grounding aims to map textual phrases to their associated image regions, which can be a prerequisite for multimodal reasoning and can benefit tasks requiring identifying objects based on language.

Fine-tuning Phrase Grounding

ESTER: A Machine Reading Comprehension Dataset for Reasoning about Event Semantic Relations

no code implementations EMNLP 2021 Rujun Han, I-Hung Hsu, Jiao Sun, Julia Baylon, Qiang Ning, Dan Roth, Nanyun Peng

While these tasks partially evaluate machines’ ability of narrative understanding, human-like reading comprehension requires the capability to process event-based information beyond arguments and temporal reasoning.

Machine Reading Comprehension

AESOP: Paraphrase Generation with Adaptive Syntactic Control

1 code implementation EMNLP 2021 Jiao Sun, Xuezhe Ma, Nanyun Peng

We propose to control paraphrase generation through carefully chosen target syntactic structures to generate more proper and higher quality paraphrases.

Data Augmentation Language Modelling +1

Understanding Procedural Knowledge by Sequencing Multimodal Instructional Manuals

no code implementations16 Oct 2021 Te-Lin Wu, Alex Spangher, Pegah Alipoormolabashi, Marjorie Freedman, Ralph Weischedel, Nanyun Peng

The ability to sequence unordered events is an essential skill to comprehend and reason about real world task procedures, which often requires thorough understanding of temporal common sense and multimodal information, as these procedures are often communicated through a combination of texts and images.

Common Sense Reasoning

On the Safety of Conversational Models: Taxonomy, Dataset, and Benchmark

no code implementations16 Oct 2021 Hao Sun, Guangxuan Xu, Jiawen Deng, Jiale Cheng, Chujie Zheng, Hao Zhou, Nanyun Peng, Xiaoyan Zhu, Minlie Huang

We propose a taxonomy for dialogue safety specifically designed to capture unsafe behaviors that are unique in human-bot dialogue setting, with focuses on context-sensitive unsafety, which is under-explored in prior works.

Socially Aware Bias Measurements for Hindi Language Representations

no code implementations15 Oct 2021 Vijit Malik, Sunipa Dev, Akihiro Nishi, Nanyun Peng, Kai-Wei Chang

Language representations are an efficient tool used across NLP, but they are strife with encoded societal biases.

HyperExpan: Taxonomy Expansion with Hyperbolic Representation Learning

no code implementations Findings (EMNLP) 2021 Mingyu Derek Ma, Muhao Chen, Te-Lin Wu, Nanyun Peng

Taxonomies are valuable resources for many applications, but the limited coverage due to the expensive manual curation process hinders their general applicability.

Representation Learning

Document-level Entity-based Extraction as Template Generation

1 code implementation EMNLP 2021 Kung-Hsiang Huang, Sam Tang, Nanyun Peng

Document-level entity-based extraction (EE), aiming at extracting entity-centric information such as entity roles and entity relations, is key to automatic knowledge acquisition from text corpora for various domains.

4-ary Relation Extraction Binary Relation Extraction +2

Paraphrase Generation as Unsupervised Machine Translation

no code implementations7 Sep 2021 Chun Fan, Yufei Tian, Yuxian Meng, Nanyun Peng, Xiaofei Sun, Fei Wu, Jiwei Li

Then based on the paraphrase pairs produced by these UMT models, a unified surrogate model can be trained to serve as the final Seq2Seq model to generate paraphrases, which can be directly used for test in the unsupervised setup, or be finetuned on labeled datasets in the supervised setup.

Paraphrase Generation Translation +1

What do Bias Measures Measure?

no code implementations7 Aug 2021 Sunipa Dev, Emily Sheng, Jieyu Zhao, Jiao Sun, Yu Hou, Mattie Sanseverino, Jiin Kim, Nanyun Peng, Kai-Wei Chang

To address this gap, this work presents a comprehensive survey of existing bias measures in NLP as a function of the associated NLP tasks, metrics, datasets, and social biases and corresponding harms.

Men Are Elected, Women Are Married: Events Gender Bias on Wikipedia

1 code implementation ACL 2021 Jiao Sun, Nanyun Peng

Human activities can be seen as sequences of events, which are crucial to understanding societies.

Event Detection

Metaphor Generation with Conceptual Mappings

1 code implementation ACL 2021 Kevin Stowe, Tuhin Chakrabarty, Nanyun Peng, Smaranda Muresan, Iryna Gurevych

Guided by conceptual metaphor theory, we propose to control the generation process by encoding conceptual mappings between cognitive domains to generate meaningful metaphoric expressions.

COM2SENSE: A Commonsense Reasoning Benchmark with Complementary Sentences

1 code implementation Findings (ACL) 2021 Shikhar Singh, Nuan Wen, Yu Hou, Pegah Alipoormolabashi, Te-Lin Wu, Xuezhe Ma, Nanyun Peng

To this end, we introduce a new commonsense reasoning benchmark dataset comprising natural language true/false statements, with each sample paired with its complementary counterpart, resulting in 4k sentence pairs.

Fine-tuning

``Nice Try, Kiddo'': Investigating Ad Hominems in Dialogue Responses

no code implementations NAACL 2021 Emily Sheng, Kai-Wei Chang, Prem Natarajan, Nanyun Peng

Ad hominem attacks are those that target some feature of a person{'}s character instead of the position the person is maintaining.

Abusive Language

Societal Biases in Language Generation: Progress and Challenges

1 code implementation ACL 2021 Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, Nanyun Peng

Technology for language generation has advanced rapidly, spurred by advancements in pre-training large models on massive amounts of data and the need for intelligent agents to communicate in a natural manner.

Fairness Text Generation

"Don't quote me on that": Finding Mixtures of Sources in News Articles

no code implementations19 Apr 2021 Alexander Spangher, Nanyun Peng, Jonathan May, Emilio Ferrara

Journalists publish statements provided by people, or \textit{sources} to contextualize current events, help voters make informed decisions, and hold powerful individuals accountable.

Modeling "Newsworthiness" for Lead-Generation Across Corpora

no code implementations19 Apr 2021 Alexander Spangher, Nanyun Peng, Jonathan May, Emilio Ferrara

Journalists obtain "leads", or story ideas, by reading large corpora of government records: court cases, proposed bills, etc.

Revealing Persona Biases in Dialogue Systems

1 code implementation18 Apr 2021 Emily Sheng, Josh Arnold, Zhou Yu, Kai-Wei Chang, Nanyun Peng

Dialogue systems in the form of chatbots and personal assistants are being increasingly integrated into people's lives.

Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training

1 code implementation EMNLP 2021 Kuan-Hao Huang, Wasi Uddin Ahmad, Nanyun Peng, Kai-Wei Chang

Pre-trained multilingual language encoders, such as multilingual BERT and XLM-R, show great potential for zero-shot cross-lingual transfer.

Fine-tuning Text Classification +2

ESTER: A Machine Reading Comprehension Dataset for Event Semantic Relation Reasoning

1 code implementation16 Apr 2021 Rujun Han, I-Hung Hsu, Jiao Sun, Julia Baylon, Qiang Ning, Dan Roth, Nanyun Peng

While these tasks partially evaluate machines' ability of narrative understanding, human-like reading comprehension requires the capability to process event-based information beyond arguments and temporal reasoning.

Machine Reading Comprehension Question Answering

Plot-guided Adversarial Example Construction for Evaluating Open-domain Story Generation

1 code implementation NAACL 2021 Sarik Ghazarian, Zixi Liu, Akash SM, Ralph Weischedel, Aram Galstyan, Nanyun Peng

We propose to tackle these issues by generating a more comprehensive set of implausible stories using {\em plots}, which are structured representations of controllable factors used to generate stories.

Story Generation

MERMAID: Metaphor Generation with Symbolism and Discriminative Decoding

1 code implementation NAACL 2021 Tuhin Chakrabarty, Xurui Zhang, Smaranda Muresan, Nanyun Peng

Generating metaphors is a challenging task as it requires a proper understanding of abstract concepts, making connections between unrelated concepts, and deviating from the literal meaning.

Language Modelling

On Efficient Training, Controllability and Compositional Generalization of Insertion-based Language Generators

no code implementations12 Feb 2021 Sidi Lu, Nanyun Peng

Auto-regressive language models with the left-to-right generation order have been a predominant paradigm for language generation.

Story Generation

EventPlus: A Temporal Event Understanding Pipeline

no code implementations NAACL 2021 Mingyu Derek Ma, Jiao Sun, Mu Yang, Kung-Hsiang Huang, Nuan Wen, Shikhar Singh, Rujun Han, Nanyun Peng

We present EventPlus, a temporal event understanding pipeline that integrates various state-of-the-art event understanding components including event trigger and type detection, event argument detection, event duration and temporal relation extraction.

Common Sense Reasoning Relation Extraction

Discourse-level Relation Extraction via Graph Pooling

no code implementations1 Jan 2021 I-Hung Hsu, Xiao Guo, Premkumar Natarajan, Nanyun Peng

The ability to capture complex linguistic structures and long-term dependencies among words in the passage is essential for discourse-level relation extraction (DRE) tasks.

Natural Language Understanding Relation Extraction

ECONET: Effective Continual Pretraining of Language Models for Event Temporal Reasoning

2 code implementations EMNLP 2021 Rujun Han, Xiang Ren, Nanyun Peng

While pre-trained language models (PTLMs) have achieved noticeable success on many NLP tasks, they still struggle for tasks that require event temporal reasoning, which is essential for event-centric applications.

Fine-tuning Language Modelling +4

A Paragraph-level Multi-task Learning Model for Scientific Fact-Verification

1 code implementation28 Dec 2020 Xiangci Li, Gully Burns, Nanyun Peng

Even for domain experts, it is a non-trivial task to verify a scientific claim by providing supporting or refuting evidence rationales.

Fact Verification Misinformation +2

Detecting Social Media Manipulation in Low-Resource Languages

no code implementations10 Nov 2020 Samar Haider, Luca Luceri, Ashok Deb, Adam Badawy, Nanyun Peng, Emilio Ferrara

Social media have been deliberately used for malicious purposes, including political manipulation and disinformation.

Transfer Learning

"Nice Try, Kiddo": Investigating Ad Hominems in Dialogue Responses

1 code implementation24 Oct 2020 Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, Nanyun Peng

Ad hominem attacks are those that target some feature of a person's character instead of the position the person is maintaining.

Abusive Language

GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and Event Extraction

1 code implementation6 Oct 2020 Wasi Uddin Ahmad, Nanyun Peng, Kai-Wei Chang

Recent progress in cross-lingual relation and event extraction use graph convolutional networks (GCNs) with universal dependency parses to learn language-agnostic sentence representations such that models trained on one language can be applied to other languages.

Event Extraction Graph Attention

Content Planning for Neural Story Generation with Aristotelian Rescoring

1 code implementation EMNLP 2020 Seraphina Goldfarb-Tarrant, Tuhin Chakrabarty, Ralph Weischedel, Nanyun Peng

Long-form narrative text generated from large language models manages a fluent impersonation of human writing, but only at the local sentence level, and lacks structure or global cohesion.

Language Modelling Story Generation

Biomedical Event Extraction with Hierarchical Knowledge Graphs

1 code implementation Findings of the Association for Computational Linguistics 2020 Kung-Hsiang Huang, Mu Yang, Nanyun Peng

To better recognize the trigger words, each sentence is first grounded to a sentence graph based on a jointly modeled hierarchical knowledge graph from UMLS.

Event Extraction

Generating similes effortlessly like a Pro: A Style Transfer Approach for Simile Generation

1 code implementation EMNLP 2020 Tuhin Chakrabarty, Smaranda Muresan, Nanyun Peng

We also show how replacing literal sentences with similes from our best model in machine generated stories improves evocativeness and leads to better acceptance by human judges.

Common Sense Reasoning Style Transfer

TORQUE: A Reading Comprehension Dataset of Temporal Ordering Questions

no code implementations EMNLP 2020 Qiang Ning, Hao Wu, Rujun Han, Nanyun Peng, Matt Gardner, Dan Roth

A critical part of reading is being able to understand the temporal relationships between events described in a passage of text, even when those relationships are not explicitly stated.

Machine Reading Comprehension

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

2 code implementations4 Nov 2019 Sarik Ghazarian, Ralph Weischedel, Aram Galstyan, Nanyun Peng

In this paper, we investigate the possibility and efficacy of estimating utterance-level engagement and define a novel metric, {\em predictive engagement}, for automatic evaluation of open-domain dialogue systems.

Do Nuclear Submarines Have Nuclear Captains? A Challenge Dataset for Commonsense Reasoning over Adjectives and Objects

no code implementations IJCNLP 2019 James Mullenbach, Jonathan Gordon, Nanyun Peng, Jonathan May

This provides evidence that the amount of commonsense knowledge encoded in these language models does not extend far beyond that already baked into the word embeddings.

Word Embeddings

Man is to Person as Woman is to Location: Measuring Gender Bias in Named Entity Recognition

1 code implementation24 Oct 2019 Ninareh Mehrabi, Thamme Gowda, Fred Morstatter, Nanyun Peng, Aram Galstyan

We study the bias in several state-of-the-art named entity recognition (NER) models---specifically, a difference in the ability to recognize male and female names as PERSON entity types.

Named Entity Recognition NER

Cross-lingual Dependency Parsing with Unlabeled Auxiliary Languages

1 code implementation CONLL 2019 Wasi Uddin Ahmad, Zhisong Zhang, Xuezhe Ma, Kai-Wei Chang, Nanyun Peng

We conduct experiments on cross-lingual dependency parsing where we train a dependency parser on a source language and transfer it to a wide range of target languages.

Cross-Lingual Transfer Dependency Parsing +2

Espresso: A Fast End-to-end Neural Speech Recognition Toolkit

1 code implementation18 Sep 2019 Yiming Wang, Tongfei Chen, Hainan Xu, Shuoyang Ding, Hang Lv, Yiwen Shao, Nanyun Peng, Lei Xie, Shinji Watanabe, Sanjeev Khudanpur

We present Espresso, an open-source, modular, extensible end-to-end neural automatic speech recognition (ASR) toolkit based on the deep learning library PyTorch and the popular neural machine translation toolkit fairseq.

 Ranked #1 on Speech Recognition on Hub5'00 SwitchBoard (Eval2000 metric)

automatic-speech-recognition Data Augmentation +4

Scientific Discourse Tagging for Evidence Extraction

1 code implementation EACL 2021 Xiangci Li, Gully Burns, Nanyun Peng

We apply richly contextualized deep representation learning pre-trained on biomedical domain corpus to the analysis of scientific discourse structures and the extraction of "evidence fragments" (i. e., the text in the results section describing data presented in a specified subfigure) from a set of biomedical experimental research articles.

Representation Learning

What Matters for Neural Cross-Lingual Named Entity Recognition: An Empirical Analysis

no code implementations IJCNLP 2019 Xiaolei Huang, Jonathan May, Nanyun Peng

While recent work has shown promising results on cross-lingual transfer from high-resource languages to low-resource languages, it is unclear what knowledge is transferred.

Cross-Lingual NER Named Entity Recognition +2

Target Language-Aware Constrained Inference for Cross-lingual Dependency Parsing

1 code implementation IJCNLP 2019 Tao Meng, Nanyun Peng, Kai-Wei Chang

Experiments show that the Lagrangian relaxation and posterior regularization inference improve the performances on 15 and 17 out of 19 target languages, respectively.

Dependency Parsing

The Woman Worked as a Babysitter: On Biases in Language Generation

1 code implementation IJCNLP 2019 Emily Sheng, Kai-Wei Chang, Premkumar Natarajan, Nanyun Peng

We present a systematic study of biases in natural language generation (NLG) by analyzing text generated from prompts that contain mentions of different demographic groups.

Language Modelling Text Generation +1

Contextualized Word Embeddings Enhanced Event Temporal Relation Extraction for Story Understanding

no code implementations26 Apr 2019 Rujun Han, Mengyue Liang, Bashar Alhafni, Nanyun Peng

In this work, we establish strong baselines for event temporal relation extraction on two under-explored story narrative datasets: Richer Event Description (RED) and Causal and Temporal Relation Scheme (CaTeRS).

Relation Extraction Word Embeddings

Pun Generation with Surprise

2 code implementations NAACL 2019 He He, Nanyun Peng, Percy Liang

We tackle the problem of generating a pun sentence given a pair of homophones (e. g., "died" and "dyed").

Language Modelling Text Generation

Plan, Write, and Revise: an Interactive System for Open-Domain Story Generation

1 code implementation NAACL 2019 Seraphina Goldfarb-Tarrant, Haining Feng, Nanyun Peng

We compare different varieties of interaction in story-writing, story-planning, and diversity controls under time constraints, and show that increased types of human collaboration at both planning and writing stages results in a 10-50% improvement in story quality as compared to less interactive baselines.

Story Generation

Plan-And-Write: Towards Better Automatic Storytelling

1 code implementation14 Nov 2018 Lili Yao, Nanyun Peng, Ralph Weischedel, Kevin Knight, Dongyan Zhao, Rui Yan

Automatic storytelling is challenging since it requires generating long, coherent natural language to describes a sensible sequence of events.

Story Generation

Scalable Construction and Reasoning of Massive Knowledge Bases

no code implementations NAACL 2018 Xiang Ren, Nanyun Peng, William Yang Wang

In today{'}s information-based society, there is abundant knowledge out there carried in the form of natural language texts (e. g., news articles, social media posts, scientific publications), which spans across various domains (e. g., corporate documents, advertisements, legal acts, medical reports), which grows at an astonishing rate.

Towards Controllable Story Generation

no code implementations WS 2018 Nanyun Peng, Marjan Ghazvininejad, Jonathan May, Kevin Knight

We present a general framework of analyzing existing story corpora to generate controllable and creative new stories.

Story Generation

Stack-Pointer Networks for Dependency Parsing

3 code implementations ACL 2018 Xuezhe Ma, Zecong Hu, Jingzhou Liu, Nanyun Peng, Graham Neubig, Eduard Hovy

Combining pointer networks~\citep{vinyals2015pointer} with an internal stack, the proposed model first reads and encodes the whole sentence, then builds the dependency tree top-down (from root-to-leaf) in a depth-first fashion.

Dependency Parsing

Style Transfer in Text: Exploration and Evaluation

2 code implementations18 Nov 2017 Zhenxin Fu, Xiaoye Tan, Nanyun Peng, Dongyan Zhao, Rui Yan

Results show that the proposed content preservation metric is highly correlate to human judgments, and the proposed models are able to generate sentences with higher style transfer strength and similar content preservation score comparing to auto-encoder.

Style Transfer Text Style Transfer

A Multi-task Learning Approach to Adapting Bilingual Word Embeddings for Cross-lingual Named Entity Recognition

no code implementations IJCNLP 2017 Dingquan Wang, Nanyun Peng, Kevin Duh

We show how to adapt bilingual word embeddings (BWE{'}s) to bootstrap a cross-lingual name-entity recognition (NER) system in a language with no labeled data.

Cross-Lingual Transfer Multi-Task Learning +3

Multi-task Domain Adaptation for Sequence Tagging

no code implementations WS 2017 Nanyun Peng, Mark Dredze

Many domain adaptation approaches rely on learning cross domain shared representations to transfer the knowledge learned in one domain to other domains.

Chinese Word Segmentation Domain Adaptation +2

Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning

no code implementations ACL 2016 Nanyun Peng, Mark Dredze

Named entity recognition, and other information extraction tasks, frequently use linguistic features such as part of speech tags or chunkings.

Named Entity Recognition NER +1

Modeling Word Forms Using Latent Underlying Morphs and Phonology

no code implementations TACL 2015 Ryan Cotterell, Nanyun Peng, Jason Eisner

Given some surface word types of a concatenative language along with the abstract morpheme sequences that they express, we show how to recover consistent underlying forms for these morphemes, together with the (stochastic) phonology that maps each concatenation of underlying forms to a surface form.

Cannot find the paper you are looking for? You can Submit a new open access paper.