Search Results for author: Jonathan May

Found 84 papers, 31 papers with code

Mega: Moving Average Equipped Gated Attention

5 code implementations21 Sep 2022 Xuezhe Ma, Chunting Zhou, Xiang Kong, Junxian He, Liangke Gui, Graham Neubig, Jonathan May, Luke Zettlemoyer

The design choices in the Transformer attention mechanism, including weak inductive bias and quadratic computational complexity, have limited its application for modeling long sequences.

Image Classification Inductive Bias +3

Know Thy Strengths: Comprehensive Dialogue State Tracking Diagnostics

2 code implementations15 Dec 2021 Hyundong Cho, Chinnadhurai Sankar, Christopher Lin, Kaushik Ram Sadagopan, Shahin Shayandeh, Asli Celikyilmaz, Jonathan May, Ahmad Beirami

Recent works that revealed the vulnerability of dialogue state tracking (DST) models to distributional shifts have made holistic comparisons on robustness and qualitative analyses increasingly important for understanding their relative performance.

Ranked #4 on Multi-domain Dialogue State Tracking on MULTIWOZ 2.1 (using extra training data)

Dialogue State Tracking Multi-domain Dialogue State Tracking +1

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

1 code implementation12 Apr 2024 Xuezhe Ma, Xiaomeng Yang, Wenhan Xiong, Beidi Chen, Lili Yu, Hao Zhang, Jonathan May, Luke Zettlemoyer, Omer Levy, Chunting Zhou

The quadratic complexity and weak length extrapolation of Transformers limits their ability to scale to long sequences, and while sub-quadratic solutions like linear attention and state space models exist, they empirically underperform Transformers in pretraining efficiency and downstream task accuracy.

Transfer Learning for Low-Resource Neural Machine Translation

1 code implementation EMNLP 2016 Barret Zoph, Deniz Yuret, Jonathan May, Kevin Knight

Ensembling and unknown word replacement add another 2 Bleu which brings the NMT performance on low-resource machine translation close to a strong syntax based machine translation (SBMT) system, exceeding its performance on one language pair.

Low-Resource Neural Machine Translation NMT +2

WARP: Word-level Adversarial ReProgramming

1 code implementation ACL 2021 Karen Hambardzumyan, Hrant Khachatrian, Jonathan May

Transfer learning from pretrained language models recently became the dominant approach for solving many NLP tasks.

Language Modelling Transfer Learning +1

Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation

1 code implementation EMNLP 2021 Mozhdeh Gheini, Xiang Ren, Jonathan May

We study the power of cross-attention in the Transformer architecture within the context of transfer learning for machine translation, and extend the findings of studies into cross-attention when training from scratch.

Machine Translation Transfer Learning +1

NewsEdits: A News Article Revision Dataset and a Document-Level Reasoning Challenge

1 code implementation14 Jun 2022 Alexander Spangher, Xiang Ren, Jonathan May, Nanyun Peng

News article revision histories provide clues to narrative and factual evolution in news articles.

RECAP: Retrieval-Enhanced Context-Aware Prefix Encoder for Personalized Dialogue Response Generation

1 code implementation12 Jun 2023 Shuai Liu, Hyundong J. Cho, Marjorie Freedman, Xuezhe Ma, Jonathan May

Endowing chatbots with a consistent persona is essential to an engaging conversation, yet it remains an unresolved challenge.

Response Generation Retrieval

Grounding Conversations with Improvised Dialogues

1 code implementation ACL 2020 Hyundong Cho, Jonathan May

Effective dialogue involves grounding, the process of establishing mutual knowledge that is essential for communication between people.

Many-to-English Machine Translation Tools, Data, and Pretrained Models

2 code implementations ACL 2021 Thamme Gowda, Zhao Zhang, Chris A Mattmann, Jonathan May

While there are more than 7000 languages in the world, most translation research efforts have targeted a few high-resource languages.

Machine Translation Transfer Learning +1

A Grounded Unsupervised Universal Part-of-Speech Tagger for Low-Resource Languages

1 code implementation NAACL 2019 Ronald Cardenas, Ying Lin, Heng Ji, Jonathan May

We also show extrinsically that incorporating our POS tagger into a name tagger leads to state-of-the-art tagging performance in Sinhalese and Kinyarwanda, two languages with nearly no labeled POS data available.

Clustering Decipherment +4

WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in Large Language Models

1 code implementation26 Jun 2023 Virginia K. Felkner, Ho-Chun Herbert Chang, Eugene Jang, Jonathan May

We present WinoQueer: a benchmark specifically designed to measure whether large language models (LLMs) encode biases that are harmful to the LGBTQ+ community.

Finding the Optimal Vocabulary Size for Neural Machine Translation

1 code implementation Findings of the Association for Computational Linguistics 2020 Thamme Gowda, Jonathan May

We cast neural machine translation (NMT) as a classification task in an autoregressive setting and analyze the limitations of both classification and autoregression components.

Classification General Classification +3

Opponent Modeling in Negotiation Dialogues by Related Data Adaptation

1 code implementation Findings (NAACL) 2022 Kushal Chawla, Gale M. Lucas, Jonathan May, Jonathan Gratch

A practical model for this task needs to infer these priorities of the opponent on the fly based on partial dialogues as input, without needing additional annotations for training.

Cross-lingual Lifelong Learning

1 code implementation23 May 2022 Meryem M'hamdi, Xiang Ren, Jonathan May

The longstanding goal of multi-lingual learning has been to develop a universal cross-lingual model that can withstand the changes in multi-lingual data distributions.

Continual Learning Transfer Learning

Continual Dialogue State Tracking via Example-Guided Question Answering

1 code implementation23 May 2023 Hyundong Cho, Andrea Madotto, Zhaojiang Lin, Khyathi Raghavi Chandu, Satwik Kottur, Jing Xu, Jonathan May, Chinnadhurai Sankar

Dialogue systems are frequently updated to accommodate new services, but naively updating them by continually training with data for new services in diminishing performance on previously learnt services.

Continual Learning Dialogue State Tracking +3

Comprehensible Context-driven Text Game Playing

2 code implementations6 May 2019 Xusen Yin, Jonathan May

As such, an LSTM-based DQN can take tens of days to finish the training process.

Q-Learning

Learn How to Cook a New Recipe in a New House: Using Map Familiarization, Curriculum Learning, and Bandit Feedback to Learn Families of Text-Based Adventure Games

1 code implementation13 Aug 2019 Xusen Yin, Jonathan May

We consider the task of learning to play families of text-based computer adventure games, i. e., fully textual environments with a common theme (e. g. cooking) and goal (e. g. prepare a meal from a recipe) but with different specifics; new instances of such games are relatively straightforward for humans to master after a brief exposure to the genre but have been curiously difficult for computer agents to learn.

Common Sense Reasoning Q-Learning

Challenges in Context-Aware Neural Machine Translation

1 code implementation23 May 2023 Linghao Jin, Jacqueline He, Jonathan May, Xuezhe Ma

Context-aware neural machine translation involves leveraging information beyond sentence-level context to resolve inter-sentential discourse dependencies and improve document-level translation quality, and has given rise to a number of recent techniques.

Machine Translation Sentence +1

Experience Grounds Language

2 code implementations EMNLP 2020 Yonatan Bisk, Ari Holtzman, Jesse Thomason, Jacob Andreas, Yoshua Bengio, Joyce Chai, Mirella Lapata, Angeliki Lazaridou, Jonathan May, Aleksandr Nisnevich, Nicolas Pinto, Joseph Turian

Language understanding research is held back by a failure to relate language to the physical world it describes and to the social interactions it facilitates.

Representation Learning

Macro-Average: Rare Types Are Important Too

1 code implementation NAACL 2021 Thamme Gowda, Weiqiu You, Constantine Lignos, Jonathan May

While traditional corpus-level evaluation metrics for machine translation (MT) correlate well with fluency, they struggle to reflect adequacy.

Cross-Lingual Information Retrieval Machine Translation +2

"Don't quote me on that": Finding Mixtures of Sources in News Articles

1 code implementation19 Apr 2021 Alexander Spangher, Nanyun Peng, Jonathan May, Emilio Ferrara

Journalists publish statements provided by people, or \textit{sources} to contextualize current events, help voters make informed decisions, and hold powerful individuals accountable.

Clustering

Identifying Informational Sources in News Articles

1 code implementation24 May 2023 Alexander Spangher, Nanyun Peng, Jonathan May, Emilio Ferrara

News articles are driven by the informational sources journalists use in reporting.

Text Generation

Learning to Generalize for Sequential Decision Making

1 code implementation Findings of the Association for Computational Linguistics 2020 Xusen Yin, Ralph Weischedel, Jonathan May

However, the large amount of computation necessary to adequately train and explore the search space of sequential decision making, under a reinforcement learning paradigm, precludes the inclusion of large contextualized language models, which might otherwise enable the desired generalization ability.

Imitation Learning Natural Language Understanding +2

Machine Translation Robustness to Natural Asemantic Variation

1 code implementation25 May 2022 Jacob Bremerman, Xiang Ren, Jonathan May

We find that existing MT models fail when presented with NAV data, but we demonstrate strategies to improve performance on NAV by fine-tuning them with human-generated variations.

Machine Translation Translation

Authorship Style Transfer with Policy Optimization

1 code implementation12 Mar 2024 Shuai Liu, Shantanu Agarwal, Jonathan May

Authorship style transfer aims to rewrite a given text into a specified target while preserving the original meaning in the source.

Style Transfer Transfer Learning

Recurrent Neural Networks as Weighted Language Recognizers

no code implementations NAACL 2018 Yining Chen, Sorcha Gilroy, Andreas Maletti, Jonathan May, Kevin Knight

We investigate the computational complexity of various problems for simple recurrent neural networks (RNNs) as formal models for recognizing weighted languages.

Out-of-the-box Universal Romanization Tool uroman

no code implementations ACL 2018 Ulf Hermjakob, Jonathan May, Kevin Knight

We present uroman, a tool for converting text in myriads of languages and scripts such as Chinese, Arabic and Cyrillic into a common Latin-script representation.

Machine Translation

Translating a Language You Don't Know In the Chinese Room

no code implementations ACL 2018 Ulf Hermjakob, Jonathan May, Michael Pust, Kevin Knight

In a corruption of John Searle{'}s famous AI thought experiment, the Chinese Room (Searle, 1980), we twist its original intent by enabling humans to translate text, e. g. from Uyghur to English, even if they don{'}t have any prior knowledge of the source language.

Domain Adaptation Language Modelling +3

Cross-lingual Name Tagging and Linking for 282 Languages

no code implementations ACL 2017 Xiaoman Pan, Boliang Zhang, Jonathan May, Joel Nothman, Kevin Knight, Heng Ji

The ambitious goal of this work is to develop a cross-lingual name tagging and linking framework for 282 languages that exist in Wikipedia.

Translation Word Translation

ELISA-EDL: A Cross-lingual Entity Extraction, Linking and Localization System

no code implementations NAACL 2018 Boliang Zhang, Ying Lin, Xiaoman Pan, Di Lu, Jonathan May, Kevin Knight, Heng Ji

We demonstrate ELISA-EDL, a state-of-the-art re-trainable system to extract entity mentions from low-resource languages, link them to external English knowledge bases, and visualize locations related to disaster topics on a world heatmap.

Entity Extraction using GAN Entity Linking +1

SemEval-2017 Task 9: Abstract Meaning Representation Parsing and Generation

no code implementations SEMEVAL 2017 Jonathan May, Jay Priyadarshi

In the generation subtask, participants were asked to generate English sentences given AMR graphs in the news/forum domain.

AMR Parsing Machine Translation

Towards Controllable Story Generation

no code implementations WS 2018 Nanyun Peng, Marjan Ghazvininejad, Jonathan May, Kevin Knight

We present a general framework of analyzing existing story corpora to generate controllable and creative new stories.

Story Generation

An Analysis (and an Annotated Corpus) of User Responses to Machine Translation Output

no code implementations LREC 2012 Daniele Pighin, Llu{\'\i}s M{\`a}rquez, Jonathan May

We present an annotated resource consisting of open-domain translation requests, automatic translations and user-provided corrections collected from casual users of the translation portal http://reverso. net.

Machine Translation Translation

Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation

no code implementations ACL 2019 Nima Pourdamghani, Nada Aldarrab, Marjan Ghazvininejad, Kevin Knight, Jonathan May

Given a rough, word-by-word gloss of a source language sentence, target language natives can uncover the latent, fully-fluent rendering of the translation.

Sentence Translation +2

SARAL: A Low-Resource Cross-Lingual Domain-Focused Information Retrieval System for Effective Rapid Document Triage

no code implementations ACL 2019 Elizabeth Boschee, Joel Barry, Jayadev Billa, Marjorie Freedman, Thamme Gowda, Constantine Lignos, Chester Palen-Michel, Michael Pust, Banriskhem Kayang Khonglah, Srikanth Madikeri, Jonathan May, Scott Miller

In this paper we present an end-to-end cross-lingual information retrieval (CLIR) and summarization system for low-resource languages that 1) enables English speakers to search foreign language repositories of text and audio using English queries, 2) summarizes the retrieved documents in English with respect to a particular information need, and 3) provides complete transcriptions and translations as needed.

Cross-Lingual Information Retrieval Machine Translation +2

What Matters for Neural Cross-Lingual Named Entity Recognition: An Empirical Analysis

no code implementations IJCNLP 2019 Xiaolei Huang, Jonathan May, Nanyun Peng

While recent work has shown promising results on cross-lingual transfer from high-resource languages to low-resource languages, it is unclear what knowledge is transferred.

Cross-Lingual NER named-entity-recognition +3

A Universal Parent Model for Low-Resource Neural Machine Translation Transfer

no code implementations14 Sep 2019 Mozhdeh Gheini, Jonathan May

In this work, we present a `universal' pre-trained neural parent model with constant vocabulary that can be used as a starting point for training practically any new low-resource language to a fixed target language.

Humanitarian Low-Resource Neural Machine Translation +2

Cross-lingual Joint Entity and Word Embedding to Improve Entity Linking and Parallel Sentence Mining

no code implementations WS 2019 Xiaoman Pan, Thamme Gowda, Heng Ji, Jonathan May, Scott Miller

Because this multilingual common space directly relates the semantics of contextual words in the source language to that of entities in the target language, we leverage it for unsupervised cross-lingual entity linking.

Cross-Lingual Entity Linking Entity Linking +1

Contextualized Cross-Lingual Event Trigger Extraction with Minimal Resources

no code implementations CONLL 2019 Meryem M{'}hamdi, Marjorie Freedman, Jonathan May

Our work is the first to experiment with two event architecture variants in a cross-lingual setting, to show the effectiveness of contextualized embeddings obtained using BERT, and to explore and analyze its performance on Arabic.

Event Extraction Transfer Learning

Cross-lingual Structure Transfer for Relation and Event Extraction

no code implementations IJCNLP 2019 Ananya Subburathinam, Di Lu, Heng Ji, Jonathan May, Shih-Fu Chang, Avirup Sil, Clare Voss

The identification of complex semantic structures such as events and entity relations, already a challenging Information Extraction task, is doubly difficult from sources written in under-resourced and under-annotated languages.

Event Extraction Relation +1

Do Nuclear Submarines Have Nuclear Captains? A Challenge Dataset for Commonsense Reasoning over Adjectives and Objects

no code implementations IJCNLP 2019 James Mullenbach, Jonathan Gordon, Nanyun Peng, Jonathan May

This provides evidence that the amount of commonsense knowledge encoded in these language models does not extend far beyond that already baked into the word embeddings.

Word Embeddings

Zero-Shot Learning of Text Adventure Games with Sentence-Level Semantics

no code implementations6 Apr 2020 Xusen Yin, Jonathan May

Reinforcement learning algorithms such as Q-learning have shown great promise in training models to learn the optimal action to take for a given system state; a goal in applications with an exploratory or adversarial nature such as task-oriented dialogues or games.

Clustering Q-Learning +2

Multitask Learning for Class-Imbalanced Discourse Classification

no code implementations2 Jan 2021 Alexander Spangher, Jonathan May, Sz-Rung Shiang, Lingjia Deng

Small class-imbalanced datasets, common in many high-level semantic tasks like discourse analysis, present a particular challenge to current deep-learning architectures.

Classification General Classification +1

NewsEdits: A Dataset of Revision Histories for News Articles (Technical Report: Data Processing)

no code implementations19 Apr 2021 Alexander Spangher, Jonathan May

In this work, we present, to our knowledge, the first publicly available dataset of news article revision histories, or NewsEdits.

Modeling "Newsworthiness" for Lead-Generation Across Corpora

no code implementations19 Apr 2021 Alexander Spangher, Nanyun Peng, Jonathan May, Emilio Ferrara

Journalists obtain "leads", or story ideas, by reading large corpora of government records: court cases, proposed bills, etc.

StateCensusLaws.org: A Web Application for Consuming and Annotating Legal Discourse Learning

no code implementations20 Apr 2021 Alexander Spangher, Jonathan May

In this work, we create a web application to highlight the output of NLP models trained to parse and label discourse segments in law text.

Viola: A Topic Agnostic Generate-and-Rank Dialogue System

no code implementations25 Aug 2021 Hyundong Cho, Basel Shbita, Kartik Shenoy, Shuai Liu, Nikhil Patel, Hitesh Pindikanti, Jennifer Lee, Jonathan May

We present Viola, an open-domain dialogue system for spoken conversation that uses a topic-agnostic dialogue manager based on a simple generate-and-rank approach.

Salience-Aware Event Chain Modeling for Narrative Understanding

no code implementations EMNLP 2021 Xiyang Zhang, Muhao Chen, Jonathan May

Storytelling, whether via fables, news reports, documentaries, or memoirs, can be thought of as the communication of interesting and related events that, taken together, form a concrete process.

Question Answering

Segmenting Numerical Substitution Ciphers

no code implementations25 May 2022 Nada Aldarrab, Jonathan May

In this work, we propose the first automatic methods to segment those ciphers using Byte Pair Encoding (BPE) and unigram language models.

Language Modelling Segmentation

Know Where You're Going: Meta-Learning for Parameter-Efficient Fine-Tuning

no code implementations25 May 2022 Mozhdeh Gheini, Xuezhe Ma, Jonathan May

A recent family of techniques, dubbed lightweight fine-tuning methods, facilitates parameter-efficient transfer learning by updating only a small set of additional parameters while keeping the parameters of the pretrained language model frozen.

Cross-Lingual NER Language Modelling +3

Investigating the Benefits of Free-Form Rationales

no code implementations25 May 2022 Jiao Sun, Swabha Swayamdipta, Jonathan May, Xuezhe Ma

After controlling for instances where rationales leak the correct answer while not providing additional background knowledge, we find that incorporating only 5% of rationales during training can boost model performance by 47. 22% for CoS-E and 57. 14% for ECQA during inference.

Towards WinoQueer: Developing a Benchmark for Anti-Queer Bias in Large Language Models

no code implementations23 Jun 2022 Virginia K. Felkner, Ho-Chun Herbert Chang, Eugene Jang, Jonathan May

This paper presents exploratory work on whether and to what extent biases against queer and trans people are encoded in large language models (LLMs) such as BERT.

Bias Detection

Augmenting Training Data for Massive Semantic Matching Models in Low-Traffic E-commerce Stores

no code implementations NAACL (ACL) 2022 Ashutosh Joshi, Shankar Vishwanath, Choon Teo, Vaclav Petricek, Vishy Vishwanathan, Rahul Bhagat, Jonathan May

We use the Aggregated Label eXtreme Multi-label Classification (AL-XMC) system (Shen et al., 2020) as an example semantic matching model and show via crowd-sourced human judgments that, when the training data is augmented through query reformulations, the quality of AL-XMC improves over a baseline that does not use query reformulation.

Extreme Multi-Label Classification

Checks and Strategies for Enabling Code-Switched Machine Translation

no code implementations11 Oct 2022 Thamme Gowda, Mozhdeh Gheini, Jonathan May

Code-switching is a common phenomenon among multilingual speakers, where alternation between two or more languages occurs within the context of a single conversation.

Data Augmentation Machine Translation +2

Anger Breeds Controversy: Analyzing Controversy and Emotions on Reddit

no code implementations1 Dec 2022 Kai Chen, Zihao He, Rong-Ching Chang, Jonathan May, Kristina Lerman

We collect discussions from a wide variety of topical forums and use emotion detection to recognize a range of emotions from text, including anger, fear, joy, admiration, etc.

CPL-NoViD: Context-Aware Prompt-based Learning for Norm Violation Detection in Online Communities

1 code implementation16 May 2023 Zihao He, Jonathan May, Kristina Lerman

Detecting norm violations in online communities is critical to maintaining healthy and safe spaces for online discussions.

Few-Shot Learning

Analyzing Norm Violations in Live-Stream Chat

no code implementations18 May 2023 Jihyung Moon, Dong-Ho Lee, Hyundong Cho, Woojeong Jin, Chan Young Park, Minwoo Kim, Jonathan May, Jay Pujara, Sungjoon Park

Previous approaches to detecting toxic language and norm violations have been primarily concerned with conversations from online forums and social media, such as Reddit and Twitter.

Multilingual Sentence-Level Semantic Search using Meta-Distillation Learning

no code implementations15 Sep 2023 Meryem M'hamdi, Jonathan May, Franck Dernoncourt, Trung Bui, Seunghyun Yoon

Our approach leverages meta-distillation learning based on MAML, an optimization-based Model-Agnostic Meta-Learner.

Sentence

Tracking the Newsworthiness of Public Documents

no code implementations16 Nov 2023 Alexander Spangher, Emilio Ferrara, Ben Welsh, Nanyun Peng, Serdar Tumgoren, Jonathan May

Journalists must find stories in huge amounts of textual data (e. g. leaks, bills, press releases) as part of their jobs: determining when and why text becomes news can help us understand coverage patterns and help us build assistive tools.

Retrieval

Can Language Model Moderators Improve the Health of Online Discourse?

no code implementations16 Nov 2023 Hyundong Cho, Shuai Liu, Taiwei Shi, Darpan Jain, Basem Rizk, YuYang Huang, Zixun Lu, Nuan Wen, Jonathan Gratch, Emilio Ferrera, Jonathan May

Human moderation of online conversation is essential to maintaining civility and focus in a dialogue, but is challenging to scale and harmful to moderators.

Language Modelling Text Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.