Search Results for author: Jonathan May

Found 56 papers, 15 papers with code

Salience-Aware Event Chain Modeling for Narrative Understanding

no code implementations22 Sep 2021 Xiyang Zhang, Muhao Chen, Jonathan May

Storytelling, whether via fables, news reports, documentaries, or memoirs, can be thought of as the communication of interesting and related events that, taken together, form a concrete process.

Question Answering

Viola: A Topic Agnostic Generate-and-Rank Dialogue System

no code implementations25 Aug 2021 Hyundong Cho, Basel Shbita, Kartik Shenoy, Shuai Liu, Nikhil Patel, Hitesh Pindikanti, Jennifer Lee, Jonathan May

We present Viola, an open-domain dialogue system for spoken conversation that uses a topic-agnostic dialogue manager based on a simple generate-and-rank approach.

Luna: Linear Unified Nested Attention

2 code implementations3 Jun 2021 Xuezhe Ma, Xiang Kong, Sinong Wang, Chunting Zhou, Jonathan May, Hao Ma, Luke Zettlemoyer

Specifically, with the first attention function, Luna packs the input sequence into a sequence of fixed length.

Language Modelling Machine Translation

\textit{}: A Web Application for Consuming and Annotating Legal Discourse Learning

no code implementations20 Apr 2021 Alexander Spangher, Jonathan May

In this work, we create a web application to highlight the output of NLP models trained to parse and label discourse segments in law text.

\textit{NewsEdits}: A Dataset of Revision Histories for News Articles (Technical Report: Data Processing)

no code implementations19 Apr 2021 Alexander Spangher, Jonathan May

In this work, we present, to our knowledge, the first publicly available dataset of news article revision histories, or \textit{NewsEdits}.

Modeling "Newsworthiness" for Lead-Generation Across Corpora

no code implementations19 Apr 2021 Alexander Spangher, Nanyun Peng, Jonathan May, Emilio Ferrara

Journalists obtain "leads", or story ideas, by reading large corpora of government records: court cases, proposed bills, etc.

"Don't quote me on that": Finding Mixtures of Sources in News Articles

no code implementations19 Apr 2021 Alexander Spangher, Nanyun Peng, Jonathan May, Emilio Ferrara

Journalists publish statements provided by people, or \textit{sources} to contextualize current events, help voters make informed decisions, and hold powerful individuals accountable.

Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation

1 code implementation18 Apr 2021 Mozhdeh Gheini, Xiang Ren, Jonathan May

We study the power of cross-attention in the Transformer architecture within the context of transfer learning for machine translation, and extend the findings of studies into cross-attention when training from scratch.

Machine Translation Transfer Learning

Macro-Average: Rare Types Are Important Too

1 code implementation NAACL 2021 Thamme Gowda, Weiqiu You, Constantine Lignos, Jonathan May

While traditional corpus-level evaluation metrics for machine translation (MT) correlate well with fluency, they struggle to reflect adequacy.

Information Retrieval Machine Translation

Many-to-English Machine Translation Tools, Data, and Pretrained Models

2 code implementations ACL 2021 Thamme Gowda, Zhao Zhang, Chris A Mattmann, Jonathan May

While there are more than 7000 languages in the world, most translation research efforts have targeted a few high-resource languages.

Machine Translation Transfer Learning

Multitask Learning for Class-Imbalanced Discourse Classification

no code implementations2 Jan 2021 Alexander Spangher, Jonathan May, Sz-Rung Shiang, Lingjia Deng

Small class-imbalanced datasets, common in many high-level semantic tasks like discourse analysis, present a particular challenge to current deep-learning architectures.

Classification General Classification

WARP: Word-level Adversarial ReProgramming

1 code implementation ACL 2021 Karen Hambardzumyan, Hrant Khachatrian, Jonathan May

Transfer learning from pretrained language models recently became the dominant approach for solving many NLP tasks.

Language Modelling Transfer Learning +1

Learning to Generalize for Sequential Decision Making

1 code implementation Findings of the Association for Computational Linguistics 2020 Xusen Yin, Ralph Weischedel, Jonathan May

However, the large amount of computation necessary to adequately train and explore the search space of sequential decision making, under a reinforcement learning paradigm, precludes the inclusion of large contextualized language models, which might otherwise enable the desired generalization ability.

Decision Making Imitation Learning +1

Experience Grounds Language

2 code implementations EMNLP 2020 Yonatan Bisk, Ari Holtzman, Jesse Thomason, Jacob Andreas, Yoshua Bengio, Joyce Chai, Mirella Lapata, Angeliki Lazaridou, Jonathan May, Aleksandr Nisnevich, Nicolas Pinto, Joseph Turian

Language understanding research is held back by a failure to relate language to the physical world it describes and to the social interactions it facilitates.

Representation Learning

Grounding Conversations with Improvised Dialogues

1 code implementation ACL 2020 Hyundong Cho, Jonathan May

Effective dialogue involves grounding, the process of establishing mutual knowledge that is essential for communication between people.

Exploring Early Prediction of Buyer-Seller Negotiation Outcomes

no code implementations6 Apr 2020 Kushal Chawla, Gale Lucas, Jonathan May, Jonathan Gratch

Agents that negotiate with humans find broad applications in pedagogy and conversational AI.

Language Modelling

Zero-Shot Learning of Text Adventure Games with Sentence-Level Semantics

no code implementations6 Apr 2020 Xusen Yin, Jonathan May

Reinforcement learning algorithms such as Q-learning have shown great promise in training models to learn the optimal action to take for a given system state; a goal in applications with an exploratory or adversarial nature such as task-oriented dialogues or games.

Q-Learning Zero-Shot Learning

Finding the Optimal Vocabulary Size for Neural Machine Translation

1 code implementation Findings of the Association for Computational Linguistics 2020 Thamme Gowda, Jonathan May

We cast neural machine translation (NMT) as a classification task in an autoregressive setting and analyze the limitations of both classification and autoregression components.

Classification General Classification +1

Cross-lingual Structure Transfer for Relation and Event Extraction

no code implementations IJCNLP 2019 Ananya Subburathinam, Di Lu, Heng Ji, Jonathan May, Shih-Fu Chang, Avirup Sil, Clare Voss

The identification of complex semantic structures such as events and entity relations, already a challenging Information Extraction task, is doubly difficult from sources written in under-resourced and under-annotated languages.

Event Extraction Relation Extraction

Contextualized Cross-Lingual Event Trigger Extraction with Minimal Resources

no code implementations CONLL 2019 Meryem M{'}hamdi, Marjorie Freedman, Jonathan May

Our work is the first to experiment with two event architecture variants in a cross-lingual setting, to show the effectiveness of contextualized embeddings obtained using BERT, and to explore and analyze its performance on Arabic.

Event Extraction Transfer Learning

Cross-lingual Joint Entity and Word Embedding to Improve Entity Linking and Parallel Sentence Mining

no code implementations WS 2019 Xiaoman Pan, Thamme Gowda, Heng Ji, Jonathan May, Scott Miller

Because this multilingual common space directly relates the semantics of contextual words in the source language to that of entities in the target language, we leverage it for unsupervised cross-lingual entity linking.

Cross-Lingual Entity Linking Entity Linking

Do Nuclear Submarines Have Nuclear Captains? A Challenge Dataset for Commonsense Reasoning over Adjectives and Objects

no code implementations IJCNLP 2019 James Mullenbach, Jonathan Gordon, Nanyun Peng, Jonathan May

This provides evidence that the amount of commonsense knowledge encoded in these language models does not extend far beyond that already baked into the word embeddings.

Word Embeddings

A Universal Parent Model for Low-Resource Neural Machine Translation Transfer

no code implementations14 Sep 2019 Mozhdeh Gheini, Jonathan May

In this work, we present a `universal' pre-trained neural parent model with constant vocabulary that can be used as a starting point for training practically any new low-resource language to a fixed target language.

Low-Resource Neural Machine Translation Transfer Learning

What Matters for Neural Cross-Lingual Named Entity Recognition: An Empirical Analysis

no code implementations IJCNLP 2019 Xiaolei Huang, Jonathan May, Nanyun Peng

While recent work has shown promising results on cross-lingual transfer from high-resource languages to low-resource languages, it is unclear what knowledge is transferred.

Cross-Lingual NER Named Entity Recognition +2

Learn How to Cook a New Recipe in a New House: Using Map Familiarization, Curriculum Learning, and Bandit Feedback to Learn Families of Text-Based Adventure Games

1 code implementation13 Aug 2019 Xusen Yin, Jonathan May

We consider the task of learning to play families of text-based computer adventure games, i. e., fully textual environments with a common theme (e. g. cooking) and goal (e. g. prepare a meal from a recipe) but with different specifics; new instances of such games are relatively straightforward for humans to master after a brief exposure to the genre but have been curiously difficult for computer agents to learn.

Common Sense Reasoning Curriculum Learning +1

SARAL: A Low-Resource Cross-Lingual Domain-Focused Information Retrieval System for Effective Rapid Document Triage

no code implementations ACL 2019 Elizabeth Boschee, Joel Barry, Jayadev Billa, Marjorie Freedman, Thamme Gowda, Constantine Lignos, Chester Palen-Michel, Michael Pust, Banriskhem Kayang Khonglah, Srikanth Madikeri, Jonathan May, Scott Miller

In this paper we present an end-to-end cross-lingual information retrieval (CLIR) and summarization system for low-resource languages that 1) enables English speakers to search foreign language repositories of text and audio using English queries, 2) summarizes the retrieved documents in English with respect to a particular information need, and 3) provides complete transcriptions and translations as needed.

Information Retrieval Machine Translation

Translating Translationese: A Two-Step Approach to Unsupervised Machine Translation

no code implementations ACL 2019 Nima Pourdamghani, Nada Aldarrab, Marjan Ghazvininejad, Kevin Knight, Jonathan May

Given a rough, word-by-word gloss of a source language sentence, target language natives can uncover the latent, fully-fluent rendering of the translation.

Unsupervised Machine Translation

Comprehensible Context-driven Text Game Playing

2 code implementations6 May 2019 Xusen Yin, Jonathan May

As such, an LSTM-based DQN can take tens of days to finish the training process.


A Grounded Unsupervised Universal Part-of-Speech Tagger for Low-Resource Languages

1 code implementation NAACL 2019 Ronald Cardenas, Ying Lin, Heng Ji, Jonathan May

We also show extrinsically that incorporating our POS tagger into a name tagger leads to state-of-the-art tagging performance in Sinhalese and Kinyarwanda, two languages with nearly no labeled POS data available.

Decipherment Part-Of-Speech Tagging +1

Translating a Language You Don't Know In the Chinese Room

no code implementations ACL 2018 Ulf Hermjakob, Jonathan May, Michael Pust, Kevin Knight

In a corruption of John Searle{'}s famous AI thought experiment, the Chinese Room (Searle, 1980), we twist its original intent by enabling humans to translate text, e. g. from Uyghur to English, even if they don{'}t have any prior knowledge of the source language.

Domain Adaptation Language Modelling +1

Out-of-the-box Universal Romanization Tool uroman

no code implementations ACL 2018 Ulf Hermjakob, Jonathan May, Kevin Knight

We present uroman, a tool for converting text in myriads of languages and scripts such as Chinese, Arabic and Cyrillic into a common Latin-script representation.

Machine Translation

ELISA-EDL: A Cross-lingual Entity Extraction, Linking and Localization System

no code implementations NAACL 2018 Boliang Zhang, Ying Lin, Xiaoman Pan, Di Lu, Jonathan May, Kevin Knight, Heng Ji

We demonstrate ELISA-EDL, a state-of-the-art re-trainable system to extract entity mentions from low-resource languages, link them to external English knowledge bases, and visualize locations related to disaster topics on a world heatmap.

Entity Extraction using GAN Entity Linking +1

Towards Controllable Story Generation

no code implementations WS 2018 Nanyun Peng, Marjan Ghazvininejad, Jonathan May, Kevin Knight

We present a general framework of analyzing existing story corpora to generate controllable and creative new stories.

Story Generation

Recurrent Neural Networks as Weighted Language Recognizers

no code implementations NAACL 2018 Yining Chen, Sorcha Gilroy, Andreas Maletti, Jonathan May, Kevin Knight

We investigate the computational complexity of various problems for simple recurrent neural networks (RNNs) as formal models for recognizing weighted languages.

SemEval-2017 Task 9: Abstract Meaning Representation Parsing and Generation

no code implementations SEMEVAL 2017 Jonathan May, Jay Priyadarshi

In the generation subtask, participants were asked to generate English sentences given AMR graphs in the news/forum domain.

AMR Parsing Machine Translation

Cross-lingual Name Tagging and Linking for 282 Languages

no code implementations ACL 2017 Xiaoman Pan, Boliang Zhang, Jonathan May, Joel Nothman, Kevin Knight, Heng Ji

The ambitious goal of this work is to develop a cross-lingual name tagging and linking framework for 282 languages that exist in Wikipedia.

Extracting Structured Scholarly Information from the Machine Translation Literature

no code implementations LREC 2016 Eunsol Choi, Matic Horvat, Jonathan May, Kevin Knight, Daniel Marcu

Understanding the experimental results of a scientific paper is crucial to understanding its contribution and to comparing it with related work.

Machine Translation Reading Comprehension

Transfer Learning for Low-Resource Neural Machine Translation

1 code implementation EMNLP 2016 Barret Zoph, Deniz Yuret, Jonathan May, Kevin Knight

Ensembling and unknown word replacement add another 2 Bleu which brings the NMT performance on low-resource machine translation close to a strong syntax based machine translation (SBMT) system, exceeding its performance on one language pair.

Low-Resource Neural Machine Translation Transfer Learning

An Analysis (and an Annotated Corpus) of User Responses to Machine Translation Output

no code implementations LREC 2012 Daniele Pighin, Llu{\'\i}s M{\`a}rquez, Jonathan May

We present an annotated resource consisting of open-domain translation requests, automatic translations and user-provided corrections collected from casual users of the translation portal http://reverso. net.

Machine Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.