Intuitively, such models can more easily output canonical utterances as they are closer to the natural language used for pre-training.
Our experiments show CCO substantially boosts the performance of neural symbolic methods on real images.
2 code implementations • • Mahsa Yarmohammadi, Shijie Wu, Marc Marone, Haoran Xu, Seth Ebner, Guanghui Qin, Yunmo Chen, Jialiang Guo, Craig Harman, Kenton Murray, Aaron Steven White, Mark Dredze, Benjamin Van Durme
Zero-shot cross-lingual information extraction (IE) describes the construction of an IE model for some target language, given existing annotations exclusively in some other language, typically English.
The success of bidirectional encoders using masked language models, such as BERT, on numerous natural language processing tasks has prompted researchers to attempt to incorporate these pre-trained models into neural machine translation (NMT) systems.
Ranked #1 on Machine Translation on IWSLT2014 German-English
We present a conditional text generation framework that posits sentential expressions of possible causes and effects.
Statutory reasoning is the task of determining whether a legal statute, stated in natural language, applies to the text description of a case.
Event schemas are structured knowledge sources defining typical real-world scenarios (e. g., going to an airport).
We explore the use of large pretrained language models as few-shot semantic parsers.
Academic neural models for coreference resolution (coref) are typically trained on a single dataset, OntoNotes, and model improvements are benchmarked on that same dataset.
While numerous attempts have been made to jointly parse syntax and semantics, high performance in one domain typically comes at the price of performance in the other.
We propose a structured extension to bidirectional-context conditional language generation, or "infilling," inspired by Frame Semantic theory (Fillmore, 1976).
Fine-tuning is known to improve NLP models by adapting an initial model trained on more plentiful but less domain-salient examples to data in a target domain.
We present LOME, a system for performing multilingual information extraction.
We recognize the task of event argument linking in documents as similar to that of intent slot resolution in dialogue, providing a Transformer-based model that extends from a recently proposed solution to resolve references to slots.
Copy mechanisms are employed in sequence to sequence models (seq2seq) to generate reproductions of words from the input to the output.
We present COD3S, a novel method for generating semantically diverse sentences using neural sequence-to-sequence (seq2seq) models.
We show that the count-based Script Induction models of Chambers and Jurafsky (2008) and Jans et al. (2012) can be unified in a general framework of narrative chain likelihood maximization.
We introduce a novel paraphrastic augmentation strategy based on sentence-level lexically constrained paraphrasing and discriminative span alignment.
Legislation can be viewed as a body of prescriptive rules expressed in natural language.
We investigate modeling coreference resolution under a fixed memory constraint by extending an incremental clustering algorithm to utilize contextualized encoders and neural components.
This paper presents CLEAR, a retrieval model that seeks to complement classical lexical exact-match models such as BM25 with semantic matching signals from a neural embedding matching model.
We propose a novel method for hierarchical entity classification that embraces ontological structure at both training and during prediction.
We ask whether text understanding has progressed to where we may extract event information through incremental refinement of bleached statements derived from annotation manuals.
We present a novel document-level model for finding argument spans that fill an event's roles, connecting related ideas in sentence-level semantic role labeling and coreference resolution.
Cross-lingual word embeddings transfer knowledge between languages: models trained on high-resource languages can predict in low-resource languages.
Many architectures for multi-task learning (MTL) have been proposed to take advantage of transfer among tasks, often involving complex models and training procedures.
We introduce a transductive model for parsing into Universal Decompositional Semantics (UDS) representations, which jointly learns to map natural language utterances into UDS graph structures and annotate the graph with decompositional semantic attribute scores.
Prior methods for retrieval of nearest neighbors in high dimensions are fast and approximate--providing probabilistic guarantees of returning the correct answer--or slow and exact performing an exhaustive search.
Data Structures and Algorithms
1 code implementation • • Aaron Steven White, Elias Stengel-Eskin, Siddharth Vashishtha, Venkata Govindarajan, Dee Ann Reisinger, Tim Vieira, Keisuke Sakaguchi, Sheng Zhang, Francis Ferraro, Rachel Rudinger, Kyle Rawlins, Benjamin Van Durme
We present the Universal Decompositional Semantics (UDS) dataset (v1. 0), which is bundled with the Decomp toolkit (v0. 1).
We introduce Uncertain Natural Language Inference (UNLI), a refinement of Natural Language Inference (NLI) that shifts away from categorical labels, targeting instead the direct prediction of subjective probability assessments.
We unify different broad-coverage semantic parsing tasks under a transduction paradigm, and propose an attention-based neural framework that incrementally builds a meaning representation via a sequence of semantic relations.
Ranked #2 on UCCA Parsing on SemEval 2019 Task 1
We introduce a novel discriminative word alignment model, which we integrate into a Transformer-based machine translation model.
In contrast to standard approaches to NLI, our methods predict the probability of a premise given a hypothesis and NLI label, discouraging models from ignoring the premise.
Popular Natural Language Inference (NLI) datasets have been shown to be tainted by hypothesis-only biases.
Researchers illustrate improvements in contextual encoding strategies via resultant performance on a battery of shared Natural Language Understanding (NLU) tasks.
Lexically-constrained sequence decoding allows for explicit positive or negative phrase-based constraints to be placed on target output strings in generation tasks such as machine translation or monolingual text rewriting.
Our experimental results outperform all previously reported SMATCH scores, on both AMR 2. 0 (76. 3% F1 on LDC2017T10) and AMR 1. 0 (70. 2% F1 on LDC2014T12).
Ranked #1 on AMR Parsing on LDC2014T12:
The jiant toolkit for general-purpose text understanding models
no code implementations • • Samuel R. Bowman, Ellie Pavlick, Edouard Grave, Benjamin Van Durme, Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen
Work on the problem of contextualized word representation—the development of reusable neural network components for sentence understanding—has recently seen a surge of progress centered on the unsupervised pretraining task of language modeling with methods like ELMo (Peters et al., 2018).
Our results show that pretraining on language modeling performs the best on average across our probing tasks, supporting its widespread use for pretraining state-of-the-art NLP models, and CCG supertagging and NLI pretraining perform comparably.
We present a novel semantic framework for modeling temporal relations and event durations that maps pairs of events to real-valued scales.
We present a novel semantic framework for modeling linguistic expressions of generalization---generic, habitual, and episodic statements---as combinations of simple, real-valued referential properties of predicates and their arguments.
We present ParaBank, a large-scale English paraphrase dataset that surpasses prior work in both quantity and quality.
no code implementations • • Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen, Benjamin Van Durme, Edouard Grave, Ellie Pavlick, Samuel R. Bowman
Natural language understanding has recently seen a surge of progress with the use of sentence encoders like ELMo (Peters et al., 2018a) and BERT (Devlin et al., 2019) which are pretrained on variants of language modeling.
We present a large-scale dataset, ReCoRD, for machine reading comprehension requiring commonsense reasoning.
We introduce the task of cross-lingual decompositional semantic parsing: mapping content provided in a source language into a decompositional semantic analysis based on a target language.
Distinguishing between arguments and adjuncts of a verb is a longstanding, nontrivial problem.
We use this dataset, which we make publicly available, to probe the behavior of current state-of-the-art neural systems, showing that these systems make certain systematic errors that are clearly visible through the lens of factuality prediction.
Cross-lingual information extraction (CLIE) is an important and challenging task, especially in low resource scenarios.
We propose a process for investigating the extent to which sentence representations arising from neural machine translation (NMT) systems encode distinct semantic phenomena.
We present a large-scale collection of diverse natural language inference (NLI) datasets that help provide insight into how well a sentence representation captures distinct types of reasoning.
Fine-grained entity typing is the task of assigning fine-grained semantic types to entity mentions.
We present a model for semantic proto-role labeling (SPRL) using an adapted bidirectional LSTM encoding strategy that we call "Neural-Davidsonian": predicate-argument structure is represented as pairs of hidden states corresponding to predicate and argument head tokens of the input sequence.
We introduce the task of cross-lingual semantic parsing: mapping content provided in a source language into a meaning representation based on a target language.
We present two neural models for event factuality prediction, which yield significant performance gains over previous models on three event factuality datasets: FactBank, UW, and MEANTIME.
We propose to unify a variety of existing semantic classification tasks, such as semantic role labeling, anaphora resolution, and paraphrase detection, under the heading of Recognizing Textual Entailment (RTE).
no code implementations • • Benjamin Van Durme, Tom Lippincott, Kevin Duh, Deana Burchfield, Adam Poliak, Cash Costello, Tim Finin, Scott Miller, James Mayfield, Philipp Koehn, Craig Harman, Dawn Lawrie, Ch May, ler, Max Thomas, Annabelle Carrell, Julianne Chaloux, Tongfei Chen, Alex Comerford, Mark Dredze, Benjamin Glass, Shudong Hao, Patrick Martin, Pushpendre Rastogi, Rashmi Sankepally, Travis Wolfe, Ying-Ying Tran, Ted Zhang
It combines a multitude of analytics together with a flexible environment for customizing the workflow for different users.
Cross-lingual open information extraction is the task of distilling facts from the source language into representations in the target language.
We propose a neural encoder-decoder model with reinforcement learning (NRL) for grammatical error correction (GEC).
Practically, this means that we may treat the lexical resources as observations under the proposed generative model.
Existing Knowledge Base Population methods extract relations from a closed relational schema with limited coverage leading to sparse KBs.
We propose a new dependency parsing scheme which jointly parses a sentence and repairs grammatical errors by extending the non-directional transition-based formalism of Goldberg and Elhadad (2010) with three additional actions: SUBSTITUTE, DELETE, INSERT.
We study how different frame annotations complement one another when learning continuous lexical semantics.
We develop a streaming (one-pass, bounded-memory) word embedding algorithm based on the canonical skip-gram with negative sampling algorithm implemented in word2vec.
The popular skip-gram model induces word embeddings by exploiting the signal from word-context coocurrence.
We analyze the Stanford Natural Language Inference (SNLI) corpus in an investigation of bias and stereotyping in NLP data.
We propose the semantic proto-role linking model, which jointly induces both predicate-specific semantic roles and predicate-general semantic proto-roles based on semantic proto-role property likelihood judgments.
Conventional pipeline solutions decompose the task as machine translation followed by information extraction (or vice versa).
We propose ECO: a new way to generate embeddings for phrases that is Efficient, Compositional, and Order-sensitive.
We propose a framework for discriminative IR atop linguistic features, trained to improve the recall of answer candidate passage retrieval, the initial step in text-based question answering.
Hand-engineered feature sets are a well understood method for creating robust NLP models, but they require a lot of expertise and effort to create.
Humans have the capacity to draw common-sense inferences from natural language: various things that are likely but not certain to hold based on established discourse, and are rarely stated explicitly.
A linking theory explains how verbs' semantic arguments are mapped to their syntactic arguments---the inverse of the Semantic Role Labeling task from the shallow semantic parsing literature.
Inspired by the findings from the Cmabrigde Uinervtisy effect, we propose a word recognition model based on a semi-character level recurrent neural network (scRNN).
Link prediction in large knowledge graphs has received a lot of attention recently because of its importance for inferring missing relations and for completing and improving noisily extracted knowledge graphs.
The output scores of a neural network classifier are converted to probabilities via normalizing over the scores of all competing categories.
Most work on building knowledge bases has focused on collecting entities and facts from as large a collection of documents as possible.
We describe a corpus for target-contextualized machine translation (MT), where the task is to improve the translation of source documents using language models built over presumably related documents in the target language.