Understanding linguistic modality is widely seen as important for downstream tasks such as Question Answering and Knowledge Graph Population.
We present a novel method for injecting temporality into entailment graphs to address the problem of spurious entailments, which may arise from similar but temporally distinct events involving the same pair of entities.
In this paper, we propose Mention Flags (MF), which traces whether lexical constraints are satisfied in the generated outputs in an S2S decoder.
Sequence-to-Sequence (S2S) neural text generation models, especially the pre-trained ones (e. g., BART and T5), have exhibited compelling performance on various natural language generation tasks.
OpenKi addresses this task through extraction of named entities and predicates via OpenIE tools then learning relation embeddings from the resulting entity-relation graph for relation prediction, outperforming previous approaches.
Drawing inferences between open-domain natural language predicates is a necessity for true language understanding.
In this work, we aim to address the dense correspondence estimation problem in a way that generalizes to more than one spectrum.
In this paper, we focus on this challenge and propose the ECOL-R model (Encouraging Copying of Object Labels with Reinforced Learning), a copy-augmented transformer model that is encouraged to accurately describe the novel object labels.
Indeed, self-supervised language models trained on "positive" examples of English text generalize in desirable ways to many natural language tasks.
Disfluency detection is usually an intermediate step between an automatic speech recognition (ASR) system and a downstream task.
no code implementations • 5 Jun 2020 • Ryan M. Corey, Evan M. Widloski, David Null, Brian Ricconi, Mark Johnson, Karen White, Jennifer R. Amos, Alex Pagano, Michael Oelze, Rachel Switzky, Matthew B. Wheeler, Eliot Bethke, Clifford Shipley, Andrew C. Singer
In response to the shortage of ventilators caused by the COVID-19 pandemic, many organizations have designed low-cost emergency ventilators.
However, we show that self-training - a semi-supervised technique for incorporating unlabeled data - sets a new state-of-the-art for the self-attentive parser on disfluency detection, demonstrating that self-training provides benefits orthogonal to the pre-trained contextualized word representations.
The new entailment score outperforms prior state-of-the-art results on a standard entialment dataset and the new link prediction scores show improvements over the raw link prediction scores.
This paper describes a spoken-language end-to-end task-oriented dialogue system for small embedded devices such as home appliances.
To encourage the development of image captioning models that can learn visual concepts from alternative data sources, such as object detection datasets, we present the first large-scale benchmark for this task.
Probabilistic topic models are widely used to discover latent topics in document collections, while latent feature vector representations of words have been used to obtain high performance in many NLP tasks.
In recent years, the natural language processing community has moved away from task-specific feature engineering, i. e., researchers discovering ad-hoc feature representations for various tasks, in favor of general-purpose methods that learn the input representation by themselves.
Because obtaining training data is often the most difficult part of an NLP or ML project, we develop methods for predicting how much data is required to achieve a desired test accuracy by extrapolating results from models trained on a small pilot training dataset.
We present a semantic parser for Abstract Meaning Representations which learns to parse strings into tree representations of the compositional structure of an AMR graph.
We present an easy-to-use and fast toolkit, namely VnCoreNLP---a Java NLP annotation pipeline for Vietnamese.
We instead propose a scalable method that learns globally consistent similarity scores based on new soft constraints that consider both the structures across typed entailment graphs and inside each graph.
This is significant because a robot interpreting a natural-language navigation instruction on the basis of what it sees is carrying out a vision and language process that is similar to Visual Question Answering.
Ranked #3 on Visual Navigation on R2R
This paper presents an empirical comparison of two strategies for Vietnamese Part-of-Speech (POS) tagging from unsegmented text: (i) a pipeline strategy where we consider the output of a word segmenter as the input of a POS tagger, and (ii) a joint strategy where we predict a combined segmentation and POS tag for each syllable.
Top-down visual attention mechanisms have been used extensively in image captioning and visual question answering (VQA) to enable deeper image understanding through fine-grained analysis and even multiple steps of reasoning.
Ranked #47 on Visual Question Answering on VQA v2 test-std
Most work on segmenting text does so on the basis of topic changes, but it can be of interest to segment by other, stylistically expressed characteristics such as change of authorship or native language.
ID has been used in two different versions: propositional idea density (PID) counts the expressed ideas and can be applied to any text while semantic idea density (SID) counts pre-defined information content units and is naturally more applicable to normative domains, such as picture description tasks.
We present a novel neural network model that learns POS tagging and graph-based dependency parsing jointly.
Ranked #5 on Part-Of-Speech Tagging on UD
Recent research has shown that the performance of search personalization depends on the richness of user profiles which normally represent the user's topical interests.
Existing image captioning models do not generalize well to out-of-domain images containing novel scenes or objects.
We show that grammar induction from words alone is in fact feasible when the model is provided with sufficient training data, and present two new streaming or mini-batch algorithms for PCFG inference that can learn from millions of words of training data.
This paper presents an empirical comparison of different dependency parsers for Vietnamese, which has some unusual characteristics such as copula drop and verb serialization.
Knowledge bases of real-world facts about entities and their relationships are useful resources for a variety of natural language processing tasks.
Knowledge bases are useful resources for many natural language processing tasks, however, they are far from complete.
The unsupervised discovery of linguistic terms from either continuous phoneme transcriptions or from raw speech has seen an increasing interest in the past years both from a theoretical and a practical standpoint.
Grounded language learning, the task of mapping from natural language to a representation of meaning, has attracted more and more interest in recent years.
This paper presents Bayesian non-parametric models that simultaneously learn to segment words from phoneme strings and learn the referents of some of those words, and shows that there is a synergistic interaction in the acquisition of these two kinds of linguistic information.