Existing knowledge graph embedding approaches concentrate on modeling symmetry/asymmetry, inversion, and composition typed relations but overlook the hierarchical nature of relations.
In this work, we propose to leverage the prior information embedded in pretrained language models (LM) to improve generalization for intent classification and slot labeling tasks with limited training data.
A commonly observed problem with the state-of-the art abstractive summarization models is that the generated summaries can be factually inconsistent with the input documents.
Compositional reasoning tasks like multi-hop question answering, require making latent decisions to get the final answer, given a question.
Experiments show that: (1) Our IR-based retrieval method is able to collect high-quality candidates efficiently, thus enables our method adapt to large-scale KBs easily; (2) the BERT model improves the accuracy across all three sub-tasks; and (3) benefiting from multi-task learning, the unified model obtains further improvements with only 1/3 of the original parameters.
A key challenge for abstractive summarization is ensuring factual consistency of the generated summary with respect to the original document.
We propose a new framework, Translation between Augmented Natural Languages (TANL), to solve many structured prediction language tasks including joint entity and relation extraction, nested named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, and dialogue state tracking.
Most recently, there has been significant interest in learning contextual representations for various NLP tasks, by leveraging large scale text corpora to train large neural language models with self-supervised learning objectives, such as Masked Language Model (MLM).
Ranked #1 on Semantic Parsing on spider
When multiple plausible answers are found, the system should rewrite the question for each answer to resolve the ambiguity.
Unsupervised domain adaptation addresses the problem of leveraging labeled data in a source domain to learn a well-performing model in a target domain where labels are unavailable.
Generative models for Information Retrieval, where ranking of documents is viewed as the task of generating a query from a document's language model, were very successful in various IR tasks in the past.
In this paper, we first review absolute position embeddings and existing methods for relative position embeddings.
In some cases, our model trained on synthetic data can even outperform the same model trained on real data
We set a new state-of-the-art for few-shot slot labeling, improving substantially upon the previous 5-shot ($75. 0\% \rightarrow 90. 9\%$) and 1-shot ($70. 4\% \rightarrow 81. 0\%$) state-of-the-art results.
Training a QA model on this data gives a relative improvement over a previous unsupervised model in F1 score on the SQuAD dataset by about 14%, and 20% when the answer is a named entity, achieving state-of-the-art performance on SQuAD for unsupervised QA.
Prior to the transformer era, bidirectional Long Short-Term Memory (BLSTM) has been the dominant modeling architecture for neural machine translation and question answering.
In this work, we define the problem of conversation structure modeling as identifying the parent utterance(s) to which each utterance in the conversation responds to.
We present a systematic investigation of layer-wise BERT activations for general-purpose text representations to understand what linguistic information they capture and how transferable they are across different tasks.
To tackle this issue, we propose a multi-passage BERT model to globally normalize answer scores across all passages of the same question, and this change enables our QA model find better answers by utilizing more passages.
Ranked #3 on Open-Domain Question Answering on SearchQA
The key contribution of our work is our proposal to explicitly constrain the latent space to exclusively represent the given class.
Ranked #17 on Anomaly Detection on One-class CIFAR-10
Coreference resolution aims to identify in a text all mentions that refer to the same real-world entity.
Question classification is an important task with wide applications.
In order to utilize the potential benefits from their correlations, we propose a jointly trained model for learning the two tasks simultaneously via Long Short-Term Memory (LSTM) networks.
Relation detection is a core component for many NLP applications including Knowledge Base Question Answering (KBQA).
Many natural language understanding (NLU) tasks, such as shallow parsing (i. e., text chunking) and semantic slot filling, require the assignment of representative labels to the meaningful chunks in a sentence.
By evaluating the NLC workloads, we show that only the conservative hyper-parameter setup (e. g., small mini-batch size and small learning rate) can guarantee acceptable model accuracy for a wide range of customers.
This paper proposes dynamic chunk reader (DCR), an end-to-end neural reading comprehension (RC) model that is able to extract and rank a set of answer candidates from a given document to answer questions.
Ranked #41 on Question Answering on SQuAD1.1 dev
Passage-level question answer matching is a challenging task since it requires effective representations that capture the complex semantic relations between questions and answers.
In fact selection, we match the subject entity in a fact candidate with the entity mention in the question by a character-level convolutional neural network (char-CNN), and match the predicate in that fact with the question by a word-level CNN (word-CNN).
In this work, we model abstractive text summarization using Attentional Encoder-Decoder Recurrent Neural Networks, and show that they achieve state-of-the-art performance on two different corpora.
Ranked #10 on Text Summarization on DUC 2004 Task 1
Recurrent Neural Network (RNN) and one of its specific architectures, Long Short-Term Memory (LSTM), have been widely used for sequence labeling.
(ii) We propose three attention schemes that integrate mutual influence between sentences into CNN; thus, the representation of each sentence takes into consideration its counterpart.
We propose two methods of learning vector representations of words and phrases that each combine sentence context with structural features extracted from dependency trees.
One direction is to define a more composite representation for questions and answers by combining convolutional neural network with the basic framework.
In this paper we explore deep learning models with memory component or attention mechanism for question answering task.
We apply a general deep learning framework to address the non-factoid question answering task.
In sentence modeling and classification, convolutional neural network approaches have recently achieved state-of-the-art results, but all such efforts process word vectors sequentially and neglect long-distance dependencies.
Relation classification is an important semantic processing task for which state-ofthe-art systems still rely on costly handcrafted features.
Ranked #15 on Relation Extraction on SemEval-2010 Task 8