We annotate a multimodal product attribute value dataset that contains 87, 194 instances, and the experimental results on this dataset demonstrate that explicitly modeling the relationship between attributes and values facilitates our method to establish the correspondence between them, and selectively utilizing visual product information is necessary for the task.
Copy module has been widely equipped in the recent abstractive summarization models, which facilitates the decoder to extract words from the source into the summary.
Human conversations are complicated and building a human-like dialogue agent is an extremely challenging task.
Translational distance-based knowledge graph embedding has shown progressive improvements on the link prediction task, from TransE to the latest state-of-the-art RotatE.
Ranked #8 on Link Prediction on WN18RR
We test the relation module on the SQuAD 2. 0 dataset using both the BiDAF and BERT models as baseline readers.
Interpretable multi-hop reading comprehension (RC) over multiple documents is a challenging problem because it demands reasoning over multiple information sources and explaining the answer prediction by providing supporting evidences.
In this paper, we aim to improve a MRC model's ability to determine whether a question has an answer in a given context (e. g. the recently proposed SQuAD 2. 0 task).
Recent years have seen great success in the use of neural seq2seq models on the text-to-SQL task.
In this paper, we propose a new end-to-end graph neural network (GNN) based algorithm for MIL: we treat each bag as a graph and use GNN to learn the bag embedding, in order to explore the useful structural information among instances in bags.
In this paper, we improve the robustness of DNNs by utilizing techniques of Distance Metric Learning.
We introduce a heterogeneous graph with different types of nodes and edges, which is named as Heterogeneous Document-Entity (HDE) graph.
We guide the optimization of label quality through a small amount of validation data, and to ensure the safeness of performance while maximizing performance gain.
This paper aims to improve the widely used deep speaker embedding x-vector model.
Our meta metric learning approach consists of task-specific learners, that exploit metric learning to handle flexible labels, and a meta learner, that discovers good parameters and gradient decent to specify the metrics in task-specific learners.
The recent graph convolutional network (GCN) provides another way of learning graph node embedding by successfully utilizing graph connectivity structure.
Ranked #22 on Link Prediction on FB15k-237
For example, there is still a lack of theories of convergence for SGD and its variants that use stagewise step size and return an averaged solution in practice.
We investigate the task of learning to follow natural language instructions by jointly reasoning with visual observations and language inputs.
Question classification is an important task with wide applications.
In order to utilize the potential benefits from their correlations, we propose a jointly trained model for learning the two tasks simultaneously via Long Short-Term Memory (LSTM) networks.
Learning to remember long sequences remains a challenging task for recurrent neural networks.
Second, we propose a novel method that jointly trains the Ranker along with an answer-generation Reader model, based on reinforcement learning.
Ranked #4 on Open-Domain Question Answering on Quasar
We present a new topic model that generates documents by sampling a topic for one whole sentence at a time, and generating the words in the sentence using an RNN decoder that is conditioned on the topic of the sentence.
We propose discriminative adversarial networks (DAN) for semi-supervised learning and loss function learning.
Relation detection is a core component for many NLP applications including Knowledge Base Question Answering (KBQA).
Many natural language understanding (NLU) tasks, such as shallow parsing (i. e., text chunking) and semantic slot filling, require the assignment of representative labels to the meaningful chunks in a sentence.
By evaluating the NLC workloads, we show that only the conservative hyper-parameter setup (e. g., small mini-batch size and small learning rate) can guarantee acceptable model accuracy for a wide range of customers.
We present SummaRuNNer, a Recurrent Neural Network (RNN) based sequence model for extractive summarization of documents and show that it achieves performance better than or comparable to state-of-the-art.
Ranked #8 on Text Summarization on CNN / Daily Mail (Anonymized)
This paper proposes dynamic chunk reader (DCR), an end-to-end neural reading comprehension (RC) model that is able to extract and rank a set of answer candidates from a given document to answer questions.
Ranked #41 on Question Answering on SQuAD1.1 dev
In fact selection, we match the subject entity in a fact candidate with the entity mention in the question by a character-level convolutional neural network (char-CNN), and match the predicate in that fact with the question by a word-level CNN (word-CNN).
We introduce the multiresolution recurrent neural network, which extends the sequence-to-sequence framework to model natural language generation as two parallel discrete stochastic processes: a sequence of high-level coarse tokens, and a sequence of natural language tokens.
Ranked #1 on Dialogue Generation on Ubuntu Dialogue (Activity)
At each time-step, the decision of which softmax layer to use choose adaptively made by an MLP which is conditioned on the context.~We motivate our work from a psychological evidence that humans naturally have a tendency to point towards objects in the context or the environment when the name of an object is not known.~We observe improvements on two tasks, neural machine translation on the Europarl English to French parallel corpora and text summarization on the Gigaword dataset using our proposed model.
In this work, we model abstractive text summarization using Attentional Encoder-Decoder Recurrent Neural Networks, and show that they achieve state-of-the-art performance on two different corpora.
Ranked #10 on Text Summarization on DUC 2004 Task 1
Recurrent Neural Network (RNN) and one of its specific architectures, Long Short-Term Memory (LSTM), have been widely used for sequence labeling.
(ii) We propose three attention schemes that integrate mutual influence between sentences into CNN; thus, the representation of each sentence takes into consideration its counterpart.
We propose two methods of learning vector representations of words and phrases that each combine sentence context with structural features extracted from dependency trees.
One direction is to define a more composite representation for questions and answers by combining convolutional neural network with the basic framework.
In this paper we explore deep learning models with memory component or attention mechanism for question answering task.
We apply a general deep learning framework to address the non-factoid question answering task.
In sentence modeling and classification, convolutional neural network approaches have recently achieved state-of-the-art results, but all such efforts process word vectors sequentially and neglect long-distance dependencies.
In this paper, we present a novel approach for medical synonym extraction.
Relation classification is an important semantic processing task for which state-ofthe-art systems still rely on costly handcrafted features.
Ranked #15 on Relation Extraction on SemEval-2010 Task 8