Detecting what emotions are expressed in text is a well-studied problem in natural language processing.
Recently deep learning has dominated many machine learning areas, including spoken language understanding (SLU).
Leveraging large amounts of unlabeled data using Transformer-like architectures, like BERT, has gained popularity in recent times owing to their effectiveness in learning general representations that can then be further fine-tuned for downstream tasks to much success.
(2) Capturing the long-range dependency, specifically, the connection between an event trigger and a distant event argument.
We present a new summarization task, generating summaries of novel chapters using summary/chapter pairs from online study guides.
To measure the true progress of existing models, we split the test set into two sets, one which requires reasoning on linguistic structure and the other which doesn't.
Neural Machine Translation (NMT) models are sensitive to small perturbations in the input.
A variety of natural language tasks require processing of textual data which contains a mix of natural language and formal languages such as mathematical expressions.
In this paper, we propose a neural architecture and a set of training methods for ordering events by predicting temporal relations.
Robustness to capitalization errors is a highly desirable characteristic of named entity recognizers, yet we find standard models for the task are surprisingly brittle to such noise.
User generated text on social media often suffers from a lot of undesired characteristics including hatespeech, abusive language, insults etc.
Relation Extraction is the task of identifying entity mention spans in raw text and then identifying relations between pairs of the entity mentions.
Ranked #8 on Relation Extraction on ACE 2005
This paper proposes a novel method to inject custom terminology into neural machine translation at run time.
Usually, the candidate lists are a combination of external word-to-word aligner, phrase table entries or most frequent words.
Knowledge distillation describes a method for training a student network to perform better by learning from a stronger teacher network.
In this paper, we concentrate on speeding up the decoder by applying a more flexible beam search strategy whose candidate size may vary at each time step depending on the candidate scores.
The basic concept in NMT is to train a large Neural Network that maximizes the translation performance on a given parallel corpus.
Attention-based Neural Machine Translation (NMT) models suffer from attention deficiency issues as has been observed in recent research.
In this paper, we propose a novel finetuning algorithm for the recently introduced multi-way, mulitlingual neural machine translate that enables zero-resource machine translation.