Although grammatical error correction (GEC) has achieved good performance on texts written by learners of English as a second language, performance on low error density domains where texts are written by English speakers of varying levels of proficiency can still be improved.
The success of a natural language processing (NLP) system on a task does not amount to fully understanding the complexity of the task, typified by many deep learning models.
In conversational question answering (CQA), the task of question rewriting (QR) in context aims to rewrite a context-dependent question into an equivalent self-contained question that gives the same answer.
Current state-of-the-art supervised word sense disambiguation (WSD) systems (such as GlossBERT and bi-encoder model) yield surprisingly good results by purely leveraging pre-trained language models and short dictionary definitions (or glosses) of the different word senses.
Ranked #2 on Word Sense Disambiguation on Supervised:
We analyze the causes and effects of the overwhelming false negative problem in the DocRED dataset.
Our model consistently outperforms strong baselines and its performance exceeds the previous SOTA by 1. 36 F1 and 1. 46 Ign_F1 score on the DocRED leaderboard.
Ranked #1 on Relation Extraction on DocRED
We leverage unlabeled data to improve classification in student training where we employ two teachers to refine the labeling of unlabeled data through teacher-student learning in a bootstrapping manner.
In this paper, we propose a system combination method for grammatical error correction (GEC), based on nonlinear integer programming (IP).
However, most existing state-of-the-art GEC approaches are based on similar sequence-to-sequence neural networks, so the gains are limited from combining the outputs of component systems similar to one another.
Distantly supervised datasets for relation extraction mostly focus on sentence-level extraction, and they cover very few relations.
One of the major challenges is that a dialogue system may generate an undesired utterance leading to a dialogue breakdown, which degrades the overall interaction quality.
Experimental results show that our proposed method achieves significant performance improvements over the state-of-the-art pretrained cross-lingual language model in the CLCD setting.
To improve the robustness of self-training, in this paper we present class-aware feature self-distillation (CFd) to learn discriminative features from PrLMs, in which PrLM features are self-distilled into a feature adaptation module and the features from the same class are more tightly clustered.
Despite recent progress in conversational question answering, most prior work does not focus on follow-up questions.
Multi-hop question answering (QA) requires a model to retrieve and integrate information from different parts of a long text to answer a question.
A relation tuple consists of two entities and the relation between them, and often such tuples are found in unstructured text.
Ranked #1 on Relation Extraction on NYT24
Contextualized word representations are able to give different representations for the same word in different contexts, and they have been shown to be effective in downstream natural language processing tasks, such as question answering, named entity recognition, and sentiment analysis.
Ranked #14 on Word Sense Disambiguation on Supervised:
The objective of this work is to develop an automated diagnosis system that is able to predict the probability of appendicitis given a free-text emergency department (ED) note and additional structured information (e. g., lab test results).
Despite the advancement of question answering (QA) systems and rapid improvements on held-out test sets, their generalizability is a topic of concern.
Aspect-based sentiment analysis produces a list of aspect terms and their corresponding sentiments for a natural language sentence.
However, current approaches suffer from an impractical assumption that every question has a valid answer in the associated passage.
We also show that a state-of-the-art GEC system can be improved when quality scores are used as features for re-ranking the N-best candidates.
Ranked #2 on Grammatical Error Correction on Restricted
We consider the cross-domain sentiment classification problem, where a sentiment classifier is to be learned from a source domain and to be generalized to a target domain.
Previous studies of the correlation of these metrics with human quality judgments were inconclusive, due to the lack of appropriate significance tests, discrepancies in the methods, and choice of datasets used.
First, we propose a method for target representation that better captures the semantic meaning of the opinion target.
Attention-based long short-term memory (LSTM) networks have proven to be useful in aspect-level sentiment classification.
Our goal in this paper is to propose a benchmark in evaluation setup for Chinese-to-English machine translation, such that the effectiveness of a new proposed MT approach can be directly compared to previous approaches.
We improve automatic correction of grammatical, orthographic, and collocation errors in text using a multilayer convolutional encoder-decoder neural network.
Ranked #1 on Grammatical Error Correction on Restricted
Neural network models recently proposed for question answering (QA) primarily focus on capturing the passage-question relation.
Ranked #4 on Question Answering on NewsQA
We build a grammatical error correction (GEC) system primarily based on the state-of-the-art statistical machine translation (SMT) approach, using task-specific features and tuning, and further enhance it with the modeling power of neural network joint models.
Unlike topic models which typically assume independently generated words, word embedding models encourage words that appear in similar contexts to be located close to each other in the embedding space.
In machine translation (MT) that involves translating between two languages with significant differences in word order, determining the correct word order of translated words is a major challenge.
Reordering poses a major challenge in machine translation (MT) between two languages with significant differences in word order.
Grammatical error correction (GEC) is the task of detecting and correcting grammatical errors in texts written by second language learners.
Phrase-based statistical machine translation (SMT) systems have previously been used for the task of grammatical error correction (GEC) to achieve state-of-the-art accuracy.
We propose a novel language-independent approach for improving machine translation for resource-poor languages by exploiting their similarity to resource-rich ones.