|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.
SOTA for Common Sense Reasoning on SWAG
Convolutional Neural Networks (CNNs) have recently achieved remarkably strong performance on the practically important task of sentence classification (kim 2014, kalchbrenner 2014, johnson 2014).
We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks.
First, the majority of datasets for sequential short-text classification (i. e., classification of short texts that appear in sequences) are small: we hope that releasing a new large dataset will help develop more accurate algorithms for this task.
Although much effort has recently been devoted to training high-quality sentence embeddings, we still have a poor understanding of what they are capturing.
Biomedical text mining is becoming increasingly important as the number of biomedical documents rapidly grows.
We present NMT-Keras, a flexible toolkit for training deep learning models, which puts a particular emphasis on the development of advanced applications of neural machine translation systems, such as interactive-predictive translation protocols and long-term adaptation of the translation system via continuous learning.
Obtaining large-scale annotated data for NLP tasks in the scientific domain is challenging and expensive.
SOTA for Sentence Classification on Paper Field (using extra training data)
CITATION INTENT CLASSIFICATION DEPENDENCY PARSING LANGUAGE MODELLING MEDICAL NAMED ENTITY RECOGNITION PARTICIPANT INTERVENTION COMPARISON OUTCOME EXTRACTION RELATION EXTRACTION SENTENCE CLASSIFICATION
We present a memory augmented neural network for natural language understanding: Neural Semantic Encoders.
#11 best model for Question Answering on WikiQA
However, due to the lack of rich pictographic evidence in glyphs and the weak generalization ability of standard computer vision models on character data, an effective way to utilize the glyph information remains to be found.
CHINESE WORD SEGMENTATION DEPENDENCY PARSING DOCUMENT CLASSIFICATION IMAGE CLASSIFICATION LANGUAGE MODELLING MACHINE TRANSLATION MULTI-TASK LEARNING PART-OF-SPEECH TAGGING SEMANTIC ROLE LABELING SEMANTIC TEXTUAL SIMILARITY SENTENCE CLASSIFICATION SENTIMENT ANALYSIS