All-but-the-Top: Simple and Effective Postprocessing for Word Representations

Real-valued word representations have transformed NLP applications; popular examples are word2vec and GloVe, recognized for their ability to capture linguistic regularities. In this paper, we demonstrate a {\em very simple}, and yet counter-intuitive, postprocessing technique -- eliminate the common mean vector and a few top dominating directions from the word vectors -- that renders off-the-shelf representations {\em even stronger}... (read more)

PDF Abstract ICLR 2018 PDF ICLR 2018 Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Sentiment Analysis MR GRU-RNN-WORD2VEC Accuracy 78.26 # 9
Sentiment Analysis SST-5 Fine-grained classification GRU-RNN-WORD2VEC Accuracy 45.02 # 22
Subjectivity Analysis SUBJ GRU-RNN-GLOVE Accuracy 91.85 # 11
Text Classification TREC-6 GRU-RNN-GLOVE Error 7 # 11

Methods used in the Paper


METHOD TYPE
GloVe
Word Embeddings