FRAGE: Frequency-Agnostic Word Representation

NeurIPS 2018 Chengyue GongDi HeXu TanTao QinLiwei WangTie-Yan Liu

Continuous word representation (aka word embedding) is a basic building block in many neural network-based models used in natural language processing tasks. Although it is widely accepted that words with similar semantics should be close to each other in the embedding space, we find that word embeddings learned in several tasks are biased towards word frequency: the embeddings of high-frequency and low-frequency words lie in different subregions of the embedding space, and the embedding of a rare word and a popular word can be far from each other even if they are semantically similar... (read more)

PDF Abstract
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT LEADERBOARD
Machine Translation IWSLT2015 German-English Transformer with FRAGE BLEU score 33.97 # 5
Language Modelling Penn Treebank (Word Level) FRAGE + AWD-LSTM-MoS + dynamic eval Validation perplexity 47.38 # 2
Test perplexity 46.54 # 3
Params 22M # 1
Language Modelling WikiText-2 FRAGE + AWD-LSTM-MoS + dynamic eval Validation perplexity 40.85 # 2
Test perplexity 39.14 # 3
Number of params 35M # 1
Machine Translation WMT2014 English-German Transformer Big with FRAGE BLEU score 29.11 # 13