Learn Interpretable Word Embeddings Efficiently with von Mises-Fisher Distribution

25 Sep 2019  ·  Minghong Yao, Liansheng Zhuang, Houqiang Li, Jian Yang, Shafei Wang ·

Word embedding plays a key role in various tasks of natural language processing. However, the dominant word embedding models don't explain what information is carried with the resulting embeddings. To generate interpretable word embeddings we intend to replace the word vector with a probability density distribution. The insight here is that if we regularize the mixture distribution of all words to be uniform, then we can prove that the inner product between word embeddings represent the point-wise mutual information between words. Moreover, our model can also handle polysemy. Each word's probability density distribution will generate different vectors for its various meanings. We have evaluated our model in several word similarity tasks. Results show that our model can outperform the dominant models consistently in these tasks.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here