Implicit Subjective and Sentimental Usages in Multi-sense Word Embeddings

WS 2018 · Yuqi Sun, Haoyue Shi, Junfeng Hu ·

In multi-sense word embeddings, contextual variations in corpus may cause a univocal word to be embedded into different sense vectors. Shi et al. (2016) show that this kind of \textit{pseudo multi-senses} can be eliminated by linear transformations. In this paper, we show that \textit{pseudo multi-senses} may come from a uniform and meaningful phenomenon such as subjective and sentimental usage, though they are seemingly redundant. In this paper, we present an unsupervised algorithm to find a linear transformation which can minimize the transformed distance of a group of sense pairs. The major shrinking direction of this transformation is found to be related with subjective shift. Therefore, we can not only eliminate \textit{pseudo multi-senses} in multisense embeddings, but also identify these subjective senses and tag the subjective and sentimental usage of words in the corpus automatically.

PDF Abstract