Word Sense Induction
17 papers with code • 2 benchmarks • 2 datasets
Word sense induction (WSI) is widely known as the “unsupervised version” of WSD. The problem states as: Given a target word (e.g., “cold”) and a collection of sentences (e.g., “I caught a cold”, “The weather is cold”) that use the word, cluster the sentences according to their different senses/meanings. We do not need to know the sense/meaning of each cluster, but sentences inside a cluster should have used the target words with the same sense.
Description from NLP Progress
Most implemented papers
Breaking Sticks and Ambiguities with Adaptive Skip-gram
Recently proposed Skip-gram model is a powerful method for learning high-dimensional word representations that capture rich semantic relationships between words.
A Simple Approach to Learn Polysemous Word Embeddings
Evaluating these methods is also problematic, as rigorous quantitative evaluations in this space is limited, especially when compared with single-sense embeddings.
Towards better substitution-based word sense induction
Word sense induction (WSI) is the task of unsupervised clustering of word usages within a sentence to distinguish senses.
RuDSI: graph-based word sense induction dataset for Russian
We present RuDSI, a new benchmark for word sense induction (WSI) in Russian.
Automated WordNet Construction Using Word Embeddings
To evaluate our method we construct two 600-word testsets for word-to-synset matching in French and Russian using native speakers and evaluate the performance of our method along with several other recent approaches.
Watset: Automatic Induction of Synsets from a Graph of Synonyms
This paper presents a new graph-based approach that induces synsets using synonymy dictionaries and word embeddings.
Improved Word Representation Learning with Sememes
The key idea is to utilize word sememes to capture exact meanings of a word within specific contexts accurately.
Russian word sense induction by clustering averaged word embeddings
The paper reports our participation in the shared task on word sense induction and disambiguation for the Russian language (RUSSE-2018).