Word Sense Induction

19 papers with code • 1 benchmarks • 1 datasets

Word sense induction (WSI) is widely known as the “unsupervised version” of WSD. The problem states as: Given a target word (e.g., “cold”) and a collection of sentences (e.g., “I caught a cold”, “The weather is cold”) that use the word, cluster the sentences according to their different senses/meanings. We do not need to know the sense/meaning of each cluster, but sentences inside a cluster should have used the target words with the same sense.

Description from NLP Progress

Benchmarks

Add a Result

These leaderboards are used to track progress in Word Sense Induction

Trend	Dataset	Best Model	Paper	Code	Compare
	SemEval 2010 WSI	BERT+DP			See all

Datasets

BRWAC

Most implemented papers

Most implemented Social Latest No code

Breaking Sticks and Ambiguities with Adaptive Skip-gram

sbos/AdaGram.jl • 25 Feb 2015

Recently proposed Skip-gram model is a powerful method for learning high-dimensional word representations that capture rich semantic relationships between words.

Paper
Code

A Simple Approach to Learn Polysemous Word Embeddings

dingwc/multisense • 6 Jul 2017

Evaluating these methods is also problematic, as rigorous quantitative evaluations in this space is limited, especially when compared with single-sense embeddings.

Paper
Code

Towards better substitution-based word sense induction

asafamr/bertwsi • • 29 May 2019

Word sense induction (WSI) is the task of unsupervised clustering of word usages within a sentence to distinguish senses.

Paper
Code

RuDSI: graph-based word sense induction dataset for Russian

kategavrishina/rudsi • COLING (TextGraphs) 2022

We present RuDSI, a new benchmark for word sense induction (WSI) in Russian.

Paper
Code

Exploring Topic Coherence over Many Models and Many Topics

fozziethebeat/TopicModelComparison • EMNLP 2012

Paper
Code

unimelb: Topic Modelling-based Word Sense Induction

jhlau/hdp-wsi • SEMEVAL 2013

Paper
Code

Automated WordNet Construction Using Word Embeddings

mkhodak/pawn • WS 2017

To evaluate our method we construct two 600-word testsets for word-to-synset matching in French and Russian using native speakers and evaluate the performance of our method along with several other recent approaches.

Paper
Code