Word Sense Disambiguation using a Bidirectional LSTM

WS 2016  ·  Mikael Kågebäck, Hans Salomonsson ·

In this paper we present a clean, yet effective, model for word sense disambiguation. Our approach leverage a bidirectional long short-term memory network which is shared between all words. This enables the model to share statistical strength and to scale well with vocabulary size. The model is trained end-to-end, directly from the raw text to sense labels, and makes effective use of word order. We evaluate our approach on two standard datasets, using identical hyperparameter settings, which are in turn tuned on a third set of held out data. We employ no external resources (e.g. knowledge graphs, part-of-speech tagging, etc), language specific features, or hand crafted rules, but still achieve statistically equivalent results to the best state-of-the-art systems, that employ no such limitations.

PDF Abstract WS 2016 PDF WS 2016 Abstract

Datasets


  Add Datasets introduced or used in this paper
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Word Sense Disambiguation SensEval 2 Lexical Sample BiLSTM with GloVe F1 66.9 # 2
Word Sense Disambiguation SensEval 3 Lexical Sample BiLSTM with GloVe F1 73.4 # 2

Methods


No methods listed for this paper. Add relevant methods here