Distributional Hypernym Generation by Jointly Learning Clusters and Projections

COLING 2016 · Josuke Yamane, Tomoya Takatani, Hitoshi Yamada, Makoto Miwa, Yutaka Sasaki ·

We propose a novel word embedding-based hypernym generation model that jointly learns clusters of hyponym-hypernym relations, i.e., hypernymy, and projections from hyponym to hypernym embeddings. Most of the recent hypernym detection models focus on a hypernymy classification problem that determines whether a pair of words is in hypernymy or not. These models do not directly deal with a hypernym generation problem in that a model generates hypernyms for a given word. Differently from previous studies, our model jointly learns the clusters and projections with adjusting the number of clusters so that the number of clusters can be determined depending on the learned projections and vice versa. Our model also boosts the performance by incorporating inner product-based similarity measures and negative examples, i.e., sampled non-hypernyms, into our objectives in learning. We evaluated our joint learning models on the task of Japanese and English hypernym generation and showed a significant improvement over an existing pipeline model. Our model also compared favorably to existing distributed hypernym detection models on the English hypernym classification task.