Improved Word Embeddings with Implicit Structure Information

COLING 2016  ·  Jie Shen, Cong Liu ·

Distributed word representation is an efficient method for capturing semantic and syntactic word relations. In this work, we introduce an extension to the continuous bag-of-words model for learning word representations efficiently by using implicit structure information. Instead of relying on a syntactic parser which might be noisy and slow to build, we compute weights representing probabilities of syntactic relations based on the Huffman softmax tree in an efficient heuristic. The constructed {``}implicit graphs{''} from these weights show that these weights contain useful implicit structure information. Extensive experiments performed on several word similarity and word analogy tasks show gains compared to the basic continuous bag-of-words model.

PDF Abstract COLING 2016 PDF COLING 2016 Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods