Learning Bilingual Word Embeddings Using Lexical Definitions

WS 2019  ·  Weijia Shi, Muhao Chen, Yingtao Tian, Kai-Wei Chang ·

Bilingual word embeddings, which representlexicons of different languages in a shared em-bedding space, are essential for supporting se-mantic and knowledge transfers in a variety ofcross-lingual NLP tasks. Existing approachesto training bilingual word embeddings requireoften require pre-defined seed lexicons that areexpensive to obtain, or parallel sentences thatcomprise coarse and noisy alignment. In con-trast, we propose BilLex that leverages pub-licly available lexical definitions for bilingualword embedding learning. Without the needof predefined seed lexicons, BilLex comprisesa novel word pairing strategy to automati-cally identify and propagate the precise fine-grained word alignment from lexical defini-tions. We evaluate BilLex in word-level andsentence-level translation tasks, which seek tofind the cross-lingual counterparts of wordsand sentences respectively.BilLex signifi-cantly outperforms previous embedding meth-ods on both tasks.

PDF Abstract WS 2019 PDF WS 2019 Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here