Improving Bilingual Lexicon Induction with Unsupervised Post-Processing of Monolingual Word Vector Spaces

Work on projection-based induction of cross-lingual word embedding spaces (CLWEs) predominantly focuses on the improvement of the projection (i.e., mapping) mechanisms. In this work, in contrast, we show that a simple method for post-processing monolingual embedding spaces facilitates learning of the cross-lingual alignment and, in turn, substantially improves bilingual lexicon induction (BLI). The post-processing method we examine is grounded in the generalisation of first- and second-order monolingual similarities to the nth-order similarity. By post-processing monolingual spaces before the cross-lingual alignment, the method can be coupled with any projection-based method for inducing CLWE spaces. We demonstrate the effectiveness of this simple monolingual post-processing across a set of 15 typologically diverse languages (i.e., 15*14 BLI setups), and in combination with two different projection methods.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here