Improving Bilingual Terminology Extraction from Comparable Corpora via Multiple Word-Space Models

LREC 2016  ·  Amir Hazem, Emmanuel Morin ·

There is a rich flora of word space models that have proven their efficiency in many different applications including information retrieval (Dumais, 1988), word sense disambiguation (Schutze, 1992), various semantic knowledge tests (Lund et al., 1995; Karlgren, 2001), and text categorization (Sahlgren, 2005). Based on the assumption that each model captures some aspects of word meanings and provides its own empirical evidence, we present in this paper a systematic exploration of the principal corpus-based word space models for bilingual terminology extraction from comparable corpora. We find that, once we have identified the best procedures, a very simple combination approach leads to significant improvements compared to individual models.

PDF Abstract LREC 2016 PDF LREC 2016 Abstract


  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here