no code implementations • CL (ACL) 2021 • Olga Majewska, Diana McCarthy, Jasper J. F. van den Bosch, Nikolaus Kriegeskorte, Ivan Vulić, Anna Korhonen
We demonstrate how the resultant data set can be used for fine-grained analyses and evaluation of representation learning models on the intrinsic tasks of semantic clustering and semantic similarity.
no code implementations • ACL 2022 • Evgeniia Razumovskaia, Goran Glavaš, Olga Majewska, Edoardo Ponti, Ivan Vulić
In this tutorial, we will thus discuss and demonstrate the importance of (building) multilingual ToD systems, and then provide a systematic overview of current research gaps, challenges and initiatives related to multilingual ToD systems, with a particular focus on their connections to current research and challenges in multilingual and low-resource NLP.
no code implementations • Findings (ACL) 2022 • Sebastian Ruder, Ivan Vulić, Anders Søgaard
Most work targeting multilinguality, for example, considers only accuracy; most work on fairness or interpretability considers only English; and so on.
no code implementations • Findings (ACL) 2022 • Evgeniia Razumovskaia, Ivan Vulić, Anna Korhonen
Scaling dialogue systems to a multitude of domains, tasks and languages relies on costly and time-consuming data annotation for different domain-task-language configurations.
no code implementations • Findings (EMNLP) 2021 • Alan Ansell, Edoardo Maria Ponti, Jonas Pfeiffer, Sebastian Ruder, Goran Glavaš, Ivan Vulić, Anna Korhonen
While offering (1) improved fine-tuning efficiency (by a factor of around 50 in our experiments), (2) a smaller parameter budget, and (3) increased language coverage, MAD-G remains competitive with more expensive methods for language-specific adapter training across the board.
no code implementations • CL (ACL) 2020 • Ivan Vulić, Simon Baker, Edoardo Maria Ponti, Ulla Petti, Ira Leviant, Kelly Wing, Olga Majewska, Eden Bar, Matt Malone, Thierry Poibeau, Roi Reichart, Anna Korhonen
We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering data sets for 12 typologically diverse languages, including major languages (e. g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e. g., Welsh, Kiswahili).
no code implementations • 20 May 2022 • Chia-Chien Hung, Anne Lauscher, Ivan Vulić, Simone Paolo Ponzetto, Goran Glavaš
We then introduce a new framework for multilingual conversational specialization of pretrained language models (PrLMs) that aims to facilitate cross-lingual transfer for arbitrary downstream TOD tasks.
no code implementations • 30 Apr 2022 • Ivan Vulić, Goran Glavaš, Fangyu Liu, Nigel Collier, Edoardo Maria Ponti, Anna Korhonen
Pretrained multilingual language models (LMs) can be successfully transformed into multilingual sentence encoders (SEs; e. g., LaBSE, xMPNET) via additional fine-tuning or model distillation on parallel data.
no code implementations • 28 Apr 2022 • Georgios P. Spithourakis, Ivan Vulić, Michał Lis, Iñigo Casanueva, Paweł Budzianowski
Knowledge-based authentication is crucial for task-oriented spoken dialogue systems that offer personalised and privacy-focused services.
1 code implementation • 27 Apr 2022 • Iñigo Casanueva, Ivan Vulić, Georgios P. Spithourakis, Paweł Budzianowski
2) The ontology is divided into domain-specific and generic (i. e., domain-universal) intent modules that overlap across domains, promoting cross-domain reusability of annotated examples.
1 code implementation • 5 Apr 2022 • Robert Litschko, Ivan Vulić, Goran Glavaš
Current approaches therefore typically transfer rankers trained on English data to other languages and cross-lingual setups by means of multilingual encoders: they fine-tune all the parameters of a pretrained massively multilingual Transformer (MMT, e. g., multilingual BERT) on English relevance judgments and then deploy it in the target language.
no code implementations • 5 Apr 2022 • Gabor Fuisz, Ivan Vulić, Samuel Gibbons, Inigo Casanueva, Paweł Budzianowski
In particular, we focus on modeling and studying \textit{slot labeling} (SL), a crucial component of NLU for dialog, through the QA optics, aiming to improve both its performance and efficiency, and make it more effective and resilient to working with limited task data.
Natural Language Understanding
Pretrained Language Models
+1
1 code implementation • ACL 2022 • Yaoyiran Li, Fangyu Liu, Nigel Collier, Anna Korhonen, Ivan Vulić
At Stage C1, we propose to refine standard cross-lingual linear maps between static word embeddings (WEs) via a contrastive learning objective; we also show how to integrate it into the self-learning procedure for even more refined cross-lingual maps.
no code implementations • 31 Jan 2022 • Olga Majewska, Evgeniia Razumovskaia, Edoardo Maria Ponti, Ivan Vulić, Anna Korhonen
Through this process we annotate a new large-scale dataset for training and evaluation of multilingual and cross-lingual ToD systems.
2 code implementations • 27 Jan 2022 • Emanuele Bugliarello, Fangyu Liu, Jonas Pfeiffer, Siva Reddy, Desmond Elliott, Edoardo Maria Ponti, Ivan Vulić
Our benchmark enables the evaluation of multilingual multimodal models for transfer learning, not only in a zero-shot setting, but also in newly defined few-shot learning setups.
Cross-Lingual Visual Natural Language Inference
Cross-Modal Retrieval
+15
1 code implementation • 21 Dec 2021 • Robert Litschko, Ivan Vulić, Simone Paolo Ponzetto, Goran Glavaš
In this work we present a systematic empirical study focused on the suitability of the state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks across a number of diverse language pairs.
1 code implementation • ACL 2022 • Wenxuan Zhou, Fangyu Liu, Ivan Vulić, Nigel Collier, Muhao Chen
To achieve this, it is crucial to represent multilingual knowledge in a shared/unified space.
1 code implementation • ACL 2022 • Alan Ansell, Edoardo Maria Ponti, Anna Korhonen, Ivan Vulić
Based on an in-depth analysis, we additionally find that sparsity is crucial to prevent both 1) interference between the fine-tunings to be composed and 2) overfitting.
no code implementations • EMNLP 2021 • Ivan Vulić, Pei-Hao Su, Sam Coope, Daniela Gerz, Paweł Budzianowski, Iñigo Casanueva, Nikola Mrkšić, Tsung-Hsien Wen
Transformer-based language models (LMs) pretrained on large text collections are proven to store a wealth of semantic knowledge.
1 code implementation • CoNLL (EMNLP) 2021 • Qianchu Liu, Fangyu Liu, Nigel Collier, Anna Korhonen, Ivan Vulić
Recent work indicated that pretrained language models (PLMs) such as BERT and RoBERTa can be transformed into effective sentence and word encoders even via simple self-supervised techniques.
1 code implementation • Findings (ACL) 2022 • Jonas Pfeiffer, Gregor Geigle, Aishwarya Kamath, Jan-Martin O. Steitz, Stefan Roth, Ivan Vulić, Iryna Gurevych
In this work, we address this gap and provide xGQA, a new multilingual evaluation benchmark for the visual question answering task.
no code implementations • IJCNLP 2019 • Edoardo Maria Ponti, Ivan Vulić, Ryan Cotterell, Roi Reichart, Anna Korhonen
Motivated by this question, we aim at constructing an informative prior over neural weights, in order to adapt quickly to held-out languages in the task of character-level language modeling.
1 code implementation • 23 Jul 2021 • Edoardo Maria Ponti, Julia Kreutzer, Ivan Vulić, Siva Reddy
To remedy this, we propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model, by treating the intermediate translations as a latent random variable.
1 code implementation • ACL 2021 • Soumya Barikeri, Anne Lauscher, Ivan Vulić, Goran Glavaš
We use the evaluation framework to benchmark the widely used conversational DialoGPT model along with the adaptations of four debiasing methods.
Conversational Response Generation
Pretrained Language Models
+1
1 code implementation • ACL 2021 • Fangyu Liu, Ivan Vulić, Anna Korhonen, Nigel Collier
To this end, we propose and evaluate a series of cross-lingual transfer methods for the XL-BEL task, and demonstrate that general-domain bitext helps propagate the available English knowledge to languages with little to no in-domain data.
no code implementations • EMNLP 2021 • Daniela Gerz, Pei-Hao Su, Razvan Kusztos, Avishek Mondal, Michał Lis, Eshan Singhal, Nikola Mrkšić, Tsung-Hsien Wen, Ivan Vulić
We present a systematic study on multilingual and cross-lingual intent detection from spoken data.
no code implementations • 17 Apr 2021 • Evgeniia Razumovskaia, Goran Glavaš, Olga Majewska, Edoardo M. Ponti, Anna Korhonen, Ivan Vulić
We find that the most critical factor preventing the creation of truly multilingual ToD systems is the lack of datasets in most languages for both training and evaluation.
1 code implementation • EMNLP 2021 • Qianchu Liu, Edoardo M. Ponti, Diana McCarthy, Ivan Vulić, Anna Korhonen
In order to address these gaps, we present AM2iCo (Adversarial and Multilingual Meaning in Context), a wide-coverage cross-lingual and multilingual evaluation set; it aims to faithfully assess the ability of state-of-the-art (SotA) representation models to understand the identity of word meaning in cross-lingual contexts for 14 language pairs.
1 code implementation • EMNLP 2021 • Fangyu Liu, Ivan Vulić, Anna Korhonen, Nigel Collier
In this work, we demonstrate that it is possible to turn MLMs into effective universal lexical and sentence encoders even without any additional data and without any supervision.
Ranked #9 on
Semantic Textual Similarity
on STS16
Contrastive Learning
Cross-Lingual Semantic Textual Similarity
+3
1 code implementation • 22 Mar 2021 • Gregor Geigle, Jonas Pfeiffer, Nils Reimers, Ivan Vulić, Iryna Gurevych
Current state-of-the-art approaches to cross-modal retrieval process text and visual input jointly, relying on Transformer-based architectures with cross-attention mechanisms that attend over all words and objects in an image.
1 code implementation • 21 Jan 2021 • Robert Litschko, Ivan Vulić, Simone Paolo Ponzetto, Goran Glavaš
Therefore, in this work we present a systematic empirical study focused on the suitability of the state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks across a large number of language pairs.
no code implementations • ACL 2021 • Mengjie Zhao, Yi Zhu, Ehsan Shareghi, Ivan Vulić, Roi Reichart, Anna Korhonen, Hinrich Schütze
Few-shot crosslingual transfer has been shown to outperform its zero-shot counterpart with pretrained encoders like multilingual BERT.
no code implementations • ACL 2021 • Olga Majewska, Ivan Vulić, Goran Glavaš, Edoardo M. Ponti, Anna Korhonen
We investigate whether injecting explicit information on verbs' semantic-syntactic behaviour improves the performance of LM-pretrained Transformers in event extraction tasks -- downstream tasks for which accurate verb processing is paramount.
2 code implementations • EMNLP 2021 • Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder
The ultimate challenge is dealing with under-resourced languages not covered at all by the models and written in scripts unseen during pretraining.
1 code implementation • ACL 2021 • Phillip Rust, Jonas Pfeiffer, Ivan Vulić, Sebastian Ruder, Iryna Gurevych
In this work, we provide a systematic and comprehensive empirical comparison of pretrained multilingual language models versus their monolingual counterparts with regard to their monolingual task performance.
no code implementations • 11 Dec 2020 • Marko Vidoni, Ivan Vulić, Goran Glavaš
Adapter modules, additional trainable parameters that enable efficient fine-tuning of pretrained transformers, have recently been used for language specialization of multilingual transformers, improving downstream zero-shot cross-lingual transfer.
1 code implementation • COLING 2020 • Yaoyiran Li, Edoardo M. Ponti, Ivan Vulić, Anna Korhonen
On the other hand, this also provides an extrinsic evaluation protocol to probe the properties of emergent languages ex vitro.
no code implementations • NAACL 2021 • Matthew Henderson, Ivan Vulić
We propose ConVEx (Conversational Value Extractor), an efficient pretraining and fine-tuning neural approach for slot-labeling dialog tasks.
no code implementations • EMNLP 2020 • Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš, Anna Korhonen
The success of large pretrained language models (LMs) such as BERT and RoBERTa has sparked interest in probing their representations, in order to unveil what types of knowledge they implicitly capture.
1 code implementation • 15 Aug 2020 • Goran Glavaš, Ivan Vulić
Traditional NLP has long held (supervised) syntactic parsing necessary for successful higher-level semantic language understanding (LU).
4 code implementations • EMNLP 2020 • Jonas Pfeiffer, Andreas Rücklé, Clifton Poth, Aishwarya Kamath, Ivan Vulić, Sebastian Ruder, Kyunghyun Cho, Iryna Gurevych
We propose AdapterHub, a framework that allows dynamic "stitching-in" of pre-trained adapters for different tasks and languages.
1 code implementation • ACL 2020 • Sam Coope, Tyler Farghly, Daniela Gerz, Ivan Vulić, Matthew Henderson
We introduce Span-ConveRT, a light-weight model for dialog slot-filling which frames the task as a turn-based span extraction task.
1 code implementation • ACL 2020 • Daniela Gerz, Ivan Vulić, Marek Rei, Roi Reichart, Anna Korhonen
We present a neural framework for learning associations between interrelated groups of words such as the ones found in Subject-Verb-Object (SVO) structures.
no code implementations • 1 May 2020 • Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš
Massively multilingual transformers pretrained with language modeling objectives (e. g., mBERT, XLM-R) have become a de facto default transfer paradigm for zero-shot cross-lingual transfer in NLP, offering unmatched transfer performance.
1 code implementation • EMNLP 2020 • Edoardo Maria Ponti, Goran Glavaš, Olga Majewska, Qianchu Liu, Ivan Vulić, Anna Korhonen
In order to simulate human language capacity, natural language processing systems must be able to reason about the dynamics of everyday situations, including their possible causes and effects.
Ranked #1 on
Cross-Lingual Transfer
on XCOPA
(using extra training data)
3 code implementations • EMNLP 2020 • Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder
The main goal behind state-of-the-art pre-trained multilingual models such as multilingual BERT and XLM-R is enabling and bootstrapping NLP applications in low-resource languages through zero-shot or few-shot cross-lingual transfer.
Ranked #2 on
Cross-Lingual Transfer
on XCOPA
(using extra training data)
no code implementations • COLING 2020 • Robert Litschko, Ivan Vulić, Željko Agić, Goran Glavaš
Current methods of cross-lingual parser transfer focus on predicting the best parser for a low-resource target language globally, that is, "at treebank level".
1 code implementation • EMNLP 2020 • Ivan Vulić, Sebastian Ruder, Anders Søgaard
Existing algorithms for aligning cross-lingual word vector spaces assume that vector spaces are approximately isomorphic.
no code implementations • 10 Mar 2020 • Ivan Vulić, Simon Baker, Edoardo Maria Ponti, Ulla Petti, Ira Leviant, Kelly Wing, Olga Majewska, Eden Bar, Matt Malone, Thierry Poibeau, Roi Reichart, Anna Korhonen
We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering datasets for 12 typologically diverse languages, including major languages (e. g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e. g., Welsh, Kiswahili).
4 code implementations • WS 2020 • Iñigo Casanueva, Tadas Temčinas, Daniela Gerz, Matthew Henderson, Ivan Vulić
Building conversational systems in new domains and with added functionality requires resource-efficient models that work under low-data regimes (i. e., in few-shot setups).
no code implementations • EMNLP 2020 • Haim Dubossarsky, Ivan Vulić, Roi Reichart, Anna Korhonen
Performance in cross-lingual NLP tasks is impacted by the (dis)similarity of languages at hand: e. g., previous work has suggested there is a connection between the expected success of bilingual lexicon induction (BLI) and the assumption of (approximate) isomorphism between monolingual embedding spaces.
1 code implementation • 30 Jan 2020 • Edoardo M. Ponti, Ivan Vulić, Ryan Cotterell, Marinela Parovic, Roi Reichart, Anna Korhonen
In this work, we propose a Bayesian generative model for the space of neural parameters.
3 code implementations • Findings of the Association for Computational Linguistics 2020 • Matthew Henderson, Iñigo Casanueva, Nikola Mrkšić, Pei-Hao Su, Tsung-Hsien Wen, Ivan Vulić
General-purpose pretrained sentence encoders such as BERT are not ideal for real-world conversational AI applications; they are computationally heavy, slow, and expensive to train.
Ranked #1 on
Conversational Response Selection
on PolyAI AmazonQA
no code implementations • CONLL 2019 • Yi Zhu, Benjamin Heinzerling, Ivan Vulić, Michael Strube, Roi Reichart, Anna Korhonen
Recent work has validated the importance of subword information for word representation learning.
3 code implementations • 13 Sep 2019 • Anne Lauscher, Goran Glavaš, Simone Paolo Ponzetto, Ivan Vulić
Moreover, we successfully transfer debiasing models, by means of cross-lingual embedding spaces, and remove or attenuate biases in distributional word vector spaces of languages that lack readily available bias specifications.
1 code implementation • COLING 2020 • Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti, Anna Korhonen, Goran Glavaš
In this work, we complement such distributional knowledge with external lexical knowledge, that is, we integrate the discrete knowledge on word-level semantic similarity into pretraining.
1 code implementation • IJCNLP 2019 • Ivan Vulić, Goran Glavaš, Roi Reichart, Anna Korhonen
A series of bilingual lexicon induction (BLI) experiments with 15 diverse languages (210 language pairs) show that fully unsupervised CLWE methods still fail for a large number of language pairs (e. g., they yield zero BLI performance for 87/210 pairs).
no code implementations • IJCNLP 2019 • Matthew Henderson, Ivan Vulić, Iñigo Casanueva, Paweł Budzianowski, Daniela Gerz, Sam Coope, Georgios Spithourakis, Tsung-Hsien Wen, Nikola Mrkšić, Pei-Hao Su
We present PolyResponse, a conversational search engine that supports task-oriented dialogue.
no code implementations • 12 Jul 2019 • Paweł Budzianowski, Ivan Vulić
Data scarcity is a long-standing and crucial challenge that hinders quick development of task-oriented dialogue systems across multiple domains: task-oriented dialogue models are expected to learn grammar, syntax, dialogue reasoning, decision making, and language generation from absurdly small amounts of task-specific data.
1 code implementation • ACL 2019 • Matthew Henderson, Ivan Vulić, Daniela Gerz, Iñigo Casanueva, Paweł Budzianowski, Sam Coope, Georgios Spithourakis, Tsung-Hsien Wen, Nikola Mrkšić, Pei-Hao Su
Despite their popularity in the chatbot literature, retrieval-based models have had modest impact on task-oriented dialogue systems, with the main obstacle to their application being the low-data regime of most task-oriented dialogue tasks.
1 code implementation • NAACL 2019 • Yi Zhu, Ivan Vulić, Anna Korhonen
The use of subword-level information (e. g., characters, character n-grams, morphemes) has become ubiquitous in modern word representation learning.
5 code implementations • WS 2019 • Matthew Henderson, Paweł Budzianowski, Iñigo Casanueva, Sam Coope, Daniela Gerz, Girish Kumar, Nikola Mrkšić, Georgios Spithourakis, Pei-Hao Su, Ivan Vulić, Tsung-Hsien Wen
Progress in Machine Learning is often driven by the availability of large datasets, and consistent evaluation metrics for comparing modeling approaches.
1 code implementation • EMNLP 2018 • Edoardo Maria Ponti, Ivan Vulić, Goran Glavaš, Nikola Mrkšić, Anna Korhonen
Our adversarial post-specialization method propagates the external lexical knowledge to the full distributional space.
no code implementations • CL 2019 • Edoardo Maria Ponti, Helen O'Horan, Yevgeni Berzak, Ivan Vulić, Roi Reichart, Thierry Poibeau, Ekaterina Shutova, Anna Korhonen
Linguistic typology aims to capture structural and semantic variation across the world's languages.
1 code implementation • 29 May 2018 • Nikola Mrkšić, Ivan Vulić
This paper proposes an improvement to the existing data-driven Neural Belief Tracking (NBT) framework for Dialogue State Tracking (DST).
1 code implementation • ACL 2018 • Marek Rei, Daniela Gerz, Ivan Vulić
Experiments show excellent performance on scoring graded lexical entailment, raising the state-of-the-art on the HyperLex dataset by approximately 25%.
no code implementations • ACL 2018 • Anders Søgaard, Sebastian Ruder, Ivan Vulić
Unsupervised machine translation---i. e., not assuming any cross-lingual supervision signal, whether a dictionary, translations, or comparable corpora---seems impossible, but nevertheless, Lample et al. (2018) recently proposed a fully unsupervised machine translation (MT) model.
1 code implementation • NAACL 2018 • Ivan Vulić, Goran Glavaš, Nikola Mrkšić, Anna Korhonen
Word vector specialisation (also known as retrofitting) is a portable, light-weight approach to fine-tuning arbitrary distributional word vector spaces by injecting external knowledge from rich lexical resources such as WordNet.
1 code implementation • 2 May 2018 • Robert Litschko, Goran Glavaš, Simone Paolo Ponzetto, Ivan Vulić
We propose a fully unsupervised framework for ad-hoc cross-lingual information retrieval (CLIR) which requires no bilingual data at all.
1 code implementation • 17 Oct 2017 • Ivan Vulić, Nikola Mrkšić
We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that transforms any input word vector space to emphasise the asymmetric relation of lexical entailment (LE), also known as the IS-A or hyponymy-hypernymy relation.
no code implementations • EMNLP 2017 • Ivan Vulić, Nikola Mrkšić, Anna Korhonen
Existing approaches to automatic VerbNet-style verb classification are heavily dependent on feature engineering and therefore limited to languages with mature NLP pipelines.
no code implementations • 15 Jun 2017 • Sebastian Ruder, Ivan Vulić, Anders Søgaard
Cross-lingual representations of words enable us to reason about word meaning in multilingual contexts and are a key facilitator of cross-lingual transfer when developing natural language processing models for low-resource languages.
2 code implementations • 1 Jun 2017 • Nikola Mrkšić, Ivan Vulić, Diarmuid Ó Séaghdha, Ira Leviant, Roi Reichart, Milica Gašić, Anna Korhonen, Steve Young
We present Attract-Repel, an algorithm for improving the semantic quality of word vectors by injecting constraints extracted from lexical resources.
no code implementations • ACL 2017 • Ivan Vulić, Nikola Mrkšić, Roi Reichart, Diarmuid Ó Séaghdha, Steve Young, Anna Korhonen
Morphologically rich languages accentuate two properties of distributional vector space models: 1) the difficulty of inducing accurate representations for low-frequency word forms; and 2) insensitivity to distinct lexical relations that have similar distributional signatures.
no code implementations • SEMEVAL 2017 • Edoardo Maria Ponti, Ivan Vulić, Anna Korhonen
Distributed representations of sentences have been developed recently to represent their meaning as real-valued vectors.
no code implementations • COLING 2016 • Helen O'Horan, Yevgeni Berzak, Ivan Vulić, Roi Reichart, Anna Korhonen
In recent years linguistic typology, which classifies the world's languages according to their functional and structural properties, has been widely used to support multilingual NLP.
no code implementations • CONLL 2017 • Ivan Vulić, Roy Schwartz, Ari Rappoport, Roi Reichart, Anna Korhonen
With our selected context configurations, we train on only 14% (A), 26. 2% (V), and 33. 6% (N) of all dependency-based contexts, resulting in a reduced training time.
no code implementations • CL 2017 • Ivan Vulić, Daniela Gerz, Douwe Kiela, Felix Hill, Anna Korhonen
We introduce HyperLex - a dataset and evaluation resource that quantifies the extent of of the semantic category membership, that is, type-of relation also known as hyponymy-hypernymy or lexical entailment (LE) relation between 2, 616 concept pairs.
1 code implementation • EMNLP 2016 • Daniela Gerz, Ivan Vulić, Felix Hill, Roi Reichart, Anna Korhonen
Verbs play a critical role in the meaning of sentences, but these ubiquitous words have received little attention in recent distributional semantics research.
no code implementations • 24 Sep 2015 • Ivan Vulić, Marie-Francine Moens
We propose a new model for learning bilingual word representations from non-parallel document-aligned data.