no code implementations • EMNLP 2020 • Anne Lauscher, Vinit Ravishankar, Ivan Vuli{\'c}, Goran Glava{\v{s}}
Massively multilingual transformers (MMTs) pretrained via language modeling (e. g., mBERT, XLM-R) have become a default paradigm for zero-shot language transfer in NLP, offering unmatched transfer performance.
no code implementations • ACL 2021 • Ivan Vuli{\'c}, Edoardo Maria Ponti, Anna Korhonen, Goran Glava{\v{s}}
Inspired by prior work on semantic specialization of static word embedding (WE) models, we show that it is possible to expose and enrich lexical knowledge from the LMs, that is, to specialize them to serve as effective and universal {``}decontextualized{''} word encoders even when fed input words {``}in isolation{''} (i. e., without any context).
no code implementations • EACL 2021 • Goran Glava{\v{s}}, Ivan Vuli{\'c}
Traditional NLP has long held (supervised) syntactic parsing necessary for successful higher-level semantic language understanding (LU).
no code implementations • SEMEVAL 2020 • Carlos Santos Armendariz, Matthew Purver, Senja Pollak, Nikola Ljube{\v{s}}i{\'c}, Matej Ul{\v{c}}ar, Ivan Vuli{\'c}, Mohammad Taher Pilehvar
This paper presents the Graded Word Similarity in Context (GWSC) task which asked participants to predict the effects of context on human perception of similarity in English, Croatian, Slovene and Finnish.
no code implementations • SEMEVAL 2020 • Goran Glava{\v{s}}, Ivan Vuli{\'c}, Anna Korhonen, Simone Paolo Ponzetto
The shared task spans three dimensions: (1) monolingual vs. cross-lingual LE, (2) binary vs. graded LE, and (3) a set of 6 diverse languages (and 15 corresponding language pairs).
no code implementations • COLING 2020 • Goran Glava{\v{s}}, Mladen Karan, Ivan Vuli{\'c}
We present XHate-999, a multi-domain and multilingual evaluation data set for abusive language detection.
1 code implementation • COLING 2020 • Olga Majewska, Ivan Vuli{\'c}, Diana McCarthy, Anna Korhonen
We present the first evaluation of the applicability of a spatial arrangement method (SpAM) to a typologically diverse language sample, and its potential to produce semantic evaluation resources to support multilingual NLP, with a focus on verb semantics.
no code implementations • WS 2020 • Ivan Vuli{\'c}, Anna Korhonen, Goran Glava{\v{s}}
Work on projection-based induction of cross-lingual word embedding spaces (CLWEs) predominantly focuses on the improvement of the projection (i. e., mapping) mechanisms.
no code implementations • ACL 2020 • Mladen Karan, Ivan Vuli{\'c}, Anna Korhonen, Goran Glava{\v{s}}
Effective projection-based cross-lingual word embedding (CLWE) induction critically relies on the iterative self-learning procedure.
no code implementations • ACL 2020 • Goran Glava{\v{s}}, Ivan Vuli{\'c}
We present InstaMap, an instance-based method for learning projection-based cross-lingual word embeddings.
Bilingual Lexicon Induction Cross-Lingual Word Embeddings +2
no code implementations • LREC 2020 • Olga Majewska, Diana McCarthy, Jasper van den Bosch, Nikolaus Kriegeskorte, Ivan Vuli{\'c}, Anna Korhonen
We present a novel methodology for fast bottom-up creation of large-scale semantic similarity resources to support development and evaluation of NLP systems.
no code implementations • CONLL 2019 • Qianchu Liu, Diana McCarthy, Ivan Vuli{\'c}, Anna Korhonen
In this paper, we present a thorough investigation on methods that align pre-trained contextualized embeddings into shared cross-lingual context-aware embedding space, providing strong reference benchmarks for future context-aware crosslingual models.
no code implementations • IJCNLP 2019 • Edoardo Maria Ponti, Ivan Vuli{\'c}, Goran Glava{\v{s}}, Roi Reichart, Anna Korhonen
Semantic specialization integrates structured linguistic knowledge from external resources (such as lexical relations in WordNet) into pretrained distributional vectors in the form of constraints.
no code implementations • WS 2019 • Pawe{\l} Budzianowski, Ivan Vuli{\'c}
Data scarcity is a long-standing and crucial challenge that hinders quick development of task-oriented dialogue systems across multiple domains: task-oriented dialogue models are expected to learn grammar, syntax, dialogue reasoning, decision making, and language generation from absurdly small amounts of task-specific data.
no code implementations • WS 2019 • Aishwarya Kamath, Jonas Pfeiffer, Edoardo Maria Ponti, Goran Glava{\v{s}}, Ivan Vuli{\'c}
Semantic specialization methods fine-tune distributional word vectors using lexical knowledge from external resources (e. g. WordNet) to accentuate a particular relation between words.
no code implementations • ACL 2019 • Ivan Vuli{\'c}, Simone Paolo Ponzetto, Goran Glava{\v{s}}
Starting from HyperLex, the only available GR-LE dataset in English, we construct new monolingual GR-LE datasets for three other languages, and combine those to create a set of six cross-lingual GR-LE datasets termed CL-HYPERLEX.
no code implementations • ACL 2019 • Sebastian Ruder, Anders S{\o}gaard, Ivan Vuli{\'c}
In this tutorial, we provide a comprehensive survey of the exciting recent work on cutting-edge weakly-supervised and unsupervised cross-lingual word representations.
no code implementations • ACL 2019 • {\v{Z}}eljko Agi{\'c}, Ivan Vuli{\'c}
Viable cross-lingual transfer critically depends on the availability of parallel texts.
1 code implementation • ACL 2019 • Goran Glava{\v{s}}, Ivan Vuli{\'c}
Lexical entailment (LE; also known as hyponymy-hypernymy or is-a relation) is a core asymmetric lexical relation that supports tasks like taxonomy induction and text generation.
no code implementations • NAACL 2019 • Ehsan Shareghi, Daniela Gerz, Ivan Vuli{\'c}, Anna Korhonen
In recent years neural language models (LMs) have set the state-of-the-art performance for several benchmarking datasets.
no code implementations • NAACL 2019 • Geert Heyman, Bregt Verreet, Ivan Vuli{\'c}, Marie-Francine Moens
We learn a shared multilingual embedding space for a variable number of languages by incrementally adding new languages one by one to the current multilingual space.
Bilingual Lexicon Induction Cross-Lingual Word Embeddings +4
no code implementations • EMNLP 2018 • Daniela Gerz, Ivan Vuli{\'c}, Edoardo Maria Ponti, Roi Reichart, Anna Korhonen
A key challenge in cross-lingual NLP is developing general language-independent architectures that are equally applicable to any language.
no code implementations • WS 2018 • Ivan Vuli{\'c}
Word vector space specialisation models offer a portable, light-weight approach to fine-tuning arbitrary distributional vector spaces to discern between synonymy and antonymy.
no code implementations • ACL 2018 • Edoardo Maria Ponti, Roi Reichart, Anna Korhonen, Ivan Vuli{\'c}
The transfer or share of knowledge between languages is a potential solution to resource scarcity in NLP.
1 code implementation • ACL 2018 • Guy Rotman, Ivan Vuli{\'c}, Roi Reichart
We present a deep neural network that leverages images to improve bilingual text embeddings.
no code implementations • ACL 2018 • Goran Glava{\v{s}}, Ivan Vuli{\'c}
The ER model allows us to learn a global specialization function and specialize the vectors of words unobserved in the training data as well.
1 code implementation • ACL 2018 • Nikola Mrk{\v{s}}i{\'c}, Ivan Vuli{\'c}
This paper proposes an improvement to the existing data-driven Neural Belief Tracking (NBT) framework for Dialogue State Tracking (DST).
no code implementations • NAACL 2018 • Pei-Hao Su, Nikola Mrk{\v{s}}i{\'c}, I{\~n}igo Casanueva, Ivan Vuli{\'c}
The main purpose of this tutorial is to encourage dialogue research in the NLP community by providing the research background, a survey of available resources, and giving key insights to application of state-of-the-art SDS methodology into industry-scale conversational AI systems.
1 code implementation • NAACL 2018 • Goran Glava{\v{s}}, Ivan Vuli{\'c}
We present a simple and effective feed-forward neural architecture for discriminating between lexico-semantic relations (synonymy, antonymy, hypernymy, and meronymy).
1 code implementation • NAACL 2018 • Ivan Vuli{\'c}, Nikola Mrk{\v{s}}i{\'c}
We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that transforms any input word vector space to emphasise the asymmetric relation of lexical entailment (LE), also known as the IS-A or hyponymy-hypernymy relation.
Ranked #1 on Lexical Entailment on HyperLex
no code implementations • TACL 2018 • Daniela Gerz, Ivan Vuli{\'c}, Edoardo Ponti, Jason Naradowsky, Roi Reichart, Anna Korhonen
Neural architectures are prominent in the construction of language models (LMs).
no code implementations • EMNLP 2017 • Manaal Faruqui, Anders S{\o}gaard, Ivan Vuli{\'c}
With the increasing use of monolingual word vectors, there is a need for word vectors that can be used as efficiently across multiple languages as monolingually.
no code implementations • EACL 2017 • Ivan Vuli{\'c}
We develop a novel cross-lingual word representation model which injects syntactic information through dependency-based contexts into a shared cross-lingual word vector space.
no code implementations • EACL 2017 • Ivan Vuli{\'c}, Douwe Kiela, Anna Korhonen
Recent work on evaluating representation learning architectures in NLP has established a need for evaluation protocols based on subconscious cognitive measures rather than manually tailored intrinsic similarity and relatedness tasks.
no code implementations • EACL 2017 • Ivan Vuli{\'c}, Nikola Mrk{\v{s}}i{\'c}, Mohammad Taher Pilehvar
Specialising vector spaces to maximise their content with respect to one key property of vector space models (e. g. semantic similarity vs. relatedness or lexical entailment) while mitigating others has become an active and attractive research topic in representation learning.
no code implementations • EACL 2017 • Geert Heyman, Ivan Vuli{\'c}, Marie-Francine Moens
We study the problem of bilingual lexicon induction (BLI) in a setting where some translation resources are available, but unknown translations are sought for certain, possibly domain-specific terminology.
no code implementations • TACL 2017 • Nikola Mrk{\v{s}}i{\'c}, Ivan Vuli{\'c}, Diarmuid {\'O} S{\'e}aghdha, Ira Leviant, Roi Reichart, Milica Ga{\v{s}}i{\'c}, Anna Korhonen, Steve Young
We present Attract-Repel, an algorithm for improving the semantic quality of word vectors by injecting constraints extracted from lexical resources.
no code implementations • LREC 2014 • Kris Heylen, Stephen Bond, Dirk De Hertog, Ivan Vuli{\'c}, Hendrik Kockaert
In this paper, we report on the TermWise project, a cooperation of terminologists, corpus linguists and computer scientists, that aims to leverage big online translation data for terminological support to legal translators at the Belgian Federal Ministry of Justice.