Search Results for author: Ivan Vuli{\'c}

Found 54 papers, 6 papers with code

From Zero to Hero: On the Limitations of Zero-Shot Language Transfer with Multilingual Transformers

no code implementations EMNLP 2020 Anne Lauscher, Vinit Ravishankar, Ivan Vuli{\'c}, Goran Glava{\v{s}}

Massively multilingual transformers (MMTs) pretrained via language modeling (e. g., mBERT, XLM-R) have become a default paradigm for zero-shot language transfer in NLP, offering unmatched transfer performance.

Cross-Lingual Word Embeddings Dependency Parsing +4

LexFit: Lexical Fine-Tuning of Pretrained Language Models

no code implementations ACL 2021 Ivan Vuli{\'c}, Edoardo Maria Ponti, Anna Korhonen, Goran Glava{\v{s}}

Inspired by prior work on semantic specialization of static word embedding (WE) models, we show that it is possible to expose and enrich lexical knowledge from the LMs, that is, to specialize them to serve as effective and universal {``}decontextualized{''} word encoders even when fed input words {``}in isolation{''} (i. e., without any context).

Cross-Lingual Transfer Pretrained Language Models

SemEval-2020 Task 2: Predicting Multilingual and Cross-Lingual (Graded) Lexical Entailment

no code implementations SEMEVAL 2020 Goran Glava{\v{s}}, Ivan Vuli{\'c}, Anna Korhonen, Simone Paolo Ponzetto

The shared task spans three dimensions: (1) monolingual vs. cross-lingual LE, (2) binary vs. graded LE, and (3) a set of 6 diverse languages (and 15 corresponding language pairs).

Lexical Entailment Natural Language Inference

SemEval-2020 Task 3: Graded Word Similarity in Context

no code implementations SEMEVAL 2020 Carlos Santos Armendariz, Matthew Purver, Senja Pollak, Nikola Ljube{\v{s}}i{\'c}, Matej Ul{\v{c}}ar, Ivan Vuli{\'c}, Mohammad Taher Pilehvar

This paper presents the Graded Word Similarity in Context (GWSC) task which asked participants to predict the effects of context on human perception of similarity in English, Croatian, Slovene and Finnish.

Translation Word Similarity

Manual Clustering and Spatial Arrangement of Verbs for Multilingual Evaluation and Typology Analysis

1 code implementation COLING 2020 Olga Majewska, Ivan Vuli{\'c}, Diana McCarthy, Anna Korhonen

We present the first evaluation of the applicability of a spatial arrangement method (SpAM) to a typologically diverse language sample, and its potential to produce semantic evaluation resources to support multilingual NLP, with a focus on verb semantics.

Multilingual NLP

Improving Bilingual Lexicon Induction with Unsupervised Post-Processing of Monolingual Word Vector Spaces

no code implementations WS 2020 Ivan Vuli{\'c}, Anna Korhonen, Goran Glava{\v{s}}

Work on projection-based induction of cross-lingual word embedding spaces (CLWEs) predominantly focuses on the improvement of the projection (i. e., mapping) mechanisms.

Bilingual Lexicon Induction

Hello, It's GPT-2 - How Can I Help You? Towards the Use of Pretrained Language Models for Task-Oriented Dialogue Systems

no code implementations WS 2019 Pawe{\l} Budzianowski, Ivan Vuli{\'c}

Data scarcity is a long-standing and crucial challenge that hinders quick development of task-oriented dialogue systems across multiple domains: task-oriented dialogue models are expected to learn grammar, syntax, dialogue reasoning, decision making, and language generation from absurdly small amounts of task-specific data.

Decision Making Language Modelling +4

Investigating Cross-Lingual Alignment Methods for Contextualized Embeddings with Token-Level Evaluation

no code implementations CONLL 2019 Qianchu Liu, Diana McCarthy, Ivan Vuli{\'c}, Anna Korhonen

In this paper, we present a thorough investigation on methods that align pre-trained contextualized embeddings into shared cross-lingual context-aware embedding space, providing strong reference benchmarks for future context-aware crosslingual models.

Word Similarity

Cross-lingual Semantic Specialization via Lexical Relation Induction

no code implementations IJCNLP 2019 Edoardo Maria Ponti, Ivan Vuli{\'c}, Goran Glava{\v{s}}, Roi Reichart, Anna Korhonen

Semantic specialization integrates structured linguistic knowledge from external resources (such as lexical relations in WordNet) into pretrained distributional vectors in the form of constraints.

Lexical Simplification Semantic Textual Similarity +2

Specializing Distributional Vectors of All Words for Lexical Entailment

no code implementations WS 2019 Aishwarya Kamath, Jonas Pfeiffer, Edoardo Maria Ponti, Goran Glava{\v{s}}, Ivan Vuli{\'c}

Semantic specialization methods fine-tune distributional word vectors using lexical knowledge from external resources (e. g. WordNet) to accentuate a particular relation between words.

Cross-Lingual Transfer Lexical Entailment +2

Unsupervised Cross-Lingual Representation Learning

no code implementations ACL 2019 Sebastian Ruder, Anders S{\o}gaard, Ivan Vuli{\'c}

In this tutorial, we provide a comprehensive survey of the exciting recent work on cutting-edge weakly-supervised and unsupervised cross-lingual word representations.

Representation Learning Structured Prediction

Multilingual and Cross-Lingual Graded Lexical Entailment

no code implementations ACL 2019 Ivan Vuli{\'c}, Simone Paolo Ponzetto, Goran Glava{\v{s}}

Starting from HyperLex, the only available GR-LE dataset in English, we construct new monolingual GR-LE datasets for three other languages, and combine those to create a set of six cross-lingual GR-LE datasets termed CL-HYPERLEX.

Lexical Entailment

Generalized Tuning of Distributional Word Vectors for Monolingual and Cross-Lingual Lexical Entailment

1 code implementation ACL 2019 Goran Glava{\v{s}}, Ivan Vuli{\'c}

Lexical entailment (LE; also known as hyponymy-hypernymy or is-a relation) is a core asymmetric lexical relation that supports tasks like taxonomy induction and text generation.

Lexical Entailment Text Generation

Learning Unsupervised Multilingual Word Embeddings with Incremental Multilingual Hubs

no code implementations NAACL 2019 Geert Heyman, Bregt Verreet, Ivan Vuli{\'c}, Marie-Francine Moens

We learn a shared multilingual embedding space for a variable number of languages by incrementally adding new languages one by one to the current multilingual space.

Bilingual Lexicon Induction Cross-Lingual Word Embeddings +4

Fully Statistical Neural Belief Tracking

1 code implementation ACL 2018 Nikola Mrk{\v{s}}i{\'c}, Ivan Vuli{\'c}

This paper proposes an improvement to the existing data-driven Neural Belief Tracking (NBT) framework for Dialogue State Tracking (DST).

Dialogue Management Dialogue State Tracking +2

Injecting Lexical Contrast into Word Vectors by Guiding Vector Space Specialisation

no code implementations WS 2018 Ivan Vuli{\'c}

Word vector space specialisation models offer a portable, light-weight approach to fine-tuning arbitrary distributional vector spaces to discern between synonymy and antonymy.

Dialogue State Tracking Representation Learning +4

Explicit Retrofitting of Distributional Word Vectors

no code implementations ACL 2018 Goran Glava{\v{s}}, Ivan Vuli{\'c}

The ER model allows us to learn a global specialization function and specialize the vectors of words unobserved in the training data as well.

Lexical Simplification Semantic Textual Similarity +2

Discriminating between Lexico-Semantic Relations with the Specialization Tensor Model

1 code implementation NAACL 2018 Goran Glava{\v{s}}, Ivan Vuli{\'c}

We present a simple and effective feed-forward neural architecture for discriminating between lexico-semantic relations (synonymy, antonymy, hypernymy, and meronymy).

Natural Language Inference Paraphrase Generation +2

Deep Learning for Conversational AI

no code implementations NAACL 2018 Pei-Hao Su, Nikola Mrk{\v{s}}i{\'c}, I{\~n}igo Casanueva, Ivan Vuli{\'c}

The main purpose of this tutorial is to encourage dialogue research in the NLP community by providing the research background, a survey of available resources, and giving key insights to application of state-of-the-art SDS methodology into industry-scale conversational AI systems.

Decision Making Dialogue Management +4

Specialising Word Vectors for Lexical Entailment

1 code implementation NAACL 2018 Ivan Vuli{\'c}, Nikola Mrk{\v{s}}i{\'c}

We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that transforms any input word vector space to emphasise the asymmetric relation of lexical entailment (LE), also known as the IS-A or hyponymy-hypernymy relation.

Dialogue State Tracking Lexical Entailment +6

Cross-Lingual Word Representations: Induction and Evaluation

no code implementations EMNLP 2017 Manaal Faruqui, Anders S{\o}gaard, Ivan Vuli{\'c}

With the increasing use of monolingual word vectors, there is a need for word vectors that can be used as efficiently across multiple languages as monolingually.

Multilingual Word Embeddings

Word Vector Space Specialisation

no code implementations EACL 2017 Ivan Vuli{\'c}, Nikola Mrk{\v{s}}i{\'c}, Mohammad Taher Pilehvar

Specialising vector spaces to maximise their content with respect to one key property of vector space models (e. g. semantic similarity vs. relatedness or lexical entailment) while mitigating others has become an active and attractive research topic in representation learning.

Lexical Entailment Representation Learning +2

Cross-Lingual Syntactically Informed Distributed Word Representations

no code implementations EACL 2017 Ivan Vuli{\'c}

We develop a novel cross-lingual word representation model which injects syntactic information through dependency-based contexts into a shared cross-lingual word vector space.

Bilingual Lexicon Induction Entity Linking +8

Evaluation by Association: A Systematic Study of Quantitative Word Association Evaluation

no code implementations EACL 2017 Ivan Vuli{\'c}, Douwe Kiela, Anna Korhonen

Recent work on evaluating representation learning architectures in NLP has established a need for evaluation protocols based on subconscious cognitive measures rather than manually tailored intrinsic similarity and relatedness tasks.

Information Retrieval Representation Learning +1

Bilingual Lexicon Induction by Learning to Combine Word-Level and Character-Level Representations

no code implementations EACL 2017 Geert Heyman, Ivan Vuli{\'c}, Marie-Francine Moens

We study the problem of bilingual lexicon induction (BLI) in a setting where some translation resources are available, but unknown translations are sought for certain, possibly domain-specific terminology.

Bilingual Lexicon Induction Classification +10

TermWise: A CAT-tool with Context-Sensitive Terminological Support.

no code implementations LREC 2014 Kris Heylen, Stephen Bond, Dirk De Hertog, Ivan Vuli{\'c}, Hendrik Kockaert

In this paper, we report on the TermWise project, a cooperation of terminologists, corpus linguists and computer scientists, that aims to leverage big online translation data for terminological support to legal translators at the Belgian Federal Ministry of Justice.

Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.