Search Results for author: Ivan Vulić

Found 81 papers, 44 papers with code

Semantic Data Set Construction from Human Clustering and Spatial Arrangement

no code implementations CL (ACL) 2021 Olga Majewska, Diana McCarthy, Jasper J. F. van den Bosch, Nikolaus Kriegeskorte, Ivan Vulić, Anna Korhonen

We demonstrate how the resultant data set can be used for fine-grained analyses and evaluation of representation learning models on the intrinsic tasks of semantic clustering and semantic similarity.

Representation Learning Semantic Similarity +2

Natural Language Processing for Multilingual Task-Oriented Dialogue

no code implementations ACL 2022 Evgeniia Razumovskaia, Goran Glavaš, Olga Majewska, Edoardo Ponti, Ivan Vulić

In this tutorial, we will thus discuss and demonstrate the importance of (building) multilingual ToD systems, and then provide a systematic overview of current research gaps, challenges and initiatives related to multilingual ToD systems, with a particular focus on their connections to current research and challenges in multilingual and low-resource NLP.

Square One Bias in NLP: Towards a Multi-Dimensional Exploration of the Research Manifold

no code implementations Findings (ACL) 2022 Sebastian Ruder, Ivan Vulić, Anders Søgaard

Most work targeting multilinguality, for example, considers only accuracy; most work on fairness or interpretability considers only English; and so on.

Fairness

Data Augmentation and Learned Layer Aggregation for Improved Multilingual Language Understanding in Dialogue

no code implementations Findings (ACL) 2022 Evgeniia Razumovskaia, Ivan Vulić, Anna Korhonen

Scaling dialogue systems to a multitude of domains, tasks and languages relies on costly and time-consuming data annotation for different domain-task-language configurations.

Data Augmentation Natural Language Understanding

MAD-G: Multilingual Adapter Generation for Efficient Cross-Lingual Transfer

no code implementations Findings (EMNLP) 2021 Alan Ansell, Edoardo Maria Ponti, Jonas Pfeiffer, Sebastian Ruder, Goran Glavaš, Ivan Vulić, Anna Korhonen

While offering (1) improved fine-tuning efficiency (by a factor of around 50 in our experiments), (2) a smaller parameter budget, and (3) increased language coverage, MAD-G remains competitive with more expensive methods for language-specific adapter training across the board.

Dependency Parsing Named Entity Recognition +3

Multi-SimLex: A Large-Scale Evaluation of Multilingual and Crosslingual Lexical Semantic Similarity

no code implementations CL (ACL) 2020 Ivan Vulić, Simon Baker, Edoardo Maria Ponti, Ulla Petti, Ira Leviant, Kelly Wing, Olga Majewska, Eden Bar, Matt Malone, Thierry Poibeau, Roi Reichart, Anna Korhonen

We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering data sets for 12 typologically diverse languages, including major languages (e. g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e. g., Welsh, Kiswahili).

Semantic Similarity Semantic Textual Similarity +1

Multi2WOZ: A Robust Multilingual Dataset and Conversational Pretraining for Task-Oriented Dialog

no code implementations20 May 2022 Chia-Chien Hung, Anne Lauscher, Ivan Vulić, Simone Paolo Ponzetto, Goran Glavaš

We then introduce a new framework for multilingual conversational specialization of pretrained language models (PrLMs) that aims to facilitate cross-lingual transfer for arbitrary downstream TOD tasks.

Cross-Lingual Transfer Pretrained Language Models

Exposing Cross-Lingual Lexical Knowledge from Multilingual Sentence Encoders

no code implementations30 Apr 2022 Ivan Vulić, Goran Glavaš, Fangyu Liu, Nigel Collier, Edoardo Maria Ponti, Anna Korhonen

Pretrained multilingual language models (LMs) can be successfully transformed into multilingual sentence encoders (SEs; e. g., LaBSE, xMPNET) via additional fine-tuning or model distillation on parallel data.

Contrastive Learning Cross-Lingual Entity Linking +5

NLU++: A Multi-Label, Slot-Rich, Generalisable Dataset for Natural Language Understanding in Task-Oriented Dialogue

1 code implementation27 Apr 2022 Iñigo Casanueva, Ivan Vulić, Georgios P. Spithourakis, Paweł Budzianowski

2) The ontology is divided into domain-specific and generic (i. e., domain-universal) intent modules that overlap across domains, promoting cross-domain reusability of annotated examples.

Natural Language Understanding

Parameter-Efficient Neural Reranking for Cross-Lingual and Multilingual Retrieval

1 code implementation5 Apr 2022 Robert Litschko, Ivan Vulić, Goran Glavaš

Current approaches therefore typically transfer rankers trained on English data to other languages and cross-lingual setups by means of multilingual encoders: they fine-tune all the parameters of a pretrained massively multilingual Transformer (MMT, e. g., multilingual BERT) on English relevance judgments and then deploy it in the target language.

Cross-Lingual Transfer Language Modelling +1

Improved and Efficient Conversational Slot Labeling through Question Answering

no code implementations5 Apr 2022 Gabor Fuisz, Ivan Vulić, Samuel Gibbons, Inigo Casanueva, Paweł Budzianowski

In particular, we focus on modeling and studying \textit{slot labeling} (SL), a crucial component of NLU for dialog, through the QA optics, aiming to improve both its performance and efficiency, and make it more effective and resilient to working with limited task data.

Natural Language Understanding Pretrained Language Models +1

Improving Word Translation via Two-Stage Contrastive Learning

1 code implementation ACL 2022 Yaoyiran Li, Fangyu Liu, Nigel Collier, Anna Korhonen, Ivan Vulić

At Stage C1, we propose to refine standard cross-lingual linear maps between static word embeddings (WEs) via a contrastive learning objective; we also show how to integrate it into the self-learning procedure for even more refined cross-lingual maps.

Bilingual Lexicon Induction Contrastive Learning +7

On Cross-Lingual Retrieval with Multilingual Text Encoders

1 code implementation21 Dec 2021 Robert Litschko, Ivan Vulić, Simone Paolo Ponzetto, Goran Glavaš

In this work we present a systematic empirical study focused on the suitability of the state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks across a number of diverse language pairs.

Re-Ranking Zero-Shot Cross-Lingual Transfer

Composable Sparse Fine-Tuning for Cross-Lingual Transfer

1 code implementation ACL 2022 Alan Ansell, Edoardo Maria Ponti, Anna Korhonen, Ivan Vulić

Based on an in-depth analysis, we additionally find that sparsity is crucial to prevent both 1) interference between the fine-tunings to be composed and 2) overfitting.

Language Modelling Masked Language Modeling +2

MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models

1 code implementation CoNLL (EMNLP) 2021 Qianchu Liu, Fangyu Liu, Nigel Collier, Anna Korhonen, Ivan Vulić

Recent work indicated that pretrained language models (PLMs) such as BERT and RoBERTa can be transformed into effective sentence and word encoders even via simple self-supervised techniques.

Contextualised Word Representations Contrastive Learning +1

Towards Zero-shot Language Modeling

no code implementations IJCNLP 2019 Edoardo Maria Ponti, Ivan Vulić, Ryan Cotterell, Roi Reichart, Anna Korhonen

Motivated by this question, we aim at constructing an informative prior over neural weights, in order to adapt quickly to held-out languages in the task of character-level language modeling.

Language Modelling

Modelling Latent Translations for Cross-Lingual Transfer

1 code implementation23 Jul 2021 Edoardo Maria Ponti, Julia Kreutzer, Ivan Vulić, Siva Reddy

To remedy this, we propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model, by treating the intermediate translations as a latent random variable.

Cross-Lingual Transfer Few-Shot Learning +4

Learning Domain-Specialised Representations for Cross-Lingual Biomedical Entity Linking

1 code implementation ACL 2021 Fangyu Liu, Ivan Vulić, Anna Korhonen, Nigel Collier

To this end, we propose and evaluate a series of cross-lingual transfer methods for the XL-BEL task, and demonstrate that general-domain bitext helps propagate the available English knowledge to languages with little to no in-domain data.

Cross-Lingual Transfer Entity Linking +1

Crossing the Conversational Chasm: A Primer on Natural Language Processing for Multilingual Task-Oriented Dialogue Systems

no code implementations17 Apr 2021 Evgeniia Razumovskaia, Goran Glavaš, Olga Majewska, Edoardo M. Ponti, Anna Korhonen, Ivan Vulić

We find that the most critical factor preventing the creation of truly multilingual ToD systems is the lack of datasets in most languages for both training and evaluation.

Cross-Lingual Transfer Machine Translation +2

AM2iCo: Evaluating Word Meaning in Context across Low-Resource Languages with Adversarial Examples

1 code implementation EMNLP 2021 Qianchu Liu, Edoardo M. Ponti, Diana McCarthy, Ivan Vulić, Anna Korhonen

In order to address these gaps, we present AM2iCo (Adversarial and Multilingual Meaning in Context), a wide-coverage cross-lingual and multilingual evaluation set; it aims to faithfully assess the ability of state-of-the-art (SotA) representation models to understand the identity of word meaning in cross-lingual contexts for 14 language pairs.

Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for Improved Cross-Modal Retrieval

1 code implementation22 Mar 2021 Gregor Geigle, Jonas Pfeiffer, Nils Reimers, Ivan Vulić, Iryna Gurevych

Current state-of-the-art approaches to cross-modal retrieval process text and visual input jointly, relying on Transformer-based architectures with cross-attention mechanisms that attend over all words and objects in an image.

Cross-Modal Retrieval

Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval

1 code implementation21 Jan 2021 Robert Litschko, Ivan Vulić, Simone Paolo Ponzetto, Goran Glavaš

Therefore, in this work we present a systematic empirical study focused on the suitability of the state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks across a large number of language pairs.

Cross-Lingual Word Embeddings Word Embeddings

A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters

no code implementations ACL 2021 Mengjie Zhao, Yi Zhu, Ehsan Shareghi, Ivan Vulić, Roi Reichart, Anna Korhonen, Hinrich Schütze

Few-shot crosslingual transfer has been shown to outperform its zero-shot counterpart with pretrained encoders like multilingual BERT.

Few-Shot Learning

Verb Knowledge Injection for Multilingual Event Processing

no code implementations ACL 2021 Olga Majewska, Ivan Vulić, Goran Glavaš, Edoardo M. Ponti, Anna Korhonen

We investigate whether injecting explicit information on verbs' semantic-syntactic behaviour improves the performance of LM-pretrained Transformers in event extraction tasks -- downstream tasks for which accurate verb processing is paramount.

Event Extraction Language Modelling

UNKs Everywhere: Adapting Multilingual Language Models to New Scripts

2 code implementations EMNLP 2021 Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder

The ultimate challenge is dealing with under-resourced languages not covered at all by the models and written in scripts unseen during pretraining.

Cross-Lingual Transfer

How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models

1 code implementation ACL 2021 Phillip Rust, Jonas Pfeiffer, Ivan Vulić, Sebastian Ruder, Iryna Gurevych

In this work, we provide a systematic and comprehensive empirical comparison of pretrained multilingual language models versus their monolingual counterparts with regard to their monolingual task performance.

Pretrained Multilingual Language Models

Orthogonal Language and Task Adapters in Zero-Shot Cross-Lingual Transfer

no code implementations11 Dec 2020 Marko Vidoni, Ivan Vulić, Goran Glavaš

Adapter modules, additional trainable parameters that enable efficient fine-tuning of pretrained transformers, have recently been used for language specialization of multilingual transformers, improving downstream zero-shot cross-lingual transfer.

NER POS +1

Emergent Communication Pretraining for Few-Shot Machine Translation

1 code implementation COLING 2020 Yaoyiran Li, Edoardo M. Ponti, Ivan Vulić, Anna Korhonen

On the other hand, this also provides an extrinsic evaluation protocol to probe the properties of emergent languages ex vitro.

Machine Translation Transfer Learning +1

ConVEx: Data-Efficient and Few-Shot Slot Labeling

no code implementations NAACL 2021 Matthew Henderson, Ivan Vulić

We propose ConVEx (Conversational Value Extractor), an efficient pretraining and fine-tuning neural approach for slot-labeling dialog tasks.

Language Modelling

Probing Pretrained Language Models for Lexical Semantics

no code implementations EMNLP 2020 Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš, Anna Korhonen

The success of large pretrained language models (LMs) such as BERT and RoBERTa has sparked interest in probing their representations, in order to unveil what types of knowledge they implicitly capture.

Pretrained Language Models

Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation

1 code implementation15 Aug 2020 Goran Glavaš, Ivan Vulić

Traditional NLP has long held (supervised) syntactic parsing necessary for successful higher-level semantic language understanding (LU).

Language Modelling Natural Language Understanding

AdapterHub: A Framework for Adapting Transformers

4 code implementations EMNLP 2020 Jonas Pfeiffer, Andreas Rücklé, Clifton Poth, Aishwarya Kamath, Ivan Vulić, Sebastian Ruder, Kyunghyun Cho, Iryna Gurevych

We propose AdapterHub, a framework that allows dynamic "stitching-in" of pre-trained adapters for different tasks and languages.

Multidirectional Associative Optimization of Function-Specific Word Representations

1 code implementation ACL 2020 Daniela Gerz, Ivan Vulić, Marek Rei, Roi Reichart, Anna Korhonen

We present a neural framework for learning associations between interrelated groups of words such as the ones found in Subject-Verb-Object (SVO) structures.

From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers

no code implementations1 May 2020 Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš

Massively multilingual transformers pretrained with language modeling objectives (e. g., mBERT, XLM-R) have become a de facto default transfer paradigm for zero-shot cross-lingual transfer in NLP, offering unmatched transfer performance.

Cross-Lingual Word Embeddings Dependency Parsing +5

XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning

1 code implementation EMNLP 2020 Edoardo Maria Ponti, Goran Glavaš, Olga Majewska, Qianchu Liu, Ivan Vulić, Anna Korhonen

In order to simulate human language capacity, natural language processing systems must be able to reason about the dynamics of everyday situations, including their possible causes and effects.

 Ranked #1 on Cross-Lingual Transfer on XCOPA (using extra training data)

Cross-Lingual Transfer Translation

MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer

3 code implementations EMNLP 2020 Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder

The main goal behind state-of-the-art pre-trained multilingual models such as multilingual BERT and XLM-R is enabling and bootstrapping NLP applications in low-resource languages through zero-shot or few-shot cross-lingual transfer.

Ranked #2 on Cross-Lingual Transfer on XCOPA (using extra training data)

Cross-Lingual Transfer Named Entity Recognition +1

Towards Instance-Level Parser Selection for Cross-Lingual Transfer of Dependency Parsers

no code implementations COLING 2020 Robert Litschko, Ivan Vulić, Željko Agić, Goran Glavaš

Current methods of cross-lingual parser transfer focus on predicting the best parser for a low-resource target language globally, that is, "at treebank level".

Cross-Lingual Transfer POS

Are All Good Word Vector Spaces Isomorphic?

1 code implementation EMNLP 2020 Ivan Vulić, Sebastian Ruder, Anders Søgaard

Existing algorithms for aligning cross-lingual word vector spaces assume that vector spaces are approximately isomorphic.

Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic Similarity

no code implementations10 Mar 2020 Ivan Vulić, Simon Baker, Edoardo Maria Ponti, Ulla Petti, Ira Leviant, Kelly Wing, Olga Majewska, Eden Bar, Matt Malone, Thierry Poibeau, Roi Reichart, Anna Korhonen

We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering datasets for 12 typologically diverse languages, including major languages (e. g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e. g., Welsh, Kiswahili).

Cross-Lingual Word Embeddings Semantic Similarity +2

Efficient Intent Detection with Dual Sentence Encoders

4 code implementations WS 2020 Iñigo Casanueva, Tadas Temčinas, Daniela Gerz, Matthew Henderson, Ivan Vulić

Building conversational systems in new domains and with added functionality requires resource-efficient models that work under low-data regimes (i. e., in few-shot setups).

Intent Detection

The Secret is in the Spectra: Predicting Cross-lingual Task Performance with Spectral Similarity Measures

no code implementations EMNLP 2020 Haim Dubossarsky, Ivan Vulić, Roi Reichart, Anna Korhonen

Performance in cross-lingual NLP tasks is impacted by the (dis)similarity of languages at hand: e. g., previous work has suggested there is a connection between the expected success of bilingual lexicon induction (BLI) and the assumption of (approximate) isomorphism between monolingual embedding spaces.

Bilingual Lexicon Induction POS

A General Framework for Implicit and Explicit Debiasing of Distributional Word Vector Spaces

3 code implementations13 Sep 2019 Anne Lauscher, Goran Glavaš, Simone Paolo Ponzetto, Ivan Vulić

Moreover, we successfully transfer debiasing models, by means of cross-lingual embedding spaces, and remove or attenuate biases in distributional word vector spaces of languages that lack readily available bias specifications.

Word Embeddings

Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

1 code implementation COLING 2020 Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti, Anna Korhonen, Goran Glavaš

In this work, we complement such distributional knowledge with external lexical knowledge, that is, we integrate the discrete knowledge on word-level semantic similarity into pretraining.

Language Modelling Lexical Simplification +6

Do We Really Need Fully Unsupervised Cross-Lingual Embeddings?

1 code implementation IJCNLP 2019 Ivan Vulić, Goran Glavaš, Roi Reichart, Anna Korhonen

A series of bilingual lexicon induction (BLI) experiments with 15 diverse languages (210 language pairs) show that fully unsupervised CLWE methods still fail for a large number of language pairs (e. g., they yield zero BLI performance for 87/210 pairs).

Bilingual Lexicon Induction Self-Learning

Hello, It's GPT-2 -- How Can I Help You? Towards the Use of Pretrained Language Models for Task-Oriented Dialogue Systems

no code implementations12 Jul 2019 Paweł Budzianowski, Ivan Vulić

Data scarcity is a long-standing and crucial challenge that hinders quick development of task-oriented dialogue systems across multiple domains: task-oriented dialogue models are expected to learn grammar, syntax, dialogue reasoning, decision making, and language generation from absurdly small amounts of task-specific data.

Decision Making Language Modelling +4

Training Neural Response Selection for Task-Oriented Dialogue Systems

1 code implementation ACL 2019 Matthew Henderson, Ivan Vulić, Daniela Gerz, Iñigo Casanueva, Paweł Budzianowski, Sam Coope, Georgios Spithourakis, Tsung-Hsien Wen, Nikola Mrkšić, Pei-Hao Su

Despite their popularity in the chatbot literature, retrieval-based models have had modest impact on task-oriented dialogue systems, with the main obstacle to their application being the low-data regime of most task-oriented dialogue tasks.

Chatbot Language Modelling +1

A Systematic Study of Leveraging Subword Information for Learning Word Representations

1 code implementation NAACL 2019 Yi Zhu, Ivan Vulić, Anna Korhonen

The use of subword-level information (e. g., characters, character n-grams, morphemes) has become ubiquitous in modern word representation learning.

Dependency Parsing Entity Typing +2

Fully Statistical Neural Belief Tracking

1 code implementation29 May 2018 Nikola Mrkšić, Ivan Vulić

This paper proposes an improvement to the existing data-driven Neural Belief Tracking (NBT) framework for Dialogue State Tracking (DST).

Dialogue State Tracking

Scoring Lexical Entailment with a Supervised Directional Similarity Network

1 code implementation ACL 2018 Marek Rei, Daniela Gerz, Ivan Vulić

Experiments show excellent performance on scoring graded lexical entailment, raising the state-of-the-art on the HyperLex dataset by approximately 25%.

Lexical Entailment Word Embeddings

On the Limitations of Unsupervised Bilingual Dictionary Induction

no code implementations ACL 2018 Anders Søgaard, Sebastian Ruder, Ivan Vulić

Unsupervised machine translation---i. e., not assuming any cross-lingual supervision signal, whether a dictionary, translations, or comparable corpora---seems impossible, but nevertheless, Lample et al. (2018) recently proposed a fully unsupervised machine translation (MT) model.

Graph Similarity Translation +1

Post-Specialisation: Retrofitting Vectors of Words Unseen in Lexical Resources

1 code implementation NAACL 2018 Ivan Vulić, Goran Glavaš, Nikola Mrkšić, Anna Korhonen

Word vector specialisation (also known as retrofitting) is a portable, light-weight approach to fine-tuning arbitrary distributional word vector spaces by injecting external knowledge from rich lexical resources such as WordNet.

Dialogue State Tracking Text Simplification +1

Unsupervised Cross-Lingual Information Retrieval using Monolingual Data Only

1 code implementation2 May 2018 Robert Litschko, Goran Glavaš, Simone Paolo Ponzetto, Ivan Vulić

We propose a fully unsupervised framework for ad-hoc cross-lingual information retrieval (CLIR) which requires no bilingual data at all.

Information Retrieval

Specialising Word Vectors for Lexical Entailment

1 code implementation17 Oct 2017 Ivan Vulić, Nikola Mrkšić

We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that transforms any input word vector space to emphasise the asymmetric relation of lexical entailment (LE), also known as the IS-A or hyponymy-hypernymy relation.

Lexical Entailment Semantic Similarity +1

Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation

no code implementations EMNLP 2017 Ivan Vulić, Nikola Mrkšić, Anna Korhonen

Existing approaches to automatic VerbNet-style verb classification are heavily dependent on feature engineering and therefore limited to languages with mature NLP pipelines.

Cross-Lingual Transfer Feature Engineering +3

A Survey Of Cross-lingual Word Embedding Models

no code implementations15 Jun 2017 Sebastian Ruder, Ivan Vulić, Anders Søgaard

Cross-lingual representations of words enable us to reason about word meaning in multilingual contexts and are a key facilitator of cross-lingual transfer when developing natural language processing models for low-resource languages.

Cross-Lingual Transfer Cross-Lingual Word Embeddings +1

Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules

no code implementations ACL 2017 Ivan Vulić, Nikola Mrkšić, Roi Reichart, Diarmuid Ó Séaghdha, Steve Young, Anna Korhonen

Morphologically rich languages accentuate two properties of distributional vector space models: 1) the difficulty of inducing accurate representations for low-frequency word forms; and 2) insensitivity to distinct lexical relations that have similar distributional signatures.

Dialogue State Tracking

Decoding Sentiment from Distributed Representations of Sentences

no code implementations SEMEVAL 2017 Edoardo Maria Ponti, Ivan Vulić, Anna Korhonen

Distributed representations of sentences have been developed recently to represent their meaning as real-valued vectors.

Survey on the Use of Typological Information in Natural Language Processing

no code implementations COLING 2016 Helen O'Horan, Yevgeni Berzak, Ivan Vulić, Roi Reichart, Anna Korhonen

In recent years linguistic typology, which classifies the world's languages according to their functional and structural properties, has been widely used to support multilingual NLP.

Multilingual NLP

Automatic Selection of Context Configurations for Improved Class-Specific Word Representations

no code implementations CONLL 2017 Ivan Vulić, Roy Schwartz, Ari Rappoport, Roi Reichart, Anna Korhonen

With our selected context configurations, we train on only 14% (A), 26. 2% (V), and 33. 6% (N) of all dependency-based contexts, resulting in a reduced training time.

Word Similarity

HyperLex: A Large-Scale Evaluation of Graded Lexical Entailment

no code implementations CL 2017 Ivan Vulić, Daniela Gerz, Douwe Kiela, Felix Hill, Anna Korhonen

We introduce HyperLex - a dataset and evaluation resource that quantifies the extent of of the semantic category membership, that is, type-of relation also known as hyponymy-hypernymy or lexical entailment (LE) relation between 2, 616 concept pairs.

Lexical Entailment Representation Learning

SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity

1 code implementation EMNLP 2016 Daniela Gerz, Ivan Vulić, Felix Hill, Roi Reichart, Anna Korhonen

Verbs play a critical role in the meaning of sentences, but these ubiquitous words have received little attention in recent distributional semantics research.

Representation Learning

Bilingual Distributed Word Representations from Document-Aligned Comparable Data

no code implementations24 Sep 2015 Ivan Vulić, Marie-Francine Moens

We propose a new model for learning bilingual word representations from non-parallel document-aligned data.

Translation Word Embeddings

Cannot find the paper you are looking for? You can Submit a new open access paper.