Search Results for author: Nikolay Arefyev

Found 27 papers, 7 papers with code

black[LSCDiscovery shared task] DeepMistake at LSCDiscovery: Can a Multilingual Word-in-Context Model Replace Human Annotators?

no code implementations • LChange (ACL) 2022 • Daniil Homskiy, Nikolay Arefyev

In this paper we describe our solution of the LSCDiscovery shared task on Lexical Semantic Change Discovery (LSCD) in Spanish.

Change Detection

Paper
Add Code

NB-MLM: Efficient Domain Adaptation of Masked Language Models for Sentiment Analysis

1 code implementation • EMNLP 2021 • Nikolay Arefyev, Dmitrii Kharchev, Artem Shelmanov

While Masked Language Models (MLM) are pre-trained on massive datasets, the additional training with the MLM objective on domain or task-specific data before fine-tuning for the final task is known to improve the final performance.

Domain Adaptation Sentiment Analysis

Paper
Code

black[LSCDiscovery shared task] BOS at LSCDiscovery: Lexical Substitution for Interpretable Lexical Semantic Change Detection

no code implementations • LChange (ACL) 2022 • Artem Kudisov, Nikolay Arefyev

We propose a solution for the LSCDiscovery shared task on Lexical Semantic Change Detection in Spanish.

Change Detection

Paper
Add Code

black[LSCDiscovery shared task] GlossReader at LSCDiscovery: Train to Select a Proper Gloss in English – Discover Lexical Semantic Change in Spanish

no code implementations • LChange (ACL) 2022 • Maxim Rachinskiy, Nikolay Arefyev

In order to conclude if there are any differences between senses of a particular word in two corpora, a human annotator or a system shall analyze many examples containing this word from both corpora.

Change Detection XLM-R

Paper
Add Code

The LSCD Benchmark: a Testbed for Diachronic Word Meaning Tasks

no code implementations • 29 Mar 2024 • Dominik Schlechtweg, Shafqat Mumtaz Virk, Nikolay Arefyev

The repository reflects the task's modularity by allowing model evaluation for WiC, WSI and LSCD.

Change Detection LEMMA +2

Paper
Add Code

Enriching Word Usage Graphs with Cluster Definitions

1 code implementation • 26 Mar 2024 • Mariia Fedorova, Andrey Kutuzov, Nikolay Arefyev, Dominik Schlechtweg

We present a dataset of word usage graphs (WUGs), where the existing WUGs for multiple languages are enriched with cluster labels functioning as sense definitions.

Paper
Code

A New Massive Multilingual Dataset for High-Performance Language Technologies

no code implementations • 20 Mar 2024 • Ona de Gibert, Graeme Nail, Nikolay Arefyev, Marta Bañón, Jelmer Van der Linde, Shaoxiong Ji, Jaume Zaragoza-Bernabeu, Mikko Aulamo, Gema Ramírez-Sánchez, Andrey Kutuzov, Sampo Pyysalo, Stephan Oepen, Jörg Tiedemann

We present the HPLT (High Performance Language Technologies) language resources, a new massive multilingual dataset including both monolingual and bilingual corpora extracted from CommonCrawl and previously unused web crawls from the Internet Archive.

Language Modelling Machine Translation +2

Paper
Add Code

BOS at LSCDiscovery: Lexical Substitution for Interpretable Lexical Semantic Change Detection

no code implementations • 7 Jun 2022 • Artem Kudisov, Nikolay Arefyev

We propose a solution for the LSCDiscovery shared task on Lexical Semantic Change Detection in Spanish.

Change Detection

Paper
Add Code

Always Keep your Target in Mind: Studying Semantics and Improving Performance of Neural Lexical Substitution

1 code implementation • COLING 2020 • Nikolay Arefyev, Boris Sheludko, Alexander Podolskiy, Alexander Panchenko

Lexical substitution, i. e. generation of plausible words that can replace a particular target word in a given context, is an extremely powerful technology that can be used as a backbone of various NLP applications, including word sense induction and disambiguation, lexical relation extraction, data augmentation, etc.

Data Augmentation Relation Extraction +1

Paper
Code

The Document Vectors Using Cosine Similarity Revisited

1 code implementation • insights (ACL) 2022 • Zhang Bingyu, Nikolay Arefyev

The results show that while RoBERTa has a clear advantage for larger training sets, the DV-ngrams-cosine performs better than RoBERTa when the labelled training set is very small (10 or 20 documents).

Ranked #7 on Sentiment Analysis on IMDb

Sentiment Analysis

Paper
Code

LIORI at SemEval-2021 Task 2: Span Prediction and Binary Classification approaches to Word-in-Context Disambiguation

no code implementations • SEMEVAL 2021 • Adis Davletov, Nikolay Arefyev, Denis Gordeev, Alexey Rey

This paper presents our approaches to SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context Disambiguation task.

Binary Classification Data Augmentation +3

Paper
Add Code

SkoltechNLP at SemEval-2021 Task 2: Generating Cross-Lingual Training Data for the Word-in-Context Task

no code implementations • SEMEVAL 2021 • Anton Razzhigaev, Nikolay Arefyev, Alexander Panchenko

In our experiments, we used a neural system based on the XLM-R, a pre-trained transformer-based masked language model, as a baseline.

Machine Translation Task 2 +2

Paper
Add Code

GlossReader at SemEval-2021 Task 2: Reading Definitions Improves Contextualized Word Embeddings

no code implementations • SEMEVAL 2021 • Maxim Rachinskiy, Nikolay Arefyev

To verify this hypothesis we developed a solution for the Multilingual and Cross-lingual Word-in-Context (MCL-WiC) task, that does not use any of the shared task data or other WiC data for training.

Task 2 Word Embeddings +3

Paper
Add Code

LIORI at SemEval-2021 Task 8: Ask Transformer for measurements

no code implementations • SEMEVAL 2021 • Adis Davletov, Denis Gordeev, Nikolay Arefyev, Emil Davletov

This work describes our approach for subtasks of SemEval-2021 Task 8: MeasEval: Counts and Measurements which took the official first place in the competition.

Multi-Task Learning Question Answering

Paper
Add Code

Gorynych Transformer at SemEval-2020 Task 6: Multi-task Learning for Definition Extraction

no code implementations • SEMEVAL 2020 • Adis Davletov, Nikolay Arefyev, Alexander Shatilov, Denis Gordeev, Alexey Rey

This paper describes our approach to {``}DeftEval: Extracting Definitions from Free Text in Textbooks{''} competition held as a part of Semeval 2020.

Classification Definition Extraction +7

Paper
Add Code

BOS at SemEval-2020 Task 1: Word Sense Induction via Lexical Substitution for Lexical Semantic Change Detection

no code implementations • SEMEVAL 2020 • Nikolay Arefyev, Vasily Zhikov

The first solution performs word sense induction (WSI) first, then makes the decision based on the induced word senses.

Change Detection Clustering +1

Paper
Add Code

Combining Neural Language Models for WordSense Induction

no code implementations • 23 Jun 2020 • Nikolay Arefyev, Boris Sheludko, Tatiana Aleksashina

Word sense induction (WSI) is the problem of grouping occurrences of an ambiguous word according to the expressed sense of this word.

Word Sense Induction

Paper
Add Code

A Comparative Study of Lexical Substitution Approaches based on Neural Language Models

no code implementations • 29 May 2020 • Nikolay Arefyev, Boris Sheludko, Alexander Podolskiy, Alexander Panchenko

Lexical substitution in context is an extremely powerful technology that can be used as a backbone of various NLP applications, such as word sense induction, lexical relation extraction, data augmentation, etc.

Data Augmentation Relation Extraction +1

Paper
Add Code

Cross-lingual Named Entity List Search via Transliteration

no code implementations • LREC 2020 • Aleks Khakhmovich, R, Svetlana Pavlova, Kira Kirillova, Nikolay Arefyev, Ekaterina Savilova

Out-of-vocabulary words are still a challenge in cross-lingual Natural Language Processing tasks, for which transliteration from source to target language or script is one of the solutions.

Transliteration

Paper
Add Code

Combining Lexical Substitutes in Neural Word Sense Induction

no code implementations • RANLP 2019 • Nikolay Arefyev, Boris Sheludko, Alex Panchenko, er

Word Sense Induction (WSI) is the task of grouping of occurrences of an ambiguous word according to their meaning.

Clustering Word Sense Induction

Paper
Add Code

Neural GRANNy at SemEval-2019 Task 2: A combined approach for better modeling of semantic relationships in semantic frame induction

no code implementations • SEMEVAL 2019 • Nikolay Arefyev, Boris Sheludko, Adis Davletov, Dmitry Kharchev, Alex Nevidomsky, Alex Panchenko, er

We describe our solutions for semantic frame and role induction subtasks of SemEval 2019 Task 2.

Clustering Language Modelling +1

Paper
Add Code

HHMM at SemEval-2019 Task 2: Unsupervised Frame Induction using Contextualized Word Embeddings

1 code implementation • SEMEVAL 2019 • Saba Anwar, Dmitry Ustalov, Nikolay Arefyev, Simone Paolo Ponzetto, Chris Biemann, Alexander Panchenko

We present our system for semantic frame induction that showed the best performance in Subtask B. 1 and finished as the runner-up in Subtask A of the SemEval 2019 Task 2 on unsupervised semantic frame induction (QasemiZadeh et al., 2019).

Clustering Task 2 +1

Paper
Code

How much does a word weigh? Weighting word embeddings for word sense induction

no code implementations • 23 May 2018 • Nikolay Arefyev, Pavel Ermolaev, Alexander Panchenko

The paper describes our participation in the first shared task on word sense induction and disambiguation for the Russian language RUSSE'2018 (Panchenko et al., 2018).

Clustering Machine Translation +3