Search Results for author: Ekaterina Artemova

Found 46 papers, 28 papers with code

A Dataset for Noun Compositionality Detection for a Slavic Language

1 code implementation WS 2019 Dmitry Puzyrev, Artem Shelmanov, Alex Panchenko, er, Ekaterina Artemova

This paper presents the first gold-standard resource for Russian annotated with compositionality information of noun compounds.

Sentence Embeddings for Russian NLU

1 code implementation29 Oct 2019 Dmitry Popov, Alexander Pugachev, Polina Svyatokum, Elizaveta Svitanko, Ekaterina Artemova

We investigate the performance of sentence embeddings models on several tasks for the Russian language.

Multiple-choice Paraphrase Identification +3

Char-RNN and Active Learning for Hashtag Segmentation

no code implementations8 Nov 2019 Taisiya Glushkova, Ekaterina Artemova

We explore the abilities of character recurrent neural network (char-RNN) for hashtag segmentation.

Active Learning

Word Sense Disambiguation for 158 Languages using Word Embeddings Only

no code implementations LREC 2020 Varvara Logacheva, Denis Teslenko, Artem Shelmanov, Steffen Remus, Dmitry Ustalov, Andrey Kutuzov, Ekaterina Artemova, Chris Biemann, Simone Paolo Ponzetto, Alexander Panchenko

We use this method to induce a collection of sense inventories for 158 languages on the basis of the original pre-trained fastText word embeddings by Grave et al. (2018), enabling WSD in these languages.

Word Embeddings Word Sense Disambiguation

A Joint Approach to Compound Splitting and Idiomatic Compound Detection

no code implementations LREC 2020 Irina Krotova, Sergey Aksenov, Ekaterina Artemova

Applications such as machine translation, speech recognition, and information retrieval require efficient handling of noun compounds as they are one of the possible sources for out-of-vocabulary (OOV) words.

Information Retrieval Machine Translation +4

Data-driven models and computational tools for neurolinguistics: a language technology perspective

1 code implementation23 Mar 2020 Ekaterina Artemova, Amir Bakarov, Aleksey Artemov, Evgeny Burnaev, Maxim Sharaev

In this paper, our focus is the connection and influence of language technologies on the research in neurolinguistics.

Word Embeddings

NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language Processing

1 code implementation12 Jun 2020 Nikita Klyuchnikov, Ilya Trofimov, Ekaterina Artemova, Mikhail Salnikov, Maxim Fedorov, Evgeny Burnaev

In this work, we step outside the computer vision domain by leveraging the language modeling task, which is the core of natural language processing (NLP).

Language Modelling Neural Architecture Search

DaNetQA: a yes/no Question Answering Dataset for the Russian Language

no code implementations6 Oct 2020 Taisia Glushkova, Alexey Machnev, Alena Fenogenova, Tatiana Shavrina, Ekaterina Artemova, Dmitry I. Ignatov

The task is to take both the question and a paragraph as input and come up with a yes/no answer, i. e. to produce a binary output.

Question Answering Sentence +2

ELMo and BERT in semantic change detection for Russian

no code implementations7 Oct 2020 Julia Rodina, Yuliya Trofimova, Andrey Kutuzov, Ekaterina Artemova

We study the effectiveness of contextualized embeddings for the task of diachronic semantic change detection for Russian language data.

Change Detection

RuREBus: a Case Study of Joint Named Entity Recognition and Relation Extraction from e-Government Domain

no code implementations29 Oct 2020 Vitaly Ivanin, Ekaterina Artemova, Tatiana Batura, Vladimir Ivanov, Veronika Sarkisyan, Elena Tutubalina, Ivan Smurov

We show-case an application of information extraction methods, such as named entity recognition (NER) and relation extraction (RE) to a novel corpus, consisting of documents, issued by a state agency.

named-entity-recognition Named Entity Recognition +4

RuSentEval: Linguistic Source, Encoder Force!

2 code implementations EACL (BSNLP) 2021 Vladislav Mikhailov, Ekaterina Taktasheva, Elina Sigdel, Ekaterina Artemova

The success of pre-trained transformer language models has brought a great deal of interest on how these models work, and what they learn about language.

Morph Call: Probing Morphosyntactic Content of Multilingual Transformers

1 code implementation NAACL (SIGTYP) 2021 Vladislav Mikhailov, Oleg Serikov, Ekaterina Artemova

The outstanding performance of transformer-based language models on a great variety of NLP and NLU tasks has stimulated interest in exploring their inner workings.

Common Sense Reasoning MORPH +4

MOROCCO: Model Resource Comparison Framework

3 code implementations29 Apr 2021 Valentin Malykh, Alexander Kukushkin, Ekaterina Artemova, Vladislav Mikhailov, Maria Tikhonova, Tatiana Shavrina

The new generation of pre-trained NLP models push the SOTA to the new limits, but at the cost of computational resources, to the point that their use in real production environments is often prohibitively expensive.

A Single Example Can Improve Zero-Shot Data Generation

no code implementations16 Aug 2021 Pavel Burnyshev, Valentin Malykh, Andrey Bout, Ekaterina Artemova, Irina Piontkovskaya

In the zero-shot approach, the model is trained to generate utterances from seen intents and is further used to generate utterances for intents unseen during training.

intent-classification Intent Classification +1

Artificial Text Detection via Examining the Topology of Attention Maps

2 code implementations EMNLP 2021 Laida Kushnareva, Daniil Cherniavskii, Vladislav Mikhailov, Ekaterina Artemova, Serguei Barannikov, Alexander Bernstein, Irina Piontkovskaya, Dmitri Piontkovski, Evgeny Burnaev

The impressive capabilities of recent generative models to create texts that are challenging to distinguish from the human-written ones can be misused for generating fake news, product reviews, and even abusive content.

Text Detection Topological Data Analysis

Shaking Syntactic Trees on the Sesame Street: Multilingual Probing with Controllable Perturbations

1 code implementation EMNLP (MRL) 2021 Ekaterina Taktasheva, Vladislav Mikhailov, Ekaterina Artemova

Recent research has adopted a new experimental field centered around the concept of text perturbations which has revealed that shuffled word order has little to no impact on the downstream performance of Transformer-based language models across many NLP tasks.

Call Larisa Ivanovna: Code-Switching Fools Multilingual NLU Models

2 code implementations29 Sep 2021 Alexey Birshert, Ekaterina Artemova

This is in line with the common understanding of how multilingual models conduct transferring between languages

Cross-Lingual Transfer Intent Recognition +3

Russian SuperGLUE 1.1: Revising the Lessons not Learned by Russian NLP models

no code implementations15 Feb 2022 Alena Fenogenova, Maria Tikhonova, Vladislav Mikhailov, Tatiana Shavrina, Anton Emelyanov, Denis Shevelev, Alexandr Kukushkin, Valentin Malykh, Ekaterina Artemova

In the last year, new neural architectures and multilingual pre-trained models have been released for Russian, which led to performance evaluation problems across a range of language understanding tasks.

Common Sense Reasoning Reading Comprehension

Template-based Approach to Zero-shot Intent Recognition

no code implementations22 Jun 2022 Dmitry Lamanov, Pavel Burnyshev, Ekaterina Artemova, Valentin Malykh, Andrey Bout, Irina Piontkovskaya

We outperform previous state-of-the-art f1-measure by up to 16\% for unseen intents, using intent labels and user utterances and without accessing external sources (such as knowledge bases).

Intent Recognition Natural Language Inference +6

Vote'n'Rank: Revision of Benchmarking with Social Choice Theory

1 code implementation11 Oct 2022 Mark Rofin, Vladislav Mikhailov, Mikhail Florinskiy, Andrey Kravchenko, Elena Tutubalina, Tatiana Shavrina, Daniel Karabekyan, Ekaterina Artemova

The development of state-of-the-art systems in different applied areas of machine learning (ML) is driven by benchmarks, which have shaped the paradigm of evaluating generalisation capabilities from multiple perspectives.

Benchmarking Result aggregation +1

RuCoLA: Russian Corpus of Linguistic Acceptability

1 code implementation23 Oct 2022 Vladislav Mikhailov, Tatiana Shamardina, Max Ryabinin, Alena Pestova, Ivan Smurov, Ekaterina Artemova

Linguistic acceptability (LA) attracts the attention of the research community due to its many uses, such as testing the grammatical knowledge of language models and filtering implausible texts with acceptability classifiers.

Linguistic Acceptability Text Generation

Can BERT eat RuCoLA? Topological Data Analysis to Explain

2 code implementations4 Apr 2023 Irina Proskurina, Irina Piontkovskaya, Ekaterina Artemova

Our results contribute to understanding the behavior of monolingual LMs in the acceptability classification task, provide insights into the functional roles of attention heads, and highlight the advantages of TDA-based approaches for analyzing LMs.

CoLA Linguistic Acceptability +2

Low-resource Bilingual Dialect Lexicon Induction with Large Language Models

1 code implementation19 Apr 2023 Ekaterina Artemova, Barbara Plank

Bilingual word lexicons are crucial tools for multilingual natural language understanding and machine translation tasks, as they facilitate the mapping of words in one language to their synonyms in another language.

Bilingual Lexicon Induction Natural Language Understanding +4

Boosting Zero-shot Cross-lingual Retrieval by Training on Artificially Code-Switched Data

1 code implementation9 May 2023 Robert Litschko, Ekaterina Artemova, Barbara Plank

Transferring information retrieval (IR) models from a high-resource language (typically English) to other languages in a zero-shot fashion has become a widely adopted approach.

Cross-Lingual Word Embeddings Information Retrieval +2

Donkii: Can Annotation Error Detection Methods Find Errors in Instruction-Tuning Datasets?

1 code implementation4 Sep 2023 Leon Weber-Genzel, Robert Litschko, Ekaterina Artemova, Barbara Plank

Our results show that the choice of the right AED method and model size is indeed crucial and derive practical recommendations for how to use AED methods to clean instruction-tuning data.

Text Generation

LUNA: A Framework for Language Understanding and Naturalness Assessment

1 code implementation9 Jan 2024 Marat Saidov, Aleksandra Bakalova, Ekaterina Taktasheva, Vladislav Mikhailov, Ekaterina Artemova

The evaluation of Natural Language Generation (NLG) models has gained increased attention, urging the development of metrics that evaluate various aspects of generated text.

nlg evaluation Text Generation

Exploring the Robustness of Task-oriented Dialogue Systems for Colloquial German Varieties

1 code implementation3 Feb 2024 Ekaterina Artemova, Verena Blaschke, Barbara Plank

Inspired by prior work on English varieties, we craft and manually evaluate perturbation rules that transform German sentences into colloquial forms and use them to synthesize test sets in four ToD datasets.

Intent Recognition slot-filling +3

Sebastian, Basti, Wastl?! Recognizing Named Entities in Bavarian Dialectal Data

1 code implementation19 Mar 2024 Siyao Peng, Zihang Sun, Huangyan Shan, Marie Kolm, Verena Blaschke, Ekaterina Artemova, Barbara Plank

Named Entity Recognition (NER) is a fundamental task to extract key information from texts, but annotated resources are scarce for dialects.

Dialect Identification Multi-Task Learning +3

RuBia: A Russian Language Bias Detection Dataset

no code implementations26 Mar 2024 Veronika Grigoreva, Anastasiia Ivanova, Ilseyar Alimova, Ekaterina Artemova

To illustrate the dataset's purpose, we conduct a diagnostic evaluation of state-of-the-art or near-state-of-the-art LLMs and discuss the LLMs' predisposition to social biases.

Bias Detection Sentence

Single Example Can Improve Zero-Shot Data Generation

no code implementations INLG (ACL) 2021 Pavel Burnyshev, Valentin Malykh, Andrey Bout, Ekaterina Artemova, Irina Piontkovskaya

We explore two approaches to the generation of task-oriented utterances: in the zero-shot approach, the model is trained to generate utterances from seen intents and is further used to generate utterances for intents unseen during training.

intent-classification Intent Classification +1

Supervised and Unsupervised Evaluation of Synthetic Code-Switching

no code implementations COLING (WNUT) 2022 Evgeny Orlov, Ekaterina Artemova

Code-switching (CS) is a phenomenon of mixing words and phrases from multiple languages within a single sentence or conversation.

Sentence

Cannot find the paper you are looking for? You can Submit a new open access paper.