Search Results for author: Varvara Logacheva

Found 35 papers, 11 papers with code

RuPAWS: A Russian Adversarial Dataset for Paraphrase Identification

1 code implementation LREC 2022 Nikita Martynov, Irina Krotova, Varvara Logacheva, Alexander Panchenko, Olga Kozlova, Nikita Semenov

We compare it to the largest available dataset for Russian ParaPhraser and show that the best available paraphrase identifiers for the Russian language fail on the RuPAWS dataset.

Paraphrase Identification

ParaDetox: Detoxification with Parallel Data

1 code implementation ACL 2022 Varvara Logacheva, Daryna Dementieva, Sergey Ustyantsev, Daniil Moskovskiy, David Dale, Irina Krotova, Nikita Semenov, Alexander Panchenko

To the best of our knowledge, these are the first parallel datasets for this task. We describe our pipeline in detail to make it fast to set up for a new language or domain, thus contributing to faster and easier development of new parallel resources. We train several detoxification models on the collected data and compare them with several baselines and state-of-the-art unsupervised approaches.

Sentence

Evaluation of Taxonomy Enrichment on Diachronic WordNet Versions

no code implementations EACL (GWC) 2021 Irina Nikishina, Natalia Loukachevitch, Varvara Logacheva, Alexander Panchenko

The vast majority of the existing approaches for taxonomy enrichment apply word embeddings as they have proven to accumulate contexts (in a broad sense) extracted from texts which are sufficient for attaching orphan words to the taxonomy.

Word Embeddings

Detecting Inappropriate Messages on Sensitive Topics that Could Harm a Company’s Reputation

no code implementations EACL (BSNLP) 2021 Nikolay Babakov, Varvara Logacheva, Olga Kozlova, Nikita Semenov, Alexander Panchenko

We define a set of sensitive topics that can yield inappropriate and toxic messages and describe the methodology of collecting and labelling a dataset for appropriateness.

Studying the role of named entities for content preservation in text style transfer

2 code implementations20 Jun 2022 Nikolay Babakov, David Dale, Varvara Logacheva, Irina Krotova, Alexander Panchenko

Text style transfer techniques are gaining popularity in Natural Language Processing, finding various applications such as text detoxification, sentiment, or formality transfer.

Style Transfer Text Style Transfer

Beyond Plain Toxic: Detection of Inappropriate Statements on Flammable Topics for the Russian Language

no code implementations4 Mar 2022 Nikolay Babakov, Varvara Logacheva, Alexander Panchenko

Toxicity on the Internet, such as hate speech, offenses towards particular users or groups of people, or the use of obscene words, is an acknowledged problem.

Chatbot Cultural Vocal Bursts Intensity Prediction

Taxonomy Enrichment with Text and Graph Vector Representations

no code implementations21 Jan 2022 Irina Nikishina, Mikhail Tikhomirov, Varvara Logacheva, Yuriy Nazarov, Alexander Panchenko, Natalia Loukachevitch

With the rapid growth of lexical resources for specific domains, the problem of automatic extension of the existing knowledge bases with new words is becoming more and more widespread.

Knowledge Graphs Word Embeddings

Detecting Inappropriate Messages on Sensitive Topics that Could Harm a Company's Reputation

1 code implementation9 Mar 2021 Nikolay Babakov, Varvara Logacheva, Olga Kozlova, Nikita Semenov, Alexander Panchenko

We define a set of sensitive topics that can yield inappropriate and toxic messages and describe the methodology of collecting and labeling a dataset for appropriateness.

RUSSE'2020: Findings of the First Taxonomy Enrichment Task for the Russian language

no code implementations22 May 2020 Irina Nikishina, Varvara Logacheva, Alexander Panchenko, Natalia Loukachevitch

This paper describes the results of the first shared task on taxonomy enrichment for the Russian language.

Word Sense Disambiguation for 158 Languages using Word Embeddings Only

no code implementations LREC 2020 Varvara Logacheva, Denis Teslenko, Artem Shelmanov, Steffen Remus, Dmitry Ustalov, Andrey Kutuzov, Ekaterina Artemova, Chris Biemann, Simone Paolo Ponzetto, Alexander Panchenko

We use this method to induce a collection of sense inventories for 158 languages on the basis of the original pre-trained fastText word embeddings by Grave et al. (2018), enabling WSD in these languages.

Word Embeddings Word Sense Disambiguation

MIPT System for World-Level Quality Estimation

no code implementations WS 2019 Mikhail Mosyagin, Varvara Logacheva

We explore different model architectures for the WMT 19 shared task on word-level quality estimation of automatic translation.

Translation

Findings of the WMT 2018 Shared Task on Quality Estimation

no code implementations WS 2018 Lucia Specia, Fr{\'e}d{\'e}ric Blain, Varvara Logacheva, Ram{\'o}n Astudillo, Andr{\'e} F. T. Martins

We report the results of the WMT18 shared task on Quality Estimation, i. e. the task of predicting the quality of the output of machine translation systems at various granularity levels: word, phrase, sentence and document.

Machine Translation Sentence +1

Phrase Level Segmentation and Labelling of Machine Translation Errors

no code implementations LREC 2016 Fr{\'e}d{\'e}ric Blain, Varvara Logacheva, Lucia Specia

This paper presents our work towards a novel approach for Quality Estimation (QE) of machine translation based on sequences of adjacent words, the so-called phrases.

Machine Translation Sentence +1

A Quality-based Active Sample Selection Strategy for Statistical Machine Translation

no code implementations LREC 2014 Varvara Logacheva, Lucia Specia

Our approach is based on a quality estimation technique which involves a wider range of features of the source text, automatic translation, and machine translation system compared to previous work.

Active Learning Machine Translation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.