Search Results for author: Alexander Panchenko

Found 78 papers, 36 papers with code

Methods for Detoxification of Texts for the Russian Language

3 code implementations • 19 May 2021 • Daryna Dementieva, Daniil Moskovskiy, Varvara Logacheva, David Dale, Olga Kozlova, Nikita Semenov, Alexander Panchenko

We introduce the first study of automatic detoxification of Russian texts to combat offensive language.

Style Transfer

2,049

Paper
Code

Making Sense of Word Embeddings

1 code implementation • WS 2016 • Maria Pelevina, Nikolay Arefyev, Chris Biemann, Alexander Panchenko

We present a simple yet effective approach for learning word sense embeddings.

Clustering Word Embeddings

212

Paper
Code

Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion

1 code implementation • 5 Oct 2023 • Anton Razzhigaev, Arseniy Shakhmatov, Anastasia Maltseva, Vladimir Arkhipkin, Igor Pavlov, Ilya Ryabov, Angelina Kuts, Alexander Panchenko, Andrey Kuznetsov, Denis Dimitrov

Text-to-image generation is a significant domain in modern computer vision and has achieved substantial improvements through the evolution of generative architectures.

Ranked #22 on Text-to-Image Generation on MS COCO

Text-to-Image Generation

Paper
Code

MERA: A Comprehensive LLM Evaluation in Russian

1 code implementation • 9 Jan 2024 • Alena Fenogenova, Artem Chervyakov, Nikita Martynov, Anastasia Kozlova, Maria Tikhonova, Albina Akhmetgareeva, Anton Emelyanov, Denis Shevelev, Pavel Lebedev, Leonid Sinev, Ulyana Isaeva, Katerina Kolomeytseva, Daniil Moskovskiy, Elizaveta Goncharova, Nikita Savushkin, Polina Mikhailova, Denis Dimitrov, Alexander Panchenko, Sergei Markov

To address these issues, we introduce an open Multimodal Evaluation of Russian-language Architectures (MERA), a new instruction benchmark for evaluating foundation models oriented towards the Russian language.

Paper
Code

Uncertainty Estimation of Transformer Predictions for Misclassification Detection

1 code implementation • ACL 2022 • Artem Vazhentsev, Gleb Kuzmin, Artem Shelmanov, Akim Tsvigun, Evgenii Tsymbalov, Kirill Fedyanin, Maxim Panov, Alexander Panchenko, Gleb Gusev, Mikhail Burtsev, Manvel Avetisian, Leonid Zhukov

Uncertainty estimation (UE) of model predictions is a crucial step for a variety of tasks such as active learning, misclassification detection, adversarial attack detection, out-of-distribution detection, etc.

Active Learning Adversarial Attack Detection +7

Paper
Code

Text Detoxification using Large Pre-trained Neural Models

1 code implementation • EMNLP 2021 • David Dale, Anton Voronov, Daryna Dementieva, Varvara Logacheva, Olga Kozlova, Nikita Semenov, Alexander Panchenko

We compare our models with a number of methods for style transfer.

Style Transfer

Paper
Code

Making Fast Graph-based Algorithms with Graph Metric Embeddings

1 code implementation • ACL 2019 • Andrey Kutuzov, Mohammad Dorgham, Oleksiy Oliynyk, Chris Biemann, Alexander Panchenko

The computation of distance measures between nodes in graphs is inefficient and does not scale to large graphs.

Paper
Code

Detecting Inappropriate Messages on Sensitive Topics that Could Harm a Company's Reputation

1 code implementation • 9 Mar 2021 • Nikolay Babakov, Varvara Logacheva, Olga Kozlova, Nikita Semenov, Alexander Panchenko

We define a set of sensitive topics that can yield inappropriate and toxic messages and describe the methodology of collecting and labeling a dataset for appropriateness.

Paper
Code

Watset: Local-Global Graph Clustering with Applications in Sense and Frame Induction

2 code implementations • CL 2019 • Dmitry Ustalov, Alexander Panchenko, Chris Biemann, Simone Paolo Ponzetto

We present a detailed theoretical and computational analysis of the Watset meta-algorithm for fuzzy graph clustering, which has been found to be widely applicable in a variety of domains.

Clustering Graph Clustering

Paper
Code

Every child should have parents: a taxonomy refinement algorithm based on hyperbolic term embeddings

1 code implementation • ACL 2019 • Rami Aly, Shantanu Acharya, Alexander Ossa, Arne Köhn, Chris Biemann, Alexander Panchenko

We introduce the use of Poincar\'e embeddings to improve existing state-of-the-art approaches to domain-specific taxonomy induction from text as a signal for both relocating wrong hyponym terms within a (pre-induced) taxonomy as well as for attaching disconnected terms in a taxonomy.

Paper
Code

Negative Sampling Improves Hypernymy Extraction Based on Projection Learning

1 code implementation • EACL 2017 • Dmitry Ustalov, Nikolay Arefyev, Chris Biemann, Alexander Panchenko

We present a new approach to extraction of hypernyms based on projection learning and word embeddings.

General Classification Relation Extraction +1

Paper
Code

How Certain is Your Transformer?

1 code implementation • EACL 2021 • Artem Shelmanov, Evgenii Tsymbalov, Dmitri Puzyrev, Kirill Fedyanin, Alexander Panchenko, Maxim Panov

In this work, we consider the problem of uncertainty estimation for Transformer-based models.

Natural Language Understanding Point Processes

Paper
Code

ParaDetox: Detoxification with Parallel Data

1 code implementation • ACL 2022 • Varvara Logacheva, Daryna Dementieva, Sergey Ustyantsev, Daniil Moskovskiy, David Dale, Irina Krotova, Nikita Semenov, Alexander Panchenko

To the best of our knowledge, these are the first parallel datasets for this task. We describe our pipeline in detail to make it fast to set up for a new language or domain, thus contributing to faster and easier development of new parallel resources. We train several detoxification models on the collected data and compare them with several baselines and state-of-the-art unsupervised approaches.

Sentence

Paper
Code

Unsupervised, Knowledge-Free, and Interpretable Word Sense Disambiguation

1 code implementation • EMNLP 2017 • Alexander Panchenko, Fide Marten, Eugen Ruppert, Stefano Faralli, Dmitry Ustalov, Simone Paolo Ponzetto, Chris Biemann

In word sense disambiguation (WSD), knowledge-based systems tend to be much more interpretable than knowledge-free counterparts as they rely on the wealth of manually-encoded elements representing word senses, such as hypernyms, usage examples, and images.

Word Sense Disambiguation

Paper
Code

Watset: Automatic Induction of Synsets from a Graph of Synonyms

1 code implementation • ACL 2017 • Dmitry Ustalov, Alexander Panchenko, Chris Biemann

This paper presents a new graph-based approach that induces synsets using synonymy dictionaries and word embeddings.

Clustering Word Embeddings +1

Paper
Code

An Unsupervised Word Sense Disambiguation System for Under-Resourced Languages

1 code implementation • LREC 2018 • Dmitry Ustalov, Denis Teslenko, Alexander Panchenko, Mikhail Chernoskutov, Chris Biemann, Simone Paolo Ponzetto

The sparse mode uses the traditional vector space model to estimate the most similar word sense corresponding to its context.

Paper
Code

A large-scale computational study of content preservation measures for text style transfer and paraphrase generation

1 code implementation • ACL 2022 • Nikolay Babakov, David Dale, Varvara Logacheva, Alexander Panchenko

In both tasks, the system is supposed to generate a text which should be semantically similar to the input text.

Paraphrase Generation Semantic Similarity +3

Paper
Code

Cross-lingual Evidence Improves Monolingual Fake News Detection

1 code implementation • ACL 2021 • Daryna Dementieva, Alexander Panchenko

Misleading information spreads on the Internet at an incredible speed, which can lead to irreparable consequences in some cases.

Fake News Detection News Classification

Paper
Code

Multiverse: Multilingual Evidence for Fake News Detection

1 code implementation • 25 Nov 2022 • Daryna Dementieva, Mikhail Kuimov, Alexander Panchenko

In this work, we propose Multiverse -- a new feature based on multilingual evidence that can be used for fake news detection and improve existing approaches.

Fake News Detection News Classification

Paper
Code

Unsupervised Semantic Frame Induction using Triclustering

1 code implementation • ACL 2018 • Dmitry Ustalov, Alexander Panchenko, Andrei Kutuzov, Chris Biemann, Simone Paolo Ponzetto

We use dependency triples automatically extracted from a Web-scale corpus to perform unsupervised semantic frame induction.

Clustering

Paper
Code

Categorizing Comparative Sentences

3 code implementations • WS 2019 • Alexander Panchenko, Alexander Bondarenko, Mirco Franzek, Matthias Hagen, Chris Biemann

We tackle the tasks of automatically identifying comparative sentences and categorizing the intended preference (e. g., "Python has better NLP libraries than MATLAB" => (Python, better, MATLAB).

Argument Mining Sentence +1

Paper
Code

Improving Hypernymy Extraction with Distributional Semantic Classes

1 code implementation • LREC 2018 • Alexander Panchenko, Dmitry Ustalov, Stefano Faralli, Simone P. Ponzetto, Chris Biemann

In this paper, we show how distributionally-induced semantic classes can be helpful for extracting hypernyms.

Denoising

Paper
Code

Unsupervised Sense-Aware Hypernymy Extraction

1 code implementation • 17 Sep 2018 • Dmitry Ustalov, Alexander Panchenko, Chris Biemann, Simone Paolo Ponzetto

In this paper, we show how unsupervised sense representations can be used to improve hypernymy extraction.

Paper
Code

Exploring Cross-lingual Text Detoxification with Large Multilingual Language Models.

1 code implementation • ACL 2022 • Daniil Moskovskiy, Daryna Dementieva, Alexander Panchenko

This work investigates multilingual and cross-lingual detoxification and the behavior of large multilingual models in this setting.

Style Transfer

Paper
Code

Exploring Cross-lingual Textual Style Transfer with Large Multilingual Language Models

1 code implementation • 5 Jun 2022 • Daniil Moskovskiy, Daryna Dementieva, Alexander Panchenko

However, models are not able to perform cross-lingual detoxification and direct fine-tuning on exact language is inevitable.

Style Transfer

Paper
Code

Active Learning for Abstractive Text Summarization

1 code implementation • 9 Jan 2023 • Akim Tsvigun, Ivan Lysenko, Danila Sedashov, Ivan Lazichny, Eldar Damirov, Vladimir Karlov, Artemy Belousov, Leonid Sanochkin, Maxim Panov, Alexander Panchenko, Mikhail Burtsev, Artem Shelmanov

Active Learning (AL) is a technique developed to reduce the amount of annotation required to achieve a certain level of machine learning model performance.

Abstractive Text Summarization Active Learning +3

Paper
Code

Studying Taxonomy Enrichment on Diachronic WordNet Versions

1 code implementation • COLING 2020 • Irina Nikishina, Alexander Panchenko, Varvara Logacheva, Natalia Loukachevitch

Ontologies, taxonomies, and thesauri are used in many NLP tasks.

Paper
Code

TaxoLLaMA: WordNet-based Model for Solving Multiple Lexical Sematic Tasks

1 code implementation • 14 Mar 2024 • Viktor Moskvoretskii, Ekaterina Neminova, Alina Lobanova, Alexander Panchenko, Irina Nikishina

It achieves 11 SotA results, 4 top-2 results out of 16 tasks for the Taxonomy Enrichment, Hypernym Discovery, Taxonomy Construction, and Lexical Entailment tasks.

Domain Adaptation Few-Shot Learning +3

Paper
Code

Studying the role of named entities for content preservation in text style transfer

2 code implementations • 20 Jun 2022 • Nikolay Babakov, David Dale, Varvara Logacheva, Irina Krotova, Alexander Panchenko

Text style transfer techniques are gaining popularity in Natural Language Processing, finding various applications such as text detoxification, sentiment, or formality transfer.

Style Transfer Text Style Transfer

Paper
Code

Don't lose the message while paraphrasing: A study on content preserving style transfer

1 code implementation • 17 Aug 2023 • Nikolay Babakov, David Dale, Ilya Gusev, Irina Krotova, Alexander Panchenko

Text style transfer techniques are gaining popularity in natural language processing allowing paraphrasing text in the required form: from toxic to neural, from formal to informal, from old to the modern English language, etc.

Style Transfer Text Style Transfer

Paper
Code

RUSSE'2018: A Shared Task on Word Sense Induction for the Russian Language

no code implementations • 15 Mar 2018 • Alexander Panchenko, Anastasiya Lopukhina, Dmitry Ustalov, Konstantin Lopukhin, Nikolay Arefyev, Alexey Leontyev, Natalia Loukachevitch

The paper describes the results of the first shared task on word sense induction (WSI) for the Russian language.

Word Sense Induction

Paper
Add Code

How much does a word weigh? Weighting word embeddings for word sense induction

no code implementations • 23 May 2018 • Nikolay Arefyev, Pavel Ermolaev, Alexander Panchenko

The paper describes our participation in the first shared task on word sense induction and disambiguation for the Russian language RUSSE'2018 (Panchenko et al., 2018).

Clustering Machine Translation +3

Paper
Add Code

Neologisms on Facebook

no code implementations • 13 Apr 2018 • Nikita Muravyev, Alexander Panchenko, Sergei Obiedkov

In this paper, we present a study of neologisms and loan words frequently occurring in Facebook user posts.

Marketing

Paper
Add Code

Enriching Frame Representations with Distributionally Induced Senses

no code implementations • LREC 2018 • Stefano Faralli, Alexander Panchenko, Chris Biemann, Simone Paolo Ponzetto

We introduce a new lexical resource that enriches the Framester knowledge graph, which links Framnet, WordNet, VerbNet and other resources, with semantic features from text corpora.

Paper
Add Code

RUSSE: The First Workshop on Russian Semantic Similarity

no code implementations • 15 Mar 2018 • Alexander Panchenko, Natalia Loukachevitch, Dmitry Ustalov, Denis Paperno, Christian Meyer, Natalia Konstantinova

The paper gives an overview of the Russian Semantic Similarity Evaluation (RUSSE) shared task held in conjunction with the Dialogue 2015 conference.

Paper
Add Code

Building a Web-Scale Dependency-Parsed Corpus from CommonCrawl

no code implementations • LREC 2018 • Alexander Panchenko, Eugen Ruppert, Stefano Faralli, Simone Paolo Ponzetto, Chris Biemann

We present DepCC, the largest-to-date linguistically analyzed corpus in English including 365 million documents, composed of 252 billion tokens and 7. 5 billion of named entity occurrences in 14. 3 billion sentences from a web-scale crawl of the \textsc{Common Crawl} project.

Open Information Extraction Question Answering +1

Paper
Add Code

A Framework for Enriching Lexical Semantic Resources with Distributional Semantics

no code implementations • 23 Dec 2017 • Chris Biemann, Stefano Faralli, Alexander Panchenko, Simone Paolo Ponzetto

While both kinds of semantic resources are available with high lexical coverage, our aligned resource combines the domain specificity and availability of contextual information from distributional models with the conciseness and high quality of manually crafted lexical networks.

Specificity Word Sense Disambiguation

Paper
Add Code

Human and Machine Judgements for Russian Semantic Relatedness

no code implementations • 31 Aug 2017 • Alexander Panchenko, Dmitry Ustalov, Nikolay Arefyev, Denis Paperno, Natalia Konstantinova, Natalia Loukachevitch, Chris Biemann

On the one hand, humans easily make judgments about semantic relatedness.

Paper
Add Code

Fighting with the Sparsity of Synonymy Dictionaries

no code implementations • 30 Aug 2017 • Dmitry Ustalov, Mikhail Chernoskutov, Chris Biemann, Alexander Panchenko

Graph-based synset induction methods, such as MaxMax and Watset, induce synsets by performing a global clustering of a synonymy graph.

Clustering

Paper
Add Code

Learning Graph Embeddings from WordNet-based Similarity Measures

no code implementations • SEMEVAL 2019 • Andrey Kutuzov, Mohammad Dorgham, Oleksiy Oliynyk, Chris Biemann, Alexander Panchenko

We present path2vec, a new approach for learning graph embeddings that relies on structural measures of pairwise node similarities.

Graph Embedding Semantic Similarity +2

Paper
Add Code

Sentiment Index of the Russian Speaking Facebook

no code implementations • 23 Aug 2018 • Alexander Panchenko

A sentiment index measures the average emotional level in a corpus.

Paper
Add Code

Answering Comparative Questions: Better than Ten-Blue-Links?

no code implementations • 15 Jan 2019 • Matthias Schildwächter, Alexander Bondarenko, Julian Zenker, Matthias Hagen, Chris Biemann, Alexander Panchenko

We present CAM (comparative argumentative machine), a novel open-domain IR system to argumentatively compare objects with respect to information extracted from the Common Crawl.

Paper
Add Code

HHMM at SemEval-2019 Task 2: Unsupervised Frame Induction using Contextualized Word Embeddings

1 code implementation • SEMEVAL 2019 • Saba Anwar, Dmitry Ustalov, Nikolay Arefyev, Simone Paolo Ponzetto, Chris Biemann, Alexander Panchenko

We present our system for semantic frame induction that showed the best performance in Subtask B. 1 and finished as the runner-up in Subtask A of the SemEval 2019 Task 2 on unsupervised semantic frame induction (QasemiZadeh et al., 2019).

Clustering Task 2 +1

Paper
Code

On the Compositionality Prediction of Noun Phrases using Poincaré Embeddings

no code implementations • 7 Jun 2019 • Abhik Jana, Dmitry Puzyrev, Alexander Panchenko, Pawan Goyal, Chris Biemann, Animesh Mukherjee

In particular, we use hypernymy information of the multiword and its constituents encoded in the form of the recently introduced Poincar\'e embeddings in addition to the distributional information to detect compositionality for noun phrases.

Paper
Add Code

Word Sense Disambiguation for 158 Languages using Word Embeddings Only

no code implementations • LREC 2020 • Varvara Logacheva, Denis Teslenko, Artem Shelmanov, Steffen Remus, Dmitry Ustalov, Andrey Kutuzov, Ekaterina Artemova, Chris Biemann, Simone Paolo Ponzetto, Alexander Panchenko

We use this method to induce a collection of sense inventories for 158 languages on the basis of the original pre-trained fastText word embeddings by Grave et al. (2018), enabling WSD in these languages.

Word Embeddings Word Sense Disambiguation

Paper
Add Code

RUSSE'2020: Findings of the First Taxonomy Enrichment Task for the Russian language

no code implementations • 22 May 2020 • Irina Nikishina, Varvara Logacheva, Alexander Panchenko, Natalia Loukachevitch

This paper describes the results of the first shared task on taxonomy enrichment for the Russian language.

Paper
Add Code

A Comparative Study of Lexical Substitution Approaches based on Neural Language Models

no code implementations • 29 May 2020 • Nikolay Arefyev, Boris Sheludko, Alexander Podolskiy, Alexander Panchenko

Lexical substitution in context is an extremely powerful technology that can be used as a backbone of various NLP applications, such as word sense induction, lexical relation extraction, data augmentation, etc.

Data Augmentation Relation Extraction +1

Paper
Add Code

Neural Entity Linking: A Survey of Models Based on Deep Learning

no code implementations • 31 May 2020 • Ozge Sevgili, Artem Shelmanov, Mikhail Arkhipov, Alexander Panchenko, Chris Biemann

This survey presents a comprehensive description of recent neural entity linking (EL) systems developed since 2015 as a result of the "deep learning revolution" in natural language processing.

Entity Embeddings Entity Linking

Paper
Add Code

Active Learning for Sequence Tagging with Deep Pre-trained Models and Bayesian Uncertainty Estimates

no code implementations • EACL 2021 • Artem Shelmanov, Dmitri Puzyrev, Lyubov Kupriyanova, Denis Belyakov, Daniil Larionov, Nikita Khromov, Olga Kozlova, Ekaterina Artemova, Dmitry V. Dylov, Alexander Panchenko

Annotating training data for sequence tagging of texts is usually very time-consuming.

Active Learning Transfer Learning

Paper
Add Code

Which is Better for Deep Learning: Python or MATLAB? Answering Comparative Questions in Natural Language

no code implementations • EACL 2021 • Viktoriia Chekalina, Alexander Bondarenko, Chris Biemann, Meriem Beloucif, Varvara Logacheva, Alexander Panchenko

in natural language.

Paper
Add Code

SkoltechNLP at SemEval-2020 Task 11: Exploring Unsupervised Text Augmentation for Propaganda Detection

no code implementations • SEMEVAL 2020 • Daryna Dementieva, Igor Markov, Alexander Panchenko

This paper presents a solution for the Span Identification (SI) task in the {``}Detection of Propaganda Techniques in News Articles{''} competition at SemEval-2020.

Propaganda detection Text Augmentation

Paper
Add Code

SkoltechNLP at SemEval-2021 Task 2: Generating Cross-Lingual Training Data for the Word-in-Context Task

no code implementations • SEMEVAL 2021 • Anton Razzhigaev, Nikolay Arefyev, Alexander Panchenko

In our experiments, we used a neural system based on the XLM-R, a pre-trained transformer-based masked language model, as a baseline.

Machine Translation Task 2 +2

Paper
Add Code

SkoltechNLP at SemEval-2021 Task 5: Leveraging Sentence-level Pre-training for Toxic Span Detection

no code implementations • SEMEVAL 2021 • David Dale, Igor Markov, Varvara Logacheva, Olga Kozlova, Nikita Semenov, Alexander Panchenko

We show that fine-tuning a RoBERTa model for this problem is a strong baseline.

Sentence Toxic Spans Detection

Paper
Add Code

Evaluation of Taxonomy Enrichment on Diachronic WordNet Versions

no code implementations • EACL (GWC) 2021 • Irina Nikishina, Natalia Loukachevitch, Varvara Logacheva, Alexander Panchenko

The vast majority of the existing approaches for taxonomy enrichment apply word embeddings as they have proven to accumulate contexts (in a broad sense) extracted from texts which are sufficient for attaching orphan words to the taxonomy.

Word Embeddings

Paper
Add Code

Documents Representation via Generalized Coupled Tensor Chain with the Rotation Group constraint

1 code implementation • Findings (ACL) 2021 • Igor Vorona, Anh-Huy Phan, Alexander Panchenko, Andrzej Cichocki

Paper
Code

Detecting Inappropriate Messages on Sensitive Topics that Could Harm a Company’s Reputation

no code implementations • EACL (BSNLP) 2021 • Nikolay Babakov, Varvara Logacheva, Olga Kozlova, Nikita Semenov, Alexander Panchenko

We define a set of sensitive topics that can yield inappropriate and toxic messages and describe the methodology of collecting and labelling a dataset for appropriateness.

Paper
Add Code

Generating Lexical Representations of Frames using Lexical Substitution

no code implementations • PaM 2020 • Saba Anwar, Artem Shelmanov, Alexander Panchenko, Chris Biemann

We investigate a simple yet effective method, lexical substitution with word representation models, to automatically expand a small set of frame-annotated sentences with new words for their respective roles and LUs.

Paper
Add Code

Taxonomy Enrichment with Text and Graph Vector Representations

no code implementations • 21 Jan 2022 • Irina Nikishina, Mikhail Tikhomirov, Varvara Logacheva, Yuriy Nazarov, Alexander Panchenko, Natalia Loukachevitch

With the rapid growth of lexical resources for specific domains, the problem of automatic extension of the existing knowledge bases with new words is becoming more and more widespread.

Knowledge Graphs Word Embeddings

Paper
Add Code

Beyond Plain Toxic: Detection of Inappropriate Statements on Flammable Topics for the Russian Language

no code implementations • 4 Mar 2022 • Nikolay Babakov, Varvara Logacheva, Alexander Panchenko

Toxicity on the Internet, such as hate speech, offenses towards particular users or groups of people, or the use of obscene words, is an acknowledged problem.

Chatbot Cultural Vocal Bursts Intensity Prediction

Paper
Add Code

Detecting Text Formality: A Study of Text Classification Approaches

2 code implementations • 19 Apr 2022 • Daryna Dementieva, Nikolay Babakov, Alexander Panchenko

Formality is one of the important characteristics of text documents.

Retrieval Style Transfer +3

Paper
Code

MEKER: Memory Efficient Knowledge Embedding Representation for Link Prediction and Question Answering

no code implementations • ACL 2022 • Viktoriia Chekalina, Anton Razzhigaev, Albert Sayapin, Evgeny Frolov, Alexander Panchenko

Knowledge Graphs (KGs) are symbolically structured storages of facts.

Knowledge Graphs Link Prediction +1

Paper
Add Code

A Study on Manual and Automatic Evaluation for Text Style Transfer: The Case of Detoxification

no code implementations • HumEval (ACL) 2022 • Varvara Logacheva, Daryna Dementieva, Irina Krotova, Alena Fenogenova, Irina Nikishina, Tatiana Shavrina, Alexander Panchenko

It is often difficult to reliably evaluate models which generate text.

Style Transfer Text Style Transfer

Paper
Add Code

RuArg-2022: Argument Mining Evaluation

no code implementations • 18 Jun 2022 • Evgeny Kotelnikov, Natalia Loukachevitch, Irina Nikishina, Alexander Panchenko

Argumentation analysis is a field of computational linguistics that studies methods for extracting arguments from texts and the relationships between them, as well as building argumentation structure of texts.

Argument Mining Natural Language Inference +1

Paper
Add Code

Always Keep your Target in Mind: Studying Semantics and Improving Performance of Neural Lexical Substitution

1 code implementation • COLING 2020 • Nikolay Arefyev, Boris Sheludko, Alexander Podolskiy, Alexander Panchenko

Lexical substitution, i. e. generation of plausible words that can replace a particular target word in a given context, is an extremely powerful technology that can be used as a backbone of various NLP applications, including word sense induction and disambiguation, lexical relation extraction, data augmentation, etc.

Data Augmentation Relation Extraction +1

Paper
Code

SkoltechNLP at SemEval-2022 Task 8: Multilingual News Article Similarity via Exploration of News Texts to Vector Representations

1 code implementation • SemEval (NAACL) 2022 • Mikhail Kuimov, Daryna Dementieva, Alexander Panchenko

This paper describes our contribution to SemEval 2022 Task 8: Multilingual News Article Similarity.

NER

Paper
Code

RuPAWS: A Russian Adversarial Dataset for Paraphrase Identification

1 code implementation • LREC 2022 • Nikita Martynov, Irina Krotova, Varvara Logacheva, Alexander Panchenko, Olga Kozlova, Nikita Semenov

We compare it to the largest available dataset for Russian ParaPhraser and show that the best available paraphrase identifiers for the Russian language fail on the RuPAWS dataset.

Paraphrase Identification

Paper
Code

Cross-Modal Contextualized Hidden State Projection Method for Expanding of Taxonomic Graphs

no code implementations • COLING (TextGraphs) 2022 • Irina Nikishina, Alsu Vakhitova, Elena Tutubalina, Alexander Panchenko

We propose a method that combines graph-, and text-based contextualized representations from transformer networks to predict new entries to the taxonomy.

graph construction Graph Embedding

Paper
Add Code

Pixel-Level BPE for Auto-Regressive Image Generation

no code implementations • MMMPIE (COLING) 2022 • Anton Razzhigaev, Anton Voronov, Andrey Kaznacheev, Andrey Kuznetsov, Denis Dimitrov, Alexander Panchenko

Pixel-level autoregression with Transformer models (Image GPT or iGPT) is one of the recent approaches to image generation that has not received massive attention and elaboration due to quadratic complexity of attention as it imposes huge memory requirements and thus restricts the resolution of the generated images.

Image Generation

Paper
Add Code

Error syntax aware augmentation of feedback comment generation dataset

no code implementations • 29 Dec 2022 • Nikolay Babakov, Maria Lysyuk, Alexander Shvets, Lilya Kazakova, Alexander Panchenko

This paper presents a solution to the GenChal 2022 shared task dedicated to feedback comment generation for writing learning.

Comment Generation

Paper
Add Code

Retrieving Comparative Arguments using Ensemble Methods and Neural Information Retrieval

no code implementations • 1 May 2023 • Viktoriia Chekalina, Alexander Panchenko

In this paper, we present a submission to the Touche lab's Task 2 on Argument Retrieval for Comparative Questions.

Argument Retrieval Information Retrieval +3

Paper
Add Code

Efficient GPT Model Pre-training using Tensor Train Matrix Representation

no code implementations • 5 Jun 2023 • Viktoriia Chekalina, Georgii Novikov, Julia Gusak, Ivan Oseledets, Alexander Panchenko

On the downstream tasks, including language understanding and text summarization, the model performs similarly to the original GPT-2 model.

Language Modelling Text Summarization

Paper
Add Code

Large Language Models Meet Knowledge Graphs to Answer Factoid Questions

no code implementations • 3 Oct 2023 • Mikhail Salnikov, Hai Le, Prateek Rajput, Irina Nikishina, Pavel Braslavski, Valentin Malykh, Alexander Panchenko

Recently, it has been shown that the incorporation of structured knowledge into Large Language Models significantly improves the results for a variety of NLP tasks.

Knowledge Graphs Re-Ranking

Paper
Add Code

Answer Candidate Type Selection: Text-to-Text Language Model for Closed Book Question Answering Meets Knowledge Graphs

no code implementations • 10 Oct 2023 • Mikhail Salnikov, Maria Lysyuk, Pavel Braslavski, Anton Razzhigaev, Valentin Malykh, Alexander Panchenko

Pre-trained Text-to-Text Language Models (LMs), such as T5 or BART yield promising results in the Knowledge Graph Question Answering (KGQA) task.

Graph Question Answering Knowledge Graphs +3

Paper
Add Code

LM-Polygraph: Uncertainty Estimation for Language Models

no code implementations • 13 Nov 2023 • Ekaterina Fadeeva, Roman Vashurin, Akim Tsvigun, Artem Vazhentsev, Sergey Petrakov, Kirill Fedyanin, Daniil Vasilev, Elizaveta Goncharova, Alexander Panchenko, Maxim Panov, Timothy Baldwin, Artem Shelmanov

Recent advancements in the capabilities of large language models (LLMs) have paved the way for a myriad of groundbreaking applications in various fields.

Text Generation

Paper
Add Code

Exploring Methods for Cross-lingual Text Style Transfer: The Case of Text Detoxification

no code implementations • 23 Nov 2023 • Daryna Dementieva, Daniil Moskovskiy, David Dale, Alexander Panchenko

Text detoxification is the task of transferring the style of text from toxic to neutral.

Cross-Lingual Transfer Style Transfer +1

Paper
Add Code

Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification

no code implementations • 7 Mar 2024 • Ekaterina Fadeeva, Aleksandr Rubashevskii, Artem Shelmanov, Sergey Petrakov, Haonan Li, Hamdy Mubarak, Evgenii Tsymbalov, Gleb Kuzmin, Alexander Panchenko, Timothy Baldwin, Preslav Nakov, Maxim Panov

Uncertainty scores leverage information encapsulated in the output of a neural network or its layers to detect unreliable predictions, and we show that they can be used to fact-check the atomic claims in the LLM output.

Fact Checking Hallucination +1

Paper
Add Code

MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages

no code implementations • 2 Apr 2024 • Daryna Dementieva, Nikolay Babakov, Alexander Panchenko

Text detoxification is a textual style transfer (TST) task where a text is paraphrased from a toxic surface form, e. g. featuring rude words, to the neutral register.

Style Transfer

Paper
Add Code

SmurfCat at SemEval-2024 Task 6: Leveraging Synthetic Data for Hallucination Detection

no code implementations • 9 Apr 2024 • Elisei Rykov, Yana Shishkina, Kseniia Petrushina, Kseniia Titova, Sergey Petrakov, Alexander Panchenko

In this paper, we present our novel systems developed for the SemEval-2024 hallucination detection task.

Hallucination

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.