Search Results for author: Valentin Malykh

Found 36 papers, 10 papers with code

Single Example Can Improve Zero-Shot Data Generation

no code implementations INLG (ACL) 2021 Pavel Burnyshev, Valentin Malykh, Andrey Bout, Ekaterina Artemova, Irina Piontkovskaya

We explore two approaches to the generation of task-oriented utterances: in the zero-shot approach, the model is trained to generate utterances from seen intents and is further used to generate utterances for intents unseen during training.

intent-classification Intent Classification +1

Ask Me Anything in Your Native Language

no code implementations NAACL 2022 Nikita Sorokin, Dmitry Abulkhanov, Irina Piontkovskaya, Valentin Malykh

Cross-lingual question answering is a thriving field in the modern world, helping people to search information on the web more efficiently.

Cross-Lingual Question Answering Retrieval

CIDRe: A Reference-Free Multi-Aspect Criterion for Code Comment Quality Measurement

no code implementations26 May 2025 Maria Dziuba, Valentin Malykh

Effective generation of structured code comments requires robust quality metrics for dataset curation, yet existing approaches (SIDE, MIDQ, STASIS) suffer from limited code-comment analysis.

Informativeness

StRuCom: A Novel Dataset of Structured Code Comments in Russian

no code implementations16 May 2025 Maria Dziuba, Valentin Malykh

Structured code comments in docstring format are essential for code comprehension and maintenance, but existing machine learning models for their generation perform poorly for Russian compared to English.

ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations

1 code implementation5 May 2025 Dmitriy Shopkhoev, Ammar Ali, Magauiya Zhussip, Valentin Malykh, Stamatios Lefkimmiatis, Nikos Komodakis, Sergey Zagoruyko

In contrast to conventional pruning approaches that require additional training or fine-tuning, our approach requires only a small calibration dataset that is used to estimate a linear transformation to approximate the pruned blocks.

Network Pruning

Iterative Self-Training for Code Generation via Reinforced Re-Ranking

no code implementations13 Apr 2025 Nikita Sorokin, Ivan Sedykh, Valentin Malykh

One effective way to enhance code generation is by pairing a code generation model with a reranker model, which selects the best solution from the generated samples.

Code Generation Reranking +1

SumHiS: Extractive Summarization Exploiting Hidden Structure

no code implementations12 Jun 2024 Tikhonov Pavel, Anastasiya Ianina, Valentin Malykh

We introduce a new approach to extractive summarization task using hidden clustering structure of the text.

Clustering Extractive Summarization

Large Language Models Meet Knowledge Graphs to Answer Factoid Questions

no code implementations3 Oct 2023 Mikhail Salnikov, Hai Le, Prateek Rajput, Irina Nikishina, Pavel Braslavski, Valentin Malykh, Alexander Panchenko

Recently, it has been shown that the incorporation of structured knowledge into Large Language Models significantly improves the results for a variety of NLP tasks.

Knowledge Graphs Re-Ranking

DetIE: Multilingual Open Information Extraction Inspired by Object Detection

1 code implementation24 Jun 2022 Michael Vasilkovsky, Anton Alekseev, Valentin Malykh, Ilya Shenbin, Elena Tutubalina, Dmitriy Salikhov, Mikhail Stepnov, Andrey Chertok, Sergey Nikolenko

Our model sets the new state of the art performance of 67. 7% F1 on CaRB evaluated as OIE2016 while being 3. 35x faster at inference than previous state of the art.

Multilingual NLP Object +2

Template-based Approach to Zero-shot Intent Recognition

no code implementations22 Jun 2022 Dmitry Lamanov, Pavel Burnyshev, Ekaterina Artemova, Valentin Malykh, Andrey Bout, Irina Piontkovskaya

We outperform previous state-of-the-art f1-measure by up to 16\% for unseen intents, using intent labels and user utterances and without accessing external sources (such as knowledge bases).

Intent Recognition Natural Language Inference +6

WikiMulti: a Corpus for Cross-Lingual Summarization

1 code implementation23 Apr 2022 Pavel Tikhonov, Valentin Malykh

Cross-lingual summarization (CLS) is the task to produce a summary in one particular language for a source document in a different language.

Abstractive Text Summarization Cross-Lingual Abstractive Summarization

Russian SuperGLUE 1.1: Revising the Lessons not Learned by Russian NLP models

no code implementations15 Feb 2022 Alena Fenogenova, Maria Tikhonova, Vladislav Mikhailov, Tatiana Shavrina, Anton Emelyanov, Denis Shevelev, Alexandr Kukushkin, Valentin Malykh, Ekaterina Artemova

In the last year, new neural architectures and multilingual pre-trained models have been released for Russian, which led to performance evaluation problems across a range of language understanding tasks.

Common Sense Reasoning Reading Comprehension

A Single Example Can Improve Zero-Shot Data Generation

no code implementations16 Aug 2021 Pavel Burnyshev, Valentin Malykh, Andrey Bout, Ekaterina Artemova, Irina Piontkovskaya

In the zero-shot approach, the model is trained to generate utterances from seen intents and is further used to generate utterances for intents unseen during training.

intent-classification Intent Classification +1

How not to Lie with a Benchmark: Rearranging NLP Learderboards

no code implementations NeurIPS 2021 Tatiana Shavrina, Valentin Malykh

Proper model ranking and comparison with a human level is an essential requirement for every benchmark to be a reliable measurement of the model quality.

MOROCCO: Model Resource Comparison Framework

3 code implementations29 Apr 2021 Valentin Malykh, Alexander Kukushkin, Ekaterina Artemova, Vladislav Mikhailov, Maria Tikhonova, Tatiana Shavrina

The new generation of pre-trained NLP models push the SOTA to the new limits, but at the cost of computational resources, to the point that their use in real production environments is often prohibitively expensive.

model

Improving unsupervised neural aspect extraction for online discussions using out-of-domain classification

no code implementations17 Jun 2020 Anton Alekseev, Elena Tutubalina, Valentin Malykh, Sergey Nikolenko

Deep learning architectures based on self-attention have recently achieved and surpassed state of the art results in the task of unsupervised aspect extraction and topic modeling.

Aspect Extraction domain classification +2

Humans Keep It One Hundred: an Overview of AI Journey

1 code implementation LREC 2020 Tatiana Shavrina, Anton Emelyanov, Alena Fenogenova, Vadim Fomin, Vladislav Mikhailov, Andrey Evlampiev, Valentin Malykh, Vladimir Larin, Alex Natekin, Aleks Vatulin, R, Peter Romov, Daniil Anastasiev, Nikolai Zinov, Andrey Chertok

Artificial General Intelligence (AGI) is showing growing performance in numerous applications - beating human performance in Chess and Go, using knowledge bases and text sources to answer questions (SQuAD) and even pass human examination (Aristo project).

Text Generation

AspeRa: Aspect-Based Rating Prediction Based on User Reviews

no code implementations WS 2019 Elena Tutubalina, Valentin Malykh, Sergey Nikolenko, Anton Alekseev, Ilya Shenbin

We propose a novel Aspect-based Rating Prediction model (AspeRa) that estimates user rating based on review texts for the items.

Aspect Extraction Prediction

AspeRa: Aspect-based Rating Prediction Model

no code implementations23 Jan 2019 Sergey I. Nikolenko, Elena Tutubalina, Valentin Malykh, Ilya Shenbin, Anton Alekseev

We propose a novel end-to-end Aspect-based Rating Prediction model (AspeRa) that estimates user rating based on review texts for the items and at the same time discovers coherent aspects of reviews that can be used to explain predictions or profile users.

model Prediction +1

Sequence Learning with RNNs for Medical Concept Normalization in User-Generated Texts

no code implementations28 Nov 2018 Elena Tutubalina, Zulfat Miftahutdinov, Sergey Nikolenko, Valentin Malykh

In this work, we consider the medical concept normalization problem, i. e., the problem of mapping a disease mention in free-form text to a concept in a controlled vocabulary, usually to the standard thesaurus in the Unified Medical Language System (UMLS).

Medical Concept Normalization Semantic Similarity +1

Cannot find the paper you are looking for? You can Submit a new open access paper.