Search Results for author: Elena Tutubalina

Found 55 papers, 27 papers with code

Entity Linking over Nested Named Entities for Russian

1 code implementation LREC 2022 Natalia Loukachevitch, Pavel Braslavski, Vladimir Ivanov, Tatiana Batura, Suresh Manandhar, Artem Shelmanov, Elena Tutubalina

In this paper, we describe entity linking annotation over nested named entities in the recently released Russian NEREL dataset for information extraction.

Entity Linking

Overview of the Seventh Social Media Mining for Health Applications (#SMM4H) Shared Tasks at COLING 2022

no code implementations SMM4H (COLING) 2022 Davy Weissenbacher, Juan Banda, Vera Davydova, Darryl Estrada Zavala, Luis Gasco Sánchez, Yao Ge, Yuting Guo, Ari Klein, Martin Krallinger, Mathias Leddin, Arjun Magge, Raul Rodriguez-Esteban, Abeed Sarker, Lucia Schmidt, Elena Tutubalina, Graciela Gonzalez-Hernandez

For the past seven years, the Social Media Mining for Health Applications (#SMM4H) shared tasks have promoted the community-driven development and evaluation of advanced natural language processing systems to detect, extract, and normalize health-related information in public, user-generated content.

KFU NLP Team at SMM4H 2020 Tasks: Cross-lingual Transfer Learning with Pretrained Language Models for Drug Reactions

1 code implementation SMM4H (COLING) 2020 Zulfat Miftahutdinov, Andrey Sakhovskiy, Elena Tutubalina

The BERT-based multilingual model for classification of English and Russian tweets that report adverse reactions ranked second among 16 and 7 teams at two first subtasks of the SMM4H 2019 Task 2 and obtained a relaxed F1 of 58% on English tweets and 51% on Russian tweets.

Cross-Lingual Transfer Task 2 +1

Cross-lingual Transfer Learning for Semantic Role Labeling in Russian

no code implementations CLIB 2020 Ilseyar Alimova, Elena Tutubalina, Alexander Kirillovich

As source data for transfer learning, we experimented with the full version of FrameNet and the reduced dataset with a smaller number of semantic roles identical to FrameBank.

Cross-Lingual Transfer Semantic Role Labeling +2

Bridging the Gap Between Open-Source and Proprietary LLMs in Table QA

1 code implementation11 Jun 2025 Nikolas Evkarpidi, Elena Tutubalina

This paper presents a system developed for SemEval 2025 Task 8: Question Answering (QA) over tabular data.

Code Generation Language Modeling +7

Geopolitical biases in LLMs: what are the "good" and the "bad" countries according to contemporary language models

no code implementations7 Jun 2025 Mikhail Salnikov, Dmitrii Korzh, Ivan Lazichny, Elvir Karimov, Artyom Iudin, Ivan Oseledets, Oleg Y. Rogov, Natalia Loukachevitch, Alexander Panchenko, Elena Tutubalina

This paper evaluates geopolitical biases in LLMs with respect to various countries though an analysis of their interpretation of historical events with conflicting national perspectives (USA, UK, USSR, and China).

One Task Vector is not Enough: A Large-Scale Study for In-Context Learning

no code implementations29 May 2025 Pavel Tikhonov, Ivan Oseledets, Elena Tutubalina

In-context learning (ICL) enables Large Language Models (LLMs) to adapt to new tasks using few examples, with task vectors - specific hidden state activations - hypothesized to encode task information.

In-Context Learning

Prompt to Polyp: Medical Text-Conditioned Image Synthesis with Diffusion Models

1 code implementation8 May 2025 Mikhail Chaichuk, Sushant Gautam, Steven Hicks, Elena Tutubalina

The generation of realistic medical images from text descriptions has significant potential to address data scarcity challenges in healthcare AI while preserving patient privacy.

Image Generation

RuCCoD: Towards Automated ICD Coding in Russian

1 code implementation28 Feb 2025 Aleksandr Nesterov, Andrey Sakhovskiy, Ivan Sviridov, Airat Valiev, Vladimir Makharev, Petr Anokhin, Galina Zubkova, Elena Tutubalina

This study investigates the feasibility of automating clinical coding in Russian, a language with limited biomedical resources.

Medical Diagnosis RAG +1

Confidence Estimation for Error Detection in Text-to-SQL Systems

1 code implementation16 Jan 2025 Oleg Somov, Elena Tutubalina

Text-to-SQL enables users to interact with databases through natural language, simplifying the retrieval and synthesis of information.

Decoder In-Context Learning +2

CLEAR: Character Unlearning in Textual and Visual Modalities

1 code implementation23 Oct 2024 Alexey Dontsov, Dmitrii Korzh, Alexey Zhavoronkin, Boris Mikheev, Denis Bobkov, Aibek Alanov, Oleg Y. Rogov, Ivan Oseledets, Elena Tutubalina

Machine Unlearning (MU) is critical for enhancing privacy and security in deep learning models, particularly in large multimodal language models (MLLMs), by removing specific private or hazardous information.

Machine Unlearning

nach0: Multimodal Natural and Chemical Languages Foundation Model

1 code implementation21 Nov 2023 Micha Livne, Zulfat Miftahutdinov, Elena Tutubalina, Maksim Kuznetsov, Daniil Polykovskiy, Annika Brundyn, Aastha Jhunjhunwala, Anthony Costa, Alex Aliper, Alán Aspuru-Guzik, Alex Zhavoronkov

Large Language Models (LLMs) have substantially driven scientific progress in various domains, and many papers have demonstrated their ability to tackle complex problems with creative solutions.

Decoder model +3

Data and models for stance and premise detection in COVID-19 tweets: insights from the Social Media Mining for Health (SMM4H) 2022 shared task

1 code implementation14 Nov 2023 Vera Davydova, Huabin Yang, Elena Tutubalina

The COVID-19 pandemic has sparked numerous discussions on social media platforms, with users sharing their views on topics such as mask-wearing and vaccination.

Argument Mining Stance Detection +1

Gradual Optimization Learning for Conformational Energy Minimization

1 code implementation5 Nov 2023 Artem Tsypin, Leonid Ugadiarov, Kuzma Khrabrov, Alexander Telepov, Egor Rumiantsev, Alexey Skrynnik, Aleksandr I. Panov, Dmitry Vetrov, Elena Tutubalina, Artur Kadurin

Our results demonstrate that the neural network trained with GOLF performs on par with the oracle on a benchmark of diverse drug-like molecules using $50$x less additional data.

Drug Discovery

Multimodal Model with Text and Drug Embeddings for Adverse Drug Reaction Classification

1 code implementation21 Oct 2022 Andrey Sakhovskiy, Elena Tutubalina

These components are state-of-the-art BERT-based models for language understanding and molecular property prediction.

Classification Molecular Property Prediction +1

Vote'n'Rank: Revision of Benchmarking with Social Choice Theory

1 code implementation11 Oct 2022 Mark Rofin, Vladislav Mikhailov, Mikhail Florinskiy, Andrey Kravchenko, Elena Tutubalina, Tatiana Shavrina, Daniel Karabekyan, Ekaterina Artemova

The development of state-of-the-art systems in different applied areas of machine learning (ML) is driven by benchmarks, which have shaped the paradigm of evaluating generalisation capabilities from multiple perspectives.

Benchmarking Result aggregation +1

DetIE: Multilingual Open Information Extraction Inspired by Object Detection

1 code implementation24 Jun 2022 Michael Vasilkovsky, Anton Alekseev, Valentin Malykh, Ilya Shenbin, Elena Tutubalina, Dmitriy Salikhov, Mikhail Stepnov, Andrey Chertok, Sergey Nikolenko

Our model sets the new state of the art performance of 67. 7% F1 on CaRB evaluated as OIE2016 while being 3. 35x faster at inference than previous state of the art.

Multilingual NLP Object +2

Near-Zero-Shot Suggestion Mining with a Little Help from WordNet

no code implementations25 Nov 2021 Anton Alekseev, Elena Tutubalina, Sejeong Kwon, Sergey Nikolenko

In this work, we explore the constructive side of online reviews: advice, tips, requests, and suggestions that users provide about goods, venues, services, and other items of interest.

Suggestion mining

Selection of pseudo-annotated data for adverse drug reaction classification across drug groups

no code implementations24 Nov 2021 Ilseyar Alimova, Elena Tutubalina

Automatic monitoring of adverse drug events (ADEs) or reactions (ADRs) is currently receiving significant attention from the biomedical community.

text-classification Text Classification

Many Heads but One Brain: Fusion Brain -- a Competition and a Single Multimodal Multitask Architecture

1 code implementation22 Nov 2021 Daria Bakshandaeva, Denis Dimitrov, Vladimir Arkhipkin, Alex Shonenkov, Mark Potanin, Denis Karachev, Andrey Kuznetsov, Anton Voronov, Vera Davydova, Elena Tutubalina, Aleksandr Petiushko

Supporting the current trend in the AI community, we present the AI Journey 2021 Challenge called Fusion Brain, the first competition which is targeted to make the universal architecture which could process different modalities (in this case, images, texts, and code) and solve multiple tasks for vision and language.

Handwritten Text Recognition object-detection +4

Drug and Disease Interpretation Learning with Biomedical Entity Representation Transformer

1 code implementation22 Jan 2021 Zulfat Miftahutdinov, Artur Kadurin, Roman Kudrin, Elena Tutubalina

We investigate the effectiveness of transferring concept normalization from the general biomedical domain to the clinical trials domain in a zero-shot setting with an absence of labeled data.

Drug Discovery Metric Learning +2

Fair Evaluation in Concept Normalization: a Large-scale Comparative Analysis for BERT-based Models

1 code implementation COLING 2020 Elena Tutubalina, Artur Kadurin, Zulfat Miftahutdinov

Linking of biomedical entity mentions to various terminologies of chemicals, diseases, genes, adverse drug reactions is a challenging task, often requiring non-syntactic interpretation.

RuREBus: a Case Study of Joint Named Entity Recognition and Relation Extraction from e-Government Domain

no code implementations29 Oct 2020 Vitaly Ivanin, Ekaterina Artemova, Tatiana Batura, Vladimir Ivanov, Veronika Sarkisyan, Elena Tutubalina, Ivan Smurov

We show-case an application of information extraction methods, such as named entity recognition (NER) and relation extraction (RE) to a novel corpus, consisting of documents, issued by a state agency.

named-entity-recognition Named Entity Recognition +4

Improving unsupervised neural aspect extraction for online discussions using out-of-domain classification

no code implementations17 Jun 2020 Anton Alekseev, Elena Tutubalina, Valentin Malykh, Sergey Nikolenko

Deep learning architectures based on self-attention have recently achieved and surpassed state of the art results in the task of unsupervised aspect extraction and topic modeling.

Articles Aspect Extraction +3

A large-scale COVID-19 Twitter chatter dataset for open scientific research -- an international collaboration

1 code implementation7 Apr 2020 Juan M. Banda, Ramya Tekumalla, Guanyu Wang, Jingyuan Yu, Tuo Liu, Yuning Ding, Katya Artemova, Elena Tutubalina, Gerardo Chowell

As the COVID-19 pandemic continues its march around the world, an unprecedented amount of open data is being generated for genetics and epidemiological research.

Misinformation

RecVAE: a New Variational Autoencoder for Top-N Recommendations with Implicit Feedback

3 code implementations24 Dec 2019 Ilya Shenbin, Anton Alekseev, Elena Tutubalina, Valentin Malykh, Sergey I. Nikolenko

Recent research has shown the advantages of using autoencoders based on deep neural networks for collaborative filtering.

 Ranked #1 on Recommendation Systems on MovieLens 20M (Recall@50 metric)

Collaborative Filtering

Distant Supervision for Sentiment Attitude Extraction

no code implementations RANLP 2019 Nicolay Rusnachenko, Natalia Loukachevitch, Elena Tutubalina

News articles often convey attitudes between the mentioned subjects, which is essential for understanding the described situation.

Articles

CommentsRadar: Dive into Unique Data on All Comments on the Web

no code implementations16 Aug 2019 Sergey Nikolenko, Elena Tutubalina, Zulfat Miftahutdinov, Eugene Beloded

We introduce an entity-centric search engineCommentsRadarthatpairs entity queries with articles and user opinions covering a widerange of topics from top commented sites.

All Articles

AspeRa: Aspect-Based Rating Prediction Based on User Reviews

no code implementations WS 2019 Elena Tutubalina, Valentin Malykh, Sergey Nikolenko, Anton Alekseev, Ilya Shenbin

We propose a novel Aspect-based Rating Prediction model (AspeRa) that estimates user rating based on review texts for the items.

Aspect Extraction Prediction

Entity-level Classification of Adverse Drug Reactions: a Comparison of Neural Network Models

no code implementations WS 2019 Ilseyar Alimova, Elena Tutubalina

This paper presents our experimental work on exploring the potential of neural network models developed for aspect-based sentiment analysis for entity-level adverse drug reaction (ADR) classification.

Articles Aspect-Based Sentiment Analysis +2

Deep Neural Models for Medical Concept Normalization in User-Generated Texts

no code implementations ACL 2019 Zulfat Miftahutdinov, Elena Tutubalina

In this work, we consider the medical concept normalization problem, i. e., the problem of mapping a health-related entity mention in a free-form text to a concept in a controlled vocabulary, usually to the standard thesaurus in the Unified Medical Language System (UMLS).

Form Medical Concept Normalization

AspeRa: Aspect-based Rating Prediction Model

no code implementations23 Jan 2019 Sergey I. Nikolenko, Elena Tutubalina, Valentin Malykh, Ilya Shenbin, Anton Alekseev

We propose a novel end-to-end Aspect-based Rating Prediction model (AspeRa) that estimates user rating based on review texts for the items and at the same time discovers coherent aspects of reviews that can be used to explain predictions or profile users.

model Prediction +1

Sequence Learning with RNNs for Medical Concept Normalization in User-Generated Texts

no code implementations28 Nov 2018 Elena Tutubalina, Zulfat Miftahutdinov, Sergey Nikolenko, Valentin Malykh

In this work, we consider the medical concept normalization problem, i. e., the problem of mapping a disease mention in free-form text to a concept in a controlled vocabulary, usually to the standard thesaurus in the Unified Medical Language System (UMLS).

Medical Concept Normalization Semantic Similarity +1

An Encoder-Decoder Model for ICD-10 Coding of Death Certificates

no code implementations4 Dec 2017 Elena Tutubalina, Zulfat Miftahutdinov

Information extraction from textual documents such as hospital records and healthrelated user discussions has become a topic of intense interest.

Decoder Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.