no code implementations • 27 Jan 2025 • Nadezhda Chirkova, Thibault Formal, Vassilina Nikoulina, Stéphane Clinchant
In this work, we close this gap and introduce Provence (Pruning and Reranking Of retrieVEd relevaNt ContExts), an efficient and robust context pruner for Question Answering, which dynamically detects the needed amount of pruning for a given context and can be used out-of-the-box for various domains.
1 code implementation • 1 Jul 2024 • David Rau, Hervé Déjean, Nadezhda Chirkova, Thibault Formal, Shuai Wang, Vassilina Nikoulina, Stéphane Clinchant
In response to the recent popularity of generative LLMs, many RAG approaches have been proposed, which involve an intricate number of different configurations such as evaluation datasets, collections, metrics, retrievers, and LLMs.
no code implementations • 1 Jul 2024 • Nadezhda Chirkova, Vassilina Nikoulina, Jean-Luc Meunier, Alexandre Bérard
We focus on multi-domain Neural Machine Translation, with the goal of developing efficient models which can handle data from various domains seen during training and are robust to domains unseen during training.
1 code implementation • 1 Jul 2024 • Nadezhda Chirkova, David Rau, Hervé Déjean, Thibault Formal, Stéphane Clinchant, Vassilina Nikoulina
Retrieval-augmented generation (RAG) has recently emerged as a promising solution for incorporating up-to-date or domain-specific knowledge into large language models (LLMs) and improving LLM factuality, but is predominantly studied in English-only settings.
no code implementations • 22 Feb 2024 • Nadezhda Chirkova, Vassilina Nikoulina
Instruction tuning (IT) is widely used to teach pretrained large language models (LLMs) to follow arbitrary instructions, but is under-studied in multilingual settings.
no code implementations • 19 Feb 2024 • Nadezhda Chirkova, Vassilina Nikoulina
Previous works notice a frequent problem of generation in a wrong language and propose approaches to address it, usually using mT5 as a backbone model.
no code implementations • 15 Oct 2023 • Nadezhda Chirkova, Sheng Liang, Vassilina Nikoulina
Zero-shot cross-lingual knowledge transfer enables the multilingual pretrained language model (mPLM), finetuned on a task in one language, make predictions for this task in other languages.
no code implementations • 1 Aug 2023 • Nadezhda Chirkova, Sergey Troshin
Recent works have widely adopted large language model pretraining for source code, suggested source code-specific pretraining objectives and investigated the applicability of various Transformer-based language model architectures for source code.
1 code implementation • 30 Jun 2023 • Nadezhda Chirkova, Germán Kruszewski, Jos Rozen, Marc Dymetman
Autoregressive language models (LMs) map token sequences to probabilities.
1 code implementation • 12 Dec 2022 • Shamil Ayupov, Nadezhda Chirkova
Pretrained Transformers achieve state-of-the-art performance in various code-processing tasks but may be too large to be deployed.
1 code implementation • 16 Feb 2022 • Sergey Troshin, Nadezhda Chirkova
Deep learning models are widely used for solving challenging code processing tasks, such as code generation or code summarization.
no code implementations • 29 Dec 2021 • Evgeny Bobrov, Sergey Troshin, Nadezhda Chirkova, Ekaterina Lobacheva, Sviatoslav Panchenko, Dmitry Vetrov, Dmitry Kropotov
Channel decoding, channel detection, channel assessment, and resource management for wireless multiple-input multiple-output (MIMO) systems are all examples of problems where machine learning (ML) can be successfully applied.
no code implementations • 21 Jul 2021 • Ildus Sadrtdinov, Nadezhda Chirkova, Ekaterina Lobacheva
Memorization studies of deep neural networks (DNNs) help to understand what patterns and how do DNNs learn, and motivate improvements to DNN training approaches.
1 code implementation • NeurIPS 2021 • Ekaterina Lobacheva, Maxim Kodryan, Nadezhda Chirkova, Andrey Malinin, Dmitry Vetrov
Training neural networks with batch normalization and weight decay has become a common practice in recent years.
1 code implementation • NAACL 2021 • Nadezhda Chirkova, Sergey Troshin
There is an emerging interest in the application of natural language processing models to source code processing tasks.
1 code implementation • NAACL 2021 • Nadezhda Chirkova
In this work, we develop dynamic embeddings, a recurrent mechanism that adjusts the learned semantics of the variable when it obtains more information about the variable's role in the program.
1 code implementation • 15 Oct 2020 • Nadezhda Chirkova, Sergey Troshin
In this work, we conduct a thorough empirical study of the capabilities of Transformers to utilize syntactic information in different tasks.
1 code implementation • NeurIPS 2020 • Ekaterina Lobacheva, Nadezhda Chirkova, Maxim Kodryan, Dmitry Vetrov
Ensembles of deep neural networks are known to achieve state-of-the-art performance in uncertainty estimation and lead to accuracy improvement.
no code implementations • 14 May 2020 • Nadezhda Chirkova, Ekaterina Lobacheva, Dmitry Vetrov
In this work, we consider a fixed memory budget setting, and investigate, what is more effective: to train a single wide network, or to perform a memory split -- to train an ensemble of several thinner networks, with the same total number of parameters?
no code implementations • 13 Nov 2019 • Ekaterina Lobacheva, Nadezhda Chirkova, Alexander Markovich, Dmitry Vetrov
Recently, a lot of techniques were developed to sparsify the weights of neural networks and to remove networks' structure units, e. g. neurons.
1 code implementation • NIPS Workshop CDNNRIA 2018 • Ekaterina Lobacheva, Nadezhda Chirkova, Dmitry Vetrov
Bayesian methods have been successfully applied to sparsify weights of neural networks and to remove structure units from the networks, e. g. neurons.
3 code implementations • EMNLP 2018 • Nadezhda Chirkova, Ekaterina Lobacheva, Dmitry Vetrov
In natural language processing, a lot of the tasks are successfully solved with recurrent neural networks, but such models have a huge number of parameters.
2 code implementations • 31 Jul 2017 • Ekaterina Lobacheva, Nadezhda Chirkova, Dmitry Vetrov
Recurrent neural networks show state-of-the-art results in many text analysis tasks but often require a lot of memory to store their weights.