no code implementations • 15 Oct 2023 • Nadezhda Chirkova, Sheng Liang, Vassilina Nikoulina
Zero-shot cross-lingual generation assumes finetuning the multilingual pretrained language model (mPLM) on a generation task in one language and then using it to make predictions for this task in other languages.
no code implementations • 1 Aug 2023 • Nadezhda Chirkova, Sergey Troshin
Recent works have widely adopted large language model pretraining for source code, suggested source code-specific pretraining objectives and investigated the applicability of various Transformer-based language model architectures for source code.
1 code implementation • 30 Jun 2023 • Nadezhda Chirkova, Germán Kruszewski, Jos Rozen, Marc Dymetman
Autoregressive language models (LMs) map token sequences to probabilities.
1 code implementation • 12 Dec 2022 • Shamil Ayupov, Nadezhda Chirkova
Pretrained Transformers achieve state-of-the-art performance in various code-processing tasks but may be too large to be deployed.
1 code implementation • 16 Feb 2022 • Sergey Troshin, Nadezhda Chirkova
Deep learning models are widely used for solving challenging code processing tasks, such as code generation or code summarization.
no code implementations • 29 Dec 2021 • Evgeny Bobrov, Sergey Troshin, Nadezhda Chirkova, Ekaterina Lobacheva, Sviatoslav Panchenko, Dmitry Vetrov, Dmitry Kropotov
Channel decoding, channel detection, channel assessment, and resource management for wireless multiple-input multiple-output (MIMO) systems are all examples of problems where machine learning (ML) can be successfully applied.
no code implementations • 21 Jul 2021 • Ildus Sadrtdinov, Nadezhda Chirkova, Ekaterina Lobacheva
Memorization studies of deep neural networks (DNNs) help to understand what patterns and how do DNNs learn, and motivate improvements to DNN training approaches.
1 code implementation • NeurIPS 2021 • Ekaterina Lobacheva, Maxim Kodryan, Nadezhda Chirkova, Andrey Malinin, Dmitry Vetrov
Training neural networks with batch normalization and weight decay has become a common practice in recent years.
1 code implementation • NAACL 2021 • Nadezhda Chirkova
In this work, we develop dynamic embeddings, a recurrent mechanism that adjusts the learned semantics of the variable when it obtains more information about the variable's role in the program.
1 code implementation • NAACL 2021 • Nadezhda Chirkova, Sergey Troshin
There is an emerging interest in the application of natural language processing models to source code processing tasks.
1 code implementation • 15 Oct 2020 • Nadezhda Chirkova, Sergey Troshin
In this work, we conduct a thorough empirical study of the capabilities of Transformers to utilize syntactic information in different tasks.
1 code implementation • NeurIPS 2020 • Ekaterina Lobacheva, Nadezhda Chirkova, Maxim Kodryan, Dmitry Vetrov
Ensembles of deep neural networks are known to achieve state-of-the-art performance in uncertainty estimation and lead to accuracy improvement.
no code implementations • 14 May 2020 • Nadezhda Chirkova, Ekaterina Lobacheva, Dmitry Vetrov
In this work, we consider a fixed memory budget setting, and investigate, what is more effective: to train a single wide network, or to perform a memory split -- to train an ensemble of several thinner networks, with the same total number of parameters?
no code implementations • 13 Nov 2019 • Ekaterina Lobacheva, Nadezhda Chirkova, Alexander Markovich, Dmitry Vetrov
Recently, a lot of techniques were developed to sparsify the weights of neural networks and to remove networks' structure units, e. g. neurons.
1 code implementation • NIPS Workshop CDNNRIA 2018 • Ekaterina Lobacheva, Nadezhda Chirkova, Dmitry Vetrov
Bayesian methods have been successfully applied to sparsify weights of neural networks and to remove structure units from the networks, e. g. neurons.
3 code implementations • EMNLP 2018 • Nadezhda Chirkova, Ekaterina Lobacheva, Dmitry Vetrov
In natural language processing, a lot of the tasks are successfully solved with recurrent neural networks, but such models have a huge number of parameters.
2 code implementations • 31 Jul 2017 • Ekaterina Lobacheva, Nadezhda Chirkova, Dmitry Vetrov
Recurrent neural networks show state-of-the-art results in many text analysis tasks but often require a lot of memory to store their weights.