Search Results for author: Nadezhda Chirkova

Found 19 papers, 11 papers with code

Zero-shot cross-lingual transfer in instruction tuning of large language model

no code implementations22 Feb 2024 Nadezhda Chirkova, Vassilina Nikoulina

Instruction tuning (IT) is widely used to teach pretrained large language models (LLMs) to follow arbitrary instructions, but is under-studied in multilingual settings.

Instruction Following Language Modelling +2

Key ingredients for effective zero-shot cross-lingual knowledge transfer in generative tasks

no code implementations19 Feb 2024 Nadezhda Chirkova, Vassilina Nikoulina

Previous works notice a frequent problem of generation in a wrong language and propose approaches to address it, usually using mT5 as a backbone model.

Language Modelling Transfer Learning

Empirical study of pretrained multilingual language models for zero-shot cross-lingual generation

no code implementations15 Oct 2023 Nadezhda Chirkova, Sheng Liang, Vassilina Nikoulina

Zero-shot cross-lingual generation assumes finetuning the multilingual pretrained language model (mPLM) on a generation task in one language and then using it to make predictions for this task in other languages.

Language Modelling Pretrained Multilingual Language Models

CodeBPE: Investigating Subtokenization Options for Large Language Model Pretraining on Source Code

no code implementations1 Aug 2023 Nadezhda Chirkova, Sergey Troshin

Recent works have widely adopted large language model pretraining for source code, suggested source code-specific pretraining objectives and investigated the applicability of various Transformer-based language model architectures for source code.

Language Modelling Large Language Model

Should you marginalize over possible tokenizations?

1 code implementation30 Jun 2023 Nadezhda Chirkova, Germán Kruszewski, Jos Rozen, Marc Dymetman

Autoregressive language models (LMs) map token sequences to probabilities.

Parameter-Efficient Finetuning of Transformers for Source Code

1 code implementation12 Dec 2022 Shamil Ayupov, Nadezhda Chirkova

Pretrained Transformers achieve state-of-the-art performance in various code-processing tasks but may be too large to be deployed.

Probing Pretrained Models of Source Code

1 code implementation16 Feb 2022 Sergey Troshin, Nadezhda Chirkova

Deep learning models are widely used for solving challenging code processing tasks, such as code generation or code summarization.

Classification regression +1

Machine Learning Methods for Spectral Efficiency Prediction in Massive MIMO Systems

no code implementations29 Dec 2021 Evgeny Bobrov, Sergey Troshin, Nadezhda Chirkova, Ekaterina Lobacheva, Sviatoslav Panchenko, Dmitry Vetrov, Dmitry Kropotov

Channel decoding, channel detection, channel assessment, and resource management for wireless multiple-input multiple-output (MIMO) systems are all examples of problems where machine learning (ML) can be successfully applied.

BIG-bench Machine Learning Management

On the Memorization Properties of Contrastive Learning

no code implementations21 Jul 2021 Ildus Sadrtdinov, Nadezhda Chirkova, Ekaterina Lobacheva

Memorization studies of deep neural networks (DNNs) help to understand what patterns and how do DNNs learn, and motivate improvements to DNN training approaches.

Contrastive Learning Memorization +1

On the Embeddings of Variables in Recurrent Neural Networks for Source Code

1 code implementation NAACL 2021 Nadezhda Chirkova

In this work, we develop dynamic embeddings, a recurrent mechanism that adjusts the learned semantics of the variable when it obtains more information about the variable's role in the program.

Bug fixing Code Completion +1

Empirical Study of Transformers for Source Code

1 code implementation15 Oct 2020 Nadezhda Chirkova, Sergey Troshin

In this work, we conduct a thorough empirical study of the capabilities of Transformers to utilize syntactic information in different tasks.

Bug fixing Code Completion

On Power Laws in Deep Ensembles

1 code implementation NeurIPS 2020 Ekaterina Lobacheva, Nadezhda Chirkova, Maxim Kodryan, Dmitry Vetrov

Ensembles of deep neural networks are known to achieve state-of-the-art performance in uncertainty estimation and lead to accuracy improvement.

Deep Ensembles on a Fixed Memory Budget: One Wide Network or Several Thinner Ones?

no code implementations14 May 2020 Nadezhda Chirkova, Ekaterina Lobacheva, Dmitry Vetrov

In this work, we consider a fixed memory budget setting, and investigate, what is more effective: to train a single wide network, or to perform a memory split -- to train an ensemble of several thinner networks, with the same total number of parameters?

Structured Sparsification of Gated Recurrent Neural Networks

no code implementations13 Nov 2019 Ekaterina Lobacheva, Nadezhda Chirkova, Alexander Markovich, Dmitry Vetrov

Recently, a lot of techniques were developed to sparsify the weights of neural networks and to remove networks' structure units, e. g. neurons.

Language Modelling text-classification +1

Bayesian Sparsification of Gated Recurrent Neural Networks

1 code implementation NIPS Workshop CDNNRIA 2018 Ekaterina Lobacheva, Nadezhda Chirkova, Dmitry Vetrov

Bayesian methods have been successfully applied to sparsify weights of neural networks and to remove structure units from the networks, e. g. neurons.

Bayesian Compression for Natural Language Processing

3 code implementations EMNLP 2018 Nadezhda Chirkova, Ekaterina Lobacheva, Dmitry Vetrov

In natural language processing, a lot of the tasks are successfully solved with recurrent neural networks, but such models have a huge number of parameters.

Bayesian Sparsification of Recurrent Neural Networks

2 code implementations31 Jul 2017 Ekaterina Lobacheva, Nadezhda Chirkova, Dmitry Vetrov

Recurrent neural networks show state-of-the-art results in many text analysis tasks but often require a lot of memory to store their weights.

Language Modelling Sentiment Analysis

Cannot find the paper you are looking for? You can Submit a new open access paper.