Search Results for author: Timofey Bryksin

Found 16 papers, 9 papers with code

Dynamic Retrieval-Augmented Generation

no code implementations14 Dec 2023 Anton Shapkin, Denis Litvinov, Yaroslav Zharov, Egor Bogomolov, Timur Galimzyanov, Timofey Bryksin

Our approach achieves several targets: (1) lifting the length limitations of the context window, saving on the prompt size; (2) allowing huge expansion of the number of retrieval entities available for the context; (3) alleviating the problem of misspelling or failing to find relevant entity names.

abstractive question answering Code Generation +3

From Commit Message Generation to History-Aware Commit Message Completion

1 code implementation15 Aug 2023 Aleksandra Eliseeva, Yaroslav Sokolov, Egor Bogomolov, Yaroslav Golubev, Danny Dig, Timofey Bryksin

We use this dataset to evaluate the completion setting and the usefulness of the historical context for state-of-the-art CMG models and GPT-3. 5-turbo.

Judging Adam: Studying the Performance of Optimization Methods on ML4SE Tasks

no code implementations6 Mar 2023 Dmitry Pasechnyuk, Anton Prazdnichnykh, Mikhail Evtikhiev, Timofey Bryksin

In this work, we test the performance of various optimizers on deep learning models for source code and find that the choice of an optimizer can have a significant impact on the model quality, with up to two-fold score differences between some of the relatively well-performing optimizers.

Out of the BLEU: how should we assess quality of the Code Generation models?

1 code implementation5 Aug 2022 Mikhail Evtikhiev, Egor Bogomolov, Yaroslav Sokolov, Timofey Bryksin

Despite all that, minimal differences in the metric scores have been used in recent papers to claim superiority of some code generation models over the others.

Code Generation Machine Translation

Evaluation of Contrastive Learning with Various Code Representations for Code Clone Detection

no code implementations17 Jun 2022 Maksim Zubkov, Egor Spirin, Egor Bogomolov, Timofey Bryksin

The first task is code clone detection, which we evaluate on the POJ-104 dataset containing implementations of 104 algorithms.

Clone Detection Code Summarization +1

Evaluating the Impact of Source Code Parsers on ML4SE Models

no code implementations17 Jun 2022 Ilya Utkin, Egor Spirin, Egor Bogomolov, Timofey Bryksin

Even though the process of extracting ASTs from code can be done with different parsers, the impact of choosing a parser on the final model quality remains unstudied.

Method name prediction

Assessing Project-Level Fine-Tuning of ML4SE Models

2 code implementations7 Jun 2022 Egor Bogomolov, Sergey Zhuravlev, Egor Spirin, Timofey Bryksin

We evaluate three models of different complexity and compare their quality in three settings: trained on a large dataset of Java projects, further fine-tuned on the data from a particular project, and trained from scratch on this data.

Method name prediction

On the Transferability of Pre-trained Language Models for Low-Resource Programming Languages

no code implementations5 Apr 2022 Fuxiang Chen, Fatemeh Fard, David Lo, Timofey Bryksin

Furthermore, some programming languages are inherently different and code written in one language usually cannot be interchanged with the others, i. e., Ruby and Java code possess very different structure.

Code Search Code Summarization

DapStep: Deep Assignee Prediction for Stack Trace Error rePresentation

1 code implementation14 Jan 2022 Denis Sushentsev, Aleksandr Khvorov, Roman Vasiliev, Yaroslav Golubev, Timofey Bryksin

In this work, we explore the applicability of existing solutions for the bug triage problem when stack traces are used as the main data source of bug reports.

Unsupervised Learning of General-Purpose Embeddings for Code Changes

no code implementations3 Jun 2021 Mikhail Pravilov, Egor Bogomolov, Yaroslav Golubev, Timofey Bryksin

As for the commit message generation, our model demonstrated the same results as supervised models trained for this specific task, which indicates that it can encode code changes well and can be improved in the future by pre-training on a larger dataset of easily gathered code changes.

PSIMiner: A Tool for Mining Rich Abstract Syntax Trees from Code

1 code implementation23 Mar 2021 Egor Spirin, Egor Bogomolov, Vladimir Kovalenko, Timofey Bryksin

PSI trees contain code syntax trees as well as functions to work with them, and therefore can be used to enrich code representation using static analysis algorithms of modern IDEs.

Method name prediction

TaskTracker-tool: a Toolkit for Tracking of Code Snapshots and Activity Data During Solution of Programming Tasks

1 code implementation9 Dec 2020 Elena Lyulina, Anastasiia Birillo, Vladimir Kovalenko, Timofey Bryksin

To validate and showcase the toolkit, we present a dataset collected by our tools.

Software Engineering D.2.2; K.3.2

Sosed: a tool for finding similar software projects

2 code implementations6 Jul 2020 Egor Bogomolov, Yaroslav Golubev, Artyom Lobanov, Vladimir Kovalenko, Timofey Bryksin

We use a dataset of 9 million GitHub projects as a reference search base.

Software Engineering

Using Large-Scale Anomaly Detection on Code to Improve Kotlin Compiler

1 code implementation3 Apr 2020 Timofey Bryksin, Victor Petukhov, Ilya Alexin, Stanislav Prikhodko, Alexey Shpilman, Vladimir Kovalenko, Nikita Povarov

In this work, we apply anomaly detection to source code and bytecode to facilitate the development of a programming language and its compiler.

Anomaly Detection

Building Implicit Vector Representations of Individual Coding Style

2 code implementations10 Feb 2020 Vladimir Kovalenko, Egor Bogomolov, Timofey Bryksin, Alberto Bacchelli

With the goal of facilitating team collaboration, we propose a new approach to building vector representations of individual developers by capturing their individual contribution style, or coding style.

Software Engineering Social and Information Networks

Cannot find the paper you are looking for? You can Submit a new open access paper.