no code implementations • 14 Dec 2023 • Anton Shapkin, Denis Litvinov, Yaroslav Zharov, Egor Bogomolov, Timur Galimzyanov, Timofey Bryksin
Our approach achieves several targets: (1) lifting the length limitations of the context window, saving on the prompt size; (2) allowing huge expansion of the number of retrieval entities available for the context; (3) alleviating the problem of misspelling or failing to find relevant entity names.
1 code implementation • 15 Aug 2023 • Aleksandra Eliseeva, Yaroslav Sokolov, Egor Bogomolov, Yaroslav Golubev, Danny Dig, Timofey Bryksin
We use this dataset to evaluate the completion setting and the usefulness of the historical context for state-of-the-art CMG models and GPT-3. 5-turbo.
1 code implementation • 5 Aug 2022 • Mikhail Evtikhiev, Egor Bogomolov, Yaroslav Sokolov, Timofey Bryksin
Despite all that, minimal differences in the metric scores have been used in recent papers to claim superiority of some code generation models over the others.
no code implementations • 17 Jun 2022 • Maksim Zubkov, Egor Spirin, Egor Bogomolov, Timofey Bryksin
The first task is code clone detection, which we evaluate on the POJ-104 dataset containing implementations of 104 algorithms.
no code implementations • 17 Jun 2022 • Ilya Utkin, Egor Spirin, Egor Bogomolov, Timofey Bryksin
Even though the process of extracting ASTs from code can be done with different parsers, the impact of choosing a parser on the final model quality remains unstudied.
2 code implementations • 7 Jun 2022 • Egor Bogomolov, Sergey Zhuravlev, Egor Spirin, Timofey Bryksin
We evaluate three models of different complexity and compare their quality in three settings: trained on a large dataset of Java projects, further fine-tuned on the data from a particular project, and trained from scratch on this data.
no code implementations • 3 Jun 2021 • Mikhail Pravilov, Egor Bogomolov, Yaroslav Golubev, Timofey Bryksin
As for the commit message generation, our model demonstrated the same results as supervised models trained for this specific task, which indicates that it can encode code changes well and can be improved in the future by pre-training on a larger dataset of easily gathered code changes.
1 code implementation • 23 Mar 2021 • Egor Spirin, Egor Bogomolov, Vladimir Kovalenko, Timofey Bryksin
PSI trees contain code syntax trees as well as functions to work with them, and therefore can be used to enrich code representation using static analysis algorithms of modern IDEs.
2 code implementations • 6 Jul 2020 • Egor Bogomolov, Yaroslav Golubev, Artyom Lobanov, Vladimir Kovalenko, Timofey Bryksin
We use a dataset of 9 million GitHub projects as a reference search base.
Software Engineering
2 code implementations • 10 Feb 2020 • Vladimir Kovalenko, Egor Bogomolov, Timofey Bryksin, Alberto Bacchelli
With the goal of facilitating team collaboration, we propose a new approach to building vector representations of individual developers by capturing their individual contribution style, or coding style.
Software Engineering Social and Information Networks