no code implementations • 26 Sep 2023 • Zimin Chen, Sen Fang, Martin Monperrus
Software optimization refines programs for resource efficiency while preserving functionality.
1 code implementation • NeurIPS 2021 • Zimin Chen, Vincent Hellendoorn, Pascal Lamblin, Petros Maniatis, Pierre-Antoine Manzagol, Daniel Tarlow, Subhodeep Moitra
Machine learning for understanding and editing source code has recently attracted significant interest, with many developments in new models, new code representations, and new tasks. This proliferation can appear disparate and disconnected, making each approach seemingly unique and incompatible, thus obscuring the core machine learning challenges and contributions. In this work, we demonstrate that the landscape can be significantly simplified by taking a general approach of mapping a graph to a sequence of tokens and pointers. Our main result is to show that 16 recently published tasks of different shapes can be cast in this form, based on which a single model architecture achieves near or above state-of-the-art results on nearly all tasks, outperforming custom models like code2seq and alternative generic models like Transformers. This unification further enables multi-task learning and a series of cross-cutting experiments about the importance of different modeling choices for code understanding and repair tasks. The full framework, called PLUR, is easily extensible to more tasks, and will be open-sourced (https://github. com/google-research/plur).
1 code implementation • 2 Jul 2021 • Jian Gu, Zimin Chen, Martin Monperrus
In this paper, to improve the vector space, we introduce tree-serialization methods on a simplified form of AST and build the multimodal representation for the code data.
Ranked #1 on Code Search on CodeSearchNet - Ruby
2 code implementations • 16 Apr 2021 • Zimin Chen, Steve Kommrusch, Martin Monperrus
To sum up, this paper shows that transfer learning works well for repairing security vulnerabilities in C compared to learning on a small dataset.
no code implementations • 4 Dec 2019 • Zimin Chen, Steve Kommrusch, Martin Monperrus
Software vulnerabilities affect all businesses and research is being done to avoid, detect or repair them.
no code implementations • 4 Nov 2019 • Daniel Tarlow, Subhodeep Moitra, Andrew Rice, Zimin Chen, Pierre-Antoine Manzagol, Charles Sutton, Edward Aftandilian
A diff specifies how to modify the code's abstract syntax tree, represented in the neural network as a sequence of tokens and of pointers to code locations.
1 code implementation • 22 Jul 2019 • Zhongxing Yu, Matias Martinez, Zimin Chen, Tegawendé F. Bissyandé, Martin Monperrus
We instantiate our approach in the context of repair transform prediction for Java programs.
1 code implementation • 5 Apr 2019 • Zimin Chen, Martin Monperrus
In this survey, we aim to collect and discuss the usage of word embedding techniques on programs and source code.
2 code implementations • 24 Dec 2018 • Zimin Chen, Steve Kommrusch, Michele Tufano, Louis-Noël Pouchet, Denys Poshyvanyk, Martin Monperrus
This paper presents a novel end-to-end approach to program repair based on sequence-to-sequence learning.
no code implementations • 14 Nov 2018 • Zimin Chen, Martin Monperrus
Recently, there have been original attempts to use the concept of "code similarity" in program repair, suggesting that similarity analysis has an important role in the repair process.
Software Engineering
1 code implementation • 6 Jul 2018 • Zimin Chen, Martin Monperrus
CodRep is a machine learning competition on source code data.