1 code implementation • 27 Mar 2024 • Yangruibo Ding, Marcus J. Min, Gail Kaiser, Baishakhi Ray
Pre-trained code language models have achieved promising performance in code generation and improved the programming efficiency of human developers.
no code implementations • 27 Mar 2024 • Yangruibo Ding, Yanjun Fu, Omniyyah Ibrahim, Chawin Sitawarin, Xinyun Chen, Basel Alomair, David Wagner, Baishakhi Ray, Yizheng Chen
Evaluating code LMs on PrimeVul reveals that existing benchmarks significantly overestimate the performance of these models.
1 code implementation • 21 Oct 2023 • Marcus J. Min, Yangruibo Ding, Luca Buratti, Saurabh Pujar, Gail Kaiser, Suman Jana, Baishakhi Ray
In this paper, we first formally define the self-consistency of Code LLMs and then design a framework, IdentityChain, which effectively and efficiently evaluates the self-consistency and conventional accuracy of a model at the same time.
no code implementations • 20 Dec 2022 • Yangruibo Ding, Zijian Wang, Wasi Uddin Ahmad, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, Bing Xiang
While pre-trained language models (LM) for code have achieved great success in code completion, they generate code conditioned only on the contents within the file, i. e., in-file context, but ignore the rich semantics in other files within the same project, i. e., cross-file context, a critical source of information that is especially useful in modern modular software development.
1 code implementation • 15 Jun 2022 • Saikat Chakraborty, Toufique Ahmed, Yangruibo Ding, Premkumar Devanbu, Baishakhi Ray
Pre-trained Generative Language models (e. g. PLBART, CodeT5, SPT-Code) for source code yielded strong results on several tasks in the past few years, including code generation and translation.
1 code implementation • 20 Dec 2021 • Yangruibo Ding, Sahil Suneja, Yunhui Zheng, Jim Laredo, Alessandro Morari, Gail Kaiser, Baishakhi Ray
Automatically locating vulnerable statements in source code is crucial to assure software security and alleviate developers' debugging efforts.
no code implementations • ACL 2022 • Yangruibo Ding, Luca Buratti, Saurabh Pujar, Alessandro Morari, Baishakhi Ray, Saikat Chakraborty
We pre-train our model with a much smaller dataset, the size of which is only 5% of the state-of-the-art models' training datasets, to illustrate the effectiveness of our data augmentation and the pre-training approach.
1 code implementation • 3 Sep 2020 • Saikat Chakraborty, Rahul Krishna, Yangruibo Ding, Baishakhi Ray
In this paper, we ask, "how well do the state-of-the-art DL-based techniques perform in a real-world vulnerability prediction scenario?".
Software Engineering
1 code implementation • 24 Aug 2020 • Yangruibo Ding, Baishakhi Ray, Premkumar Devanbu, Vincent J. Hellendoorn
Given these findings, we demonstrate how a more principled approach to model design, based on our empirical findings and general knowledge of software development, can lead to better solutions.