no code implementations • 23 Feb 2024 • Toufique Ahmed, Christian Bird, Premkumar Devanbu, Saikat Chakraborty
We find that performance for C# changes little from OSS --> proprietary code, but does significantly reduce for C++; we find that this difference is attributable to differences in identifiers.
no code implementations • 14 Nov 2023 • Adharsh Kamath, Aditya Senthilnathan, Saikat Chakraborty, Pantazis Deligiannis, Shuvendu K. Lahiri, Akash Lal, Aseem Rastogi, Subhajit Roy, Rahul Sharma
Finally, we explore the effectiveness of using an efficient combination of a symbolic tool and an LLM on our dataset and compare it against a purely symbolic baseline.
1 code implementation • 13 Oct 2023 • Saikat Chakraborty, Shuvendu K. Lahiri, Sarah Fakhoury, Madanlal Musuvathi, Akash Lal, Aseem Rastogi, Aditya Senthilnathan, Rahul Sharma, Nikhil Swamy
In this work, we observe that Large Language Models (such as gpt-3. 5 or gpt-4) are capable of synthesizing loop invariants for a class of programs in a 0-shot setting, yet require several samples to generate the correct invariants.
no code implementations • 12 Oct 2023 • Md Mahbubur Rahman, Ira Ceka, Chengzhi Mao, Saikat Chakraborty, Baishakhi Ray, Wei Le
Our results show that CausalVul consistently improved the model accuracy, robustness and OOD performance for all the state-of-the-art models and datasets we experimented.
no code implementations • 3 Oct 2023 • Madeline Endres, Sarah Fakhoury, Saikat Chakraborty, Shuvendu K. Lahiri
Informal natural language that describes code functionality, such as code comments or function documentation, may contain substantial information about a programs intent.
no code implementations • 23 May 2023 • Priyanshu Gupta, Avishree Khare, Yasharth Bajpai, Saikat Chakraborty, Sumit Gulwani, Aditya Kanade, Arjun Radhakrishna, Gustavo Soares, Ashish Tiwari
In our experiments with two datasets, the knowledge of prior edits boosts the performance of the LLMs significantly and enables them to generate 29% and 54% more correctly edited code in top-1 suggestions relative to the current state-of-the-art symbolic and neural approaches, respectively.
1 code implementation • 5 May 2023 • Anthony Saieva, Saikat Chakraborty, Gail Kaiser
This paper introduces a novel code-to-code search technique that enhances the performance of Large Language Models (LLMs) by including both static and dynamic features as well as utilizing both similar and dissimilar examples during training.
no code implementations • 7 Apr 2023 • Sarah Fakhoury, Saikat Chakraborty, Madan Musuvathi, Shuvendu K. Lahiri
Several benchmarks have recently emerged to evaluate the ability of LLMs to generate functionally correct code from natural language intent with respect to a set of hidden test cases.
1 code implementation • 21 Feb 2023 • Aniketh Malyala, Katelyn Zhou, Baishakhi Ray, Saikat Chakraborty
In the future, we envision an end-to-end program translation tool where programming domain knowledge can be embedded into an ML-based translation pipeline using pre- and post-processing steps.
no code implementations • 11 Aug 2022 • Shuvendu K. Lahiri, Sarah Fakhoury, Aaditya Naik, Georgios Sakkas, Saikat Chakraborty, Madanlal Musuvathi, Piali Choudhury, Curtis von Veh, Jeevana Priya Inala, Chenglong Wang, Jianfeng Gao
Large language models (LLMs) have shown great potential in automating significant aspects of coding by producing natural code from informal natural language (NL) intent.
1 code implementation • 15 Jun 2022 • Saikat Chakraborty, Toufique Ahmed, Yangruibo Ding, Premkumar Devanbu, Baishakhi Ray
Pre-trained Generative Language models (e. g. PLBART, CodeT5, SPT-Code) for source code yielded strong results on several tasks in the past few years, including code generation and translation.
1 code implementation • 23 May 2022 • Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang
In code generation, the model learns to do the opposite.
no code implementations • ACL 2022 • Yangruibo Ding, Luca Buratti, Saurabh Pujar, Alessandro Morari, Baishakhi Ray, Saikat Chakraborty
We pre-train our model with a much smaller dataset, the size of which is only 5% of the state-of-the-art models' training datasets, to illustrate the effectiveness of our data augmentation and the pre-training approach.
1 code implementation • 26 Aug 2021 • Wasi Uddin Ahmad, Md Golam Rahman Tushar, Saikat Chakraborty, Kai-Wei Chang
Automating program translation is of paramount importance in software migration, and recently researchers explored unsupervised approaches due to the unavailability of parallel corpora.
1 code implementation • Findings (EMNLP) 2021 • Md Rizwan Parvez, Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang
To mimic developers' code or summary generation behavior, we propose a retrieval augmented framework, REDCODER, that retrieves relevant code or summaries from a retrieval database and provides them as a supplement to code generation or summarization models.
Ranked #1 on Code Generation on CodeXGLUE - CodeSearchNet (using extra training data)
1 code implementation • 15 Aug 2021 • Saikat Chakraborty, Baishakhi Ray
With in-depth investigation and analysis, we show that developers' hint as an input modality can narrow the search space for patches and outperform state-of-the-art models to generate correctly patched code in top-1 position.
1 code implementation • NAACL 2021 • Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang
Experiments on code summarization in the English language, code generation, and code translation in seven programming languages show that PLBART outperforms or rivals state-of-the-art models.
1 code implementation • 3 Sep 2020 • Saikat Chakraborty, Rahul Krishna, Yangruibo Ding, Baishakhi Ray
In this paper, we ask, "how well do the state-of-the-art DL-based techniques perform in a real-world vulnerability prediction scenario?".
Software Engineering
9 code implementations • ACL 2020 • Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang
Generating a readable summary that describes the functionality of a program is known as source code summarization.
no code implementations • 29 Nov 2019 • Saikat Chakraborty
Video abstraction has become one of the efficient approaches to grasp the content of a video without seeing it entirely.
no code implementations • 30 Sep 2018 • Saikat Chakraborty, Miltiadis Allamanis, Baishakhi Ray
Our evaluation shows the effectiveness of CODIT in learning and suggesting abstract change templates.
Software Engineering
no code implementations • 8 Aug 2018 • Md Masudur Rahman, Saikat Chakraborty, Gail Kaiser, Baishakhi Ray
In particular, we analyze two previously proposed tools for project recommendation and bug localization tasks, which leverage diverse software artifacts, and observe that an informed choice of similarity measure indeed leads to improved performance of the existing SE tools.
2 code implementations • ACL 2018 • Md. Rizwan Parvez, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang
Text in many domains involves a significant amount of named entities.
Ranked #1 on Recipe Generation on Now You're Cooking!