Search Results for author: Saikat Chakraborty

Found 23 papers, 12 papers with code

Studying LLM Performance on Closed- and Open-source Data

no code implementations23 Feb 2024 Toufique Ahmed, Christian Bird, Premkumar Devanbu, Saikat Chakraborty

We find that performance for C# changes little from OSS --> proprietary code, but does significantly reduce for C++; we find that this difference is attributable to differences in identifiers.

In-Context Learning

Finding Inductive Loop Invariants using Large Language Models

no code implementations14 Nov 2023 Adharsh Kamath, Aditya Senthilnathan, Saikat Chakraborty, Pantazis Deligiannis, Shuvendu K. Lahiri, Akash Lal, Aseem Rastogi, Subhajit Roy, Rahul Sharma

Finally, we explore the effectiveness of using an efficient combination of a symbolic tool and an LLM on our dataset and compare it against a purely symbolic baseline.

Ranking LLM-Generated Loop Invariants for Program Verification

1 code implementation13 Oct 2023 Saikat Chakraborty, Shuvendu K. Lahiri, Sarah Fakhoury, Madanlal Musuvathi, Akash Lal, Aseem Rastogi, Aditya Senthilnathan, Rahul Sharma, Nikhil Swamy

In this work, we observe that Large Language Models (such as gpt-3. 5 or gpt-4) are capable of synthesizing loop invariants for a class of programs in a 0-shot setting, yet require several samples to generate the correct invariants.

Re-Ranking

Towards Causal Deep Learning for Vulnerability Detection

no code implementations12 Oct 2023 Md Mahbubur Rahman, Ira Ceka, Chengzhi Mao, Saikat Chakraborty, Baishakhi Ray, Wei Le

Our results show that CausalVul consistently improved the model accuracy, robustness and OOD performance for all the state-of-the-art models and datasets we experimented.

Vulnerability Detection

Formalizing Natural Language Intent into Program Specifications via Large Language Models

no code implementations3 Oct 2023 Madeline Endres, Sarah Fakhoury, Saikat Chakraborty, Shuvendu K. Lahiri

Informal natural language that describes code functionality, such as code comments or function documentation, may contain substantial information about a programs intent.

Fault localization Translation

GrACE: Generation using Associated Code Edits

no code implementations23 May 2023 Priyanshu Gupta, Avishree Khare, Yasharth Bajpai, Saikat Chakraborty, Sumit Gulwani, Aditya Kanade, Arjun Radhakrishna, Gustavo Soares, Ashish Tiwari

In our experiments with two datasets, the knowledge of prior edits boosts the performance of the LLMs significantly and enables them to generate 29% and 54% more correctly edited code in top-1 suggestions relative to the current state-of-the-art symbolic and neural approaches, respectively.

Bug fixing Code Generation

On Contrastive Learning of Semantic Similarity forCode to Code Search

1 code implementation5 May 2023 Anthony Saieva, Saikat Chakraborty, Gail Kaiser

This paper introduces a novel code-to-code search technique that enhances the performance of Large Language Models (LLMs) by including both static and dynamic features as well as utilizing both similar and dissimilar examples during training.

Code Search Contrastive Learning +2

Towards Generating Functionally Correct Code Edits from Natural Language Issue Descriptions

no code implementations7 Apr 2023 Sarah Fakhoury, Saikat Chakraborty, Madan Musuvathi, Shuvendu K. Lahiri

Several benchmarks have recently emerged to evaluate the ability of LLMs to generate functionally correct code from natural language intent with respect to a set of hidden test cases.

On ML-Based Program Translation: Perils and Promises

1 code implementation21 Feb 2023 Aniketh Malyala, Katelyn Zhou, Baishakhi Ray, Saikat Chakraborty

In the future, we envision an end-to-end program translation tool where programming domain knowledge can be embedded into an ML-based translation pipeline using pre- and post-processing steps.

Translation

Interactive Code Generation via Test-Driven User-Intent Formalization

no code implementations11 Aug 2022 Shuvendu K. Lahiri, Sarah Fakhoury, Aaditya Naik, Georgios Sakkas, Saikat Chakraborty, Madanlal Musuvathi, Piali Choudhury, Curtis von Veh, Jeevana Priya Inala, Chenglong Wang, Jianfeng Gao

Large language models (LLMs) have shown great potential in automating significant aspects of coding by producing natural code from informal natural language (NL) intent.

Code Generation

NatGen: Generative pre-training by "Naturalizing" source code

1 code implementation15 Jun 2022 Saikat Chakraborty, Toufique Ahmed, Yangruibo Ding, Premkumar Devanbu, Baishakhi Ray

Pre-trained Generative Language models (e. g. PLBART, CodeT5, SPT-Code) for source code yielded strong results on several tasks in the past few years, including code generation and translation.

Code Translation Few-Shot Learning +1

Towards Learning (Dis)-Similarity of Source Code from Program Contrasts

no code implementations ACL 2022 Yangruibo Ding, Luca Buratti, Saurabh Pujar, Alessandro Morari, Baishakhi Ray, Saikat Chakraborty

We pre-train our model with a much smaller dataset, the size of which is only 5% of the state-of-the-art models' training datasets, to illustrate the effectiveness of our data augmentation and the pre-training approach.

Clone Detection Contrastive Learning +2

AVATAR: A Parallel Corpus for Java-Python Program Translation

1 code implementation26 Aug 2021 Wasi Uddin Ahmad, Md Golam Rahman Tushar, Saikat Chakraborty, Kai-Wei Chang

Automating program translation is of paramount importance in software migration, and recently researchers explored unsupervised approaches due to the unavailability of parallel corpora.

Translation

Retrieval Augmented Code Generation and Summarization

1 code implementation Findings (EMNLP) 2021 Md Rizwan Parvez, Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

To mimic developers' code or summary generation behavior, we propose a retrieval augmented framework, REDCODER, that retrieves relevant code or summaries from a retrieval database and provides them as a supplement to code generation or summarization models.

 Ranked #1 on Code Generation on CodeXGLUE - CodeSearchNet (using extra training data)

Code Generation Code Summarization +1

On Multi-Modal Learning of Editing Source Code

1 code implementation15 Aug 2021 Saikat Chakraborty, Baishakhi Ray

With in-depth investigation and analysis, we show that developers' hint as an input modality can narrow the search space for patches and outperform state-of-the-art models to generate correctly patched code in top-1 position.

NMT

Unified Pre-training for Program Understanding and Generation

1 code implementation NAACL 2021 Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

Experiments on code summarization in the English language, code generation, and code translation in seven programming languages show that PLBART outperforms or rivals state-of-the-art models.

Clone Detection Code Summarization +6

Deep Learning based Vulnerability Detection: Are We There Yet?

1 code implementation3 Sep 2020 Saikat Chakraborty, Rahul Krishna, Yangruibo Ding, Baishakhi Ray

In this paper, we ask, "how well do the state-of-the-art DL-based techniques perform in a real-world vulnerability prediction scenario?".

Software Engineering

A Graph-based Ranking Approach to Extract Key-frames for Static Video Summarization

no code implementations29 Nov 2019 Saikat Chakraborty

Video abstraction has become one of the efficient approaches to grasp the content of a video without seeing it entirely.

Video Summarization

Tree2Tree Neural Translation Model for Learning Source Code Changes

no code implementations30 Sep 2018 Saikat Chakraborty, Miltiadis Allamanis, Baishakhi Ray

Our evaluation shows the effectiveness of CODIT in learning and suggesting abstract change templates.

Software Engineering

A Case Study on the Impact of Similarity Measure on Information Retrieval based Software Engineering Tasks

no code implementations8 Aug 2018 Md Masudur Rahman, Saikat Chakraborty, Gail Kaiser, Baishakhi Ray

In particular, we analyze two previously proposed tools for project recommendation and bug localization tasks, which leverage diverse software artifacts, and observe that an informed choice of similarity measure indeed leads to improved performance of the existing SE tools.

Information Retrieval Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.