Search Results for author: Neel Sundaresan

Found 35 papers, 10 papers with code

Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic Comprehension

no code implementations • 13 Apr 2024 • MengNan Qi, Yufan Huang, Yongqiang Yao, Maoquan Wang, Bin Gu, Neel Sundaresan

Our experimental results reveal that following this pretraining, both Code Llama and StarCoder, the prevalent code domain pretraining models, display significant improvements on our logically equivalent code selection task and the code completion task.

Code Completion Sentence +2

Paper
Add Code

AutoDev: Automated AI-Driven Development

no code implementations • 13 Mar 2024 • Michele Tufano, Anisha Agarwal, Jinu Jang, Roshanak Zilouchian Moghaddam, Neel Sundaresan

This enables the AI Agents to execute tasks in a fully automated manner with a comprehensive understanding of the contextual information required.

Code Generation

Paper
Add Code

Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming

no code implementations • 22 Feb 2024 • Anisha Agarwal, Aaron Chan, Shubham Chandel, Jinu Jang, Shaun Miller, Roshanak Zilouchian Moghaddam, Yevhen Mohylevskyy, Neel Sundaresan, Michele Tufano

The integration of Large Language Models (LLMs) into Development Environments (IDEs) has become a focal point in modern software development.

Bug fixing Code Generation

Paper
Add Code

Rethinking the Instruction Quality: LIFT is What You Need

no code implementations • 12 Dec 2023 • Yang Xu, Yongqiang Yao, Yufan Huang, MengNan Qi, Maoquan Wang, Bin Gu, Neel Sundaresan

Instruction tuning, a specialized technique to enhance large language model (LLM) performance via instruction datasets, relies heavily on the quality of employed data.

Code Generation Instruction Following +3

Paper
Add Code

SUT: Active Defects Probing for Transcompiler Models

no code implementations • 22 Oct 2023 • MengNan Qi, Yufan Huang, Maoquan Wang, Yongqiang Yao, Zihan Liu, Bin Gu, Colin Clement, Neel Sundaresan

In this paper we introduce a new metrics for programming language translation and these metrics address these basic syntax errors.

Translation

Paper
Add Code

Program Translation via Code Distillation

no code implementations • 17 Oct 2023 • Yufan Huang, MengNan Qi, Yongqiang Yao, Maoquan Wang, Bin Gu, Colin Clement, Neel Sundaresan

Distilled code serves as a translation pivot for any programming language, leading by construction to parallel corpora which scale to all available source code by simply applying the distillation compiler.

Machine Translation Translation

Paper
Add Code

Reinforcement Learning from Automatic Feedback for High-Quality Unit Test Generation

no code implementations • 3 Oct 2023 • Benjamin Steenhoek, Michele Tufano, Neel Sundaresan, Alexey Svyatkovskiy

Software testing is a crucial aspect of software development, and the creation of high-quality tests that adhere to best practices is essential for effective maintenance.

Code Generation reinforcement-learning

Paper
Add Code

Predicting Code Coverage without Execution

1 code implementation • 25 Jul 2023 • Michele Tufano, Shubham Chandel, Anisha Agarwal, Neel Sundaresan, Colin Clement

Using Machine Learning to amortize this expensive process could lower the cost of code coverage by requiring only the source code context, and the task of code coverage prediction can be a novel benchmark for judging the ability of models to understand code.

Paper
Code

RAPGen: An Approach for Fixing Code Inefficiencies in Zero-Shot

no code implementations • 29 Jun 2023 • Spandan Garg, Roshanak Zilouchian Moghaddam, Neel Sundaresan

We compare our approach with the various prompt variations and state of the art methods in the task of performance bug fixing.

Bug fixing Language Modelling +2

Paper
Add Code

Transformer-based Vulnerability Detection in Code at EditTime: Zero-shot, Few-shot, or Fine-tuning?

no code implementations • 23 May 2023 • Aaron Chan, Anant Kharkar, Roshanak Zilouchian Moghaddam, Yevhen Mohylevskyy, Alec Helyar, Eslam Kamal, Mohamed Elkamhawy, Neel Sundaresan

We recognize that the current advances in machine learning can be used to detect vulnerable code patterns on syntactically incomplete code snippets as the developer is writing the code at EditTime.

Vulnerability Detection

Paper
Add Code

Code Execution with Pre-trained Language Models

1 code implementation • 8 May 2023 • Chenxiao Liu, Shuai Lu, Weizhu Chen, Daxin Jiang, Alexey Svyatkovskiy, Shengyu Fu, Neel Sundaresan, Nan Duan

Code execution is a fundamental aspect of programming language semantics that reflects the exact behavior of the code.

Code Generation Code Search +2

1,976

Paper
Code

Exploring and Evaluating Personalized Models for Code Generation

no code implementations • 29 Aug 2022 • Andrei Zlotchevski, Dawn Drain, Alexey Svyatkovskiy, Colin Clement, Neel Sundaresan, Michele Tufano

Large Transformer models achieved the state-of-the-art status for Natural Language Understanding tasks and are increasingly becoming the baseline model architecture for modeling source code.

Code Generation Natural Language Understanding +1

Paper
Add Code

DeepPERF: A Deep Learning-Based Approach For Improving Software Performance

no code implementations • 27 Jun 2022 • Spandan Garg, Roshanak Zilouchian Moghaddam, Colin B. Clement, Neel Sundaresan, Chen Wu

Additionally, we evaluate DeepPERF on 50 open source C# repositories on GitHub using both benchmark and unit tests and find that our model is able to suggest valid performance improvements that can improve both CPU usage and Memory allocations.

valid

Paper
Add Code

AdaptivePaste: Code Adaptation through Learning Semantics-aware Variable Usage Representations

no code implementations • 23 May 2022 • Xiaoyu Liu, Jinu Jang, Neel Sundaresan, Miltiadis Allamanis, Alexey Svyatkovskiy

This scenario motivates the code adaptation task -- a variant of program repair which aims to adapt variable identifiers in a pasted snippet of code to the surrounding, preexisting source code.

Program Repair

Paper
Add Code

Generating Examples From CLI Usage: Can Transformers Help?

no code implementations • 27 Apr 2022 • Roshanak Zilouchian Moghaddam, Spandan Garg, Colin B. Clement, Yevhen Mohylevskyy, Neel Sundaresan

Continuous evolution in modern software often causes documentation, tutorials, and examples to be out of sync with changing interfaces and frameworks.

BIG-bench Machine Learning

Paper
Add Code

Automating Code Review Activities by Large-Scale Pre-training

2 code implementations • 17 Mar 2022 • Zhiyu Li, Shuai Lu, Daya Guo, Nan Duan, Shailesh Jannu, Grant Jenks, Deep Majumder, Jared Green, Alexey Svyatkovskiy, Shengyu Fu, Neel Sundaresan

In this research, we focus on utilizing pre-training techniques for the tasks in the code review scenario.

Comment Generation

1,976

Paper
Code

Learning to Reduce False Positives in Analytic Bug Detectors

no code implementations • 8 Mar 2022 • Anant Kharkar, Roshanak Zilouchian Moghaddam, Matthew Jin, Xiaoyu Liu, Xin Shi, Colin Clement, Neel Sundaresan

Due to increasingly complex software design and rapid iterative development, code defects and security vulnerabilities are prevalent in modern software.

Paper
Add Code

Training and Evaluating a Jupyter Notebook Data Science Assistant

1 code implementation • 30 Jan 2022 • Shubham Chandel, Colin B. Clement, Guillermo Serrato, Neel Sundaresan

We study the feasibility of a Data Science assistant powered by a sequence-to-sequence transformer by training a new model JuPyT5 on all publicly available Jupyter Notebook GitHub repositories and developing a new metric: Data Science Problems (DSP).

Math

Paper
Code

Long-Range Modeling of Source Code Files with eWASH: Extended Window Access by Syntax Hierarchy

no code implementations • EMNLP 2021 • Colin B. Clement, Shuai Lu, Xiaoyu Liu, Michele Tufano, Dawn Drain, Nan Duan, Neel Sundaresan, Alexey Svyatkovskiy

While there are many efforts to extend the context window, we introduce an architecture-independent approach for leveraging the syntactic hierarchies of source code for incorporating entire file-level context into a fixed-length window.

Code Completion Code Generation +3

Paper
Add Code

Program Merge Conflict Resolution via Neural Transformers

1 code implementation • 31 Aug 2021 • Alexey Svyatkovskiy, Sarah Fakhoury, Negar Ghorbani, Todd Mytkowicz, Elizabeth Dinella, Christian Bird, Jinu Jang, Neel Sundaresan, Shuvendu Lahiri

Our model achieves 63-68% accuracy for merge resolution synthesis, yielding nearly a 3x performance improvement over existing semi-structured, and 2x improvement over neural program merge tools.

Paper
Code

Distilling Transformers for Neural Cross-Domain Search

no code implementations • 6 Aug 2021 • Colin B. Clement, Chen Wu, Dawn Drain, Neel Sundaresan

Pre-trained transformers have recently clinched top spots in the gamut of natural language tasks and pioneered solutions to software engineering tasks.

Code Search Data Augmentation +3

Paper
Add Code

DeepDebug: Fixing Python Bugs Using Stack Traces, Backtranslation, and Code Skeletons

no code implementations • 19 May 2021 • Dawn Drain, Colin B. Clement, Guillermo Serrato, Neel Sundaresan

The joint task of bug localization and program repair is an integral part of the software development process.

Program Repair

Paper
Add Code

Generating Bug-Fixes Using Pretrained Transformers

no code implementations • 16 Apr 2021 • Dawn Drain, Chen Wu, Alexey Svyatkovskiy, Neel Sundaresan

In this work we introduce DeepDebug: a data-driven program repair approach which learns to detect and fix bugs in Java methods mined from real-world GitHub repositories.

Denoising Program Repair

Paper
Add Code

Generating Code with the Help of Retrieved Template Functions and Stack Overflow Answers

no code implementations • 12 Apr 2021 • Dawn Drain, Changran Hu, Chen Wu, Mikhail Breslav, Neel Sundaresan

To demonstrate the effectiveness of our model designs, we perform extensive experiments with CodeSearchNet which contains template functions and CoNaLa which contains Stack Overflow intent-snippet pairs.

Code Search Retrieval

Paper
Add Code

CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

4 code implementations • 9 Feb 2021 • Shuai Lu, Daya Guo, Shuo Ren, JunJie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, Shujie Liu

Benchmark datasets have a significant impact on accelerating research in programming language tasks.

Ranked #1 on Cloze Test on CodeXGLUE - CT-maxmin

BIG-bench Machine Learning Clone Detection +9

1,413

Paper
Code

PyMT5: multi-mode translation of natural language and Python code with transformers

no code implementations • EMNLP 2020 • Colin B. Clement, Dawn Drain, Jonathan Timcheck, Alexey Svyatkovskiy, Neel Sundaresan

Simultaneously modeling source code and natural language has many exciting applications in automated software development and understanding.

Translation

Paper
Add Code

CodeBLEU: a Method for Automatic Evaluation of Code Synthesis

2 code implementations • 22 Sep 2020 • Shuo Ren, Daya Guo, Shuai Lu, Long Zhou, Shujie Liu, Duyu Tang, Neel Sundaresan, Ming Zhou, Ambrosio Blanco, Shuai Ma

Evaluation metrics play a vital role in the growth of an area as it defines the standard of distinguishing between good and bad models.

Code Translation Translation

7,760

Paper
Code

GraphCodeBERT: Pre-training Code Representations with Data Flow

1 code implementation • ICLR 2021 • Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, Ming Zhou

Instead of taking syntactic-level structure of code like abstract syntax tree (AST), we use data flow in the pre-training stage, which is a semantic-level structure of code that encodes the relation of "where-the-value-comes-from" between variables.

Ranked #3 on Type prediction on ManyTypes4TypeScript

Clone Detection Code Completion +7

1,976

Paper
Code

Unit Test Case Generation with Transformers and Focal Context

1 code implementation • 11 Sep 2020 • Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, Shao Kun Deng, Neel Sundaresan

We execute the test cases, collect test coverage information, and compare them with test cases generated by EvoSuite and GPT-3, finding that our approach outperforms GPT-3 and has comparable coverage w. r. t.

Denoising

118

Paper
Code

Generating Accurate Assert Statements for Unit Test Cases using Pretrained Transformers

no code implementations • 11 Sep 2020 • Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, Neel Sundaresan

In this paper we present an approach to support developers in writing unit test cases by generating accurate and useful assert statements.

Paper
Add Code

IntelliCode Compose: Code Generation Using Transformer

no code implementations • 16 May 2020 • Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, Neel Sundaresan

In software development through integrated development environments (IDEs), code completion is one of the most widely used features.

Code Completion Code Generation

Paper
Add Code

Pythia: AI-assisted Code Completion System

1 code implementation • 29 Nov 2019 • Alexey Svyatkovskiy, Ying Zhao, Shengyu Fu, Neel Sundaresan

In this paper, we propose a novel end-to-end approach for AI-assisted code completion called Pythia.

Code Completion Language Modelling

2,505

Paper
Code

Fast Approximate Matching of Cell-Phone Videos for Robust Background Subtraction

no code implementations • 22 Apr 2014 • Raffay Hamid, Atish Das Sarma, Dennis Decoste, Neel Sundaresan

We identify a novel instance of the background subtraction problem that focuses on extracting near-field foreground objects captured using handheld cameras.

Object

Paper
Add Code

Large Scale Visual Recommendations From Street Fashion Images

no code implementations • 8 Jan 2014 • Vignesh Jagadeesh, Robinson Piramuthu, Anurag Bhardwaj, Wei Di, Neel Sundaresan

We describe a completely automated large scale visual recommendation system for fashion.

Recommendation Systems Retrieval

Paper
Add Code

Large-Scale Video Summarization Using Web-Image Priors

no code implementations • CVPR 2013 • Aditya Khosla, Raffay Hamid, Chih-Jen Lin, Neel Sundaresan

Given the enormous growth in user-generated videos, it is becoming increasingly important to be able to navigate them efficiently.

Navigate Video Summarization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.