Search Results for author: Michele Tufano

Found 16 papers, 6 papers with code

AutoDev: Automated AI-Driven Development

no code implementations • 13 Mar 2024 • Michele Tufano, Anisha Agarwal, Jinu Jang, Roshanak Zilouchian Moghaddam, Neel Sundaresan

This enables the AI Agents to execute tasks in a fully automated manner with a comprehensive understanding of the contextual information required.

Code Generation

Paper
Add Code

Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming

no code implementations • 22 Feb 2024 • Anisha Agarwal, Aaron Chan, Shubham Chandel, Jinu Jang, Shaun Miller, Roshanak Zilouchian Moghaddam, Yevhen Mohylevskyy, Neel Sundaresan, Michele Tufano

The integration of Large Language Models (LLMs) into Development Environments (IDEs) has become a focal point in modern software development.

Bug fixing Code Generation

Paper
Add Code

Reinforcement Learning from Automatic Feedback for High-Quality Unit Test Generation

no code implementations • 3 Oct 2023 • Benjamin Steenhoek, Michele Tufano, Neel Sundaresan, Alexey Svyatkovskiy

Software testing is a crucial aspect of software development, and the creation of high-quality tests that adhere to best practices is essential for effective maintenance.

Code Generation reinforcement-learning

Paper
Add Code

Predicting Code Coverage without Execution

1 code implementation • 25 Jul 2023 • Michele Tufano, Shubham Chandel, Anisha Agarwal, Neel Sundaresan, Colin Clement

Using Machine Learning to amortize this expensive process could lower the cost of code coverage by requiring only the source code context, and the task of code coverage prediction can be a novel benchmark for judging the ability of models to understand code.

Paper
Code

An Empirical Investigation into the Use of Image Captioning for Automated Software Documentation

no code implementations • 3 Jan 2023 • Kevin Moran, Ali Yachnes, George Purnell, Junayed Mahmud, Michele Tufano, Carlos Bernal-Cárdenas, Denys Poshyvanyk, Zach H'Doubler

This paper offers one of the first comprehensive empirical investigations into the connection between GUIs and functional, natural language descriptions of software.

Image Captioning Machine Translation

Paper
Add Code

Exploring and Evaluating Personalized Models for Code Generation

no code implementations • 29 Aug 2022 • Andrei Zlotchevski, Dawn Drain, Alexey Svyatkovskiy, Colin Clement, Neel Sundaresan, Michele Tufano

Large Transformer models achieved the state-of-the-art status for Natural Language Understanding tasks and are increasingly becoming the baseline model architecture for modeling source code.

Code Generation Natural Language Understanding +1

Paper
Add Code

Long-Range Modeling of Source Code Files with eWASH: Extended Window Access by Syntax Hierarchy

no code implementations • EMNLP 2021 • Colin B. Clement, Shuai Lu, Xiaoyu Liu, Michele Tufano, Dawn Drain, Nan Duan, Neel Sundaresan, Alexey Svyatkovskiy

While there are many efforts to extend the context window, we introduce an architecture-independent approach for leveraging the syntactic hierarchies of source code for incorporating entire file-level context into a fixed-length window.

Code Completion Code Generation +3

Paper
Add Code

CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

4 code implementations • 9 Feb 2021 • Shuai Lu, Daya Guo, Shuo Ren, JunJie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, Shujie Liu

Benchmark datasets have a significant impact on accelerating research in programming language tasks.

Ranked #1 on Cloze Test on CodeXGLUE - CT-maxmin

BIG-bench Machine Learning Clone Detection +9

1,414

Paper
Code

GraphCodeBERT: Pre-training Code Representations with Data Flow

1 code implementation • ICLR 2021 • Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, Ming Zhou

Instead of taking syntactic-level structure of code like abstract syntax tree (AST), we use data flow in the pre-training stage, which is a semantic-level structure of code that encodes the relation of "where-the-value-comes-from" between variables.

Ranked #3 on Type prediction on ManyTypes4TypeScript

Clone Detection Code Completion +7

1,973

Paper
Code

Unit Test Case Generation with Transformers and Focal Context

1 code implementation • 11 Sep 2020 • Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, Shao Kun Deng, Neel Sundaresan

We execute the test cases, collect test coverage information, and compare them with test cases generated by EvoSuite and GPT-3, finding that our approach outperforms GPT-3 and has comparable coverage w. r. t.

Denoising

118

Paper
Code

Generating Accurate Assert Statements for Unit Test Cases using Pretrained Transformers

no code implementations • 11 Sep 2020 • Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, Neel Sundaresan

In this paper we present an approach to support developers in writing unit test cases by generating accurate and useful assert statements.

Paper
Add Code

DeepMutation: A Neural Mutation Tool

no code implementations • 12 Feb 2020 • Michele Tufano, Jason Kimko, Shiya Wang, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Denys Poshyvanyk

To this aim, two characteristics of mutation testing frameworks are of paramount importance: (i) they should generate mutants that are representative of real faults; and (ii) they should provide a complete tool chain able to automatically generate, inject, and test the mutants.

Fault Detection

Paper
Add Code

On Learning Meaningful Code Changes via Neural Machine Translation

no code implementations • 25 Jan 2019 • Michele Tufano, Jevgenija Pantiuchina, Cody Watson, Gabriele Bavota, Denys Poshyvanyk

We show that, when applied in a narrow enough context (i. e., small/medium-sized pairs of methods before/after the pull request changes), NMT can automatically replicate the changes implemented by developers during pull requests in up to 36% of the cases.

Bug fixing Machine Translation +2

Paper
Add Code

Learning How to Mutate Source Code from Bug-Fixes

no code implementations • 27 Dec 2018 • Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, Denys Poshyvanyk

Starting from code fixed by developers in the context of a bug-fix, our empirical evaluation showed that our models are able to predict mutants that resemble original fixed bugs in between 9% and 45% of the cases (depending on the model).

Software Engineering

Paper
Add Code

SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair

2 code implementations • 24 Dec 2018 • Zimin Chen, Steve Kommrusch, Michele Tufano, Louis-Noël Pouchet, Denys Poshyvanyk, Martin Monperrus

This paper presents a novel end-to-end approach to program repair based on sequence-to-sequence learning.

Program Repair

Paper
Code

Sorting and Transforming Program Repair Ingredients via Deep Learning Code Similarities

1 code implementation • 15 Jul 2017 • Martin White, Michele Tufano, Matias Martinez, Martin Monperrus, Denys Poshyvanyk

We aim to reason about the repair ingredients by using code similarities to prioritize and transform statements in a codebase for patch generation.

Software Engineering

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.