Search Results for author: Vincent J. Hellendoorn

Found 12 papers, 7 papers with code

Learning Defect Prediction from Unrealistic Data

no code implementations2 Nov 2023 Kamel Alrashedy, Vincent J. Hellendoorn, Alessandro Orso

To investigate this conjecture, we propose an approach for identifying the subsets of these large yet unrealistic datasets that are most similar to examples in real-world datasets based on their learned representations.

Synthetic Data Generation

Large Language Models for Test-Free Fault Localization

1 code implementation3 Oct 2023 Aidan Z. H. Yang, Ruben Martins, Claire Le Goues, Vincent J. Hellendoorn

Specifically, we propose to overcome the left-to-right nature of LLMs by fine-tuning a small set of bidirectional adapter layers on top of the representations learned by LLMs to produce LLMAO, the first language model based fault localization approach that locates buggy lines of code without any test coverage information.

Fault localization Language Modelling

CAT-LM: Training Language Models on Aligned Code And Tests

2 code implementations2 Oct 2023 Nikitha Rao, Kush Jain, Uri Alon, Claire Le Goues, Vincent J. Hellendoorn

We also drastically increase the maximum sequence length of inputs to 8, 192 tokens, 4x more than typical code generation models, to ensure that the code context is available to the model when generating test code.

Code Generation Language Modelling

AI for Low-Code for AI

no code implementations31 May 2023 Nikitha Rao, Jason Tsay, Kiran Kate, Vincent J. Hellendoorn, Martin Hirzel

We task 20 developers with varying levels of AI expertise with implementing four ML pipelines using LowCoder, replacing the LowCoder_NL component with a simple keyword search in half the tasks.

DiffusER: Discrete Diffusion via Edit-based Reconstruction

no code implementations30 Oct 2022 Machel Reid, Vincent J. Hellendoorn, Graham Neubig

In text generation, models that generate text from scratch one token at a time are currently the dominant paradigm.

Denoising Machine Translation +2

A Systematic Evaluation of Large Language Models of Code

3 code implementations26 Feb 2022 Frank F. Xu, Uri Alon, Graham Neubig, Vincent J. Hellendoorn

We aim to fill in some of these blanks through a systematic evaluation of the largest existing models: Codex, GPT-J, GPT-Neo, GPT-NeoX-20B, and CodeParrot, across various programming languages.

Language Modelling

Capturing Structural Locality in Non-parametric Language Models

no code implementations ICLR 2022 Frank F. Xu, Junxian He, Graham Neubig, Vincent J. Hellendoorn

Structural locality is a ubiquitous feature of real-world datasets, wherein data points are organized into local hierarchies.

Memorization and Generalization in Neural Code Intelligence Models

2 code implementations16 Jun 2021 Md Rafiqul Islam Rabin, Aftab Hussain, Mohammad Amin Alipour, Vincent J. Hellendoorn

The goal of this paper is to evaluate and compare the extent of memorization and generalization in neural code intelligence models.

Code Documentation Generation Code Search +3

Understanding Neural Code Intelligence Through Program Simplification

2 code implementations7 Jun 2021 Md Rafiqul Islam Rabin, Vincent J. Hellendoorn, Mohammad Amin Alipour

Our approach, SIVAND, uses simplification techniques that reduce the size of input programs of a CI model while preserving the predictions of the model.

Method name prediction Variable misuse

Patching as Translation: the Data and the Metaphor

1 code implementation24 Aug 2020 Yangruibo Ding, Baishakhi Ray, Premkumar Devanbu, Vincent J. Hellendoorn

Given these findings, we demonstrate how a more principled approach to model design, based on our empirical findings and general knowledge of software development, can lead to better solutions.

General Knowledge Program Repair +1

Global Relational Models of Source Code

1 code implementation ICLR 2020 Vincent J. Hellendoorn, Charles Sutton, Rishabh Singh, Petros Maniatis, David Bieber

By studying a popular, non-trivial program repair task, variable-misuse identification, we explore the relative merits of traditional and hybrid model families for code representation.

Inductive Bias Variable misuse

Cannot find the paper you are looking for? You can Submit a new open access paper.