Search Results for author: Colin Clement

Found 8 papers, 4 papers with code

SUT: Active Defects Probing for Transcompiler Models

no code implementations • 22 Oct 2023 • MengNan Qi, Yufan Huang, Maoquan Wang, Yongqiang Yao, Zihan Liu, Bin Gu, Colin Clement, Neel Sundaresan

In this paper we introduce a new metrics for programming language translation and these metrics address these basic syntax errors.

Translation

Paper
Add Code

Program Translation via Code Distillation

no code implementations • 17 Oct 2023 • Yufan Huang, MengNan Qi, Yongqiang Yao, Maoquan Wang, Bin Gu, Colin Clement, Neel Sundaresan

Distilled code serves as a translation pivot for any programming language, leading by construction to parallel corpora which scale to all available source code by simply applying the distillation compiler.

Machine Translation Translation

Paper
Add Code

Predicting Code Coverage without Execution

1 code implementation • 25 Jul 2023 • Michele Tufano, Shubham Chandel, Anisha Agarwal, Neel Sundaresan, Colin Clement

Using Machine Learning to amortize this expensive process could lower the cost of code coverage by requiring only the source code context, and the task of code coverage prediction can be a novel benchmark for judging the ability of models to understand code.

Paper
Code

Execution-based Evaluation for Data Science Code Generation Models

1 code implementation • 17 Nov 2022 • JunJie Huang, Chenglong Wang, Jipeng Zhang, Cong Yan, Haotian Cui, Jeevana Priya Inala, Colin Clement, Nan Duan, Jianfeng Gao

Code generation models can benefit data scientists' productivity by automatically generating code from context and text descriptions.

Code Generation Model Selection

Paper
Code

Exploring and Evaluating Personalized Models for Code Generation

no code implementations • 29 Aug 2022 • Andrei Zlotchevski, Dawn Drain, Alexey Svyatkovskiy, Colin Clement, Neel Sundaresan, Michele Tufano

Large Transformer models achieved the state-of-the-art status for Natural Language Understanding tasks and are increasingly becoming the baseline model architecture for modeling source code.

Code Generation Natural Language Understanding +1

Paper
Add Code

Learning to Reduce False Positives in Analytic Bug Detectors

no code implementations • 8 Mar 2022 • Anant Kharkar, Roshanak Zilouchian Moghaddam, Matthew Jin, Xiaoyu Liu, Xin Shi, Colin Clement, Neel Sundaresan

Due to increasingly complex software design and rapid iterative development, code defects and security vulnerabilities are prevalent in modern software.

Paper
Add Code

CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

4 code implementations • 9 Feb 2021 • Shuai Lu, Daya Guo, Shuo Ren, JunJie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, Shujie Liu

Benchmark datasets have a significant impact on accelerating research in programming language tasks.

Ranked #1 on Cloze Test on CodeXGLUE - CT-maxmin

BIG-bench Machine Learning Clone Detection +10

1,452

Paper
Code

GraphCodeBERT: Pre-training Code Representations with Data Flow

1 code implementation • ICLR 2021 • Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, Ming Zhou

Instead of taking syntactic-level structure of code like abstract syntax tree (AST), we use data flow in the pre-training stage, which is a semantic-level structure of code that encodes the relation of "where-the-value-comes-from" between variables.

Ranked #3 on Type prediction on ManyTypes4TypeScript

Clone Detection Code Completion +7

2,023

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.