Search Results for author: Kensen Shi

Found 13 papers, 7 papers with code

Grounding Data Science Code Generation with Input-Output Specifications

no code implementations12 Feb 2024 Yeming Wen, Pengcheng Yin, Kensen Shi, Henryk Michalewski, Swarat Chaudhuri, Alex Polozov

Specifically, we propose GIFT4Code, a novel approach for the instruction fine-tuning of LLMs with respect to I/O specifications.

Code Generation

ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis

no code implementations26 Jul 2023 Kensen Shi, Joey Hong, Manzil Zaheer, Pengcheng Yin, Charles Sutton

When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks.

Program Synthesis

Natural Language to Code Generation in Interactive Data Science Notebooks

no code implementations19 Dec 2022 Pengcheng Yin, Wen-Ding Li, Kefan Xiao, Abhishek Rao, Yeming Wen, Kensen Shi, Joshua Howland, Paige Bailey, Michele Catasta, Henryk Michalewski, Alex Polozov, Charles Sutton

To measure the performance of AI pair programmers that automatically synthesize programs for those tasks given natural language (NL) intents from users, we build ARCADE, a benchmark of 1082 code generation problems using the pandas data analysis framework in data science notebooks.

Code Generation Language Modelling

A Library for Representing Python Programs as Graphs for Machine Learning

1 code implementation15 Aug 2022 David Bieber, Kensen Shi, Petros Maniatis, Charles Sutton, Vincent Hellendoorn, Daniel Johnson, Daniel Tarlow

Graph representations of programs are commonly a central element of machine learning for code research.

Compositional Generalization and Decomposition in Neural Program Synthesis

no code implementations7 Apr 2022 Kensen Shi, Joey Hong, Manzil Zaheer, Pengcheng Yin, Charles Sutton

We first characterize several different axes along which program synthesis methods would be desired to generalize, e. g., length generalization, or the ability to combine known subroutines in new ways that do not occur in the training data.

Program Synthesis

CrossBeam: Learning to Search in Bottom-Up Program Synthesis

1 code implementation ICLR 2022 Kensen Shi, Hanjun Dai, Kevin Ellis, Charles Sutton

Many approaches to program synthesis perform a search within an enormous space of programs to find one that satisfies a given specification.

Program Synthesis Structured Prediction

TF-Coder: Program Synthesis for Tensor Manipulations

2 code implementations NeurIPS Workshop CAP 2020 Kensen Shi, David Bieber, Rishabh Singh

The success and popularity of deep learning is on the rise, partially due to powerful deep learning frameworks such as TensorFlow and PyTorch that make it easier to develop deep learning models.

Enumerative Search

Incremental Sampling Without Replacement for Sequence Models

1 code implementation ICML 2020 Kensen Shi, David Bieber, Charles Sutton

Sampling is a fundamental technique, and sampling without replacement is often desirable when duplicate samples are not beneficial.

Combinatorial Optimization Program Synthesis

Learning and Evaluating Contextual Embedding of Source Code

2 code implementations ICML 2020 Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi

We fine-tune CuBERT on our benchmark tasks, and compare the resulting models to different variants of Word2Vec token embeddings, BiLSTM and Transformer models, as well as published state-of-the-art models, showing that CuBERT outperforms them all, even with shorter training, and with fewer labeled examples.

Contextual Embedding for Source Code Exception type +5

Pre-trained Contextual Embedding of Source Code

no code implementations25 Sep 2019 Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi

A major advancement in natural-language understanding has been the use of pre-trained token embeddings; BERT and other works have further shown that pre-trained contextual embeddings can be extremely powerful and can be finetuned effectively for a variety of downstream supervised tasks.

Natural Language Understanding

FrAngel: Component-Based Synthesis with Control Structures

2 code implementations13 Nov 2018 Kensen Shi, Jacob Steinhardt, Percy Liang

We present FrAngel, a new approach to component-based synthesis that can synthesize short Java functions with control structures when given a desired signature, a set of input-output examples, and a collection of libraries (without formal specifications).

Programming Languages

Cannot find the paper you are looking for? You can Submit a new open access paper.