2 code implementations • ICML 2020 • Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi
We fine-tune CuBERT on our benchmark tasks, and compare the resulting models to different variants of Word2Vec token embeddings, BiLSTM and Transformer models, as well as published state-of-the-art models, showing that CuBERT outperforms them all, even with shorter training, and with fewer labeled examples.
5 code implementations • Google Research 2022 • Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, Noah Fiedel
To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM.
Ranked #1 on Coreference Resolution on Winograd Schema Challenge
1 code implementation • 15 Aug 2022 • David Bieber, Kensen Shi, Petros Maniatis, Charles Sutton, Vincent Hellendoorn, Daniel Johnson, Daniel Tarlow
Graph representations of programs are commonly a central element of machine learning for code research.
2 code implementations • NeurIPS Workshop CAP 2020 • Kensen Shi, David Bieber, Rishabh Singh
The success and popularity of deep learning is on the rise, partially due to powerful deep learning frameworks such as TensorFlow and PyTorch that make it easier to develop deep learning models.
1 code implementation • ICLR 2022 • Kensen Shi, Hanjun Dai, Kevin Ellis, Charles Sutton
Many approaches to program synthesis perform a search within an enormous space of programs to find one that satisfies a given specification.
2 code implementations • 13 Nov 2018 • Kensen Shi, Jacob Steinhardt, Percy Liang
We present FrAngel, a new approach to component-based synthesis that can synthesize short Java functions with control structures when given a desired signature, a set of input-output examples, and a collection of libraries (without formal specifications).
Programming Languages
1 code implementation • ICML 2020 • Kensen Shi, David Bieber, Charles Sutton
Sampling is a fundamental technique, and sampling without replacement is often desirable when duplicate samples are not beneficial.
no code implementations • ICLR 2021 • Augustus Odena, Kensen Shi, David Bieber, Rishabh Singh, Charles Sutton, Hanjun Dai
Program synthesis is challenging largely because of the difficulty of search in a large space of programs.
no code implementations • 25 Sep 2019 • Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi
A major advancement in natural-language understanding has been the use of pre-trained token embeddings; BERT and other works have further shown that pre-trained contextual embeddings can be extremely powerful and can be finetuned effectively for a variety of downstream supervised tasks.
no code implementations • 7 Apr 2022 • Kensen Shi, Joey Hong, Manzil Zaheer, Pengcheng Yin, Charles Sutton
We first characterize several different axes along which program synthesis methods would be desired to generalize, e. g., length generalization, or the ability to combine known subroutines in new ways that do not occur in the training data.
no code implementations • 19 Dec 2022 • Pengcheng Yin, Wen-Ding Li, Kefan Xiao, Abhishek Rao, Yeming Wen, Kensen Shi, Joshua Howland, Paige Bailey, Michele Catasta, Henryk Michalewski, Alex Polozov, Charles Sutton
To measure the performance of AI pair programmers that automatically synthesize programs for those tasks given natural language (NL) intents from users, we build ARCADE, a benchmark of 1082 code generation problems using the pandas data analysis framework in data science notebooks.
no code implementations • 26 Jul 2023 • Kensen Shi, Joey Hong, Manzil Zaheer, Pengcheng Yin, Charles Sutton
When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks.
no code implementations • 12 Feb 2024 • Yeming Wen, Pengcheng Yin, Kensen Shi, Henryk Michalewski, Swarat Chaudhuri, Alex Polozov
Specifically, we propose GIFT4Code, a novel approach for the instruction fine-tuning of LLMs with respect to I/O specifications.