no code implementations • 26 Jul 2023 • Kensen Shi, Joey Hong, Manzil Zaheer, Pengcheng Yin, Charles Sutton
When writing programs, people have the ability to tackle a new complex task by decomposing it into smaller and more familiar subtasks.
no code implementations • 26 May 2023 • Ruoxi Sun, Sercan O. Arik, Hootan Nakhost, Hanjun Dai, Rajarishi Sinha, Pengcheng Yin, Tomas Pfister
One impressive emergent capability of large language models (LLMs) is generation of code, including Structured Query Language (SQL) for databases.
no code implementations • 17 May 2023 • Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernandez Abrego, Junwhan Ahn, Jacob Austin, Paul Barham, Jan Botha, James Bradbury, Siddhartha Brahma, Kevin Brooks, Michele Catasta, Yong Cheng, Colin Cherry, Christopher A. Choquette-Choo, Aakanksha Chowdhery, Clément Crepy, Shachi Dave, Mostafa Dehghani, Sunipa Dev, Jacob Devlin, Mark Díaz, Nan Du, Ethan Dyer, Vlad Feinberg, Fangxiaoyu Feng, Vlad Fienber, Markus Freitag, Xavier Garcia, Sebastian Gehrmann, Lucas Gonzalez, Guy Gur-Ari, Steven Hand, Hadi Hashemi, Le Hou, Joshua Howland, Andrea Hu, Jeffrey Hui, Jeremy Hurwitz, Michael Isard, Abe Ittycheriah, Matthew Jagielski, Wenhao Jia, Kathleen Kenealy, Maxim Krikun, Sneha Kudugunta, Chang Lan, Katherine Lee, Benjamin Lee, Eric Li, Music Li, Wei Li, Yaguang Li, Jian Li, Hyeontaek Lim, Hanzhao Lin, Zhongtao Liu, Frederick Liu, Marcello Maggioni, Aroma Mahendru, Joshua Maynez, Vedant Misra, Maysam Moussalem, Zachary Nado, John Nham, Eric Ni, Andrew Nystrom, Alicia Parrish, Marie Pellat, Martin Polacek, Alex Polozov, Reiner Pope, Siyuan Qiao, Emily Reif, Bryan Richter, Parker Riley, Alex Castro Ros, Aurko Roy, Brennan Saeta, Rajkumar Samuel, Renee Shelby, Ambrose Slone, Daniel Smilkov, David R. So, Daniel Sohn, Simon Tokumine, Dasha Valter, Vijay Vasudevan, Kiran Vodrahalli, Xuezhi Wang, Pidong Wang, ZiRui Wang, Tao Wang, John Wieting, Yuhuai Wu, Kelvin Xu, Yunhan Xu, Linting Xue, Pengcheng Yin, Jiahui Yu, Qiao Zhang, Steven Zheng, Ce Zheng, Weikang Zhou, Denny Zhou, Slav Petrov, Yonghui Wu
Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM.
Ranked #1 on
Question Answering
on TriviaQA
(using extra training data)
no code implementations • 19 Dec 2022 • Pengcheng Yin, Wen-Ding Li, Kefan Xiao, Abhishek Rao, Yeming Wen, Kensen Shi, Joshua Howland, Paige Bailey, Michele Catasta, Henryk Michalewski, Alex Polozov, Charles Sutton
To measure the performance of AI pair programmers that automatically synthesize programs for those tasks given natural language (NL) intents from users, we build ARCADE, a benchmark of 1082 code generation problems using the pandas data analysis framework in data science notebooks.
1 code implementation • 19 Jul 2022 • Ningyi Liao, Dingheng Mo, Siqiang Luo, Xiang Li, Pengcheng Yin
Recent advances in data processing have stimulated the demand for learning graphs of very large scales.
no code implementations • 7 Apr 2022 • Kensen Shi, Joey Hong, Manzil Zaheer, Pengcheng Yin, Charles Sutton
We first characterize several different axes along which program synthesis methods would be desired to generalize, e. g., length generalization, or the ability to combine known subroutines in new ways that do not occur in the training data.
5 code implementations • Google Research 2022 • Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, Noah Fiedel
To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM.
Ranked #1 on
Natural Language Inference
on RTE
1 code implementation • ACL 2022 • Shuyan Zhou, Li Zhang, Yue Yang, Qing Lyu, Pengcheng Yin, Chris Callison-Burch, Graham Neubig
To this end, we develop a simple and efficient method that links steps (e. g., "purchase a camera") in an article to other articles with similar goals (e. g., "how to choose a camera"), recursively constructing the KB.
1 code implementation • 16 Jan 2022 • Tianbao Xie, Chen Henry Wu, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida I. Wang, Victor Zhong, Bailin Wang, Chengzu Li, Connor Boyle, Ansong Ni, Ziyu Yao, Dragomir Radev, Caiming Xiong, Lingpeng Kong, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, Tao Yu
Structured knowledge grounding (SKG) leverages structured knowledge to complete user requests, such as semantic parsing over databases and question answering over knowledge bases.
Ranked #1 on
Task-Oriented Dialogue Systems
on KVRET
no code implementations • ACL 2022 • Pengcheng Yin, John Wieting, Avirup Sil, Graham Neubig
Semantic parsers map natural language utterances into meaning representations (e. g., programs).
no code implementations • 28 Sep 2021 • Alex Shypula, Pengcheng Yin, Jeremy Lacomis, Claire Le Goues, Edward Schwartz, Graham Neubig
We also report that SILO's rate of superoptimization on our test set is over five times that of a standard policy gradient approach and a model pre-trained on compiler optimization demonstration.
no code implementations • NAACL (SUKI) 2022 • Shuyan Zhou, Pengcheng Yin, Graham Neubig
When humans conceive how to perform a particular task, they do so hierarchically: splitting higher-level tasks into smaller sub-tasks.
no code implementations • NAACL 2021 • Pengcheng Yin, Hao Fang, Graham Neubig, Adam Pauls, Emmanouil Antonios Platanios, Yu Su, Sam Thomson, Jacob Andreas
We describe a span-level supervised attention loss that improves compositional generalization in semantic parsers.
1 code implementation • ICLR 2021 • Ziyu Yao, Frank F. Xu, Pengcheng Yin, Huan Sun, Graham Neubig
To show the unique benefits of modeling tree edits directly, we further propose a novel edit encoder for learning to represent edits, as well as an imitation learning method that allows the editor to be more robust.
1 code implementation • ACL 2020 • Pengcheng Yin, Graham Neubig, Wen-tau Yih, Sebastian Riedel
Recent years have witnessed the burgeoning of pretrained language models (LMs) for text-based natural language (NL) understanding tasks.
Ranked #6 on
Text-To-SQL
on spider
2 code implementations • ACL 2020 • Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, Graham Neubig
Open-domain code generation aims to generate code in a general-purpose programming language (such as Python) from natural language (NL) intents.
Ranked #3 on
Code Generation
on CoNaLa-Ext
1 code implementation • 29 Nov 2019 • Ansong Ni, Pengcheng Yin, Graham Neubig
Experiments on WikiTableQuestions with human annotators show that our method can improve the performance with only 100 active queries, especially for weakly-supervised parsers learnt from a cold start.
no code implementations • ACL 2019 • Pengcheng Yin, Graham Neubig
Semantic parsing considers the task of transducing natural language (NL) utterances into machine executable meaning representations (MRs).
Ranked #4 on
Code Generation
on Django
1 code implementation • ACL 2019 • Zhengbao Jiang, Pengcheng Yin, Graham Neubig
We found that the extraction likelihood, a confidence measure used by current supervised open IE systems, is not well calibrated when comparing the quality of assertions extracted from different sentences.
2 code implementations • ICLR 2019 • Pengcheng Yin, Graham Neubig, Miltiadis Allamanis, Marc Brockschmidt, Alexander L. Gaunt
We introduce the problem of learning distributed representations of edits.
4 code implementations • EMNLP 2018 • Pengcheng Yin, Graham Neubig
We present TRANX, a transition-based neural semantic parser that maps natural language (NL) utterances into formal meaning representations (MRs).
Ranked #2 on
Semantic Parsing
on ATIS
1 code implementation • EMNLP 2018 • Shirley Anugrah Hayati, Raphael Olivier, Pravalika Avvaru, Pengcheng Yin, Anthony Tomasic, Graham Neubig
In models to generate program source code from natural language, representing this code in a tree structure has been a common approach.
1 code implementation • EMNLP 2018 • Xinyi Wang, Hieu Pham, Pengcheng Yin, Graham Neubig
Recent advances in Neural Machine Translation (NMT) show that adding syntactic information to NMT systems can improve the quality of their translations.
6 code implementations • ACL 2018 • Pengcheng Yin, Chunting Zhou, Junxian He, Graham Neubig
Semantic parsing is the task of transducing natural language (NL) utterances into formal meaning representations (MRs), commonly represented as tree structures.
no code implementations • 23 May 2018 • Pengcheng Yin, Bowen Deng, Edgar Chen, Bogdan Vasilescu, Graham Neubig
For tasks like code synthesis from natural language, code retrieval, and code summarization, data-driven models have shown great promise.
no code implementations • ICLR 2018 • Xuezhe Ma, Pengcheng Yin, Jingzhou Liu, Graham Neubig, Eduard Hovy
Reward augmented maximum likelihood (RAML), a simple and effective learning framework to directly optimize towards the reward function in structured prediction tasks, has led to a number of impressive empirical successes.
6 code implementations • ACL 2017 • Pengcheng Yin, Graham Neubig
We consider the problem of parsing natural language descriptions into source code written in a general-purpose programming language like Python.
4 code implementations • 15 Jan 2017 • Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, Pengcheng Yin
In the static declaration strategy that is used in toolkits like Theano, CNTK, and TensorFlow, the user first defines a computation graph (a symbolic representation of the computation), and then examples are fed into an engine that executes this computation and computes its derivatives.
no code implementations • 3 Dec 2015 • Pengcheng Yin, Zhengdong Lu, Hang Li, Ben Kao
Neural Enquirer can be trained with gradient descent, with which not only the parameters of the controlling components and semantic parsing component, but also the embeddings of the tables and query words can be learned from scratch.