no code implementations • 19 Dec 2022 • Pengcheng Yin, Wen-Ding Li, Kefan Xiao, Abhishek Rao, Yeming Wen, Kensen Shi, Joshua Howland, Paige Bailey, Michele Catasta, Henryk Michalewski, Alex Polozov, Charles Sutton
To measure the performance of AI pair programmers that automatically synthesize programs for those tasks given natural language (NL) intents from users, we build ARCADE, a benchmark of 1082 code generation problems using the pandas data analysis framework in data science notebooks.
no code implementations • 31 Oct 2022 • Spyridon Mouselinos, Mateusz Malinowski, Henryk Michalewski
This work shows that current code generation systems exhibit undesired biases inherited from their large language model backbones, which can reduce the quality of the generated code under specific circumstances.
1 code implementation • 21 Jul 2022 • David Dohan, Winnie Xu, Aitor Lewkowycz, Jacob Austin, David Bieber, Raphael Gontijo Lopes, Yuhuai Wu, Henryk Michalewski, Rif A. Saurous, Jascha Sohl-Dickstein, Kevin Murphy, Charles Sutton
Prompted models have demonstrated impressive few-shot learning abilities.
no code implementations • 29 Jun 2022 • Aitor Lewkowycz, Anders Andreassen, David Dohan, Ethan Dyer, Henryk Michalewski, Vinay Ramasesh, Ambrose Slone, Cem Anil, Imanol Schlag, Theo Gutman-Solo, Yuhuai Wu, Behnam Neyshabur, Guy Gur-Ari, Vedant Misra
Language models have achieved remarkable performance on a wide range of tasks that require natural language understanding.
Ranked #1 on
Multi-task Language Understanding
on MMLU
(STEM metric)
1 code implementation • 30 May 2022 • Kuang-Huei Lee, Ofir Nachum, Mengjiao Yang, Lisa Lee, Daniel Freeman, Winnie Xu, Sergio Guadarrama, Ian Fischer, Eric Jang, Henryk Michalewski, Igor Mordatch
Specifically, we show that a single transformer-based model - with a single set of weights - trained purely offline can play a suite of up to 46 Atari games simultaneously at close-to-human performance.
5 code implementations • Google Research 2022 • Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, Noah Fiedel
To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM.
Ranked #1 on
Common Sense Reasoning
on BIG-bench (Known Unknowns)
no code implementations • 24 Feb 2022 • Spyridon Mouselinos, Henryk Michalewski, Mateusz Malinowski
Visual question answering provides a convenient framework for testing the model's abilities by interrogating the model through questions about the scene.
no code implementations • 30 Nov 2021 • Maxwell Nye, Anders Johan Andreassen, Guy Gur-Ari, Henryk Michalewski, Jacob Austin, David Bieber, David Dohan, Aitor Lewkowycz, Maarten Bosma, David Luan, Charles Sutton, Augustus Odena
Large pre-trained language models perform remarkably well on tasks that can be done "in one pass", such as generating realistic text or synthesizing computer programs.
no code implementations • NeurIPS 2021 • Sebastian Jaszczur, Aakanksha Chowdhery, Afroz Mohiuddin, Łukasz Kaiser, Wojciech Gajewski, Henryk Michalewski, Jonni Kanerva
We study sparse variants for all layers in the Transformer and propose Scaling Transformers, a family of next generation Transformer models that use sparse layers to scale efficiently and perform unbatched decoding much faster than the standard Transformer as we scale up the model size.
1 code implementation • 22 Nov 2021 • Michał Zawalski, Błażej Osiński, Henryk Michalewski, Piotr Miłoś
The key advantage of our algorithm is its high scalability in a multi-worker setting.
Multi-agent Reinforcement Learning
reinforcement-learning
+2
3 code implementations • Findings (NAACL) 2022 • Piotr Nawrot, Szymon Tworkowski, Michał Tyrolski, Łukasz Kaiser, Yuhuai Wu, Christian Szegedy, Henryk Michalewski
Transformer models yield impressive results on many NLP and sequence modeling tasks.
Ranked #4 on
Image Generation
on ImageNet 32x32
(bpd metric)
no code implementations • ICLR 2022 • Spyridon Mouselinos, Henryk Michalewski, Mateusz Malinowski
To answer such a question, we extend the visual question answering framework and propose the following behavioral test in the form of a two-player game.
1 code implementation • 16 Aug 2021 • Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, Charles Sutton
Our largest models, even without finetuning on a code dataset, can synthesize solutions to 59. 6 percent of the problems from MBPP using few-shot learning with a well-designed prompt.
no code implementations • ACL 2021 • Piotr Pi{\k{e}}kos, Mateusz Malinowski, Henryk Michalewski
How many fruits do you have in total?
no code implementations • 7 Jun 2021 • Piotr Piękos, Henryk Michalewski, Mateusz Malinowski
How many fruits do you have in total?
1 code implementation • 12 Feb 2021 • Piotr Kozakowski, Łukasz Kaiser, Henryk Michalewski, Afroz Mohiuddin, Katarzyna Kańska
QWR is an extension of Advantage Weighted Regression (AWR), an off-policy actor-critic algorithm that performs very well on continuous control tasks, also in the offline setting, but has low sample efficiency and struggles with high-dimensional observation spaces.
no code implementations • 16 Dec 2020 • Błażej Osiński, Piotr Miłoś, Adam Jakubowski, Paweł Zięcina, Michał Martyniak, Christopher Galias, Antonia Breuer, Silviu Homoceanu, Henryk Michalewski
This work introduces interactive traffic scenarios in the CARLA simulator, which are based on real-world traffic.
no code implementations • 27 May 2020 • Sebastian Jaszczur, Michał Łuszczyk, Henryk Michalewski
We use neural graph networks with a message-passing architecture and an attention mechanism to enhance the branching heuristic in two SAT-solving algorithms.
no code implementations • ICLR 2020 • Łukasz Kaiser, Mohammad Babaeizadeh, Piotr Miłos, Błażej Osiński, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski
We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting.
1 code implementation • 29 Nov 2019 • Błażej Osiński, Adam Jakubowski, Piotr Miłoś, Paweł Zięcina, Christopher Galias, Silviu Homoceanu, Henryk Michalewski
Using reinforcement learning in simulation and synthetic data is motivated by lowering costs and engineering effort.
1 code implementation • 30 May 2019 • Zsolt Zombori, Adrián Csiszárik, Henryk Michalewski, Cezary Kaliszyk, Josef Urban
We present a reinforcement learning (RL) based guidance system for automated theorem proving geared towards Finding Longer Proofs (FLoP).
2 code implementations • 1 Mar 2019 • Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski
We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting.
Ranked #8 on
Atari Games 100k
on Atari 100k
2 code implementations • 10 Sep 2018 • Michał Garmulewicz, Henryk Michalewski, Piotr Miłoś
We propose an expert-augmented actor-critic algorithm, which we evaluate on two environments with sparse rewards: Montezumas Revenge and a demanding maze from the ViZDoom suite.
no code implementations • NeurIPS 2018 • Cezary Kaliszyk, Josef Urban, Henryk Michalewski, Mirek Olšák
The strongest version of the system is trained on a large corpus of mathematical problems and evaluated on previously unseen problems.
2 code implementations • 2 Apr 2018 • Łukasz Kidziński, Sharada Prasanna Mohanty, Carmichael Ong, Zhewei Huang, Shuchang Zhou, Anton Pechenko, Adam Stelmaszczyk, Piotr Jarosik, Mikhail Pavlov, Sergey Kolesnikov, Sergey Plis, Zhibo Chen, Zhizheng Zhang, Jiale Chen, Jun Shi, Zhuobin Zheng, Chun Yuan, Zhihui Lin, Henryk Michalewski, Piotr Miłoś, Błażej Osiński, Andrew Melnik, Malte Schilling, Helge Ritter, Sean Carroll, Jennifer Hicks, Sergey Levine, Marcel Salathé, Scott Delp
In the NIPS 2017 Learning to Run challenge, participants were tasked with building a controller for a musculoskeletal model to make it run as fast as possible through an obstacle course.
1 code implementation • 9 Jan 2018 • Igor Adamski, Robert Adamski, Tomasz Grel, Adam Jędrych, Kamil Kaczmarek, Henryk Michalewski
We present a study in Distributed Deep Reinforcement Learning (DDRL) focused on scalability of a state-of-the-art Deep Reinforcement Learning algorithm known as Batch Asynchronous Advantage ActorCritic (BA3C).
Ranked #20 on
Atari Games
on Atari 2600 Boxing
no code implementations • 19 May 2017 • Robert Adamski, Tomasz Grel, Maciej Klimek, Henryk Michalewski
The asynchronous nature of the state-of-the-art reinforcement learning algorithms such as the Asynchronous Advantage Actor-Critic algorithm, makes them exceptionally suitable for CPU computations.
3 code implementations • 4 May 2016 • Jakub Sygnowski, Henryk Michalewski
We train a number of neural networks to play games Bowling, Breakout and Seaquest using information stored in the memory of a video game console Atari 2600.