Search Results for author: Henryk Michalewski

Found 28 papers, 14 papers with code

Natural Language to Code Generation in Interactive Data Science Notebooks

no code implementations19 Dec 2022 Pengcheng Yin, Wen-Ding Li, Kefan Xiao, Abhishek Rao, Yeming Wen, Kensen Shi, Joshua Howland, Paige Bailey, Michele Catasta, Henryk Michalewski, Alex Polozov, Charles Sutton

To measure the performance of AI pair programmers that automatically synthesize programs for those tasks given natural language (NL) intents from users, we build ARCADE, a benchmark of 1082 code generation problems using the pandas data analysis framework in data science notebooks.

Code Generation Language Modelling

A Simple, Yet Effective Approach to Finding Biases in Code Generation

no code implementations31 Oct 2022 Spyridon Mouselinos, Mateusz Malinowski, Henryk Michalewski

This work shows that current code generation systems exhibit undesired biases inherited from their large language model backbones, which can reduce the quality of the generated code under specific circumstances.

Causal Language Modeling Code Generation +1

Multi-Game Decision Transformers

1 code implementation30 May 2022 Kuang-Huei Lee, Ofir Nachum, Mengjiao Yang, Lisa Lee, Daniel Freeman, Winnie Xu, Sergio Guadarrama, Ian Fischer, Eric Jang, Henryk Michalewski, Igor Mordatch

Specifically, we show that a single transformer-based model - with a single set of weights - trained purely offline can play a suite of up to 46 Atari games simultaneously at close-to-human performance.

Atari Games Offline RL

Measuring CLEVRness: Blackbox testing of Visual Reasoning Models

no code implementations24 Feb 2022 Spyridon Mouselinos, Henryk Michalewski, Mateusz Malinowski

Visual question answering provides a convenient framework for testing the model's abilities by interrogating the model through questions about the scene.

Benchmarking Question Answering +2

Show Your Work: Scratchpads for Intermediate Computation with Language Models

no code implementations30 Nov 2021 Maxwell Nye, Anders Johan Andreassen, Guy Gur-Ari, Henryk Michalewski, Jacob Austin, David Bieber, David Dohan, Aitor Lewkowycz, Maarten Bosma, David Luan, Charles Sutton, Augustus Odena

Large pre-trained language models perform remarkably well on tasks that can be done "in one pass", such as generating realistic text or synthesizing computer programs.

Sparse is Enough in Scaling Transformers

no code implementations NeurIPS 2021 Sebastian Jaszczur, Aakanksha Chowdhery, Afroz Mohiuddin, Łukasz Kaiser, Wojciech Gajewski, Henryk Michalewski, Jonni Kanerva

We study sparse variants for all layers in the Transformer and propose Scaling Transformers, a family of next generation Transformer models that use sparse layers to scale efficiently and perform unbatched decoding much faster than the standard Transformer as we scale up the model size.

Text Summarization

Measuring CLEVRness: Black-box Testing of Visual Reasoning Models

no code implementations ICLR 2022 Spyridon Mouselinos, Henryk Michalewski, Mateusz Malinowski

To answer such a question, we extend the visual question answering framework and propose the following behavioral test in the form of a two-player game.

Benchmarking Question Answering +2

Program Synthesis with Large Language Models

1 code implementation16 Aug 2021 Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, Charles Sutton

Our largest models, even without finetuning on a code dataset, can synthesize solutions to 59. 6 percent of the problems from MBPP using few-shot learning with a well-designed prompt.

Few-Shot Learning Program Synthesis

Q-Value Weighted Regression: Reinforcement Learning with Limited Data

1 code implementation12 Feb 2021 Piotr Kozakowski, Łukasz Kaiser, Henryk Michalewski, Afroz Mohiuddin, Katarzyna Kańska

QWR is an extension of Advantage Weighted Regression (AWR), an off-policy actor-critic algorithm that performs very well on continuous control tasks, also in the offline setting, but has low sample efficiency and struggles with high-dimensional observation spaces.

Atari Games Continuous Control +4

Neural heuristics for SAT solving

no code implementations27 May 2020 Sebastian Jaszczur, Michał Łuszczyk, Henryk Michalewski

We use neural graph networks with a message-passing architecture and an attention mechanism to enhance the branching heuristic in two SAT-solving algorithms.

Model Based Reinforcement Learning for Atari

no code implementations ICLR 2020 Łukasz Kaiser, Mohammad Babaeizadeh, Piotr Miłos, Błażej Osiński, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski

We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting.

Atari Games Model-based Reinforcement Learning +3

Towards Finding Longer Proofs

1 code implementation30 May 2019 Zsolt Zombori, Adrián Csiszárik, Henryk Michalewski, Cezary Kaliszyk, Josef Urban

We present a reinforcement learning (RL) based guidance system for automated theorem proving geared towards Finding Longer Proofs (FLoP).

Automated Theorem Proving reinforcement-learning +1

Model-Based Reinforcement Learning for Atari

2 code implementations1 Mar 2019 Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski

We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting.

Atari Games Atari Games 100k +4

Expert-augmented actor-critic for ViZDoom and Montezumas Revenge

2 code implementations10 Sep 2018 Michał Garmulewicz, Henryk Michalewski, Piotr Miłoś

We propose an expert-augmented actor-critic algorithm, which we evaluate on two environments with sparse rewards: Montezumas Revenge and a demanding maze from the ViZDoom suite.

Reinforcement Learning of Theorem Proving

no code implementations NeurIPS 2018 Cezary Kaliszyk, Josef Urban, Henryk Michalewski, Mirek Olšák

The strongest version of the system is trained on a large corpus of mathematical problems and evaluated on previously unseen problems.

Automated Theorem Proving reinforcement-learning +1

Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes

1 code implementation9 Jan 2018 Igor Adamski, Robert Adamski, Tomasz Grel, Adam Jędrych, Kamil Kaczmarek, Henryk Michalewski

We present a study in Distributed Deep Reinforcement Learning (DDRL) focused on scalability of a state-of-the-art Deep Reinforcement Learning algorithm known as Batch Asynchronous Advantage ActorCritic (BA3C).

Atari Games Playing the Game of 2048 +2

Atari games and Intel processors

no code implementations19 May 2017 Robert Adamski, Tomasz Grel, Maciej Klimek, Henryk Michalewski

The asynchronous nature of the state-of-the-art reinforcement learning algorithms such as the Asynchronous Advantage Actor-Critic algorithm, makes them exceptionally suitable for CPU computations.

Atari Games BIG-bench Machine Learning +2

Learning from the memory of Atari 2600

3 code implementations4 May 2016 Jakub Sygnowski, Henryk Michalewski

We train a number of neural networks to play games Bowling, Breakout and Seaquest using information stored in the memory of a video game console Atari 2600.

Atari Games

Cannot find the paper you are looking for? You can Submit a new open access paper.