Search Results for author: Piotr Miłoś

Found 30 papers, 18 papers with code

tsGT: Stochastic Time Series Modeling With Transformer

no code implementations • 8 Mar 2024 • Łukasz Kuciński, Witold Drzewakowski, Mateusz Olko, Piotr Kozakowski, Łukasz Maziarka, Marta Emilia Nowakowska, Łukasz Kaiser, Piotr Miłoś

Time series methods are of fundamental importance in virtually any field of science that deals with temporally structured data.

Time Series

Paper
Add Code

Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning

no code implementations • 1 Mar 2024 • Michal Nauman, Michał Bortkiewicz, Mateusz Ostaszewski, Piotr Miłoś, Tomasz Trzciński, Marek Cygan

We tested these agents across 14 diverse tasks from 2 simulation benchmarks.

Reinforcement Learning (RL)

Paper
Add Code

Analysing The Impact of Sequence Composition on Language Model Pre-Training

1 code implementation • 21 Feb 2024 • Yu Zhao, Yuanbin Qu, Konrad Staniszewski, Szymon Tworkowski, Wei Liu, Piotr Miłoś, Yuxiang Wu, Pasquale Minervini

In this work, we find that applying causal masking can lead to the inclusion of distracting information from previous documents during pre-training, which negatively impacts the performance of the models on language modelling and downstream tasks.

In-Context Learning Language Modelling +1

Paper
Code

Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem

no code implementations • 5 Feb 2024 • Maciej Wołczyk, Bartłomiej Cupiał, Mateusz Ostaszewski, Michał Bortkiewicz, Michał Zając, Razvan Pascanu, Łukasz Kuciński, Piotr Miłoś

Fine-tuning is a widespread technique that allows practitioners to transfer pre-trained capabilities, as recently showcased by the successful applications of foundation models.

Montezuma's Revenge NetHack +2

Paper
Add Code

MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

1 code implementation • 8 Jan 2024 • Maciej Pióro, Kamil Ciebiera, Krystian Król, Jan Ludziejewski, Michał Krutul, Jakub Krajewski, Szymon Antoniak, Piotr Miłoś, Marek Cygan, Sebastian Jaszczur

State Space Models (SSMs) have become serious contenders in the field of sequential modeling, challenging the dominance of Transformers.

145

Paper
Code

Structured Packing in LLM Training Improves Long Context Utilization

no code implementations • 28 Dec 2023 • Konrad Staniszewski, Szymon Tworkowski, Yu Zhao, Sebastian Jaszczur, Henryk Michalewski, Łukasz Kuciński, Piotr Miłoś

Recent developments in long-context large language models have attracted considerable attention.

Information Retrieval Retrieval

Paper
Add Code

Focused Transformer: Contrastive Training for Context Scaling

1 code implementation • NeurIPS 2023 • Szymon Tworkowski, Konrad Staniszewski, Mikołaj Pacek, Yuhuai Wu, Henryk Michalewski, Piotr Miłoś

This novel approach enhances the structure of the (key, value) space, enabling an extension of the context length.

Contrastive Learning

1,426

Paper
Code

Exploring Continual Learning of Diffusion Models

no code implementations • 27 Mar 2023 • Michał Zając, Kamil Deja, Anna Kuzina, Jakub M. Tomczak, Tomasz Trzciński, Florian Shkurti, Piotr Miłoś

Diffusion models have achieved remarkable success in generating high-quality images thanks to their novel training procedures applied to unprecedented amounts of data.

Benchmarking Continual Learning +1

Paper
Add Code

Magnushammer: A Transformer-Based Approach to Premise Selection

no code implementations • 8 Mar 2023 • Maciej Mikuła, Szymon Tworkowski, Szymon Antoniak, Bartosz Piotrowski, Albert Qiaochu Jiang, Jin Peng Zhou, Christian Szegedy, Łukasz Kuciński, Piotr Miłoś, Yuhuai Wu

By combining \method with a language-model-based automated theorem prover, we further improve the state-of-the-art proof success rate from $57. 0\%$ to $71. 0\%$ on the PISA benchmark using $4$x fewer parameters.

Automated Theorem Proving Language Modelling +1

Paper
Add Code

The Effectiveness of World Models for Continual Reinforcement Learning

2 code implementations • 29 Nov 2022 • Samuel Kessler, Mateusz Ostaszewski, Michał Bortkiewicz, Mateusz Żarski, Maciej Wołczyk, Jack Parker-Holder, Stephen J. Roberts, Piotr Miłoś

World models power some of the most efficient reinforcement learning algorithms.

Continual Learning Model-based Reinforcement Learning +2

448

Paper
Code

Trust Your $\nabla$: Gradient-based Intervention Targeting for Causal Discovery

no code implementations • NeurIPS 2023 • Mateusz Olko, Michał Zając, Aleksandra Nowak, Nino Scherrer, Yashas Annadani, Stefan Bauer, Łukasz Kuciński, Piotr Miłoś

In this work, we propose a novel Gradient-based Intervention Targeting method, abbreviated GIT, that 'trusts' the gradient estimator of a gradient-based causal discovery framework to provide signals for the intervention acquisition function.

Causal Discovery Experimental Design

Paper
Add Code

Disentangling Transfer in Continual Reinforcement Learning

no code implementations • 28 Sep 2022 • Maciej Wołczyk, Michał Zając, Razvan Pascanu, Łukasz Kuciński, Piotr Miłoś

The ability of continual learning systems to transfer knowledge from previously seen tasks in order to maximize performance on new tasks is a significant challenge for the field, limiting the applicability of continual learning solutions to realistic scenarios.

Continual Learning Continuous Control +2

Paper
Add Code

Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search

1 code implementation • 1 Jun 2022 • Michał Zawalski, Michał Tyrolski, Konrad Czechowski, Tomasz Odrzygóźdź, Damian Stachura, Piotr Piękos, Yuhuai Wu, Łukasz Kuciński, Piotr Miłoś

Complex reasoning problems contain states that vary in the computational cost required to determine a good action plan.

Rubik's Cube

Paper
Code

Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers

no code implementations • 22 May 2022 • Albert Q. Jiang, Wenda Li, Szymon Tworkowski, Konrad Czechowski, Tomasz Odrzygóźdź, Piotr Miłoś, Yuhuai Wu, Mateja Jamnik

Thor increases a language model's success rate on the PISA dataset from $39\%$ to $57\%$, while solving $8. 2\%$ of problems neither language models nor automated theorem provers are able to solve on their own.

Ranked #2 on Automated Theorem Proving on miniF2F-test

Automated Theorem Proving

Paper
Add Code

Continuous Control With Ensemble Deep Deterministic Policy Gradients

1 code implementation • 30 Nov 2021 • Piotr Januszewski, Mateusz Olko, Michał Królikowski, Jakub Świątkowski, Marcin Andrychowicz, Łukasz Kuciński, Piotr Miłoś

The growth of deep reinforcement learning (RL) has brought multiple exciting tools and methods to the field.

Continuous Control OpenAI Gym +2

Paper
Code

Off-Policy Correction For Multi-Agent Reinforcement Learning

1 code implementation • 22 Nov 2021 • Michał Zawalski, Błażej Osiński, Henryk Michalewski, Piotr Miłoś

The key advantage of our algorithm is its high scalability in a multi-worker setting.

Multi-agent Reinforcement Learning reinforcement-learning +2

Paper
Code

Catalytic Role Of Noise And Necessity Of Inductive Biases In The Emergence Of Compositional Communication

no code implementations • NeurIPS 2021 • Łukasz Kuciński, Tomasz Korbak, Paweł Kołodziej, Piotr Miłoś

Communication is compositional if complex signals can be represented as a combination of simpler subparts.

Paper
Add Code

Subgoal Search For Complex Reasoning Tasks

1 code implementation • NeurIPS 2021 • Konrad Czechowski, Tomasz Odrzygóźdź, Marek Zbysiński, Michał Zawalski, Krzysztof Olejnik, Yuhuai Wu, Łukasz Kuciński, Piotr Miłoś

In this paper, we implement kSubS using a transformer-based subgoal module coupled with the classical best-first search framework.

Rubik's Cube

Paper
Code

Continual World: A Robotic Benchmark For Continual Reinforcement Learning

1 code implementation • NeurIPS 2021 • Maciej Wołczyk, Michał Zając, Razvan Pascanu, Łukasz Kuciński, Piotr Miłoś

Continual learning (CL) -- the ability to continuously learn, building on previously acquired knowledge -- is a natural requirement for long-lived autonomous reinforcement learning (RL) agents.

Continual Learning reinforcement-learning +1

Paper
Code

Planning and Learning Using Adaptive Entropy Tree Search

1 code implementation • 12 Feb 2021 • Piotr Kozakowski, Mikołaj Pacek, Piotr Miłoś

We present Adaptive Entropy Tree Search (ANTS) - a novel algorithm combining planning and learning in the maximum entropy paradigm.

Paper
Code

Structure and randomness in planning and reinforcement learning

1 code implementation • NeurIPS Workshop LMCA 2020 • Piotr Kozakowski, Piotr Januszewski, Konrad Czechowski, Łukasz Kuciński, Piotr Miłoś

Planning in large state spaces inevitably needs to balance depth and breadth of the search.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Trust, but verify: model-based exploration in sparse reward environments

1 code implementation • NeurIPS Workshop LMCA 2020 • Konrad Czechowski, Tomasz Odrzygóźdź, Michał Izworski, Marek Zbysiński, Łukasz Kuciński, Piotr Miłoś

We propose $\textit{trust-but-verify}$ (TBV) mechanism, a new method which uses model uncertainty estimates to guide exploration.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Code

CARLA Real Traffic Scenarios -- novel training ground and benchmark for autonomous driving

no code implementations • 16 Dec 2020 • Błażej Osiński, Piotr Miłoś, Adam Jakubowski, Paweł Zięcina, Michał Martyniak, Christopher Galias, Antonia Breuer, Silviu Homoceanu, Henryk Michalewski

This work introduces interactive traffic scenarios in the CARLA simulator, which are based on real-world traffic.

Autonomous Driving reinforcement-learning +1

Paper
Add Code

Emergence of compositional language in communication through noisy channel

no code implementations • ICML Workshop LaReL 2020 • Łukasz Kuciński, Paweł Kołodziej, Piotr Miłoś

In this paper, we investigate how communication through a noisy channel can lead to the emergence of compositional language.

Paper
Add Code

Uncertainty-sensitive Learning and Planning with Ensembles

1 code implementation • 19 Dec 2019 • Piotr Miłoś, Łukasz Kuciński, Konrad Czechowski, Piotr Kozakowski, Maciek Klimek

The former manifests itself through the use of value function, while the latter is powered by a tree search planner.

Montezuma's Revenge

Paper
Code

Simulation-based reinforcement learning for real-world autonomous driving

1 code implementation • 29 Nov 2019 • Błażej Osiński, Adam Jakubowski, Piotr Miłoś, Paweł Zięcina, Christopher Galias, Silviu Homoceanu, Henryk Michalewski

Using reinforcement learning in simulation and synthetic data is motivated by lowering costs and engineering effort.

Autonomous Driving reinforcement-learning +3

161

Paper
Code

Developmentally motivated emergence of compositional communication via template transfer

1 code implementation • 4 Oct 2019 • Tomasz Korbak, Julian Zubek, Łukasz Kuciński, Piotr Miłoś, Joanna Rączaszek-Leonardi

This paper explores a novel approach to achieving emergent compositional communication in multi-agent systems.

Zero-shot Generalization

Paper
Code

Uncertainty - sensitive learning and planning with ensembles

1 code implementation • 25 Sep 2019 • Piotr Miłoś, Łukasz Kuciński, Konrad Czechowski, Piotr Kozakowski, Maciej Klimek

Notably, our method performs well in environments with sparse rewards where standard $TD(1)$ backups fail.

Montezuma's Revenge

Paper
Code

Expert-augmented actor-critic for ViZDoom and Montezumas Revenge

2 code implementations • 10 Sep 2018 • Michał Garmulewicz, Henryk Michalewski, Piotr Miłoś

We propose an expert-augmented actor-critic algorithm, which we evaluate on two environments with sparse rewards: Montezumas Revenge and a demanding maze from the ViZDoom suite.

Paper
Code

Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments

2 code implementations • 2 Apr 2018 • Łukasz Kidziński, Sharada Prasanna Mohanty, Carmichael Ong, Zhewei Huang, Shuchang Zhou, Anton Pechenko, Adam Stelmaszczyk, Piotr Jarosik, Mikhail Pavlov, Sergey Kolesnikov, Sergey Plis, Zhibo Chen, Zhizheng Zhang, Jiale Chen, Jun Shi, Zhuobin Zheng, Chun Yuan, Zhihui Lin, Henryk Michalewski, Piotr Miłoś, Błażej Osiński, Andrew Melnik, Malte Schilling, Helge Ritter, Sean Carroll, Jennifer Hicks, Sergey Levine, Marcel Salathé, Scott Delp

In the NIPS 2017 Learning to Run challenge, participants were tasked with building a controller for a musculoskeletal model to make it run as fast as possible through an obstacle course.

reinforcement-learning Reinforcement Learning (RL)

124

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.