Search Results for author: Matthieu Zimmer

Found 15 papers, 8 papers with code

Learning Fair Policies in Multi-Objective (Deep) Reinforcement Learning with Average and Discounted Rewards

no code implementations • ICML 2020 • Umer Siddique, Paul Weng, Matthieu Zimmer

During this analysis, we notably derive a new result in the standard RL setting, which is of independent interest: it states a novel bound on the approximation error with respect to the optimal average reward of that of a policy optimal for the discounted reward.

Fairness Reinforcement Learning (RL)

Paper
Add Code

Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control

no code implementations • 9 Feb 2024 • Zheng Xiong, Risto Vuorio, Jacob Beck, Matthieu Zimmer, Kun Shao, Shimon Whiteson

Learning a universal policy across different robot morphologies can significantly improve learning efficiency and enable zero-shot generalization to unseen morphologies.

Zero-shot Generalization

Paper
Add Code

Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning

no code implementations • 22 Dec 2023 • Filippos Christianos, Georgios Papoudakis, Matthieu Zimmer, Thomas Coste, Zhihao Wu, Jingxuan Chen, Khyati Khandelwal, James Doran, Xidong Feng, Jiacheng Liu, Zheng Xiong, Yicheng Luo, Jianye Hao, Kun Shao, Haitham Bou-Ammar, Jun Wang

This paper presents a general framework model for integrating and learning structured reasoning into AI agents' policies.

Reinforcement Learning (RL)

Paper
Add Code

Automatic Unit Test Data Generation and Actor-Critic Reinforcement Learning for Code Synthesis

1 code implementation • 20 Oct 2023 • Philip John Gorinski, Matthieu Zimmer, Gerasimos Lampouras, Derrick Goh Xin Deik, Ignacio Iacobacci

The advent of large pre-trained language models in the domain of Code Synthesis has shown remarkable performance on various benchmarks, treating the problem of Code Generation in a fashion similar to Natural Language Generation, trained with a Language Modelling (LM) objective.

Code Generation Language Modelling +2

835

Paper
Code

End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes

2 code implementations • NeurIPS 2023 • Alexandre Maraval, Matthieu Zimmer, Antoine Grosnit, Haitham Bou Ammar

We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.

Bayesian Optimisation Inductive Bias +2

2,948

Paper
Code

Sample-Efficient Optimisation with Probabilistic Transformer Surrogates

no code implementations • 27 May 2022 • Alexandre Maraval, Matthieu Zimmer, Antoine Grosnit, Rasul Tutunov, Jun Wang, Haitham Bou Ammar

First, we notice that these models are trained on uniformly distributed inputs, which impairs predictive accuracy on non-uniform data - a setting arising from any typical BO loop due to exploration-exploitation trade-offs.

Bayesian Optimisation Gaussian Processes

Paper
Add Code

Neuro-Symbolic Hierarchical Rule Induction

no code implementations • 26 Dec 2021 • Claire Glanois, Xuening Feng, Zhaohui Jiang, Paul Weng, Matthieu Zimmer, Dong Li, Wulong Liu

We propose an efficient interpretable neuro-symbolic model to solve Inductive Logic Programming (ILP) problems.

Inductive logic programming reinforcement-learning +1

Paper
Add Code

A Survey on Interpretable Reinforcement Learning

no code implementations • 24 Dec 2021 • Claire Glanois, Paul Weng, Matthieu Zimmer, Dong Li, Tianpei Yang, Jianye Hao, Wulong Liu

To that aim, we distinguish interpretability (as a property of a model) and explainability (as a post-hoc operation, with the intervention of a proxy) and discuss them in the context of RL with an emphasis on the former notion.

Autonomous Driving Decision Making +2

Paper
Add Code

Differentiable Logic Machines

no code implementations • 23 Feb 2021 • Matthieu Zimmer, Xuening Feng, Claire Glanois, Zhaohui Jiang, Jianyi Zhang, Paul Weng, Dong Li, Jianye Hao, Wulong Liu

As a step in this direction, we propose a novel neural-logic architecture, called differentiable logic machine (DLM), that can solve both inductive logic programming (ILP) and reinforcement learning (RL) problems, where the solution can be interpreted as a first-order logic program.

Decision Making Inductive logic programming +1

Paper
Add Code

Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning

3 code implementations • 17 Dec 2020 • Matthieu Zimmer, Claire Glanois, Umer Siddique, Paul Weng

As a solution method, we propose a novel neural network architecture, which is composed of two sub-networks specifically designed for taking into account the two aspects of fairness.

Fairness Multi-agent Reinforcement Learning +2

Paper
Code

Hyperparameter Auto-tuning in Self-Supervised Robotic Learning

2 code implementations • 16 Oct 2020 • Jiancong Huang, Juan Rojas, Matthieu Zimmer, Hongmin Wu, Yisheng Guan, Paul Weng

Insufficient learning (due to convergence to local optima) results in under-performing policies whilst redundant learning wastes time and resources.

Multi-Task Learning reinforcement-learning +1

Paper
Code

Learning Fair Policies in Multiobjective (Deep) Reinforcement Learning with Average and Discounted Rewards

1 code implementation • 18 Aug 2020 • Umer Siddique, Paul Weng, Matthieu Zimmer

Since learning with discounted rewards is generally easier, this discussion further justifies finding a fair policy for the average reward by learning a fair policy for the discounted reward.

Fairness reinforcement-learning +1

Paper
Code

Towards More Sample Efficiency in Reinforcement Learning with Data Augmentation

1 code implementation • 19 Oct 2019 • Yijiong Lin, Jiancong Huang, Matthieu Zimmer, Juan Rojas, Paul Weng

Deep reinforcement learning (DRL) is a promising approach for adaptive robot control, but its current application to robotics is currently hindered by high sample requirements.

Data Augmentation reinforcement-learning +1

Paper
Code

Invariant Transform Experience Replay: Data Augmentation for Deep Reinforcement Learning

1 code implementation • 24 Sep 2019 • Yijiong Lin, Jiancong Huang, Matthieu Zimmer, Yisheng Guan, Juan Rojas, Paul Weng

Our work demonstrates that invariant transformations on RL trajectories are a promising methodology to speed up learning in deep RL.

Data Augmentation OpenAI Gym +2

Paper
Code

Exploiting the Sign of the Advantage Function to Learn Deterministic Policies in Continuous Domains

1 code implementation • 10 Jun 2019 • Matthieu Zimmer, Paul Weng

In the context of learning deterministic policies in continuous domains, we revisit an approach, which was first proposed in Continuous Actor Critic Learning Automaton (CACLA) and later extended in Neural Fitted Actor Critic (NFAC).

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.