Search Results for author: Stefano V. Albrecht

Found 55 papers, 25 papers with code

Multi-Agent Reinforcement Learning for Energy Networks: Computational Challenges, Progress and Open Problems

no code implementations24 Apr 2024 Sarah Keren, Chaimaa Essayeh, Stefano V. Albrecht, Thomas Mortsyn

The rapidly changing architecture and functionality of electrical networks and the increasing penetration of renewable and distributed energy resources have resulted in various technological and managerial challenges.

Multi-agent Reinforcement Learning

Multi-view Disentanglement for Reinforcement Learning with Multiple Cameras

1 code implementation22 Apr 2024 Mhairi Dunion, Stefano V. Albrecht

To overcome these hardware constraints, we propose Multi-View Disentanglement (MVD), which uses multiple cameras to learn a policy that achieves zero-shot generalisation to any single camera from the training set.

Disentanglement reinforcement-learning +1

ICED: Zero-Shot Transfer in Reinforcement Learning via In-Context Environment Design

no code implementations5 Feb 2024 Samuel Garcin, James Doran, Shangmin Guo, Christopher G. Lucas, Stefano V. Albrecht

ICED generates levels using a variational autoencoder trained over an initial set of level parameters, reducing distributional shift, and achieves significant improvements in ZSG over adaptive level sampling strategies and UED methods.

Reinforcement Learning (RL)

Sample Relationship from Learning Dynamics Matters for Generalisation

no code implementations16 Jan 2024 Shangmin Guo, Yi Ren, Stefano V. Albrecht, Kenny Smith

Although much research has been done on proposing new models or loss functions to improve the generalisation of artificial neural networks (ANNs), less attention has been directed to the impact of the training data on generalisation.

Is Feedback All You Need? Leveraging Natural Language Feedback in Goal-Conditioned Reinforcement Learning

1 code implementation7 Dec 2023 Sabrina McCallum, Max Taylor-Davies, Stefano V. Albrecht, Alessandro Suglia

Despite numerous successes, the field of reinforcement learning (RL) remains far from matching the impressive generalisation power of human behaviour learning.

Reinforcement Learning (RL)

Planning to Go Out-of-Distribution in Offline-to-Online Reinforcement Learning

no code implementations9 Oct 2023 Trevor McInroe, Adam Jelley, Stefano V. Albrecht, Amos Storkey

Offline pretraining with a static dataset followed by online fine-tuning (offline-to-online, or OtO) is a paradigm well matched to a real-world RL deployment process.

Continuous Control Offline RL

How the level sampling process impacts zero-shot generalisation in deep reinforcement learning

no code implementations5 Oct 2023 Samuel Garcin, James Doran, Shangmin Guo, Christopher G. Lucas, Stefano V. Albrecht

A key limitation preventing the wider adoption of autonomous agents trained via deep reinforcement learning (RL) is their limited ability to generalise to new environments, even when these share similar characteristics with environments encountered during training.

Reinforcement Learning (RL)

Contextual Pre-planning on Reward Machine Abstractions for Enhanced Transfer in Deep Reinforcement Learning

1 code implementation11 Jul 2023 Guy Azran, Mohamad H. Danesh, Stefano V. Albrecht, Sarah Keren

Recent studies show that deep reinforcement learning (DRL) agents tend to overfit to the task on which they were trained and fail to adapt to minor environment changes.

SMAClite: A Lightweight Environment for Multi-Agent Reinforcement Learning

1 code implementation9 May 2023 Adam Michalski, Filippos Christianos, Stefano V. Albrecht

The Starcraft Multi-Agent Challenge (SMAC) has been widely used in MARL research, but is built on top of a heavy, closed-source computer game, StarCraft II.

reinforcement-learning Reinforcement Learning (RL) +3

Using Offline Data to Speed-up Reinforcement Learning in Procedurally Generated Environments

no code implementations18 Apr 2023 Alain Andres, Lukas Schäfer, Esther Villar-Rodriguez, Stefano V. Albrecht, Javier Del Ser

Motivated by the recent success of Offline RL and Imitation Learning (IL), we conduct a study to investigate whether agents can leverage offline data in the form of trajectories to improve the sample-efficiency in procedurally generated environments.

Imitation Learning Offline RL +2

Revisiting the Gumbel-Softmax in MADDPG

1 code implementation23 Feb 2023 Callum Rhys Tilbury, Filippos Christianos, Stefano V. Albrecht

This method, however, is statistically biased, and a recent MARL benchmarking paper suggests that this bias makes MADDPG perform poorly in grid-world situations, where the action space is discrete.

Benchmarking Multi-agent Reinforcement Learning

Causal Explanations for Sequential Decision-Making in Multi-Agent Systems

1 code implementation21 Feb 2023 Balint Gyevnar, Cheng Wang, Christopher G. Lucas, Shay B. Cohen, Stefano V. Albrecht

We present CEMA: Causal Explanations in Multi-Agent systems; a framework for creating causal natural language explanations of an agent's decisions in dynamic sequential multi-agent systems to build more trustworthy autonomous agents.

Autonomous Driving counterfactual +2

Learning Complex Teamwork Tasks Using a Given Sub-task Decomposition

no code implementations9 Feb 2023 Elliot Fosong, Arrasy Rahman, Ignacio Carlucho, Stefano V. Albrecht

Training a team to complete a complex task via multi-agent reinforcement learning can be difficult due to challenges such as policy search in a large joint policy space, and non-stationarity caused by mutually adapting agents.

Multi-agent Reinforcement Learning reinforcement-learning +1

Ensemble Value Functions for Efficient Exploration in Multi-Agent Reinforcement Learning

no code implementations7 Feb 2023 Lukas Schäfer, Oliver Slumbers, Stephen Mcaleer, Yali Du, Stefano V. Albrecht, David Mguni

In this work, we propose ensemble value functions for multi-agent exploration (EMAX), a general framework to seamlessly extend value-based MARL algorithms with ensembles of value functions.

Efficient Exploration Multi-agent Reinforcement Learning +2

A General Learning Framework for Open Ad Hoc Teamwork Using Graph-based Policy Learning

1 code implementation11 Oct 2022 Arrasy Rahman, Ignacio Carlucho, Niklas Höpner, Stefano V. Albrecht

These belief estimates are combined with our solution for the fully observable case to compute an agent's optimal policy under partial observability in open ad hoc teamwork.

Pareto Actor-Critic for Equilibrium Selection in Multi-Agent Reinforcement Learning

1 code implementation28 Sep 2022 Filippos Christianos, Georgios Papoudakis, Stefano V. Albrecht

This work focuses on equilibrium selection in no-conflict multi-agent games, where we specifically study the problem of selecting a Pareto-optimal Nash equilibrium among several existing equilibria.

Multi-agent Reinforcement Learning reinforcement-learning +1

Generating Teammates for Training Robust Ad Hoc Teamwork Agents via Best-Response Diversity

no code implementations28 Jul 2022 Arrasy Rahman, Elliot Fosong, Ignacio Carlucho, Stefano V. Albrecht

Early approaches address the AHT challenge by training the learner with a diverse set of handcrafted teammate policies, usually designed based on an expert's domain knowledge about the policies the learner may encounter.

valid

Few-Shot Teamwork

no code implementations19 Jul 2022 Elliot Fosong, Arrasy Rahman, Ignacio Carlucho, Stefano V. Albrecht

We propose the novel few-shot teamwork (FST) problem, where skilled agents trained in a team to complete one task are combined with skilled agents from different tasks, and together must learn to adapt to an unseen but related task.

Multi-agent Reinforcement Learning reinforcement-learning +1

Temporal Disentanglement of Representations for Improved Generalisation in Reinforcement Learning

1 code implementation12 Jul 2022 Mhairi Dunion, Trevor McInroe, Kevin Sebastian Luck, Josiah P. Hanna, Stefano V. Albrecht

Reinforcement Learning (RL) agents are often unable to generalise well to environment variations in the state space that were not observed during training.

Disentanglement reinforcement-learning +1

Verifiable Goal Recognition for Autonomous Driving with Occlusions

1 code implementation28 Jun 2022 Cillian Brewitt, Massimiliano Tamborski, Cheng Wang, Stefano V. Albrecht

Goal recognition (GR) involves inferring the goals of other vehicles, such as a certain junction exit, which can enable more accurate prediction of their future behaviour.

Autonomous Driving

Multi-Horizon Representations with Hierarchical Forward Models for Reinforcement Learning

1 code implementation22 Jun 2022 Trevor McInroe, Lukas Schäfer, Stefano V. Albrecht

Learning control from pixels is difficult for reinforcement learning (RL) agents because representation learning and policy learning are intertwined.

reinforcement-learning Reinforcement Learning (RL) +1

A Survey of Ad Hoc Teamwork Research

no code implementations16 Feb 2022 Reuth Mirsky, Ignacio Carlucho, Arrasy Rahman, Elliot Fosong, William Macke, Mohan Sridharan, Peter Stone, Stefano V. Albrecht

Ad hoc teamwork is the research problem of designing agents that can collaborate with new teammates without prior coordination.

Robust On-Policy Sampling for Data-Efficient Policy Evaluation in Reinforcement Learning

1 code implementation29 Nov 2021 Rujie Zhong, Duohan Zhang, Lukas Schäfer, Stefano V. Albrecht, Josiah P. Hanna

Reinforcement learning (RL) algorithms are often categorized as either on-policy or off-policy depending on whether they use data from a target policy of interest or from a different behavior policy.

Offline RL reinforcement-learning +1

Learning Temporally-Consistent Representations for Data-Efficient Reinforcement Learning

2 code implementations11 Oct 2021 Trevor McInroe, Lukas Schäfer, Stefano V. Albrecht

Deep reinforcement learning (RL) agents that exist in high-dimensional state spaces, such as those composed of images, have interconnected learning burdens.

reinforcement-learning Reinforcement Learning (RL) +1

Decoupled Reinforcement Learning to Stabilise Intrinsically-Motivated Exploration

1 code implementation ICML Workshop URL 2021 Lukas Schäfer, Filippos Christianos, Josiah P. Hanna, Stefano V. Albrecht

Intrinsic rewards can improve exploration in reinforcement learning, but the exploration process may suffer from instability caused by non-stationary reward shaping and strong dependency on hyperparameters.

reinforcement-learning Reinforcement Learning (RL)

Expressivity of Emergent Language is a Trade-off between Contextual Complexity and Unpredictability

1 code implementation7 Jun 2021 Shangmin Guo, Yi Ren, Kory Mathewson, Simon Kirby, Stefano V. Albrecht, Kenny Smith

Researchers are using deep learning models to explore the emergence of language in various language games, where agents interact and develop an emergent language to solve tasks.

GRIT: Fast, Interpretable, and Verifiable Goal Recognition with Learned Decision Trees for Autonomous Driving

2 code implementations10 Mar 2021 Cillian Brewitt, Balint Gyevnar, Stefano V. Albrecht

As autonomous driving is safety-critical, it is important to have methods which are human interpretable and for which safety can be formally verified.

Autonomous Driving Robotics Multiagent Systems

Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing

1 code implementation15 Feb 2021 Filippos Christianos, Georgios Papoudakis, Arrasy Rahman, Stefano V. Albrecht

Sharing parameters in multi-agent deep reinforcement learning has played an essential role in allowing algorithms to scale to a large number of agents.

Multi-agent Reinforcement Learning reinforcement-learning +1

PILOT: Efficient Planning by Imitation Learning and Optimisation for Safe Autonomous Driving

no code implementations1 Nov 2020 Henry Pulver, Francisco Eiras, Ludovico Carozza, Majd Hawasly, Stefano V. Albrecht, Subramanian Ramamoorthy

In this paper, we present PILOT -- a planning framework that comprises an imitation neural network followed by an efficient optimiser that actively rectifies the network's plan, guaranteeing fulfilment of safety and comfort requirements.

Autonomous Driving Imitation Learning

Towards Quantum-Secure Authentication and Key Agreement via Abstract Multi-Agent Interaction

1 code implementation18 Jul 2020 Ibrahim H. Ahmed, Josiah P. Hanna, Elliot Fosong, Stefano V. Albrecht

Authentication and key agreement are decided based on the agents' observed behaviors during the interaction.

Towards Open Ad Hoc Teamwork Using Graph-based Policy Learning

1 code implementation18 Jun 2020 Arrasy Rahman, Niklas Höpner, Filippos Christianos, Stefano V. Albrecht

Ad hoc teamwork is the challenging problem of designing an autonomous agent which can adapt quickly to collaborate with teammates without prior coordination mechanisms, including joint training.

Agent Modelling under Partial Observability for Deep Reinforcement Learning

1 code implementation NeurIPS 2021 Georgios Papoudakis, Filippos Christianos, Stefano V. Albrecht

Existing methods for agent modelling commonly assume knowledge of the local observations and chosen actions of the modelled agents during execution.

reinforcement-learning Reinforcement Learning (RL)

Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks

8 code implementations14 Jun 2020 Georgios Papoudakis, Filippos Christianos, Lukas Schäfer, Stefano V. Albrecht

Multi-agent deep reinforcement learning (MARL) suffers from a lack of commonly-used evaluation tasks and criteria, making comparisons between approaches difficult.

Benchmarking reinforcement-learning +1

Integrating Planning and Interpretable Goal Recognition for Autonomous Driving

2 code implementations6 Feb 2020 Stefano V. Albrecht, Cillian Brewitt, John Wilhelm, Francisco Eiras, Mihai Dobre, Subramanian Ramamoorthy

The ability to predict the intentions and driving trajectories of other vehicles is a key problem for autonomous driving.

Robotics

Variational Autoencoders for Opponent Modeling in Multi-Agent Systems

no code implementations29 Jan 2020 Georgios Papoudakis, Stefano V. Albrecht

Modeling the behavior of other agents (opponents) is essential in understanding the interactions of the agents in the system.

Stabilizing Generative Adversarial Networks: A Survey

no code implementations30 Sep 2019 Maciej Wiatrak, Stefano V. Albrecht, Andrew Nystrom

Generative Adversarial Networks (GANs) are a type of generative model which have received much attention due to their ability to model complex real-world data.

E-HBA: Using Action Policies for Expert Advice and Agent Typification

no code implementations23 Jul 2019 Stefano V. Albrecht, Jacob W. Crandall, Subramanian Ramamoorthy

Past research has studied two approaches to utilise predefined policy sets in repeated interactions: as experts, to dictate our own actions, and as types, to characterise the behaviour of other agents.

Comparative Evaluation of Multiagent Learning Algorithms in a Diverse Set of Ad Hoc Team Problems

no code implementations22 Jul 2019 Stefano V. Albrecht, Subramanian Ramamoorthy

In this work, we empirically evaluate five MAL algorithms, representing major approaches to multiagent learning but originally developed with the homogeneous setting in mind, to understand their behaviour in a set of ad hoc team problems.

Fairness

On Convergence and Optimality of Best-Response Learning with Policy Types in Multiagent Systems

no code implementations15 Jul 2019 Stefano V. Albrecht, Subramanian Ramamoorthy

In this paper, we provide theoretical guidance on two central design parameters of this method: Firstly, it is important that the user choose a posterior which can learn the true distribution of latent types, as otherwise suboptimal actions may be chosen.

Exploiting Causality for Selective Belief Filtering in Dynamic Bayesian Networks (Extended Abstract)

no code implementations10 Jul 2019 Stefano V. Albrecht, Subramanian Ramamoorthy

Belief filtering in DBNs is the task of inferring the belief state (i. e. the probability distribution over process states) based on incomplete and uncertain observations.

An Empirical Study on the Practical Impact of Prior Beliefs over Policy Types

no code implementations10 Jul 2019 Stefano V. Albrecht, Jacob W. Crandall, Subramanian Ramamoorthy

To address this problem, researchers have studied learning algorithms which compute posterior beliefs over a hypothesised set of policies, based on the observed actions of the other agents.

Are You Doing What I Think You Are Doing? Criticising Uncertain Agent Models

no code implementations2 Jul 2019 Stefano V. Albrecht, S. Ramamoorthy

The key for effective interaction in many multiagent applications is to reason explicitly about the behaviour of other agents, in the form of a hypothesised behaviour.

Dealing with Non-Stationarity in Multi-Agent Deep Reinforcement Learning

no code implementations11 Jun 2019 Georgios Papoudakis, Filippos Christianos, Arrasy Rahman, Stefano V. Albrecht

Recent developments in deep reinforcement learning are concerned with creating decision-making agents which can perform well in various complex domains.

Decision Making Meta-Learning +3

Reasoning about Unforeseen Possibilities During Policy Learning

no code implementations10 Jan 2018 Craig Innes, Alex Lascarides, Stefano V. Albrecht, Subramanian Ramamoorthy, Benjamin Rosman

Methods for learning optimal policies in autonomous agents often assume that the way the domain is conceptualised---its possible states and actions and their causal structure---is known in advance and does not change during learning.

Autonomous Agents Modelling Other Agents: A Comprehensive Survey and Open Problems

no code implementations23 Sep 2017 Stefano V. Albrecht, Peter Stone

Much research in artificial intelligence is concerned with the development of autonomous agents that can interact effectively with other agents.

Belief and Truth in Hypothesised Behaviours

no code implementations28 Jul 2015 Stefano V. Albrecht, Jacob W. Crandall, Subramanian Ramamoorthy

The idea is to hypothesise a set of types, each specifying a possible behaviour for the other agents, and to plan our own actions with respect to those types which we believe are most likely, given the observed actions of the agents.

A Game-Theoretic Model and Best-Response Learning Method for Ad Hoc Coordination in Multiagent Systems

no code implementations3 Jun 2015 Stefano V. Albrecht, Subramanian Ramamoorthy

Based on this model, we derive a solution, called Harsanyi-Bellman Ad Hoc Coordination (HBA), which utilises the concept of Bayesian Nash equilibrium in a planning procedure to find optimal actions in the sense of Bellman optimal control.

Exploiting Causality for Selective Belief Filtering in Dynamic Bayesian Networks

no code implementations30 Jan 2014 Stefano V. Albrecht, Subramanian Ramamoorthy

Furthermore, we demonstrate how passivity occurs naturally in a complex system such as a multi-robot warehouse, and how PSBF can exploit this to accelerate the filtering task.

Cannot find the paper you are looking for? You can Submit a new open access paper.