Search Results for author: Paul Duckworth

Found 10 papers, 4 papers with code

SMX: Sequential Monte Carlo Planning for Expert Iteration

no code implementations • 12 Feb 2024 • Matthew V Macfarlane, Edan Toledo, Donal Byrne, Siddarth Singh, Paul Duckworth, Alexandre Laterre

SMX demonstrates a statistically significant improvement in performance compared to AlphaZero, as well as demonstrating its performance as an improvement operator for a model-free policy, matching or exceeding top model-free methods across both continuous and discrete environments.

Self-Learning

Paper
Add Code

Should we be going MAD? A Look at Multi-Agent Debate Strategies for LLMs

1 code implementation • 29 Nov 2023 • Andries Smit, Paul Duckworth, Nathan Grinsztajn, Thomas D. Barrett, Arnu Pretorius

In this context, multi-agent debate (MAD) has emerged as a promising strategy for enhancing the truthfulness of LLMs.

Benchmarking

Paper
Code

Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX

1 code implementation • 16 Jun 2023 • Clément Bonnet, Daniel Luo, Donal Byrne, Shikha Surana, Sasha Abramowitz, Paul Duckworth, Vincent Coyette, Laurence I. Midgley, Elshadai Tegegn, Tristan Kalloniatis, Omayma Mahjoub, Matthew Macfarlane, Andries P. Smit, Nathan Grinsztajn, Raphael Boige, Cemlyn N. Waters, Mohamed A. Mimouni, Ulrich A. Mbou Sob, Ruan de Kock, Siddarth Singh, Daniel Furelos-Blanco, Victor Le, Arnu Pretorius, Alexandre Laterre

Open-source reinforcement learning (RL) environments have played a crucial role in driving progress in the development of AI algorithms.

Decision Making reinforcement-learning +1

518

Paper
Code

DITTO: Offline Imitation Learning with World Models

no code implementations • 6 Feb 2023 • Branton DeMoss, Paul Duckworth, Nick Hawes, Ingmar Posner

We propose DITTO, an offline imitation learning algorithm which uses world models and on-policy reinforcement learning to addresses the problem of covariate shift, without access to an oracle or any additional online interactions.

Imitation Learning reinforcement-learning +1

Paper
Add Code

Invariant Risk Minimisation for Cross-Organism Inference: Substituting Mouse Data for Human Data in Human Risk Factor Discovery

no code implementations • 14 Nov 2021 • Odhran O'Donoghue, Paul Duckworth, Giuseppe Ughi, Linus Scheibenreif, Kia Khezeli, Adrienne Hoarfrost, Samuel Budd, Patrick Foley, Nicholas Chia, John Kalantari, Graham Mackintosh, Frank Soboczenski, Lauren Sanders

In this work, we augment small human medical datasets with in-vitro data and animal models.

Paper
Add Code

Planning for Risk-Aversion and Expected Value in MDPs

1 code implementation • 25 Oct 2021 • Marc Rigter, Paul Duckworth, Bruno Lacerda, Nick Hawes

This motivates us to propose a lexicographic approach which minimises the expected cost subject to the constraint that the CVaR of the total cost is optimal.

Paper
Code

On Solving a Stochastic Shortest-Path Markov Decision Process as Probabilistic Inference

no code implementations • 13 Sep 2021 • Mohamed Baioumy, Bruno Lacerda, Paul Duckworth, Nick Hawes

Previous work on planning as active inference addresses finite horizon problems and solutions valid for online planning.

valid

Paper
Add Code

Active Inference for Integrated State-Estimation, Control, and Learning

1 code implementation • 12 May 2020 • Mohamed Baioumy, Paul Duckworth, Bruno Lacerda, Nick Hawes

This work presents an approach for control, state-estimation and learning model (hyper)parameters for robotic manipulators.

Robotics

Paper
Code

Towards better healthcare: What could and should be automated?

no code implementations • 21 Oct 2019 • Wolfgang Frühwirt, Paul Duckworth

While artificial intelligence (AI) and other automation technologies might lead to enormous progress in healthcare, they may also have undesired consequences for people working in the field.

Paper
Add Code

Natural Language Grounding and Grammar Induction for Robotic Manipulation Commands

no code implementations • WS 2017 • Muhannad Alomari, Paul Duckworth, Majd Hawasly, David C. Hogg, Anthony G. Cohn

This is achieved by first learning a set of visual {`}concepts{'} that abstract the visual feature spaces into concepts that have human-level meaning.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.