Search Results for author: Paul Duckworth

Found 10 papers, 4 papers with code

SMX: Sequential Monte Carlo Planning for Expert Iteration

no code implementations12 Feb 2024 Matthew V Macfarlane, Edan Toledo, Donal Byrne, Siddarth Singh, Paul Duckworth, Alexandre Laterre

SMX demonstrates a statistically significant improvement in performance compared to AlphaZero, as well as demonstrating its performance as an improvement operator for a model-free policy, matching or exceeding top model-free methods across both continuous and discrete environments.

Self-Learning

Should we be going MAD? A Look at Multi-Agent Debate Strategies for LLMs

1 code implementation29 Nov 2023 Andries Smit, Paul Duckworth, Nathan Grinsztajn, Thomas D. Barrett, Arnu Pretorius

In this context, multi-agent debate (MAD) has emerged as a promising strategy for enhancing the truthfulness of LLMs.

Benchmarking

DITTO: Offline Imitation Learning with World Models

no code implementations6 Feb 2023 Branton DeMoss, Paul Duckworth, Nick Hawes, Ingmar Posner

We propose DITTO, an offline imitation learning algorithm which uses world models and on-policy reinforcement learning to addresses the problem of covariate shift, without access to an oracle or any additional online interactions.

Imitation Learning reinforcement-learning +1

Planning for Risk-Aversion and Expected Value in MDPs

1 code implementation25 Oct 2021 Marc Rigter, Paul Duckworth, Bruno Lacerda, Nick Hawes

This motivates us to propose a lexicographic approach which minimises the expected cost subject to the constraint that the CVaR of the total cost is optimal.

On Solving a Stochastic Shortest-Path Markov Decision Process as Probabilistic Inference

no code implementations13 Sep 2021 Mohamed Baioumy, Bruno Lacerda, Paul Duckworth, Nick Hawes

Previous work on planning as active inference addresses finite horizon problems and solutions valid for online planning.

valid

Active Inference for Integrated State-Estimation, Control, and Learning

1 code implementation12 May 2020 Mohamed Baioumy, Paul Duckworth, Bruno Lacerda, Nick Hawes

This work presents an approach for control, state-estimation and learning model (hyper)parameters for robotic manipulators.

Robotics

Towards better healthcare: What could and should be automated?

no code implementations21 Oct 2019 Wolfgang Frühwirt, Paul Duckworth

While artificial intelligence (AI) and other automation technologies might lead to enormous progress in healthcare, they may also have undesired consequences for people working in the field.

Natural Language Grounding and Grammar Induction for Robotic Manipulation Commands

no code implementations WS 2017 Muhannad Alomari, Paul Duckworth, Majd Hawasly, David C. Hogg, Anthony G. Cohn

This is achieved by first learning a set of visual {`}concepts{'} that abstract the visual feature spaces into concepts that have human-level meaning.

Cannot find the paper you are looking for? You can Submit a new open access paper.