no code implementations • 12 Feb 2024 • Matthew V Macfarlane, Edan Toledo, Donal Byrne, Siddarth Singh, Paul Duckworth, Alexandre Laterre
SMX demonstrates a statistically significant improvement in performance compared to AlphaZero, as well as demonstrating its performance as an improvement operator for a model-free policy, matching or exceeding top model-free methods across both continuous and discrete environments.
1 code implementation • 29 Nov 2023 • Andries Smit, Paul Duckworth, Nathan Grinsztajn, Thomas D. Barrett, Arnu Pretorius
In this context, multi-agent debate (MAD) has emerged as a promising strategy for enhancing the truthfulness of LLMs.
1 code implementation • 16 Jun 2023 • Clément Bonnet, Daniel Luo, Donal Byrne, Shikha Surana, Sasha Abramowitz, Paul Duckworth, Vincent Coyette, Laurence I. Midgley, Elshadai Tegegn, Tristan Kalloniatis, Omayma Mahjoub, Matthew Macfarlane, Andries P. Smit, Nathan Grinsztajn, Raphael Boige, Cemlyn N. Waters, Mohamed A. Mimouni, Ulrich A. Mbou Sob, Ruan de Kock, Siddarth Singh, Daniel Furelos-Blanco, Victor Le, Arnu Pretorius, Alexandre Laterre
Open-source reinforcement learning (RL) environments have played a crucial role in driving progress in the development of AI algorithms.
no code implementations • 6 Feb 2023 • Branton DeMoss, Paul Duckworth, Nick Hawes, Ingmar Posner
We propose DITTO, an offline imitation learning algorithm which uses world models and on-policy reinforcement learning to addresses the problem of covariate shift, without access to an oracle or any additional online interactions.
no code implementations • 14 Nov 2021 • Odhran O'Donoghue, Paul Duckworth, Giuseppe Ughi, Linus Scheibenreif, Kia Khezeli, Adrienne Hoarfrost, Samuel Budd, Patrick Foley, Nicholas Chia, John Kalantari, Graham Mackintosh, Frank Soboczenski, Lauren Sanders
In this work, we augment small human medical datasets with in-vitro data and animal models.
1 code implementation • 25 Oct 2021 • Marc Rigter, Paul Duckworth, Bruno Lacerda, Nick Hawes
This motivates us to propose a lexicographic approach which minimises the expected cost subject to the constraint that the CVaR of the total cost is optimal.
no code implementations • 13 Sep 2021 • Mohamed Baioumy, Bruno Lacerda, Paul Duckworth, Nick Hawes
Previous work on planning as active inference addresses finite horizon problems and solutions valid for online planning.
1 code implementation • 12 May 2020 • Mohamed Baioumy, Paul Duckworth, Bruno Lacerda, Nick Hawes
This work presents an approach for control, state-estimation and learning model (hyper)parameters for robotic manipulators.
Robotics
no code implementations • 21 Oct 2019 • Wolfgang Frühwirt, Paul Duckworth
While artificial intelligence (AI) and other automation technologies might lead to enormous progress in healthcare, they may also have undesired consequences for people working in the field.
no code implementations • WS 2017 • Muhannad Alomari, Paul Duckworth, Majd Hawasly, David C. Hogg, Anthony G. Cohn
This is achieved by first learning a set of visual {`}concepts{'} that abstract the visual feature spaces into concepts that have human-level meaning.