no code implementations • 3 Feb 2025 • Christopher E. Mower, Haitham Bou-Ammar
Inferring physical laws from data is a central challenge in science and engineering, including but not limited to healthcare, physical sciences, biosciences, social sciences, sustainability, climate, and robotics.
no code implementations • 2 Jan 2025 • Rasul Tutnov, Antoine Grosnit, Haitham Bou-Ammar
However, the vast number of DPO variants in the literature has made it increasingly difficult for researchers to navigate and fully grasp the connections between these approaches.
no code implementations • 8 Nov 2024 • Puze Liu, Jonas Günster, Niklas Funk, Simon Gröger, Dong Chen, Haitham Bou-Ammar, Julius Jankowski, Ante Marić, Sylvain Calinon, Andrej Orsula, Miguel Olivares-Mendez, Hongyi Zhou, Rudolf Lioutikov, Gerhard Neumann, Amarildo Likmeta Amirhossein Zhalehmehrabi, Thomas Bonenfant, Marcello Restelli, Davide Tateo, Ziyuan Liu, Jan Peters
Despite the many challenges associated with combining machine learning technology with robotics, robot learning remains one of the most promising directions for enhancing the capabilities of robots.
no code implementations • 5 Nov 2024 • Antoine Grosnit, Alexandre Maraval, James Doran, Giuseppe Paolo, Albert Thomas, Refinath Shahul Hameed Nabeezath Beevi, Jonas Gonzalez, Khyati Khandelwal, Ignacio Iacobacci, Abdelhakim Benechehab, Hamza Cherkaoui, Youssef Attia El-Hili, Kun Shao, Jianye Hao, Jun Yao, Balazs Kegl, Haitham Bou-Ammar, Jun Wang
We introduce Agent K v1. 0, an end-to-end autonomous data science agent designed to automate, optimise, and generalise across diverse data science tasks.
no code implementations • 7 Oct 2024 • Fenia Christopoulou, Ronald Cardenas, Gerasimos Lampouras, Haitham Bou-Ammar, Jun Wang
Preference Optimization (PO) has proven an effective step for aligning language models to human-desired behaviors.
no code implementations • 19 Aug 2024 • Dimitrios Tsaras, Antoine Grosnit, Lei Chen, Zhiyao Xie, Haitham Bou-Ammar, Mingxuan Yuan
Chip design relies heavily on generating Boolean circuits, such as AND-Inverter Graphs (AIGs), from functional descriptions like truth tables.
1 code implementation • 12 Jul 2024 • Zafeirios Fountas, Martin A Benfeghoul, Adnan Oomerjee, Fenia Christopoulou, Gerasimos Lampouras, Haitham Bou-Ammar, Jun Wang
Large language models (LLMs) have shown remarkable capabilities, but still struggle with processing extensive contexts, limiting their ability to maintain coherence and accuracy over long sequences.
1 code implementation • 28 Jun 2024 • Christopher E. Mower, Yuhui Wan, Hongzhan Yu, Antoine Grosnit, Jonas Gonzalez-Billandon, Matthieu Zimmer, Jinlong Wang, Xinyu Zhang, Yao Zhao, Anbang Zhai, Puze Liu, Daniel Palenicek, Davide Tateo, Cesar Cadena, Marco Hutter, Jan Peters, Guangjian Tian, Yuzheng Zhuang, Kun Shao, Xingyue Quan, Jianye Hao, Jun Wang, Haitham Bou-Ammar
Key features of the framework include: integration of ROS with an AI agent connected to a plethora of open-source and commercial LLMs, automatic extraction of a behavior from the LLM output and execution of ROS actions/services, support for three behavior modes (sequence, behavior tree, state machine), imitation learning for adding new robot actions to the library of possible actions, and LLM reflection via human and environment feedback.
no code implementations • 13 Apr 2024 • Puze Liu, Haitham Bou-Ammar, Jan Peters, Davide Tateo
Indeed, safety specifications, often represented as constraints, can be complex and non-linear, making safety challenging to guarantee in learning systems.
no code implementations • 20 Feb 2024 • Adam X. Yang, Maxime Robeyns, Thomas Coste, Zhengyan Shi, Jun Wang, Haitham Bou-Ammar, Laurence Aitchison
To ensure that large language model (LLM) responses are helpful and non-toxic, a reward model trained on human preference data is usually used.
no code implementations • 22 Dec 2023 • Filippos Christianos, Georgios Papoudakis, Matthieu Zimmer, Thomas Coste, Zhihao Wu, Jingxuan Chen, Khyati Khandelwal, James Doran, Xidong Feng, Jiacheng Liu, Zheng Xiong, Yicheng Luo, Jianye Hao, Kun Shao, Haitham Bou-Ammar, Jun Wang
This paper presents a general framework model for integrating and learning structured reasoning into AI agents' policies.
no code implementations • 20 Oct 2023 • Rasul Tutunov, Antoine Grosnit, Juliusz Ziomek, Jun Wang, Haitham Bou-Ammar
This paper delves into the capabilities of large language models (LLMs), specifically focusing on advancing the theoretical comprehension of chain-of-thought prompting.
2 code implementations • 30 Jan 2023 • Juliusz Ziomek, Haitham Bou-Ammar
Learning decompositions of expensive-to-evaluate black-box functions promises to scale Bayesian optimisation (BO) to high-dimensional problems.
no code implementations • 29 Jan 2023 • Vahan Arsenyan, Antoine Grosnit, Haitham Bou-Ammar
Causal Bayesian optimisation (CaBO) combines causality with Bayesian optimisation (BO) and shows that there are situations where the optimal reward is not achievable if causal knowledge is ignored.
1 code implementation • 11 Jan 2023 • Piotr Kicki, Puze Liu, Davide Tateo, Haitham Bou-Ammar, Krzysztof Walas, Piotr Skrzypczyński, Jan Peters
Motion planning is a mature area of research in robotics with many well-established methods based on optimization or sampling the state space, suitable for solving kinematic motion planning.
no code implementations • ICLR 2022 • Hang Ren, Aivar Sootla, Taher Jafferjee, Junxiao Shen, Jun Wang, Haitham Bou-Ammar
We consider a context-dependent Reinforcement Learning (RL) setting, which is characterized by: a) an unknown finite number of not directly observable contexts; b) abrupt (discontinuous) context changes occurring during an episode; and c) Markovian context evolution.
1 code implementation • 14 Feb 2022 • Aivar Sootla, Alexander I. Cowen-Rivers, Taher Jafferjee, Ziyan Wang, David Mguni, Jun Wang, Haitham Bou-Ammar
Satisfying safety constraints almost surely (or with probability one) can be critical for the deployment of Reinforcement Learning (RL) in real-life applications.
no code implementations • 3 Feb 2022 • Xihan Li, Xiang Chen, Rasul Tutunov, Haitham Bou-Ammar, Lei Wang, Jun Wang
The Schr\"odinger equation is at the heart of modern quantum mechanics.
1 code implementation • 29 Jan 2022 • Asif Khan, Alexander I. Cowen-Rivers, Antoine Grosnit, Derrick-Goh-Xin Deik, Philippe A. Robert, Victor Greiff, Eva Smorodina, Puneet Rawat, Kamil Dreczkowski, Rahmad Akbar, Rasul Tutunov, Dany Bou-Ammar, Jun Wang, Amos Storkey, Haitham Bou-Ammar
software suite as a black-box oracle to score the target specificity and affinity of designed antibodies \textit{in silico} in an unconstrained fashion~\citep{robert2021one}.
2 code implementations • 7 Jun 2021 • Antoine Grosnit, Rasul Tutunov, Alexandre Max Maraval, Ryan-Rhys Griffiths, Alexander I. Cowen-Rivers, Lin Yang, Lin Zhu, Wenlong Lyu, Zhitang Chen, Jun Wang, Jan Peters, Haitham Bou-Ammar
We introduce a method combining variational autoencoders (VAEs) and deep metric learning to perform Bayesian optimisation (BO) over high-dimensional and structured input spaces.
Ranked #1 on
Molecular Graph Generation
on ZINC
1 code implementation • 15 Dec 2020 • Antoine Grosnit, Alexander I. Cowen-Rivers, Rasul Tutunov, Ryan-Rhys Griffiths, Jun Wang, Haitham Bou-Ammar
Bayesian optimisation presents a sample-efficient methodology for global optimisation.
no code implementations • 10 Feb 2020 • Rasul Tutunov, Minne Li, Alexander I. Cowen-Rivers, Jun Wang, Haitham Bou-Ammar
In this paper, we present C-ADAM, the first adaptive solver for compositional problems involving a non-linear functional nesting of expected values.
no code implementations • 19 Feb 2018 • Garrett Andersen, Peter Vrancx, Haitham Bou-Ammar
A common approach to HL, is to provide the agent with a number of high-level skills that solve small parts of the overall problem.
no code implementations • 9 Feb 2018 • Jordi Grau-Moya, Felix Leibfried, Haitham Bou-Ammar
Within the context of video games the notion of perfectly rational agents can be undesirable as it leads to uninteresting situations, where humans face tough adversarial decision makers.
no code implementations • 6 Aug 2017 • Felix Leibfried, Jordi Grau-Moya, Haitham Bou-Ammar
Different learning outcomes can be demonstrated by tuning a Lagrange multiplier accordingly.