no code implementations • 10 Oct 2024 • Xue Yan, Yan Song, Xidong Feng, Mengyue Yang, Haifeng Zhang, Haitham Bou Ammar, Jun Wang
In sequential decision-making (SDM) tasks, methods like reinforcement learning (RL) and heuristic search have made notable advances in specific cases.
1 code implementation • 4 Oct 2024 • Matthieu Zimmer, Milan Gritta, Gerasimos Lampouras, Haitham Bou Ammar, Jun Wang
The growth in the number of parameters of Large Language Models (LLMs) has led to a significant surge in computational requirements, making them challenging and costly to deploy.
2 code implementations • 30 May 2024 • Shyam Sundhar Ramesh, Yifan Hu, Iason Chaimalas, Viraj Mehta, Pier Giuseppe Sessa, Haitham Bou Ammar, Ilija Bogunovic
Our approach builds upon reward-free direct preference optimization methods, but unlike previous approaches, it seeks a robust policy which maximizes the worst-case group performance.
1 code implementation • NeurIPS 2023 • Kamil Dreczkowski, Antoine Grosnit, Haitham Bou Ammar
This paper introduces a modular framework for Mixed-variable and Combinatorial Bayesian Optimization (MCBO) to address the lack of systematic benchmarking and standardized evaluation in the field.
2 code implementations • NeurIPS 2023 • Alexandre Maraval, Matthieu Zimmer, Antoine Grosnit, Haitham Bou Ammar
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
no code implementations • 16 May 2023 • Desong Du, Shaohang Han, Naiming Qi, Haitham Bou Ammar, Jun Wang, Wei Pan
Reinforcement learning (RL) exhibits impressive performance when managing complicated control tasks for robots.
no code implementations • 10 Sep 2022 • Alexander I. Cowen-Rivers, Philip John Gorinski, Aivar Sootla, Asif Khan, Liu Furui, Jun Wang, Jan Peters, Haitham Bou Ammar
Optimizing combinatorial structures is core to many real-world problems, such as those encountered in life sciences.
1 code implementation • 6 Jun 2022 • Aivar Sootla, Alexander I. Cowen-Rivers, Jun Wang, Haitham Bou Ammar
We further show that Simmer can stabilize training and improve the performance of safe RL with average constraints.
no code implementations • 27 May 2022 • Alexandre Maraval, Matthieu Zimmer, Antoine Grosnit, Rasul Tutunov, Jun Wang, Haitham Bou Ammar
First, we notice that these models are trained on uniformly distributed inputs, which impairs predictive accuracy on non-uniform data - a setting arising from any typical BO loop due to exploration-exploitation trade-offs.
no code implementations • 11 Nov 2021 • Antoine Grosnit, Cedric Malherbe, Rasul Tutunov, Xingchen Wan, Jun Wang, Haitham Bou Ammar
Optimising the quality-of-results (QoR) of circuits during logic synthesis is a formidable challenge necessitating the exploration of exponentially sized search spaces.
no code implementations • 6 Jul 2021 • Vincent Moens, Aivar Sootla, Haitham Bou Ammar, Jun Wang
We present a method for conditional sampling for pre-trained normalizing flows when only part of an observation is available.
1 code implementation • 13 Mar 2021 • Le Cong Dinh, Yaodong Yang, Stephen Mcaleer, Zheng Tian, Nicolas Perez Nieves, Oliver Slumbers, David Henry Mguni, Haitham Bou Ammar, Jun Wang
Solving strategic games with huge action space is a critical yet under-explored topic in economics, operations research and artificial intelligence.
no code implementations • 15 Feb 2021 • Yaodong Yang, Jun Luo, Ying Wen, Oliver Slumbers, Daniel Graves, Haitham Bou Ammar, Jun Wang, Matthew E. Taylor
Multiagent reinforcement learning (MARL) has achieved a remarkable amount of success in solving various types of video games.
3 code implementations • 7 Dec 2020 • Alexander I. Cowen-Rivers, Wenlong Lyu, Rasul Tutunov, Zhi Wang, Antoine Grosnit, Ryan Rhys Griffiths, Alexandre Max Maraval, Hao Jianye, Jun Wang, Jan Peters, Haitham Bou Ammar
Our results on the Bayesmark benchmark indicate that heteroscedasticity and non-stationarity pose significant challenges for black-box optimisers.
Ranked #1 on
Hyperparameter Optimization
on Bayesmark
5 code implementations • 19 Oct 2020 • Ming Zhou, Jun Luo, Julian Villella, Yaodong Yang, David Rusu, Jiayu Miao, Weinan Zhang, Montgomery Alban, Iman Fadakar, Zheng Chen, Aurora Chongxi Huang, Ying Wen, Kimia Hassanzadeh, Daniel Graves, Dong Chen, Zhengbang Zhu, Nhat Nguyen, Mohamed Elsayed, Kun Shao, Sanjeevan Ahilan, Baokuan Zhang, Jiannan Wu, Zhengang Fu, Kasra Rezaee, Peyman Yadmellat, Mohsen Rohani, Nicolas Perez Nieves, Yihan Ni, Seyedershad Banijamali, Alexander Cowen Rivers, Zheng Tian, Daniel Palenicek, Haitham Bou Ammar, Hongbo Zhang, Wulong Liu, Jianye Hao, Jun Wang
We open-source the SMARTS platform and the associated benchmark tasks and evaluation metrics to encourage and empower research on multi-agent learning for autonomous driving.
1 code implementation • NeurIPS 2019 • Minne Li, Lisheng Wu, Haitham Bou Ammar, Jun Wang
This paper is concerned with multi-view reinforcement learning (MVRL), which allows for decision making when agents share common dynamics but adhere to different observation models.
no code implementations • 9 Oct 2019 • Victor Gabillon, Rasul Tutunov, Michal Valko, Haitham Bou Ammar
In this paper, we formalise order-robust optimisation as an instance of online learning minimising simple regret, and propose Vroom, a zero'th order optimisation algorithm capable of achieving vanishing regret in non-stationary environments, while recovering favorable rates under stochastic reward-generating processes.
no code implementations • 25 Sep 2019 • Yaodong Yang, Rasul Tutunov, Phu Sakulwongtana, Haitham Bou Ammar
Furthermore, we also show successful results on large joint strategy profiles with a maximum size in the order of $\mathcal{O}(2^{25})$ ($\approx 33$ million joint strategies) -- a setting not evaluable using $\alpha$-Rank with reasonable computational budget.
no code implementations • 30 Jul 2019 • Mohammed Amin Abdullah, Hang Ren, Haitham Bou Ammar, Vladimir Milenkovic, Rui Luo, Mingtian Zhang, Jun Wang
Reinforcement learning algorithms, though successful, tend to over-fit to training environments hampering their application to the real-world.
no code implementations • NeurIPS 2018 • Rasul Tutunov, Dongho Kim, Haitham Bou Ammar
Multitask reinforcement learning (MTRL) suffers from scalability issues when the number of tasks or trajectories grows large.
no code implementations • 10 Oct 2018 • Zheng Tian, Shihao Zou, Ian Davies, Tim Warr, Lisheng Wu, Haitham Bou Ammar, Jun Wang
The auxiliary reward for communication is integrated into the learning of the policy module.
no code implementations • 20 Apr 2016 • Decebal Constantin Mocanu, Haitham Bou Ammar, Luis Puig, Eric Eaton, Antonio Liotta
Estimation, recognition, and near-future prediction of 3D trajectories based on their two dimensional projections available from one camera source is an exceptionally difficult problem due to uncertainty in the trajectories and environment, high dimensionality of the specific trajectory states, lack of enough labeled data and so on.
no code implementations • 13 Apr 2016 • Yusen Zhan, Haitham Bou Ammar, Matthew E. Taylor
This paper formally defines a setting where multiple teacher agents can provide advice to a student and introduces an algorithm to leverage both autonomous exploration and teacher's advice.
no code implementations • 21 May 2015 • Haitham Bou Ammar, Rasul Tutunov, Eric Eaton
Lifelong reinforcement learning provides a promising framework for developing versatile agents that can accumulate knowledge over a lifetime of experience and rapidly learn new tasks by building upon prior knowledge.