Search Results for author: Rasul Tutunov

Found 15 papers, 4 papers with code

Why Can Large Language Models Generate Correct Chain-of-Thoughts?

no code implementations20 Oct 2023 Rasul Tutunov, Antoine Grosnit, Juliusz Ziomek, Jun Wang, Haitham Bou-Ammar

This paper delves into the capabilities of large language models (LLMs), specifically focusing on advancing the theoretical comprehension of chain-of-thought prompting.

Text Generation

Sample-Efficient Optimisation with Probabilistic Transformer Surrogates

no code implementations27 May 2022 Alexandre Maraval, Matthieu Zimmer, Antoine Grosnit, Rasul Tutunov, Jun Wang, Haitham Bou Ammar

First, we notice that these models are trained on uniformly distributed inputs, which impairs predictive accuracy on non-uniform data - a setting arising from any typical BO loop due to exploration-exploitation trade-offs.

Bayesian Optimisation Gaussian Processes

BOiLS: Bayesian Optimisation for Logic Synthesis

no code implementations11 Nov 2021 Antoine Grosnit, Cedric Malherbe, Rasul Tutunov, Xingchen Wan, Jun Wang, Haitham Bou Ammar

Optimising the quality-of-results (QoR) of circuits during logic synthesis is a formidable challenge necessitating the exploration of exponentially sized search spaces.

Bayesian Optimisation Navigate

Efficient Semi-Implicit Variational Inference

no code implementations15 Jan 2021 Vincent Moens, Hang Ren, Alexandre Maraval, Rasul Tutunov, Jun Wang, Haitham Ammar

In this paper, we propose CI-VI an efficient and scalable solver for semi-implicit variational inference (SIVI).

Variational Inference

Compositional ADAM: An Adaptive Compositional Solver

no code implementations10 Feb 2020 Rasul Tutunov, Minne Li, Alexander I. Cowen-Rivers, Jun Wang, Haitham Bou-Ammar

In this paper, we present C-ADAM, the first adaptive solver for compositional problems involving a non-linear functional nesting of expected values.


Derivative-Free & Order-Robust Optimisation

no code implementations9 Oct 2019 Victor Gabillon, Rasul Tutunov, Michal Valko, Haitham Bou Ammar

In this paper, we formalise order-robust optimisation as an instance of online learning minimising simple regret, and propose Vroom, a zero'th order optimisation algorithm capable of achieving vanishing regret in non-stationary environments, while recovering favorable rates under stochastic reward-generating processes.

$α^α$-Rank: Practically Scaling $α$-Rank through Stochastic Optimisation

no code implementations25 Sep 2019 Yaodong Yang, Rasul Tutunov, Phu Sakulwongtana, Haitham Bou Ammar

Furthermore, we also show successful results on large joint strategy profiles with a maximum size in the order of $\mathcal{O}(2^{25})$ ($\approx 33$ million joint strategies) -- a setting not evaluable using $\alpha$-Rank with reasonable computational budget.

Stochastic Optimization

Graph Attention Memory for Visual Navigation

no code implementations11 May 2019 Dong Li, Qichao Zhang, Dongbin Zhao, Yuzheng Zhuang, Bin Wang, Wulong Liu, Rasul Tutunov, Jun Wang

To address the long-term memory issue, this paper proposes a graph attention memory (GAM) architecture consisting of memory construction module, graph attention module and control module.

Graph Attention Reinforcement Learning (RL) +1

Safe Policy Search for Lifelong Reinforcement Learning with Sublinear Regret

no code implementations21 May 2015 Haitham Bou Ammar, Rasul Tutunov, Eric Eaton

Lifelong reinforcement learning provides a promising framework for developing versatile agents that can accumulate knowledge over a lifetime of experience and rapidly learn new tasks by building upon prior knowledge.

reinforcement-learning Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.