Search Results for author: Sattar Vakili

Found 23 papers, 3 papers with code

Sample Complexity of Kernel-Based Q-Learning

no code implementations1 Feb 2023 Sing-Yuan Yeh, Fu-Chieh Chang, Chang-Wei Yueh, Pei-Yuan Wu, Alberto Bernacchia, Sattar Vakili

To the best of our knowledge, this is the first result showing a finite sample complexity under such a general model.

Q-Learning Reinforcement Learning (RL)

Delayed Feedback in Kernel Bandits

no code implementations1 Feb 2023 Sattar Vakili, Danyal Ahmed, Alberto Bernacchia, Ciara Pike-Burke

An abstraction of the problem can be formulated as a kernel based bandit problem (also known as Bayesian optimisation), where a learner aims at optimising a kernelized function through sequential noisy observations.

Bayesian Optimisation Recommendation Systems

Collaborative Learning in Kernel-based Bandits for Distributed Users

no code implementations16 Jul 2022 Sudeep Salgia, Sattar Vakili, Qing Zhao

We study collaborative learning among distributed clients facilitated by a central server.

Federated Learning

Provably and Practically Efficient Neural Contextual Bandits

no code implementations31 May 2022 Sudeep Salgia, Sattar Vakili, Qing Zhao

The non-asymptotic error bounds may be of broader interest as a tool to establish the relation between the smoothness of the activation functions in neural contextual bandits and the smoothness of the kernels in kernel bandits.

Multi-Armed Bandits

Near-Optimal Collaborative Learning in Bandits

1 code implementation31 May 2022 Clémence Réda, Sattar Vakili, Emilie Kaufmann

In this paper, we provide new lower bounds on the sample complexity of pure exploration and on the regret.

Federated Learning

Improved Convergence Rates for Sparse Approximation Methods in Kernel-Based Learning

no code implementations8 Feb 2022 Sattar Vakili, Jonathan Scarlett, Da-Shan Shiu, Alberto Bernacchia

Kernel-based models such as kernel ridge regression and Gaussian processes are ubiquitous in machine learning applications for regression and optimization.

Gaussian Processes regression

Open Problem: Tight Online Confidence Intervals for RKHS Elements

no code implementations28 Oct 2021 Sattar Vakili, Jonathan Scarlett, Tara Javidi

Confidence intervals are a crucial building block in the analysis of various online learning problems.

Reinforcement Learning (RL)

Uniform Generalization Bounds for Overparameterized Neural Networks

no code implementations13 Sep 2021 Sattar Vakili, Michael Bromberg, Jezabel Garcia, Da-Shan Shiu, Alberto Bernacchia

As a byproduct of our results, we show the equivalence between the RKHS corresponding to the NT kernel and its counterpart corresponding to the Mat\'ern family of kernels, showing the NT kernels induce a very general class of models.

Generalization Bounds

Optimal Order Simple Regret for Gaussian Process Bandits

no code implementations NeurIPS 2021 Sattar Vakili, Nacime Bouziani, Sepehr Jalali, Alberto Bernacchia, Da-Shan Shiu

Consider the sequential optimization of a continuous, possibly non-convex, and expensive to evaluate objective function $f$.

Art Analysis

On Information Gain and Regret Bounds in Gaussian Process Bandits

no code implementations15 Sep 2020 Sattar Vakili, Kia Khezeli, Victor Picheny

For the Mat\'ern family of kernels, where the lower bounds on $\gamma_T$, and regret under the frequentist setting, are known, our results close a huge polynomial in $T$ gap between the upper and lower bounds (up to logarithmic in $T$ factors).

Scalable Thompson Sampling using Sparse Gaussian Process Models

no code implementations NeurIPS 2021 Sattar Vakili, Henry Moss, Artem Artemev, Vincent Dutordoir, Victor Picheny

We provide theoretical guarantees and show that the drastic reduction in computational complexity of scalable TS can be enjoyed without loss in the regret performance over the standard TS.

Thompson Sampling

Amortized variance reduction for doubly stochastic objectives

no code implementations9 Mar 2020 Ayman Boustati, Sattar Vakili, James Hensman, ST John

Approximate inference in complex probabilistic models such as deep Gaussian processes requires the optimisation of doubly stochastic objective functions.

Gaussian Processes

Regret Bounds for Noise-Free Kernel-Based Bandits

no code implementations12 Feb 2020 Sattar Vakili

Kernel-based bandit is an extensively studied black-box optimization problem, in which the objective function is assumed to live in a known reproducing kernel Hilbert space.

Bayesian Optimisation

Ordinal Bayesian Optimisation

no code implementations5 Dec 2019 Victor Picheny, Sattar Vakili, Artem Artemev

Bayesian optimisation is a powerful tool to solve expensive black-box problems, but fails when the stationary assumption made on the objective function is strongly violated, which is the case in particular for ill-conditioned or discontinuous objectives.

Bayesian Optimisation Thompson Sampling

Adaptive Sensor Placement for Continuous Spaces

no code implementations16 May 2019 James A. Grant, Alexis Boukouvalas, Ryan-Rhys Griffiths, David S. Leslie, Sattar Vakili, Enrique Munoz de Cote

We consider the problem of adaptively placing sensors along an interval to detect stochastically-generated events.

Thompson Sampling

Stochastic Gradient Descent on a Tree: an Adaptive and Robust Approach to Stochastic Convex Optimization

no code implementations17 Jan 2019 Sattar Vakili, Sudeep Salgia, Qing Zhao

Online minimization of an unknown convex function over the interval $[0, 1]$ is considered under first-order stochastic bandit feedback, which returns a random realization of the gradient of the function at each query point.

Decision Variance in Online Learning

no code implementations24 Jul 2018 Sattar Vakili, Alexis Boukouvalas, Qing Zhao

In this paper, a risk-averse online learning problem under the performance measure of the mean-variance of the rewards is studied.

Multi-Armed Bandits on Partially Revealed Unit Interval Graphs

no code implementations12 Feb 2018 Xiao Xu, Sattar Vakili, Qing Zhao, Ananthram Swami

Two settings of complete and partial side information based on whether the UIG is fully revealed are studied and a general two-step learning structure consisting of an offline reduction of the action space and online aggregation of reward observations from similar arms is proposed to fully exploit the topological structure of the side information.

Multi-Armed Bandits

Anomaly Detection in Hierarchical Data Streams under Unknown Models

no code implementations11 Sep 2017 Sattar Vakili, Qing Zhao, Chang Liu, Chen-Nee Chuah

We consider the problem of detecting a few targets among a large number of hierarchical data streams.

Active Learning Anomaly Detection +1

Risk-Averse Multi-Armed Bandit Problems under Mean-Variance Measure

no code implementations18 Apr 2016 Sattar Vakili, Qing Zhao

We show that the model-specific regret and the model-independent regret in terms of the mean-variance of the reward process are lower bounded by $\Omega(\log T)$ and $\Omega(T^{2/3})$, respectively.

Cannot find the paper you are looking for? You can Submit a new open access paper.