no code implementations • 28 Feb 2025 • Sharan Vaswani, Reza Babanezhad
For smooth functions, Armijo-LS alleviates the need to know the global smoothness constant $L$ and adapts to the local smoothness, enabling GD to converge faster.
no code implementations • 11 Feb 2025 • Jincheng Mei, Bo Dai, Alekh Agarwal, Sharan Vaswani, Anant Raj, Csaba Szepesvari, Dale Schuurmans
The proofs are based on novel findings about action sampling rates and the relationship between cumulative progress and noise, and extend the current understanding of how simple stochastic gradient methods behave in bandit settings.
no code implementations • 20 Nov 2024 • Shuman Peng, Arash Khoeini, Sharan Vaswani, Martin Ester
The quality of self-supervised pre-trained embeddings on out-of-distribution (OOD) data is poor without fine-tuning.
no code implementations • 18 Nov 2024 • Reza Asad, Reza Babanezhad, Issam Laradji, Nicolas Le Roux, Sharan Vaswani
Natural policy gradient (NPG) is a common policy optimization algorithm and can be viewed as mirror ascent in the space of probabilities.
1 code implementation • 21 May 2024 • Michael Lu, Matin Aghaei, Anant Raj, Sharan Vaswani
We show that the proposed algorithm offers similar theoretical guarantees as the state-of-the art results, but does not require the knowledge of oracle-like quantities.
1 code implementation • 27 Feb 2024 • Saurabh Mishra, Anant Raj, Sharan Vaswani
Inverse optimization involves inferring unknown parameters of an optimization problem from known solutions and is widely used in fields such as transportation, power systems, and healthcare.
1 code implementation • 12 Jan 2024 • Anh Dang, Reza Babanezhad, Sharan Vaswani
Specifically, we prove that with the same step-size and momentum parameters as in the deterministic setting, SHB with a sufficiently large mini-batch size results in an $O\left(\exp(-\frac{T}{\sqrt{\kappa}}) + \sigma \right)$ convergence, where $T$ is the number of iterations and $\sigma^2$ is the variance in the stochastic gradients.
1 code implementation • NeurIPS 2023 • Sharan Vaswani, Amirreza Kazemi, Reza Babanezhad, Nicolas Le Roux
Instantiating the generic algorithm results in an actor that involves maximizing a sequence of surrogate functions (similar to TRPO, PPO) and a critic that involves minimizing a closely connected objective.
1 code implementation • 6 Feb 2023 • Jonathan Wilder Lavington, Sharan Vaswani, Reza Babanezhad, Mark Schmidt, Nicolas Le Roux
Our target optimization framework uses the (expensive) gradient computation to construct surrogate functions in a \emph{target space} (e. g. the logits output by a linear model for classification) that can be minimized efficiently.
1 code implementation • 29 Jul 2022 • Jonathan Wilder Lavington, Sharan Vaswani, Mark Schmidt
Specifically, if the class of policies is sufficiently expressive to contain the expert policy, we prove that DAGGER achieves constant regret.
no code implementations • 13 Jun 2022 • Sharan Vaswani, Lin F. Yang, Csaba Szepesvári
In particular, we design a model-based algorithm that addresses two settings: (i) relaxed feasibility, where small constraint violations are allowed, and (ii) strict feasibility, where the output policy is required to satisfy the constraint.
1 code implementation • 11 Apr 2022 • Arushi Jain, Sharan Vaswani, Reza Babanezhad, Csaba Szepesvari, Doina Precup
We propose a generic primal-dual framework that allows us to bound the reward sub-optimality and constraint violation for arbitrary algorithms in terms of their primal and dual regret on online linear optimization problems.
no code implementations • 21 Oct 2021 • Sharan Vaswani, Benjamin Dubois-Taine, Reza Babanezhad
In order to be adaptive to the smoothness, we use a stochastic line-search (SLS) and show (via upper and lower-bounds) that SGD with SLS converges at the desired rate, but only to a neighbourhood of the solution.
1 code implementation • 12 Aug 2021 • Sharan Vaswani, Olivier Bachem, Simone Totaro, Robert Mueller, Shivam Garg, Matthieu Geist, Marlos C. Machado, Pablo Samuel Castro, Nicolas Le Roux
Common policy gradient methods rely on the maximization of a sequence of surrogate functions.
no code implementations • 18 Feb 2021 • Benjamin Dubois-Taine, Sharan Vaswani, Reza Babanezhad, Mark Schmidt, Simon Lacoste-Julien
Variance reduction (VR) methods for finite-sum minimization typically require the knowledge of problem-dependent constants that are often unknown and difficult to estimate.
no code implementations • 28 Sep 2020 • Sharan Vaswani, Issam H. Laradji, Frederik Kunstner, Si Yi Meng, Mark Schmidt, Simon Lacoste-Julien
Under an interpolation assumption, we prove that AMSGrad with a constant step-size and momentum can converge to the minimizer at the faster $O(1/T)$ rate for smooth, convex functions.
1 code implementation • 11 Jun 2020 • Sharan Vaswani, Issam Laradji, Frederik Kunstner, Si Yi Meng, Mark Schmidt, Simon Lacoste-Julien
In this setting, we prove that AMSGrad with constant step-size and momentum converges to the minimizer at a faster $O(1/T)$ rate.
no code implementations • 11 Jun 2020 • Sharan Vaswani, Reza Babanezhad, Jose Gallego, Aaron Mishkin, Simon Lacoste-Julien, Nicolas Le Roux
For under-parameterized linear classification, we prove that for any linear classifier separating the data, there exists a family of quadratic norms ||.||_P such that the classifier's direction is the same as that of the maximum P-margin solution.
1 code implementation • 24 Feb 2020 • Nicolas Loizou, Sharan Vaswani, Issam Laradji, Simon Lacoste-Julien
Consequently, the proposed stochastic Polyak step-size (SPS) is an attractive choice for setting the learning rate for stochastic gradient descent (SGD).
1 code implementation • 11 Oct 2019 • Sharan Vaswani, Abbas Mehrabian, Audrey Durand, Branislav Kveton
We propose $\tt RandUCB$, a bandit strategy that builds on theoretically derived confidence intervals similar to upper confidence bound (UCB) algorithms, but akin to Thompson sampling (TS), it uses randomization to trade off exploration and exploitation.
1 code implementation • 11 Oct 2019 • Si Yi Meng, Sharan Vaswani, Issam Laradji, Mark Schmidt, Simon Lacoste-Julien
Under this condition, we show that the regularized subsampled Newton method (R-SSN) achieves global linear convergence with an adaptive step-size and a constant batch-size.
1 code implementation • NeurIPS 2019 • Sharan Vaswani, Aaron Mishkin, Issam Laradji, Mark Schmidt, Gauthier Gidel, Simon Lacoste-Julien
To improve the proposed methods' practical performance, we give heuristics to use larger step-sizes and acceleration.
no code implementations • 13 Nov 2018 • Branislav Kveton, Csaba Szepesvari, Sharan Vaswani, Zheng Wen, Mohammad Ghavamzadeh, Tor Lattimore
Specifically, it pulls the arm with the highest mean reward in a non-parametric bootstrap sample of its history with pseudo rewards.
no code implementations • 16 Oct 2018 • Sharan Vaswani, Francis Bach, Mark Schmidt
Under this condition, we prove that constant step-size stochastic gradient descent (SGD) with Nesterov acceleration matches the convergence rate of the deterministic accelerated method for both convex and strongly-convex functions.
no code implementations • 10 Oct 2018 • Mohamed Osama Ahmed, Sharan Vaswani, Mark Schmidt
Indeed, in a particular setting, we prove that using the Lipschitz information yields the same or a better bound on the regret compared to using Bayesian optimization on its own.
no code implementations • 24 May 2018 • Sharan Vaswani, Branislav Kveton, Zheng Wen, Anup Rao, Mark Schmidt, Yasin Abbasi-Yadkori
We investigate the use of bootstrapping in the bandit setting.
no code implementations • 7 Mar 2017 • Sharan Vaswani, Mark Schmidt, Laks. V. S. Lakshmanan
The gang of bandits (GOB) model \cite{cesa2013gang} is a recent contextual bandits framework that shares information between a set of bandit problems, related by a known (possibly noisy) graph.
no code implementations • ICML 2017 • Sharan Vaswani, Branislav Kveton, Zheng Wen, Mohammad Ghavamzadeh, Laks Lakshmanan, Mark Schmidt
We consider influence maximization (IM) in social networks, which is the problem of maximizing the number of users that become aware of a product by selecting a set of "seed" users to expose the product to.
1 code implementation • NeurIPS 2017 • Zheng Wen, Branislav Kveton, Michal Valko, Sharan Vaswani
Specifically, we aim to learn the set of "best influencers" in a social network online while repeatedly interacting with it.
no code implementations • 27 Apr 2016 • Sharan Vaswani, Laks. V. S. Lakshmanan
A disadvantage of this setting is that the marketer is forced to select all the seeds based solely on a diffusion model.
Social and Information Networks
no code implementations • 27 Feb 2015 • Sharan Vaswani, Laks. V. S. Lakshmanan, Mark Schmidt
We consider the problem of \emph{influence maximization}, the problem of maximizing the number of people that become aware of a product by finding the `best' set of `seed' users to expose the product to.
no code implementations • 24 Oct 2013 • Rahul Thota, Sharan Vaswani, Amit Kale, Nagavijayalakshmi Vydyanathan
This allows us to initialize a sparse seed-point grid as the set of tentative salient region centers and iteratively converge to the local entropy maxima, thereby reducing the computation complexity compared to the Kadir Brady approach of performing this computation at every point in the image.