Search Results for author: Sharan Vaswani

Found 27 papers, 11 papers with code

From Inverse Optimization to Feasibility to ERM

no code implementations27 Feb 2024 Saurabh Mishra, Anant Raj, Sharan Vaswani

For a linear prediction model, we reduce CILP to a convex feasibility problem allowing the use of standard algorithms such as alternating projections.

Noise-adaptive (Accelerated) Stochastic Heavy-Ball Momentum

no code implementations12 Jan 2024 Anh Dang, Reza Babanezhad, Sharan Vaswani

In particular, for strongly-convex quadratics with condition number $\kappa$, we prove that SHB with the standard step-size and momentum parameters results in an $O\left(\exp(-\frac{T}{\sqrt{\kappa}}) + \sigma \right)$ convergence rate, where $T$ is the number of iterations and $\sigma^2$ is the variance in the stochastic gradients.

Decision-Aware Actor-Critic with Function Approximation and Theoretical Guarantees

1 code implementation NeurIPS 2023 Sharan Vaswani, Amirreza Kazemi, Reza Babanezhad, Nicolas Le Roux

Instantiating the generic algorithm results in an actor that involves maximizing a sequence of surrogate functions (similar to TRPO, PPO) and a critic that involves minimizing a closely connected objective.

Reinforcement Learning (RL)

Target-based Surrogates for Stochastic Optimization

1 code implementation6 Feb 2023 Jonathan Wilder Lavington, Sharan Vaswani, Reza Babanezhad, Mark Schmidt, Nicolas Le Roux

Our target optimization framework uses the (expensive) gradient computation to construct surrogate functions in a \emph{target space} (e. g. the logits output by a linear model for classification) that can be minimized efficiently.

Imitation Learning Stochastic Optimization

Improved Policy Optimization for Online Imitation Learning

1 code implementation29 Jul 2022 Jonathan Wilder Lavington, Sharan Vaswani, Mark Schmidt

Specifically, if the class of policies is sufficiently expressive to contain the expert policy, we prove that DAGGER achieves constant regret.

Imitation Learning

Near-Optimal Sample Complexity Bounds for Constrained MDPs

no code implementations13 Jun 2022 Sharan Vaswani, Lin F. Yang, Csaba Szepesvári

In particular, we design a model-based algorithm that addresses two settings: (i) relaxed feasibility, where small constraint violations are allowed, and (ii) strict feasibility, where the output policy is required to satisfy the constraint.

Towards Painless Policy Optimization for Constrained MDPs

1 code implementation11 Apr 2022 Arushi Jain, Sharan Vaswani, Reza Babanezhad, Csaba Szepesvari, Doina Precup

We propose a generic primal-dual framework that allows us to bound the reward sub-optimality and constraint violation for arbitrary algorithms in terms of their primal and dual regret on online linear optimization problems.

Towards Noise-adaptive, Problem-adaptive (Accelerated) Stochastic Gradient Descent

no code implementations21 Oct 2021 Sharan Vaswani, Benjamin Dubois-Taine, Reza Babanezhad

In order to be adaptive to the smoothness, we use a stochastic line-search (SLS) and show (via upper and lower-bounds) that SGD with SLS converges at the desired rate, but only to a neighbourhood of the solution.

SVRG Meets AdaGrad: Painless Variance Reduction

no code implementations18 Feb 2021 Benjamin Dubois-Taine, Sharan Vaswani, Reza Babanezhad, Mark Schmidt, Simon Lacoste-Julien

Variance reduction (VR) methods for finite-sum minimization typically require the knowledge of problem-dependent constants that are often unknown and difficult to estimate.

Adaptive Gradient Methods Converge Faster with Over-Parameterization (and you can do a line-search)

no code implementations28 Sep 2020 Sharan Vaswani, Issam H. Laradji, Frederik Kunstner, Si Yi Meng, Mark Schmidt, Simon Lacoste-Julien

Under an interpolation assumption, we prove that AMSGrad with a constant step-size and momentum can converge to the minimizer at the faster $O(1/T)$ rate for smooth, convex functions.

Binary Classification

To Each Optimizer a Norm, To Each Norm its Generalization

no code implementations11 Jun 2020 Sharan Vaswani, Reza Babanezhad, Jose Gallego, Aaron Mishkin, Simon Lacoste-Julien, Nicolas Le Roux

For under-parameterized linear classification, we prove that for any linear classifier separating the data, there exists a family of quadratic norms ||.||_P such that the classifier's direction is the same as that of the maximum P-margin solution.

Classification General Classification

Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence

1 code implementation24 Feb 2020 Nicolas Loizou, Sharan Vaswani, Issam Laradji, Simon Lacoste-Julien

Consequently, the proposed stochastic Polyak step-size (SPS) is an attractive choice for setting the learning rate for stochastic gradient descent (SGD).

Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation

1 code implementation11 Oct 2019 Si Yi Meng, Sharan Vaswani, Issam Laradji, Mark Schmidt, Simon Lacoste-Julien

Under this condition, we show that the regularized subsampled Newton method (R-SSN) achieves global linear convergence with an adaptive step-size and a constant batch-size.

Binary Classification Second-order methods

Old Dog Learns New Tricks: Randomized UCB for Bandit Problems

1 code implementation11 Oct 2019 Sharan Vaswani, Abbas Mehrabian, Audrey Durand, Branislav Kveton

We propose $\tt RandUCB$, a bandit strategy that builds on theoretically derived confidence intervals similar to upper confidence bound (UCB) algorithms, but akin to Thompson sampling (TS), it uses randomization to trade off exploration and exploitation.

Thompson Sampling

Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits

no code implementations13 Nov 2018 Branislav Kveton, Csaba Szepesvari, Sharan Vaswani, Zheng Wen, Mohammad Ghavamzadeh, Tor Lattimore

Specifically, it pulls the arm with the highest mean reward in a non-parametric bootstrap sample of its history with pseudo rewards.

Multi-Armed Bandits

Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron

no code implementations16 Oct 2018 Sharan Vaswani, Francis Bach, Mark Schmidt

Under this condition, we prove that constant step-size stochastic gradient descent (SGD) with Nesterov acceleration matches the convergence rate of the deterministic accelerated method for both convex and strongly-convex functions.

Combining Bayesian Optimization and Lipschitz Optimization

no code implementations10 Oct 2018 Mohamed Osama Ahmed, Sharan Vaswani, Mark Schmidt

Indeed, in a particular setting, we prove that using the Lipschitz information yields the same or a better bound on the regret compared to using Bayesian optimization on its own.

Bayesian Optimization Thompson Sampling

Horde of Bandits using Gaussian Markov Random Fields

no code implementations7 Mar 2017 Sharan Vaswani, Mark Schmidt, Laks. V. S. Lakshmanan

The gang of bandits (GOB) model \cite{cesa2013gang} is a recent contextual bandits framework that shares information between a set of bandit problems, related by a known (possibly noisy) graph.

Clustering Multi-Armed Bandits +2

Model-Independent Online Learning for Influence Maximization

no code implementations ICML 2017 Sharan Vaswani, Branislav Kveton, Zheng Wen, Mohammad Ghavamzadeh, Laks Lakshmanan, Mark Schmidt

We consider influence maximization (IM) in social networks, which is the problem of maximizing the number of users that become aware of a product by selecting a set of "seed" users to expose the product to.

Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback

1 code implementation NeurIPS 2017 Zheng Wen, Branislav Kveton, Michal Valko, Sharan Vaswani

Specifically, we aim to learn the set of "best influencers" in a social network online while repeatedly interacting with it.

Adaptive Influence Maximization in Social Networks: Why Commit when You can Adapt?

no code implementations27 Apr 2016 Sharan Vaswani, Laks. V. S. Lakshmanan

A disadvantage of this setting is that the marketer is forced to select all the seeds based solely on a diffusion model.

Social and Information Networks

Influence Maximization with Bandits

no code implementations27 Feb 2015 Sharan Vaswani, Laks. V. S. Lakshmanan, Mark Schmidt

We consider the problem of \emph{influence maximization}, the problem of maximizing the number of people that become aware of a product by finding the `best' set of `seed' users to expose the product to.

Fast 3D Salient Region Detection in Medical Images using GPUs

no code implementations24 Oct 2013 Rahul Thota, Sharan Vaswani, Amit Kale, Nagavijayalakshmi Vydyanathan

This allows us to initialize a sparse seed-point grid as the set of tentative salient region centers and iteratively converge to the local entropy maxima, thereby reducing the computation complexity compared to the Kadir Brady approach of performing this computation at every point in the image.

Cannot find the paper you are looking for? You can Submit a new open access paper.