Search Results for author: Rahul Kidambi

Found 23 papers, 10 papers with code

The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares

1 code implementation NeurIPS 2019 Rong Ge, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli

First, this work shows that even if the time horizon T (i. e. the number of iterations SGD is run for) is known in advance, SGD's final iterate behavior with any polynomially decaying learning rate scheme is highly sub-optimal compared to the minimax rate (by a condition number factor in the strongly convex case and a factor of $\sqrt{T}$ in the non-strongly convex case).

Stochastic Optimization

Parallelizing Stochastic Gradient Descent for Least Squares Regression: mini-batching, averaging, and model misspecification

1 code implementation12 Oct 2016 Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Aaron Sidford

In particular, this work provides a sharp analysis of: (1) mini-batching, a method of averaging many samples of a stochastic gradient to both reduce the variance of the stochastic gradient estimate and for parallelizing SGD and (2) tail-averaging, a method involving averaging the final few iterates of SGD to decrease the variance in SGD's final iterate.

regression

Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage

1 code implementation NeurIPS 2021 Jonathan D. Chang, Masatoshi Uehara, Dhruv Sreenivas, Rahul Kidambi, Wen Sun

Instead, the learner is presented with a static offline dataset of state-action-next state transition triples from a potentially less proficient behavior policy.

Continuous Control Imitation Learning

Mitigating Covariate Shift in Imitation Learning via Offline Data With Partial Coverage

1 code implementation NeurIPS 2021 Jonathan Daniel Chang, Masatoshi Uehara, Dhruv Sreenivas, Rahul Kidambi, Wen Sun

Instead, the learner is presented with a static offline dataset of state-action-next state triples from a potentially less proficient behavior policy.

Continuous Control Imitation Learning

Making Paper Reviewing Robust to Bid Manipulation Attacks

1 code implementation9 Feb 2021 Ruihan Wu, Chuan Guo, Felix Wu, Rahul Kidambi, Laurens van der Maaten, Kilian Q. Weinberger

We develop a novel approach for paper bidding and assignment that is much more robust against such attacks.

Top-$k$ eXtreme Contextual Bandits with Arm Hierarchy

1 code implementation15 Feb 2021 Rajat Sen, Alexander Rakhlin, Lexing Ying, Rahul Kidambi, Dean Foster, Daniel Hill, Inderjit Dhillon

We show that our algorithm has a regret guarantee of $O(k\sqrt{(A-k+1)T \log (|\mathcal{F}|T)})$, where $A$ is the total number of arms and $\mathcal{F}$ is the class containing the regression function, while only requiring $\tilde{O}(A)$ computation per time step.

Computational Efficiency Extreme Multi-Label Classification +2

MobILE: Model-Based Imitation Learning From Observation Alone

1 code implementation NeurIPS 2021 Rahul Kidambi, Jonathan Chang, Wen Sun

This paper studies Imitation Learning from Observations alone (ILFO) where the learner is presented with expert demonstrations that consist only of states visited by an expert (without access to actions taken by the expert).

Imitation Learning OpenAI Gym

Enhancing Group Fairness in Online Settings Using Oblique Decision Forests

2 code implementations17 Oct 2023 Somnath Basu Roy Chowdhury, Nicholas Monath, Ahmad Beirami, Rahul Kidambi, Avinava Dubey, Amr Ahmed, Snigdha Chaturvedi

In the online setting, where the algorithm has access to a single instance at a time, estimating the group fairness objective requires additional storage and significantly more computation (e. g., forward/backward passes) than the task-specific objective at every time step.

Fairness

Leverage Score Sampling for Faster Accelerated Regression and ERM

no code implementations22 Nov 2017 Naman Agarwal, Sham Kakade, Rahul Kidambi, Yin Tat Lee, Praneeth Netrapalli, Aaron Sidford

Given a matrix $\mathbf{A}\in\mathbb{R}^{n\times d}$ and a vector $b \in\mathbb{R}^{d}$, we show how to compute an $\epsilon$-approximate solution to the regression problem $ \min_{x\in\mathbb{R}^{d}}\frac{1}{2} \|\mathbf{A} x - b\|_{2}^{2} $ in time $ \tilde{O} ((n+\sqrt{d\cdot\kappa_{\text{sum}}})\cdot s\cdot\log\epsilon^{-1}) $ where $\kappa_{\text{sum}}=\mathrm{tr}\left(\mathbf{A}^{\top}\mathbf{A}\right)/\lambda_{\min}(\mathbf{A}^{T}\mathbf{A})$ and $s$ is the maximum number of non-zero entries in a row of $\mathbf{A}$.

regression

Efficient Estimation of Generalization Error and Bias-Variance Components of Ensembles

no code implementations15 Nov 2017 Dhruv Mahajan, Vivek Gupta, S. Sathiya Keerthi, Sellamanickam Sundararajan, Shravan Narayanamurthy, Rahul Kidambi

We also demonstrate their usefulness in making design choices such as the number of classifiers in the ensemble and the size of a subset of data used for training that is needed to achieve a certain value of generalization error.

A Markov Chain Theory Approach to Characterizing the Minimax Optimality of Stochastic Gradient Descent (for Least Squares)

no code implementations25 Oct 2017 Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Venkata Krishna Pillutla, Aaron Sidford

This work provides a simplified proof of the statistical minimax optimality of (iterate averaged) stochastic gradient descent (SGD), for the special case of least squares.

Accelerating Stochastic Gradient Descent For Least Squares Regression

no code implementations26 Apr 2017 Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Aaron Sidford

There is widespread sentiment that it is not possible to effectively utilize fast gradient methods (e. g. Nesterov's acceleration, conjugate gradient, heavy ball) for the purposes of stochastic optimization due to their instability and error accumulation, a notion made precise in d'Aspremont 2008 and Devolder, Glineur, and Nesterov 2014.

regression Stochastic Optimization

Submodular Hamming Metrics

no code implementations NeurIPS 2015 Jennifer Gillenwater, Rishabh Iyer, Bethany Lusch, Rahul Kidambi, Jeff Bilmes

We show that there is a largely unexplored class of functions (positive polymatroids) that can define proper discrete metrics over pairs of binary vectors and that are fairly tractable to optimize over.

Clustering

A Structured Prediction Approach for Missing Value Imputation

no code implementations9 Nov 2013 Rahul Kidambi, Vinod Nair, Sundararajan Sellamanickam, S. Sathiya Keerthi

In this paper we propose a structured output approach for missing value imputation that also incorporates domain constraints.

Imputation Structured Prediction

Rethinking learning rate schedules for stochastic optimization

no code implementations ICLR 2019 Rong Ge, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli

One plausible explanation is that non-convex neural network training procedures are better suited to the use of fundamentally different learning rate schedules, such as the ``cut the learning rate every constant number of epochs'' method (which more closely resembles an exponentially decaying learning rate schedule); note that this widely used schedule is in stark contrast to the polynomial decay schemes prescribed in the stochastic approximation literature, which are indeed shown to be (worst case) optimal for classes of convex optimization problems.

Stochastic Optimization

Optimism is All You Need: Model-Based Imitation Learning From Observation Alone

no code implementations ICLR Workshop SSL-RL 2021 Rahul Kidambi, Jonathan Daniel Chang, Wen Sun

This paper studies Imitation Learning from Observations alone (ILFO) where the learner is presented with expert demonstrations that only consist of states encountered by an expert (without access to actions taken by the expert).

Imitation Learning OpenAI Gym

Counterfactual Learning To Rank for Utility-Maximizing Query Autocompletion

no code implementations22 Apr 2022 Adam Block, Rahul Kidambi, Daniel N. Hill, Thorsten Joachims, Inderjit S. Dhillon

A shortcoming of this approach is that users often do not know which query will provide the best retrieval performance on the current information retrieval system, meaning that any query autocompletion methods trained to mimic user behavior can lead to suboptimal query suggestions.

counterfactual Information Retrieval +2

A Minimaximalist Approach to Reinforcement Learning from Human Feedback

no code implementations8 Jan 2024 Gokul Swamy, Christoph Dann, Rahul Kidambi, Zhiwei Steven Wu, Alekh Agarwal

Our approach is maximalist in that it provably handles non-Markovian, intransitive, and stochastic preferences while being robust to the compounding errors that plague offline approaches to sequential prediction.

Continuous Control reinforcement-learning

Auctions with LLM Summaries

no code implementations11 Apr 2024 Kumar Avinava Dubey, Zhe Feng, Rahul Kidambi, Aranyak Mehta, Di Wang

We study an auction setting in which bidders bid for placement of their content within a summary generated by a large language model (LLM), e. g., an ad auction in which the display is a summary paragraph of multiple ads.

Language Modelling Large Language Model +1

Cannot find the paper you are looking for? You can Submit a new open access paper.