1 code implementation • 25 Mar 2025 • Somnath Basu Roy Chowdhury, Avinava Dubey, Ahmad Beirami, Rahul Kidambi, Nicholas Monath, Amr Ahmed, Snigdha Chaturvedi
Concept erasure is the task of erasing information about a concept (e. g., gender or race) from a representation set while retaining the maximum possible utility -- information from original representations.
no code implementations • 22 Jul 2024 • Kaiwen Wang, Rahul Kidambi, Ryan Sullivan, Alekh Agarwal, Christoph Dann, Andrea Michi, Marco Gelmi, Yunxuan Li, Raghav Gupta, Avinava Dubey, Alexandre Ramé, Johan Ferret, Geoffrey Cideron, Le Hou, Hongkun Yu, Amr Ahmed, Aranyak Mehta, Léonard Hussenot, Olivier Bachem, Edouard Leurent
Reward-based finetuning is crucial for aligning language policies with intended behaviors (e. g., creativity and safety).
no code implementations • 11 Apr 2024 • Kumar Avinava Dubey, Zhe Feng, Rahul Kidambi, Aranyak Mehta, Di Wang
We study an auction setting in which bidders bid for placement of their content within a summary generated by a large language model (LLM), e. g., an ad auction in which the display is a summary paragraph of multiple ads.
no code implementations • 8 Jan 2024 • Gokul Swamy, Christoph Dann, Rahul Kidambi, Zhiwei Steven Wu, Alekh Agarwal
Our approach is maximalist in that it provably handles non-Markovian, intransitive, and stochastic preferences while being robust to the compounding errors that plague offline approaches to sequential prediction.
2 code implementations • 17 Oct 2023 • Somnath Basu Roy Chowdhury, Nicholas Monath, Ahmad Beirami, Rahul Kidambi, Avinava Dubey, Amr Ahmed, Snigdha Chaturvedi
In the online setting, where the algorithm has access to a single instance at a time, estimating the group fairness objective requires additional storage and significantly more computation (e. g., forward/backward passes) than the task-specific objective at every time step.
no code implementations • 22 Apr 2022 • Adam Block, Rahul Kidambi, Daniel N. Hill, Thorsten Joachims, Inderjit S. Dhillon
A shortcoming of this approach is that users often do not know which query will provide the best retrieval performance on the current information retrieval system, meaning that any query autocompletion methods trained to mimic user behavior can lead to suboptimal query suggestions.
1 code implementation • NeurIPS 2021 • Jonathan D. Chang, Masatoshi Uehara, Dhruv Sreenivas, Rahul Kidambi, Wen Sun
Instead, the learner is presented with a static offline dataset of state-action-next state transition triples from a potentially less proficient behavior policy.
1 code implementation • NeurIPS 2021 • Jonathan Daniel Chang, Masatoshi Uehara, Dhruv Sreenivas, Rahul Kidambi, Wen Sun
Instead, the learner is presented with a static offline dataset of state-action-next state triples from a potentially less proficient behavior policy.
no code implementations • ICLR Workshop SSL-RL 2021 • Rahul Kidambi, Jonathan Daniel Chang, Wen Sun
This paper studies Imitation Learning from Observations alone (ILFO) where the learner is presented with expert demonstrations that only consist of states encountered by an expert (without access to actions taken by the expert).
1 code implementation • NeurIPS 2021 • Rahul Kidambi, Jonathan Chang, Wen Sun
This paper studies Imitation Learning from Observations alone (ILFO) where the learner is presented with expert demonstrations that consist only of states visited by an expert (without access to actions taken by the expert).
1 code implementation • 15 Feb 2021 • Rajat Sen, Alexander Rakhlin, Lexing Ying, Rahul Kidambi, Dean Foster, Daniel Hill, Inderjit Dhillon
We show that our algorithm has a regret guarantee of $O(k\sqrt{(A-k+1)T \log (|\mathcal{F}|T)})$, where $A$ is the total number of arms and $\mathcal{F}$ is the class containing the regression function, while only requiring $\tilde{O}(A)$ computation per time step.
Computational Efficiency
Extreme Multi-Label Classification
+3
1 code implementation • 9 Feb 2021 • Ruihan Wu, Chuan Guo, Felix Wu, Rahul Kidambi, Laurens van der Maaten, Kilian Q. Weinberger
We develop a novel approach for paper bidding and assignment that is much more robust against such attacks.
no code implementations • NeurIPS 2020 • Rahul Kidambi, Aravind Rajeswaran, Praneeth Netrapalli, Thorsten Joachims
In this work, we present MOReL, an algorithmic framework for model-based offline RL.
2 code implementations • 12 May 2020 • Rahul Kidambi, Aravind Rajeswaran, Praneeth Netrapalli, Thorsten Joachims
In this work, we present MOReL, an algorithmic framework for model-based offline RL.
no code implementations • ICLR 2019 • Rong Ge, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli
One plausible explanation is that non-convex neural network training procedures are better suited to the use of fundamentally different learning rate schedules, such as the ``cut the learning rate every constant number of epochs'' method (which more closely resembles an exponentially decaying learning rate schedule); note that this widely used schedule is in stark contrast to the polynomial decay schemes prescribed in the stochastic approximation literature, which are indeed shown to be (worst case) optimal for classes of convex optimization problems.
1 code implementation • NeurIPS 2019 • Rong Ge, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli
First, this work shows that even if the time horizon T (i. e. the number of iterations SGD is run for) is known in advance, SGD's final iterate behavior with any polynomially decaying learning rate scheme is highly sub-optimal compared to the minimax rate (by a condition number factor in the strongly convex case and a factor of $\sqrt{T}$ in the non-strongly convex case).
2 code implementations • ICLR 2018 • Rahul Kidambi, Praneeth Netrapalli, Prateek Jain, Sham M. Kakade
Extensive empirical results in this paper show that ASGD has performance gains over HB, NAG, and SGD.
no code implementations • 22 Nov 2017 • Naman Agarwal, Sham Kakade, Rahul Kidambi, Yin Tat Lee, Praneeth Netrapalli, Aaron Sidford
Given a matrix $\mathbf{A}\in\mathbb{R}^{n\times d}$ and a vector $b \in\mathbb{R}^{d}$, we show how to compute an $\epsilon$-approximate solution to the regression problem $ \min_{x\in\mathbb{R}^{d}}\frac{1}{2} \|\mathbf{A} x - b\|_{2}^{2} $ in time $ \tilde{O} ((n+\sqrt{d\cdot\kappa_{\text{sum}}})\cdot s\cdot\log\epsilon^{-1}) $ where $\kappa_{\text{sum}}=\mathrm{tr}\left(\mathbf{A}^{\top}\mathbf{A}\right)/\lambda_{\min}(\mathbf{A}^{T}\mathbf{A})$ and $s$ is the maximum number of non-zero entries in a row of $\mathbf{A}$.
no code implementations • 15 Nov 2017 • Dhruv Mahajan, Vivek Gupta, S. Sathiya Keerthi, Sellamanickam Sundararajan, Shravan Narayanamurthy, Rahul Kidambi
We also demonstrate their usefulness in making design choices such as the number of classifiers in the ensemble and the size of a subset of data used for training that is needed to achieve a certain value of generalization error.
no code implementations • 25 Oct 2017 • Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Venkata Krishna Pillutla, Aaron Sidford
This work provides a simplified proof of the statistical minimax optimality of (iterate averaged) stochastic gradient descent (SGD), for the special case of least squares.
no code implementations • 26 Apr 2017 • Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Aaron Sidford
There is widespread sentiment that it is not possible to effectively utilize fast gradient methods (e. g. Nesterov's acceleration, conjugate gradient, heavy ball) for the purposes of stochastic optimization due to their instability and error accumulation, a notion made precise in d'Aspremont 2008 and Devolder, Glineur, and Nesterov 2014.
4 code implementations • 12 Oct 2016 • Prateek Jain, Sham M. Kakade, Rahul Kidambi, Praneeth Netrapalli, Aaron Sidford
In particular, this work provides a sharp analysis of: (1) mini-batching, a method of averaging many samples of a stochastic gradient to both reduce the variance of the stochastic gradient estimate and for parallelizing SGD and (2) tail-averaging, a method involving averaging the final few iterates of SGD to decrease the variance in SGD's final iterate.
no code implementations • NeurIPS 2015 • Jennifer Gillenwater, Rishabh Iyer, Bethany Lusch, Rahul Kidambi, Jeff Bilmes
We show that there is a largely unexplored class of functions (positive polymatroids) that can define proper discrete metrics over pairs of binary vectors and that are fairly tractable to optimize over.
no code implementations • 10 Nov 2013 • Vinod Nair, Rahul Kidambi, Sundararajan Sellamanickam, S. Sathiya Keerthi, Johannes Gehrke, Vijay Narayanan
We consider the problem of quantitatively evaluating missing value imputation algorithms.
no code implementations • 9 Nov 2013 • Rahul Kidambi, Vinod Nair, Sundararajan Sellamanickam, S. Sathiya Keerthi
In this paper we propose a structured output approach for missing value imputation that also incorporates domain constraints.