Search Results for author: Kush Bhatia

Found 23 papers, 4 papers with code

The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry

no code implementations6 Feb 2024 Michael Zhang, Kush Bhatia, Hermann Kumbong, Christopher Ré

Experiments show Hedgehog recovers over 99% of standard Transformer quality in train-from-scratch and finetuned-conversion settings, outperforming prior linear attentions up to 6 perplexity points on WikiText-103 with causal GPTs, and up to 8. 7 GLUE score points on finetuned bidirectional BERTs.

SuperHF: Supervised Iterative Learning from Human Feedback

1 code implementation25 Oct 2023 Gabriel Mukobi, Peter Chatain, Su Fong, Robert Windesheim, Gitta Kutyniok, Kush Bhatia, Silas Alberti

Here, we focus on two prevalent methods used to align these models, Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF).

Language Modelling

Reward Learning as Doubly Nonparametric Bandits: Optimal Design and Scaling Laws

no code implementations23 Feb 2023 Kush Bhatia, Wenshuo Guo, Jacob Steinhardt

We specifically show that the well-studied problem of Gaussian process (GP) bandit optimization is a special case of our framework, and that our bounds either improve or are competitive with known regret guarantees for the Mat\'ern kernel.

Congested Bandits: Optimal Routing via Short-term Resets

no code implementations23 Jan 2023 Pranjal Awasthi, Kush Bhatia, Sreenivas Gollapudi, Kostas Kollias

For the linear contextual bandit setup, our algorithm, based on an iterative least squares planner, achieves policy regret $\tilde{O}(\sqrt{dT} + \Delta)$.

On the Sensitivity of Reward Inference to Misspecified Human Models

no code implementations9 Dec 2022 Joey Hong, Kush Bhatia, Anca Dragan

This begs the question: how accurate do these models need to be in order for the reward inference to be accurate?

Continuous Control

Ask Me Anything: A simple strategy for prompting language models

3 code implementations5 Oct 2022 Simran Arora, Avanika Narayan, Mayee F. Chen, Laurel Orr, Neel Guha, Kush Bhatia, Ines Chami, Frederic Sala, Christopher Ré

Prompting is a brittle process wherein small modifications to the prompt can cause large variations in the model predictions, and therefore significant effort is dedicated towards designing a painstakingly "perfect prompt" for a task.

Coreference Resolution Natural Language Inference +2

Statistical and Computational Trade-offs in Variational Inference: A Case Study in Inferential Model Selection

no code implementations22 Jul 2022 Kush Bhatia, Nikki Lijing Kuang, Yi-An Ma, Yixin Wang

Focusing on Gaussian inferential models (or variational approximating families) with diagonal plus low-rank precision matrices, we initiate a theoretical study of the trade-offs in two aspects, Bayesian posterior inference error and frequentist uncertainty quantification error.

Bayesian Inference Computational Efficiency +4

The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models

1 code implementation ICLR 2022 Alexander Pan, Kush Bhatia, Jacob Steinhardt

Reward hacking -- where RL agents exploit gaps in misspecified reward functions -- has been widely observed, but not yet systematically studied.

Anomaly Detection

Preference learning along multiple criteria: A game-theoretic perspective

no code implementations NeurIPS 2020 Kush Bhatia, Ashwin Pananjady, Peter L. Bartlett, Anca D. Dragan, Martin J. Wainwright

Finally, we showcase the practical utility of our framework in a user study on autonomous driving, where we find that the Blackwell winner outperforms the von Neumann winner for the overall preferences.

Autonomous Driving

Agnostic learning with unknown utilities

no code implementations17 Apr 2021 Kush Bhatia, Peter L. Bartlett, Anca D. Dragan, Jacob Steinhardt

This raises an interesting question whether learning is even possible in our setup, given that obtaining a generalizable estimate of utility $u^*$ might not be possible from finitely many samples.

Online learning with dynamics: A minimax perspective

no code implementations NeurIPS 2020 Kush Bhatia, Karthik Sridharan

In this setting, we study the problem of minimizing policy regret and provide non-constructive upper bounds on the minimax rate for the problem.

counterfactual

Bayesian Robustness: A Nonasymptotic Viewpoint

no code implementations27 Jul 2019 Kush Bhatia, Yi-An Ma, Anca D. Dragan, Peter L. Bartlett, Michael. I. Jordan

We study the problem of robustly estimating the posterior distribution for the setting where observed data can be contaminated with potentially adversarial outliers.

Binary Classification regression

Adaptive Hard Thresholding for Near-optimal Consistent Robust Regression

no code implementations19 Mar 2019 Arun Sai Suggala, Kush Bhatia, Pradeep Ravikumar, Prateek Jain

We provide a nearly linear time estimator which consistently estimates the true regression vector, even with $1-o(1)$ fraction of corruptions.

regression

FastGRNN: A Fast, Accurate, Stable and Tiny Kilobyte Sized Gated Recurrent Neural Network

1 code implementation NeurIPS 2018 Aditya Kusupati, Manish Singh, Kush Bhatia, Ashish Kumar, Prateek Jain, Manik Varma

FastRNN addresses these limitations by adding a residual connection that does not constrain the range of the singular values explicitly and has only two extra scalar parameters.

Action Classification Language Modelling +3

Derivative-Free Methods for Policy Optimization: Guarantees for Linear Quadratic Systems

no code implementations20 Dec 2018 Dhruv Malik, Ashwin Pananjady, Kush Bhatia, Koulik Khamaru, Peter L. Bartlett, Martin J. Wainwright

We focus on characterizing the convergence rate of these methods when applied to linear-quadratic systems, and study various settings of driving noise and reward feedback.

Gen-Oja: Simple & Efficient Algorithm for Streaming Generalized Eigenvector Computation

no code implementations NeurIPS 2018 Kush Bhatia, Aldo Pacchiano, Nicolas Flammarion, Peter L. Bartlett, Michael. I. Jordan

In this paper, we study the problems of principle Generalized Eigenvector computation and Canonical Correlation Analysis in the stochastic setting.

Gen-Oja: A Two-time-scale approach for Streaming CCA

no code implementations20 Nov 2018 Kush Bhatia, Aldo Pacchiano, Nicolas Flammarion, Peter L. Bartlett, Michael. I. Jordan

In this paper, we study the problems of principal Generalized Eigenvector computation and Canonical Correlation Analysis in the stochastic setting.

Vocal Bursts Valence Prediction

Establishing Appropriate Trust via Critical States

no code implementations18 Oct 2018 Sandy H. Huang, Kush Bhatia, Pieter Abbeel, Anca D. Dragan

In order to effectively interact with or supervise a robot, humans need to have an accurate mental model of its capabilities and how it acts.

Robotics

Consistent Robust Regression

no code implementations NeurIPS 2017 Kush Bhatia, Prateek Jain, Parameswaran Kamalaruban, Purushottam Kar

We present the first efficient and provably consistent estimator for the robust regression problem.

regression

Efficient and Consistent Robust Time Series Analysis

no code implementations1 Jul 2016 Kush Bhatia, Prateek Jain, Parameswaran Kamalaruban, Purushottam Kar

We illustrate our methods on synthetic datasets and show that our methods indeed are able to consistently recover the optimal parameters despite a large fraction of points being corrupted.

regression Time Series +1

Sparse Local Embeddings for Extreme Multi-label Classification

no code implementations NeurIPS 2015 Kush Bhatia, Himanshu Jain, Purushottam Kar, Manik Varma, Prateek Jain

The objective in extreme multi-label learning is to train a classifier that can automatically tag a novel data point with the most relevant subset of labels from an extremely large label set.

Classification Extreme Multi-Label Classification +3

Locally Non-linear Embeddings for Extreme Multi-label Learning

no code implementations9 Jul 2015 Kush Bhatia, Himanshu Jain, Purushottam Kar, Prateek Jain, Manik Varma

Embedding based approaches make training and prediction tractable by assuming that the training label matrix is low-rank and hence the effective number of labels can be reduced by projecting the high dimensional label vectors onto a low dimensional linear subspace.

Extreme Multi-Label Classification General Classification +2

Robust Regression via Hard Thresholding

no code implementations NeurIPS 2015 Kush Bhatia, Prateek Jain, Purushottam Kar

In this work, we study a simple hard-thresholding algorithm called TORRENT which, under mild conditions on X, can recover w* exactly even if b corrupts the response variables in an adversarial manner, i. e. both the support and entries of b are selected adversarially after observing X and w*.

regression

Cannot find the paper you are looking for? You can Submit a new open access paper.