no code implementations • 10 Oct 2024 • Amrith Setlur, Chirag Nagpal, Adam Fisch, Xinyang Geng, Jacob Eisenstein, Rishabh Agarwal, Alekh Agarwal, Jonathan Berant, Aviral Kumar
Our key insight is that, to be effective, the process reward for a step should measure progress: a change in the likelihood of producing a correct response in the future, before and after taking the step, corresponding to the notion of step-level advantages in RL.
1 code implementation • 20 Jun 2024 • Amrith Setlur, Saurabh Garg, Xinyang Geng, Naman Garg, Virginia Smith, Aviral Kumar
With this per-step scheme, we are able to attain consistent gains over only positive data, attaining performance similar to amplifying the amount of synthetic data by $\mathbf{8 \times}$.
1 code implementation • 24 Dec 2023 • Pratiksha Thaker, Amrith Setlur, Zhiwei Steven Wu, Virginia Smith
Public pretraining is a promising approach to improve differentially private model training.
no code implementations • NeurIPS 2023 • Saurabh Garg, Amrith Setlur, Zachary Chase Lipton, Sivaraman Balakrishnan, Virginia Smith, aditi raghunathan
Self-training and contrastive learning have emerged as leading techniques for incorporating unlabeled data, both under distribution shift (unsupervised domain adaptation) and when it is absent (semi-supervised learning).
1 code implementation • 5 Dec 2023 • Atharva Kulkarni, Lucio Dery, Amrith Setlur, aditi raghunathan, Ameet Talwalkar, Graham Neubig
We primarily consider the standard setting of fine-tuning a pre-trained model, where, following recent work \citep{gururangan2020don, dery2023aang}, we multitask the end task with the pre-training objective constructed from the end task data itself.
1 code implementation • 2 Oct 2023 • Katie Kang, Amrith Setlur, Claire Tomlin, Sergey Levine
Rather than extrapolating in arbitrary ways, we observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
no code implementations • 19 Jul 2023 • Gaurav Ghosal, Amrith Setlur, Daniel S. Brown, Anca D. Dragan, aditi raghunathan
We formalize a new setting called contextual reliability which accounts for the fact that the "right" features to use may vary depending on the context.
no code implementations • 19 Jun 2023 • Annie S. Chen, Yoonho Lee, Amrith Setlur, Sergey Levine, Chelsea Finn
Effective machine learning models learn both robust features that directly determine the outcome of interest (e. g., an object with wheels is more likely to be a car), and shortcut features (e. g., an object on a road is more likely to be a car).
no code implementations • 10 Feb 2023 • Annie S. Chen, Yoonho Lee, Amrith Setlur, Sergey Levine, Chelsea Finn
Transfer learning with a small amount of target data is an effective and common approach to adapting a pre-trained model to distribution shifts.
1 code implementation • 6 Feb 2023 • Amrith Setlur, Don Dennis, Benjamin Eysenbach, aditi raghunathan, Chelsea Finn, Virginia Smith, Sergey Levine
Some robust training algorithms (e. g., Group DRO) specialize to group shifts and require group information on all training points.
no code implementations • 3 Jun 2022 • Amrith Setlur, Benjamin Eysenbach, Virginia Smith, Sergey Levine
Supervised learning methods trained with maximum likelihood objectives often overfit on training data.
1 code implementation • NeurIPS 2021 • Amrith Setlur, Oscar Li, Virginia Smith
We categorize meta-learning evaluation into two settings: $\textit{in-distribution}$ [ID], in which the train and test tasks are sampled $\textit{iid}$ from the same underlying task distribution, and $\textit{out-of-distribution}$ [OOD], in which they are not.
no code implementations • 28 Nov 2020 • Amrith Setlur, Oscar Li, Virginia Smith
Meta-learning is a popular framework for learning with limited data in which an algorithm is produced by training over multiple few-shot learning tasks.
no code implementations • ICLR 2021 • Divyansh Kaushik, Amrith Setlur, Eduard Hovy, Zachary C. Lipton
In attempts to produce ML models less reliant on spurious patterns in NLP datasets, researchers have recently proposed curating counterfactually augmented data (CAD) via a human-in-the-loop process in which given some documents and their (initial) labels, humans must revise the text to make a counterfactual label applicable.
no code implementations • 18 Aug 2020 • Hai Pham, Amrith Setlur, Saket Dingliwal, Tzu-Hsiang Lin, Barnabas Poczos, Kang Huang, Zhuo Li, Jae Lim, Collin McCormack, Tam Vu
Despite the advent of deep learning in computer vision, the general handwriting recognition problem is far from solved.
no code implementations • 25 Jul 2020 • Amrith Setlur, Barnabas Poczos, Alan W. black
This paper extends recent work on nonlinear Independent Component Analysis (ICA) by introducing a theoretical framework for nonlinear Independent Subspace Analysis (ISA) in the presence of auxiliary variables.
1 code implementation • ICML Workshop LifelongML 2020 • Amrith Setlur, Saket Dingliwal, Barnabas Poczos
Based on this model we propose a computationally feasible meta-learning algorithm by introducing meaningful relaxations in our final objective.
1 code implementation • ACL 2020 • Aman Madaan, Amrith Setlur, Tanmay Parekh, Barnabas Poczos, Graham Neubig, Yiming Yang, Ruslan Salakhutdinov, Alan W. black, Shrimai Prabhumoye
This paper introduces a new task of politeness transfer which involves converting non-polite sentences to polite sentences while preserving the meaning.
no code implementations • 22 Oct 2019 • Amrith Setlur, Barnabás Póczós
Temporal Point Processes (TPP) with partial likelihoods involving a latent structure often entail an intractable marginalization, thus making inference hard.