Search Results for author: Amrith Setlur

Found 19 papers, 8 papers with code

Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning

no code implementations10 Oct 2024 Amrith Setlur, Chirag Nagpal, Adam Fisch, Xinyang Geng, Jacob Eisenstein, Rishabh Agarwal, Alekh Agarwal, Jonathan Berant, Aviral Kumar

Our key insight is that, to be effective, the process reward for a step should measure progress: a change in the likelihood of producing a correct response in the future, before and after taking the step, corresponding to the notion of step-level advantages in RL.

Reinforcement Learning (RL)

RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold

1 code implementation20 Jun 2024 Amrith Setlur, Saurabh Garg, Xinyang Geng, Naman Garg, Virginia Smith, Aviral Kumar

With this per-step scheme, we are able to attain consistent gains over only positive data, attaining performance similar to amplifying the amount of synthetic data by $\mathbf{8 \times}$.

Math Reinforcement Learning (RL)

Complementary Benefits of Contrastive Learning and Self-Training Under Distribution Shift

no code implementations NeurIPS 2023 Saurabh Garg, Amrith Setlur, Zachary Chase Lipton, Sivaraman Balakrishnan, Virginia Smith, aditi raghunathan

Self-training and contrastive learning have emerged as leading techniques for incorporating unlabeled data, both under distribution shift (unsupervised domain adaptation) and when it is absent (semi-supervised learning).

Contrastive Learning Unsupervised Domain Adaptation

Multitask Learning Can Improve Worst-Group Outcomes

1 code implementation5 Dec 2023 Atharva Kulkarni, Lucio Dery, Amrith Setlur, aditi raghunathan, Ameet Talwalkar, Graham Neubig

We primarily consider the standard setting of fine-tuning a pre-trained model, where, following recent work \citep{gururangan2020don, dery2023aang}, we multitask the end task with the pre-training objective constructed from the end task data itself.

Fairness

Deep Neural Networks Tend To Extrapolate Predictably

1 code implementation2 Oct 2023 Katie Kang, Amrith Setlur, Claire Tomlin, Sergey Levine

Rather than extrapolating in arbitrary ways, we observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.

Decision Making

Contextual Reliability: When Different Features Matter in Different Contexts

no code implementations19 Jul 2023 Gaurav Ghosal, Amrith Setlur, Daniel S. Brown, Anca D. Dragan, aditi raghunathan

We formalize a new setting called contextual reliability which accounts for the fact that the "right" features to use may vary depending on the context.

Confidence-Based Model Selection: When to Take Shortcuts for Subpopulation Shifts

no code implementations19 Jun 2023 Annie S. Chen, Yoonho Lee, Amrith Setlur, Sergey Levine, Chelsea Finn

Effective machine learning models learn both robust features that directly determine the outcome of interest (e. g., an object with wheels is more likely to be a car), and shortcut features (e. g., an object on a road is more likely to be a car).

Model Selection

Project and Probe: Sample-Efficient Domain Adaptation by Interpolating Orthogonal Features

no code implementations10 Feb 2023 Annie S. Chen, Yoonho Lee, Amrith Setlur, Sergey Levine, Chelsea Finn

Transfer learning with a small amount of target data is an effective and common approach to adapting a pre-trained model to distribution shifts.

Domain Adaptation Transfer Learning

Bitrate-Constrained DRO: Beyond Worst Case Robustness To Unknown Group Shifts

1 code implementation6 Feb 2023 Amrith Setlur, Don Dennis, Benjamin Eysenbach, aditi raghunathan, Chelsea Finn, Virginia Smith, Sergey Levine

Some robust training algorithms (e. g., Group DRO) specialize to group shifts and require group information on all training points.

Adversarial Unlearning: Reducing Confidence Along Adversarial Directions

no code implementations3 Jun 2022 Amrith Setlur, Benjamin Eysenbach, Virginia Smith, Sergey Levine

Supervised learning methods trained with maximum likelihood objectives often overfit on training data.

Data Augmentation

Two Sides of Meta-Learning Evaluation: In vs. Out of Distribution

1 code implementation NeurIPS 2021 Amrith Setlur, Oscar Li, Virginia Smith

We categorize meta-learning evaluation into two settings: $\textit{in-distribution}$ [ID], in which the train and test tasks are sampled $\textit{iid}$ from the same underlying task distribution, and $\textit{out-of-distribution}$ [OOD], in which they are not.

Few-Shot Learning Learning Theory +2

Is Support Set Diversity Necessary for Meta-Learning?

no code implementations28 Nov 2020 Amrith Setlur, Oscar Li, Virginia Smith

Meta-learning is a popular framework for learning with limited data in which an algorithm is produced by training over multiple few-shot learning tasks.

Diversity Few-Shot Learning

Explaining The Efficacy of Counterfactually Augmented Data

no code implementations ICLR 2021 Divyansh Kaushik, Amrith Setlur, Eduard Hovy, Zachary C. Lipton

In attempts to produce ML models less reliant on spurious patterns in NLP datasets, researchers have recently proposed curating counterfactually augmented data (CAD) via a human-in-the-loop process in which given some documents and their (initial) labels, humans must revise the text to make a counterfactual label applicable.

counterfactual Domain Generalization

Nonlinear ISA with Auxiliary Variables for Learning Speech Representations

no code implementations25 Jul 2020 Amrith Setlur, Barnabas Poczos, Alan W. black

This paper extends recent work on nonlinear Independent Component Analysis (ICA) by introducing a theoretical framework for nonlinear Independent Subspace Analysis (ISA) in the presence of auxiliary variables.

Speaker Verification

Covariate Distribution Aware Meta-learning

1 code implementation ICML Workshop LifelongML 2020 Amrith Setlur, Saket Dingliwal, Barnabas Poczos

Based on this model we propose a computationally feasible meta-learning algorithm by introducing meaningful relaxations in our final objective.

Few-Shot Learning regression

Politeness Transfer: A Tag and Generate Approach

1 code implementation ACL 2020 Aman Madaan, Amrith Setlur, Tanmay Parekh, Barnabas Poczos, Graham Neubig, Yiming Yang, Ruslan Salakhutdinov, Alan W. black, Shrimai Prabhumoye

This paper introduces a new task of politeness transfer which involves converting non-polite sentences to polite sentences while preserving the meaning.

Sentence Style Transfer +1

Better Approximate Inference for Partial Likelihood Models with a Latent Structure

no code implementations22 Oct 2019 Amrith Setlur, Barnabás Póczós

Temporal Point Processes (TPP) with partial likelihoods involving a latent structure often entail an intractable marginalization, thus making inference hard.

Point Processes Survival Analysis

Cannot find the paper you are looking for? You can Submit a new open access paper.