1 code implementation • 1 Jun 2023 • Elan Rosenfeld, Saurabh Garg
Estimating the bound requires optimizing one multiclass classifier to disagree with another, for which some prior works have used sub-optimal proxy losses; we devise a "disagreement loss" which is theoretically justified and performs better in practice.
1 code implementation • 31 May 2023 • Dheeraj Baby, Saurabh Garg, Tzu-Ching Yen, Sivaraman Balakrishnan, Zachary Chase Lipton, Yu-Xiang Wang
In the supervised setting, we must both learn a classifier and adapt to the dynamically evolving class marginals given only labeled online data.
1 code implementation • 6 Feb 2023 • Saurabh Garg, Nick Erickson, James Sharpnack, Alex Smola, Sivaraman Balakrishnan, Zachary C. Lipton
Despite the emergence of principled methods for domain adaptation under label shift, their sensitivity to shifts in class conditional distributions is precariously under explored.
1 code implementation • 6 Feb 2023 • Zachary Novack, Julian McAuley, Zachary C. Lipton, Saurabh Garg
Open vocabulary models (e. g.
no code implementations • 18 Jan 2023 • Renjie Li, Chun Yu Lao, Rebecca St. George, Katherine Lawler, Saurabh Garg, Son N. Tran, Quan Bai, Jane Alty
RMT and a range of DLC models were applied to the video data with tapping frequencies up to 8Hz to extract movement features.
1 code implementation • 29 Nov 2022 • Zachary Novack, Simran Kaur, Tanya Marwah, Saurabh Garg, Zachary C. Lipton
A number of competing hypotheses have been proposed to explain why small-batch Stochastic Gradient Descent (SGD)leads to improved generalization over the full-batch regime, with recent work crediting the implicit regularization of various quantities throughout training.
1 code implementation • 26 Oct 2022 • Pratyush Maini, Saurabh Garg, Zachary C. Lipton, J. Zico Kolter
Popular metrics derived from these dynamics include (i) the epoch at which examples are first correctly classified; (ii) the number of times their predictions flip during training; and (iii) whether their prediction flips if they are held out.
1 code implementation • 28 Sep 2022 • Kundan Krishna, Saurabh Garg, Jeffrey P. Bigham, Zachary C. Lipton
In experiments addressing both ELECTRA and RoBERTa models and 10 distinct downstream classification datasets, we observe that self-pretraining rivals standard pretraining on the BookWiki corpus (despite using around $10\times$--$500\times$ less data), outperforming the latter on $7$ and $5$ datasets, respectively.
2 code implementations • 26 Jul 2022 • Manley Roberts, Pranav Mani, Saurabh Garg, Zachary C. Lipton
Thus motivated, we introduce a practical algorithm that leverages domain-discriminative models as follows: (i) push examples through domain discriminator $p(d|\mathbf{x})$; (ii) discretize the data by clustering examples in $p(d|\mathbf{x})$ space; (iii) perform non-negative matrix factorization on the discrete data; (iv) combine the recovered $p(y|d)$ with the discriminator outputs $p(d|\mathbf{x})$ to compute $p_d(y|x) \; \forall d$.
1 code implementation • 26 Jul 2022 • Saurabh Garg, Sivaraman Balakrishnan, Zachary C. Lipton
We introduce the problem of domain adaptation under Open Set Label Shift (OSLS) where the label distribution can change arbitrarily and a new class may arrive during deployment, but the class-conditional distributions p(x|y) are domain-invariant.
no code implementations • 6 Jul 2022 • Renjie Li, Xinyi Wang, Guan Huang, Wenli Yang, Kaining Zhang, Xiaotong Gu, Son N. Tran, Saurabh Garg, Jane Alty, Quan Bai
Deep supervision, or known as 'intermediate supervision' or 'auxiliary supervision', is to add supervision at hidden layers of a neural network.
1 code implementation • 20 Feb 2022 • Gal Kaplun, Nikhil Ghosh, Saurabh Garg, Boaz Barak, Preetum Nakkiran
In this work, we propose a new approach: we measure the performance of a collection of models when evaluated on a $\textit{single input point}$.
1 code implementation • ICLR 2022 • Saurabh Garg, Sivaraman Balakrishnan, Zachary C. Lipton, Behnam Neyshabur, Hanie Sedghi
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions that may cause performance drops.
no code implementations • 19 Dec 2021 • Renjie Li, Son Tran, Saurabh Garg, Katherine Lawler, Jane Alty, Quan Bai
Keypoint detection plays an important role in a wide range of applications.
1 code implementation • NeurIPS 2021 • Saurabh Garg, Yifan Wu, Alex Smola, Sivaraman Balakrishnan, Zachary C. Lipton
Formally, this task is broken down into two subtasks: (i) Mixture Proportion Estimation (MPE) -- determining the fraction of positive examples in the unlabeled data; and (ii) PU-learning -- given such an estimate, learning the desired positive-versus-negative classifier.
1 code implementation • NeurIPS 2021 • Saurabh Garg, Yifan Wu, Alex Smola, Sivaraman Balakrishnan, Zachary Chase Lipton
Formally, this task is broken down into two subtasks: (i) Mixture Proportion Estimation (MPE)---determining the fraction of positive examples in the unlabeled data; and (ii) PU-learning---given such an estimate, learning the desired positive-versus-negative classifier.
1 code implementation • 1 May 2021 • Saurabh Garg, Sivaraman Balakrishnan, J. Zico Kolter, Zachary C. Lipton
To assess generalization, machine learning scientists typically either (i) bound the generalization gap and then (after training) plug in the empirical risk to obtain a bound on the true risk; or (ii) validate empirically on holdout data.
no code implementations • 29 Apr 2021 • Renjie Li, Xinyi Wang, Katherine Lawler, Saurabh Garg, Quan Bai, Jane Alty
With populations ageing, the number of people with dementia worldwide is expected to triple to 152 million by 2050.
no code implementations • 20 Feb 2021 • Saurabh Garg, Joshua Zhanson, Emilio Parisotto, Adarsh Prasad, J. Zico Kolter, Zachary C. Lipton, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Pradeep Ravikumar
In this paper, we present a detailed empirical study to characterize the heavy-tailed nature of the gradients of the PPO surrogate reward function.
no code implementations • NeurIPS 2020 • Saurabh Garg, Yifan Wu, Sivaraman Balakrishnan, Zachary C. Lipton
Our contributions include (i) consistency conditions for MLLS, which include calibration of the classifier and a confusion matrix invertibility condition that BBSE also requires; (ii) a unified framework, casting BBSE as roughly equivalent to MLLS for a particular choice of calibration method; and (iii) a decomposition of MLLS's finite-sample error into terms reflecting miscalibration and estimation error.
no code implementations • EMNLP 2018 • Saurabh Garg, Tanmay Parekh, Preethi Jyothi
This work focuses on building language models (LMs) for code-switched text.
no code implementations • 3 Nov 2017 • Saurabh Garg, Tanmay Parekh, Preethi Jyothi
Since code-switching is a blend of two or more different languages, a standard bilingual language model can be improved upon by using structures of the monolingual language models.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2