no code implementations • 16 Nov 2023 • Yun-Shiuan Chuang, Yi Wu, Dhruv Gupta, Rheeya Uppaal, Ananya Kumar, Luhang Sun, Makesh Narsimhan Sreedhar, Sijia Yang, Timothy T. Rogers, Junjie Hu
Adapting pre-trained language models (PLMs) for time-series text classification amidst evolving domain shifts (EDS) is critical for maintaining accuracy in applications like stance detection.
no code implementations • 15 Nov 2023 • Vaishnavi Shrivastava, Percy Liang, Ananya Kumar
To maintain user trust, large language models (LLMs) should signal low confidence on examples where they are incorrect, instead of misleading the user.
1 code implementation • 8 Jun 2023 • Caroline Choi, Fahim Tajwar, Yoonho Lee, Huaxiu Yao, Ananya Kumar, Chelsea Finn
Taking inspiration from this result, we present data-driven confidence minimization (DCM), which minimizes confidence on an uncertainty dataset containing examples that the model is likely to misclassify at test time.
1 code implementation • 26 Feb 2023 • Michael Sun, Ananya Kumar, Divyam Madaan, Percy Liang
We consider the continual representation learning setting: sequentially pretrain a model $M'$ on tasks $T_1, \ldots, T_T$, and then adapt $M'$ on a small amount of data from each task $T_i$ to check if it has forgotten information from old tasks.
1 code implementation • CVPR 2023 • Sachin Goyal, Ananya Kumar, Sankalp Garg, Zico Kolter, aditi raghunathan
In total, these benchmarks establish contrastive finetuning as a simple, intuitive, and state-of-the-art approach for supervised finetuning of image-text models like CLIP.
no code implementations • 25 Nov 2022 • Rishi Bommasani, Kathleen A. Creel, Ananya Kumar, Dan Jurafsky, Percy Liang
As the scope of machine learning broadens, we observe a recurring theme of algorithmic monoculture: the same systems, or systems that share components (e. g. training data), are deployed by multiple decision-makers.
no code implementations • 17 Nov 2022 • Ananya Kumar, Ruoqi Shen, Sebastien Bubeck, Suriya Gunasekar
SGD and AdamW are the two most used optimizers for fine-tuning large neural networks in computer vision.
1 code implementation • 16 Nov 2022 • Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel Orr, Lucia Zheng, Mert Yuksekgonul, Mirac Suzgun, Nathan Kim, Neel Guha, Niladri Chatterji, Omar Khattab, Peter Henderson, Qian Huang, Ryan Chi, Sang Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori Hashimoto, Thomas Icard, Tianyi Zhang, Vishrav Chaudhary, William Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, Yuta Koreeda
We present Holistic Evaluation of Language Models (HELM) to improve the transparency of language models.
1 code implementation • 20 Oct 2022 • Yoonho Lee, Annie S. Chen, Fahim Tajwar, Ananya Kumar, Huaxiu Yao, Percy Liang, Chelsea Finn
A common approach to transfer learning under distribution shift is to fine-tune the last few layers of a pre-trained model, preserving learned features while also adapting to the new task.
no code implementations • 12 Oct 2022 • Nelson F. Liu, Ananya Kumar, Percy Liang, Robin Jia
Recent results in image classification and extractive question answering have observed that pre-trained models trained on less in-distribution data have better out-of-distribution performance.
no code implementations • 18 Jul 2022 • Ananya Kumar, Tengyu Ma, Percy Liang, aditi raghunathan
We often see undesirable tradeoffs in robust machine learning where out-of-distribution (OOD) accuracy is at odds with in-distribution (ID) accuracy: a robust classifier obtained via specialized techniques such as removing spurious features often has better OOD but worse ID accuracy compared to a standard classifier trained via ERM.
no code implementations • 6 Apr 2022 • Jeff Z. HaoChen, Colin Wei, Ananya Kumar, Tengyu Ma
In particular, a linear classifier trained to separate the representations on the source domain can also predict classes on the target domain accurately, even though the representations of the two domains are far from each other.
no code implementations • 1 Apr 2022 • Kendrick Shen, Robbie Jones, Ananya Kumar, Sang Michael Xie, Jeff Z. HaoChen, Tengyu Ma, Percy Liang
We consider unsupervised domain adaptation (UDA), where labeled data from a source domain (e. g., photographs) and unlabeled data from a target domain (e. g., sketches) are used to learn a classifier for the target domain.
3 code implementations • 21 Feb 2022 • Ananya Kumar, aditi raghunathan, Robbie Jones, Tengyu Ma, Percy Liang
However, in this paper, we find that fine-tuning can achieve worse accuracy than linear probing out-of-distribution (OOD) when the pretrained features are good and the distribution shift is large.
1 code implementation • ICLR 2022 • Shiori Sagawa, Pang Wei Koh, Tony Lee, Irena Gao, Sang Michael Xie, Kendrick Shen, Ananya Kumar, Weihua Hu, Michihiro Yasunaga, Henrik Marklund, Sara Beery, Etienne David, Ian Stavness, Wei Guo, Jure Leskovec, Kate Saenko, Tatsunori Hashimoto, Sergey Levine, Chelsea Finn, Percy Liang
Unlabeled data can be a powerful point of leverage for mitigating these distribution shifts, as it is frequently much more available than labeled data and can often be obtained from distributions beyond the source distribution as well.
no code implementations • ICLR 2022 • Ananya Kumar, aditi raghunathan, Robbie Matthew Jones, Tengyu Ma, Percy Liang
It is well known that fine-tuning leads to better accuracy in-distribution (ID).
no code implementations • 29 Sep 2021 • Ananya Kumar, aditi raghunathan, Tengyu Ma, Percy Liang
We often see undesirable tradeoffs in robust machine learning where out-of-distribution (OOD) accuracy is at odds with in-distribution (ID) accuracy.
no code implementations • 29 Sep 2021 • Kendrick Shen, Robbie Matthew Jones, Ananya Kumar, Sang Michael Xie, Percy Liang
We develop a conceptual model for contrastive learning under domain shifts, where data augmentations form connections between classes and domains that can be far apart.
1 code implementation • 12 Sep 2021 • Fahim Tajwar, Ananya Kumar, Sang Michael Xie, Percy Liang
Out-of-distribution detection is an important component of reliable ML systems.
Out-of-Distribution Detection Out of Distribution (OOD) Detection
2 code implementations • 16 Aug 2021 • Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang
AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.
1 code implementation • ICLR 2021 • Sang Michael Xie, Ananya Kumar, Robbie Jones, Fereshte Khani, Tengyu Ma, Percy Liang
To get the best of both worlds, we introduce In-N-Out, which first trains a model with auxiliary inputs and uses it to pseudolabel all the in-distribution inputs, then pre-trains a model on OOD auxiliary outputs and fine-tunes this model with the pseudolabels (self-training).
1 code implementation • ICLR 2021 • Erik Jones, Shiori Sagawa, Pang Wei Koh, Ananya Kumar, Percy Liang
In this paper, we find that while selective classification can improve average accuracies, it can simultaneously magnify existing accuracy disparities between various groups within a population, especially in the presence of spurious correlations.
no code implementations • NeurIPS 2020 • Yining Chen, Colin Wei, Ananya Kumar, Tengyu Ma
In unsupervised domain adaptation, existing theory focuses on situations where the source and target domains are close.
2 code implementations • ICML 2020 • Ananya Kumar, Tengyu Ma, Percy Liang
Machine learning systems must adapt to data distributions that evolve over time, in applications ranging from sensor networks and self-driving car perception modules to brain-machine interfaces.
3 code implementations • NeurIPS 2019 • Ananya Kumar, Percy Liang, Tengyu Ma
In these experiments, we also estimate the calibration error and ECE more accurately than the commonly used plugin estimators.
no code implementations • ICLR 2019 • Ananya Kumar, S. M. Ali Eslami, Danilo Rezende, Marta Garnelo, Fabio Viola, Edward Lockhart, Murray Shanahan
These models typically generate future frames in an autoregressive fashion, which is slow and requires the input and output frames to be consecutive.
no code implementations • ICLR 2019 • Avraham Ruderman, Richard Everett, Bristy Sikder, Hubert Soyer, Jonathan Uesato, Ananya Kumar, Charlie Beattie, Pushmeet Kohli
Reinforcement learning agents are typically trained and evaluated according to their performance averaged over some distribution of environment settings.
no code implementations • ICLR 2019 • Jonathan Uesato, Ananya Kumar, Csaba Szepesvari, Tom Erez, Avraham Ruderman, Keith Anderson, Krishmamurthy, Dvijotham, Nicolas Heess, Pushmeet Kohli
We demonstrate this is an issue for current agents, where even matching the compute used for training is sometimes insufficient for evaluation.
no code implementations • ICLR 2019 • Ananya Kumar, S. M. Ali Eslami, Danilo J. Rezende, Marta Garnelo, Fabio Viola, Edward Lockhart, Murray Shanahan
These models typically generate future frames in an autoregressive fashion, which is slow and requires the input and output frames to be consecutive.