Search Results for author: Pang Wei Koh

Found 20 papers, 16 papers with code

Overparameterization hurts worst-group accuracy with spurious correlations

no code implementations ICML 2020 Shiori Sagawa, aditi raghunathan, Pang Wei Koh, Percy Liang

Increasing model capacity well beyond the point of zero training error has been observed to improve average test accuracy.

Impossibility Theorems for Feature Attribution

no code implementations22 Dec 2022 Blair Bilodeau, Natasha Jaques, Pang Wei Koh, Been Kim

Despite a sea of interpretability methods that can produce plausible explanations, the field has also empirically seen many failure cases of such methods.

Wild-Time: A Benchmark of in-the-Wild Distribution Shift over Time

1 code implementation25 Nov 2022 Huaxiu Yao, Caroline Choi, Bochuan Cao, Yoonho Lee, Pang Wei Koh, Chelsea Finn

Temporal shifts -- distribution shifts arising from the passage of time -- often occur gradually and have the additional structure of timestamp metadata.

Continual Learning Domain Generalization +3

Extending the WILDS Benchmark for Unsupervised Adaptation

1 code implementation ICLR 2022 Shiori Sagawa, Pang Wei Koh, Tony Lee, Irena Gao, Sang Michael Xie, Kendrick Shen, Ananya Kumar, Weihua Hu, Michihiro Yasunaga, Henrik Marklund, Sara Beery, Etienne David, Ian Stavness, Wei Guo, Jure Leskovec, Kate Saenko, Tatsunori Hashimoto, Sergey Levine, Chelsea Finn, Percy Liang

Unlabeled data can be a powerful point of leverage for mitigating these distribution shifts, as it is frequently much more available than labeled data and can often be obtained from distributions beyond the source distribution as well.

On the Opportunities and Risks of Foundation Models

no code implementations16 Aug 2021 Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang

AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.

Transfer Learning

Just Train Twice: Improving Group Robustness without Training Group Information

1 code implementation19 Jul 2021 Evan Zheran Liu, Behzad Haghgoo, Annie S. Chen, aditi raghunathan, Pang Wei Koh, Shiori Sagawa, Percy Liang, Chelsea Finn

Standard training via empirical risk minimization (ERM) can produce models that achieve high accuracy on average but low accuracy on certain groups, especially in the presence of spurious correlations between the input and label.

Image Classification Out-of-Distribution Generalization

Selective Classification Can Magnify Disparities Across Groups

1 code implementation ICLR 2021 Erik Jones, Shiori Sagawa, Pang Wei Koh, Ananya Kumar, Percy Liang

In this paper, we find that while selective classification can improve average accuracies, it can simultaneously magnify existing accuracy disparities between various groups within a population, especially in the presence of spurious correlations.

Classification General Classification

Concept Bottleneck Models

3 code implementations ICML 2020 Pang Wei Koh, Thao Nguyen, Yew Siang Tang, Stephen Mussmann, Emma Pierson, Been Kim, Percy Liang

We seek to learn models that we can interact with using high-level concepts: if the model did not think there was a bone spur in the x-ray, would it still predict severe arthritis?

An Investigation of Why Overparameterization Exacerbates Spurious Correlations

2 code implementations9 May 2020 Shiori Sagawa, aditi raghunathan, Pang Wei Koh, Percy Liang

We study why overparameterization -- increasing model size well beyond the point of zero training error -- can hurt test error on minority groups despite improving average test error when there are spurious correlations in the data.

Inductive Bias

ExpBERT: Representation Engineering with Natural Language Explanations

2 code implementations ACL 2020 Shikhar Murty, Pang Wei Koh, Percy Liang

Suppose we want to specify the inductive bias that married couples typically go on honeymoons for the task of extracting pairs of spouses from text.

Inductive Bias Relation Extraction

Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations

1 code implementation14 Sep 2019 Sawyer Birnbaum, Volodymyr Kuleshov, Zayd Enam, Pang Wei Koh, Stefano Ermon

Learning representations that accurately capture long-range dependencies in sequential inputs -- including text, audio, and genomic data -- is a key problem in deep learning.

Audio Super-Resolution Super-Resolution +2

Stronger Data Poisoning Attacks Break Data Sanitization Defenses

2 code implementations2 Nov 2018 Pang Wei Koh, Jacob Steinhardt, Percy Liang

In this paper, we develop three attacks that can bypass a broad range of common data sanitization defenses, including anomaly detectors based on nearest neighbors, training loss, and singular-value decomposition.

Data Poisoning Sentiment Analysis +1

Inferring Multidimensional Rates of Aging from Cross-Sectional Data

1 code implementation12 Jul 2018 Emma Pierson, Pang Wei Koh, Tatsunori Hashimoto, Daphne Koller, Jure Leskovec, Nicholas Eriksson, Percy Liang

Motivated by the study of human aging, we present an interpretable latent-variable model that learns temporal dynamics from cross-sectional data.

Human Aging Time Series

Certified Defenses for Data Poisoning Attacks

2 code implementations NeurIPS 2017 Jacob Steinhardt, Pang Wei Koh, Percy Liang

Machine learning systems trained on user-provided data are susceptible to data poisoning attacks, whereby malicious users inject false training data with the aim of corrupting the learned model.

Data Poisoning

Cannot find the paper you are looking for? You can Submit a new open access paper.