TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Out-of-Distribution Generalization	ImageNet-W	JTT (ResNet-50)	IN-W Gap	-15.74	# 1
Out-of-Distribution Generalization	ImageNet-W	JTT (ResNet-50)	Carton Gap	+32	# 1
Out-of-Distribution Generalization	UrbanCars	JTT (E=2)	BG Gap	-23.3	# 1
Out-of-Distribution Generalization	UrbanCars	JTT (E=2)	CoObj Gap	-5.3	# 1
Out-of-Distribution Generalization	UrbanCars	JTT (E=2)	BG+CoObj Gap	-52.1	# 1
Out-of-Distribution Generalization	UrbanCars	JTT (E=1)	BG Gap	-8.1	# 1
Out-of-Distribution Generalization	UrbanCars	JTT (E=1)	CoObj Gap	-13.3	# 1
Out-of-Distribution Generalization	UrbanCars	JTT (E=1)	BG+CoObj Gap	-40.1	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/just-train-twice-improving-group-robustness/out-of-distribution-generalization-on-1)](https://paperswithcode.com/sota/out-of-distribution-generalization-on-1?p=just-train-twice-improving-group-robustness)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/just-train-twice-improving-group-robustness/out-of-distribution-generalization-on)](https://paperswithcode.com/sota/out-of-distribution-generalization-on?p=just-train-twice-improving-group-robustness)`

Just Train Twice: Improving Group Robustness without Training Group Information

19 Jul 2021 · Evan Zheran Liu, Behzad Haghgoo, Annie S. Chen, aditi raghunathan, Pang Wei Koh, Shiori Sagawa, Percy Liang, Chelsea Finn ·

Standard training via empirical risk minimization (ERM) can produce models that achieve high accuracy on average but low accuracy on certain groups, especially in the presence of spurious correlations between the input and label. Prior approaches that achieve high worst-group accuracy, like group distributionally robust optimization (group DRO) require expensive group annotations for each training point, whereas approaches that do not use such group annotations typically achieve unsatisfactory worst-group accuracy. In this paper, we propose a simple two-stage approach, JTT, that first trains a standard ERM model for several epochs, and then trains a second model that upweights the training examples that the first model misclassified. Intuitively, this upweights examples from groups on which standard ERM models perform poorly, leading to improved worst-group performance. Averaged over four image classification and natural language processing tasks with spurious correlations, JTT closes 75% of the gap in worst-group accuracy between standard ERM and group DRO, while only requiring group annotations on a small validation set in order to tune hyperparameters.

PDF Abstract

Code

Add Remove Mark official

anniesch/jtt official

Tasks

Add Remove

Image Classification

Out-of-Distribution Generalization

Datasets

CelebA

Wilds Civil Comments

ImageNet-W

UrbanCars

Results from the Paper

Edit

Ranked #1 on Out-of-Distribution Generalization on ImageNet-W

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Out-of-Distribution Generalization	ImageNet-W	JTT (ResNet-50)	IN-W Gap	-15.74	# 1	Compare
Out-of-Distribution Generalization	ImageNet-W	JTT (ResNet-50)	Carton Gap	+32	# 1	Compare
Out-of-Distribution Generalization	UrbanCars	JTT (E=2)	BG Gap	-23.3	# 1	Compare
			CoObj Gap	-5.3	# 1	Compare
			BG+CoObj Gap	-52.1	# 1	Compare
Out-of-Distribution Generalization	UrbanCars	JTT (E=1)	BG Gap	-8.1	# 1	Compare
			CoObj Gap	-13.3	# 1	Compare
			BG+CoObj Gap	-40.1	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Just Train Twice: Improving Group Robustness without Training Group Information

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove