ImageNet-W(atermark) is a test set to evaluate models’ reliance on the newly found watermark shortcut in ImageNet, which is used to predict the carton class. ImageNet-W is created by overlaying transparent watermarks on the ImageNet validation set. Two metrics are used to evaluate watermark shortcut reliance: (1) IN-W Gap: the top-1 accuracy drop from ImageNet to ImageNet-W, (2) Carton Gap: carton class accuracy increase from ImageNet to ImageNet-W. Combining ImageNet-W with previous out-of-distribution variants of ImageNet (e.g., Stylized ImageNet, ImageNet-R, ImageNet-9) forms a comprehensive suite of multi-shortcut evaluation on ImageNet.
22 PAPERS • 1 BENCHMARK
UrbanCars facilitates multi-shortcut learning under the controlled setting with two shortcuts—background and co-occurring object. The task is classifying the car body type into two categories: urban car and country car. The dataset contains three splits: training, validation, and testing. In the training set, two shortcuts spuriously correlate with the car body type. Both validation and testing sets are balanced, i.e., no spurious correlations. The validation set is used for model selection, and the testing set evaluates the mitigation of two shortcuts.
18 PAPERS • 1 BENCHMARK
An object-centric version of Stylized COCO to benchmark texture bias and out-of-distribution robustness of vision models. See the ECCV 22 paper and supplementary material for details.
1 PAPER • NO BENCHMARKS YET