TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Image Classification	CIFAR-10	CCT-7/3x1+VTM	Percentage correct	97.78	# 65
Image Classification	CIFAR-100	CCT-7/3x1+HTM+VTM	Percentage correct	83.57	# 86
Image Classification	ImageNet	ViT-B/16-224+HTM	Top 1 Accuracy	82.37%	# 499

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/tokenmixup-efficient-attention-guided-token/image-classification-on-cifar-10)](https://paperswithcode.com/sota/image-classification-on-cifar-10?p=tokenmixup-efficient-attention-guided-token)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/tokenmixup-efficient-attention-guided-token/image-classification-on-cifar-100)](https://paperswithcode.com/sota/image-classification-on-cifar-100?p=tokenmixup-efficient-attention-guided-token)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/tokenmixup-efficient-attention-guided-token/image-classification-on-imagenet)](https://paperswithcode.com/sota/image-classification-on-imagenet?p=tokenmixup-efficient-attention-guided-token)`

TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers

14 Oct 2022 · Hyeong Kyu Choi, Joonmyung Choi, Hyunwoo J. Kim ·

Mixup is a commonly adopted data augmentation technique for image classification. Recent advances in mixup methods primarily focus on mixing based on saliency. However, many saliency detectors require intense computation and are especially burdensome for parameter-heavy transformer models. To this end, we propose TokenMixup, an efficient attention-guided token-level data augmentation method that aims to maximize the saliency of a mixed set of tokens. TokenMixup provides x15 faster saliency-aware data augmentation compared to gradient-based methods. Moreover, we introduce a variant of TokenMixup which mixes tokens within a single instance, thereby enabling multi-scale feature augmentation. Experiments show that our methods significantly improve the baseline models' performance on CIFAR and ImageNet-1K, while being more efficient than previous methods. We also reach state-of-the-art performance on CIFAR-100 among from-scratch transformer models. Code is available at https://github.com/mlvlab/TokenMixup.

PDF Abstract

Code

Add Remove Mark official

mlvlab/tokenmixup official

Tasks

Add Remove

Data Augmentation

Image Classification

Datasets

CIFAR-10

ImageNet

CIFAR-100 ImageNet-1K

Results from the Paper

Edit

Ranked #65 on Image Classification on CIFAR-10

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Image Classification	CIFAR-10	CCT-7/3x1+VTM	Percentage correct	97.78	# 65	Compare
Image Classification	CIFAR-100	CCT-7/3x1+HTM+VTM	Percentage correct	83.57	# 86	Compare
Image Classification	ImageNet	ViT-B/16-224+HTM	Top 1 Accuracy	82.37%	# 499	Compare

Methods

Add Remove

Mixup

Edit Social Preview

TokenMixup: Efficient Attention-guided Token-level Data Augmentation for Transformers

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove