Search Results for author: Ross Wightman

Found 4 papers, 4 papers with code

How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers

15 code implementations • 18 Jun 2021 • Andreas Steiner, Alexander Kolesnikov, Xiaohua Zhai, Ross Wightman, Jakob Uszkoreit, Lucas Beyer

Vision Transformers (ViT) have been shown to attain highly competitive performance for a wide range of vision applications, such as image classification, object detection and semantic image segmentation.

Data Augmentation Image Classification +5

29,671

Paper
Code

ResNet strikes back: An improved training procedure in timm

11 code implementations • NeurIPS Workshop ImageNet_PPF 2021 • Ross Wightman, Hugo Touvron, Hervé Jégou

We share competitive training settings and pre-trained models in the timm open-source library, with the hope that they will serve as better baselines for future work.

Ranked #2 on Medical Image Classification on NCT-CRC-HE-100K

Data Augmentation Domain Generalization +2

29,671

Paper
Code

LAION-5B: An open large-scale dataset for training next generation image-text models

3 code implementations • NeurIPS 2022 Datasets and Benchmarks 2022 • Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, Patrick Schramowski, Srivatsa Kundurthy, Katherine Crowson, Ludwig Schmidt, Robert Kaczmarczyk, Jenia Jitsev

We show successful replication and fine-tuning of foundational models like CLIP, GLIDE and Stable Diffusion using the dataset, and discuss further experiments enabled with an openly available dataset of this scale.

Image Generation Zero-Shot Learning

8,365

Paper
Code

Reproducible scaling laws for contrastive language-image learning

3 code implementations • CVPR 2023 • Mehdi Cherti, Romain Beaumont, Ross Wightman, Mitchell Wortsman, Gabriel Ilharco, Cade Gordon, Christoph Schuhmann, Ludwig Schmidt, Jenia Jitsev

To address these limitations, we investigate scaling laws for contrastive language-image pre-training (CLIP) with the public LAION dataset and the open-source OpenCLIP repository.

Ranked #1 on Zero-Shot Image Classification on Country211 (using extra training data)

Image Classification Open Vocabulary Attribute Detection +4

8,365

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.