TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Image Classification	CIFAR-10	ResNet-18	Percentage correct	95.55	# 119
Stochastic Optimization	CIFAR-10 ResNet-18 - 200 Epochs	SGD - cosine LR schedule	Accuracy	95.55	# 1
Image Classification	SVHN	ResNet-18	Percentage error	2.65	# 38

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/benchopt-reproducible-efficient-and/stochastic-optimization-on-cifar-10-resnet-18)](https://paperswithcode.com/sota/stochastic-optimization-on-cifar-10-resnet-18?p=benchopt-reproducible-efficient-and)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/benchopt-reproducible-efficient-and/image-classification-on-svhn)](https://paperswithcode.com/sota/image-classification-on-svhn?p=benchopt-reproducible-efficient-and)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/benchopt-reproducible-efficient-and/image-classification-on-cifar-10)](https://paperswithcode.com/sota/image-classification-on-cifar-10?p=benchopt-reproducible-efficient-and)`

Benchopt: Reproducible, efficient and collaborative optimization benchmarks

27 Jun 2022 · Thomas Moreau, Mathurin Massias, Alexandre Gramfort, Pierre Ablin, Pierre-Antoine Bannier, Benjamin Charlier, Mathieu Dagréou, Tom Dupré La Tour, Ghislain Durif, Cassio F. Dantas, Quentin Klopfenstein, Johan Larsson, En Lai, Tanguy Lefort, Benoit Malézieux, Badr Moufad, Binh T. Nguyen, Alain Rakotomamonjy, Zaccharie Ramzi, Joseph Salmon, Samuel Vaiter ·

Numerical validation is at the core of machine learning research as it allows to assess the actual impact of new methods, and to confirm the agreement between theory and practice. Yet, the rapid development of the field poses several challenges: researchers are confronted with a profusion of methods to compare, limited transparency and consensus on best practices, as well as tedious re-implementation work. As a result, validation is often very partial, which can lead to wrong conclusions that slow down the progress of research. We propose Benchopt, a collaborative framework to automate, reproduce and publish optimization benchmarks in machine learning across programming languages and hardware architectures. Benchopt simplifies benchmarking for the community by providing an off-the-shelf tool for running, sharing and extending experiments. To demonstrate its broad usability, we showcase benchmarks on three standard learning tasks: $\ell_2$-regularized logistic regression, Lasso, and ResNet18 training for image classification. These benchmarks highlight key practical findings that give a more nuanced view of the state-of-the-art for these problems, showing that for practical evaluation, the devil is in the details. We hope that Benchopt will foster collaborative work in the community hence improving the reproducibility of research findings.

PDF Abstract

Code

Add Remove Mark official

benchopt/benchopt official

208

deepmind/optax

1,460

google-deepmind/optax

1,459

Tasks

Add Remove

Benchmarking

Image Classification

Stochastic Optimization

Datasets

CIFAR-10

MNIST

SVHN

RCV1

Criteo MSD

Results from the Paper

Add Remove

Ranked #1 on Stochastic Optimization on CIFAR-10 ResNet-18 - 200 Epochs

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Image Classification	CIFAR-10	ResNet-18	Percentage correct	95.55	# 119	Compare
Stochastic Optimization	CIFAR-10 ResNet-18 - 200 Epochs	SGD - cosine LR schedule	Accuracy	95.55	# 1	Compare
Image Classification	SVHN	ResNet-18	Percentage error	2.65	# 38	Compare

Methods

Add Remove

1x1 Convolution • Average Pooling • Batch Normalization • Bottleneck Residual Block • Convolution • Global Average Pooling • Kaiming Initialization • Max Pooling • ReLU • Residual Block • Residual Connection • ResNet

Edit Social Preview

Benchopt: Reproducible, efficient and collaborative optimization benchmarks

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove