TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Parameter Prediction	CIFAR10	GHN-2	Classification Accuracy (Deep)	60.5	# 1
Parameter Prediction	CIFAR10	GHN-2	Classification Accuracy (ID-test)	66.9	# 1
Parameter Prediction	CIFAR10	GHN-2	Classification Accuracy (Wide)	64	# 1
Parameter Prediction	CIFAR10	GHN-2	Classification Accuracy (Dense)	65.8	# 1
Parameter Prediction	CIFAR10	GHN-2	Classification Accuracy (BN-free)	36.8	# 1
Parameter Prediction	CIFAR10	GHN-2	Classification Accuracy (ResNet-50)	58.6	# 1
Parameter Prediction	CIFAR10	GHN-2	Classification Accuracy (ViT)	11.4	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/parameter-prediction-for-unseen-deep/parameter-prediction-on-cifar10)](https://paperswithcode.com/sota/parameter-prediction-on-cifar10?p=parameter-prediction-for-unseen-deep)`

Parameter Prediction for Unseen Deep Architectures

NeurIPS 2021 · Boris Knyazev, Michal Drozdzal, Graham W. Taylor, Adriana Romero-Soriano ·

Deep learning has been successful in automating the design of features in machine learning pipelines. However, the algorithms optimizing neural network parameters remain largely hand-designed and computationally inefficient. We study if we can use deep learning to directly predict these parameters by exploiting the past knowledge of training other networks. We introduce a large-scale dataset of diverse computational graphs of neural architectures - DeepNets-1M - and use it to explore parameter prediction on CIFAR-10 and ImageNet. By leveraging advances in graph neural networks, we propose a hypernetwork that can predict performant parameters in a single forward pass taking a fraction of a second, even on a CPU. The proposed model achieves surprisingly good performance on unseen and diverse networks. For example, it is able to predict all 24 million parameters of a ResNet-50 achieving a 60% accuracy on CIFAR-10. On ImageNet, top-5 accuracy of some of our networks approaches 50%. Our task along with the model and results can potentially lead to a new, more computationally efficient paradigm of training networks. Our model also learns a strong representation of neural architectures enabling their analysis.

PDF Abstract NeurIPS 2021 PDF NeurIPS 2021 Abstract

Code

Add Remove Mark official

facebookresearch/ppuda official

↳ Quickstart in

Colab

482

Tasks

Add Remove

Parameter Prediction

Datasets

Introduced in the Paper:

DeepNets-1M

Used in the Paper:

CIFAR-10 CIFAR10

Results from the Paper

Edit

Ranked #1 on Parameter Prediction on CIFAR10

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Parameter Prediction	CIFAR10	GHN-2	Classification Accuracy (Deep)	60.5	# 1	Compare
			Classification Accuracy (ID-test)	66.9	# 1	Compare
			Classification Accuracy (Wide)	64	# 1	Compare
			Classification Accuracy (Dense)	65.8	# 1	Compare
			Classification Accuracy (BN-free)	36.8	# 1	Compare
			Classification Accuracy (ResNet-50)	58.6	# 1	Compare
			Classification Accuracy (ViT)	11.4	# 1	Compare

Methods

Add Remove

GGS-NNs • HyperNetwork

Edit Social Preview

Parameter Prediction for Unseen Deep Architectures

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove