The DeepNets-1M dataset is composed of neural network architectures represented as graphs where nodes are operations (convolution, pooling, etc.) and edges correspond to the forward pass flow of data through the network. DeepNets-1M has 1 million training architectures and 1402 in-distribution (ID) and out-of-distribution (OOD) evaluation architectures: 500 validation and 500 testing ID architectures, 100 wide OOD architectures, 100 deep OOD architectures, 100 dense OOD architectures, 100 OOD archtectures without batch normalization, and 2 predefined architectures (ResNet-50 and 12 layer Visual Transformer).
For 1402 evaluation architectures, DeepNets-1M includes accuracies of the networks on CIFAR-10 and ImageNet after training them with stochastic gradient descent (SGD). Besides accuracy, other properties of evaluation architectures are included: accuracy on noisy images, inference and convergence time. These properties of architectures can enable training neural architecture search models.
The DeepNets-1M is used to train and evaluate parameter prediction models such as Graph HyperNetworks. These models can predict all parameters for a given network (graph) in a single forward pass and the results can be compared to optimizing parameters with SGD.