We propose the Spherical Channel Network (SCN) to model atomic energies and forces.
1 code implementation • 17 Jun 2022 • Richard Tran, Janice Lan, Muhammed Shuaibi, Siddharth Goyal, Brandon M. Wood, Abhishek Das, Javier Heras-Domingo, Adeesh Kolluru, Ammar Rizvi, Nima Shoghi, Anuroop Sriram, Zachary Ulissi, C. Lawrence Zitnick
The dataset and baseline models are open sourced, and a public leaderboard will follow to encourage continued community developments on the total energy tasks and data.
However, prior work has implicitly assumed that the best training configuration for model performance was also the best configuration for mask discovery.
Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities.
Standard gradient descent methods are susceptible to a range of issues that can impede training, such as high correlations and different scaling in parameter space. These difficulties can be addressed by second-order approaches that apply a pre-conditioning matrix to the gradient to improve convergence.
We propose a new window into training called Loss Change Allocation (LCA), in which credit for changes to the network loss is conservatively partitioned to the parameters.
The recent "Lottery Ticket Hypothesis" paper by Frankle & Carbin showed that a simple approach to creating sparse networks (keeping the large weights) results in models that are trainable from scratch, but only when starting from the same initial weights.