AtomSets -- A Hierarchical Transfer Learning Framework for Small and Large Materials Datasets

4 Feb 2021  ·  Chi Chen, Shyue Ping Ong ·

Predicting materials properties from composition or structure is of great interest to the materials science community. Deep learning has recently garnered considerable interest in materials predictive tasks with low model errors when dealing with large materials data. However, deep learning models suffer in the small data regime that is common in materials science. Here we leverage the transfer learning concept and the graph network deep learning framework and develop the AtomSets machine learning framework for consistent high model accuracy at both small and large materials data. The AtomSets models can work with both compositional and structural materials data. By combining with transfer learned features from graph networks, they can achieve state-of-the-art accuracy from using small compositional data (<400) to large structural data (>130,000). The AtomSets models show much lower errors than the state-of-the-art graph network models at small data limits and the classical machine learning models at large data limits. They also transfer better in the simulated materials discovery process where the targeted materials have property values out of the training data limits. The models require minimal domain knowledge inputs and are free from feature engineering. The presented AtomSets model framework opens new routes for machine learning-assisted materials design and discovery.

PDF Abstract