iProStruct2D: Identifying protein structural classes by deep learning via 2D representations

11 Jun 2019 · Loris Nanni, Alessandra Lumini, Federica Pasquali, Sheryl Brahnam ·

In this paper we address the problem of protein classification starting from a multi-view 2D representation of proteins. From each 3D protein structure, a large set of 2D projections is generated using the protein visualization software Jmol. This set of multi-view 2D representations includes 13 different types of protein visualizations that emphasize specific properties of protein structure (e.g., a backbone visualization that displays the backbone structure of the protein as a trace of the C{\alpha} atom). Each type of representation is used to train a different Convolutional Neural Network (CNN), and the fusion of these CNNs is shown to be able to exploit the diversity of different types of representations to improve classification performance. In addition, several multi-view projections are obtained by uniformly rotating the protein structure around its central X, Y, and Z viewing axes to produce 125 images. This approach can be considered a data augmentation method for improving the performance of the classifier and can be used in both the training and the testing phases. Experimental evaluation of the proposed approach on two datasets demonstrates the strength of the proposed method with respect to the other state-of-the-art approaches. The MATLAB code used in this paper is available at https://github.com/LorisNanni.

PDF Abstract