|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
Computational biology and bioinformatics provide vast data gold-mines from protein sequences, ideal for Language Models (LMs) taken from Natural Language Processing (NLP).
Ranked #1 on Protein Secondary Structure Prediction on CASP12
We have created the ProteinNet series of data sets to provide a standardized mechanism for training and assessing data-driven models of protein sequence-structure relationships.
Protein secondary structure (SS) prediction is important for studying protein structure and function.
Ranked #1 on Protein Secondary Structure Prediction on CullPDB
Here we present a new supervised generative stochastic network (GSN) based method to predict local secondary structure with deep hierarchical representations.
In spite of this, even the most sophisticated ab initio SS predictors are not able to reach the theoretical limit of three-state prediction accuracy (88–90%), while only a few predict more than the 3 traditional Helix, Strand and Coil classes.
Motivation: Although secondary structure predictors have been developed for decades, current ab initio methods have still some way to go to reach their theoretical limits.
In the spirit of reproducible research we make our data, models and code available, aiming to set a gold standard for purity of training and testing sets.
Inspired by the recent successes of deep neural networks, in this paper, we propose an end-to-end deep network that predicts protein secondary structures from integrated local and global contextual features.
This paper proposed a novel and straightforward approach to improve the accuracy of progressive multiple protein sequence alignment method.
Ranked #1 on Multiple Sequence Alignment on OXBench