Protein Language Model

25 papers with code • 1 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?

Datasets


Most implemented papers

ESM-NBR: fast and accurate nucleic acid-binding residue prediction via protein language model feature representation and multi-task learning

pengsl-lab/esm-nbr 1 Dec 2023

Meanwhile, the ESM-NBR obtains the MCC values for DNA-binding residues prediction of 0. 427 and 0. 391 on two independent test sets, which are 18. 61 and 10. 45% higher than those of the second-best methods, respectively.

MSA Transformer

The-AI-Summer/self-attention-cv 13 Feb 2021

Unsupervised protein language models trained across millions of diverse sequences learn structure and function of proteins.

ECRECer: Enzyme Commission Number Recommendation and Benchmarking based on Multiagent Dual-core Learning

kingstdio/ECRECer 8 Feb 2022

Take UniPort protein "A0A0U5GJ41" as an example (1. 14.-.-), ECRECer annotated it with "1. 14. 11. 38", which supported by further protein structure analysis based on AlphaFold2.

Structure-aware Protein Self-supervised Learning

ggchen1997/steps_bioinformatics 6 Apr 2022

Furthermore, we propose to leverage the available protein language model pretrained on protein sequences to enhance the self-supervised learning.

Generative power of a protein language model trained on multiple sequence alignments

bitbol-lab/iterative_masking 14 Apr 2022

Moreover, for small protein families, our generation method based on MSA Transformer outperforms Potts models.

DistilProtBert: A distilled protein language model used to distinguish between real proteins and their randomly shuffled counterparts

yarongef/DistilProtBert bioRxiv 2022

Here, we adapted this concept to the problem of protein sequence analysis, by developing DistilProtBert, a distilled version of the successful ProtBert model.

HelixFold-Single: MSA-free Protein Structure Prediction by Using Protein Language Model as an Alternative

PaddlePaddle/PaddleHelix 28 Jul 2022

Our proposed method, HelixFold-Single, first pre-trains a large-scale protein language model (PLM) with thousands of millions of primary sequences utilizing the self-supervised learning paradigm, which will be used as an alternative to MSAs for learning the co-evolution information.

Protein language model rescue mutations highlight variant effects and structure in clinically relevant genes

dimenwarper/llm-for-clinical-variants 18 Nov 2022

Despite being self-supervised, protein language models have shown remarkable performance in fundamental biological tasks such as predicting impact of genetic variation on protein structure and function.

Protein Language Models and Structure Prediction: Connection and Progression

bozhenhhu/a-review-of-plms-and-methods-for-protein-structure-prediction 30 Nov 2022

The prediction of protein structures from sequences is an important task for function prediction, drug design, and related biological processes understanding.

Plug & Play Directed Evolution of Proteins with Gradient-based Discrete MCMC

pemami4911/ppde 20 Dec 2022

We introduce a sampling framework for evolving proteins in silico that supports mixing and matching a variety of unsupervised models, such as protein language models, and supervised models that predict protein function from sequence.