Protein Function Prediction

24 papers with code • 3 benchmarks • 2 datasets

For GO terms prediction, given the specific function prediction instruction and a protein sequence, models characterize the protein functions using the GO terms presented in three different domains (cellular component, biological process, and molecular function).

Most implemented papers

Multi-Scale Representation Learning on Proteins

vsomnath/holoprot NeurIPS 2021

This paper introduces a multi-scale graph construction of a protein -- HoloProt -- connecting surface to structure and sequence.

Robust deep learning based protein sequence design using ProteinMPNN

dauparas/ProteinMPNN bioRxiv 2022

While deep learning has revolutionized protein structure prediction, almost all experimentally characterized de novo protein designs have been generated using physically based approaches such as Rosetta.

PEER: A Comprehensive and Multi-Task Benchmark for Protein Sequence Understanding

deepgraphlearning/peer_benchmark 5 Jun 2022

However, there is a lack of a standard benchmark to evaluate the performance of different methods, which hinders the progress of deep learning in this field.

Deep learning-based rapid generation of broadly reactive antibodies against SARS-CoV-2 and its Omicron variant

jianqingzheng/XBCR-net Cell Research 2022

The COVID-19 pandemic has been ongoing for nearly two and half years, and new variants of concern (VOCs) of SARS-CoV-2 continue to emerge, which urges the development of broadly neutralizing antibodies.

Galactica: A Large Language Model for Science

paperswithcode/galai 16 Nov 2022

We believe these results demonstrate the potential for language models as a new interface for science.

EurNet: Efficient Multi-Range Relational Modeling of Spatial Multi-Relational Data

hirl-team/eurnet-image 23 Nov 2022

We study EurNets in two important domains for image and protein structure modeling.

Ankh: Optimized Protein Language Model Unlocks General-Purpose Modelling

agemagician/Ankh 16 Jan 2023

As opposed to scaling-up protein language models (PLMs), we seek improving performance via protein-specific optimization.

Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models

zjunlp/mol-instructions 13 Jun 2023

Large Language Models (LLMs), with their remarkable task-handling capabilities and innovative outputs, have catalyzed significant advancements across a spectrum of fields.

MD-HIT: Machine learning for materials property prediction with dataset redundancy control

usccolumbia/md-hit 10 Jul 2023

This issue is well known in the field of bioinformatics for protein function prediction, in which a redundancy reduction procedure (CD-Hit) is always applied to reduce the sample redundancy by ensuring no pair of samples has a sequence similarity greater than a given threshold.

Prot2Text: Multimodal Protein's Function Generation with GNNs and Transformers

hadi-abdine/Prot2Text 25 Jul 2023

These results highlight the transformative impact of multimodal models, specifically the fusion of GNNs and LLMs, empowering researchers with powerful tools for more accurate function prediction of existing as well as first-to-see proteins.