Protein Function Prediction

24 papers with code • 3 benchmarks • 2 datasets

For GO terms prediction, given the specific function prediction instruction and a protein sequence, models characterize the protein functions using the GO terms presented in three different domains (cellular component, biological process, and molecular function).

Structure-Informed Protein Language Model

deepgraphlearning/esm-s 7 Feb 2024

To address this issue, we introduce the integration of remote homology detection to distill structural information into protein language models without requiring explicit protein structures as input.

22
07 Feb 2024

Endowing Protein Language Models with Structural Knowledge

borgwardtlab/pst 26 Jan 2024

Drawing from recent advances in graph transformers, our approach refines the self-attention mechanisms of pretrained language transformers by integrating structural information with structure extractor modules.

18
26 Jan 2024

Insights Into the Inner Workings of Transformer Models for Protein Function Prediction

markuswenzel/xai-proteins 7 Sep 2023

Motivation: We explored how explainable artificial intelligence (XAI) can help to shed light into the inner workings of neural networks for protein function prediction, by extending the widely used XAI method of integrated gradients such that latent representations inside of transformer models, which were finetuned to Gene Ontology term and Enzyme Commission number prediction, can be inspected too.

10
07 Sep 2023

Biomedical Knowledge Graph Embeddings with Negative Statements

liseda-lab/truewalks 7 Aug 2023

Explicitly considering negative statements has been shown to improve performance on tasks such as entity summarization and question answering or domain-specific tasks such as protein function prediction.

1
07 Aug 2023

Prot2Text: Multimodal Protein's Function Generation with GNNs and Transformers

hadi-abdine/Prot2Text 25 Jul 2023

These results highlight the transformative impact of multimodal models, specifically the fusion of GNNs and LLMs, empowering researchers with powerful tools for more accurate function prediction of existing as well as first-to-see proteins.

11
25 Jul 2023

MD-HIT: Machine learning for materials property prediction with dataset redundancy control

usccolumbia/md-hit 10 Jul 2023

This issue is well known in the field of bioinformatics for protein function prediction, in which a redundancy reduction procedure (CD-Hit) is always applied to reduce the sample redundancy by ensuring no pair of samples has a sequence similarity greater than a given threshold.

7
10 Jul 2023

Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models

zjunlp/mol-instructions 13 Jun 2023

Large Language Models (LLMs), with their remarkable task-handling capabilities and innovative outputs, have catalyzed significant advancements across a spectrum of fields.

183
13 Jun 2023

A Systematic Study of Joint Representation Learning on Protein Sequences and Structures

deepgraphlearning/gearnet 11 Mar 2023

Recent sequence representation learning methods based on Protein Language Models (PLMs) excel in sequence-based tasks, but their direct adaptation to tasks involving protein structures remains a challenge.

243
11 Mar 2023

Linear-scaling kernels for protein sequences and small molecules outperform deep learning while providing uncertainty quantitation and improved interpretability

jlparki/xgpr 7 Feb 2023

We compare the performance of xGPR with the reported performance of various deep learning models on 20 benchmarks, including small molecule, protein sequence and tabular data.

5
07 Feb 2023

Ankh: Optimized Protein Language Model Unlocks General-Purpose Modelling

agemagician/Ankh 16 Jan 2023

As opposed to scaling-up protein language models (PLMs), we seek improving performance via protein-specific optimization.

192
16 Jan 2023