Protein Design

46 papers with code • 2 benchmarks • 3 datasets

Formally, given the design requirements of users, models are required to generate protein amino acid sequences that align with those requirements.

Benchmarks

Add a Result

These leaderboards are used to track progress in Protein Design

Trend	Dataset	Best Model	Paper	Code	Compare
	CATH 4.2	Knowledge-Design			See all
	CATH 4.3	GVP-large			See all

Datasets

Most implemented papers

Most implemented Social Latest No code

ProGen2: Exploring the Boundaries of Protein Language Models

salesforce/progen • • 27 Jun 2022

Attention-based models trained on protein sequences have demonstrated incredible success at classification and generation tasks relevant for artificial intelligence-driven protein design.

Paper
Code

Learning from Protein Structure with Geometric Vector Perceptrons

drorlab/gvp-pytorch • • ICLR 2021

Learning on 3D structures of large biomolecules is emerging as a distinct area in machine learning, but there has yet to emerge a unifying network architecture that simultaneously leverages the graph-structured and geometric aspects of the problem domain.

Paper
Code

RITA: a Study on Scaling Up Generative Protein Sequence Models

lightonai/rita • • 11 May 2022

In this work we introduce RITA: a suite of autoregressive generative models for protein sequences, with up to 1. 2 billion parameters, trained on over 280 million protein sequences belonging to the UniRef-100 database.

Paper
Code

Geometry-Complete Diffusion for 3D Molecule Generation and Optimization

bioinfomachinelearning/bio-diffusion • • 8 Feb 2023

However, such methods are unable to learn important geometric and physical properties of 3D molecules during molecular graph generation, as they adopt molecule-agnostic and non-geometric GNNs as their 3D graph denoising networks, which negatively impacts their ability to effectively scale to datasets of large 3D molecules.

Paper
Code

X-LoRA: Mixture of Low-Rank Adapter Experts, a Flexible Framework for Large Language Models with Applications in Protein Mechanics and Molecular Design

ericlbuehler/mistral.rs • 11 Feb 2024

Starting with a set of pre-trained LoRA adapters, our gating strategy uses the hidden states to dynamically mix adapted layers, allowing the resulting X-LoRA model to draw upon different capabilities and create never-before-used deep layer-wise combinations to solve tasks.

Paper
Code

Variational auto-encoding of protein sequences

samsinai/VAE_protein_function • 9 Dec 2017

Here we present an embedding of natural protein sequences using a Variational Auto-Encoder and use it to predict how mutations affect protein function.

Paper
Code

Unsupervisedly Prompting AlphaFold2 for Few-Shot Learning of Accurate Folding Landscape and Protein Structure Prediction

mindspore-ai/mindscience • • 20 Aug 2022

Data-driven predictive methods which can efficiently and accurately transform protein sequences into biologically active structures are highly valuable for scientific research and medical development.

Paper
Code

TaxDiff: Taxonomic-Guided Diffusion Model for Protein Sequence Generation

linzy19/taxdiff • • 27 Feb 2024

In this work, we propose TaxDiff, a taxonomic-guided diffusion model for controllable protein sequence generation that combines biological species information with the generative capabilities of diffusion models to generate structurally stable proteins within the sequence space.

Paper
Code

mGPfusion: Predicting protein stability changes with Gaussian process kernel learning and data fusion

emmijokinen/mgpfusion • 8 Feb 2018

We introduce a Bayesian data fusion model that re-calibrates the experimental and in silico data sources and then learns a predictive GP model from the combined data.

Paper
Code

Conditioning by adaptive sampling for robust design

jacquesboitreaud/optimol • • 29 Jan 2019

We assume access to one or more, potentially black box, stochastic "oracle" predictive functions, each of which maps from input (e. g., protein sequences) design space to a distribution over a property of interest (e. g. protein fluorescence).

Paper
Code

Protein Design

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result