Search Results for author: Pascal Notin

Found 10 papers, 6 papers with code

Protriever: End-to-End Differentiable Protein Homology Search for Fitness Prediction

no code implementations10 Jun 2025 Ruben Weitzman, Peter Mørch Groth, Lood Van Niekerk, Aoi Otani, Yarin Gal, Debora Marks, Pascal Notin

Retrieving homologous protein sequences is essential for a broad range of protein modeling tasks such as fitness prediction, protein design, structure modeling, and protein-protein interactions.

Protein Design Retrieval

Multi-megabase scale genome interpretation with genetic language models

no code implementations13 Jan 2025 Frederik Träuble, Lachlan Stuart, Andreas Georgiou, Pascal Notin, Arash Mehrjou, Ron Schwessinger, Mathieu Chevalley, Kim Branson, Bernhard Schölkopf, Cornelia van Duijn, Debora Marks, Patrick Schwab

Here, we present Phenformer, a multi-scale genetic language model that learns to generate mechanistic hypotheses as to how differences in genome sequence lead to disease-relevant changes in expression across cell types and tissues directly from DNA sequences of up to 88 million base pairs.

Language Modeling Language Modelling

DiscoBAX: Discovery of Optimal Intervention Sets in Genomic Experiment Design

1 code implementation7 Dec 2023 Clare Lyle, Arash Mehrjou, Pascal Notin, Andrew Jesson, Stefan Bauer, Yarin Gal, Patrick Schwab

The discovery of therapeutics to treat genetically-driven pathologies relies on identifying genes involved in the underlying disease mechanisms.

Experimental Design

Tranception: protein fitness prediction with autoregressive transformers and inference-time retrieval

2 code implementations27 May 2022 Pascal Notin, Mafalda Dias, Jonathan Frazer, Javier Marchena-Hurtado, Aidan Gomez, Debora S. Marks, Yarin Gal

The ability to accurately model the fitness landscape of protein sequences is critical to a wide range of applications, from quantifying the effects of human variants on disease likelihood, to predicting immune-escape mutations in viruses and designing novel biotherapeutic proteins.

Retrieval

RITA: a Study on Scaling Up Generative Protein Sequence Models

4 code implementations11 May 2022 Daniel Hesslow, Niccoló Zanichelli, Pascal Notin, Iacopo Poli, Debora Marks

In this work we introduce RITA: a suite of autoregressive generative models for protein sequences, with up to 1. 2 billion parameters, trained on over 280 million protein sequences belonging to the UniRef-100 database.

Prediction Protein Design

GeneDisco: A Benchmark for Experimental Design in Drug Discovery

2 code implementations ICLR 2022 Arash Mehrjou, Ashkan Soleymani, Andrew Jesson, Pascal Notin, Yarin Gal, Stefan Bauer, Patrick Schwab

GeneDisco contains a curated set of multiple publicly available experimental data sets as well as open-source implementations of state-of-the-art active learning policies for experimental design and exploration.

Active Learning Drug Discovery +1

Improving black-box optimization in VAE latent space using decoder uncertainty

1 code implementation NeurIPS 2021 Pascal Notin, José Miguel Hernández-Lobato, Yarin Gal

Optimization in the latent space of variational autoencoders is a promising approach to generate high-dimensional discrete objects that maximize an expensive black-box property (e. g., drug-likeness in molecular generation, function approximation with arithmetic expressions).

Decoder Drug Design

Improving compute efficacy frontiers with SliceOut

no code implementations21 Jul 2020 Pascal Notin, Aidan N. Gomez, Joanna Yoo, Yarin Gal

Pushing forward the compute efficacy frontier in deep learning is critical for tasks that require frequent model re-training or workloads that entail training a large number of models.

Deep Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.