Search Results for author: Connor W. Coley

Found 47 papers, 33 papers with code

OpenChemIE: An Information Extraction Toolkit For Chemistry Literature

1 code implementation1 Apr 2024 Vincent Fan, Yujie Qian, Alex Wang, Amber Wang, Connor W. Coley, Regina Barzilay

Our machine learning models attain state-of-the-art performance when evaluated individually, and we meticulously annotate a challenging dataset of reaction schemes with R-groups to evaluate our pipeline as a whole, achieving an F1 score of 69. 5%.

Beyond Major Product Prediction: Reproducing Reaction Mechanisms with Machine Learning Models Trained on a Large-Scale Mechanistic Dataset

no code implementations7 Mar 2024 Joonyoung F. Joung, Mun Hong Fong, Jihye Roh, Zhengkai Tu, John Bradshaw, Connor W. Coley

Mechanistic understanding of organic reactions can facilitate reaction development, impurity prediction, and in principle, reaction discovery.

Substrate Scope Contrastive Learning: Repurposing Human Bias to Learn Atomic Representations

1 code implementation19 Feb 2024 Wenhao Gao, Priyanka Raghavan, Ron Shprints, Connor W. Coley

In this work, we introduce a novel pre-training strategy, substrate scope contrastive learning, which learns atomic representations tailored to chemical reactivity.

Contrastive Learning molecular representation

Predictive Chemistry Augmented with Text Retrieval

1 code implementation8 Dec 2023 Yujie Qian, Zhening Li, Zhengkai Tu, Connor W. Coley, Regina Barzilay

Conventionally, chemoinformatics models are trained with extensive structured data manually extracted from the literature.

molecular representation Retrieval +2

An algorithmic framework for synthetic cost-aware decision making in molecular design

1 code implementation3 Nov 2023 Jenna C. Fromer, Connor W. Coley

Small molecules exhibiting desirable property profiles are often discovered through an iterative process of designing, synthesizing, and testing sets of molecules.

Decision Making Property Prediction

MIST-CF: Chemical formula inference from tandem mass spectra

1 code implementation17 Jul 2023 Samuel Goldman, Jiayi Xin, Joules Provenzano, Connor W. Coley

Importantly, MIST-CF learns in a data dependent fashion using a Formula Transformer neural network architecture and circumvents the need for fragmentation tree construction.

Learning-To-Rank

RxnScribe: A Sequence Generation Model for Reaction Diagram Parsing

1 code implementation19 May 2023 Yujie Qian, Jiang Guo, Zhengkai Tu, Connor W. Coley, Regina Barzilay

Reaction diagram parsing is the task of extracting reaction schemes from a diagram in the chemistry literature.

Structured Prediction

Evaluating the roughness of structure-property relationships using pretrained molecular representations

no code implementations14 May 2023 David E. Graff, Edward O. Pyzer-Knapp, Kirk E. Jordan, Eugene I. Shakhnovich, Connor W. Coley

When the correlation between structure and property weakens, a dataset is described as "rough," but this characteristic is partly a function of the chosen representation.

molecular representation Property Prediction

Generating Molecular Fragmentation Graphs with Autoregressive Neural Networks

1 code implementation25 Apr 2023 Samuel Goldman, Janet Li, Connor W. Coley

The accurate prediction of tandem mass spectra from molecular structures has the potential to unlock new metabolomic discoveries by augmenting the community's libraries of experimental reference standards.

Prefix-Tree Decoding for Predicting Mass Spectra from Molecules

1 code implementation NeurIPS 2023 Samuel Goldman, John Bradshaw, Jiayi Xin, Connor W. Coley

Computational predictions of mass spectra from molecules have enabled the discovery of clinically relevant metabolites.

Reinforced Genetic Algorithm for Structure-based Drug Design

1 code implementation28 Nov 2022 Tianfan Fu, Wenhao Gao, Connor W. Coley, Jimeng Sun

The neural models take the 3D structure of the targets and ligands as inputs and are pre-trained using native complex structures to utilize the knowledge of the shared binding physics from different targets and then fine-tuned during optimization.

Combinatorial Optimization Drug Discovery +1

De novo PROTAC design using graph-based deep generative models

1 code implementation4 Nov 2022 Divya Nori, Connor W. Coley, Rocío Mercado

After fine-tuning, predicted activity against a challenging POI increases from 50% to >80% with near-perfect chemical validity for sampled compounds, suggesting this is a promising approach for the optimization of large, PROTAC-like molecules for targeted protein degradation.

Reinforcement Learning (RL)

Computer-Aided Multi-Objective Optimization in Small Molecule Discovery

no code implementations13 Oct 2022 Jenna C. Fromer, Connor W. Coley

Molecular discovery is a multi-objective optimization problem that requires identifying a molecule or set of molecules that balance multiple, often competing, properties.

Bayesian Optimization

Equivariant Shape-Conditioned Generation of 3D Molecules for Ligand-Based Drug Design

1 code implementation6 Oct 2022 Keir Adams, Connor W. Coley

Shape-based virtual screening is widely employed in ligand-based drug design to search chemical libraries for molecules with similar 3D shapes yet novel 2D chemical structures compared to known ligands.

valid

Roughness of molecular property landscapes and its impact on modellability

2 code implementations19 Jul 2022 Matteo Aldeghi, David E. Graff, Nathan Frey, Joseph A. Morrone, Edward O. Pyzer-Knapp, Kirk E. Jordan, Connor W. Coley

In molecular discovery and drug design, structure-property relationships and activity landscapes are often qualitatively or quantitatively analyzed to guide the navigation of chemical space.

regression

A graph representation of molecular ensembles for polymer property prediction

1 code implementation17 May 2022 Matteo Aldeghi, Connor W. Coley

While doing so, we built a dataset of simulated electron affinity and ionization potential values for >40k polymers with varying monomer composition, stoichiometry, and chain architecture, which may be used in the development of other tailored machine learning approaches.

BIG-bench Machine Learning Property Prediction

Self-focusing virtual screening with active design space pruning

2 code implementations3 May 2022 David E. Graff, Matteo Aldeghi, Joseph A. Morrone, Kirk E. Jordan, Edward O. Pyzer-Knapp, Connor W. Coley

In this study, we propose an extension to the framework of model-guided optimization that mitigates inferences costs using a technique we refer to as design space pruning (DSP), which irreversibly removes poor-performing candidates from consideration.

pyscreener: A Python Wrapper for Computational Docking Software

1 code implementation17 Dec 2021 David E. Graff, Connor W. Coley

pyscreener is a Python library that seeks to alleviate the challenges of large-scale structure-based design using computational docking.

Scalable Geometric Deep Learning on Molecular Graphs

1 code implementation NeurIPS Workshop AI4Scien 2021 Nathan C. Frey, Siddharth Samsi, Joseph McDonald, Lin Li, Connor W. Coley, Vijay Gadepally

Deep learning in molecular and materials sciences is limited by the lack of integration between applied science, artificial intelligence, and high-performance computing.

Permutation invariant graph-to-sequence model for template-free retrosynthesis and reaction prediction

1 code implementation19 Oct 2021 Zhengkai Tu, Connor W. Coley

Synthesis planning and reaction outcome prediction are two fundamental problems in computer-aided organic chemistry for which a variety of data-driven approaches have emerged.

Data Augmentation Graph-to-Sequence +5

Amortized Tree Generation for Bottom-up Synthesis Planning and Synthesizable Molecular Design

1 code implementation ICLR 2022 Wenhao Gao, Rocío Mercado, Connor W. Coley

Molecular design and synthesis planning are two critical steps in the process of molecular discovery that we propose to formulate as a single shared task of conditional synthetic pathway generation.

Drug Discovery

Learning 3D Representations of Molecular Chirality with Invariance to Bond Rotations

1 code implementation ICLR 2022 Keir Adams, Lagnajit Pattanaik, Connor W. Coley

Molecular chirality, a form of stereochemistry most often describing relative spatial arrangements of bonded neighbors around tetrahedral carbon centers, influences the set of 3D conformers accessible to the molecule without changing its 2D graph connectivity.

Contrastive Learning Data Augmentation +3

Differentiable Scaffolding Tree for Molecule Optimization

no code implementations ICLR 2022 Tianfan Fu, Wenhao Gao, Cao Xiao, Jacob Yasonik, Connor W. Coley, Jimeng Sun

The structural design of functional molecules, also called molecular optimization, is an essential chemical science and engineering task with important applications, such as drug discovery.

Combinatorial Optimization Drug Discovery

Differentiable Scaffolding Tree for Molecular Optimization

no code implementations22 Sep 2021 Tianfan Fu, Wenhao Gao, Cao Xiao, Jacob Yasonik, Connor W. Coley, Jimeng Sun

The structural design of functional molecules, also called molecular optimization, is an essential chemical science and engineering task with important applications, such as drug discovery.

Combinatorial Optimization Drug Discovery

Machine learning modeling of family wide enzyme-substrate specificity screens

1 code implementation8 Sep 2021 Samuel Goldman, Ria Das, Kevin K. Yang, Connor W. Coley

However, the adoption of biocatalysis is limited by our ability to select enzymes that will catalyze their natural chemical transformation on non-natural substrates.

BIG-bench Machine Learning Drug Discovery +1

Machine learning on DNA-encoded library count data using an uncertainty-aware probabilistic loss function

1 code implementation27 Aug 2021 Katherine S. Lim, Andrew G. Reidenbach, Bruce K. Hua, Jeremy W. Mason, Christopher J. Gerry, Paul A. Clemons, Connor W. Coley

Further, this approach to uncertainty-aware regression is applicable to other sparse or noisy datasets where the nature of stochasticity is known or can be modeled; in particular, the Poisson enrichment ratio metric we use can apply to other settings that compare sequencing count data between two experimental conditions.

Denoising Drug Discovery +1

Learning Graph Models for Template-Free Retrosynthesis

no code implementations arXiv 2021 Vignesh Ram Somnath, Charlotte Bunne, Connor W. Coley, Andreas Krause, Regina Barzilay

Retrosynthesis prediction is a fundamental problem in organic synthesis, where the task is to identify precursor molecules that can be used to synthesize a target molecule.

Retrosynthesis Single-step retrosynthesis

BioNavi-NP: Biosynthesis Navigator for Natural Products

no code implementations26 May 2021 Shuangjia Zheng, Tao Zeng, Chengtao Li, Binghong Chen, Connor W. Coley, Yuedong Yang, Ruibo Wu

Nature, a synthetic master, creates more than 300, 000 natural products (NPs) which are the major constituents of FDA-proved drugs owing to the vast chemical space of NPs.

Retrosynthesis

Therapeutics Data Commons: Machine Learning Datasets and Tasks for Drug Discovery and Development

2 code implementations18 Feb 2021 Kexin Huang, Tianfan Fu, Wenhao Gao, Yue Zhao, Yusuf Roohani, Jure Leskovec, Connor W. Coley, Cao Xiao, Jimeng Sun, Marinka Zitnik

Here, we introduce Therapeutics Data Commons (TDC), the first unifying platform to systematically access and evaluate machine learning across the entire range of therapeutics.

BIG-bench Machine Learning Drug Discovery

Accelerating high-throughput virtual screening through molecular pool-based active learning

1 code implementation13 Dec 2020 David E. Graff, Eugene I. Shakhnovich, Connor W. Coley

Structure-based virtual screening is an important tool in early stage drug discovery that scores the interactions between a target protein and candidate ligands.

Active Learning Bayesian Optimization +2

Message Passing Networks for Molecules with Tetrahedral Chirality

1 code implementation24 Nov 2020 Lagnajit Pattanaik, Octavian-Eugen Ganea, Ian Coley, Klavs F. Jensen, William H. Green, Connor W. Coley

Molecules with identical graph connectivity can exhibit different physical and biological properties if they exhibit stereochemistry-a spatial structural characteristic.

Drug Discovery

Learning Graph Models for Retrosynthesis Prediction

2 code implementations NeurIPS 2021 Vignesh Ram Somnath, Charlotte Bunne, Connor W. Coley, Andreas Krause, Regina Barzilay

Retrosynthesis prediction is a fundamental problem in organic synthesis, where the task is to identify precursor molecules that can be used to synthesize a target molecule.

Retrosynthesis

Uncertainty Quantification Using Neural Networks for Molecular Property Prediction

1 code implementation20 May 2020 Lior Hirschfeld, Kyle Swanson, Kevin Yang, Regina Barzilay, Connor W. Coley

While we believe these results show that existing UQ methods are not sufficient for all common use-cases and demonstrate the benefits of further research, we conclude with a practical recommendation as to which existing techniques seem to perform well relative to others.

Drug Discovery Experimental Design +3

Autonomous discovery in the chemical sciences part I: Progress

no code implementations30 Mar 2020 Connor W. Coley, Natalie S. Eyke, Klavs F. Jensen

This two-part review examines how automation has contributed to different aspects of discovery in the chemical sciences.

Drug Discovery

Autonomous discovery in the chemical sciences part II: Outlook

no code implementations30 Mar 2020 Connor W. Coley, Natalie S. Eyke, Klavs F. Jensen

This two-part review examines how automation has contributed to different aspects of discovery in the chemical sciences.

The Synthesizability of Molecules Proposed by Generative Models

1 code implementation17 Feb 2020 Wenhao Gao, Connor W. Coley

The discovery of functional molecules is an expensive and time-consuming process, exemplified by the rising costs of small molecule therapeutic discovery.

Drug Discovery

Learning retrosynthetic planning through self-play

no code implementations19 Jan 2019 John S. Schreck, Connor W. Coley, Kyle J. M. Bishop

The problem of retrosynthetic planning can be framed as one player game, in which the chemist (or a computer program) works backwards from a molecular target to simpler starting materials though a series of choices regarding which reactions to perform.

Multi-step retrosynthesis

Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network

1 code implementation NeurIPS 2017 Wengong Jin, Connor W. Coley, Regina Barzilay, Tommi Jaakkola

The prediction of organic reaction outcomes is a fundamental problem in computational chemistry.

Cannot find the paper you are looking for? You can Submit a new open access paper.