Search Results for author: Regina Barzilay

Found 135 papers, 78 papers with code

Composing Molecules with Multiple Property Constraints

no code implementations • ICML 2020 • Wengong Jin, Regina Barzilay, Tommi Jaakkola

These rationales are identified from molecules as substructures that are likely responsible for each property of interest.

Drug Discovery

Paper
Add Code

CapWAP: Image Captioning with a Purpose

no code implementations • EMNLP 2020 • Adam Fisch, Kenton Lee, Ming-Wei Chang, Jonathan Clark, Regina Barzilay

In this task, we use question-answer (QA) pairs{---}a natural expression of information need{---}from users, instead of reference captions, for both training and post-inference evaluation.

Image Captioning Question Answering +1

Paper
Add Code

OpenChemIE: An Information Extraction Toolkit For Chemistry Literature

1 code implementation • 1 Apr 2024 • Vincent Fan, Yujie Qian, Alex Wang, Amber Wang, Connor W. Coley, Regina Barzilay

Our machine learning models attain state-of-the-art performance when evaluated individually, and we meticulously annotate a challenging dataset of reaction schemes with R-groups to evaluate our pipeline as a whole, achieving an F1 score of 69. 5%.

Paper
Code

Deep Confident Steps to New Pockets: Strategies for Docking Generalization

2 code implementations • 28 Feb 2024 • Gabriele Corso, Arthur Deng, Benjamin Fry, Nicholas Polizzi, Regina Barzilay, Tommi Jaakkola

Accurate blind docking has the potential to lead to new biological breakthroughs, but for this promise to be realized, docking methods must generalize well across the proteome.

Blind Docking

905

Paper
Code

CLIPZyme: Reaction-Conditioned Virtual Screening of Enzymes

no code implementations • 9 Feb 2024 • Peter G. Mikhael, Itamar Chinn, Regina Barzilay

Computational screening of naturally occurring proteins has the potential to identify efficient catalysts among the hundreds of millions of sequences that remain uncharacterized.

Paper
Add Code

Dirichlet Flow Matching with Applications to DNA Sequence Design

1 code implementation • 8 Feb 2024 • Hannes Stark, Bowen Jing, Chenyu Wang, Gabriele Corso, Bonnie Berger, Regina Barzilay, Tommi Jaakkola

Further, we provide distilled Dirichlet flow matching, which enables one-step sequence generation with minimal performance hits, resulting in $O(L)$ speedups compared to autoregressive models.

Paper
Code

Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design

1 code implementation • 7 Feb 2024 • Andrew Campbell, Jason Yim, Regina Barzilay, Tom Rainforth, Tommi Jaakkola

Our approach achieves state-of-the-art co-design performance while allowing the same multimodal model to be used for flexible generation of the sequence or structure.

Paper
Code

Sample, estimate, aggregate: A recipe for causal discovery foundation models

1 code implementation • 2 Feb 2024 • Menghua Wu, Yujia Bao, Regina Barzilay, Tommi Jaakkola

Causal discovery, the task of inferring causal structure from data, promises to accelerate scientific research, inform policy making, and more.

Causal Discovery

Paper
Code

Improved motif-scaffolding with SE(3) flow matching

1 code implementation • 8 Jan 2024 • Jason Yim, Andrew Campbell, Emile Mathieu, Andrew Y. K. Foong, Michael Gastegger, José Jiménez-Luna, Sarah Lewis, Victor Garcia Satorras, Bastiaan S. Veeling, Frank Noé, Regina Barzilay, Tommi S. Jaakkola

The first is motif amortization, in which FrameFlow is trained with the motif as input using a data augmentation strategy.

Data Augmentation Protein Design

132

Paper
Code

Predictive Chemistry Augmented with Text Retrieval

1 code implementation • 8 Dec 2023 • Yujie Qian, Zhening Li, Zhengkai Tu, Connor W. Coley, Regina Barzilay

Conventionally, chemoinformatics models are trained with extensive structured data manually extracted from the literature.

molecular representation Retrieval +2

Paper
Code

Fast non-autoregressive inverse folding with discrete diffusion

1 code implementation • 5 Dec 2023 • John J. Yang, Jason Yim, Regina Barzilay, Tommi Jaakkola

Generating protein sequences that fold into a intended 3D structure is a fundamental step in de novo protein design.

Protein Design

Paper
Code

Risk-Controlling Model Selection via Guided Bayesian Optimization

no code implementations • 4 Dec 2023 • Bracha Laufer-Goldshtein, Adam Fisch, Regina Barzilay, Tommi Jaakkola

Adjustable hyperparameters of machine learning models typically impact various key trade-offs such as accuracy, fairness, robustness, or inference cost.

Bayesian Optimization Fairness +1

Paper
Add Code

Particle Guidance: non-I.I.D. Diverse Sampling with Diffusion Models

1 code implementation • 19 Oct 2023 • Gabriele Corso, Yilun Xu, Valentin De Bortoli, Regina Barzilay, Tommi Jaakkola

In light of the widespread success of generative models, a significant amount of research has gone into speeding up their sampling time.

Conditional Image Generation

Paper
Code

Harmonic Self-Conditioned Flow Matching for Multi-Ligand Docking and Binding Site Design

1 code implementation • 9 Oct 2023 • Hannes Stärk, Bowen Jing, Regina Barzilay, Tommi Jaakkola

A significant amount of protein function requires binding small molecules, including enzymatic catalysis.

Paper
Code

Fast protein backbone generation with SE(3) flow matching

1 code implementation • 8 Oct 2023 • Jason Yim, Andrew Campbell, Andrew Y. K. Foong, Michael Gastegger, José Jiménez-Luna, Sarah Lewis, Victor Garcia Satorras, Bastiaan S. Veeling, Regina Barzilay, Tommi Jaakkola, Frank Noé

We present FrameFlow, a method for fast protein backbone generation using SE(3) flow matching.

Protein Design

132

Paper
Code

Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems

1 code implementation • 17 Jul 2023 • Xuan Zhang, Limei Wang, Jacob Helwig, Youzhi Luo, Cong Fu, Yaochen Xie, Meng Liu, Yuchao Lin, Zhao Xu, Keqiang Yan, Keir Adams, Maurice Weiler, Xiner Li, Tianfan Fu, Yucheng Wang, Haiyang Yu, Yuqing Xie, Xiang Fu, Alex Strasser, Shenglong Xu, Yi Liu, Yuanqi Du, Alexandra Saxton, Hongyi Ling, Hannah Lawrence, Hannes Stärk, Shurui Gui, Carl Edwards, Nicholas Gao, Adriana Ladera, Tailin Wu, Elyssa F. Hofgard, Aria Mansouri Tehrani, Rui Wang, Ameya Daigavane, Montgomery Bohde, Jerry Kurtin, Qian Huang, Tuong Phung, Minkai Xu, Chaitanya K. Joshi, Simon V. Mathis, Kamyar Azizzadenesheli, Ada Fang, Alán Aspuru-Guzik, Erik Bekkers, Michael Bronstein, Marinka Zitnik, Anima Anandkumar, Stefano Ermon, Pietro Liò, Rose Yu, Stephan Günnemann, Jure Leskovec, Heng Ji, Jimeng Sun, Regina Barzilay, Tommi Jaakkola, Connor W. Coley, Xiaoning Qian, Xiaofeng Qian, Tess Smidt, Shuiwang Ji

Advances in artificial intelligence (AI) are fueling a new paradigm of discoveries in natural sciences.

Out-of-Distribution Generalization Transfer Learning +1

402

Paper
Code

Improving Protein Optimization with Smoothed Fitness Landscapes

1 code implementation • 2 Jul 2023 • Andrew Kirjner, Jason Yim, Raman Samusevich, Shahar Bracha, Tommi Jaakkola, Regina Barzilay, Ila Fiete

The ability to engineer novel proteins with higher fitness for a desired property would be revolutionary for biotechnology and medicine.

Efficient Exploration

Paper
Code

Conformal Language Modeling

1 code implementation • 16 Jun 2023 • Victor Quach, Adam Fisch, Tal Schuster, Adam Yala, Jae Ho Sohn, Tommi S. Jaakkola, Regina Barzilay

Translating this process to conformal prediction, we calibrate a stopping rule for sampling different outputs from the LM that get added to a growing set of candidates until we are confident that the output set is sufficient.

Conformal Prediction Language Modelling +2

Paper
Code

RxnScribe: A Sequence Generation Model for Reaction Diagram Parsing

1 code implementation • 19 May 2023 • Yujie Qian, Jiang Guo, Zhengkai Tu, Connor W. Coley, Regina Barzilay

Reaction diagram parsing is the task of extracting reaction schemes from a diagram in the chemistry literature.

Structured Prediction

Paper
Code

DiffDock-PP: Rigid Protein-Protein Docking with Diffusion Models

1 code implementation • 8 Apr 2023 • Mohamed Amine Ketata, Cedrik Laue, Ruslan Mammadov, Hannes Stärk, Menghua Wu, Gabriele Corso, Céline Marquet, Regina Barzilay, Tommi S. Jaakkola

Understanding how proteins structurally interact is crucial to modern biology, with applications in drug discovery and protein design.

Drug Discovery Protein Design

167

Paper
Code

PEOPL: Characterizing Privately Encoded Open Datasets with Public Labels

no code implementations • 31 Mar 2023 • Homa Esfahanizadeh, Adam Yala, Rafael G. L. D'Oliveira, Andrea J. D. Jaba, Victor Quach, Ken R. Duffy, Tommi S. Jaakkola, Vinod Vaikuntanathan, Manya Ghobadi, Regina Barzilay, Muriel Médard

Allowing organizations to share their data for training of machine learning (ML) models without unintended information leakage is an open problem in practice.

Paper
Add Code

SE(3) diffusion model with application to protein backbone generation

1 code implementation • 5 Feb 2023 • Jason Yim, Brian L. Trippe, Valentin De Bortoli, Emile Mathieu, Arnaud Doucet, Regina Barzilay, Tommi Jaakkola

The design of novel protein structures remains a challenge in protein engineering for applications across biomedicine and chemistry.

Protein Structure Prediction

258

Paper
Code

Efficiently Controlling Multiple Risks with Pareto Testing

no code implementations • 14 Oct 2022 • Bracha Laufer-Goldshtein, Adam Fisch, Regina Barzilay, Tommi Jaakkola

Machine learning applications frequently come with multiple diverse objectives and constraints that can change over time.

Paper
Add Code

DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking

2 code implementations • 4 Oct 2022 • Gabriele Corso, Hannes Stärk, Bowen Jing, Regina Barzilay, Tommi Jaakkola

We instead frame molecular docking as a generative modeling problem and develop DiffDock, a diffusion generative model over the non-Euclidean manifold of ligand poses.

Ranked #1 on Blind Docking on PDBbind

Blind Docking

905

Paper
Code

Calibrated Selective Classification

no code implementations • 25 Aug 2022 • Adam Fisch, Tommi Jaakkola, Regina Barzilay

Providing calibrated uncertainty estimates alongside predictions -- probabilities that correspond to true frequencies -- can be as important as having predictions that are simply accurate on average.

Classification Image Classification

Paper
Add Code

Antibody-Antigen Docking and Design via Hierarchical Equivariant Refinement

1 code implementation • 14 Jul 2022 • Wengong Jin, Regina Barzilay, Tommi Jaakkola

The binding affinity is governed by the 3D binding interface where antibody residues (paratope) closely interact with antigen residues (epitope).

Atomic Forces

Paper
Code

Diffusion probabilistic modeling of protein backbones in 3D for the motif-scaffolding problem

1 code implementation • 8 Jun 2022 • Brian L. Trippe, Jason Yim, Doug Tischer, David Baker, Tamara Broderick, Regina Barzilay, Tommi Jaakkola

Construction of a scaffold structure that supports a desired motif, conferring protein function, shows promise for the design of vaccines and enzymes.

Paper
Code

Torsional Diffusion for Molecular Conformer Generation

1 code implementation • 1 Jun 2022 • Bowen Jing, Gabriele Corso, Jeffrey Chang, Regina Barzilay, Tommi Jaakkola

Molecular conformer generation is a fundamental task in computational chemistry.

BIG-bench Machine Learning

232

Paper
Code

MolScribe: Robust Molecular Structure Recognition with Image-To-Graph Generation

1 code implementation • 28 May 2022 • Yujie Qian, Jiang Guo, Zhengkai Tu, Zhening Li, Connor W. Coley, Regina Barzilay

Molecular structure recognition is the task of translating a molecular image into its graph structure.

Data Augmentation Graph Generation

115

Paper
Code

Learning to Split for Automatic Bias Detection

1 code implementation • 28 Apr 2022 • Yujia Bao, Regina Barzilay

Classifiers are biased when trained on biased datasets.

Bias Detection Image Classification +3

Paper
Code

Conformal Prediction Sets with Limited False Positives

1 code implementation • 15 Feb 2022 • Adam Fisch, Tal Schuster, Tommi Jaakkola, Regina Barzilay

We propose to trade coverage for a notion of precision by enforcing that the presence of incorrect candidates in the predicted conformal sets (i. e., the total number of false positives) is bounded according to a user-specified tolerance.

Conformal Prediction

Paper
Code

EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction

1 code implementation • 7 Feb 2022 • Hannes Stärk, Octavian-Eugen Ganea, Lagnajit Pattanaik, Regina Barzilay, Tommi Jaakkola

Predicting how a drug-like molecule binds to a specific protein target is a core problem in drug discovery.

Ranked #5 on Blind Docking on PDBBind

Blind Docking Drug Discovery

464

Paper
Code

Syfer: Neural Obfuscation for Private Data Release

no code implementations • 28 Jan 2022 • Adam Yala, Victor Quach, Homa Esfahanizadeh, Rafael G. L. D'Oliveira, Ken R. Duffy, Muriel Médard, Tommi S. Jaakkola, Regina Barzilay

We quantify privacy as the number of attacker guesses required to re-identify a single image (guesswork).

Contrastive Learning

Paper
Add Code

Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking

1 code implementation • ICLR 2022 • Octavian-Eugen Ganea, Xinyuan Huang, Charlotte Bunne, Yatao Bian, Regina Barzilay, Tommi Jaakkola, Andreas Krause

Protein complex formation is a central problem in biology, being involved in most of the cell's processes, and essential for applications, e. g. drug design or protein engineering.

Graph Matching Translation

223

Paper
Code

Fragment-based Sequential Translation for Molecular Optimization

no code implementations • NeurIPS Workshop AI4Scien 2021 • Benson Chen, Xiang Fu, Regina Barzilay, Tommi Jaakkola

Equipped with the learned fragment vocabulary, we propose Fragment-based Sequential Translation (FaST), which learns a reinforcement learning (RL) policy to iteratively translate model-discovered molecules into increasingly novel molecules while satisfying desired properties.

Drug Discovery Reinforcement Learning (RL) +1

Paper
Add Code

Crystal Diffusion Variational Autoencoder for Periodic Material Generation

4 code implementations • ICLR 2022 • Tian Xie, Xiang Fu, Octavian-Eugen Ganea, Regina Barzilay, Tommi Jaakkola

Generating the periodic structure of stable materials is a long-standing challenge for the material design community.

Inductive Bias Translation +1

198

Paper
Code

Iterative Refinement Graph Neural Network for Antibody Sequence-Structure Co-design

1 code implementation • ICLR 2022 • Wengong Jin, Jeremy Wohlwend, Regina Barzilay, Tommi Jaakkola

In this paper, we propose a generative model to automatically design the CDRs of antibodies with enhanced binding specificity or neutralization capabilities.

Protein Design Specificity

122

Paper
Code

Text Style Transfer with Confounders

no code implementations • 29 Sep 2021 • Tianxiao Shen, Regina Barzilay, Tommi S. Jaakkola

Existing methods for style transfer operate either with paired sentences or distributionally matched corpora which differ only in the desired style.

Style Transfer Text Style Transfer

Paper
Add Code

Trading Coverage for Precision: Conformal Prediction with Limited False Discoveries

no code implementations • 29 Sep 2021 • Adam Fisch, Tal Schuster, Tommi S. Jaakkola, Regina Barzilay

In this paper, we develop a new approach to conformal prediction in which we aim to output a precise set of promising prediction candidates that is guaranteed to contain a limited number of incorrect answers.

Conformal Prediction Drug Discovery

Paper
Add Code

Mol2Image: Improved Conditional Flow Models for Molecule to Image Synthesis

no code implementations • CVPR 2021 • Karren Yang, Samuel Goldman, Wengong Jin, Alex X. Lu, Regina Barzilay, Tommi Jaakkola, Caroline Uhler

In this paper, we aim to synthesize cell microscopy images under different molecular interventions, motivated by practical applications to drug development.

Contrastive Learning Image Generation

Paper
Add Code

Learning Stable Classifiers by Transferring Unstable Features

1 code implementation • 15 Jun 2021 • Yujia Bao, Shiyu Chang, Regina Barzilay

Empirical results demonstrate that our algorithm is able to maintain robustness on the target task for both synthetically generated environments and real-world environments.

Transfer Learning

Paper
Code

GeoMol: Torsional Geometric Generation of Molecular 3D Conformer Ensembles

1 code implementation • NeurIPS 2021 • Octavian-Eugen Ganea, Lagnajit Pattanaik, Connor W. Coley, Regina Barzilay, Klavs F. Jensen, William H. Green, Tommi S. Jaakkola

Prediction of a molecule's 3D conformer ensemble from the molecular graph holds a key role in areas of cheminformatics and drug discovery.

Drug Discovery

150

Paper
Code

Learning Graph Models for Template-Free Retrosynthesis

no code implementations • arXiv 2021 • Vignesh Ram Somnath, Charlotte Bunne, Connor W. Coley, Andreas Krause, Regina Barzilay

Retrosynthesis prediction is a fundamental problem in organic synthesis, where the task is to identify precursor molecules that can be used to synthesize a target molecule.

Ranked #6 on Single-step retrosynthesis on USPTO-50k

Retrosynthesis Single-step retrosynthesis

Paper
Add Code

NeuraCrypt: Hiding Private Health Data via Random Neural Networks for Public Training

1 code implementation • 4 Jun 2021 • Adam Yala, Homa Esfahanizadeh, Rafael G. L. D' Oliveira, Ken R. Duffy, Manya Ghobadi, Tommi S. Jaakkola, Vinod Vaikuntanathan, Regina Barzilay, Muriel Medard

We propose to approximate this family of encoding functions through random deep neural networks.

BIG-bench Machine Learning

Paper
Code

Nutri-bullets Hybrid: Consensual Multi-document Summarization

no code implementations • NAACL 2021 • Darsh Shah, Lili Yu, Tao Lei, Regina Barzilay

We present a method for generating comparative summaries that highlight similarities and contradictions in input documents.

Document Summarization Language Modelling +3

Paper
Add Code

Predict then Interpolate: A Simple Algorithm to Learn Stable Classifiers

1 code implementation • 26 May 2021 • Yujia Bao, Shiyu Chang, Regina Barzilay

In this work, we prove that by interpolating the distributions of the correct predictions and the wrong predictions, we can uncover an oracle distribution where the unstable correlation vanishes.

Image Classification text-classification +1

Paper
Code

Consistent Accelerated Inference via Confident Adaptive Transformers

1 code implementation • EMNLP 2021 • Tal Schuster, Adam Fisch, Tommi Jaakkola, Regina Barzilay

In this work, we present CATs -- Confident Adaptive Transformers -- in which we simultaneously increase computational efficiency, while guaranteeing a specifiable degree of consistency with the original model with high confidence.

Computational Efficiency Conformal Prediction +1

Paper
Code

Generating Related Work

no code implementations • 18 Apr 2021 • Darsh J Shah, Regina Barzilay

Communicating new research ideas involves highlighting similarities and differences with past work.

Document Summarization Multi-Document Summarization

Paper
Add Code

Nutribullets Hybrid: Multi-document Health Summarization

2 code implementations • 8 Apr 2021 • Darsh J Shah, Lili Yu, Tao Lei, Regina Barzilay

We present a method for generating comparative summaries that highlights similarities and contradictions in input documents.

Language Modelling Nutrition +1

896

Paper
Code

Nutri-bullets: Summarizing Health Studies by Composing Segments

1 code implementation • 22 Mar 2021 • Darsh J Shah, Lili Yu, Tao Lei, Regina Barzilay

We introduce \emph{Nutri-bullets}, a multi-document summarization task for health and nutrition.

Document Summarization Language Modelling +2

Paper
Code

Get Your Vitamin C! Robust Fact Verification with Contrastive Evidence

1 code implementation • NAACL 2021 • Tal Schuster, Adam Fisch, Regina Barzilay

Typical fact verification models use retrieved written evidence to verify claims.

Fact Checking Fact Verification +2

Paper
Code

Few-shot Conformal Prediction with Auxiliary Tasks

1 code implementation • 17 Feb 2021 • Adam Fisch, Tal Schuster, Tommi Jaakkola, Regina Barzilay

We develop a novel approach to conformal prediction when the target task has limited data available for training.

Conformal Prediction Drug Discovery +1

Paper
Code

CapWAP: Captioning with a Purpose

1 code implementation • 9 Nov 2020 • Adam Fisch, Kenton Lee, Ming-Wei Chang, Jonathan H. Clark, Regina Barzilay

In this task, we use question-answer (QA) pairs---a natural expression of information need---from users, instead of reference captions, for both training and post-inference evaluation.

Image Captioning Question Answering +1

1,561

Paper
Code

Discovering Synergistic Drug Combinations for COVID with Biological Bottleneck Models

no code implementations • 9 Nov 2020 • Wengong Jin, Regina Barzilay, Tommi Jaakkola

Drug combinations play an important role in therapeutics due to its better efficacy and reduced toxicity.

Paper
Add Code

Deciphering Undersegmented Ancient Scripts Using Phonetic Prior

1 code implementation • 21 Oct 2020 • Jiaming Luo, Frederik Hartmann, Enrico Santus, Yuan Cao, Regina Barzilay

We evaluate the model on both deciphered languages (Gothic, Ugaritic) and an undeciphered one (Iberian).

Decipherment

Paper
Code

Efficient Conformal Prediction via Cascaded Inference with Expanded Admission

1 code implementation • ICLR 2021 • Adam Fisch, Tal Schuster, Tommi Jaakkola, Regina Barzilay

This set is guaranteed to contain a correct answer with high probability, and is well-suited for many open-ended classification tasks.

Conformal Prediction Drug Discovery +2

Paper
Code

Improved Conditional Flow Models for Molecule to Image Synthesis

1 code implementation • 15 Jun 2020 • Karren Yang, Samuel Goldman, Wengong Jin, Alex Lu, Regina Barzilay, Tommi Jaakkola, Caroline Uhler

In this paper, we aim to synthesize cell microscopy images under different molecular interventions, motivated by practical applications to drug development.

Contrastive Learning Image Generation

Paper
Code

Learning Graph Models for Retrosynthesis Prediction

2 code implementations • NeurIPS 2021 • Vignesh Ram Somnath, Charlotte Bunne, Connor W. Coley, Andreas Krause, Regina Barzilay

Retrosynthesis prediction is a fundamental problem in organic synthesis, where the task is to identify precursor molecules that can be used to synthesize a target molecule.

Retrosynthesis

Paper
Code

Optimal Transport Graph Neural Networks

2 code implementations • 8 Jun 2020 • Benson Chen, Gary Bécigneul, Octavian-Eugen Ganea, Regina Barzilay, Tommi Jaakkola

Current graph neural network (GNN) architectures naively average or sum node embeddings into an aggregated graph representation -- potentially losing structural or semantic information.

Ranked #1 on Graph Regression on Lipophilicity (using extra training data)

Drug Discovery Graph Regression +2

Paper
Code

Enforcing Predictive Invariance across Structured Biomedical Domains

no code implementations • 6 Jun 2020 • Wengong Jin, Regina Barzilay, Tommi Jaakkola

We evaluate our method on multiple applications: molecular property prediction, protein homology and stability prediction and show that RGM significantly outperforms previous state-of-the-art baselines.

Domain Generalization Molecular Property Prediction +1

Paper
Add Code

Uncertainty Quantification Using Neural Networks for Molecular Property Prediction

1 code implementation • 20 May 2020 • Lior Hirschfeld, Kyle Swanson, Kevin Yang, Regina Barzilay, Connor W. Coley

While we believe these results show that existing UQ methods are not sufficient for all common use-cases and demonstrate the benefits of further research, we conclude with a practical recommendation as to which existing techniques seem to perform well relative to others.

Drug Discovery Experimental Design +3

Paper
Code

Adaptive Invariance for Molecule Property Prediction

no code implementations • 5 May 2020 • Wengong Jin, Regina Barzilay, Tommi Jaakkola

Effective property prediction methods can help accelerate the search for COVID-19 antivirals either through accurate in-silico screens or by effectively guiding on-going at-scale experimental efforts.

Property Prediction Transfer Learning

Paper
Add Code

Improving Molecular Design by Stochastic Iterative Target Augmentation

2 code implementations • ICML 2020 • Kevin Yang, Wengong Jin, Kyle Swanson, Regina Barzilay, Tommi Jaakkola

The property predictor is then used as a likelihood model for filtering candidate structures from the generative model.

Program Synthesis

Paper
Code

Blank Language Models

1 code implementation • EMNLP 2020 • Tianxiao Shen, Victor Quach, Regina Barzilay, Tommi Jaakkola

We propose Blank Language Model (BLM), a model that generates sequences by dynamically creating and filling in blanks.

Ancient Text Restoration Language Modelling +1

Paper
Code

Multi-Objective Molecule Generation using Interpretable Substructures

4 code implementations • 8 Feb 2020 • Wengong Jin, Regina Barzilay, Tommi Jaakkola

These rationales are identified from molecules as substructures that are likely responsible for each property of interest.

Drug Discovery

1,543

Paper
Code

Hierarchical Generation of Molecular Graphs using Structural Motifs

2 code implementations • ICML 2020 • Wengong Jin, Regina Barzilay, Tommi Jaakkola

Indeed, as we demonstrate, their performance degrades significantly for larger molecules.

Drug Discovery Graph Generation

341

Paper
Code

Generative Models for Graph-Based Protein Design

1 code implementation • ICLR Workshop DeepGenStruct 2019 • John Ingraham, Vikas Garg, Regina Barzilay, Tommi Jaakkola

Engineered proteins offer the potential to solve many problems in biomedicine, energy, and materials science, but creating designs that succeed is difficult in practice.

Protein Design Protein Folding

225

Paper
Code

Capturing Greater Context for Question Generation

1 code implementation • 22 Oct 2019 • Luu Anh Tuan, Darsh J Shah, Regina Barzilay

Automatic question generation can benefit many applications ranging from dialogue systems to reading comprehension.

Question Answering Question Generation +3

Paper
Code

Learning to Make Generalizable and Diverse Predictions for Retrosynthesis

no code implementations • 21 Oct 2019 • Benson Chen, Tianxiao Shen, Tommi S. Jaakkola, Regina Barzilay

We propose a new model for making generalizable and diverse retrosynthetic reaction predictions.

Retrosynthesis Single-step retrosynthesis

Paper
Add Code

Automatic Fact-guided Sentence Modification

3 code implementations • 30 Sep 2019 • Darsh J Shah, Tal Schuster, Regina Barzilay

This is a challenging constrained generation task, as the output must be consistent with the new information and fit into the rest of the existing document.

Fact Checking Sentence

896

Paper
Code

Denoising Improves Latent Space Geometry in Text Autoencoders

no code implementations • 25 Sep 2019 • Tianxiao Shen, Jonas Mueller, Regina Barzilay, Tommi Jaakkola

Neural language models have recently shown impressive gains in unconditional text generation, but controllable generation and manipulation of text remain challenging.

Denoising Sentence +1

Paper
Add Code

Iterative Target Augmentation for Effective Conditional Generation

no code implementations • 25 Sep 2019 • Kevin Yang, Wengong Jin, Kyle Swanson, Regina Barzilay, Tommi Jaakkola

Many challenging prediction problems, from molecular optimization to program synthesis, involve creating complex structured objects as outputs.

Program Synthesis

Paper
Add Code

Working Hard or Hardly Working: Challenges of Integrating Typology into Neural Dependency Parsers

1 code implementation • IJCNLP 2019 • Adam Fisch, Jiang Guo, Regina Barzilay

This paper explores the task of leveraging typology in the context of cross-lingual dependency parsing.

Dependency Parsing

Paper
Code

The Limitations of Stylometry for Detecting Machine-Generated Fake News

no code implementations • CL 2020 • Tal Schuster, Roei Schuster, Darsh J Shah, Regina Barzilay

Recent developments in neural language models (LMs) have raised concerns about their potential misuse for automatically spreading misinformation.

Fake News Detection Language Modelling +1

Paper
Add Code

Few-shot Text Classification with Distributional Signatures

2 code implementations • ICLR 2020 • Yujia Bao, Menghua Wu, Shiyu Chang, Regina Barzilay

In this paper, we explore meta-learning for few-shot text classification.

Few-Shot Text Classification General Classification +3

252

Paper
Code

Towards Debiasing Fact Verification Models

3 code implementations • IJCNLP 2019 • Tal Schuster, Darsh J Shah, Yun Jie Serene Yeo, Daniel Filizzola, Enrico Santus, Regina Barzilay

Fact verification requires validating a claim in the context of evidence.

Fact Verification

Paper
Code

Neural Decipherment via Minimum-Cost Flow: from Ugaritic to Linear B

1 code implementation • ACL 2019 • Jiaming Luo, Yuan Cao, Regina Barzilay

In this paper we propose a novel neural approach for automatic decipherment of lost languages.

Decipherment

Paper
Code

Hierarchical Graph-to-Graph Translation for Molecules

1 code implementation • 11 Jun 2019 • Wengong Jin, Regina Barzilay, Tommi Jaakkola

The problem of accelerating drug discovery relies heavily on automatic tools to optimize precursor molecules to afford them with better biochemical properties.

Ranked #1 on Drug Discovery on QED

Drug Discovery Graph-To-Graph Translation +1

341

Paper
Code

Educating Text Autoencoders: Latent Representation Guidance via Denoising

3 code implementations • ICML 2020 • Tianxiao Shen, Jonas Mueller, Regina Barzilay, Tommi Jaakkola

We prove that this simple modification guides the latent space geometry of the resulting model by encouraging the encoder to map similar texts to similar latent representations.

Denoising Sentence +2

201

Paper
Code

Path-Augmented Graph Transformer Network

2 code implementations • 29 May 2019 • Benson Chen, Regina Barzilay, Tommi Jaakkola

Much of the recent work on learning molecular representations has been based on Graph Convolution Networks (GCN).

Molecular Property Prediction Property Prediction

Paper
Code

Learning Multimodal Graph-to-Graph Translation for Molecule Optimization

no code implementations • ICLR 2019 • Wengong Jin, Kevin Yang, Regina Barzilay, Tommi Jaakkola

We evaluate our model on multiple molecule optimization tasks and show that our model outperforms previous state-of-the-art baselines by a significant margin.

Graph-To-Graph Translation Translation

Paper
Add Code

Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes

no code implementations • 24 Apr 2019 • Yujia Bao, Zhengyi Deng, Yan Wang, Heeyoon Kim, Victor Diego Armengol, Francisco Acevedo, Nofal Ouardaoui, Cathy Wang, Giovanni Parmigiani, Regina Barzilay, Danielle Braun, Kevin S. Hughes

We developed and evaluated two machine learning models to classify abstracts as relevant to the penetrance (risk of cancer for germline mutation carriers) or prevalence of germline genetic mutations.

Classification General Classification

Paper
Add Code

Analyzing Learned Molecular Representations for Property Prediction

4 code implementations • 2 Apr 2019 • Kevin Yang, Kyle Swanson, Wengong Jin, Connor Coley, Philipp Eiden, Hua Gao, Angel Guzman-Perez, Timothy Hopper, Brian Kelley, Miriam Mathea, Andrew Palmer, Volker Settels, Tommi Jaakkola, Klavs Jensen, Regina Barzilay

In addition, we introduce a graph convolutional model that consistently matches or outperforms models using fixed molecular descriptors as well as previous graph neural architectures on both public and proprietary datasets.

Ranked #3 on Molecular Property Prediction on QM9

Molecular Property Prediction molecular representation +1

1,543

Paper
Code

Inferring Which Medical Treatments Work from Reports of Clinical Trials

2 code implementations • NAACL 2019 • Eric Lehman, Jay DeYoung, Regina Barzilay, Byron C. Wallace

In this paper, we present a new task and corpus for making this unstructured evidence actionable.

Paper
Code

Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing

2 code implementations • NAACL 2019 • Tal Schuster, Ori Ram, Regina Barzilay, Amir Globerson

We introduce a novel method for multilingual transfer that utilizes deep contextual embeddings, pretrained in an unsupervised fashion.

Ranked #1 on Cross-lingual zero-shot dependency parsing on Universal Dependency Treebank

Cross-lingual zero-shot dependency parsing Few-Shot Learning +1

Paper
Code

Learning Multimodal Graph-to-Graph Translation for Molecular Optimization

5 code implementations • 3 Dec 2018 • Wengong Jin, Kevin Yang, Regina Barzilay, Tommi Jaakkola

We evaluate our model on multiple molecular optimization tasks and show that our model outperforms previous state-of-the-art baselines.

Graph-To-Graph Translation Translation

147

Paper
Code

A graph-convolutional neural network model for the prediction of chemical reactivity

no code implementations • Chemical Science 2018 • Connor W. Coley, Wengong Jin, Luke Rogers, Timothy F. Jamison, Tommi S. Jaakkola, William H. Green, Regina Barzilay, Klavs F. Jensen

We present a supervised learning approach to predict the products of organic reactions given their reactants, reagents, and solvent(s).

Paper
Add Code

GraphIE: A Graph-Based Framework for Information Extraction

2 code implementations • NAACL 2019 • Yujie Qian, Enrico Santus, Zhijing Jin, Jiang Guo, Regina Barzilay

Most modern Information Extraction (IE) systems are implemented as sequential taggers and only model local dependencies.

108

Paper
Code

Multi-Source Domain Adaptation with Mixture of Experts

1 code implementation • EMNLP 2018 • Jiang Guo, Darsh J Shah, Regina Barzilay

We propose a mixture-of-experts approach for unsupervised domain adaptation from multiple sources.

Part-Of-Speech Tagging Sentiment Analysis +1

Paper
Code

Deriving Machine Attention from Human Rationales

3 code implementations • EMNLP 2018 • Yujia Bao, Shiyu Chang, Mo Yu, Regina Barzilay

Attention-based models are successful when trained on large amounts of data.

Paper
Code

The Three Pillars of Machine Programming

no code implementations • 20 Mar 2018 • Justin Gottschlich, Armando Solar-Lezama, Nesime Tatbul, Michael Carbin, Martin Rinard, Regina Barzilay, Saman Amarasinghe, Joshua B. Tenenbaum, Tim Mattson

In this position paper, we describe our vision of the future of machine programming through a categorical examination of three pillars of research.

BIG-bench Machine Learning Position

Paper
Add Code

Junction Tree Variational Autoencoder for Molecular Graph Generation

11 code implementations • ICML 2018 • Wengong Jin, Regina Barzilay, Tommi Jaakkola

We evaluate our model on multiple tasks ranging from molecular generation to optimization.

Ranked #1 on Molecular Graph Generation on InterBioScreen

Drug Discovery Graph Generation +1

13,001

Paper
Code

Using Deep Reinforcement Learning to Generate Rationales for Molecules

no code implementations • ICLR 2018 • Benson Chen, Connor Coley, Regina Barzilay, Tommi Jaakkola

Deep learning algorithms are increasingly used in modeling chemical processes.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network

1 code implementation • NeurIPS 2017 • Wengong Jin, Connor W. Coley, Regina Barzilay, Tommi Jaakkola

The prediction of organic reaction outcomes is a fundamental problem in computational chemistry.

141

Paper
Code

Grounding Language for Transfer in Deep Reinforcement Learning

1 code implementation • 1 Aug 2017 • Karthik Narasimhan, Regina Barzilay, Tommi Jaakkola

In this paper, we explore the utilization of natural language to drive transfer for reinforcement learning (RL).

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Representation Learning for Grounded Spatial Reasoning

1 code implementation • TACL 2018 • Michael Janner, Karthik Narasimhan, Regina Barzilay

The interpretation of spatial references is highly contextual, requiring joint inference over both language and the environment.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Style Transfer from Non-Parallel Text by Cross-Alignment

12 code implementations • NeurIPS 2017 • Tianxiao Shen, Tao Lei, Regina Barzilay, Tommi Jaakkola

We demonstrate the effectiveness of this cross-alignment method on three tasks: sentiment modification, decipherment of word substitution ciphers, and recovery of word order.

Ranked #7 on Text Style Transfer on Yelp Review Dataset (Small)

Decipherment Machine Translation +3

551

Paper
Code

Deriving Neural Architectures from Sequence and Graph Kernels

no code implementations • ICML 2017 • Tao Lei, Wengong Jin, Regina Barzilay, Tommi Jaakkola

The design of neural architectures for structured objects is typically guided by experimental insights rather than a formal process.

Graph Regression Language Modelling +1

Paper
Add Code

Unsupervised Learning of Morphological Forests

no code implementations • TACL 2017 • Jiaming Luo, Karthik Narasimhan, Regina Barzilay

This paper focuses on unsupervised modeling of morphological families, collectively comprising a forest over the language vocabulary.

Clustering

Paper
Add Code

Aspect-augmented Adversarial Networks for Domain Adaptation

1 code implementation • TACL 2017 • Yuan Zhang, Regina Barzilay, Tommi Jaakkola

We introduce a neural method for transfer learning between two (source and target) classification tasks or aspects over the same domain.

Domain Adaptation General Classification +2

Paper
Code

Learning to refine text based recommendations

no code implementations • EMNLP 2016 • Youyang Gu, Tao Lei, Regina Barzilay, Tommi Jaakkola

Collaborative Filtering

Paper
Add Code

Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge

2 code implementations • EMNLP 2016 • Nicholas Locascio, Karthik Narasimhan, Eduardo DeLeon, Nate Kushman, Regina Barzilay

This paper explores the task of translating natural language queries into regular expressions which embody their meaning.

Natural Language Queries

425

Paper
Code

sk_p: a neural program corrector for MOOCs

no code implementations • 11 Jul 2016 • Yewen Pu, Karthik Narasimhan, Armando Solar-Lezama, Regina Barzilay

We present a novel technique for automatic program correction in MOOCs, capable of fixing both syntactic and semantic errors without manual, problem specific correction strategies.

Machine Translation Translation

Paper
Add Code

Rationalizing Neural Predictions

3 code implementations • EMNLP 2016 • Tao Lei, Regina Barzilay, Tommi Jaakkola

Our approach combines two modular components, generator and encoder, which are trained to operate well together.

Retrieval Sentiment Analysis

353

Paper
Code

Ten Pairs to Tag -- Multilingual POS Tagging via Coarse Mapping between Embeddings

no code implementations • NAACL 2016 • Yuan Zhang, David Gaddy, Regina Barzilay, Tommi Jaakkola

Part-Of-Speech Tagging POS +3

Paper
Add Code

Making Dependency Labeling Simple, Fast and Accurate

no code implementations • NAACL 2016 • Tianxiao Shen, Tao Lei, Regina Barzilay

Dependency Parsing Representation Learning

Paper
Add Code

Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning

1 code implementation • EMNLP 2016 • Karthik Narasimhan, Adam Yala, Regina Barzilay

Most successful information extraction systems operate with access to a large collection of documents.

reinforcement-learning Reinforcement Learning (RL)

232

Paper
Code

Semi-supervised Question Retrieval with Gated Convolutions

1 code implementation • NAACL 2016 • Tao Lei, Hrishikesh Joshi, Regina Barzilay, Tommi Jaakkola, Katerina Tymoshenko, Alessandro Moschitti, Lluis Marquez

Question answering forums are rapidly growing in size with no effective automated ability to refer to and reuse answers already available for previous posted questions.

Question Answering Retrieval

353

Paper
Code

Hierarchical Low-Rank Tensors for Multilingual Transfer Parsing

no code implementations • EMNLP 2015 • Yuan Zhang, Regina Barzilay

Feature Engineering

Paper
Add Code

Molding CNNs for text: non-linear, non-consecutive convolutions

2 code implementations • EMNLP 2015 • Tao Lei, Regina Barzilay, Tommi Jaakkola

Moreover, we extend the n-gram convolution to non-consecutive words to recognize patterns with intervening words.

General Classification Sentiment Analysis +1

353

Paper
Code

Machine Comprehension with Discourse Relations

no code implementations • IJCNLP 2015 • Karthik Narasimhan, Regina Barzilay

Question Answering Reading Comprehension

Paper
Add Code

Language Understanding for Text-based Games Using Deep Reinforcement Learning

3 code implementations • EMNLP 2015 • Karthik Narasimhan, tejas kulkarni, Regina Barzilay

We evaluate our approach on two game worlds, comparing against baselines using bag-of-words and bag-of-bigrams for state representations.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

High-Order Low-Rank Tensors for Semantic Role Labeling

no code implementations • HLT 2015 • Lluís Màrquez, Alessandro Moschitti, Regina Barzilay, Yuan Zhang, Tao Lei

Dependency Parsing Dimensionality Reduction +3

Paper
Add Code

Randomized Greedy Inference for Joint Segmentation, POS Tagging and Dependency Parsing

no code implementations • HLT 2015 • Regina Barzilay, Kareem Darwish, Yuan Zhang, Chengtao Li

Dependency Parsing POS +2

Paper
Add Code

An Unsupervised Method for Uncovering Morphological Chains

1 code implementation • TACL 2015 • Karthik Narasimhan, Regina Barzilay, Tommi Jaakkola

In contrast, we propose a model for unsupervised morphological analysis that integrates orthographic and semantic views of words.

Morphological Analysis

Paper
Code

Morphological Segmentation for Keyword Spotting

no code implementations • EMNLP 2014 • Karthik Narasimhan, Damianos Karakos, Richard Schwartz, Stavros Tsakalidis, Regina Barzilay

Keyword Spotting Language Modelling +4

Paper
Add Code

Greed is Good if Randomized: New Inference for Dependency Parsing

no code implementations • EMNLP 2014 • Yuan Zhang, Tao Lei, Regina Barzilay, Tommi Jaakkola

Dependency Parsing

Paper
Add Code

Low-Rank Tensors for Scoring Dependency Structures

no code implementations • ACL 2014 • Tao Lei, Yu Xin, Yuan Zhang, Regina Barzilay, Tommi Jaakkola

Dependency Parsing Part-Of-Speech Tagging

Paper
Add Code

Steps to Excellence: Simple Inference with Refined Scoring of Dependency Trees

no code implementations • ACL 2014 • Yuan Zhang, Tao Lei, Regina Barzilay, Tommi Jaakkola, Amir Globerson

Dependency Parsing

Paper
Add Code

Learning to Automatically Solve Algebra Word Problems

no code implementations • ACL 2014 • Nate Kushman, Yoav Artzi, Luke Zettlemoyer, Regina Barzilay

Ranked #8 on Math Word Problem Solving on ALG514

Math Word Problem Solving

Paper
Add Code

Automatic Aggregation by Joint Modeling of Aspects and Values

no code implementations • 23 Jan 2014 • Christina Sauper, Regina Barzilay

We test our model on two tasks, joint aspect identification and sentiment analysis on a set of Yelp reviews and aspect identification alone on a set of medical summaries.

Sentiment Analysis

Paper
Add Code

Learning to Win by Reading Manuals in a Monte-Carlo Framework

no code implementations • 18 Jan 2014 • S. R. K. Branavan, David Silver, Regina Barzilay

In this paper, we present an approach to language grounding which automatically interprets text in the context of a complex control application, such as a game, and uses domain knowledge extracted from the text to improve control performance.

Paper
Add Code

Multilingual Part-of-Speech Tagging: Two Unsupervised Approaches

no code implementations • 15 Jan 2014 • Tahira Naseem, Benjamin Snyder, Jacob Eisenstein, Regina Barzilay

We demonstrate the effectiveness of multilingual learning for unsupervised part-of-speech tagging.

TAG Unsupervised Part-Of-Speech Tagging +1

Paper
Add Code

Content Modeling Using Latent Permutations

no code implementations • 15 Jan 2014 • Harr Chen, S. R. K. Branavan, Regina Barzilay, David R. Karger

We present a novel Bayesian topic model for learning discourse-level document structure.

Paper
Add Code

Learning Document-Level Semantic Properties from Free-Text Annotations

no code implementations • 15 Jan 2014 • S. R. K. Branavan, Harr Chen, Jacob Eisenstein, Regina Barzilay

The paraphrase structure is linked with a latent topic model of the review texts, enabling the system to predict the properties of unannotated documents and to effectively aggregate the semantic properties of multiple reviews.

Clustering