Search Results for author: Yoshua Bengio

Found 575 papers, 294 papers with code

Learning to Navigate in Synthetically Accessible Chemical Space Using Reinforcement Learning

1 code implementation ICML 2020 Sai Krishna Gottipati, Boris Sattarov, Sufeng. Niu, Hao-Ran Wei, Yashaswi Pathak, Shengchao Liu, Simon Blackburn, Karam Thomas, Connor Coley, Jian Tang, Sarath Chandar, Yoshua Bengio

In this work, we propose a novel reinforcement learning (RL) setup for drug discovery that addresses this challenge by embedding the concept of synthetic accessibility directly into the de novo compound design system.

Drug Discovery Navigate +3

hBERT + BiasCorp - Fighting Racism on the Web

no code implementations EACL (LTEDI) 2021 Olawale Onabola, Zhuang Ma, Xie Yang, Benjamin Akera, Ibraheem Abdulrahman, Jia Xue, Dianbo Liu, Yoshua Bengio

In this work, we present hBERT, where we modify certain layers of the pretrained BERT model with the new Hopfield Layer.

Ant Colony Sampling with GFlowNets for Combinatorial Optimization

2 code implementations11 Mar 2024 Minsu Kim, Sanghyeok Choi, Jiwoo Son, Hyeonah Kim, Jinkyoo Park, Yoshua Bengio

This paper introduces the Generative Flow Ant Colony Sampler (GFACS), a novel neural-guided meta-heuristic algorithm for combinatorial optimization.

Combinatorial Optimization

Machine learning and information theory concepts towards an AI Mathematician

no code implementations7 Mar 2024 Yoshua Bengio, Nikolay Malkin

The current state-of-the-art in artificial intelligence is impressive, especially in terms of mastery of language, but not so much in terms of mathematical reasoning.

Mathematical Reasoning

Discrete Probabilistic Inference as Control in Multi-path Environments

1 code implementation15 Feb 2024 Tristan Deleu, Padideh Nouri, Nikolay Malkin, Doina Precup, Yoshua Bengio

We consider the problem of sampling from a discrete and structured distribution as a sequential decision problem, where the objective is to find a stochastic policy such that objects are sampled at the end of this sequential process proportionally to some predefined reward.

Iterated Denoising Energy Matching for Sampling from Boltzmann Densities

1 code implementation9 Feb 2024 Tara Akhound-Sadegh, Jarrid Rector-Brooks, Avishek Joey Bose, Sarthak Mittal, Pablo Lemos, Cheng-Hao Liu, Marcin Sendera, Siamak Ravanbakhsh, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Alexander Tong

Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-body systems, is a foundational problem in science.

Denoising Efficient Exploration

Efficient Causal Graph Discovery Using Large Language Models

1 code implementation2 Feb 2024 Thomas Jiralerspong, Xiaoyin Chen, Yash More, Vedant Shah, Yoshua Bengio

We propose a novel framework that leverages LLMs for full causal graph discovery.

A Hitchhiker's Guide to Geometric GNNs for 3D Atomic Systems

1 code implementation12 Dec 2023 Alexandre Duval, Simon V. Mathis, Chaitanya K. Joshi, Victor Schmidt, Santiago Miret, Fragkiskos D. Malliaros, Taco Cohen, Pietro Liò, Yoshua Bengio, Michael Bronstein

In these graphs, the geometric attributes transform according to the inherent physical symmetries of 3D atomic systems, including rotations and translations in Euclidean space, as well as node permutations.

Protein Structure Prediction Specificity

Improving Gradient-guided Nested Sampling for Posterior Inference

1 code implementation6 Dec 2023 Pablo Lemos, Nikolay Malkin, Will Handley, Yoshua Bengio, Yashar Hezaveh, Laurence Perreault-Levasseur

We present a performant, general-purpose gradient-guided nested sampling algorithm, ${\tt GGNS}$, combining the state of the art in differentiable programming, Hamiltonian slice sampling, clustering, mode separation, dynamic nested sampling, and parallelization.

Clustering

Unlearning via Sparse Representations

no code implementations26 Nov 2023 Vedant Shah, Frederik Träuble, Ashish Malik, Hugo Larochelle, Michael Mozer, Sanjeev Arora, Yoshua Bengio, Anirudh Goyal

Machine \emph{unlearning}, which involves erasing knowledge about a \emph{forget set} from a trained model, can prove to be costly and infeasible by existing techniques.

Knowledge Distillation

Mitigating Biases with Diverse Ensembles and Diffusion Models

no code implementations23 Nov 2023 Luca Scimeca, Alexander Rubinstein, Damien Teney, Seong Joon Oh, Armand Mihai Nicolicioiu, Yoshua Bengio

Spurious correlations in the data, where multiple cues are predictive of the target labels, often lead to a phenomenon known as shortcut learning, where a model relies on erroneous, easy-to-learn cues while ignoring reliable ones.

SatBird: Bird Species Distribution Modeling with Remote Sensing and Citizen Science Data

1 code implementation2 Nov 2023 Mélisande Teng, Amna Elmustafa, Benjamin Akera, Yoshua Bengio, Hager Radi Abdelwahed, Hugo Larochelle, David Rolnick

The wide availability of remote sensing data and the growing adoption of citizen science tools to collect species observations data at low cost offer an opportunity for improving biodiversity monitoring and enabling the modelling of complex ecosystems.

Object-centric architectures enable efficient causal representation learning

1 code implementation29 Oct 2023 Amin Mansouri, Jason Hartford, Yan Zhang, Yoshua Bengio

Causal representation learning has showed a variety of settings in which we can disentangle latent variables with identifiability guarantees (up to some reasonable equivalence class).

Disentanglement Object

Causal machine learning for single-cell genomics

no code implementations23 Oct 2023 Alejandro Tejada-Lapuerta, Paul Bertin, Stefan Bauer, Hananeh Aliee, Yoshua Bengio, Fabian J. Theis

Advances in single-cell omics allow for unprecedented insights into the transcription profiles of individual cells.

Experimental Design

Towards equilibrium molecular conformation generation with GFlowNets

no code implementations20 Oct 2023 Alexandra Volokhova, Michał Koziarski, Alex Hernández-García, Cheng-Hao Liu, Santiago Miret, Pablo Lemos, Luca Thiede, Zichao Yan, Alán Aspuru-Guzik, Yoshua Bengio

Sampling diverse, thermodynamically feasible molecular conformations plays a crucial role in predicting properties of a molecule.

A cry for help: Early detection of brain injury in newborns

no code implementations12 Oct 2023 Charles C. Onu, Samantha Latremouille, Arsenii Gorin, Junhao Wang, Innocent Udeogu, Uchenna Ekwochi, Peter O. Ubuane, Omolara A. Kehinde, Muhammad A. Salisu, Datonye Briggs, Yoshua Bengio, Doina Precup

Since the 1960s, neonatal clinicians have known that newborns suffering from certain neurological conditions exhibit altered crying patterns such as the high-pitched cry in birth asphyxia.

Specificity

PhyloGFN: Phylogenetic inference with generative flow networks

1 code implementation12 Oct 2023 Mingyang Zhou, Zichao Yan, Elliot Layne, Nikolay Malkin, Dinghuai Zhang, Moksh Jain, Mathieu Blanchette, Yoshua Bengio

Phylogenetics is a branch of computational biology that studies the evolutionary relationships among biological entities.

Variational Inference

On the importance of catalyst-adsorbate 3D interactions for relaxed energy predictions

no code implementations10 Oct 2023 Alvaro Carbonero, Alexandre Duval, Victor Schmidt, Santiago Miret, Alex Hernandez-Garcia, Yoshua Bengio, David Rolnick

The use of machine learning for material property prediction and discovery has traditionally centered on graph neural networks that incorporate the geometric configuration of all atoms.

Property Prediction

Amortizing intractable inference in large language models

1 code implementation6 Oct 2023 Edward J. Hu, Moksh Jain, Eric Elmoznino, Younesse Kaddar, Guillaume Lajoie, Yoshua Bengio, Nikolay Malkin

Autoregressive large language models (LLMs) compress knowledge from their training data through next-token conditional distributions.

Bayesian Inference

Pre-Training and Fine-Tuning Generative Flow Networks

no code implementations5 Oct 2023 Ling Pan, Moksh Jain, Kanika Madan, Yoshua Bengio

However, as they are typically trained from a given extrinsic reward function, it remains an important open challenge about how to leverage the power of pre-training and train GFlowNets in an unsupervised fashion for efficient adaptation to downstream tasks.

Unsupervised Pre-training

Causal Inference in Gene Regulatory Networks with GFlowNet: Towards Scalability in Large Systems

no code implementations5 Oct 2023 Trang Nguyen, Alexander Tong, Kanika Madan, Yoshua Bengio, Dianbo Liu

Understanding causal relationships within Gene Regulatory Networks (GRNs) is essential for unraveling the gene interactions in cellular processes.

Causal Discovery Causal Inference

Local Search GFlowNets

2 code implementations4 Oct 2023 Minsu Kim, Taeyoung Yun, Emmanuel Bengio, Dinghuai Zhang, Yoshua Bengio, Sungsoo Ahn, Jinkyoo Park

Generative Flow Networks (GFlowNets) are amortized sampling methods that learn a distribution over discrete objects proportional to their rewards.

Learning to Scale Logits for Temperature-Conditional GFlowNets

1 code implementation4 Oct 2023 Minsu Kim, Joohwan Ko, Taeyoung Yun, Dinghuai Zhang, Ling Pan, Woochang Kim, Jinkyoo Park, Emmanuel Bengio, Yoshua Bengio

We find that the challenge is greatly reduced if a learned function of the temperature is used to scale the policy's logits directly.

Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization

2 code implementations4 Oct 2023 Dinghuai Zhang, Ricky T. Q. Chen, Cheng-Hao Liu, Aaron Courville, Yoshua Bengio

We tackle the problem of sampling from intractable high-dimensional density functions, a fundamental task that often appears in machine learning and statistics.

Discrete, compositional, and symbolic representations through attractor dynamics

no code implementations3 Oct 2023 Andrew Nam, Eric Elmoznino, Nikolay Malkin, Chen Sun, Yoshua Bengio, Guillaume Lajoie

Compositionality is an important feature of discrete symbolic systems, such as language and programs, as it enables them to have infinite capacity despite a finite symbol set.

Quantization

Delta-AI: Local objectives for amortized inference in sparse graphical models

1 code implementation3 Oct 2023 Jean-Pierre Falet, Hae Beom Lee, Nikolay Malkin, Chen Sun, Dragos Secrieru, Thomas Jiralerspong, Dinghuai Zhang, Guillaume Lajoie, Yoshua Bengio

We present a new algorithm for amortized inference in sparse probabilistic graphical models (PGMs), which we call $\Delta$-amortized inference ($\Delta$-AI).

Leveraging Diffusion Disentangled Representations to Mitigate Shortcuts in Underspecified Visual Tasks

no code implementations3 Oct 2023 Luca Scimeca, Alexander Rubinstein, Armand Mihai Nicolicioiu, Damien Teney, Yoshua Bengio

Spurious correlations in the data, where multiple cues are predictive of the target labels, often lead to shortcut learning phenomena, where a model may rely on erroneous, easy-to-learn, cues while ignoring reliable ones.

Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning

1 code implementation30 Sep 2023 Mingde Zhao, Safa Alver, Harm van Seijen, Romain Laroche, Doina Precup, Yoshua Bengio

Inspired by human conscious planning, we propose Skipper, a model-based reinforcement learning framework utilizing spatio-temporal abstractions to generalize better in novel situations.

Decision Making Model-based Reinforcement Learning +2

Tree Cross Attention

1 code implementation29 Sep 2023 Leo Feng, Frederick Tung, Hossein Hajimirsadeghi, Yoshua Bengio, Mohamed Osama Ahmed

In this work, we propose Tree Cross Attention (TCA) - a module based on Cross Attention that only retrieves information from a logarithmic $\mathcal{O}(\log(N))$ number of tokens for performing inference.

Benchmarking Bayesian Causal Discovery Methods for Downstream Treatment Effect Estimation

no code implementations11 Jul 2023 Chris Chinenye Emezue, Alexandre Drouin, Tristan Deleu, Stefan Bauer, Yoshua Bengio

Nevertheless, a notable gap exists in the evaluation of causal discovery methods, where insufficient emphasis is placed on downstream inference.

Benchmarking Causal Discovery +2

AI For Global Climate Cooperation 2023 Competition Proceedings

no code implementations10 Jul 2023 Yoshua Bengio, Prateek Gupta, Lu Li, Soham Phade, Sunil Srinivasa, Andrew Williams, Tianyu Zhang, Yang Zhang, Stephan Zheng

On the other hand, an interdisciplinary panel of human experts in law, policy, sociology, economics and environmental science, evaluated the solutions qualitatively.

Decision Making Ethics +1

Simulation-free Schrödinger bridges via score and flow matching

1 code implementation7 Jul 2023 Alexander Tong, Nikolay Malkin, Kilian Fatras, Lazar Atanackovic, Yanlei Zhang, Guillaume Huguet, Guy Wolf, Yoshua Bengio

We present simulation-free score and flow matching ([SF]$^2$M), a simulation-free objective for inferring stochastic dynamics given unpaired samples drawn from arbitrary source and target distributions.

Generative Flow Networks: a Markov Chain Perspective

no code implementations4 Jul 2023 Tristan Deleu, Yoshua Bengio

While Markov chain Monte Carlo methods (MCMC) provide a general framework to sample from a probability distribution defined up to normalization, they often suffer from slow convergence to the target distribution when the latter is highly multi-modal.

Decision Making

Thompson sampling for improved exploration in GFlowNets

no code implementations30 Jun 2023 Jarrid Rector-Brooks, Kanika Madan, Moksh Jain, Maksym Korablyov, Cheng-Hao Liu, Sarath Chandar, Nikolay Malkin, Yoshua Bengio

Generative flow networks (GFlowNets) are amortized variational inference algorithms that treat sampling from a distribution over compositional objects as a sequential decision-making problem with a learnable action policy.

Active Learning Decision Making +3

HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution

2 code implementations NeurIPS 2023 Eric Nguyen, Michael Poli, Marjan Faizi, Armin Thomas, Callum Birch-Sykes, Michael Wornow, Aman Patel, Clayton Rabideau, Stefano Massaroli, Yoshua Bengio, Stefano Ermon, Stephen A. Baccus, Chris Ré

Leveraging Hyena's new long-range capabilities, we present HyenaDNA, a genomic foundation model pretrained on the human reference genome with context lengths of up to 1 million tokens at the single nucleotide-level - an up to 500x increase over previous dense attention-based models.

4k In-Context Learning +2

BatchGFN: Generative Flow Networks for Batch Active Learning

1 code implementation26 Jun 2023 Shreshth A. Malik, Salem Lahlou, Andrew Jesson, Moksh Jain, Nikolay Malkin, Tristan Deleu, Yoshua Bengio, Yarin Gal

We introduce BatchGFN -- a novel approach for pool-based active learning that uses generative flow networks to sample sets of data points proportional to a batch reward.

Active Learning

Constant Memory Attention Block

no code implementations21 Jun 2023 Leo Feng, Frederick Tung, Hossein Hajimirsadeghi, Yoshua Bengio, Mohamed Osama Ahmed

Modern foundation model architectures rely on attention mechanisms to effectively capture context.

Point Processes

Multi-Fidelity Active Learning with GFlowNets

2 code implementations20 Jun 2023 Alex Hernandez-Garcia, Nikita Saxena, Moksh Jain, Cheng-Hao Liu, Yoshua Bengio

For example, in scientific discovery, we are often faced with the problem of exploring very large, high-dimensional spaces, where querying a high fidelity, black-box objective function is very expensive.

Active Learning

GEO-Bench: Toward Foundation Models for Earth Monitoring

1 code implementation NeurIPS 2023 Alexandre Lacoste, Nils Lehmann, Pau Rodriguez, Evan David Sherwin, Hannah Kerner, Björn Lütjens, Jeremy Andrew Irvin, David Dao, Hamed Alemohammad, Alexandre Drouin, Mehmet Gunturkun, Gabriel Huang, David Vazquez, Dava Newman, Yoshua Bengio, Stefano Ermon, Xiao Xiang Zhu

Recent progress in self-supervision has shown that pre-training large neural networks on vast amounts of unsupervised data can lead to substantial increases in generalization to downstream tasks.

Cycle Consistency Driven Object Discovery

no code implementations3 Jun 2023 Aniket Didolkar, Anirudh Goyal, Yoshua Bengio

To tackle the second limitation, we apply the learned object-centric representations from the proposed method to two downstream reinforcement learning tasks, demonstrating considerable performance enhancements compared to conventional slot-based and monolithic representation learning methods.

Object Object Discovery +2

Attention Schema in Neural Agents

no code implementations27 May 2023 Dianbo Liu, Samuele Bolotta, He Zhu, Yoshua Bengio, Guillaume Dumas

A strong prediction of this theory is that an agent can use its own AS to also infer the states of other agents' attention and consequently enhance coordination with other agents.

Descriptive Multi-agent Reinforcement Learning

Let the Flows Tell: Solving Graph Combinatorial Optimization Problems with GFlowNets

1 code implementation26 May 2023 Dinghuai Zhang, Hanjun Dai, Nikolay Malkin, Aaron Courville, Yoshua Bengio, Ling Pan

In this paper, we design Markov decision processes (MDPs) for different combinatorial problems and propose to train conditional GFlowNets to sample from the solution space.

Combinatorial Optimization

torchgfn: A PyTorch GFlowNet library

2 code implementations24 May 2023 Salem Lahlou, Joseph D. Viviano, Victor Schmidt, Yoshua Bengio

The growing popularity of generative flow networks (GFlowNets or GFNs) from a range of researchers with diverse backgrounds and areas of expertise necessitates a library which facilitates the testing of new features such as training losses that can be easily compared to standard benchmark implementations, or on a set of common environments.

Memory Efficient Neural Processes via Constant Memory Attention Block

no code implementations23 May 2023 Leo Feng, Frederick Tung, Hossein Hajimirsadeghi, Yoshua Bengio, Mohamed Osama Ahmed

Neural Processes (NPs) are popular meta-learning methods for efficiently modelling predictive uncertainty.

Meta-Learning

FAENet: Frame Averaging Equivariant GNN for Materials Modeling

1 code implementation28 Apr 2023 Alexandre Duval, Victor Schmidt, Alex Hernandez Garcia, Santiago Miret, Fragkiskos D. Malliaros, Yoshua Bengio, David Rolnick

Applications of machine learning techniques for materials modeling typically involve functions known to be equivariant or invariant to specific symmetries.

Hyena Hierarchy: Towards Larger Convolutional Language Models

5 code implementations21 Feb 2023 Michael Poli, Stefano Massaroli, Eric Nguyen, Daniel Y. Fu, Tri Dao, Stephen Baccus, Yoshua Bengio, Stefano Ermon, Christopher Ré

Recent advances in deep learning have relied heavily on the use of large Transformers due to their ability to learn at scale.

2k 8k +2

Stochastic Generative Flow Networks

1 code implementation19 Feb 2023 Ling Pan, Dinghuai Zhang, Moksh Jain, Longbo Huang, Yoshua Bengio

Generative Flow Networks (or GFlowNets for short) are a family of probabilistic agents that learn to sample complex combinatorial structures through the lens of "inference as control".

GFlowNet-EM for learning compositional latent variable models

1 code implementation13 Feb 2023 Edward J. Hu, Nikolay Malkin, Moksh Jain, Katie Everett, Alexandros Graikos, Yoshua Bengio

Latent variable models (LVMs) with discrete compositional latents are an important but challenging setting due to a combinatorially large number of possible configurations of the latents.

Variational Inference

Sources of Richness and Ineffability for Phenomenally Conscious States

no code implementations13 Feb 2023 Xu Ji, Eric Elmoznino, George Deane, Axel Constant, Guillaume Dumas, Guillaume Lajoie, Jonathan Simon, Yoshua Bengio

Conscious states (states that there is something it is like to be in) seem both rich or full of detail, and ineffable or hard to fully describe or recall.

Philosophy

Distributional GFlowNets with Quantile Flows

1 code implementation11 Feb 2023 Dinghuai Zhang, Ling Pan, Ricky T. Q. Chen, Aaron Courville, Yoshua Bengio

Generative Flow Networks (GFlowNets) are a new family of probabilistic samplers where an agent learns a stochastic policy for generating complex combinatorial structure through a series of decision-making steps.

Decision Making

DynGFN: Towards Bayesian Inference of Gene Regulatory Networks with GFlowNets

1 code implementation NeurIPS 2023 Lazar Atanackovic, Alexander Tong, Bo wang, Leo J. Lee, Yoshua Bengio, Jason Hartford

In this paper we leverage the fact that it is possible to estimate the "velocity" of gene expression with RNA velocity techniques to develop an approach that addresses both challenges.

Bayesian Inference Causal Discovery

Better Training of GFlowNets with Local Credit and Incomplete Trajectories

2 code implementations3 Feb 2023 Ling Pan, Nikolay Malkin, Dinghuai Zhang, Yoshua Bengio

Generative Flow Networks or GFlowNets are related to Monte-Carlo Markov chain methods (as they sample from a distribution specified by an energy function), reinforcement learning (as they learn a policy to sample composed objects through a sequence of steps), generative models (as they learn to represent and sample from a distribution) and amortized variational methods (as they can be used to learn to approximate and sample from an otherwise intractable posterior, given a prior and a likelihood).

Improving and generalizing flow-based generative models with minibatch optimal transport

2 code implementations1 Feb 2023 Alexander Tong, Kilian Fatras, Nikolay Malkin, Guillaume Huguet, Yanlei Zhang, Jarrid Rector-Brooks, Guy Wolf, Yoshua Bengio

CFM features a stable regression objective like that used to train the stochastic flow in diffusion models but enjoys the efficient inference of deterministic flow models.

GFlowNets for AI-Driven Scientific Discovery

no code implementations1 Feb 2023 Moksh Jain, Tristan Deleu, Jason Hartford, Cheng-Hao Liu, Alex Hernandez-Garcia, Yoshua Bengio

However, in order to truly leverage large-scale data sets and high-throughput experimental setups, machine learning methods will need to be further improved and better integrated in the scientific discovery pipeline.

Efficient Exploration Experimental Design

A theory of continuous generative flow networks

1 code implementation30 Jan 2023 Salem Lahlou, Tristan Deleu, Pablo Lemos, Dinghuai Zhang, Alexandra Volokhova, Alex Hernández-García, Léna Néhale Ezzine, Yoshua Bengio, Nikolay Malkin

Generative flow networks (GFlowNets) are amortized variational inference algorithms that are trained to sample from unnormalized target distributions over compositional objects.

Variational Inference

Leveraging the Third Dimension in Contrastive Learning

no code implementations27 Jan 2023 Sumukh Aithal, Anirudh Goyal, Alex Lamb, Yoshua Bengio, Michael Mozer

We evaluate these two approaches on three different SSL methods -- BYOL, SimSiam, and SwAV -- using ImageNette (10 class subset of ImageNet), ImageNet-100 and ImageNet-1k datasets.

Contrastive Learning Depth Estimation +2

Regeneration Learning: A Learning Paradigm for Data Generation

no code implementations21 Jan 2023 Xu Tan, Tao Qin, Jiang Bian, Tie-Yan Liu, Yoshua Bengio

Regeneration learning extends the concept of representation learning to data generation tasks, and can be regarded as a counterpart of traditional representation learning, since 1) regeneration learning handles the abstraction (Y') of the target data Y for data generation while traditional representation learning handles the abstraction (X') of source data X for data understanding; 2) both the processes of Y'-->Y in regeneration learning and X-->X' in representation learning can be learned in a self-supervised way (e. g., pre-training); 3) both the mappings from X to Y' in regeneration learning and from X' to Y in representation learning are simpler than the direct mapping from X to Y.

Image Generation Representation Learning +6

MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

1 code implementation27 Dec 2022 Yingtian Zou, Vikas Verma, Sarthak Mittal, Wai Hoh Tang, Hieu Pham, Juho Kannala, Yoshua Bengio, Arno Solin, Kenji Kawaguchi

Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels.

Data Augmentation

PhAST: Physics-Aware, Scalable, and Task-specific GNNs for Accelerated Catalyst Design

2 code implementations22 Nov 2022 Alexandre Duval, Victor Schmidt, Santiago Miret, Yoshua Bengio, Alex Hernández-García, David Rolnick

Catalyst materials play a crucial role in the electrochemical reactions involved in numerous industrial processes key to this transition, such as renewable energy storage and electrofuel synthesis.

Computational Efficiency

Latent Bottlenecked Attentive Neural Processes

1 code implementation15 Nov 2022 Leo Feng, Hossein Hajimirsadeghi, Yoshua Bengio, Mohamed Osama Ahmed

We demonstrate that LBANPs can trade-off the computational cost and performance according to the number of latent vectors.

Meta-Learning Multi-Armed Bandits

Equivariance with Learned Canonicalization Functions

no code implementations11 Nov 2022 Sékou-Oumar Kaba, Arnab Kumar Mondal, Yan Zhang, Yoshua Bengio, Siamak Ravanbakhsh

Symmetry-based neural networks often constrain the architecture in order to achieve invariance or equivariance to a group of transformations.

Image Classification Point Cloud Classification

Posterior samples of source galaxies in strong gravitational lenses with score-based priors

no code implementations7 Nov 2022 Alexandre Adam, Adam Coogan, Nikolay Malkin, Ronan Legin, Laurence Perreault-Levasseur, Yashar Hezaveh, Yoshua Bengio

Inferring accurate posteriors for high-dimensional representations of the brightness of gravitationally-lensed sources is a major challenge, in part due to the difficulties of accurately quantifying the priors.

A General Purpose Neural Architecture for Geospatial Systems

no code implementations4 Nov 2022 Nasim Rahaman, Martin Weiss, Frederik Träuble, Francesco Locatello, Alexandre Lacoste, Yoshua Bengio, Chris Pal, Li Erran Li, Bernhard Schölkopf

Geospatial Information Systems are used by researchers and Humanitarian Assistance and Disaster Response (HADR) practitioners to support a wide variety of important applications.

Disaster Response Earth Observation +2

Bayesian learning of Causal Structure and Mechanisms with GFlowNets and Variational Bayes

no code implementations4 Nov 2022 Mizu Nishikawa-Toomey, Tristan Deleu, Jithendaraa Subramanian, Yoshua Bengio, Laurent Charlin

We extend the method of Bayesian causal structure learning using GFlowNets to learn not only the posterior distribution over the structure, but also the parameters of a linear-Gaussian model.

Consistent Training via Energy-Based GFlowNets for Modeling Discrete Joint Distributions

no code implementations1 Nov 2022 Chanakya Ekbote, Moksh Jain, Payel Das, Yoshua Bengio

We hypothesize that this can lead to incompatibility between the inductive optimization biases in training $R$ and in training the GFlowNet, potentially leading to worse samples and slow adaptation to changes in the distribution.

Active Learning

Discrete Factorial Representations as an Abstraction for Goal Conditioned Reinforcement Learning

no code implementations1 Nov 2022 Riashat Islam, Hongyu Zang, Anirudh Goyal, Alex Lamb, Kenji Kawaguchi, Xin Li, Romain Laroche, Yoshua Bengio, Remi Tachet des Combes

Goal-conditioned reinforcement learning (RL) is a promising direction for training agents that are capable of solving multiple tasks and reach a diverse set of objectives.

reinforcement-learning Reinforcement Learning (RL)

FL Games: A Federated Learning Framework for Distribution Shifts

no code implementations31 Oct 2022 Sharut Gupta, Kartik Ahuja, Mohammad Havaei, Niladri Chatterjee, Yoshua Bengio

Federated learning aims to train predictive models for data that is distributed across clients, under the orchestration of a server.

Federated Learning

GFlowOut: Dropout with Generative Flow Networks

no code implementations24 Oct 2022 Dianbo Liu, Moksh Jain, Bonaventure Dossou, Qianli Shen, Salem Lahlou, Anirudh Goyal, Nikolay Malkin, Chris Emezue, Dinghuai Zhang, Nadhir Hassen, Xu Ji, Kenji Kawaguchi, Yoshua Bengio

These methods face two important challenges: (a) the posterior distribution over masks can be highly multi-modal which can be difficult to approximate with standard variational inference and (b) it is not trivial to fully utilize sample-dependent information and correlation among dropout masks to improve posterior estimation.

Bayesian Inference Variational Inference

Neural Attentive Circuits

no code implementations14 Oct 2022 Nasim Rahaman, Martin Weiss, Francesco Locatello, Chris Pal, Yoshua Bengio, Bernhard Schölkopf, Li Erran Li, Nicolas Ballas

Recent work has seen the development of general purpose neural architectures that can be trained to perform tasks across diverse data modalities.

Point Cloud Classification text-classification +1

Contrastive Retrospection: honing in on critical steps for rapid learning and generalization in RL

1 code implementation NeurIPS 2023 Chen Sun, Wannan Yang, Thomas Jiralerspong, Dane Malenfant, Benjamin Alsbury-Nealy, Yoshua Bengio, Blake Richards

Distinct from other contemporary RL approaches to credit assignment, ConSpec takes advantage of the fact that it is easier to retrospectively identify the small set of steps that success is contingent upon (and ignoring other states) than it is to prospectively predict reward at every taken step.

Contrastive Learning Out-of-Distribution Generalization +1

MAgNet: Mesh Agnostic Neural PDE Solver

1 code implementation11 Oct 2022 Oussama Boussif, Dan Assouline, Loubna Benabbou, Yoshua Bengio

The computational complexity of classical numerical methods for solving Partial Differential Equations (PDE) scales significantly as the resolution increases.

Zero-shot Generalization

Generative Augmented Flow Networks

no code implementations7 Oct 2022 Ling Pan, Dinghuai Zhang, Aaron Courville, Longbo Huang, Yoshua Bengio

We specify intermediate rewards by intrinsic motivation to tackle the exploration problem in sparse reward environments.

Stateful active facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning

2 code implementations4 Oct 2022 Dianbo Liu, Vedant Shah, Oussama Boussif, Cristian Meo, Anirudh Goyal, Tianmin Shu, Michael Mozer, Nicolas Heess, Yoshua Bengio

We formalize the notions of coordination level and heterogeneity level of an environment and present HECOGrid, a suite of multi-agent RL environments that facilitates empirical evaluation of different MARL approaches across different levels of coordination and environmental heterogeneity by providing a quantitative control over coordination and heterogeneity levels of the environment.

Multi-agent Reinforcement Learning reinforcement-learning +1

Latent State Marginalization as a Low-cost Approach for Improving Exploration

1 code implementation3 Oct 2022 Dinghuai Zhang, Aaron Courville, Yoshua Bengio, Qinqing Zheng, Amy Zhang, Ricky T. Q. Chen

While the maximum entropy (MaxEnt) reinforcement learning (RL) framework -- often touted for its exploration and robustness capabilities -- is usually motivated from a probabilistic perspective, the use of deep probabilistic models has not gained much traction in practice due to their inherent complexity.

Continuous Control Reinforcement Learning (RL) +1

GFlowNets and variational inference

1 code implementation2 Oct 2022 Nikolay Malkin, Salem Lahlou, Tristan Deleu, Xu Ji, Edward Hu, Katie Everett, Dinghuai Zhang, Yoshua Bengio

This paper builds bridges between two families of probabilistic algorithms: (hierarchical) variational inference (VI), which is typically used to model distributions over continuous spaces, and generative flow networks (GFlowNets), which have been used for distributions over discrete structures such as graphs.

Reinforcement Learning (RL) Variational Inference

Learning GFlowNets from partial episodes for improved convergence and stability

3 code implementations26 Sep 2022 Kanika Madan, Jarrid Rector-Brooks, Maksym Korablyov, Emmanuel Bengio, Moksh Jain, Andrei Nica, Tom Bosc, Yoshua Bengio, Nikolay Malkin

Generative flow networks (GFlowNets) are a family of algorithms for training a sequential sampler of discrete objects under an unnormalized target density and have been successfully used for various probabilistic modeling tasks.

Graph-Based Active Machine Learning Method for Diverse and Novel Antimicrobial Peptides Generation and Selection

no code implementations18 Sep 2022 Bonaventure F. P. Dossou, Dianbo Liu, Xu Ji, Moksh Jain, Almer M. van der Sloot, Roger Palou, Michael Tyers, Yoshua Bengio

As antibiotic-resistant bacterial strains are rapidly spreading worldwide, infections caused by these strains are emerging as a global crisis causing the death of millions of people every year.

Designing Biological Sequences via Meta-Reinforcement Learning and Bayesian Optimization

no code implementations13 Sep 2022 Leo Feng, Padideh Nouri, Aneri Muni, Yoshua Bengio, Pierre-Luc Bacon

The problem can be framed as a global optimization problem where the objective is an expensive black-box function such that we can query large batches restricted with a limitation of a low number of rounds.

Bayesian Optimization Meta-Learning +3

Unifying Generative Models with GFlowNets and Beyond

no code implementations6 Sep 2022 Dinghuai Zhang, Ricky T. Q. Chen, Nikolay Malkin, Yoshua Bengio

Our framework provides a means for unifying training and inference algorithms, and provides a route to shine a unifying light over many generative models.

Decision Making

AI for Global Climate Cooperation: Modeling Global Climate Negotiations, Agreements, and Long-Term Cooperation in RICE-N

2 code implementations15 Aug 2022 Tianyu Zhang, Andrew Williams, Soham Phade, Sunil Srinivasa, Yang Zhang, Prateek Gupta, Yoshua Bengio, Stephan Zheng

To facilitate this research, here we introduce RICE-N, a multi-region integrated assessment model that simulates the global climate and economy, and which can be used to design and evaluate the strategic outcomes for different negotiation and agreement frameworks.

Ethics Multi-agent Reinforcement Learning

Diversifying Design of Nucleic Acid Aptamers Using Unsupervised Machine Learning

no code implementations10 Aug 2022 Siba Moussa, Michael Kilgour, Clara Jans, Alex Hernandez-Garcia, Miroslava Cuperlovic-Culf, Yoshua Bengio, Lena Simine

Inverse design of short single-stranded RNA and DNA sequences (aptamers) is the task of finding sequences that satisfy a set of desired criteria.

Lookback for Learning to Branch

no code implementations30 Jun 2022 Prateek Gupta, Elias B. Khalil, Didier Chetélat, Maxime Gasse, Yoshua Bengio, Andrea Lodi, M. Pawan Kumar

Given that B&B results in a tree of sub-MILPs, we ask (a) whether there are strong dependencies exhibited by the target heuristic among the neighboring nodes of the B&B tree, and (b) if so, whether we can incorporate them in our training procedure.

Model Selection Variable Selection

On the Generalization and Adaption Performance of Causal Models

no code implementations9 Jun 2022 Nino Scherrer, Anirudh Goyal, Stefan Bauer, Yoshua Bengio, Nan Rosemary Ke

Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes and offer robust generalization.

Causal Discovery Out-of-Distribution Generalization

On Neural Architecture Inductive Biases for Relational Tasks

1 code implementation9 Jun 2022 Giancarlo Kerg, Sarthak Mittal, David Rolnick, Yoshua Bengio, Blake Richards, Guillaume Lajoie

Recent work has explored how forcing relational representations to remain distinct from sensory representations, as it seems to be the case in the brain, can help artificial systems.

Inductive Bias Out-of-Distribution Generalization

Building Robust Ensembles via Margin Boosting

1 code implementation7 Jun 2022 Dinghuai Zhang, Hongyang Zhang, Aaron Courville, Yoshua Bengio, Pradeep Ravikumar, Arun Sai Suggala

Consequently, an emerging line of work has focused on learning an ensemble of neural networks to defend against adversarial attacks.

Adversarial Robustness

Is a Modular Architecture Enough?

1 code implementation6 Jun 2022 Sarthak Mittal, Yoshua Bengio, Guillaume Lajoie

Inspired from human cognition, machine learning systems are gradually revealing advantages of sparser and more modular architectures.

Out-of-Distribution Generalization

Weakly Supervised Representation Learning with Sparse Perturbations

1 code implementation2 Jun 2022 Kartik Ahuja, Jason Hartford, Yoshua Bengio

We show that if the perturbations are applied only on mutually exclusive blocks of latents, we identify the latents up to those blocks.

Representation Learning

Agnostic Physics-Driven Deep Learning

no code implementations30 May 2022 Benjamin Scellier, Siddhartha Mishra, Yoshua Bengio, Yann Ollivier

This work establishes that a physical system can perform statistical learning without gradient computations, via an Agnostic Equilibrium Propagation (Aeqprop) procedure that combines energy minimization, homeostatic control, and nudging towards the correct response.

Temporal Latent Bottleneck: Synthesis of Fast and Slow Processing Mechanisms in Sequence Learning

2 code implementations30 May 2022 Aniket Didolkar, Kshitij Gupta, Anirudh Goyal, Nitesh B. Gundavarapu, Alex Lamb, Nan Rosemary Ke, Yoshua Bengio

A slow stream that is recurrent in nature aims to learn a specialized and compressed representation, by forcing chunks of $K$ time steps into a single representation which is divided into multiple vectors.

Decision Making Inductive Bias

FL Games: A federated learning framework for distribution shifts

no code implementations23 May 2022 Sharut Gupta, Kartik Ahuja, Mohammad Havaei, Niladri Chatterjee, Yoshua Bengio

Federated learning aims to train predictive models for data that is distributed across clients, under the orchestration of a server.

Federated Learning

FedILC: Weighted Geometric Mean and Invariant Gradient Covariance for Federated Learning on Non-IID Data

1 code implementation19 May 2022 Mike He Zhu, Léna Néhale Ezzine, Dianbo Liu, Yoshua Bengio

Federated learning is a distributed machine learning approach which enables a shared server model to learn by aggregating the locally-computed parameter updates with the training data from spatially-distributed client silos.

Federated Learning

A Highly Adaptive Acoustic Model for Accurate Multi-Dialect Speech Recognition

no code implementations6 May 2022 Sanghyun Yoo, Inchul Song, Yoshua Bengio

In this paper, we propose a novel acoustic modeling technique for accurate multi-dialect speech recognition with a single AM.

speech-recognition Speech Recognition

Temporal Abstractions-Augmented Temporally Contrastive Learning: An Alternative to the Laplacian in RL

no code implementations21 Mar 2022 Akram Erraqabi, Marlos C. Machado, Mingde Zhao, Sainbayar Sukhbaatar, Alessandro Lazaric, Ludovic Denoyer, Yoshua Bengio

In reinforcement learning, the graph Laplacian has proved to be a valuable tool in the task-agnostic setting, with applications ranging from skill discovery to reward shaping.

Continuous Control Contrastive Learning +1

Continuous-Time Meta-Learning with Forward Mode Differentiation

no code implementations ICLR 2022 Tristan Deleu, David Kanaa, Leo Feng, Giancarlo Kerg, Yoshua Bengio, Guillaume Lajoie, Pierre-Luc Bacon

Drawing inspiration from gradient-based meta-learning methods with infinitely small gradient steps, we introduce Continuous-Time Meta-Learning (COMLN), a meta-learning algorithm where adaptation follows the dynamics of a gradient vector field.

Few-Shot Image Classification Meta-Learning

Biological Sequence Design with GFlowNets

1 code implementation2 Mar 2022 Moksh Jain, Emmanuel Bengio, Alex-Hernandez Garcia, Jarrid Rector-Brooks, Bonaventure F. P. Dossou, Chanakya Ekbote, Jie Fu, Tianyu Zhang, Micheal Kilgour, Dinghuai Zhang, Lena Simine, Payel Das, Yoshua Bengio

In this work, we propose an active learning algorithm leveraging epistemic uncertainty estimation and the recently proposed GFlowNets as a generator of diverse candidate solutions, with the objective to obtain a diverse batch of useful (as defined by some utility function, for example, the predicted anti-microbial activity of a peptide) and informative candidates after each round.

Active Learning

Combining Modular Skills in Multitask Learning

1 code implementation28 Feb 2022 Edoardo M. Ponti, Alessandro Sordoni, Yoshua Bengio, Siva Reddy

By jointly learning these and a task-skill allocation matrix, the network for each task is instantiated as the average of the parameters of active skills.

Instruction Following reinforcement-learning +1

Bayesian Structure Learning with Generative Flow Networks

1 code implementation28 Feb 2022 Tristan Deleu, António Góis, Chris Emezue, Mansi Rankawat, Simon Lacoste-Julien, Stefan Bauer, Yoshua Bengio

In Bayesian structure learning, we are interested in inferring a distribution over the directed acyclic graph (DAG) structure of Bayesian networks, from data.

Variational Inference

Generative Flow Networks for Discrete Probabilistic Modeling

2 code implementations3 Feb 2022 Dinghuai Zhang, Nikolay Malkin, Zhen Liu, Alexandra Volokhova, Aaron Courville, Yoshua Bengio

We present energy-based generative flow networks (EB-GFN), a novel probabilistic modeling algorithm for high-dimensional discrete data.

Adaptive Discrete Communication Bottlenecks with Dynamic Vector Quantization

no code implementations2 Feb 2022 Dianbo Liu, Alex Lamb, Xu Ji, Pascal Notsawo, Mike Mozer, Yoshua Bengio, Kenji Kawaguchi

Vector Quantization (VQ) is a method for discretizing latent representations and has become a major part of the deep learning toolkit.

Quantization reinforcement-learning +2

Trajectory balance: Improved credit assignment in GFlowNets

3 code implementations31 Jan 2022 Nikolay Malkin, Moksh Jain, Emmanuel Bengio, Chen Sun, Yoshua Bengio

Generative flow networks (GFlowNets) are a method for learning a stochastic policy for generating compositional objects, such as graphs or strings, from a given unnormalized density by sequences of actions, where many possible action sequences may lead to the same object.

Towards Scaling Difference Target Propagation by Learning Backprop Targets

1 code implementation31 Jan 2022 Maxence Ernoult, Fabrice Normandin, Abhinav Moudgil, Sean Spinney, Eugene Belilovsky, Irina Rish, Blake Richards, Yoshua Bengio

As such, it is important to explore learning algorithms that come with strong theoretical guarantees and can match the performance of backpropagation (BP) on complex tasks.

Boosting Exploration in Multi-Task Reinforcement Learning using Adversarial Networks

1 code implementation27 Jan 2022 Ramnath Kumar, Tristan Deleu, Yoshua Bengio

Our proposed adversarial training regime for Multi-Task Reinforcement Learning (MT-RL) addresses the limitations of conventional training methods in RL, especially in meta-RL environments where the agent faces new tasks.

Decision Making reinforcement-learning +1

The Effect of Diversity in Meta-Learning

1 code implementation27 Jan 2022 Ramnath Kumar, Tristan Deleu, Yoshua Bengio

Recent studies show that task distribution plays a vital role in the meta-learner's performance.

Few-Shot Learning

Multi-scale Feature Learning Dynamics: Insights for Double Descent

1 code implementation6 Dec 2021 Mohammad Pezeshki, Amartya Mitra, Yoshua Bengio, Guillaume Lajoie

A key challenge in building theoretical foundations for deep learning is the complex optimization dynamics of neural networks, resulting from the high-dimensional interactions between the large number of network parameters.

GFlowNet Foundations

2 code implementations17 Nov 2021 Yoshua Bengio, Salem Lahlou, Tristan Deleu, Edward J. Hu, Mo Tiwari, Emmanuel Bengio

Generative Flow Networks (GFlowNets) have been introduced as a method to sample a diverse set of candidates in an active learning context, with a training objective that makes them approximately sample in proportion to a given reward function.

Active Learning

Properties from Mechanisms: An Equivariance Perspective on Identifiable Representation Learning

no code implementations ICLR 2022 Kartik Ahuja, Jason Hartford, Yoshua Bengio

These results suggest that by exploiting inductive biases on mechanisms, it is possible to design a range of new identifiable representation learning approaches.

Representation Learning

Chunked Autoregressive GAN for Conditional Waveform Synthesis

1 code implementation ICLR 2022 Max Morrison, Rithesh Kumar, Kundan Kumar, Prem Seetharaman, Aaron Courville, Yoshua Bengio

We show that simple pitch and periodicity conditioning is insufficient for reducing this error relative to using autoregression.

Inductive Bias

Compositional Attention: Disentangling Search and Retrieval

3 code implementations ICLR 2022 Sarthak Mittal, Sharath Chandra Raparthy, Irina Rish, Yoshua Bengio, Guillaume Lajoie

Through our qualitative analysis, we demonstrate that Compositional Attention leads to dynamic specialization based on the type of retrieval needed.

Retrieval

Unifying Likelihood-free Inference with Black-box Optimization and Beyond

no code implementations ICLR 2022 Dinghuai Zhang, Jie Fu, Yoshua Bengio, Aaron Courville

Black-box optimization formulations for biological sequence design have drawn recent attention due to their promising potential impact on the pharmaceutical industry.

Drug Discovery

Divide and Explore: Multi-Agent Separate Exploration with Shared Intrinsic Motivations

no code implementations29 Sep 2021 Xiao Jing, Zhenwei Zhu, Hongliang Li, Xin Pei, Yoshua Bengio, Tong Che, Hongyong Song

One of the greatest challenges of reinforcement learning is efficient exploration, especially when training signals are sparse or deceptive.

Distributed Computing Efficient Exploration

Discrete-Valued Neural Communication

no code implementations NeurIPS 2021 Dianbo Liu, Alex Lamb, Kenji Kawaguchi, Anirudh Goyal, Chen Sun, Michael Curtis Mozer, Yoshua Bengio

Deep learning has advanced from fully connected architectures to structured models organized into components, e. g., the transformer composed of positional elements, modular architectures divided into slots, and graph neural nets made up of nodes.

Quantization Systematic Generalization

The Causal-Neural Connection: Expressiveness, Learnability, and Inference

2 code implementations NeurIPS 2021 Kevin Xia, Kai-Zhan Lee, Yoshua Bengio, Elias Bareinboim

Given this property, one may be tempted to surmise that a collection of neural nets is capable of learning any SCM by training on data generated by that SCM.

Causal Identification Causal Inference +1

Variational Causal Networks: Approximate Bayesian Inference over Causal Structures

1 code implementation14 Jun 2021 Yashas Annadani, Jonas Rothfuss, Alexandre Lacoste, Nino Scherrer, Anirudh Goyal, Yoshua Bengio, Stefan Bauer

However, a crucial aspect to acting intelligently upon the knowledge about causal structure which has been inferred from finite data demands reasoning about its uncertainty.

Bayesian Inference Causal Inference +2

Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation

4 code implementations NeurIPS 2021 Emmanuel Bengio, Moksh Jain, Maksym Korablyov, Doina Precup, Yoshua Bengio

Using insights from Temporal Difference learning, we propose GFlowNet, based on a view of the generative process as a flow network, making it possible to handle the tricky case where different trajectories can yield the same final state, e. g., there are many ways to sequentially add atoms to generate some molecular graph.

Fast and Slow Learning of Recurrent Independent Mechanisms

no code implementations18 May 2021 Kanika Madan, Nan Rosemary Ke, Anirudh Goyal, Bernhard Schölkopf, Yoshua Bengio

To study these ideas, we propose a particular training framework in which we assume that the pieces of knowledge an agent needs and its reward function are stationary and can be re-used across tasks.

Meta-Learning

An End-to-End Framework for Molecular Conformation Generation via Bilevel Programming

1 code implementation15 May 2021 Minkai Xu, Wujie Wang, Shitong Luo, Chence Shi, Yoshua Bengio, Rafael Gomez-Bombarelli, Jian Tang

Specifically, the molecular graph is first encoded in a latent space, and then the 3D structures are generated by solving a principled bilevel optimization program.

Bilevel Optimization

HBert + BiasCorp -- Fighting Racism on the Web

no code implementations6 Apr 2021 Olawale Onabola, Zhuang Ma, Yang Xie, Benjamin Akera, Abdulrahman Ibraheem, Jia Xue, Dianbo Liu, Yoshua Bengio

In this work, we present hBERT, where we modify certain layers of the pretrained BERT model with the new Hopfield Layer.

Transformers with Competitive Ensembles of Independent Mechanisms

no code implementations27 Feb 2021 Alex Lamb, Di He, Anirudh Goyal, Guolin Ke, Chien-Feng Liao, Mirco Ravanelli, Yoshua Bengio

In this work we explore a way in which the Transformer architecture is deficient: it represents each position with a large monolithic hidden representation and a single set of parameters which are applied over the entire hidden representation.

Speech Enhancement

Learning Neural Generative Dynamics for Molecular Conformation Generation

3 code implementations ICLR 2021 Minkai Xu, Shitong Luo, Yoshua Bengio, Jian Peng, Jian Tang

Inspired by the recent progress in deep generative models, in this paper, we propose a novel probabilistic framework to generate valid and diverse conformations given a molecular graph.

valid

Structured Sparsity Inducing Adaptive Optimizers for Deep Learning

2 code implementations7 Feb 2021 Tristan Deleu, Yoshua Bengio

The parameters of a neural network are naturally organized in groups, some of which might not contribute to its overall performance.

Scaling Equilibrium Propagation to Deep ConvNets by Drastically Reducing its Gradient Estimator Bias

no code implementations14 Jan 2021 Axel Laborieux, Maxence Ernoult, Benjamin Scellier, Yoshua Bengio, Julie Grollier, Damien Querlioz

Equilibrium Propagation (EP) is a biologically-inspired counterpart of Backpropagation Through Time (BPTT) which, owing to its strong theoretical guarantees and the locality in space of its learning rule, fosters the design of energy-efficient hardware dedicated to learning.

Spatially Structured Recurrent Modules

no code implementations ICLR 2021 Nasim Rahaman, Anirudh Goyal, Muhammad Waleed Gondal, Manuel Wuthrich, Stefan Bauer, Yash Sharma, Yoshua Bengio, Bernhard Schölkopf

Capturing the structure of a data-generating process by means of appropriate inductive biases can help in learning models that generalise well and are robust to changes in the input distribution.

Starcraft II Video Prediction

FloW: A Dataset and Benchmark for Floating Waste Detection in Inland Waters

1 code implementation ICCV 2021 Yuwei Cheng, Jiannan Zhu, Mengxin Jiang, Jie Fu, Changsong Pang, Peidong Wang, Kris Sankaran, Olawale Onabola, Yimin Liu, Dianbo Liu, Yoshua Bengio

To promote the practical application for autonomous floating wastes cleaning, we present FloW, the first dataset for floating waste detection in inland water areas.

object-detection Robust Object Detection

Neural Bayes: A Generic Parameterization Method for Unsupervised Learning

no code implementations1 Jan 2021 Devansh Arpit, Huan Wang, Caiming Xiong, Richard Socher, Yoshua Bengio

Disjoint Manifold Separation: Neural Bayes allows us to formulate an objective which can optimally label samples from disjoint manifolds present in the support of a continuous distribution.

Clustering Representation Learning

Factorizing Declarative and Procedural Knowledge in Structured, Dynamical Environments

no code implementations ICLR 2021 Anirudh Goyal, Alex Lamb, Phanideep Gampa, Philippe Beaudoin, Charles Blundell, Sergey Levine, Yoshua Bengio, Michael Curtis Mozer

To use a video game as an illustration, two enemies of the same type will share schemata but will have separate object files to encode their distinct state (e. g., health, position).

Object

Systematic generalisation with group invariant predictions

no code implementations ICLR 2021 Faruk Ahmed, Yoshua Bengio, Harm van Seijen, Aaron Courville

We consider situations where the presence of dominant simpler correlations with the target variable in a training set can cause an SGD-trained neural network to be less reliant on more persistently-correlating complex features.

Dependency Structure Discovery from Interventions

no code implementations1 Jan 2021 Nan Rosemary Ke, Olexa Bilaniuk, Anirudh Goyal, Stefan Bauer, Bernhard Schölkopf, Michael Curtis Mozer, Hugo Larochelle, Christopher Pal, Yoshua Bengio

Promising results have driven a recent surge of interest in continuous optimization methods for Bayesian network structure learning from observational data.

Inductive Biases for Deep Learning of Higher-Level Cognition

no code implementations30 Nov 2020 Anirudh Goyal, Yoshua Bengio

A fascinating hypothesis is that human and animal intelligence could be explained by a few principles (rather than an encyclopedic list of heuristics).

Systematic Generalization

RetroGNN: Approximating Retrosynthesis by Graph Neural Networks for De Novo Drug Design

no code implementations25 Nov 2020 Cheng-Hao Liu, Maksym Korablyov, Stanisław Jastrzębski, Paweł Włodarczyk-Pruszyński, Yoshua Bengio, Marwin H. S. Segler

A natural idea to mitigate this problem is to bias the search process towards more easily synthesizable molecules using a proxy for synthetic accessibility.

Retrosynthesis

Predicting Infectiousness for Proactive Contact Tracing

1 code implementation ICLR 2021 Yoshua Bengio, Prateek Gupta, Tegan Maharaj, Nasim Rahaman, Martin Weiss, Tristan Deleu, Eilif Muller, Meng Qu, Victor Schmidt, Pierre-Luc St-Charles, Hannah Alsdurf, Olexa Bilanuik, David Buckeridge, Gáetan Marceau Caron, Pierre-Luc Carrier, Joumana Ghosn, Satya Ortiz-Gagne, Chris Pal, Irina Rish, Bernhard Schölkopf, Abhinav Sharma, Jian Tang, Andrew Williams

Predictions are used to provide personalized recommendations to the individual via an app, as well as to send anonymized messages to the individual's contacts, who use this information to better predict their own infectiousness, an approach we call proactive contact tracing (PCT).

NU-GAN: High resolution neural upsampling with GAN

no code implementations22 Oct 2020 Rithesh Kumar, Kundan Kumar, Vicki Anand, Yoshua Bengio, Aaron Courville

In this paper, we propose NU-GAN, a new method for resampling audio from lower to higher sampling rates (upsampling).

Audio Generation Speech Synthesis +1

Cross-Modal Information Maximization for Medical Imaging: CMIM

no code implementations20 Oct 2020 Tristan Sylvain, Francis Dutil, Tess Berthier, Lisa Di Jorio, Margaux Luck, Devon Hjelm, Yoshua Bengio

In hospitals, data are siloed to specific information systems that make the same information available under different modalities such as the different medical imaging exams the patient undergoes (CT scans, MRI, PET, Ultrasound, etc.)

Image Classification Medical Image Classification

Neural Function Modules with Sparse Arguments: A Dynamic Approach to Integrating Information across Layers

no code implementations15 Oct 2020 Alex Lamb, Anirudh Goyal, Agnieszka Słowik, Michael Mozer, Philippe Beaudoin, Yoshua Bengio

Feed-forward neural networks consist of a sequence of layers, in which each layer performs some processing on the information from the previous layer.

Domain Generalization

RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs

2 code implementations ICLR 2021 Meng Qu, Junkun Chen, Louis-Pascal Xhonneux, Yoshua Bengio, Jian Tang

Then in the E-step, we select a set of high-quality rules from all generated rules with both the rule generator and reasoning predictor via posterior inference; and in the M-step, the rule generator is updated with the rules selected in the E-step.

Knowledge Graphs

Visual Concept Reasoning Networks

no code implementations26 Aug 2020 Taesup Kim, Sungwoong Kim, Yoshua Bengio

It approximates sparsely connected networks by explicitly defining multiple branches to simultaneously learn representations with different visual concepts or properties.

Action Recognition Image Classification +4

Mastering Rate based Curriculum Learning

1 code implementation14 Aug 2020 Lucas Willems, Salem Lahlou, Yoshua Bengio

Recent automatic curriculum learning algorithms, and in particular Teacher-Student algorithms, rely on the notion of learning progress, making the assumption that the good next tasks are the ones on which the learner is making the fastest progress or digress.

Deriving Differential Target Propagation from Iterating Approximate Inverses

no code implementations29 Jul 2020 Yoshua Bengio

We show that a particular form of target propagation, i. e., relying on learned inverses of each layer, which is differential, i. e., where the target is a small perturbation of the forward propagation, gives rise to an update rule which corresponds to an approximate Gauss-Newton gradient-based optimization, without requiring the manipulation or inversion of large matrices.

BabyAI 1.1

3 code implementations24 Jul 2020 David Yu-Tung Hui, Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Yoshua Bengio

This increases reinforcement learning sample efficiency by up to 3 times and improves imitation learning performance on the hardest level from 77 % to 90. 4 %.

Computational Efficiency Imitation Learning +2

Revisiting Fundamentals of Experience Replay

2 code implementations ICML 2020 William Fedus, Prajit Ramachandran, Rishabh Agarwal, Yoshua Bengio, Hugo Larochelle, Mark Rowland, Will Dabney

Experience replay is central to off-policy algorithms in deep reinforcement learning (RL), but there remain significant gaps in our understanding.

DQN Replay Dataset Q-Learning +1

S2RMs: Spatially Structured Recurrent Modules

no code implementations13 Jul 2020 Nasim Rahaman, Anirudh Goyal, Muhammad Waleed Gondal, Manuel Wuthrich, Stefan Bauer, Yash Sharma, Yoshua Bengio, Bernhard Schölkopf

Capturing the structure of a data-generating process by means of appropriate inductive biases can help in learning models that generalize well and are robust to changes in the input distribution.

Starcraft II Video Prediction

Compositional Generalization by Factorizing Alignment and Translation

no code implementations ACL 2020 Jacob Russin, Jason Jo, R O{'}Reilly, all, Yoshua Bengio

Standard methods in deep learning for natural language processing fail to capture the compositional structure of human language that allows for systematic generalization outside of the training distribution.

Machine Translation Systematic Generalization +1

Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules

1 code implementation ICML 2020 Sarthak Mittal, Alex Lamb, Anirudh Goyal, Vikram Voleti, Murray Shanahan, Guillaume Lajoie, Michael Mozer, Yoshua Bengio

To effectively utilize the wealth of potential top-down information available, and to prevent the cacophony of intermixed signals in a bidirectional architecture, mechanisms are needed to restrict information flow.

Language Modelling Open-Ended Question Answering +2

Object Files and Schemata: Factorizing Declarative and Procedural Knowledge in Dynamical Systems

no code implementations29 Jun 2020 Anirudh Goyal, Alex Lamb, Phanideep Gampa, Philippe Beaudoin, Sergey Levine, Charles Blundell, Yoshua Bengio, Michael Mozer

To use a video game as an illustration, two enemies of the same type will share schemata but will have separate object files to encode their distinct state (e. g., health, position).

Object

Hybrid Models for Learning to Branch

1 code implementation NeurIPS 2020 Prateek Gupta, Maxime Gasse, Elias B. Khalil, M. Pawan Kumar, Andrea Lodi, Yoshua Bengio

First, in a more realistic setting where only a CPU is available, is the GNN model still competitive?

Rethinking Distributional Matching Based Domain Adaptation

no code implementations23 Jun 2020 Bo Li, Yezhen Wang, Tong Che, Shanghang Zhang, Sicheng Zhao, Pengfei Xu, Wei Zhou, Yoshua Bengio, Kurt Keutzer

In this paper, in order to devise robust DA algorithms, we first systematically analyze the limitations of DM based methods, and then build new benchmarks with more realistic domain shifts to evaluate the well-accepted DM methods.

Domain Adaptation

Image-to-image Mapping with Many Domains by Sparse Attribute Transfer

no code implementations23 Jun 2020 Matthew Amodio, Rim Assouel, Victor Schmidt, Tristan Sylvain, Smita Krishnaswamy, Yoshua Bengio

Unsupervised image-to-image translation consists of learning a pair of mappings between two domains without known pairwise correspondences between points.

Attribute Translation +1

Learning Causal Models Online

1 code implementation12 Jun 2020 Khurram Javed, Martha White, Yoshua Bengio

One solution for achieving strong generalization is to incorporate causal structures in the models; such structures constrain learning by ignoring correlations that contradict them.

Continual Learning

Scaling Equilibrium Propagation to Deep ConvNets by Drastically Reducing its Gradient Estimator Bias

1 code implementation6 Jun 2020 Axel Laborieux, Maxence Ernoult, Benjamin Scellier, Yoshua Bengio, Julie Grollier, Damien Querlioz

In this work, we show that a bias in the gradient estimate of EP, inherent in the use of finite nudging, is responsible for this phenomenon and that cancelling it allows training deep ConvNets by EP.

Training End-to-End Analog Neural Networks with Equilibrium Propagation

no code implementations2 Jun 2020 Jack Kendall, Ross Pantone, Kalpana Manickavasagam, Yoshua Bengio, Benjamin Scellier

We introduce a principled method to train end-to-end analog neural networks by stochastic gradient descent.

Cannot find the paper you are looking for? You can Submit a new open access paper.