no code implementations • 8 Mar 2024 • Yazhe Li, Jorg Bornschein, Ting Chen
In this paper, we explore a new generative approach for learning visual representations.
no code implementations • 3 Mar 2024 • Jorg Bornschein, Yazhe Li, Amal Rannen-Triki
Inspired by the in-context learning capabilities of transformers and their connection to meta-learning, we propose a method that leverages these strengths for online continual learning.
no code implementations • 3 Mar 2024 • Amal Rannen-Triki, Jorg Bornschein, Razvan Pascanu, Marcus Hutter, Andras György, Alexandre Galashov, Yee Whye Teh, Michalis K. Titsias
We consider the problem of online fine tuning the parameters of a language model at test time, also known as dynamic evaluation.
no code implementations • 14 Jun 2023 • Michalis K. Titsias, Alexandre Galashov, Amal Rannen-Triki, Razvan Pascanu, Yee Whye Teh, Jorg Bornschein
Non-stationarity over the linear predictor weights is modelled using a parameter drift transition density, parametrized by a coefficient that quantifies forgetting.
no code implementations • 12 Apr 2023 • Nan Rosemary Ke, Sara-Jane Dunn, Jorg Bornschein, Silvia Chiappa, Melanie Rey, Jean-Baptiste Lespiau, Albin Cassirer, Jane Wang, Theophane Weber, David Barrett, Matthew Botvinick, Anirudh Goyal, Mike Mozer, Danilo Rezende
To accurately identify GRNs, perturbational data is required.
no code implementations • 19 Feb 2023 • Yazhe Li, Jorg Bornschein, Marcus Hutter
Although much of the success of Deep Learning builds on learning good representations, a rigorous method to evaluate their quality is lacking.
no code implementations • 15 Dec 2022 • Shivakanth Sujit, Pedro H. M. Braga, Jorg Bornschein, Samira Ebrahimi Kahou
Offline RL algorithms try to address this issue by bootstrapping the learning process from existing logged data without needing to interact with the environment from the very beginning.
1 code implementation • 15 Nov 2022 • Jorg Bornschein, Alexandre Galashov, Ross Hemsley, Amal Rannen-Triki, Yutian Chen, Arslan Chaudhry, Xu Owen He, Arthur Douillard, Massimo Caccia, Qixuang Feng, Jiajun Shen, Sylvestre-Alvise Rebuffi, Kitty Stacpoole, Diego de Las Casas, Will Hawkins, Angeliki Lazaridou, Yee Whye Teh, Andrei A. Rusu, Razvan Pascanu, Marc'Aurelio Ranzato
A shared goal of several machine learning communities like continual learning, meta-learning and transfer learning, is to design algorithms and models that efficiently and robustly adapt to unseen tasks.
no code implementations • 14 Oct 2022 • Jorg Bornschein, Yazhe Li, Marcus Hutter
In the prequential formulation of MDL, the objective is to minimize the cumulative next-step log-loss when sequentially going through the data and using previous observations for parameter estimation.
no code implementations • 11 Apr 2022 • Nan Rosemary Ke, Silvia Chiappa, Jane Wang, Anirudh Goyal, Jorg Bornschein, Melanie Rey, Theophane Weber, Matthew Botvinic, Michael Mozer, Danilo Jimenez Rezende
The fundamental challenge in causal induction is to infer the underlying graph structure given observational and/or interventional data.
no code implementations • 2 Jul 2021 • Jorg Bornschein, Silvia Chiappa, Alan Malek, Rosemary Nan Ke
Learning the structure of Bayesian networks and causal relationships from observations is a common goal in several areas of science and technology.
no code implementations • 31 May 2021 • Tudor Berariu, Wojciech Czarnecki, Soham De, Jorg Bornschein, Samuel Smith, Razvan Pascanu, Claudia Clopath
One aim shared by multiple settings, such as continual learning or transfer learning, is to leverage previously acquired knowledge to converge faster on the current task.
no code implementations • ICML 2020 • Jorg Bornschein, Francesco Visin, Simon Osindero
Highly overparametrized neural networks can display curiously strong generalization performance - a phenomenon that has recently garnered a wealth of theoretical and empirical research in order to better understand it.
1 code implementation • 12 Jun 2015 • Jorg Bornschein, Samira Shabanian, Asja Fischer, Yoshua Bengio
We present a lower-bound for the likelihood of this model and we show that optimizing this bound regularizes the model so that the Bhattacharyya distance between the bottom-up and top-down approximate distributions is minimized.
no code implementations • 14 Feb 2015 • Yoshua Bengio, Dong-Hyun Lee, Jorg Bornschein, Thomas Mesnard, Zhouhan Lin
Neuroscientists have long criticised deep learning algorithms as incompatible with current knowledge of neurobiology.