1 code implementation • 13 May 2022 • Miroslav Štrupl, Francesco Faccio, Dylan R. Ashley, Jürgen Schmidhuber, Rupesh Kumar Srivastava
Upside-Down Reinforcement Learning (UDRL) is an approach for solving RL problems that does not require value functions and uses only supervised learning, where the targets for given inputs in a dataset do not change over time.
no code implementations • 25 Mar 2022 • Anand Gopalakrishnan, Kazuki Irie, Jürgen Schmidhuber, Sjoerd van Steenkiste
The discovery of reusable sub-routines simplifies decision-making and planning in complex reinforcement learning problems.
1 code implementation • 24 Feb 2022 • Kai Arulkumaran, Dylan R. Ashley, Jürgen Schmidhuber, Rupesh K. Srivastava
Upside down reinforcement learning (UDRL) flips the conventional use of the return in the objective function in RL upside down, by taking returns as input and predicting actions.
no code implementations • 23 Feb 2022 • Dylan R. Ashley, Kai Arulkumaran, Jürgen Schmidhuber, Rupesh Kumar Srivastava
Lately, there has been a resurgence of interest in using supervised learning to solve reinforcement learning problems.
1 code implementation • 11 Feb 2022 • Kazuki Irie, Róbert Csordás, Jürgen Schmidhuber
Linear layers in neural networks (NNs) trained by gradient descent can be expressed as a key-value memory system which stores all training datapoints and the initial weights, and produces outputs using unnormalised dot attention over the entire training experience.
1 code implementation • 11 Feb 2022 • Kazuki Irie, Imanol Schlag, Róbert Csordás, Jürgen Schmidhuber
The weight matrix (WM) of a neural network (NN) is its program.
1 code implementation • 31 Dec 2021 • Kazuki Irie, Imanol Schlag, Róbert Csordás, Jürgen Schmidhuber
We share our experience with the recently released WILDS benchmark, a collection of ten datasets dedicated to developing models and training strategies which are robust to domain shifts.
1 code implementation • ICLR Workshop Neural_Compression 2021 • Kazuki Irie, Jürgen Schmidhuber
The inputs and/or outputs of some neural nets are weight matrices of other neural nets.
1 code implementation • 3 Nov 2021 • Dylan R. Ashley, Vincent Herrmann, Zachary Friggstad, Kory W. Mathewson, Jürgen Schmidhuber
We look at how machine learning techniques that derive properties of items in a collection of independent media can be used to automatically embed stories into such collections.
1 code implementation • 14 Oct 2021 • Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber
Despite progress across a broad range of applications, Transformers have limited success in systematic generalization.
no code implementations • NeurIPS Workshop AIPLANS 2021 • Imanol Schlag, Jürgen Schmidhuber
We augment classic algorithms with learned components to adapt them to domains currently dominated by deep learning models.
no code implementations • NeurIPS Workshop AIPLANS 2021 • Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber
Despite successes across a broad range of applications, Transformers have limited capability in systematic generalization.
no code implementations • ICLR 2022 • Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber
Despite successes across a broad range of applications, Transformers have limited capability in systematic generalization.
1 code implementation • EMNLP 2021 • Róbert Csordás, Kazuki Irie, Jürgen Schmidhuber
Our models improve accuracy from 50% to 85% on the PCFG productivity split, and from 35% to 81% on COGS.
1 code implementation • 19 Jul 2021 • Miroslav Štrupl, Francesco Faccio, Dylan R. Ashley, Rupesh Kumar Srivastava, Jürgen Schmidhuber
Reward-Weighted Regression (RWR) belongs to a family of widely known iterative Reinforcement Learning algorithms based on the Expectation-Maximization framework.
no code implementations • 12 Jul 2021 • Noor Sajid, Francesco Faccio, Lancelot Da Costa, Thomas Parr, Jürgen Schmidhuber, Karl Friston
Under the Bayesian brain hypothesis, behavioural variations can be attributed to different priors over generative model parameters.
4 code implementations • NeurIPS 2021 • Kazuki Irie, Imanol Schlag, Róbert Csordás, Jürgen Schmidhuber
Transformers with linearised attention (''linear Transformers'') have demonstrated the practical scalability and effectiveness of outer product-based Fast Weight Programmers (FWPs) from the '90s.
no code implementations • 16 Mar 2021 • Lukas Tuggener, Jürgen Schmidhuber, Thilo Stadelmann
We investigate and improve the representativeness of ImageNet as a basis for deriving generally effective convolutional neural network (CNN) architectures that perform well on a diverse set of datasets and application domains.
1 code implementation • ICLR 2021 • Đorđe Miladinović, Aleksandar Stanić, Stefan Bauer, Jürgen Schmidhuber, Joachim M. Buhmann
We show that augmenting the decoder of a hierarchical VAE by spatial dependency layers considerably improves density estimation over baseline convolutional architectures and the state-of-the-art among the models within the same class.
7 code implementations • 22 Feb 2021 • Imanol Schlag, Kazuki Irie, Jürgen Schmidhuber
We show the formal equivalence of linearised self-attention mechanisms and fast weight controllers from the early '90s, where a ``slow" neural net learns by gradient descent to program the ``fast weights" of another net through sequences of elementary programming instructions which are additive outer products of self-invented activation patterns (today called keys and values).
no code implementations • NeurIPS 2021 • Louis Kirsch, Jürgen Schmidhuber
Many concepts have been proposed for meta learning with neural networks (NNs), e. g., NNs that learn to reprogram fast weights, Hebbian plasticity, learned learning rules, and meta recurrent NNs.
no code implementations • 9 Dec 2020 • Klaus Greff, Sjoerd van Steenkiste, Jürgen Schmidhuber
Contemporary neural networks still fall short of human-level generalization, which extends far beyond our direct experiences.
1 code implementation • ICLR 2021 • Anand Gopalakrishnan, Sjoerd van Steenkiste, Jürgen Schmidhuber
We propose PermaKey, a novel approach to representation learning based on object keypoints.
1 code implementation • ICLR 2021 • Imanol Schlag, Tsendsuren Munkhdalai, Jürgen Schmidhuber
Humans can quickly associate stimuli to solve problems in novel contexts.
Ranked #1 on
Question Answering
on catbAbI LM-mode
no code implementations • 7 Oct 2020 • Aleksandar Stanić, Sjoerd van Steenkiste, Jürgen Schmidhuber
Common-sense physical reasoning in the real world requires learning about the interactions of objects and their dynamics.
1 code implementation • ICLR 2021 • Róbert Csordás, Sjoerd van Steenkiste, Jürgen Schmidhuber
Neural networks (NNs) whose subnetworks implement reusable functions are expected to offer numerous advantages, including compositionality through efficient recombination of functional building blocks, interpretability, preventing catastrophic interference, etc.
1 code implementation • 9 Jul 2020 • Aditya Ramesh, Paulo Rauber, Jürgen Schmidhuber
An agent in a non-stationary contextual bandit problem should balance between exploration and the exploitation of (periodic or structured) patterns present in its previous experiences.
1 code implementation • ICLR 2021 • Francesco Faccio, Louis Kirsch, Jürgen Schmidhuber
We introduce a class of value functions called Parameter-Based Value Functions (PBVFs) whose inputs include the policy parameters.
no code implementations • ICML Workshop LifelongML 2020 • Krsto Proroković, Michael Wand, Jürgen Schmidhuber
An EMG-based upper limb prosthesis relies on a statistical pattern recognition system to map the EMG signal of residual forearm muscles into the appropriate hand movements.
7 code implementations • 5 Dec 2019 • Rupesh Kumar Srivastava, Pranav Shyam, Filipe Mutz, Wojciech Jaśkowski, Jürgen Schmidhuber
Many of its general principles are outlined in a companion report; the goal of this paper is to develop a practical learning algorithm and show that this conceptually simple perspective on agent training can produce a range of rewarding behaviors for multiple episodic environments.
2 code implementations • 15 Oct 2019 • Imanol Schlag, Paul Smolensky, Roland Fernandez, Nebojsa Jojic, Jürgen Schmidhuber, Jianfeng Gao
We incorporate Tensor-Product Representations within the Transformer in order to better support the explicit representation of relation structure.
Ranked #3 on
Question Answering
on Mathematics Dataset
no code implementations • 11 Oct 2019 • Aleksandar Stanić, Jürgen Schmidhuber
Traditional sequential multi-object attention models rely on a recurrent mechanism to infer object relations.
no code implementations • ICLR 2020 • Louis Kirsch, Sjoerd van Steenkiste, Jürgen Schmidhuber
Biological evolution has distilled the experiences of many learners into the general learning algorithms of humans.
2 code implementations • 13 Jun 2019 • Timon Willi, Jonathan Masci, Jürgen Schmidhuber, Christian Osendorfer
We extend Neural Processes (NPs) to sequential data through Recurrent NPs or RNPs, a family of conditional state space models.
no code implementations • 3 Jun 2019 • Sjoerd van Steenkiste, Klaus Greff, Jürgen Schmidhuber
In order to meet the diverse challenges in solving many real-world problems, an intelligent agent has to be able to dynamically construct a model of its environment.
no code implementations • NeurIPS 2019 • Sjoerd van Steenkiste, Francesco Locatello, Jürgen Schmidhuber, Olivier Bachem
A disentangled representation encodes information about the salient factors of variation in the data independently.
1 code implementation • 23 Apr 2019 • Róbert Csordás, Jürgen Schmidhuber
The Differentiable Neural Computer (DNC) can learn algorithmic and question answering tasks.
1 code implementation • NeurIPS 2018 • Imanol Schlag, Jürgen Schmidhuber
We combine Recurrent Neural Networks with Tensor Product Representations to learn combinatorial representations of sequential data.
1 code implementation • 29 Nov 2018 • Imanol Schlag, Jürgen Schmidhuber
We combine Recurrent Neural Networks with Tensor Product Representations to learn combinatorial representations of sequential data.
no code implementations • ICLR 2019 • Sjoerd van Steenkiste, Karol Kurach, Jürgen Schmidhuber, Sylvain Gelly
We present a minimal modification of a standard generator to incorporate this inductive bias and find that it reliably learns to generate images as compositions of objects.
no code implementations • NeurIPS 2018 • David Ha, Jürgen Schmidhuber
A generative recurrent neural network is quickly trained in an unsupervised manner to model popular reinforcement learning environments through compressed spatio-temporal representations.
2 code implementations • 27 Mar 2018 • Lukas Tuggener, Ismail Elezi, Jürgen Schmidhuber, Marcello Pelillo, Thilo Stadelmann
We present the DeepScores dataset with the goal of advancing the state-of-the-art in small objects recognition, and by placing the question of object recognition in the context of scene understanding.
20 code implementations • 27 Mar 2018 • David Ha, Jürgen Schmidhuber
We explore building generative neural network models of popular reinforcement learning environments.
3 code implementations • ICLR 2018 • Sjoerd van Steenkiste, Michael Chang, Klaus Greff, Jürgen Schmidhuber
Common-sense physical reasoning is an essential ingredient for any intelligent agent operating in the real-world.
no code implementations • ICLR 2018 • Imanol Schlag, Jürgen Schmidhuber
We improve previous end-to-end differentiable neural networks (NNs) with fast weight memories.
1 code implementation • NeurIPS 2017 • Klaus Greff, Sjoerd van Steenkiste, Jürgen Schmidhuber
Many real world tasks such as reasoning and physical interaction require identification and manipulation of conceptual entities.
no code implementations • 22 Dec 2016 • Klaus Greff, Rupesh K. Srivastava, Jürgen Schmidhuber
We demonstrate that this viewpoint directly leads to the construction of Highway and Residual networks.
5 code implementations • ICML 2017 • Julian Georg Zilly, Rupesh Kumar Srivastava, Jan Koutník, Jürgen Schmidhuber
We introduce a novel theoretical analysis of recurrent networks based on Gersgorin's circle theorem that illuminates several modeling and optimization issues and improves our understanding of the LSTM cell.
Ranked #16 on
Language Modelling
on Hutter Prize
2 code implementations • NeurIPS 2016 • Klaus Greff, Antti Rasmus, Mathias Berglund, Tele Hotloo Hao, Jürgen Schmidhuber, Harri Valpola
We present a framework for efficient perceptual inference that explicitly reasons about the segmentation of its inputs and features.
no code implementations • 29 Jan 2016 • Michael Wand, Jan Koutník, Jürgen Schmidhuber
Lipreading, i. e. speech recognition from visual-only recordings of a speaker's face, can be achieved with a processing pipeline based solely on neural networks, yielding significantly better accuracy than conventional methods.
1 code implementation • 19 Nov 2015 • Klaus Greff, Rupesh Kumar Srivastava, Jürgen Schmidhuber
Disentangled distributed representations of data are desirable for machine learning, since they are more expressive and can generalize from fewer examples.
4 code implementations • NeurIPS 2015 • Rupesh Kumar Srivastava, Klaus Greff, Jürgen Schmidhuber
Theoretical and empirical evidence indicates that the depth of neural networks is crucial for their success.
Ranked #35 on
Image Classification
on MNIST
3 code implementations • 3 May 2015 • Rupesh Kumar Srivastava, Klaus Greff, Jürgen Schmidhuber
There is plenty of theoretical and empirical evidence that depth of neural networks is a crucial ingredient for their success.
12 code implementations • 13 Mar 2015 • Klaus Greff, Rupesh Kumar Srivastava, Jan Koutník, Bas R. Steunebrink, Jürgen Schmidhuber
Several variants of the Long Short-Term Memory (LSTM) architecture for recurrent neural networks have been proposed since its inception in 1995.
no code implementations • 21 Nov 2014 • Mitko Veta, Paul J. van Diest, Stefan M. Willems, Haibo Wang, Anant Madabhushi, Angel Cruz-Roa, Fabio Gonzalez, Anders B. L. Larsen, Jacob S. Vestergaard, Anders B. Dahl, Dan C. Cireşan, Jürgen Schmidhuber, Alessandro Giusti, Luca M. Gambardella, F. Boray Tek, Thomas Walter, Ching-Wei Wang, Satoshi Kondo, Bogdan J. Matuszewski, Frederic Precioso, Violet Snell, Josef Kittler, Teofilo E. de Campos, Adnan M. Khan, Nasir M. Rajpoot, Evdokia Arkoumani, Miangela M. Lacle, Max A. Viergever, Josien P. W. Pluim
The proliferative activity of breast tumors, which is routinely estimated by counting of mitotic figures in hematoxylin and eosin stained histology sections, is considered to be one of the most important prognostic markers.
no code implementations • 5 Oct 2014 • Rupesh Kumar Srivastava, Jonathan Masci, Faustino Gomez, Jürgen Schmidhuber
Recently proposed neural network activation functions such as rectified linear, maxout, and local winner-take-all have allowed for faster and more effective training of deep neural architectures on large and complex datasets.
5 code implementations • 14 Feb 2014 • Jan Koutník, Klaus Greff, Faustino Gomez, Jürgen Schmidhuber
Sequence prediction and classification are ubiquitous and challenging problems in machine learning that can require identifying complex dependencies between temporally distant inputs.
no code implementations • 19 Dec 2013 • Jürgen Schmidhuber
Deep Learning has attracted significant attention in recent years.
no code implementations • NeurIPS 2013 • Rupesh K. Srivastava, Jonathan Masci, Sohrob Kazerounian, Faustino Gomez, Jürgen Schmidhuber
Local competition among neighboring neurons is common in biological neural networks (NNs).
no code implementations • 1 Sep 2013 • Dan Cireşan, Jürgen Schmidhuber
Our Multi-Column Deep Neural Networks achieve best known recognition rates on Chinese characters from the ICDAR 2011 and 2013 offline handwriting competitions, approaching human performance.
no code implementations • 7 Feb 2013 • Alessandro Giusti, Dan C. Cireşan, Jonathan Masci, Luca M. Gambardella, Jürgen Schmidhuber
Deep Neural Networks now excel at image classification, detection and segmentation.
no code implementations • NeurIPS 2012 • Dan Ciresan, Alessandro Giusti, Luca M. Gambardella, Jürgen Schmidhuber
The input layer maps each window pixel to a neuron.
1 code implementation • 22 Jun 2011 • Daan Wierstra, Tom Schaul, Tobias Glasmachers, Yi Sun, Jürgen Schmidhuber
This paper presents Natural Evolution Strategies (NES), a recent family of algorithms that constitute a more principled approach to black-box optimization than established evolutionary algorithms.
no code implementations • 1 Feb 2011 • Dan C. Cireşan, Ueli Meier, Jonathan Masci, Luca M. Gambardella, Jürgen Schmidhuber
We present a fast, fully parameterizable GPU implementation of Convolutional Neural Network variants.
no code implementations • NeurIPS 2010 • Yi Sun, Jürgen Schmidhuber, Faustino J. Gomez
We present a new way of converting a reversible finite Markov chain into a nonreversible one, with a theoretical guarantee that the asymptotic variance of the MCMC estimator based on the non-reversible chain is reduced.
no code implementations • NeurIPS 2008 • Alex Graves, Jürgen Schmidhuber
Offline handwriting recognition---the transcription of images of handwritten text---is an interesting task, in that it combines computer vision with sequence learning.
no code implementations • NeurIPS 2007 • Alex Graves, Marcus Liwicki, Horst Bunke, Jürgen Schmidhuber, Santiago Fernández
On-line handwriting recognition is unusual among sequence labelling tasks in that the underlying generator of the observed data, i. e. the movement of the pen, is recorded directly.
1 code implementation • ICML 2006 2006 • Alex Graves, Santiago Fernández, Faustino Gomez, Jürgen Schmidhuber
Many real-world sequence learning tasks require the prediction of sequences of labels from noisy, unsegmented input data.