Furthermore, we present a new strong baseline for more consistent predictive confidence in deep models, called fitted ensembles, where overconfident predictions are rectified by transformed versions of the original classification task.
Neural Architecture Search (NAS) explores a large space of architectural motifs -- a compute-intensive process that often involves ground-truth evaluation of each motif by instantiating it within a large network, and training and evaluating the network with thousands of domain-specific data samples.
The promise of reinforcement learning is to solve complex sequential decision problems autonomously by specifying a high-level reward function only.
Ranked #1 on Atari Games on Atari 2600 Montezuma's Revenge
Recent advances in machine learning are consistently enabled by increasing amounts of computation.
Creating open-ended algorithms, which generate their own never-ending stream of novel and appropriately challenging learning opportunities, could help to automate and accelerate progress in machine learning.
We show that neuromodulated plasticity improves the performance of neural networks on both reinforcement learning and supervised learning tasks.
Continual lifelong learning requires an agent or model to learn many sequentially ordered tasks, building on previous knowledge without catastrophically forgetting it.
Deep reinforcement learning approaches have shown impressive results in a variety of different domains, however, more complex heterogeneous architectures such as world models require the different neural components to be trained separately instead of end-to-end.
This paper introduces GTNs, discusses their potential, and showcases that they can substantially accelerate learning.
Evolutionary-based optimization approaches have recently shown promising results in domains such as Atari and robot locomotion but less so in solving 3D tasks directly from pixels.
Nature's spectacular inventiveness, reflected in the enormous diversity of form and function displayed by the biosphere, is a feature of life that distinguishes living most strongly from nonliving.
Designing evolutionary algorithms capable of uncovering highly evolvable representations is an open challenge; such evolvability is important because it accelerates evolution and enables fast adaptation to changing circumstances.
Instead of the relatively simple architectures employed in most RL experiments, world models rely on multiple different neural components that are responsible for visual information processing, memory, and decision-making.
Go-Explore can also harness human-provided domain knowledge and, when augmented with it, scores a mean of over 650k points on Montezuma's Revenge.
Ranked #1 on Atari Games on Atari 2600 Pitfall!
Our results show that POET produces a diverse range of sophisticated behaviors that solve a wide range of environmental challenges, many of which cannot be solved by direct optimization alone, or even through a direct-path curriculum-building control algorithm introduced to highlight the critical role of open-endedness in solving ambitious challenges.
Recent advances in deep neuroevolution have demonstrated that evolutionary algorithms, such as evolution strategies (ES) and genetic algorithms (GA), can scale to train deep neural networks to solve difficult reinforcement learning (RL) problems.
no code implementations • 9 Mar 2018 • Joel Lehman, Jeff Clune, Dusan Misevic, Christoph Adami, Lee Altenberg, Julie Beaulieu, Peter J. Bentley, Samuel Bernard, Guillaume Beslon, David M. Bryson, Patryk Chrabaszcz, Nick Cheney, Antoine Cully, Stephane Doncieux, Fred C. Dyer, Kai Olav Ellefsen, Robert Feldt, Stephan Fischer, Stephanie Forrest, Antoine Frénoy, Christian Gagné, Leni Le Goff, Laura M. Grabowski, Babak Hodjat, Frank Hutter, Laurent Keller, Carole Knibbe, Peter Krcah, Richard E. Lenski, Hod Lipson, Robert MacCurdy, Carlos Maestre, Risto Miikkulainen, Sara Mitri, David E. Moriarty, Jean-Baptiste Mouret, Anh Nguyen, Charles Ofria, Marc Parizeau, David Parsons, Robert T. Pennock, William F. Punch, Thomas S. Ray, Marc Schoenauer, Eric Shulte, Karl Sims, Kenneth O. Stanley, François Taddei, Danesh Tarapore, Simon Thibault, Westley Weimer, Richard Watson, Jason Yosinski
Biological evolution provides a creative fount of complex and subtle adaptations, often surprising the scientists who discover them.
Evolution strategies (ES) are a family of black-box optimization algorithms able to train deep neural networks roughly as well as Q-learning and policy gradient methods on challenging deep reinforcement learning (RL) problems, but are much faster (e. g. hours vs. days) because they parallelize better.
Because stochastic gradient descent (SGD) has shown promise optimizing neural networks with millions of parameters and few if any alternatives are known to exist, it has moved to the heart of leading approaches to reinforcement learning (RL).
Here we demonstrate they can: we evolve the weights of a DNN with a simple, gradient-free, population-based genetic algorithm (GA) and it performs well on hard deep RL problems, including Atari and humanoid locomotion.
While neuroevolution (evolving neural networks) has a successful track record across a variety of domains from reinforcement learning to artificial life, it is rarely applied to large, deep neural networks.
However, this ES optimizes for a different gradient than just reward: It optimizes for the average reward of the entire population, thereby seeking parameters that are robust to perturbation.
Inspired by such intricate natural phenomena, Evolved Plastic Artificial Neural Networks (EPANNs) use simulated evolution in-silico to breed plastic neural networks with a large variety of dynamics, architectures, and plasticity rules: these artificial systems are composed of inputs, outputs, and plastic components that change in response to experiences in an environment.
Though deep learning has pushed the boundaries of classification forward, in recent years hints of the limits of standard classification have begun to emerge.
To address the difficulty of creating online collaborative evolutionary systems, this paper presents a new prototype library called Worldwide Infrastructure for Neuroevolution (WIN) and its accompanying site WIN Online (http://winark. org/).
Unlike unsupervised approaches such as autoencoders that learn to reconstruct their inputs, this paper introduces an alternative approach to unsupervised feature learning called divergent discriminative feature accumulation (DDFA) that instead continually accumulates features that make novel discriminations among the training set.
An important question in neuroevolution is how to gain an advantage from evolving neural network topologies along with weights.