Efficient Exploration

155 papers with code • 0 benchmarks • 3 datasets

Efficient Exploration is one of the main obstacles in scaling up modern deep reinforcement learning algorithms. The main challenge in Efficient Exploration is the balance between exploiting current estimates, and gaining information about poorly understood states and actions.

Source: Randomized Value Functions via Multiplicative Normalizing Flows


Use these libraries to find Efficient Exploration models and implementations
2 papers

Most implemented papers

Noisy Networks for Exploration

opendilab/DI-engine ICLR 2018

We introduce NoisyNet, a deep reinforcement learning agent with parametric noise added to its weights, and show that the induced stochasticity of the agent's policy can be used to aid efficient exploration.

Automatic chemical design using a data-driven continuous representation of molecules

aspuru-guzik-group/chemical_vae 7 Oct 2016

We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation.

Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables

katerakelly/oyster ICLR Workshop LLD 2019

In our approach, we perform online probabilistic filtering of latent task variables to infer how to solve a new task from small amounts of experience.

Deep Exploration via Bootstrapped DQN

tensorflow/models NeurIPS 2016

Efficient exploration in complex environments remains a major challenge for reinforcement learning.

Stochastic Gradient Hamiltonian Monte Carlo

JavierAntoran/Bayesian-Neural-Networks 17 Feb 2014

Hamiltonian Monte Carlo (HMC) sampling methods provide a mechanism for defining distant proposals with high acceptance probabilities in a Metropolis-Hastings framework, enabling more efficient exploration of the state space than standard random-walk proposals.

Data-Efficient Exploration, Optimization, and Modeling of Diverse Designs through Surrogate-Assisted Illumination

agaier/sail_gecco2017 13 Feb 2017

The MAP-Elites algorithm produces a set of high-performing solutions that vary according to features defined by the user.

Neural Contextual Bandits with UCB-based Exploration

sauxpa/neural_exploration ICML 2020

To the best of our knowledge, it is the first neural network-based contextual bandit algorithm with a near-optimal regret guarantee.

Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents

xwhan/walk_the_blocks 16 Jun 2018

We investigate the task of learning to follow natural language instructions by jointly reasoning with visual observations and language inputs.

ConEx: Efficient Exploration of Big-Data System Configurations for Better Performance

ARiSE-Lab/ConEX__Replication_Package 17 Oct 2019

For cost reduction, we developed and experimentally tested and validated two approaches: using scaled-up big data jobs as proxies for the objective function for larger jobs and using a dynamic job similarity measure to infer that results obtained for one kind of big data problem will work well for similar problems.

Scaling MAP-Elites to Deep Neuroevolution

uber-research/Map-Elites-Evolutionary 3 Mar 2020

Quality-Diversity (QD) algorithms, and MAP-Elites (ME) in particular, have proven very useful for a broad range of applications including enabling real robots to recover quickly from joint damage, solving strongly deceptive maze tasks or evolving robot morphologies to discover new gaits.