Efficient Exploration
121 papers with code • 0 benchmarks • 2 datasets
Efficient Exploration is one of the main obstacles in scaling up modern deep reinforcement learning algorithms. The main challenge in Efficient Exploration is the balance between exploiting current estimates, and gaining information about poorly understood states and actions.
Source: Randomized Value Functions via Multiplicative Normalizing Flows
Benchmarks
These leaderboards are used to track progress in Efficient Exploration
Libraries
Use these libraries to find Efficient Exploration models and implementationsMost implemented papers
Noisy Networks for Exploration
We introduce NoisyNet, a deep reinforcement learning agent with parametric noise added to its weights, and show that the induced stochasticity of the agent's policy can be used to aid efficient exploration.
Automatic chemical design using a data-driven continuous representation of molecules
We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation.
Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables
In our approach, we perform online probabilistic filtering of latent task variables to infer how to solve a new task from small amounts of experience.
Deep Exploration via Bootstrapped DQN
Efficient exploration in complex environments remains a major challenge for reinforcement learning.
Stochastic Gradient Hamiltonian Monte Carlo
Hamiltonian Monte Carlo (HMC) sampling methods provide a mechanism for defining distant proposals with high acceptance probabilities in a Metropolis-Hastings framework, enabling more efficient exploration of the state space than standard random-walk proposals.
Neural Contextual Bandits with UCB-based Exploration
To the best of our knowledge, it is the first neural network-based contextual bandit algorithm with a near-optimal regret guarantee.
Data-Efficient Exploration, Optimization, and Modeling of Diverse Designs through Surrogate-Assisted Illumination
The MAP-Elites algorithm produces a set of high-performing solutions that vary according to features defined by the user.
Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents
We investigate the task of learning to follow natural language instructions by jointly reasoning with visual observations and language inputs.
ConEx: Efficient Exploration of Big-Data System Configurations for Better Performance
For cost reduction, we developed and experimentally tested and validated two approaches: using scaled-up big data jobs as proxies for the objective function for larger jobs and using a dynamic job similarity measure to infer that results obtained for one kind of big data problem will work well for similar problems.
Scaling MAP-Elites to Deep Neuroevolution
Quality-Diversity (QD) algorithms, and MAP-Elites (ME) in particular, have proven very useful for a broad range of applications including enabling real robots to recover quickly from joint damage, solving strongly deceptive maze tasks or evolving robot morphologies to discover new gaits.