Browse > Methodology > Efficient Exploration

# Efficient Exploration Edit

24 papers with code · Methodology

No evaluation results yet. Help compare methods by submit evaluation metrics.

# Deep exploration by novelty-pursuit with maximum state entropy

Efficient exploration is essential to reinforcement learning in huge state space.

# Clustered Reinforcement Learning

Exploration strategy design is one of the challenging problems in reinforcement learning~(RL), especially when the environment contains a large state space or sparse rewards.

# Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies

We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph which describes a set of subtasks and their dependencies that are unknown to the agent.

# Efficient Exploration via State Marginal Matching

We address these shortcomings by learning a single exploration policy that can quickly solve a suite of downstream tasks in a multi-task setting, amortizing the cost of learning to explore.

# Regulatory Focus: Promotion and Prevention Inclinations in Policy Search

The estimation of advantage is crucial for a number of reinforcement learning algorithms, as it directly influences the choices of future paths.

# Implicit Generative Modeling for Efficient Exploration

Each random draw from our generative model is a neural network that instantiates the dynamic function, hence multiple draws would approximate the posterior, and the variance in the future prediction based on this posterior is used as an intrinsic reward for exploration.

# Learning Index Selection with Structured Action Spaces

16 Sep 2019

Configuration spaces for computer systems can be challenging for traditional and automatic tuning strategies.

# Biased Estimates of Advantages over Path Ensembles

15 Sep 2019

The estimation of advantage is crucial for a number of reinforcement learning algorithms, as it directly influences the choices of future paths.

# $\sqrt{n}$-Regret for Learning in Markov Decision Processes with Function Approximation and Low Bellman Rank

5 Sep 2019

Our learning algorithm, Adaptive Value-function Elimination (AVE), is inspired by the policy elimination algorithm proposed in (Jiang et al., 2017), known as OLIVE.

# Improving a State-of-the-Art Heuristic for the Minimum Latency Problem with Data Mining

28 Aug 2019

Recently, hybrid metaheuristics have become a trend in operations research.