Browse > Methodology > Q-Learning

Q-Learning

68 papers with code · Methodology

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

State-of-the-art leaderboards

No evaluation results yet. Help compare methods by submit evaluation metrics.

Greatest papers with code

Deep Reinforcement Learning with Double Q-learning

22 Sep 2015tensorpack/tensorpack

The popular Q-learning algorithm is known to overestimate action values under certain conditions.

Q-LEARNING

Playing Atari with Deep Reinforcement Learning

19 Dec 2013tensorpack/tensorpack

We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning.

ATARI GAMES Q-LEARNING

Increasing the Action Gap: New Operators for Reinforcement Learning

15 Dec 2015janhuenermann/neurojs

Extending the idea of a locally consistent operator, we then derive sufficient conditions for an operator to preserve optimality, leading to a family of operators which includes our consistent Bellman operator.

ATARI GAMES Q-LEARNING

Continuous control with deep reinforcement learning

9 Sep 2015facebookresearch/Horizon

We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain.

CONTINUOUS CONTROL Q-LEARNING

ViZDoom: A Doom-based AI Research Platform for Visual Reinforcement Learning

6 May 2016NervanaSystems/coach

Here, we propose a novel test-bed platform for reinforcement learning research from raw visual information which employs the first-person perspective in a semi-realistic 3D world.

ATARI GAMES GAME OF DOOM Q-LEARNING

Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents

NeurIPS 2018 uber-common/deep-neuroevolution

Evolution strategies (ES) are a family of black-box optimization algorithms able to train deep neural networks roughly as well as Q-learning and policy gradient methods on challenging deep reinforcement learning (RL) problems, but are much faster (e. g. hours vs. days) because they parallelize better.

POLICY GRADIENT METHODS Q-LEARNING

Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning

ICLR 2019 uber-common/deep-neuroevolution

Here we demonstrate they can: we evolve the weights of a DNN with a simple, gradient-free, population-based genetic algorithm (GA) and it performs well on hard deep RL problems, including Atari and humanoid locomotion.

Q-LEARNING

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

10 Mar 2017openai/evolution-strategies-starter

We explore the use of Evolution Strategies (ES), a class of black box optimization algorithms, as an alternative to popular MDP-based RL techniques such as Q-learning and Policy Gradients.

ATARI GAMES Q-LEARNING

Addressing Function Approximation Error in Actor-Critic Methods

ICML 2018 hill-a/stable-baselines

In value-based reinforcement learning methods such as deep Q-learning, function approximation errors are known to lead to overestimated value estimates and suboptimal policies.

Q-LEARNING