# Q-Learning

342 papers with code • 0 benchmarks • 2 datasets

The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )

## Benchmarks

These leaderboards are used to track progress in Q-Learning
## Libraries

Use these libraries to find Q-Learning models and implementations## Most implemented papers

# Continuous control with deep reinforcement learning

We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain.

# Playing Atari with Deep Reinforcement Learning

We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning.

# Deep Reinforcement Learning with Double Q-learning

The popular Q-learning algorithm is known to overestimate action values under certain conditions.

# Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

We explore deep reinforcement learning methods for multi-agent domains.

# Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

A platform for Applied Reinforcement Learning (Applied RL)

# Addressing Function Approximation Error in Actor-Critic Methods

In value-based reinforcement learning methods such as deep Q-learning, function approximation errors are known to lead to overestimated value estimates and suboptimal policies.

# Evolution Strategies as a Scalable Alternative to Reinforcement Learning

We explore the use of Evolution Strategies (ES), a class of black box optimization algorithms, as an alternative to popular MDP-based RL techniques such as Q-learning and Policy Gradients.

# A disembodied developmental robotic agent called Samu Bátfai

The basic objective of this paper is to reach the same results using reinforcement learning with general function approximators that can be achieved by using the classical Q lookup table on small input samples.

# Conservative Q-Learning for Offline Reinforcement Learning

We theoretically show that CQL produces a lower bound on the value of the current policy and that it can be incorporated into a policy learning procedure with theoretical improvement guarantees.

# Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning

Here we demonstrate they can: we evolve the weights of a DNN with a simple, gradient-free, population-based genetic algorithm (GA) and it performs well on hard deep RL problems, including Atari and humanoid locomotion.