Q-Learning
390 papers with code • 0 benchmarks • 2 datasets
The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.
( Image credit: Playing Atari with Deep Reinforcement Learning )
Benchmarks
These leaderboards are used to track progress in Q-Learning
Libraries
Use these libraries to find Q-Learning models and implementationsMost implemented papers
Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble
However, prior methods typically require accurate estimation of the behavior policy or sampling from OOD data points, which themselves can be a non-trivial problem.
Deep Reinforcement Learning with a Natural Language Action Space
This paper introduces a novel architecture for reinforcement learning with deep neural networks designed to handle state and action spaces characterized by natural language, as found in text-based games.
Taming the Noise in Reinforcement Learning via Soft Updates
We propose G-learning, a new off-policy learning algorithm that regularizes the value estimates by penalizing deterministic policies in the beginning of the learning process.
Learning to Communicate with Deep Multi-Agent Reinforcement Learning
We consider the problem of multiple agents sensing and acting in environments with the goal of maximising their shared utility.
Reinforcement Learning with Deep Energy-Based Policies
We propose a method for learning expressive energy-based policies for continuous states and actions, which has been feasible only in tabular domains before.
Mean Field Multi-Agent Reinforcement Learning
Existing multi-agent reinforcement learning methods are limited typically to a small number of agents.
Deep Reinforcement Learning for Traffic Light Control in Vehicular Networks
In terms of how to dynamically adjust traffic signals' duration, existing works either split the traffic signal into equal duration or extract limited traffic information from the real data.
Deep Quality-Value (DQV) Learning
We introduce a novel Deep Reinforcement Learning (DRL) algorithm called Deep Quality-Value (DQV) Learning.
Adversarial Learning of a Sampler Based on an Unnormalized Distribution
We investigate adversarial learning in the case when only an unnormalized form of the density can be accessed, rather than samples.
Deep Reinforcement Learning for Imbalanced Classification
The agent finally finds an optimal classification policy in imbalanced data under the guidance of specific reward function and beneficial learning environment.