The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.

( Image credit: Playing Atari with Deep Reinforcement Learning )


No evaluation results yet. Help compare methods by submit evaluation metrics.


Latest papers without code

Survey on Multi-Agent Q-Learning frameworks for resource management in wireless sensor network

5 May 2021

In the fourth section, the author surveyed sets of game-theoretic frameworks that researchers used to address this problem for resource allocation and task scheduling in the wireless sensor networks.


HASCO: Towards Agile HArdware and Software CO-design for Tensor Computation

4 May 2021

Second, the overall design space composed of HW/SW partitioning, hardware optimization, and software optimization is huge.


CARL-DTN: Context Adaptive Reinforcement Learning based Routing Algorithm in Delay Tolerant Network

2 May 2021

The result shows that the proposed protocol has better performance in terms of message delivery ratio and overhead.


RP-DQN: An application of Q-Learning to Vehicle Routing Problems

25 Apr 2021

In this paper we present a new approach to tackle complex routing problems with an improved state representation that utilizes the model complexity better than previous methods.


Reinforcement Learning for Traffic Signal Control: Comparison with Commercial Systems

21 Apr 2021

Recently, Intelligent Transportation Systems are leveraging the power of increased sensory coverage and computing power to deliver data-intensive solutions achieving higher levels of performance than traditional systems.


Model-aided Deep Reinforcement Learning for Sample-efficient UAV Trajectory Design in IoT Networks

21 Apr 2021

Deep Reinforcement Learning (DRL) is gaining attention as a potential approach to design trajectories for autonomous unmanned aerial vehicles (UAV) used as flying access points in the context of cellular or Internet of Things (IoT) connectivity.


A Simulated Experiment to Explore Robotic Dialogue Strategies for People with Dementia

18 Apr 2021

People with Alzheimer's disease and related dementias (ADRD) often show the problem of repetitive questioning, which brings a great burden on persons with ADRD (PwDs) and their caregivers.


Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills

15 Apr 2021

We consider the problem of learning useful robotic skills from previously collected offline data without access to manually specified rewards or additional online exploration, a setting that is becoming increasingly important for scaling robot learning by reusing past robotic data.


Prospect-theoretic Q-learning

12 Apr 2021

We consider a prospect theoretic version of the classical Q-learning algorithm for discounted reward Markov decision processes, wherein the controller perceives a distorted and noisy future reward, modeled by a nonlinearity that accentuates gains and underrepresents losses relative to a reference point.


Group Equivariant Neural Architecture Search via Group Decomposition and Reinforcement Learning

10 Apr 2021

We address these problems by proving a new group-theoretic result in the context of equivariant neural networks that shows that a network is equivariant to a large group if and only if it is equivariant to smaller groups from which it is constructed.