Q-Learning
386 papers with code • 0 benchmarks • 2 datasets
The goal of Q-learning is to learn a policy, which tells an agent what action to take under what circumstances.
( Image credit: Playing Atari with Deep Reinforcement Learning )
Benchmarks
These leaderboards are used to track progress in Q-Learning
Libraries
Use these libraries to find Q-Learning models and implementationsLatest papers with no code
From $r$ to $Q^*$: Your Language Model is Secretly a Q-Function
Standard RLHF deploys reinforcement learning in a specific token-level MDP, while DPO is derived as a bandit problem in which the whole response of the model is treated as a single arm.
Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL
We evaluate our tracker on several high-fidelity environments with challenging situations, such as distraction and occlusion.
Advancing Forest Fire Prevention: Deep Reinforcement Learning for Effective Firebreak Placement
To the best of our knowledge, this study represents a pioneering effort in using Reinforcement Learning to address the aforementioned problem, offering promising perspectives in fire prevention and landscape management
Prelimit Coupling and Steady-State Convergence of Constant-stepsize Nonsmooth Contractive SA
Motivated by Q-learning, we study nonsmooth contractive stochastic approximation (SA) with constant stepsize.
Traffic Signal Control and Speed Offset Coordination Using Q-Learning for Arterial Road Networks
We evaluate the performance of the proposed arterial traffic control strategy using microscopic traffic simulations of an arterial corridor with seven intersections near the I-710 freeway.
Deep Reinforcement Learning Control for Disturbance Rejection in a Nonlinear Dynamic System with Parametric Uncertainty
This work describes a technique for active rejection of multiple independent and time-correlated stochastic disturbances for a nonlinear flexible inverted pendulum with cart system with uncertain model parameters.
Growing Q-Networks: Solving Continuous Control Tasks with Adaptive Control Resolution
Recent reinforcement learning approaches have shown surprisingly strong capabilities of bang-bang policies for solving continuous control benchmarks.
Utilizing Maximum Mean Discrepancy Barycenter for Propagating the Uncertainty of Value Functions in Reinforcement Learning
Accounting for the uncertainty of value functions boosts exploration in Reinforcement Learning (RL).
Compressed Federated Reinforcement Learning with a Generative Model
Addressing this challenge, federated reinforcement learning (FedRL) has emerged, wherein agents collaboratively learn a single policy by aggregating local estimations.
Semantic-Aware Remote Estimation of Multiple Markov Sources Under Constraints
This paper studies semantic-aware communication for remote estimation of multiple Markov sources over a lossy and rate-constrained channel.