Search Results for author: Miguel Ayala Botto

Found 2 papers, 2 papers with code

Control with adaptive Q-learning

1 code implementation3 Nov 2020 João Pedro Araújo, Mário A. T. Figueiredo, Miguel Ayala Botto

The main difference between AQL and SPAQL is that the latter learns time-invariant policies, where the mapping from states to actions does not depend explicitly on the time step.

OpenAI Gym Q-Learning +1

Single-partition adaptive Q-learning

1 code implementation14 Jul 2020 João Pedro Araújo, Mário Figueiredo, Miguel Ayala Botto

This paper introduces single-partition adaptive Q-learning (SPAQL), an algorithm for model-free episodic reinforcement learning (RL), which adaptively partitions the state-action space of a Markov decision process (MDP), while simultaneously learning a time-invariant policy (i. e., the mapping from states to actions does not depend explicitly on the episode time step) for maximizing the cumulative reward.

Q-Learning Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.