9 papers with code • 0 benchmarks • 0 datasets

The acrobot system includes two joints and two links, where the joint between the two links is actuated. Initially, the links are hanging downwards, and the goal is to swing the end of the lower link up to a given height.

Most implemented papers

Criticality as It Could Be: organizational invariance as self-organized criticality in embodied agents

MiguelAguilera/Criticality-as-It-Could-Be 18 Apr 2017

This paper outlines a methodological approach for designing adaptive agents driving themselves near points of criticality.

Adaptation to criticality through organizational invariance in embodied agents

MiguelAguilera/Adaptation-to-criticality-through-organizational-invariance 13 Dec 2017

In order to explore how criticality might emerge from general adaptive mechanisms, we propose a simple learning rule that maintains an internal organizational structure from a specific family of systems at criticality.

Meta-learning curiosity algorithms

mfranzs/meta-learning-curiosity-algorithms ICLR 2020

We hypothesize that curiosity is a mechanism found by evolution that encourages meaningful exploration early in an agent's life in order to expose it to experiences that enable it to obtain high rewards over the course of its lifetime.

Learning Synthetic Environments for Reinforcement Learning with Evolution Strategies

automl/learning_environments 24 Jan 2021

This work explores learning agent-agnostic synthetic environments (SEs) for Reinforcement Learning.

Model-based micro-data reinforcement learning: what are the crucial model properties and which model to choose?

ramp-kits/rl_simulator ICLR 2021

We contribute to micro-data model-based reinforcement learning (MBRL) by rigorously comparing popular generative models using a fixed (random shooting) control agent.

Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations

saodem74/Transfer-Learning-in-Reinforcement-Learning 5 Feb 2022

In this paper, we approach the task of transfer learning between domains that differ in action spaces.

Adaptive Online Value Function Approximation with Wavelets

michael-beukman/waveletrl 22 Apr 2022

We further demonstrate that a fixed wavelet basis set performs comparably against the high-performing Fourier basis on Mountain Car and Acrobot, and that the adaptive methods provide a convenient approach to addressing an oversized initial basis set, while demonstrating performance comparable to, or greater than, the fixed wavelet basis.

Total energy-shaping control for mechanical systems via Control-by-Interconnection

joelferguson/underactuated_mechanical_cbi 10 Jan 2023

In this work, it is shown that total energy-shaping control of under-actuated mechanical systems has a control-by-interconnection interpretation.

Signal Novelty Detection as an Intrinsic Reward for Robotics

markub3327/Dueling-DQN-with-AutoEncoder MDPI Sensors 2023

In advanced robot control, reinforcement learning is a common technique used to transform sensor data into signals for actuators, based on feedback from the robot’s environment.