no code implementations • 2 Apr 2024 • Kun Chu, Xufeng Zhao, Cornelius Weber, Mengdi Li, Wenhao Lu, Stefan Wermter
Although there has been rapid progress in endowing robots with the ability to solve complex manipulation tasks, generating control policies for bimanual robots to solve tasks involving two hands is still challenging because of the difficulties in effective temporal and spatial coordination.
1 code implementation • 30 Dec 2023 • Wenhao Lu, Xufeng Zhao, Thilo Fryen, Jae Hee Lee, Mengdi Li, Sven Magg, Stefan Wermter
This lack of transparency in RL models has been a long-standing problem, making it difficult for users to grasp the reasons behind an agent's behaviour.
no code implementations • 4 Nov 2023 • Kun Chu, Xufeng Zhao, Cornelius Weber, Mengdi Li, Stefan Wermter
Large Language Models (LLMs) demonstrate remarkable abilities to provide human-like feedback on user inputs in natural language.
1 code implementation • 23 Sep 2023 • Xufeng Zhao, Mengdi Li, Wenhao Lu, Cornelius Weber, Jae Hee Lee, Kun Chu, Stefan Wermter
Recent advancements in large language models have showcased their remarkable generalizability across various domains.
no code implementations • 25 Apr 2023 • Wenhao Lu, Xufeng Zhao, Sven Magg, Martin Gromniak, Mengdi Li, Stefan Wermter
Explaining the behaviour of intelligent agents learned by reinforcement learning (RL) to humans is challenging yet crucial due to their incomprehensible proprioceptive states, variational intermediate goals, and resultant unpredictability.
Reinforcement Learning (RL) Vocal Bursts Intensity Prediction
1 code implementation • 14 Mar 2023 • Xufeng Zhao, Mengdi Li, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter
However, it remains challenging to ground LLMs in multimodal sensory input and continuous action output, while enabling a robot to interact with its environment and acquire novel information as its policies unfold.
1 code implementation • 1 Feb 2023 • Mengdi Li, Xufeng Zhao, Jae Hee Lee, Cornelius Weber, Stefan Wermter
We study a class of reinforcement learning problems where the reward signals for policy learning are generated by an internal reward model that is dependent on and jointly optimized with the policy.
1 code implementation • 4 Aug 2022 • Xufeng Zhao, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter
Sound is one of the most informative and abundant modalities in the real world while being robust to sense without contacts by small and cheap sensors that can be placed on mobile devices.