1 code implementation • 6 Feb 2024 • Yash Shukla, Tanushree Burman, Abhishek Kulkarni, Robert Wright, Alvaro Velasquez, Jivko Sinapov
In this work, we propose a novel approach, called Logical Specifications-guided Dynamic Task Sampling (LSTS), that learns a set of RL policies to guide an agent from an initial state to a goal state based on a high-level task specification, while minimizing the number of environmental interactions.
no code implementations • 14 Oct 2023 • Yash Shukla, Wenchang Gao, Vasanth Sarathy, Alvaro Velasquez, Robert Wright, Jivko Sinapov
In this work, we propose LgTS (LLM-guided Teacher-Student learning), a novel approach that explores the planning abilities of LLMs to provide a graphical representation of the sub-goals to a reinforcement learning (RL) agent that does not have access to the transition dynamics of the environment.
1 code implementation • 13 Oct 2023 • Yash Shukla, Bharat Kesari, Shivam Goel, Robert Wright, Jivko Sinapov
We use Generative Adversarial Networks (GANs) along with a cycle-consistency loss to map the observations between the source and target domains and later use this learned mapping to clone the successful source task behavior policy to the target domain.
1 code implementation • 11 Apr 2023 • Yash Shukla, Abhishek Kulkarni, Robert Wright, Alvaro Velasquez, Jivko Sinapov
Experiments in gridworld and physics-based simulated robotics domains show that the curricula produced by AGCL achieve improved time-to-threshold performance on a complex sequential decision-making problem relative to state-of-the-art curriculum learning (e. g, teacher-student, self-play) and automaton-guided reinforcement learning baselines (e. g, Q-Learning for Reward Machines).
1 code implementation • 24 Jun 2022 • Shivam Goel, Yash Shukla, Vasanth Sarathy, Matthias Scheutz, Jivko Sinapov
We propose RAPid-Learn: Learning to Recover and Plan Again, a hybrid planning and learning method, to tackle the problem of adapting to sudden and unexpected changes in an agent's environment (i. e., novelties).