1 code implementation • 3 Apr 2024 • Renhao Zhang, Haotian Fu, Yilin Miao, George Konidaris
We propose a novel model-based reinforcement learning algorithm -- Dynamics Learning and predictive control with Parameterized Actions (DLPA) -- for Parameterized Action Markov Decision Processes (PAMDPs).
no code implementations • 26 Feb 2024 • Haotian Fu, Pratyusha Sharma, Elias Stengel-Eskin, George Konidaris, Nicolas Le Roux, Marc-Alexandre Côté, Xingdi Yuan
We present an algorithm for skill discovery from expert demonstrations.
no code implementations • 18 Feb 2024 • Benedict Quartey, Eric Rosen, Stefanie Tellex, George Konidaris
We propose Language Instruction grounding for Motion Planning (LIMP), a system that leverages foundation models and temporal logics to generate instruction-conditioned semantic maps that enable robots to verifiably follow expressive and long-horizon instructions with open vocabulary referents and complex spatiotemporal constraints.
1 code implementation • 20 Dec 2023 • William Hill, Ireton Liu, Anita de Mello Koch, Damion Harvey, George Konidaris, Steven James
We propose a new benchmark for planning tasks based on the Minecraft game.
no code implementations • 3 Oct 2023 • Ifrah Idrees, Tian Yun, Naveen Sharma, Yunxin Deng, Nakul Gopalan, George Konidaris, Stefanie Tellex
We propose a novel framework for plan and goal recognition in partially observable domains -- Dialogue for Goal Recognition (D4GR) enabling a robot to rectify its belief in human progress by asking clarification questions about noisy sensor data and sub-optimal human actions.
no code implementations • 6 Jul 2023 • Andrew Levy, Sreehari Rammohan, Alessandro Allievi, Scott Niekum, George Konidaris
Our framework makes two specific contributions.
1 code implementation • 5 Jun 2023 • Sam Lobel, Akhil Bagaria, George Konidaris
We propose a new method for count-based exploration in high-dimensional state spaces.
no code implementations • 9 Mar 2023 • Benedict Quartey, Ankit Shah, George Konidaris
We propose an approach that maximizes experience reuse while learning to solve a given task by generating and simultaneously learning useful auxiliary tasks.
no code implementations • 18 Jan 2023 • Megan M. Baker, Alexander New, Mario Aguilar-Simon, Ziad Al-Halah, Sébastien M. R. Arnold, Ese Ben-Iwhiwhu, Andrew P. Brna, Ethan Brooks, Ryan C. Brown, Zachary Daniels, Anurag Daram, Fabien Delattre, Ryan Dellana, Eric Eaton, Haotian Fu, Kristen Grauman, Jesse Hostetler, Shariq Iqbal, Cassandra Kent, Nicholas Ketz, Soheil Kolouri, George Konidaris, Dhireesha Kudithipudi, Erik Learned-Miller, Seungwon Lee, Michael L. Littman, Sandeep Madireddy, Jorge A. Mendez, Eric Q. Nguyen, Christine D. Piatko, Praveen K. Pilly, Aswin Raghavan, Abrar Rahman, Santhosh Kumar Ramakrishnan, Neale Ratzlaff, Andrea Soltoggio, Peter Stone, Indranil Sur, Zhipeng Tang, Saket Tiwari, Kyle Vedder, Felix Wang, Zifan Xu, Angel Yanguas-Gil, Harel Yedidsion, Shangqun Yu, Gautam K. Vallabha
Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed.
no code implementations • 29 Dec 2022 • Saket Tiwari, George Konidaris
Deep neural networks can approximate functions on different types of data, from images to graphs, with varied underlying structure.
no code implementations • 29 Dec 2022 • Saket Tiwari, Omer Gottesman, George Konidaris
Central to our work is the idea that the transition dynamics induce a low dimensional manifold of reachable states embedded in the high-dimensional nominal state space.
1 code implementation • 26 Nov 2022 • Charles Lovering, Jessica Zosa Forde, George Konidaris, Ellie Pavlick, Michael L. Littman
AlphaZero, an approach to reinforcement learning that couples neural networks and Monte Carlo tree search (MCTS), has produced state-of-the-art strategies for traditional board games like chess, Go, shogi, and Hex.
2 code implementations • 20 Oct 2022 • Haotian Fu, Shangqun Yu, Michael Littman, George Konidaris
We propose a model-based lifelong reinforcement-learning approach that estimates a hierarchical Bayesian posterior distilling the common structure shared across different tasks.
no code implementations • 28 Sep 2022 • Seiji Shaw, Devesh K. Jha, Arvind Raghunathan, Radu Corcodel, Diego Romeres, George Konidaris, Daniel Nikovski
In this paper, we present constrained dynamic movement primitives (CDMP) which can allow for constraint satisfaction in the robot workspace.
no code implementations • 12 Aug 2022 • Rafael Rodriguez-Sanchez, Benjamin A. Spiegel, Jennifer Wang, Roma Patel, Stefanie Tellex, George Konidaris
We define precise syntax and grounding semantics for RLang, and provide a parser that grounds RLang programs to an algorithm-agnostic \textit{partial} world model and policy that can be exploited by an RL agent.
1 code implementation • 7 Jun 2022 • Haotian Fu, Shangqun Yu, Saket Tiwari, Michael Littman, George Konidaris
We propose a novel parameterized skill-learning algorithm that aims to learn transferable parameterized skills and synthesize them into a new action space that supports efficient learning in long-horizon tasks.
no code implementations • 11 May 2022 • Zhiyuan Zhou, Cameron Allen, Kavosh Asadi, George Konidaris
We study the action generalization ability of deep Q-learning in discrete action spaces.
no code implementations • 4 May 2022 • Steven James, Benjamin Rosman, George Konidaris
We propose a framework for autonomously learning state abstractions of an agent's environment, given a set of skills.
1 code implementation • 22 Apr 2022 • Michael Beukman, Michael Mitchley, Dean Wookey, Steven James, George Konidaris
We further demonstrate that a fixed wavelet basis set performs comparably against the high-performing Fourier basis on Mountain Car and Acrobot, and that the adaptive methods provide a convenient approach to addressing an oversized initial basis set, while demonstrating performance comparable to, or greater than, the fixed wavelet basis.
no code implementations • 20 Mar 2022 • Shangqun Yu, Sreehari Rammohan, Kaiyu Zheng, George Konidaris
Animals such as rabbits and birds can instantly generate locomotion behavior in reaction to a dynamic, approaching object, such as a person or a rock, despite having possibly never seen the object before and having limited perception of the object's properties.
no code implementations • 23 Oct 2021 • Omer Gottesman, Kavosh Asadi, Cameron Allen, Sam Lobel, George Konidaris, Michael Littman
We propose a new coarse-grained smoothness definition that generalizes the notion of Lipschitz continuity, is more widely applicable, and allows us to compute significantly tighter bounds on Q-functions, leading to improved learning.
no code implementations • 23 Oct 2021 • Benjamin A. Spiegel, George Konidaris
We present a method for using adverb phrases to adjust skill parameters via learned adverb-skill groundings.
1 code implementation • 19 Oct 2021 • Kaiyu Zheng, Rohan Chitnis, Yoonchang Sung, George Konidaris, Stefanie Tellex
In realistic applications of object search, robots will need to locate target objects in complex environments while coping with unreliable sensors, especially for small or hard-to-detect objects.
no code implementations • 15 Oct 2021 • Hameed Abdul-Rashid, Miles Freeman, Ben Abbatematteo, George Konidaris, Daniel Ritchie
Manipulating an articulated object requires perceiving itskinematic hierarchy: its parts, how each can move, and howthose motions are coupled.
no code implementations • 11 Oct 2021 • Eric Hsiung, Hiloni Mehta, Junchi Chu, Xinyu Liu, Roma Patel, Stefanie Tellex, George Konidaris
We compare our method of mapping natural language task specifications to intermediate contextual queries against state-of-the-art CopyNet models capable of translating natural language to LTL, by evaluating whether correct LTL for manipulation and navigation task specifications can be output, and show that our method outperforms the CopyNet model on unseen object references.
no code implementations • 29 Sep 2021 • Haotian Fu, Shangqun Yu, Michael Littman, George Konidaris
A central question in reinforcement learning (RL) is how to leverage prior knowledge to accelerate learning in new tasks.
no code implementations • 12 Aug 2021 • Willie McClinton, Andrew Levy, George Konidaris
Sparse rewards and long time horizons remain challenging for reinforcement learning algorithms.
no code implementations • 28 Jul 2021 • Sreehari Rammohan, Shangqun Yu, Bowen He, Eric Hsiung, Eric Rosen, Stefanie Tellex, George Konidaris
Learning continuous control in high-dimensional sparse reward settings, such as robotic manipulation, is a challenging problem due to the number of samples often required to obtain accurate optimal value and policy estimates.
1 code implementation • NeurIPS 2021 • Cameron Allen, Neev Parikh, Omer Gottesman, George Konidaris
A fundamental assumption of reinforcement learning in Markov decision processes (MDPs) is that the relevant decision process is, in fact, Markov.
no code implementations • 12 Jan 2021 • Ben Abbatematteo, Eric Rosen, Stefanie Tellex, George Konidaris
We propose using kinematic motion planning as a completely autonomous, sample efficient way to bootstrap motor skill learning for object manipulation.
no code implementations • ICLR 2022 • Steven James, Benjamin Rosman, George Konidaris
Such representations can immediately be transferred between tasks that share the same types of objects, resulting in agents that require fewer samples to learn a model of a new task.
no code implementations • 17 Oct 2020 • Michael Fishman, Nishanth Kumar, Cameron Allen, Natasha Danas, Michael Littman, Stefanie Tellex, George Konidaris
Unfortunately, planning to solve any specific task using an open-scope model is computationally intractable - even for state-of-the-art methods - due to the many states and actions that are necessarily present in the model but irrelevant to that problem.
no code implementations • ICML Workshop LifelongML 2020 • Akhil Bagaria, Jason Crowley, Jing Wei Nicholas Lim, George Konidaris
Temporal abstraction provides an opportunity to drastically lower the decision making burden facing reinforcement learning agents in rich sensorimotor spaces.
no code implementations • ICML Workshop LaReL 2020 • Roma Patel, Rafael Rodriguez-Sanchez, George Konidaris
Human language is distinguished by powerful semantics, rich structure, and incredible flexibility.
1 code implementation • 4 Jun 2020 • Josh Roy, George Konidaris
We introduce Wasserstein Adversarial Proximal Policy Optimization (WAPPO), a novel algorithm for visual transfer in Reinforcement Learning that explicitly learns to align the distributions of extracted features between a source and target task.
1 code implementation • 6 May 2020 • Kaiyu Zheng, Yoonchang Sung, George Konidaris, Stefanie Tellex
Robots operating in households must find objects on shelves, under tables, and in cupboards.
no code implementations • ICLR 2020 • Yuu Jinnai, Jee Won Park, Marlos C. Machado, George Konidaris
While many option discovery methods have been proposed to accelerate exploration in reinforcement learning, they are often heuristic.
1 code implementation • ICLR 2020 • Akhil Bagaria, George Konidaris
Autonomously discovering temporally extended actions, or skills, is a longstanding goal of hierarchical reinforcement learning.
2 code implementations • 28 Apr 2020 • Cameron Allen, Michael Katz, Tim Klinger, George Konidaris, Matthew Riemer, Gerald Tesauro
Focused macros dramatically improve black-box planning efficiency across a wide range of planning domains, sometimes beating even state-of-the-art planners with access to a full domain model.
2 code implementations • 23 Oct 2019 • Jonathan Chang, Nishanth Kumar, Sean Hastings, Aaron Gokaslan, Diego Romeres, Devesh Jha, Daniel Nikovski, George Konidaris, Stefanie Tellex
We demonstrate that our model trained on 33% of the possible goals is able to generalize to more than 90% of the targets in the scene for both simulation and robot experiments.
no code implementations • 25 Sep 2019 • Josh Roy, George Konidaris
In such settings, agents are trained in similar environments, such as simulators, and are then transferred to the original environment.
no code implementations • 6 Jul 2019 • Oliver Kroemer, Scott Niekum, George Konidaris
A key challenge in intelligent robotics is creating robots that are capable of directly interacting with the world around them to achieve their goals.
Robotics
no code implementations • 30 May 2019 • Vanya Cohen, Benjamin Burchfiel, Thao Nguyen, Nakul Gopalan, Stefanie Tellex, George Konidaris
Our system is able to disambiguate between novel objects, observed via depth images, based on natural language descriptions.
no code implementations • ICML 2020 • Steven James, Benjamin Rosman, George Konidaris
We present a framework for autonomously learning a portable representation that describes a collection of low-level continuous environments.
no code implementations • 28 May 2019 • Benjamin Burchfiel, George Konidaris
We introduce a new method for category-level pose estimation which produces a distribution over predicted poses by integrating 3D shape estimates from a generative object model with segmentation information.
no code implementations • 2 Mar 2019 • Yuu Jinnai, Jee Won Park, David Abel, George Konidaris
One of the main challenges in reinforcement learning is solving tasks with sparse reward.
no code implementations • 16 Oct 2018 • Yuu Jinnai, David Abel, D. Ellis Hershkowitz, Michael Littman, George Konidaris
We formalize the problem of selecting the optimal set of options for planning as that of computing the smallest set of options so that planning converges in less than a given maximum of value-iteration passes.
no code implementations • ICML 2018 • David Abel, Yuu Jinnai, Sophie Yue Guo, George Konidaris, Michael Littman
We consider the problem of how best to use prior experience to bootstrap lifelong learning, where an agent faces a series of task instances drawn from some task distribution.
no code implementations • 20 Jun 2018 • Benjamin Burchfiel, George Konidaris
We introduce Hybrid Bayesian Eigenobjects (HBEOs), a novel representation for 3D objects designed to allow a robot to jointly estimate the pose, class, and full 3D geometry of a novel object observed from a single viewpoint in a single practical framework.
4 code implementations • 4 Dec 2017 • Andrew Levy, George Konidaris, Robert Platt, Kate Saenko
Hierarchical agents have the potential to solve sequential decision making tasks with greater sample efficiency than their non-hierarchical counterparts because hierarchical agents can break down tasks into sets of subtasks that only require short sequences of decisions.
1 code implementation • NeurIPS 2017 • Taylor W. Killian, Samuel Daulton, George Konidaris, Finale Doshi-Velez
We introduce a new formulation of the Hidden Parameter Markov Decision Process (HiP-MDP), a framework for modeling families of related tasks using low-dimensional latent embeddings.
no code implementations • NeurIPS 2017 • Garrett Andersen, George Konidaris
We introduce an online active exploration algorithm for data-efficiently learning an abstract symbolic model of an environment.
2 code implementations • 1 Sep 2017 • Cameron Allen, Kavosh Asadi, Melrose Roderick, Abdel-rahman Mohamed, George Konidaris, Michael Littman
We propose a new algorithm, Mean Actor-Critic (MAC), for discrete-action continuous-state reinforcement learning.
Ranked #1 on Continuous Control on Cart Pole (OpenAI Gym)
1 code implementation • 20 Jun 2017 • Taylor Killian, Samuel Daulton, George Konidaris, Finale Doshi-Velez
We introduce a new formulation of the Hidden Parameter Markov Decision Process (HiP-MDP), a framework for modeling families of related tasks using low-dimensional latent embeddings.
no code implementations • 1 Dec 2016 • Taylor Killian, George Konidaris, Finale Doshi-Velez
Due to physiological variation, patients diagnosed with the same condition may exhibit divergent, but related, responses to the same treatments.
no code implementations • NeurIPS 2015 • Philip S. Thomas, Scott Niekum, Georgios Theocharous, George Konidaris
The benefit of the Ω-return is that it accounts for the correlation of different length returns.
no code implementations • 25 Sep 2015 • George Konidaris
We describe a framework for building abstraction hierarchies whereby an agent alternates skill- and representation-acquisition phases to construct a sequence of increasingly abstract Markov decision processes.
3 code implementations • 5 Sep 2015 • Warwick Masson, Pravesh Ranchod, George Konidaris
We introduce a model-free algorithm for learning in Markov decision processes with parameterized actions-discrete actions with continuous parameters.
no code implementations • 15 Aug 2013 • Finale Doshi-Velez, George Konidaris
Control applications often feature tasks with similar, but not identical, dynamics.
no code implementations • NeurIPS 2011 • George Konidaris, Scott Niekum, Philip S. Thomas
We show that the lambda-return target used in the TD(lambda) family of algorithms is the maximum likelihood estimator for a specific model of how the variance of an n-step return estimate increases with n. We introduce the gamma-return estimator, an alternative target based on a more accurate model of variance, which defines the TD_gamma family of complex-backup temporal difference learning algorithms.
no code implementations • NeurIPS 2010 • George Konidaris, Scott Kuindersma, Roderic Grupen, Andrew G. Barto
We demonstrate that CST constructs an appropriate skill tree that can be further refined through learning in a challenging continuous domain, and that it can be used to segment demonstration trajectories on a mobile manipulator into chains of skills where each skill is assigned an appropriate abstraction.
no code implementations • NeurIPS 2009 • George Konidaris, Andrew G. Barto
We introduce skill chaining, a skill discovery method for reinforcement learning agents in continuous domains, that builds chains of skills leading to an end-of-task reward.