no code implementations • 12 Apr 2024 • Mudit Verma, Katherine Metcalf
Incorporating state importance into reward learning improves the speed of policy learning, overall policy performance, and reward recovery on both locomotion and manipulation tasks.
1 code implementation • 28 Feb 2024 • Katherine Metcalf, Miguel Sarabia, Natalie Mackraz, Barry-John Theobald
Preference-based reinforcement learning (PbRL) aligns a robot behavior with human preferences via a reward function learned from binary feedback over agent behaviors.
no code implementations • 26 Oct 2023 • Andrew Szot, Max Schwarzer, Harsh Agrawal, Bogdan Mazoure, Walter Talbott, Katherine Metcalf, Natalie Mackraz, Devon Hjelm, Alexander Toshev
We show that large language models (LLMs) can be adapted to be generalizable policies for embodied visual tasks.
no code implementations • 12 Nov 2022 • Katherine Metcalf, Miguel Sarabia, Barry-John Theobald
In this work, we demonstrate that encoding environment dynamics in the reward function (REED) dramatically reduces the number of preference labels required in state-of-the-art preference-based RL frameworks.
no code implementations • 17 Oct 2022 • Mudit Verma, Katherine Metcalf
Specifying rewards for reinforcement learned (RL) agents is challenging.
no code implementations • 18 Mar 2022 • Zakaria Aldeneh, Masha Fedzechkina, Skyler Seto, Katherine Metcalf, Miguel Sarabia, Nicholas Apostoloff, Barry-John Theobald
Previous research has shown that traditional metrics used to optimize and assess models for generating lip motion from speech are not a good indicator of subjective opinion of animation quality.
no code implementations • 18 Feb 2022 • Andrew Silva, Katherine Metcalf, Nicholas Apostoloff, Barry-John Theobald
Federated learning enables the deployment of machine learning to problems for which centralized data collection is impractical.
no code implementations • 2 Apr 2019 • Katherine Metcalf, Barry-John Theobald, Garrett Weinberg, Robert Lee, Ing-Marie Jonsson, Russ Webb, Nicholas Apostoloff
We describe experiments towards building a conversational digital assistant that considers the preferred conversational style of the user.
no code implementations • 10 Dec 2018 • Katherine Metcalf, Barry-John Theobald, Nicholas Apostoloff
We model the individual behavior for each agent in an interaction and then use a multi-agent fusion model to generate a summary over the expected actions of the group to render the model independent of the number of agents.