1 code implementation • 28 Feb 2024 • Katherine Metcalf, Miguel Sarabia, Natalie Mackraz, Barry-John Theobald
Preference-based reinforcement learning (PbRL) aligns a robot behavior with human preferences via a reward function learned from binary feedback over agent behaviors.
no code implementations • 26 Oct 2023 • Andrew Szot, Max Schwarzer, Harsh Agrawal, Bogdan Mazoure, Walter Talbott, Katherine Metcalf, Natalie Mackraz, Devon Hjelm, Alexander Toshev
We show that large language models (LLMs) can be adapted to be generalizable policies for embodied visual tasks.