no code implementations • 23 Jun 2024 • Ozan Vardal, Richard Hawkins, Colin Paterson, Chiara Picardi, Daniel Omeiza, Lars Kunze, Ibrahim Habli
A critical part of this is to be able to monitor when the performance of the model at runtime (as a result of changes) poses a safety risk to the system.
no code implementations • 28 Jul 2023 • Kevin Denamganaï, Daniel Hernandez, Ozan Vardal, Sondess Missaoui, James Alfred Walker
We show that the referential game's agents make an artificial language emerge that is aligned with the natural-like language used to describe goals in the BabyAI benchmark and that it is expressive enough so as to also describe unsuccessful RL trajectories and thus provide feedback to the RL agent to leverage the linguistic, structured information contained in all trajectories.