no code implementations • 15 Jun 2023 • Federico Malato, Florian Leopold, Ville Hautamaki, Andrew Melnik
Actions from a selected similar situation can be performed by the agent until representations of the agent's current situation and the selected experience diverge in the latent space.
no code implementations • 23 Mar 2023 • Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander Schulhoff, Brandon Houghton, Sharada Mohanty, Byron Galbraith, Ke Chen, Yan Song, Tianze Zhou, Bingquan Yu, He Liu, Kai Guan, Yujing Hu, Tangjie Lv, Federico Malato, Florian Leopold, Amogh Raut, Ville Hautamäki, Andrew Melnik, Shu Ishida, João F. Henriques, Robert Klassert, Walter Laurito, Ellen Novoseller, Vinicius G. Goecks, Nicholas Waytowich, David Watkins, Josh Miller, Rohin Shah
To facilitate research in the direction of fine-tuning foundation models from human feedback, we held the MineRL BASALT Competition on Fine-Tuning from Human Feedback at NeurIPS 2022.
no code implementations • 27 Dec 2022 • Federico Malato, Florian Leopold, Amogh Raut, Ville Hautamäki, Andrew Melnik
Our approach can effectively recover meaningful demonstration trajectories and show human-like behavior of an agent in the Minecraft environment.
no code implementations • 19 Jan 2022 • Federico Malato, Joona Jehkonen, Ville Hautamäki
Behavioural cloning has been extensively used to train agents and is recognized as a fast and solid approach to teach general behaviours based on expert trajectories.