no code implementations • 9 Feb 2024 • Anna Madison, Ellen Novoseller, Vinicius G. Goecks, Benjamin T. Files, Nicholas Waytowich, Alfred Yu, Vernon J. Lawhern, Steven Thurman, Christopher Kelshaw, Kaleb McDowell
Future warfare will require Command and Control (C2) personnel to make decisions at shrinking timescales in complex and potentially ill-defined situations.
no code implementations • 1 Feb 2024 • Vinicius G. Goecks, Nicholas Waytowich
The development of Courses of Action (COAs) in military operations is traditionally a time-consuming and intricate process.
no code implementations • 30 Jul 2023 • Devin White, Mingkang Wu, Ellen Novoseller, Vernon J. Lawhern, Nicholas Waytowich, Yongcan Cao
This paper develops a novel rating-based reinforcement learning approach that uses human ratings to obtain human guidance in reinforcement learning.
no code implementations • 22 Jul 2023 • Ellen Novoseller, Vinicius G. Goecks, David Watkins, Josh Miller, Nicholas Waytowich
In machine learning for sequential decision-making, an algorithmic agent learns to interact with an environment while receiving feedback in the form of a reward signal.
no code implementations • 23 Mar 2023 • Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander Schulhoff, Brandon Houghton, Sharada Mohanty, Byron Galbraith, Ke Chen, Yan Song, Tianze Zhou, Bingquan Yu, He Liu, Kai Guan, Yujing Hu, Tangjie Lv, Federico Malato, Florian Leopold, Amogh Raut, Ville Hautamäki, Andrew Melnik, Shu Ishida, João F. Henriques, Robert Klassert, Walter Laurito, Ellen Novoseller, Vinicius G. Goecks, Nicholas Waytowich, David Watkins, Josh Miller, Rohin Shah
To facilitate research in the direction of fine-tuning foundation models from human feedback, we held the MineRL BASALT Competition on Fine-Tuning from Human Feedback at NeurIPS 2022.
no code implementations • 16 Oct 2022 • Bharat Prakash, Nicholas Waytowich, Tim Oates, Tinoosh Mohsenin
Learning to solve long horizon temporally extended tasks with reinforcement learning has been a challenge for several years now.
no code implementations • 13 Sep 2022 • David Watkins-Valls, Peter Allen, Krzysztof Choromanski, Jacob Varley, Nicholas Waytowich
We propose the Multiple View Performer (MVP) - a new architecture for 3D shape completion from a series of temporally sequential views.
no code implementations • 11 May 2022 • Nicholas Waytowich, James Hare, Vinicius G. Goecks, Mark Mittrick, John Richardson, Anjon Basak, Derrik E. Asher
Traditionally, learning from human demonstrations via direct behavior cloning can lead to high-performance policies given that the algorithm has access to large amounts of high-quality data covering the most likely scenarios to be encountered when the agent is operating.
no code implementations • 14 Apr 2022 • Rohin Shah, Steven H. Wang, Cody Wild, Stephanie Milani, Anssi Kanervisto, Vinicius G. Goecks, Nicholas Waytowich, David Watkins-Valls, Bharat Prakash, Edmund Mills, Divyansh Garg, Alexander Fries, Alexandra Souly, Chan Jun Shern, Daniel del Castillo, Tom Lieberum
The goal of the competition was to promote research towards agents that use learning from human feedback (LfHF) techniques to solve open-world tasks.
1 code implementation • 7 Dec 2021 • Vinicius G. Goecks, Nicholas Waytowich, David Watkins-Valls, Bharat Prakash
In this work, we present the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge: Learning from Human Feedback in Minecraft, which challenged participants to use human data to solve four tasks defined only by a natural language description and no reward function.
no code implementations • 7 Nov 2021 • Bharat Prakash, Nicholas Waytowich, Tinoosh Mohsenin, Tim Oates
In this work, we propose a method for automatic goal generation using a dynamical distance function (DDF) in a self-supervised fashion.
no code implementations • 21 Oct 2021 • Vinicius G. Goecks, Nicholas Waytowich, Derrik E. Asher, Song Jun Park, Mark Mittrick, John Richardson, Manuel Vindiola, Anne Logie, Mark Dennison, Theron Trout, Priya Narayanan, Alexander Kott
Games and simulators can be a valuable platform to execute complex multi-agent, multiplayer, imperfect information scenarios with significant parallels to military applications: multiple participants manage resources and make decisions that command assets to secure specific areas of a map or neutralize opposing forces.
no code implementations • 9 Oct 2021 • Bharat Prakash, Nicholas Waytowich, Tim Oates, Tinoosh Mohsenin
The low-level controller executes the sub-tasks based on the language commands.
no code implementations • 31 Oct 2019 • Nicholas Waytowich, Sean L. Barton, Vernon Lawhern, Garrett Warnell
While this problem can be addressed through reward shaping, such approaches typically require a human expert with specialized knowledge.
no code implementations • 29 Sep 2019 • Sunil Gandhi, Tim Oates, Tinoosh Mohsenin, Nicholas Waytowich
In this paper, we present a method for learning from video demonstrations by using human feedback to construct a mapping between the standard representation of the agent and the visual representation of the demonstration.
no code implementations • 20 Sep 2019 • David Watkins-Valls, Jingxi Xu, Nicholas Waytowich, Peter Allen
We present a robot navigation system that uses an imitation learning framework to successfully navigate in complex environments.
no code implementations • 24 Apr 2019 • Nicholas Waytowich, Sean L. Barton, Vernon Lawhern, Ethan Stump, Garrett Warnell
While deep reinforcement learning techniques have led to agents that are successfully able to learn to perform a number of tasks that had been previously unlearnable, these techniques are still susceptible to the longstanding problem of {\em reward sparsity}.
no code implementations • 22 Mar 2019 • Bharat Prakash, Mohit Khatwani, Nicholas Waytowich, Tinoosh Mohsenin
Recent progress in AI and Reinforcement learning has shown great success in solving complex problems with high dimensional state spaces.
2 code implementations • 28 Sep 2017 • Garrett Warnell, Nicholas Waytowich, Vernon Lawhern, Peter Stone
While recent advances in deep reinforcement learning have allowed autonomous learning agents to succeed at a variety of complex tasks, existing algorithms generally require a lot of training data.