Search Results for author: Stephanie Milani

Found 22 papers, 5 papers with code

Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels

no code implementations22 Jul 2024 Zhuorui Ye, Stephanie Milani, Geoffrey J. Gordon, Fei Fang

To overcome this limitation, we introduce a novel training scheme that enables RL algorithms to efficiently learn a concept-based policy by only querying humans to label a small set of data, or in the extreme case, without any human labels.

Decision Making Reinforcement Learning (RL)

Interpretability in Action: Exploratory Analysis of VPT, a Minecraft Agent

no code implementations16 Jul 2024 Karolis Jucys, George Adamopoulos, Mehrab Hamidi, Stephanie Milani, Mohammad Reza Samsami, Artem Zholus, Sonia Joseph, Blake Richards, Irina Rish, Özgür Şimşek

Understanding the mechanisms behind decisions taken by large foundation models in sequential decision making tasks is critical to ensuring that such systems operate transparently and safely.

Decision Making Minecraft +1

Unifying Interpretability and Explainability for Alzheimer's Disease Progression Prediction

1 code implementation11 Jun 2024 Raja Farrukh Ali, Stephanie Milani, John Woods, Emmanuel Adenij, Ayesha Farooq, Clayton Mansel, Jeffrey Burns, William Hsu

Our findings show that only one of the RL methods is able to satisfactorily model disease progression, but the post-hoc explanations indicate that all methods fail to properly capture the importance of amyloid accumulation, one of the pathological hallmarks of Alzheimer's disease.

Reinforcement Learning (RL)

PATIENT-Ψ: Using Large Language Models to Simulate Patients for Training Mental Health Professionals

1 code implementation30 May 2024 Ruiyi Wang, Stephanie Milani, Jamie C. Chiu, Jiayin Zhi, Shaun M. Eack, Travis Labrum, Samuel M. Murphy, Nev Jones, Kate Hardy, Hong Shen, Fei Fang, Zhiyu Zoey Chen

We propose an interactive training scheme, PATIENT-{\Psi}-TRAINER, for mental health trainees to practice a key skill in CBT -- formulating the cognitive model of the patient -- through role-playing a therapy session with PATIENT-{\Psi}.

BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks

1 code implementation NeurIPS 2023 Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander Schulhoff, Brandon Houghton, Rohin Shah

Given the completion of two years of BASALT competitions, we offer to the community a formalized benchmark through the BASALT Evaluation and Demonstrations Dataset (BEDD), which serves as a resource for algorithm development and performance assessment.

Benchmarking Minecraft

MABL: Bi-Level Latent-Variable World Model for Sample-Efficient Multi-Agent Reinforcement Learning

no code implementations12 Apr 2023 Aravind Venugopal, Stephanie Milani, Fei Fang, Balaraman Ravindran

Unlike existing models, MABL is capable of encoding essential global information into the latent states during training while guaranteeing the decentralized execution of learned policies.

reinforcement-learning SMAC+

Navigates Like Me: Understanding How People Evaluate Human-Like AI in Video Games

no code implementations2 Mar 2023 Stephanie Milani, Arthur Juliani, Ida Momennejad, Raluca Georgescu, Jaroslaw Rzpecki, Alison Shaw, Gavin Costello, Fei Fang, Sam Devlin, Katja Hofmann

We aim to understand how people assess human likeness in navigation produced by people and artificially intelligent (AI) agents in a video game.

AI Agent

A Survey of Explainable Reinforcement Learning

no code implementations17 Feb 2022 Stephanie Milani, Nicholay Topin, Manuela Veloso, Fei Fang

In this survey, we propose a novel taxonomy for organizing the XRL literature that prioritizes the RL setting.

Decision Making reinforcement-learning +4

MineRL Diamond 2021 Competition: Overview, Results, and Lessons Learned

no code implementations17 Feb 2022 Anssi Kanervisto, Stephanie Milani, Karolis Ramanauskas, Nicholay Topin, Zichuan Lin, Junyou Li, Jianing Shi, Deheng Ye, Qiang Fu, Wei Yang, Weijun Hong, Zhongyue Huang, Haicheng Chen, Guangjun Zeng, Yue Lin, Vincent Micheli, Eloi Alonso, François Fleuret, Alexander Nikulin, Yury Belousov, Oleg Svidchenko, Aleksei Shpilman

With this in mind, we hosted the third edition of the MineRL ObtainDiamond competition, MineRL Diamond 2021, with a separate track in which we permitted any solution to promote the participation of newcomers.

The MineRL BASALT Competition on Learning from Human Feedback

no code implementations5 Jul 2021 Rohin Shah, Cody Wild, Steven H. Wang, Neel Alex, Brandon Houghton, William Guss, Sharada Mohanty, Anssi Kanervisto, Stephanie Milani, Nicholay Topin, Pieter Abbeel, Stuart Russell, Anca Dragan

Rather than training AI systems using a predefined reward function or using a labeled dataset with a predefined set of categories, we instead train the AI system using a learning signal derived from some form of human feedback, which can evolve over time as the understanding of the task changes, or as the capabilities of the AI system improve.

Imitation Learning Minecraft

Iterative Bounding MDPs: Learning Interpretable Policies via Non-Interpretable Methods

no code implementations25 Feb 2021 Nicholay Topin, Stephanie Milani, Fei Fang, Manuela Veloso

Because of this decision tree equivalence, any function approximator can be used during training, including a neural network, while yielding a decision tree policy for the base MDP.

reinforcement-learning Reinforcement Learning +1

The MineRL 2020 Competition on Sample Efficient Reinforcement Learning using Human Priors

no code implementations26 Jan 2021 William H. Guss, Mario Ynocente Castro, Sam Devlin, Brandon Houghton, Noboru Sean Kuno, Crissman Loomis, Stephanie Milani, Sharada Mohanty, Keisuke Nakata, Ruslan Salakhutdinov, John Schulman, Shinya Shiroshita, Nicholay Topin, Avinash Ummadisingu, Oriol Vinyals

Although deep reinforcement learning has led to breakthroughs in many difficult domains, these successes have required an ever-increasing number of samples, affording only a shrinking segment of the AI community access to their development.

Decision Making Deep Reinforcement Learning +5

Guaranteeing Reproducibility in Deep Learning Competitions

no code implementations12 May 2020 Brandon Houghton, Stephanie Milani, Nicholay Topin, William Guss, Katja Hofmann, Diego Perez-Liebana, Manuela Veloso, Ruslan Salakhutdinov

To encourage the development of methods with reproducible and robust training behavior, we propose a challenge paradigm where competitors are evaluated directly on the performance of their learning procedures rather than pre-trained agents.

Deep Learning

Retrospective Analysis of the 2019 MineRL Competition on Sample Efficient Reinforcement Learning

no code implementations10 Mar 2020 Stephanie Milani, Nicholay Topin, Brandon Houghton, William H. Guss, Sharada P. Mohanty, Keisuke Nakata, Oriol Vinyals, Noboru Sean Kuno

To facilitate research in the direction of sample efficient reinforcement learning, we held the MineRL Competition on Sample Efficient Reinforcement Learning Using Human Priors at the Thirty-third Conference on Neural Information Processing Systems (NeurIPS 2019).

Deep Reinforcement Learning Imitation Learning +2

Planning with Abstract Learned Models While Learning Transferable Subtasks

no code implementations16 Dec 2019 John Winder, Stephanie Milani, Matthew Landen, Erebus Oh, Shane Parr, Shawn Squire, Marie desJardins, Cynthia Matuszek

We introduce an algorithm for model-based hierarchical reinforcement learning to acquire self-contained transition and reward models suitable for probabilistic planning at multiple levels of abstraction.

Hierarchical Reinforcement Learning reinforcement-learning +2

The MineRL 2019 Competition on Sample Efficient Reinforcement Learning using Human Priors

1 code implementation22 Apr 2019 William H. Guss, Cayden Codel, Katja Hofmann, Brandon Houghton, Noboru Kuno, Stephanie Milani, Sharada Mohanty, Diego Perez Liebana, Ruslan Salakhutdinov, Nicholay Topin, Manuela Veloso, Phillip Wang

To that end, we introduce: (1) the Minecraft ObtainDiamond task, a sequential decision making environment requiring long-term planning, hierarchical control, and efficient exploration methods; and (2) the MineRL-v0 dataset, a large-scale collection of over 60 million state-action pairs of human demonstrations that can be resimulated into embodied trajectories with arbitrary modifications to game state and visuals.

Decision Making Deep Reinforcement Learning +5

Cannot find the paper you are looking for? You can Submit a new open access paper.