Search Results for author: Sandy H. Huang

Found 8 papers, 2 papers with code

Coherent Soft Imitation Learning

1 code implementation NeurIPS 2023 Joe Watson, Sandy H. Huang, Nicolas Heess

Imitation learning methods seek to learn from an expert either through behavioral cloning (BC) of the policy or inverse reinforcement learning (IRL) of the reward.

Imitation Learning reinforcement-learning

On Multi-objective Policy Optimization as a Tool for Reinforcement Learning: Case Studies in Offline RL and Finetuning

no code implementations15 Jun 2021 Abbas Abdolmaleki, Sandy H. Huang, Giulia Vezzani, Bobak Shahriari, Jost Tobias Springenberg, Shruti Mishra, Dhruva TB, Arunkumar Byravan, Konstantinos Bousmalis, Andras Gyorgy, Csaba Szepesvari, Raia Hadsell, Nicolas Heess, Martin Riedmiller

Many advances that have improved the robustness and efficiency of deep reinforcement learning (RL) algorithms can, in one way or another, be understood as introducing additional objectives or constraints in the policy optimization step.

Offline RL reinforcement-learning +1

Nonverbal Robot Feedback for Human Teachers

no code implementations6 Nov 2019 Sandy H. Huang, Isabella Huang, Ravi Pandya, Anca D. Dragan

Robots can learn preferences from human demonstrations, but their success depends on how informative these demonstrations are.

Human-AI Learning Performance in Multi-Armed Bandits

no code implementations21 Dec 2018 Ravi Pandya, Sandy H. Huang, Dylan Hadfield-Menell, Anca D. Dragan

People frequently face challenging decision-making problems in which outcomes are uncertain or unknown.

Decision Making Multi-Armed Bandits

Establishing Appropriate Trust via Critical States

no code implementations18 Oct 2018 Sandy H. Huang, Kush Bhatia, Pieter Abbeel, Anca D. Dragan

In order to effectively interact with or supervise a robot, humans need to have an accurate mental model of its capabilities and how it acts.

Robotics

Enabling Robots to Communicate their Objectives

no code implementations11 Feb 2017 Sandy H. Huang, David Held, Pieter Abbeel, Anca D. Dragan

We show that certain approximate-inference models lead to the robot generating example behaviors that better enable users to anticipate what it will do in novel situations.

Autonomous Driving

Cannot find the paper you are looking for? You can Submit a new open access paper.