Search Results for author: Sandy H. Huang

Found 8 papers, 2 papers with code

Coherent Soft Imitation Learning

1 code implementation • NeurIPS 2023 • Joe Watson, Sandy H. Huang, Nicolas Heess

Imitation learning methods seek to learn from an expert either through behavioral cloning (BC) of the policy or inverse reinforcement learning (IRL) of the reward.

Imitation Learning reinforcement-learning

Paper
Code

Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning

no code implementations • 26 Apr 2023 • Tuomas Haarnoja, Ben Moran, Guy Lever, Sandy H. Huang, Dhruva Tirumala, Jan Humplik, Markus Wulfmeier, Saran Tunyasuvunakool, Noah Y. Siegel, Roland Hafner, Michael Bloesch, Kristian Hartikainen, Arunkumar Byravan, Leonard Hasenclever, Yuval Tassa, Fereshteh Sadeghi, Nathan Batchelor, Federico Casarini, Stefano Saliceti, Charles Game, Neil Sreendra, Kushal Patel, Marlon Gwira, Andrea Huber, Nicole Hurley, Francesco Nori, Raia Hadsell, Nicolas Heess

We investigate whether Deep Reinforcement Learning (Deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies in dynamic environments.

reinforcement-learning

Paper
Add Code

On Multi-objective Policy Optimization as a Tool for Reinforcement Learning: Case Studies in Offline RL and Finetuning

no code implementations • 15 Jun 2021 • Abbas Abdolmaleki, Sandy H. Huang, Giulia Vezzani, Bobak Shahriari, Jost Tobias Springenberg, Shruti Mishra, Dhruva TB, Arunkumar Byravan, Konstantinos Bousmalis, Andras Gyorgy, Csaba Szepesvari, Raia Hadsell, Nicolas Heess, Martin Riedmiller

Many advances that have improved the robustness and efficiency of deep reinforcement learning (RL) algorithms can, in one way or another, be understood as introducing additional objectives or constraints in the policy optimization step.

Offline RL reinforcement-learning +1

Paper
Add Code

A Distributional View on Multi-Objective Policy Optimization

1 code implementation • 15 May 2020 • Abbas Abdolmaleki, Sandy H. Huang, Leonard Hasenclever, Michael Neunert, H. Francis Song, Martina Zambelli, Murilo F. Martins, Nicolas Heess, Raia Hadsell, Martin Riedmiller

Many real-world problems require trading off multiple competing objectives.

Multi-Objective Reinforcement Learning

3,525

Paper
Code

Nonverbal Robot Feedback for Human Teachers

no code implementations • 6 Nov 2019 • Sandy H. Huang, Isabella Huang, Ravi Pandya, Anca D. Dragan

Robots can learn preferences from human demonstrations, but their success depends on how informative these demonstrations are.

Paper
Add Code

Human-AI Learning Performance in Multi-Armed Bandits

no code implementations • 21 Dec 2018 • Ravi Pandya, Sandy H. Huang, Dylan Hadfield-Menell, Anca D. Dragan

People frequently face challenging decision-making problems in which outcomes are uncertain or unknown.

Decision Making Multi-Armed Bandits

Paper
Add Code

Establishing Appropriate Trust via Critical States

no code implementations • 18 Oct 2018 • Sandy H. Huang, Kush Bhatia, Pieter Abbeel, Anca D. Dragan

In order to effectively interact with or supervise a robot, humans need to have an accurate mental model of its capabilities and how it acts.

Robotics

Paper
Add Code

Enabling Robots to Communicate their Objectives

no code implementations • 11 Feb 2017 • Sandy H. Huang, David Held, Pieter Abbeel, Anca D. Dragan

We show that certain approximate-inference models lead to the robot generating example behaviors that better enable users to anticipate what it will do in novel situations.

Autonomous Driving

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.