Search Results for author: Aurick Zhou

Found 12 papers, 5 papers with code

Bayesian Adaptation for Covariate Shift

no code implementations NeurIPS 2021 Aurick Zhou, Sergey Levine

When faced with distribution shift at test time, deep neural networks often make inaccurate predictions with unreliable uncertainty estimates. While improving the robustness of neural networks is one promising approach to mitigate this issue, an appealing alternate to robustifying networks against all possible test-time shifts is to instead directly adapt them to unlabeled inputs from the particular distribution shift we encounter at test time. However, this poses a challenging question: in the standard Bayesian model for supervised learning, unlabeled inputs are conditionally independent of model parameters when the labels are unobserved, so what can unlabeled data tell us about the model parameters at test-time?

Domain Adaptation Image Classification

Training on Test Data with Bayesian Adaptation for Covariate Shift

no code implementations27 Sep 2021 Aurick Zhou, Sergey Levine

When faced with distribution shift at test time, deep neural networks often make inaccurate predictions with unreliable uncertainty estimates.

Domain Adaptation Image Classification

MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

no code implementations15 Jul 2021 Kevin Li, Abhishek Gupta, Ashwin Reddy, Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine

In this work, we show that an uncertainty aware classifier can solve challenging reinforcement learning problems by both encouraging exploration and provided directed guidance towards positive outcomes.

Meta-Learning reinforcement-learning

Reinforcement Learning with Bayesian Classifiers: Efficient Skill Learning from Outcome Examples

no code implementations1 Jan 2021 Kevin Li, Abhishek Gupta, Vitchyr H. Pong, Ashwin Reddy, Aurick Zhou, Justin Yu, Sergey Levine

In this work, we study a more tractable class of reinforcement learning problems defined by data that provides examples of successful outcome states.

reinforcement-learning

Amortized Conditional Normalized Maximum Likelihood: Reliable Out of Distribution Uncertainty Estimation

no code implementations5 Nov 2020 Aurick Zhou, Sergey Levine

In this paper, we propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation, calibration, and out-of-distribution robustness with deep networks.

Bayesian Inference

Amortized Conditional Normalized Maximum Likelihood

no code implementations28 Sep 2020 Aurick Zhou, Sergey Levine

In this paper, we propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation, calibration, and out-of-distribution robustness with deep networks.

Bayesian Inference

Conservative Q-Learning for Offline Reinforcement Learning

10 code implementations NeurIPS 2020 Aviral Kumar, Aurick Zhou, George Tucker, Sergey Levine

We theoretically show that CQL produces a lower bound on the value of the current policy and that it can be incorporated into a policy learning procedure with theoretical improvement guarantees.

Continuous Control DQN Replay Dataset +2

Learning to Walk via Deep Reinforcement Learning

no code implementations26 Dec 2018 Tuomas Haarnoja, Sehoon Ha, Aurick Zhou, Jie Tan, George Tucker, Sergey Levine

In this paper, we propose a sample-efficient deep RL algorithm based on maximum entropy RL that requires minimal per-task tuning and only a modest number of trials to learn neural network policies.

Legged Robots reinforcement-learning

Composable Deep Reinforcement Learning for Robotic Manipulation

1 code implementation19 Mar 2018 Tuomas Haarnoja, Vitchyr Pong, Aurick Zhou, Murtaza Dalal, Pieter Abbeel, Sergey Levine

Second, we show that policies learned with soft Q-learning can be composed to create new policies, and that the optimality of the resulting policy can be bounded in terms of the divergence between the composed policies.

Q-Learning reinforcement-learning

Cannot find the paper you are looking for? You can Submit a new open access paper.