Search Results for author: Yin-Lam Chow

Found 24 papers, 5 papers with code

Control-Aware Representations for Model-based Reinforcement Learning

no code implementations • ICLR 2021 • Brandon Cui, Yin-Lam Chow, Mohammad Ghavamzadeh

We first formulate a LCE model to learn representations that are suitable to be used by a policy iteration style algorithm in the latent space.

Model-based Reinforcement Learning reinforcement-learning +2

Paper
Add Code

Latent Bandits Revisited

no code implementations • NeurIPS 2020 • Joey Hong, Branislav Kveton, Manzil Zaheer, Yin-Lam Chow, Amr Ahmed, Craig Boutilier

A latent bandit problem is one in which the learning agent knows the arm reward distributions conditioned on an unknown discrete latent state.

Recommendation Systems Thompson Sampling

Paper
Add Code

Non-Stationary Off-Policy Optimization

no code implementations • 15 Jun 2020 • Joey Hong, Branislav Kveton, Manzil Zaheer, Yin-Lam Chow, Amr Ahmed

This approach is practical and analyzable, and we provide guarantees on both the quality of off-policy optimization and the regret during online deployment.

Multi-Armed Bandits

Paper
Add Code

Variational Model-based Policy Optimization

no code implementations • 9 Jun 2020 • Yin-Lam Chow, Brandon Cui, MoonKyung Ryu, Mohammad Ghavamzadeh

Model-based reinforcement learning (RL) algorithms allow us to combine model-generated data with those collected from interaction with the real system in order to alleviate the data efficiency problem in RL.

Continuous Control Model-based Reinforcement Learning +1

Paper
Add Code

Predictive Coding for Locally-Linear Control

1 code implementation • ICML 2020 • Rui Shu, Tung Nguyen, Yin-Lam Chow, Tuan Pham, Khoat Than, Mohammad Ghavamzadeh, Stefano Ermon, Hung H. Bui

High-dimensional observations and unknown dynamics are major challenges when applying optimal control to many real-world decision making tasks.

Decision Making

Paper
Code

BRPO: Batch Residual Policy Optimization

no code implementations • 8 Feb 2020 • Sungryull Sohn, Yin-Lam Chow, Jayden Ooi, Ofir Nachum, Honglak Lee, Ed Chi, Craig Boutilier

In batch reinforcement learning (RL), one often constrains a learned policy to be close to the behavior (data-generating) policy, e. g., by constraining the learned action distribution to differ from the behavior policy by some maximum degree that is the same at each state.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

AlgaeDICE: Policy Gradient from Arbitrary Experience

no code implementations • 4 Dec 2019 • Ofir Nachum, Bo Dai, Ilya Kostrikov, Yin-Lam Chow, Lihong Li, Dale Schuurmans

In many real-world applications of reinforcement learning (RL), interactions with the environment are limited due to cost or feasibility.

Reinforcement Learning (RL)

Paper
Add Code

CAQL: Continuous Action Q-Learning

no code implementations • ICLR 2020 • Moonkyung Ryu, Yin-Lam Chow, Ross Anderson, Christian Tjandraatmadja, Craig Boutilier

Value-based reinforcement learning (RL) methods like Q-learning have shown success in a variety of domains.

Continuous Control Q-Learning +1

Paper
Add Code

Prediction, Consistency, Curvature: Representation Learning for Locally-Linear Control

1 code implementation • ICLR 2020 • Nir Levine, Yin-Lam Chow, Rui Shu, Ang Li, Mohammad Ghavamzadeh, Hung Bui

A promising approach is to embed the high-dimensional observations into a lower-dimensional latent representation space, estimate the latent dynamics model, then utilize this model for control in the latent space.

Decision Making Open-Ended Question Answering +1

Paper
Code

DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections

2 code implementations • NeurIPS 2019 • Ofir Nachum, Yin-Lam Chow, Bo Dai, Lihong Li

In contrast to previous approaches, our algorithm is agnostic to knowledge of the behavior policy (or policies) used to generate the dataset.

32,870

Paper
Code

Lyapunov-based Safe Policy Optimization for Continuous Control

1 code implementation • 28 Jan 2019 • Yin-Lam Chow, Ofir Nachum, Aleksandra Faust, Edgar Duenez-Guzman, Mohammad Ghavamzadeh

We formulate these problems as constrained Markov decision processes (CMDPs) and present safe policy optimization algorithms that are based on a Lyapunov approach to solve them.

Continuous Control Robot Navigation

Paper
Code

A Block Coordinate Ascent Algorithm for Mean-Variance Optimization

no code implementations • NeurIPS 2018 • Bo Liu, Tengyang Xie, Yangyang Xu, Mohammad Ghavamzadeh, Yin-Lam Chow, Daoming Lyu, Daesub Yoon

Risk management in dynamic decision problems is a primary concern in many fields, including financial investment, autonomous driving, and healthcare.

Autonomous Driving Management

Paper
Add Code

Risk-Sensitive Generative Adversarial Imitation Learning

no code implementations • 13 Aug 2018 • Jonathan Lacotte, Mohammad Ghavamzadeh, Yin-Lam Chow, Marco Pavone

We then derive two different versions of our RS-GAIL optimization problem that aim at matching the risk profiles of the agent and the expert w. r. t.

Imitation Learning

Paper
Add Code

A Lyapunov-based Approach to Safe Reinforcement Learning

1 code implementation • NeurIPS 2018 • Yin-Lam Chow, Ofir Nachum, Edgar Duenez-Guzman, Mohammad Ghavamzadeh

In many real-world reinforcement learning (RL) problems, besides optimizing the main objective function, an agent must concurrently avoid violating a number of constraints.

Decision Making reinforcement-learning +2

Paper
Code

More Robust Doubly Robust Off-policy Evaluation

no code implementations • ICML 2018 • Mehrdad Farajtabar, Yin-Lam Chow, Mohammad Ghavamzadeh

In particular, we focus on the doubly robust (DR) estimators that consist of an importance sampling (IS) component and a performance model, and utilize the low (or zero) bias of IS and low variance of the model at the same time.

Multi-Armed Bandits Off-policy evaluation

Paper
Add Code

Path Consistency Learning in Tsallis Entropy Regularized MDPs

no code implementations • ICML 2018 • Ofir Nachum, Yin-Lam Chow, Mohammad Ghavamzadeh

In this paper, we follow the work of Nachum et al. (2017) in the soft ERL setting, and propose a class of novel path consistency learning (PCL) algorithms, called {\em sparse PCL}, for the sparse ERL problem that can work with both on-policy and off-policy data.

Paper
Add Code

Imitation Learning from Visual Data with Multiple Intentions

no code implementations • ICLR 2018 • Aviv Tamar, Khashayar Rohanimanesh, Yin-Lam Chow, Chris Vigorito, Ben Goodrich, Michael Kahane, Derik Pridmore

In this paper we present an LfD approach for learning multiple modes of behavior from visual data.

Imitation Learning

Paper
Add Code

Safe Policy Improvement by Minimizing Robust Baseline Regret

no code implementations • NeurIPS 2016 • Marek Petrik, Yin-Lam Chow, Mohammad Ghavamzadeh

We show that our formulation is NP-hard and propose an approximate algorithm.

Decision Making Decision Making Under Uncertainty

Paper
Add Code

Risk-Constrained Reinforcement Learning with Percentile Risk Criteria

no code implementations • 5 Dec 2015 • Yin-Lam Chow, Mohammad Ghavamzadeh, Lucas Janson, Marco Pavone

In many sequential decision-making problems one is interested in minimizing an expected cumulative cost while taking into account \emph{risk}, i. e., increased awareness of events of small probability and high consequences.

Decision Making Marketing +2

Paper
Add Code

Two Phase $Q-$learning for Bidding-based Vehicle Sharing

no code implementations • 29 Sep 2015 • Yin-Lam Chow, Jia Yuan Yu, Marco Pavone

We consider one-way vehicle sharing systems where customers can rent a car at one station and drop it off at another.

Decision Making Q-Learning +1

Paper
Add Code

Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach

no code implementations • NeurIPS 2015 • Yin-Lam Chow, Aviv Tamar, Shie Mannor, Marco Pavone

Our first contribution is to show that a CVaR objective, besides capturing risk sensitivity, has an alternative interpretation as expected cost under worst-case modeling errors, for a given error budget.

Decision Making

Paper
Add Code

Policy Gradient for Coherent Risk Measures

no code implementations • NeurIPS 2015 • Aviv Tamar, Yin-Lam Chow, Mohammad Ghavamzadeh, Shie Mannor

For static risk measures, our approach is in the spirit of policy gradient algorithms and combines a standard sampling approach with convex programming.

Policy Gradient Methods

Paper
Add Code

Weighted SGD for $\ell_p$ Regression with Randomized Preconditioning

no code implementations • 12 Feb 2015 • Jiyan Yang, Yin-Lam Chow, Christopher Ré, Michael W. Mahoney

We aim to bridge the gap between these two methods in solving constrained overdetermined linear regression problems---e. g., $\ell_2$ and $\ell_1$ regression problems.

regression

Paper
Add Code

Algorithms for CVaR Optimization in MDPs

no code implementations • NeurIPS 2014 • Yin-Lam Chow, Mohammad Ghavamzadeh

In many sequential decision-making problems we may want to manage risk by minimizing some measure of variability in costs in addition to minimizing a standard criterion.

Decision Making

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.