Search Results for author: Yinlam Chow

Found 15 papers, 2 papers with code

DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning

no code implementations25 Feb 2024 Anthony Liang, Guy Tennenholtz, Chih-Wei Hsu, Yinlam Chow, Erdem Biyik, Craig Boutilier

We introduce DynaMITE-RL, a meta-reinforcement learning (meta-RL) approach to approximate inference in environments where the latent state evolves at varying rates.

Continuous Control Meta Reinforcement Learning

Preference Elicitation with Soft Attributes in Interactive Recommendation

no code implementations22 Oct 2023 Erdem Biyik, Fan Yao, Yinlam Chow, Alex Haig, Chih-Wei Hsu, Mohammad Ghavamzadeh, Craig Boutilier

Leveraging concept activation vectors for soft attribute semantics, we develop novel preference elicitation methods that can accommodate soft attributes and bring together both item and attribute-based preference elicitation.

Attribute Recommendation Systems

Factual and Personalized Recommendations using Language Models and Reinforcement Learning

no code implementations9 Oct 2023 Jihwan Jeong, Yinlam Chow, Guy Tennenholtz, Chih-Wei Hsu, Azamat Tulepbergenov, Mohammad Ghavamzadeh, Craig Boutilier

Recommender systems (RSs) play a central role in connecting users to content, products, and services, matching candidate items to users based on their preferences.

Language Modelling Recommendation Systems +1

Demystifying Embedding Spaces using Large Language Models

no code implementations6 Oct 2023 Guy Tennenholtz, Yinlam Chow, Chih-Wei Hsu, Jihwan Jeong, Lior Shani, Azamat Tulepbergenov, Deepak Ramachandran, Martin Mladenov, Craig Boutilier

Embeddings have become a pivotal means to represent complex, multi-faceted information about entities, concepts, and relationships in a condensed and useful format.

Dimensionality Reduction Recommendation Systems

Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning

no code implementations25 Jul 2022 Deborah Cohen, MoonKyung Ryu, Yinlam Chow, Orgad Keller, Ido Greenberg, Avinatan Hassidim, Michael Fink, Yossi Matias, Idan Szpektor, Craig Boutilier, Gal Elidan

Despite recent advances in natural language understanding and generation, and decades of research on the development of conversational bots, building automated agents that can carry on rich open-ended conversations with humans "in the wild" remains a formidable challenge.

Natural Language Understanding reinforcement-learning +1

A Mixture-of-Expert Approach to RL-based Dialogue Management

no code implementations31 May 2022 Yinlam Chow, Aza Tulepbergenov, Ofir Nachum, MoonKyung Ryu, Mohammad Ghavamzadeh, Craig Boutilier

Despite recent advancements in language models (LMs), their application to dialogue management (DM) problems and ability to carry on rich conversations remain a challenge.

Attribute Dialogue Management +3

Efficient Risk-Averse Reinforcement Learning

2 code implementations10 May 2022 Ido Greenberg, Yinlam Chow, Mohammad Ghavamzadeh, Shie Mannor

In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns.

Autonomous Driving reinforcement-learning +1

SAFER: Data-Efficient and Safe Reinforcement Learning via Skill Acquisition

no code implementations10 Feb 2022 Dylan Slack, Yinlam Chow, Bo Dai, Nevan Wichers

However, we identify these techniques are not well equipped for safe policy learning because they ignore negative experiences(e. g., unsafe or unsuccessful), focusing only on positive experiences, which harms their ability to generalize to new tasks safely.

reinforcement-learning Reinforcement Learning (RL) +2

Discovering Personalized Semantics for Soft Attributes in Recommender Systems using Concept Activation Vectors

2 code implementations6 Feb 2022 Christina Göpfert, Alex Haig, Yinlam Chow, Chih-Wei Hsu, Ivan Vendrov, Tyler Lu, Deepak Ramachandran, Hubert Pham, Mohammad Ghavamzadeh, Craig Boutilier

Interactive recommender systems have emerged as a promising paradigm to overcome the limitations of the primitive user feedback used by traditional recommender systems (e. g., clicks, item consumption, ratings).

Recommendation Systems

SAFER: Data-Efficient and Safe Reinforcement Learning Through Skill Acquisition

no code implementations29 Sep 2021 Dylan Z Slack, Yinlam Chow, Bo Dai, Nevan Wichers

Though many reinforcement learning (RL) problems involve learning policies in settings that are difficult to specify safety constraints and sparse rewards, current methods struggle to rapidly and safely acquire successful policies.

reinforcement-learning Reinforcement Learning (RL) +2

Non-Stationary Latent Bandits

no code implementations1 Dec 2020 Joey Hong, Branislav Kveton, Manzil Zaheer, Yinlam Chow, Amr Ahmed, Mohammad Ghavamzadeh, Craig Boutilier

The key idea is to frame this problem as a latent bandit, where the prototypical models of user behavior are learned offline and the latent state of the user is inferred online from its interactions with the models.

Recommendation Systems Thompson Sampling

CoinDICE: Off-Policy Confidence Interval Estimation

no code implementations NeurIPS 2020 Bo Dai, Ofir Nachum, Yinlam Chow, Lihong Li, Csaba Szepesvári, Dale Schuurmans

We study high-confidence behavior-agnostic off-policy evaluation in reinforcement learning, where the goal is to estimate a confidence interval on a target policy's value, given only access to a static experience dataset collected by unknown behavior policies.

Off-policy evaluation valid

Safe Policy Learning for Continuous Control

no code implementations25 Sep 2019 Yinlam Chow, Ofir Nachum, Aleksandra Faust, Edgar Duenez-Guzman, Mohammad Ghavamzadeh

We study continuous action reinforcement learning problems in which it is crucial that the agent interacts with the environment only through safe policies, i. e.,~policies that keep the agent in desirable situations, both during training and at convergence.

Continuous Control

Lyapunov-based Safe Policy Optimization

no code implementations27 Sep 2018 Yinlam Chow, Ofir Nachum, Mohammad Ghavamzadeh, Edgar Guzman-Duenez

In many reinforcement learning applications, it is crucial that the agent interacts with the environment only through safe policies, i. e.,~policies that do not take the agent to certain undesirable situations.

Cannot find the paper you are looking for? You can Submit a new open access paper.