Search Results for author: Timothy A. Mann

Found 11 papers, 1 papers with code

Learning from Delayed Outcomes via Proxies with Applications to Recommender Systems

no code implementations24 Jul 2018 Timothy A. Mann, Sven Gowal, András György, Ray Jiang, Huiyi Hu, Balaji Lakshminarayanan, Prav Srinivasan

Predicting delayed outcomes is an important problem in recommender systems (e. g., if customers will finish reading an ebook).

Recommendation Systems

Soft-Robust Actor-Critic Policy-Gradient

no code implementations11 Mar 2018 Esther Derman, Daniel J. Mankowitz, Timothy A. Mann, Shie Mannor

It learns an optimal policy with respect to a distribution over an uncertainty set and stays robust to model uncertainty but avoids the conservativeness of robust strategies.

reinforcement-learning Reinforcement Learning (RL)

Beyond Greedy Ranking: Slate Optimization via List-CVAE

1 code implementation ICLR 2019 Ray Jiang, Sven Gowal, Timothy A. Mann, Danilo J. Rezende

The conventional solution to the recommendation problem greedily ranks individual document candidates by prediction scores.

Learning Robust Options

no code implementations9 Feb 2018 Daniel J. Mankowitz, Timothy A. Mann, Pierre-Luc Bacon, Doina Precup, Shie Mannor

We present a Robust Options Policy Iteration (ROPI) algorithm with convergence guarantees, which learns options that are robust to model uncertainty.

Adaptive Lambda Least-Squares Temporal Difference Learning

no code implementations30 Dec 2016 Timothy A. Mann, Hugo Penedones, Shie Mannor, Todd Hester

Temporal Difference learning or TD($\lambda$) is a fundamental algorithm in the field of reinforcement learning.

Adaptive Skills Adaptive Partitions (ASAP)

no code implementations NeurIPS 2016 Daniel J. Mankowitz, Timothy A. Mann, Shie Mannor

We introduce the Adaptive Skills, Adaptive Partitions (ASAP) framework that (1) learns skills (i. e., temporally extended actions or options) as well as (2) where to apply them.

Adaptive Skills, Adaptive Partitions (ASAP)

no code implementations10 Feb 2016 Daniel J. Mankowitz, Timothy A. Mann, Shie Mannor

We introduce the Adaptive Skills, Adaptive Partitions (ASAP) framework that (1) learns skills (i. e., temporally extended actions or options) as well as (2) where to apply them.

Iterative Hierarchical Optimization for Misspecified Problems (IHOMP)

no code implementations10 Feb 2016 Daniel J. Mankowitz, Timothy A. Mann, Shie Mannor

For complex, high-dimensional Markov Decision Processes (MDPs), it may be necessary to represent the policy with function approximation.

Bootstrapping Skills

no code implementations11 Jun 2015 Daniel J. Mankowitz, Timothy A. Mann, Shie Mannor

The monolithic approach to policy representation in Markov Decision Processes (MDPs) looks for a single policy that can be represented as a function from states to actions.

Reinforcement Learning (RL)

Actively Learning to Attract Followers on Twitter

no code implementations16 Apr 2015 Nir Levine, Timothy A. Mann, Shie Mannor

Twitter, a popular social network, presents great opportunities for on-line machine learning research.

BIG-bench Machine Learning

How hard is my MDP?" The distribution-norm to the rescue"

no code implementations NeurIPS 2014 Odalric-Ambrym Maillard, Timothy A. Mann, Shie Mannor

In Reinforcement Learning (RL), state-of-the-art algorithms require a large number of samples per state-action pair to estimate the transition kernel $p$.

Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.