Search Results for author: Timothy A. Mann

Found 11 papers, 1 papers with code

Learning from Delayed Outcomes via Proxies with Applications to Recommender Systems

no code implementations • 24 Jul 2018 • Timothy A. Mann, Sven Gowal, András György, Ray Jiang, Huiyi Hu, Balaji Lakshminarayanan, Prav Srinivasan

Predicting delayed outcomes is an important problem in recommender systems (e. g., if customers will finish reading an ebook).

Recommendation Systems

Paper
Add Code

Soft-Robust Actor-Critic Policy-Gradient

no code implementations • 11 Mar 2018 • Esther Derman, Daniel J. Mankowitz, Timothy A. Mann, Shie Mannor

It learns an optimal policy with respect to a distribution over an uncertainty set and stays robust to model uncertainty but avoids the conservativeness of robust strategies.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Beyond Greedy Ranking: Slate Optimization via List-CVAE

1 code implementation • ICLR 2019 • Ray Jiang, Sven Gowal, Timothy A. Mann, Danilo J. Rezende

The conventional solution to the recommendation problem greedily ranks individual document candidates by prediction scores.

Paper
Code

Learning Robust Options

no code implementations • 9 Feb 2018 • Daniel J. Mankowitz, Timothy A. Mann, Pierre-Luc Bacon, Doina Precup, Shie Mannor

We present a Robust Options Policy Iteration (ROPI) algorithm with convergence guarantees, which learns options that are robust to model uncertainty.

Paper
Add Code

Adaptive Lambda Least-Squares Temporal Difference Learning

no code implementations • 30 Dec 2016 • Timothy A. Mann, Hugo Penedones, Shie Mannor, Todd Hester

Temporal Difference learning or TD($\lambda$) is a fundamental algorithm in the field of reinforcement learning.

Paper
Add Code

Adaptive Skills Adaptive Partitions (ASAP)

no code implementations • NeurIPS 2016 • Daniel J. Mankowitz, Timothy A. Mann, Shie Mannor

We introduce the Adaptive Skills, Adaptive Partitions (ASAP) framework that (1) learns skills (i. e., temporally extended actions or options) as well as (2) where to apply them.

Paper
Add Code

Adaptive Skills, Adaptive Partitions (ASAP)

no code implementations • 10 Feb 2016 • Daniel J. Mankowitz, Timothy A. Mann, Shie Mannor

We introduce the Adaptive Skills, Adaptive Partitions (ASAP) framework that (1) learns skills (i. e., temporally extended actions or options) as well as (2) where to apply them.

Paper
Add Code

Iterative Hierarchical Optimization for Misspecified Problems (IHOMP)

no code implementations • 10 Feb 2016 • Daniel J. Mankowitz, Timothy A. Mann, Shie Mannor

For complex, high-dimensional Markov Decision Processes (MDPs), it may be necessary to represent the policy with function approximation.

Paper
Add Code

Bootstrapping Skills

no code implementations • 11 Jun 2015 • Daniel J. Mankowitz, Timothy A. Mann, Shie Mannor

The monolithic approach to policy representation in Markov Decision Processes (MDPs) looks for a single policy that can be represented as a function from states to actions.

Reinforcement Learning (RL)

Paper
Add Code

Actively Learning to Attract Followers on Twitter

no code implementations • 16 Apr 2015 • Nir Levine, Timothy A. Mann, Shie Mannor

Twitter, a popular social network, presents great opportunities for on-line machine learning research.

BIG-bench Machine Learning

Paper
Add Code

How hard is my MDP?" The distribution-norm to the rescue"

no code implementations • NeurIPS 2014 • Odalric-Ambrym Maillard, Timothy A. Mann, Shie Mannor

In Reinforcement Learning (RL), state-of-the-art algorithms require a large number of samples per state-action pair to estimate the transition kernel $p$.

Reinforcement Learning (RL)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.