Search Results for author: Philipp Moritz

Found 10 papers, 9 papers with code

Trust Region Policy Optimization

21 code implementations19 Feb 2015 John Schulman, Sergey Levine, Philipp Moritz, Michael. I. Jordan, Pieter Abbeel

We describe an iterative procedure for optimizing policies, with guaranteed monotonic improvement.

Atari Games Policy Gradient Methods

High-Dimensional Continuous Control Using Generalized Advantage Estimation

17 code implementations8 Jun 2015 John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, Pieter Abbeel

Policy gradient methods are an appealing approach in reinforcement learning because they directly optimize the cumulative reward and can straightforwardly be used with nonlinear function approximators such as neural networks.

Continuous Control Policy Gradient Methods +1

A Linearly-Convergent Stochastic L-BFGS Algorithm

1 code implementation9 Aug 2015 Philipp Moritz, Robert Nishihara, Michael. I. Jordan

We propose a new stochastic L-BFGS algorithm and prove a linear convergence rate for strongly convex and smooth functions.

SparkNet: Training Deep Networks in Spark

1 code implementation19 Nov 2015 Philipp Moritz, Robert Nishihara, Ion Stoica, Michael. I. Jordan

We introduce SparkNet, a framework for training deep networks in Spark.

Real-Time Machine Learning: The Missing Pieces

2 code implementations11 Mar 2017 Robert Nishihara, Philipp Moritz, Stephanie Wang, Alexey Tumanov, William Paul, Johann Schleier-Smith, Richard Liaw, Mehrdad Niknami, Michael. I. Jordan, Ion Stoica

Machine learning applications are increasingly deployed not only to serve predictions using static models, but also as tightly-integrated components of feedback loops involving dynamic, real-time decision making.

BIG-bench Machine Learning Decision Making

Ray: A Distributed Framework for Emerging AI Applications

4 code implementations16 Dec 2017 Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael. I. Jordan, Ion Stoica

To meet the performance requirements, Ray employs a distributed scheduler and a distributed and fault-tolerant store to manage the system's control state.

reinforcement-learning Reinforcement Learning (RL)

RLlib: Abstractions for Distributed Reinforcement Learning

3 code implementations ICML 2018 Eric Liang, Richard Liaw, Philipp Moritz, Robert Nishihara, Roy Fox, Ken Goldberg, Joseph E. Gonzalez, Michael. I. Jordan, Ion Stoica

Reinforcement learning (RL) algorithms involve the deep nesting of highly irregular computation patterns, each of which typically exhibits opportunities for distributed computation.

reinforcement-learning Reinforcement Learning (RL)

Tune: A Research Platform for Distributed Model Selection and Training

4 code implementations13 Jul 2018 Richard Liaw, Eric Liang, Robert Nishihara, Philipp Moritz, Joseph E. Gonzalez, Ion Stoica

We show that this interface meets the requirements for a broad range of hyperparameter search algorithms, allows straightforward scaling of search to large clusters, and simplifies algorithm implementation.

Hyperparameter Optimization Model Selection

Hoplite: Efficient and Fault-Tolerant Collective Communication for Task-Based Distributed Systems

1 code implementation13 Feb 2020 Siyuan Zhuang, Zhuohan Li, Danyang Zhuo, Stephanie Wang, Eric Liang, Robert Nishihara, Philipp Moritz, Ion Stoica

Task-based distributed frameworks (e. g., Ray, Dask, Hydro) have become increasingly popular for distributed applications that contain asynchronous and dynamic workloads, including asynchronous gradient descent, reinforcement learning, and model serving.

Distributed Computing reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.