Search Results for author: Nan Ye

Found 10 papers, 4 papers with code

LiMIIRL: Lightweight Multiple-Intent Inverse Reinforcement Learning

no code implementations3 Jun 2021 Aaron J. Snoswell, Surya P. N. Singh, Nan Ye

Multiple-Intent Inverse Reinforcement Learning (MI-IRL) seeks to find a reward function ensemble to rationalize demonstrations of different but unlabelled intents.

Revisiting Maximum Entropy Inverse Reinforcement Learning: New Perspectives and Algorithms

1 code implementation1 Dec 2020 Aaron J. Snoswell, Surya P. N. Singh, Nan Ye

This improves the previous heuristic derivation of the MaxEnt IRL model (for stochastic MDPs), allows a unified view of MaxEnt IRL and Relative Entropy IRL, and leads to a model-free learning algorithm for the MaxEnt IRL model.

OpenAI Gym

Discriminative Particle Filter Reinforcement Learning for Complex Partial Observations

1 code implementation ICLR 2020 Xiao Ma, Peter Karkus, David Hsu, Wee Sun Lee, Nan Ye

The particle filter maintains a belief using learned discriminative update, which is trained end-to-end for decision making.

Atari Games Decision Making +1

Greedy Convex Ensemble

1 code implementation9 Oct 2019 Tan Nguyen, Nan Ye, Peter L. Bartlett

Theoretically, we first consider whether we can use linear, instead of convex, combinations, and obtain generalization results similar to existing ones for learning from a convex hull.

Nesterov Acceleration of Alternating Least Squares for Canonical Tensor Decomposition: Momentum Step Size Selection and Restart Mechanisms

1 code implementation13 Oct 2018 Drew Mitchell, Nan Ye, Hans De Sterck

While Nesterov acceleration turns gradient descent into an optimal first-order method for convex problems by adding a momentum term with a specific weight sequence, a direct application of this method and weight sequence to ALS results in erratic convergence behaviour.

Tensor Decomposition

Tensor Belief Propagation

no code implementations ICML 2017 Andrew Wrigley, Wee Sun Lee, Nan Ye

We propose a new approximate inference algorithm for graphical models, tensor belief propagation, based on approximating the messages passed in the junction tree algorithm.

DESPOT: Online POMDP Planning with Regularization

no code implementations NeurIPS 2013 Nan Ye, Adhiraj Somani, David Hsu, Wee Sun Lee

We show that the best policy obtained from a DESPOT is near-optimal, with a regret bound that depends on the representation size of the optimal policy.

Autonomous Driving

Conditional Random Fields with High-Order Features for Sequence Labeling

no code implementations NeurIPS 2009 Nan Ye, Wee S. Lee, Hai L. Chieu, Dan Wu

Dependencies among neighbouring labels in a sequence is an important source of information for sequence labeling problems.

Cannot find the paper you are looking for? You can Submit a new open access paper.