Conditions on Preference Relations that Guarantee the Existence of Optimal Policies

Jonathan Colaço Carr, Prakash Panangaden, Doina Precup

Current results guaranteeing the existence of optimal policies in LfPF problems assume that both the preferences and transition dynamics are determined by a Markov Decision Process.

A Kernel Perspective on Behavioural Metrics for Markov Decision Processes

Pablo Samuel Castro, Tyler Kastner, Prakash Panangaden, Mark Rowland

Behavioural metrics have been shown to be an effective mechanism for constructing representations in reinforcement learning.


Continuous MDP Homomorphisms and Homomorphic Policy Gradient

Sahand Rezaei-Shoshtari, Rosie Zhao, Prakash Panangaden, David Meger, Doina Precup

Abstraction has been widely studied as a way to improve the efficiency and generalization of reinforcement learning algorithms.

Riemannian Diffusion Models

Chin-wei Huang, Milad Aghajohari, Avishek Joey Bose, Prakash Panangaden, Aaron Courville

In this work, we generalize continuous-time diffusion models to arbitrary Riemannian manifolds and derive a variational framework for likelihood estimation.

Extracting Weighted Automata for Approximate Minimization in Language Modelling

Clara Lacroce, Prakash Panangaden, Guillaume Rabusseau

The objective is to obtain a weighted finite automaton (WFA) that fits within a given size constraint and which mimics the behaviour of the original model while minimizing some notion of distance between the black box and the extracted WFA.

MICo: Improved representations via sampling-based state similarity for Markov decision processes

2 code implementations NeurIPS 2021 Pablo Samuel Castro, Tyler Kastner, Prakash Panangaden, Mark Rowland

We present a new behavioural distance over the state space of a Markov decision process, and demonstrate the use of this distance as an effective means of shaping the learnt representations of deep reinforcement learning agents.

A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms

Philip Amortila, Doina Precup, Prakash Panangaden, Marc G. Bellemare

We present a distributional approach to theoretical analyses of reinforcement learning algorithms for constant step-sizes.

Latent Variable Modelling with Hyperbolic Normalizing Flows

1 code implementation ICML 2020 Avishek Joey Bose, Ariella Smofsky, Renjie Liao, Prakash Panangaden, William L. Hamilton

One effective solution is the use of normalizing flows \cut{defined on Euclidean spaces} to construct flexible posterior distributions.

Basis refinement strategies for linear value function approximation in MDPs

Gheorghe Comanici, Doina Precup, Prakash Panangaden

We provide a theoretical framework for analyzing basis function construction for linear value function approximation in Markov Decision Processes (MDPs).

Proceedings of the 11th workshop on Quantum Physics and Logic

Bob Coecke, Ichiro Hasuo, Prakash Panangaden

The first QPL under the new name Quantum Physics and Logic was held in Reykjavik (2008), followed by Oxford (2009 and 2010), Nijmegen (2011), Brussels (2012) and Barcelona (2013).

