no code implementations • 17 Aug 2023 • Ivan Ovinnikov, Joachim M. Buhmann
Imitation learning methods are used to infer a policy in a Markov decision process from a dataset of expert demonstrations by minimizing a divergence measure between the empirical state occupancy measures of the expert and the policy.
no code implementations • 29 Sep 2021 • Ivan Ovinnikov, Eugene Bykovets, Joachim M. Buhmann
Inverse reinforcement learning methods aim to retrieve the reward function of a Markov decision process based on a dataset of expert demonstrations.
no code implementations • 18 Nov 2020 • Luis Haug, Ivan Ovinnikov, Eugene Bykovets
Given an optimality profile and a small amount of additional supervision, our algorithm fits a reward function, modeled as a neural network, by essentially minimizing the Wasserstein distance between the corresponding induced distribution and the optimality profile.
no code implementations • ICLR 2020 • Ivan Ovinnikov
This work presents a reformulation of the recently proposed Wasserstein autoencoder framework on a non-Euclidean manifold, the Poincar\'e ball model of the hyperbolic space.