Hamiltonian Monte Carlo (HMC) sampling methods provide a mechanism for defining distant proposals with high acceptance probabilities in a Metropolis-Hastings framework, enabling more efficient exploration of the state space than standard random-walk proposals.
We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation.
In our approach, we perform online probabilistic filtering of latent task variables to infer how to solve a new task from small amounts of experience.
Text-based adventure games provide a platform on which to explore reinforcement learning in the context of a combinatorial action space, such as natural language.
We recast exploration as a problem of State Marginal Matching (SMM), where we aim to learn a policy for which the state marginal distribution matches a given target state distribution, which can incorporate prior knowledge about the task.