Search Results for author: Richard Lewis

Found 9 papers, 1 papers with code

Learning State Representations from Random Deep Action-conditional Predictions

no code implementations9 Feb 2021 Zeyu Zheng, Vivek Veeriah, Risto Vuorio, Richard Lewis, Satinder Singh

In this work, we study auxiliary prediction tasks defined by temporal-difference networks (TD networks); these networks are a language for expressing a rich space of general value function (GVF) prediction targets that may be learned efficiently with TD.

Atari Games Value prediction

Pairwise Weights for Temporal Credit Assignment

no code implementations9 Feb 2021 Zeyu Zheng, Risto Vuorio, Richard Lewis, Satinder Singh

In this empirical paper, we explore heuristics based on more general pairwise weightings that are functions of the state in which the action was taken, the state at the time of the reward, as well as the time interval between the two.

How Should an Agent Practice?

no code implementations15 Dec 2019 Janarthanan Rajendran, Richard Lewis, Vivek Veeriah, Honglak Lee, Satinder Singh

We present a method for learning intrinsic reward functions to drive the learning of an agent during periods of practice in which extrinsic task rewards are not available.

Discovery of Useful Questions as Auxiliary Tasks

no code implementations NeurIPS 2019 Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Richard Lewis, Janarthanan Rajendran, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh

Arguably, intelligent agents ought to be able to discover their own questions so that in learning answers for them they learn unanticipated useful knowledge and skills; this departs from the focus in much of machine learning on agents learning answers to externally defined questions.

A Modeling Study of the Effects of Surprisal and Entropy in Perceptual Decision Making of an Adaptive Agent

no code implementations WS 2019 Pyeong Whan Cho, Richard Lewis

Temporal dynamics in the task environment was determined by a simple finite-state grammar, which was designed to create the situations where the surprisal and entropy reduction hypotheses predict different patterns.

Decision Making

In silico generation of novel, drug-like chemical matter using the LSTM neural network

no code implementations20 Dec 2017 Peter Ertl, Richard Lewis, Eric Martin, Valery Polyakov

In this article we present a method to generate molecules using a long short-term memory (LSTM) neural network and provide an analysis of the results, including a virtual screening test.

Drug Discovery

Deep Learning for Reward Design to Improve Monte Carlo Tree Search in ATARI Games

no code implementations24 Apr 2016 Xiaoxiao Guo, Satinder Singh, Richard Lewis, Honglak Lee

We present an adaptation of PGRD (policy-gradient for reward-design) for learning a reward-bonus function to improve UCT (a MCTS algorithm).

Atari Games Decision Making

Action-Conditional Video Prediction using Deep Networks in Atari Games

1 code implementation NeurIPS 2015 Junhyuk Oh, Xiaoxiao Guo, Honglak Lee, Richard Lewis, Satinder Singh

Motivated by vision-based reinforcement learning (RL) problems, in particular Atari games from the recent benchmark Aracade Learning Environment (ALE), we consider spatio-temporal prediction problems where future (image-)frames are dependent on control variables or actions as well as previous frames.

Atari Games Video Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.