Search Results for author: Vivek Veeriah

Found 15 papers, 2 papers with code

Differential Recurrent Neural Networks for Action Recognition

no code implementations • ICCV 2015 • Vivek Veeriah, Naifan Zhuang, Guo-Jun Qi

This change in information gain is quantified by Derivative of States (DoS), and thus the proposed LSTM model is termed as differential Recurrent Neural Network (dRNN).

Action Recognition Temporal Action Localization +2

Paper
Add Code

Face valuing: Training user interfaces with facial expressions and reinforcement learning

no code implementations • 9 Jun 2016 • Vivek Veeriah, Patrick M. Pilarski, Richard S. Sutton

The primary objective of the current work is to demonstrate that a learning agent can reduce the amount of explicit feedback required for adapting to the user's preferences pertaining to a task by learning to perceive a value of its behavior from the human user, particularly from the user's facial expressions---we call this face valuing.

BIG-bench Machine Learning reinforcement-learning +1

Paper
Add Code

Learning Representations by Stochastic Meta-Gradient Descent in Neural Networks

no code implementations • 9 Dec 2016 • Vivek Veeriah, Shangtong Zhang, Richard S. Sutton

In this paper, we introduce a new incremental learning algorithm called crossprop, which learns incoming weights of hidden units based on the meta-gradient descent approach, that was previously introduced by Sutton (1992) and Schraudolph (1999) for learning step-sizes.

Incremental Learning

Paper
Add Code

TIDBD: Adapting Temporal-difference Step-sizes Through Stochastic Meta-descent

no code implementations • 10 Apr 2018 • Alex Kearney, Vivek Veeriah, Jaden B. Travnik, Richard S. Sutton, Patrick M. Pilarski

In this paper, we introduce a method for adapting the step-sizes of temporal difference (TD) learning.

Representation Learning

Paper
Add Code

Many-Goals Reinforcement Learning

no code implementations • 22 Jun 2018 • Vivek Veeriah, Junhyuk Oh, Satinder Singh

Second, we explore whether many-goals updating can be used to pre-train a network to subsequently learn faster and better on a single main task of interest.

Q-Learning reinforcement-learning +1

Paper
Add Code

Learning Feature Relevance Through Step Size Adaptation in Temporal-Difference Learning

no code implementations • 8 Mar 2019 • Alex Kearney, Vivek Veeriah, Jaden Travnik, Patrick M. Pilarski, Richard S. Sutton

In this paper, we examine an instance of meta-learning in which feature relevance is learned by adapting step size parameters of stochastic gradient descent---building on a variety of prior work in stochastic approximation, machine learning, and artificial neural networks.

Meta-Learning Representation Learning

Paper
Add Code

Discovery of Useful Questions as Auxiliary Tasks

no code implementations • NeurIPS 2019 • Vivek Veeriah, Matteo Hessel, Zhongwen Xu, Richard Lewis, Janarthanan Rajendran, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh

Arguably, intelligent agents ought to be able to discover their own questions so that in learning answers for them they learn unanticipated useful knowledge and skills; this departs from the focus in much of machine learning on agents learning answers to externally defined questions.

Reinforcement Learning (RL)

Paper
Add Code

How Should an Agent Practice?

no code implementations • 15 Dec 2019 • Janarthanan Rajendran, Richard Lewis, Vivek Veeriah, Honglak Lee, Satinder Singh

We present a method for learning intrinsic reward functions to drive the learning of an agent during periods of practice in which extrinsic task rewards are not available.

Paper
Add Code

A Self-Tuning Actor-Critic Algorithm

no code implementations • NeurIPS 2020 • Tom Zahavy, Zhongwen Xu, Vivek Veeriah, Matteo Hessel, Junhyuk Oh, Hado van Hasselt, David Silver, Satinder Singh

Reinforcement learning algorithms are highly sensitive to the choice of hyperparameters, typically requiring significant manual effort to identify hyperparameters that perform well on a new domain.

Atari Games reinforcement-learning +1

Paper
Add Code

Learning Retrospective Knowledge with Reverse Reinforcement Learning

1 code implementation • NeurIPS 2020 • Shangtong Zhang, Vivek Veeriah, Shimon Whiteson

We present a Reverse Reinforcement Learning (Reverse RL) approach for representing retrospective knowledge.

Anomaly Detection reinforcement-learning +2

3,093

Paper
Code

Learning State Representations from Random Deep Action-conditional Predictions

1 code implementation • NeurIPS 2021 • Zeyu Zheng, Vivek Veeriah, Risto Vuorio, Richard Lewis, Satinder Singh

Our main contribution in this work is an empirical finding that random General Value Functions (GVFs), i. e., deep action-conditional predictions -- random both in what feature of observations they predict as well as in the sequence of actions the predictions are conditioned upon -- form good auxiliary tasks for reinforcement learning (RL) problems.

Atari Games Reinforcement Learning (RL) +2

Paper
Code

Discovery of Options via Meta-Learned Subgoals

no code implementations • NeurIPS 2021 • Vivek Veeriah, Tom Zahavy, Matteo Hessel, Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh

Temporal abstractions in the form of options have been shown to help reinforcement learning (RL) agents learn faster.

Reinforcement Learning (RL)

Paper
Add Code

GrASP: Gradient-Based Affordance Selection for Planning

no code implementations • 8 Feb 2022 • Vivek Veeriah, Zeyu Zheng, Richard Lewis, Satinder Singh

Our empirical work shows that it is feasible to learn to select both primitive-action and option affordances, and that simultaneously learning to select affordances and planning with a learned value-equivalent model can outperform model-free RL.

Reinforcement Learning (RL)

Paper
Add Code

ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs

no code implementations • 2 Feb 2023 • Ted Moskovitz, Brendan O'Donoghue, Vivek Veeriah, Sebastian Flennerhag, Satinder Singh, Tom Zahavy

Such applications often require to put constraints on the agent's behavior.

Continuous Control reinforcement-learning +1

Paper
Add Code

Diversifying AI: Towards Creative Chess with AlphaZero

no code implementations • 17 Aug 2023 • Tom Zahavy, Vivek Veeriah, Shaobo Hou, Kevin Waugh, Matthew Lai, Edouard Leurent, Nenad Tomasev, Lisa Schut, Demis Hassabis, Satinder Singh

In particular, we investigate whether a team of diverse AI systems can outperform a single AI in challenging tasks by generating more ideas as a group and then selecting the best ones.

Decision Making Game of Chess

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.