Search Results for author: Ishan Durugkar

Found 10 papers, 3 papers with code

Wasserstein Distance Maximizing Intrinsic Control

no code implementations28 Oct 2021 Ishan Durugkar, Steven Hansen, Stephen Spencer, Volodymyr Mnih

This paper deals with the problem of learning a skill-conditioned policy that acts meaningfully in the absence of a reward signal.

Adversarial Intrinsic Motivation for Reinforcement Learning

1 code implementation NeurIPS 2021 Ishan Durugkar, Mauricio Tec, Scott Niekum, Peter Stone

In this paper, we investigate whether one such objective, the Wasserstein-1 distance between a policy's state visitation distribution and a target distribution, can be utilized effectively for reinforcement learning (RL) tasks.

Multi-Goal Reinforcement Learning reinforcement-learning

Reducing Sampling Error in Batch Temporal Difference Learning

no code implementations ICML 2020 Brahma Pavse, Ishan Durugkar, Josiah Hanna, Peter Stone

In this batch setting, we show that TD(0) may converge to an inaccurate value function because the update following an action is weighted according to the number of times that action occurred in the batch -- not the true probability of the action under the given policy.

An Imitation from Observation Approach to Transfer Learning with Dynamics Mismatch

no code implementations NeurIPS 2020 Siddharth Desai, Ishan Durugkar, Haresh Karnan, Garrett Warnell, Josiah Hanna, Peter Stone

We examine the problem of transferring a policy learned in a source environment to a target environment with different dynamics, particularly in the case where it is critical to reduce the amount of interaction with the target environment during learning.

Transfer Learning

HR-TD: A Regularized TD Method to Avoid Over-Generalization

no code implementations ICLR 2019 Ishan Durugkar, Bo Liu, Peter Stone

Temporal Difference learning with function approximation has been widely used recently and has led to several successful results.

Multi-Preference Actor Critic

no code implementations5 Apr 2019 Ishan Durugkar, Matthew Hausknecht, Adith Swaminathan, Patrick MacAlpine

Policy gradient algorithms typically combine discounted future rewards with an estimated value function, to compute the direction and magnitude of parameter updates.

reinforcement-learning

TD Learning with Constrained Gradients

no code implementations ICLR 2018 Ishan Durugkar, Peter Stone

In this work we propose a constraint on the TD update that minimizes change to the target values.

Q-Learning

Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning

6 code implementations ICLR 2018 Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum

Knowledge bases (KB), both automatically and manually constructed, are often incomplete --- many valid facts can be inferred from the KB by synthesizing existing information.

Generative Multi-Adversarial Networks

1 code implementation5 Nov 2016 Ishan Durugkar, Ian Gemp, Sridhar Mahadevan

Generative adversarial networks (GANs) are a framework for producing a generative model by way of a two-player minimax game.

Ranked #60 on Image Generation on CIFAR-10 (Inception score metric)

Image Generation

Inverting Variational Autoencoders for Improved Generative Accuracy

no code implementations21 Aug 2016 Ian Gemp, Ishan Durugkar, Mario Parente, M. Darby Dyar, Sridhar Mahadevan

Recent advances in semi-supervised learning with deep generative models have shown promise in generalizing from small labeled datasets ($\mathbf{x},\mathbf{y}$) to large unlabeled ones ($\mathbf{x}$).

Cannot find the paper you are looking for? You can Submit a new open access paper.