Search Results for author: Tanmay Gangwani

Found 11 papers, 7 papers with code

Multi-Objective Optimization via Wasserstein-Fisher-Rao Gradient Flow

no code implementations22 Nov 2023 Yinuo Ren, Tesi Xiao, Tanmay Gangwani, Anshuka Rangi, Holakou Rahmanian, Lexing Ying, Subhajit Sanyal

Multi-objective optimization (MOO) aims to optimize multiple, possibly conflicting objectives with widespread applications.

Selective Uncertainty Propagation in Offline RL

no code implementations1 Feb 2023 Sanath Kumar Krishnamurthy, Shrey Modi, Tanmay Gangwani, Sumeet Katariya, Branislav Kveton, Anshuka Rangi

We consider the finite-horizon offline reinforcement learning (RL) setting, and are motivated by the challenge of learning the policy at any step h in dynamic programming (DP) algorithms.

Offline RL reinforcement-learning +1

Imitation Learning from Observations under Transition Model Disparity

1 code implementation ICLR 2022 Tanmay Gangwani, Yuan Zhou, Jian Peng

In this work, we propose an algorithm that trains an intermediary policy in the learner environment and uses it as a surrogate expert for the learner.

Imitation Learning

Hindsight Foresight Relabeling for Meta-Reinforcement Learning

1 code implementation ICLR 2022 Michael Wan, Jian Peng, Tanmay Gangwani

Meta-reinforcement learning (meta-RL) algorithms allow for agents to learn new behaviors from small amounts of experience, mitigating the sample inefficiency problem in RL.

Meta Reinforcement Learning reinforcement-learning +1

Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity

1 code implementation5 Nov 2020 Tanmay Gangwani, Jian Peng, Yuan Zhou

Quality-Diversity (QD) is a concept from Neuroevolution with some intriguing applications to Reinforcement Learning.

Off-policy evaluation

Learning Guidance Rewards with Trajectory-space Smoothing

2 code implementations NeurIPS 2020 Tanmay Gangwani, Yuan Zhou, Jian Peng

To make credit assignment easier, recent works have proposed algorithms to learn dense "guidance" rewards that could be used in place of the sparse or delayed environmental rewards.

Attribute Q-Learning +1

Mutual Information Based Knowledge Transfer Under State-Action Dimension Mismatch

1 code implementation12 Jun 2020 Michael Wan, Tanmay Gangwani, Jian Peng

In this paper, we propose a new framework for transfer learning where the teacher and the student can have arbitrarily different state- and action-spaces.

Decision Making Reinforcement Learning (RL) +1

State-only Imitation with Transition Dynamics Mismatch

1 code implementation ICLR 2020 Tanmay Gangwani, Jian Peng

Imitation Learning (IL) is a popular paradigm for training agents to achieve complicated goals by leveraging expert behavior, rather than dealing with the hardships of designing a correct reward function.

Imitation Learning OpenAI Gym

Learning Belief Representations for Imitation Learning in POMDPs

1 code implementation22 Jun 2019 Tanmay Gangwani, Joel Lehman, Qiang Liu, Jian Peng

We consider the problem of imitation learning from expert demonstrations in partially observable Markov decision processes (POMDPs).

Continuous Control Imitation Learning +1

Learning Self-Imitating Diverse Policies

no code implementations ICLR 2019 Tanmay Gangwani, Qiang Liu, Jian Peng

Improving the efficiency of RL algorithms in real-world problems with sparse or episodic rewards is therefore a pressing need.

Continuous Control Imitation Learning +2

Policy Optimization by Genetic Distillation

no code implementations ICLR 2018 Tanmay Gangwani, Jian Peng

GPO uses imitation learning for policy crossover in the state space and applies policy gradient methods for mutation.

Imitation Learning Policy Gradient Methods +2

Cannot find the paper you are looking for? You can Submit a new open access paper.