1 code implementation • 10 Feb 2024 • Rati Devidze, Parameswaran Kamalaruban, Adish Singla
Reward functions are central in specifying the task we want a reinforcement learning agent to perform.
1 code implementation • 21 Feb 2023 • Weichen Li, Rati Devidze, Sophie Fellenz
To deal with sparse extrinsic rewards from the environment, we combine it with a potential-based reward shaping technique to provide more informative (dense) reward signals to the RL agent.
1 code implementation • NeurIPS 2021 • Rati Devidze, Goran Radanovic, Parameswaran Kamalaruban, Adish Singla
By being explicable, we seek to capture two properties: (a) informativeness so that the rewards speed up the agent's convergence, and (b) sparseness as a proxy for ease of interpretability of the rewards.
1 code implementation • NeurIPS 2021 • Gaurav Yengera, Rati Devidze, Parameswaran Kamalaruban, Adish Singla
In particular, we study how to design a personalized curriculum over demonstrations to speed up the learner's convergence.
no code implementations • 21 Nov 2020 • Amin Rakhsha, Goran Radanovic, Rati Devidze, Xiaojin Zhu, Adish Singla
We provide lower/upper bounds on the attack cost, and instantiate our attacks in two settings: (i) an offline setting where the agent is doing planning in the poisoned environment, and (ii) an online setting where the agent is learning a policy with poisoned feedback.
no code implementations • 23 Jun 2020 • Parameswaran Kamalaruban, Rati Devidze, Volkan Cevher, Adish Singla
However, the applicability of potential-based reward shaping is limited in settings where (i) the state space is very large, and it is challenging to compute an appropriate potential function, (ii) the feedback signals are noisy, and even with shaped rewards the agent could be trapped in local optima, and (iii) changing the rewards alone is not sufficient, and effective shaping requires changing the dynamics.
1 code implementation • ICML 2020 • Amin Rakhsha, Goran Radanovic, Rati Devidze, Xiaojin Zhu, Adish Singla
We study a security threat to reinforcement learning where an attacker poisons the learning environment to force the agent into executing a target policy chosen by the attacker.
no code implementations • 21 Mar 2020 • Rati Devidze, Farnam Mansouri, Luis Haug, Yuxin Chen, Adish Singla
Machine teaching studies the interaction between a teacher and a student/learner where the teacher selects training examples for the learner to learn a specific task.
no code implementations • NeurIPS 2019 • Sebastian Tschiatschek, Ahana Ghosh, Luis Haug, Rati Devidze, Adish Singla
We study two teaching approaches: learner-agnostic teaching, where the teacher provides demonstrations from an optimal policy ignoring the learner's preferences, and learner-aware teaching, where the teacher accounts for the learner's preferences.
no code implementations • 28 May 2019 • Parameswaran Kamalaruban, Rati Devidze, Volkan Cevher, Adish Singla
We study the problem of inverse reinforcement learning (IRL) with the added twist that the learner is assisted by a helpful teacher.
no code implementations • 23 Jan 2019 • Goran Radanovic, Rati Devidze, David C. Parkes, Adish Singla
We consider a two-agent MDP framework where agents repeatedly solve a task in a collaborative setting.