1 code implementation • 24 Feb 2024 • Aleksa Sukovic, Goran Radanovic
In this work, we propose the use of a debate-based reward model for reinforcement learning agents, where the outcome of a zero-sum debate game quantifies the justifiability of a decision in a particular state.
2 code implementations • 15 Feb 2024 • Ben Rank, Stelios Triantafyllou, Debmalya Mandal, Goran Radanovic
This makes MDRR particularly suitable for scenarios where the environment's response strongly depends on its previous dynamics, which are common in practice.
1 code implementation • 17 Oct 2023 • Stelios Triantafyllou, Aleksa Sukovic, Debmalya Mandal, Goran Radanovic
These challenges are particularly prominent in the context of multi-agent sequential decision-making, where the causal effect of an agent's action on the outcome depends on how other agents respond to that action.
no code implementations • 19 Jul 2023 • Jiarui Gan, Annika Hennes, Rupak Majumdar, Debmalya Mandal, Goran Radanovic
We take a game-theoretic perspective -- whereby each time step is treated as an independent decision maker with their own (fixed) discount factor -- and we study the subgame perfect equilibrium (SPE) of the resulting game as well as the related algorithmic problems.
no code implementations • 6 Jun 2023 • Jiarui Gan, Rupak Majumdar, Debmalya Mandal, Goran Radanovic
Both players are far-sighted, aiming to maximize their total payoffs over the time horizon.
2 code implementations • 5 Jun 2023 • Mridul Mahajan, Georgios Tzannetos, Goran Radanovic, Adish Singla
We present an information-theoretic framework to learn fixed-dimensional embeddings for tasks in reinforcement learning.
1 code implementation • 27 Feb 2023 • Mohammad Mohammadi, Jonathan Nöther, Debmalya Mandal, Adish Singla, Goran Radanovic
In this paper, we study targeted poisoning attacks in a two-agent setting where an attacker implicitly poisons the effective environment of one of the agents by modifying the policy of its peer.
1 code implementation • 24 Feb 2023 • Stelios Triantafyllou, Goran Radanovic
Responsibility attribution is a key concept of accountable multi-agent decision making.
no code implementations • 7 Feb 2023 • Debmalya Mandal, Goran Radanovic, Jiarui Gan, Adish Singla, Rupak Majumdar
We show that minimizing regret with this new general discounting is equivalent to minimizing regret with uncertain episode lengths.
no code implementations • 30 Jun 2022 • Debmalya Mandal, Stelios Triantafyllou, Goran Radanovic
We introduce the framework of performative reinforcement learning where the policy chosen by the learner affects the underlying reward and transition dynamics of the environment.
no code implementations • 1 Apr 2022 • Stelios Triantafyllou, Adish Singla, Goran Radanovic
Responsibility attribution is complementary and aims to identify the extent to which decision makers (agents) are responsible for this outcome.
no code implementations • 6 Jan 2022 • Kiarash Banihashem, Adish Singla, Jiarui Gan, Goran Radanovic
This problem can be viewed as a dual to the problem of optimal reward poisoning attacks: instead of forcing an agent to adopt a specific policy, the reward designer incentivizes an agent to avoid taking actions that are inadmissible in certain states.
1 code implementation • NeurIPS 2021 • Rati Devidze, Goran Radanovic, Parameswaran Kamalaruban, Adish Singla
By being explicable, we seek to capture two properties: (a) informativeness so that the rewards speed up the agent's convergence, and (b) sparseness as a proxy for ease of interpretability of the rewards.
no code implementations • NeurIPS 2021 • Stelios Triantafyllou, Adish Singla, Goran Radanovic
We formalize desirable properties of blame attribution in the setting of interest, and we analyze the relationship between these properties and the studied blame attribution methods.
no code implementations • 15 Jul 2021 • Adish Singla, Anna N. Rafferty, Goran Radanovic, Neil T. Heffernan
This survey article has grown out of the RL4ED workshop organized by the authors at the Educational Data Mining (EDM) 2021 conference.
no code implementations • 10 Feb 2021 • Kiarash Banihashem, Adish Singla, Goran Radanovic
As a threat model, we consider attacks that minimally alter rewards to make the attacker's target policy uniquely optimal under the poisoned rewards, with the optimality gap specified by an attack parameter.
no code implementations • 21 Nov 2020 • Amin Rakhsha, Goran Radanovic, Rati Devidze, Xiaojin Zhu, Adish Singla
We provide lower/upper bounds on the attack cost, and instantiate our attacks in two settings: (i) an offline setting where the agent is doing planning in the poisoned environment, and (ii) an online setting where the agent is learning a policy with poisoned feedback.
1 code implementation • ICML 2020 • Amin Rakhsha, Goran Radanovic, Rati Devidze, Xiaojin Zhu, Adish Singla
We study a security threat to reinforcement learning where an attacker poisons the learning environment to force the agent into executing a target policy chosen by the attacker.
no code implementations • 23 Jan 2019 • Goran Radanovic, Rati Devidze, David C. Parkes, Adish Singla
We consider a two-agent MDP framework where agents repeatedly solve a task in a collaborative setting.
no code implementations • 8 Nov 2018 • Nripsuta Saxena, Karen Huang, Evan DeFilippis, Goran Radanovic, David Parkes, Yang Liu
What is the best way to define algorithmic fairness?
no code implementations • NeurIPS 2017 • Christos Dimitrakakis, David C. Parkes, Goran Radanovic, Paul Tylkin
We consider a two-player sequential game in which agents have the same reward function but may disagree on the transition probabilities of an underlying Markovian model of the world.
no code implementations • 6 Jul 2017 • Yang Liu, Goran Radanovic, Christos Dimitrakakis, Debmalya Mandal, David C. Parkes
In addition, we define the {\em fairness regret}, which corresponds to the degree to which an algorithm is not calibrated, where perfect calibration requires that the probability of selecting an arm is equal to the probability with which the arm has the best quality realization.
no code implementations • 31 May 2017 • Christos Dimitrakakis, Yang Liu, David Parkes, Goran Radanovic
We consider the problem of how decision making can be fair when the underlying probabilistic model of the world is not known with certainty.