1 code implementation • 24 Feb 2024 • Aleksa Sukovic, Goran Radanovic
In this work, we propose the use of a debate-based reward model for reinforcement learning agents, where the outcome of a zero-sum debate game quantifies the justifiability of a decision in a particular state.
no code implementations • 17 Oct 2023 • Stelios Triantafyllou, Aleksa Sukovic, Debmalya Mandal, Goran Radanovic
These challenges are particularly prominent in the context of multi-agent sequential decision-making, where the causal effect of an agent's action on the outcome depends on how other agents respond to that action.