Search Results for author: Goran Radanović

Found 3 papers, 0 papers with code

Corruption-Robust Offline Two-Player Zero-Sum Markov Games

no code implementations • 4 Mar 2024 • Andi Nika, Debmalya Mandal, Adish Singla, Goran Radanović

We note that we are the first to provide such a characterization of the problem of learning approximate Nash Equilibrium policies in offline two-player zero-sum Markov games under data corruption.

Paper
Add Code

Reward Model Learning vs. Direct Policy Optimization: A Comparative Analysis of Learning from Human Preferences

no code implementations • 4 Mar 2024 • Andi Nika, Debmalya Mandal, Parameswaran Kamalaruban, Georgios Tzannetos, Goran Radanović, Adish Singla

Moreover, we extend our analysis to the approximate optimization setting and derive exponentially decaying convergence rates for both RLHF and DPO.

Paper
Add Code

Corruption Robust Offline Reinforcement Learning with Human Feedback

no code implementations • 9 Feb 2024 • Debmalya Mandal, Andi Nika, Parameswaran Kamalaruban, Adish Singla, Goran Radanović

We aim to design algorithms that identify a near-optimal policy from the corrupted data, with provable guarantees.

Adversarial Attack reinforcement-learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.