Search Results for author: Roberto-Rafael Maura-Rivero

Found 3 papers, 1 papers with code

Jackpot! Alignment as a Maximal Lottery

no code implementations31 Jan 2025 Roberto-Rafael Maura-Rivero, Marc Lanctot, Francesco Visin, Kate Larson

Reinforcement Learning from Human Feedback (RLHF), the standard for aligning Large Language Models (LLMs) with human values, is known to fail to satisfy properties that are intuitively desirable, such as respecting the preferences of the majority \cite{ge2024axioms}.

Utility-inspired Reward Transformations Improve Reinforcement Learning Training of Language Models

no code implementations8 Jan 2025 Roberto-Rafael Maura-Rivero, Chirag Nagpal, Roma Patel, Francesco Visin

Current methods that train large language models (LLMs) with reinforcement learning feedback, often resort to averaging outputs of multiple rewards functions during training.

Soft Condorcet Optimization for Ranking of General Agents

1 code implementation31 Oct 2024 Marc Lanctot, Kate Larson, Michael Kaisers, Quentin Berthet, Ian Gemp, Manfred Diaz, Roberto-Rafael Maura-Rivero, Yoram Bachrach, Anna Koop, Doina Precup

This optimal ranking is the maximum likelihood estimate when evaluation data (which we view as votes) are interpreted as noisy samples from a ground truth ranking, a solution to Condorcet's original voting system criteria.

Cannot find the paper you are looking for? You can Submit a new open access paper.