Search Results for author: Rachel Freedman

Found 9 papers, 1 papers with code

Adapting a Kidney Exchange Algorithm to Align with Human Values

1 code implementation19 May 2020 Rachel Freedman, Jana Schaich Borg, Walter Sinnott-Armstrong, John P. Dickerson, Vincent Conitzer

In kidney exchanges, a central market maker allocates living kidney donors to patients in need of an organ.

Aligning with Heterogeneous Preferences for Kidney Exchange

no code implementations16 Jun 2020 Rachel Freedman

In this paper, we propose, implement and evaluate a methodology for prioritizing patients based on such heterogeneous moral preferences.

Decision Making

Benefits of Assistance over Reward Learning

no code implementations1 Jan 2021 Rohin Shah, Pedro Freire, Neel Alex, Rachel Freedman, Dmitrii Krasheninnikov, Lawrence Chan, Michael D Dennis, Pieter Abbeel, Anca Dragan, Stuart Russell

By merging reward learning and control, assistive agents can reason about the impact of control actions on reward learning, leading to several advantages over agents based on reward learning.

Choice Set Misspecification in Reward Inference

no code implementations19 Jan 2021 Rachel Freedman, Rohin Shah, Anca Dragan

A promising alternative to manually specifying reward functions is to enable robots to infer them from human feedback, like demonstrations or corrections.

The Expertise Problem: Learning from Specialized Feedback

no code implementations12 Nov 2022 Oliver Daniels-Koch, Rachel Freedman

RLHF algorithms that learn from multiple teachers therefore face an expertise problem: the reliability of a given piece of feedback depends both on the teacher that it comes from and how specialized that teacher is on relevant components of the task.

Active Reward Learning from Multiple Teachers

no code implementations2 Mar 2023 Peter Barnett, Rachel Freedman, Justin Svegliato, Stuart Russell

Reward learning algorithms utilize human feedback to infer a reward function, which is then used to train an AI system.

Active teacher selection for reinforcement learning from human feedback

no code implementations23 Oct 2023 Rachel Freedman, Justin Svegliato, Kyle Wray, Stuart Russell

The HUB framework and ATS algorithm demonstrate the importance of leveraging differences between teachers to learn accurate reward models, facilitating future research on active teacher selection for robust reward modeling.

Recommendation Systems reinforcement-learning

Social Choice for AI Alignment: Dealing with Diverse Human Feedback

no code implementations16 Apr 2024 Vincent Conitzer, Rachel Freedman, Jobst Heitzig, Wesley H. Holliday, Bob M. Jacobs, Nathan Lambert, Milan Mossé, Eric Pacuit, Stuart Russell, Hailey Schoelkopf, Emanuel Tewolde, William S. Zwicker

Foundation models such as GPT-4 are fine-tuned to avoid unsafe or otherwise problematic behavior, so that, for example, they refuse to comply with requests for help with committing crimes or with producing racist text.

Ethics

Cannot find the paper you are looking for? You can Submit a new open access paper.