Search Results for author: David Lindner

We propose a novel IRL algorithm: Active exploration for Inverse Reinforcement Learning (AceIRL), which actively explores an unknown environment and expert policy to quickly learn the expert's reward function and identify a good policy.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

GoSafeOpt: Scalable Safe Exploration for Global Optimization of Dynamical Systems

1 code implementation • 24 Jan 2022 • Bhavya Sukhija, Matteo Turchetta, David Lindner, Andreas Krause, Sebastian Trimpe, Dominik Baumann

Learning optimal control policies directly on physical systems is challenging since even a single failure can lead to costly hardware damage.

Safe Exploration

Paper
Code

Interactively Learning Preference Constraints in Linear Bandits

1 code implementation • 10 Jun 2022 • David Lindner, Sebastian Tschiatschek, Katja Hofmann, Andreas Krause

We provide an instance-dependent lower bound for constrained linear best-arm identification and show that ACOL's sample complexity matches the lower bound in the worst-case.

Decision Making

Paper
Code

Learning Safety Constraints from Demonstrations with Unknown Rewards

1 code implementation • 25 May 2023 • David Lindner, Xin Chen, Sebastian Tschiatschek, Katja Hofmann, Andreas Krause

We evaluate CoCoRL in gridworld environments and a driving simulation with multiple constraints.

reinforcement-learning

Paper
Code

Sensing Social Media Signals for Cryptocurrency News

no code implementations • 27 Mar 2019 • Johannes Beck, Roberta Huang, David Lindner, Tian Guo, Ce Zhang, Dirk Helbing, Nino Antulov-Fantulin

The ability to track and monitor relevant and important news in real-time is of crucial interest in multiple industrial sectors.

BIG-bench Machine Learning

Paper
Add Code

Challenges for Using Impact Regularizers to Avoid Negative Side Effects

no code implementations • 29 Jan 2021 • David Lindner, Kyle Matoba, Alexander Meulemans

Finally, we explore promising directions to overcome the unsolved challenges in preventing negative side effects with impact regularizers.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Addressing the Long-term Impact of ML Decisions via Policy Regret

1 code implementation • 2 Jun 2021 • David Lindner, Hoda Heidari, Andreas Krause

To capture the long-term effects of ML-based allocation decisions, we study a setting in which the reward from each arm evolves every time the decision-maker pulls that arm.

Multi-Armed Bandits

Paper
Code

Humans are not Boltzmann Distributions: Challenges and Opportunities for Modelling Human Feedback and Interaction in Reinforcement Learning

no code implementations • 27 Jun 2022 • David Lindner, Mennatallah El-Assady

Reinforcement learning (RL) commonly assumes access to well-specified reward functions, which many practical applications do not provide.

Reinforcement Learning (RL)

Paper
Add Code

Red-Teaming the Stable Diffusion Safety Filter

no code implementations • 3 Oct 2022 • Javier Rando, Daniel Paleka, David Lindner, Lennart Heim, Florian Tramèr

We then reverse-engineer the filter and find that while it aims to prevent sexual content, it ignores violence, gore, and other similarly disturbing content.

Image Generation

Paper
Add Code

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

no code implementations • 27 Jul 2023 • Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, Tony Wang, Samuel Marks, Charbel-Raphaël Segerie, Micah Carroll, Andi Peng, Phillip Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J. Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen, Lauro Langosco, Peter Hase, Erdem Biyik, Anca Dragan, David Krueger, Dorsa Sadigh, Dylan Hadfield-Menell

Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align with human goals.

reinforcement-learning

Paper
Add Code

RLHF-Blender: A Configurable Interactive Interface for Learning from Diverse Human Feedback

no code implementations • 8 Aug 2023 • Yannick Metz, David Lindner, Raphaël Baur, Daniel Keim, Mennatallah El-Assady

To use reinforcement learning from human feedback (RLHF) in practical applications, it is crucial to learn reward models from diverse sources of human feedback and to consider human factors involved in providing feedback of different types.

Paper
Add Code

Evaluating Frontier Models for Dangerous Capabilities

no code implementations • 20 Mar 2024 • Mary Phuong, Matthew Aitchison, Elliot Catt, Sarah Cogan, Alexandre Kaskasoli, Victoria Krakovna, David Lindner, Matthew Rahtz, Yannis Assael, Sarah Hodkinson, Heidi Howard, Tom Lieberum, Ramana Kumar, Maria Abi Raad, Albert Webson, Lewis Ho, Sharon Lin, Sebastian Farquhar, Marcus Hutter, Gregoire Deletang, Anian Ruoss, Seliem El-Sayed, Sasha Brown, Anca Dragan, Rohin Shah, Allan Dafoe, Toby Shevlane

To understand the risks posed by a new AI system, we must understand what it can and cannot do.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.