1 code implementation • 12 Jan 2023 • David Lindner, János Kramár, Matthew Rahtz, Thomas McGrath, Vladimir Mikulik
Interpretability research aims to build tools for understanding machine learning (ML) models.
no code implementations • 3 Oct 2022 • Javier Rando, Daniel Paleka, David Lindner, Lennart Heim, Florian Tramèr
We then reverse-engineer the filter and find that while it aims to prevent sexual content, it ignores violence, gore, and other similarly disturbing content.
1 code implementation • 18 Jul 2022 • David Lindner, Andreas Krause, Giorgia Ramponi
We propose a novel IRL algorithm: Active exploration for Inverse Reinforcement Learning (AceIRL), which actively explores an unknown environment and expert policy to quickly learn the expert's reward function and identify a good policy.
no code implementations • 27 Jun 2022 • David Lindner, Mennatallah El-Assady
Reinforcement learning (RL) commonly assumes access to well-specified reward functions, which many practical applications do not provide.
1 code implementation • 10 Jun 2022 • David Lindner, Sebastian Tschiatschek, Katja Hofmann, Andreas Krause
We provide an instance-dependent lower bound for constrained linear best-arm identification and show that ACOL's sample complexity matches the lower bound in the worst-case.
no code implementations • 24 Jan 2022 • Bhavya Sukhija, Matteo Turchetta, David Lindner, Andreas Krause, Sebastian Trimpe, Dominik Baumann
Learning optimal control policies directly on physical systems is challenging since even a single failure can lead to costly hardware damage.
1 code implementation • 2 Jun 2021 • David Lindner, Hoda Heidari, Andreas Krause
To capture the long-term effects of ML-based allocation decisions, we study a setting in which the reward from each arm evolves every time the decision-maker pulls that arm.
1 code implementation • ICLR 2021 • David Lindner, Rohin Shah, Pieter Abbeel, Anca Dragan
Since reward functions are hard to specify, recent work has focused on learning policies from human feedback.
1 code implementation • NeurIPS 2021 • David Lindner, Matteo Turchetta, Sebastian Tschiatschek, Kamil Ciosek, Andreas Krause
For many reinforcement learning (RL) applications, specifying a reward is difficult.
no code implementations • 29 Jan 2021 • David Lindner, Kyle Matoba, Alexander Meulemans
Finally, we explore promising directions to overcome the unsolved challenges in preventing negative side effects with impact regularizers.
1 code implementation • 30 Jun 2019 • Jason Mancuso, Tomasz Kisielewski, David Lindner, Alok Singh
We show that if the reward corruption in a CRMDP is sufficiently "spiky", the environment is solvable.
no code implementations • 27 Mar 2019 • Johannes Beck, Roberta Huang, David Lindner, Tian Guo, Ce Zhang, Dirk Helbing, Nino Antulov-Fantulin
The ability to track and monitor relevant and important news in real-time is of crucial interest in multiple industrial sectors.