no code implementations • 3 Feb 2024 • Florian E. Dorner, Moritz Hardt
We study how to best spend a budget of noisy labels to compare the accuracy of two binary classifiers.
no code implementations • 9 Nov 2023 • Florian E. Dorner, Tom Sühr, Samira Samadi, Augustin Kelava
With large language models (LLMs) appearing to behave increasingly human-like in text-based interactions, it has become popular to attempt to evaluate various properties of these models using tests originally designed for humans.
1 code implementation • 20 Dec 2022 • Florian E. Dorner, Momchil Peychev, Nikola Konstantinov, Naman Goel, Elliott Ash, Martin Vechev
While existing research has started to address this gap, current methods are based on hardcoded word replacements, resulting in specifications with limited expressivity or ones that fail to fully align with human intuition (e. g., in cases of asymmetric counterfactuals).
no code implementations • 10 Oct 2021 • Florian E. Dorner
The prospect of collusive agreements being stabilized via the use of pricing algorithms is widely discussed by antitrust experts and economists.
no code implementations • 9 Feb 2021 • Florian E. Dorner
Sampled environment transitions are a critical input to deep reinforcement learning (DRL) algorithms.