no code implementations • 1 Apr 2024 • Matthias Gerstgrasser, Rylan Schaeffer, Apratim Dey, Rafael Rafailov, Henry Sleight, John Hughes, Tomasz Korbak, Rajashree Agrawal, Dhruv Pai, Andrey Gromov, Daniel A. Roberts, Diyi Yang, David L. Donoho, Sanmi Koyejo
The proliferation of generative models, combined with pretraining on web-scale data, raises a timely question: what happens when these models are trained on their own generated outputs?
no code implementations • 15 Nov 2023 • Omar Shaikh, Kristina Gligorić, Ashna Khetan, Matthias Gerstgrasser, Diyi Yang, Dan Jurafsky
To understand the roots of the identified grounding gap, we examine the role of instruction tuning and preference optimization, finding that training on contemporary preference data leads to a reduction in generated grounding acts.
1 code implementation • NeurIPS 2023 • Matthias Gerstgrasser, Tom Danino, Sarah Keren
We present a novel multi-agent RL approach, Selective Multi-Agent Prioritized Experience Relay, in which agents share with other agents a limited number of transitions they observe during training.
no code implementations • 19 Oct 2022 • Matthias Gerstgrasser, David C. Parkes
Stackelberg equilibria arise naturally in a range of popular learning problems, such as in security games or indirect mechanism design, and have received increasing attention in the reinforcement learning literature.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 12 Nov 2021 • Sarah Keren, Matthias Gerstgrasser, Ofir Abu, Jeffrey Rosenschein
AI agents need to be robust to unexpected changes in their environment in order to safely operate in real-world scenarios.
Multi-agent Reinforcement Learning Reinforcement Learning (RL)
no code implementations • ICLR 2022 • Matthias Gerstgrasser, Rakshit Trivedi, David C. Parkes
Human demonstrations of video game play can serve as vital surrogate representations of real-world behaviors, access to which would facilitate rapid progress in several complex learning settings (e. g. behavior classification, imitation learning, offline RL etc.).
no code implementations • 2 Oct 2020 • Gianluca Brero, Alon Eden, Matthias Gerstgrasser, David C. Parkes, Duncan Rheingans-Yoo
We introduce the use of reinforcement learning for indirect mechanisms, working with the existing class of sequential price mechanisms, which generalizes both serial dictatorship and posted price mechanisms and essentially characterizes all strongly obviously strategyproof mechanisms.
no code implementations • 22 Nov 2017 • Wolfgang Fruehwirt, Matthias Gerstgrasser, Pengfei Zhang, Leonard Weydemann, Markus Waser, Reinhold Schmidt, Thomas Benke, Peter Dal-Bianco, Gerhard Ransmayr, Dieter Grossegger, Heinrich Garn, Gareth W. Peters, Stephen Roberts, Georg Dorffner
The diagnosis of Alzheimer's disease (AD) in routine clinical practice is most commonly based on subjective clinical interpretations.