no code implementations • 22 Jan 2024 • Alizée Pace, Jonathan Mallinson, Eric Malmi, Sebastian Krause, Aliaksei Severyn
The success of reinforcement learning from human feedback (RLHF) in language model alignment is strongly dependent on the quality of the underlying reward model.
no code implementations • 15 Nov 2023 • Rita Kuznetsova, Alizée Pace, Manuel Burger, Hugo Yèche, Gunnar Rätsch
Recent findings in deep learning for tabular data are now surpassing these classical methods by better handling the severe heterogeneity of data input features.
no code implementations • 1 Jun 2023 • Alizée Pace, Hugo Yèche, Bernhard Schölkopf, Gunnar Rätsch, Guy Tennenholtz
A prominent challenge of offline reinforcement learning (RL) is the issue of hidden confounding: unobserved variables may influence both the actions taken by the agent and the observed outcomes.
1 code implementation • 29 Aug 2022 • Hugo Yèche, Alizée Pace, Gunnar Rätsch, Rita Kuznetsova
TLS reduces the number of missed events by up to a factor of two over previously used approaches in early event prediction.
Ranked #1 on Respiratory Failure on HiRID
no code implementations • ICLR 2022 • Alizée Pace, Alex J. Chan, Mihaela van der Schaar
Building models of human decision-making from observed behaviour is critical to better understand, diagnose and support real-world policies such as clinical care.