no code implementations • 16 Apr 2024 • Christian Tomani, Kamalika Chaudhuri, Ivan Evtimov, Daniel Cremers, Mark Ibrahim
A major barrier towards the practical deployment of large language models (LLMs) is their lack of reliability.
no code implementations • 10 Oct 2023 • Christian Tomani, David Vilar, Markus Freitag, Colin Cherry, Subhajit Naskar, Mara Finkelstein, Xavier Garcia, Daniel Cremers
Maximum-a-posteriori (MAP) decoding is the most widely used decoding strategy for neural machine translation (NMT) models.
1 code implementation • 10 Feb 2023 • Christian Tomani, Futa Waseda, Yuesong Shen, Daniel Cremers
While existing post-hoc calibration methods achieve impressive results on in-domain test datasets, they are limited by their inability to yield reliable uncertainty estimates in domain-shift and out-of-domain (OOD) scenarios.
1 code implementation • 12 Oct 2022 • Hans Hao-Hsun Hsu, Yuesong Shen, Christian Tomani, Daniel Cremers
Furthermore, based on the insights from this study, we design a novel calibration method named Graph Attention Temperature Scaling (GATS), which is tailored for calibrating graph neural networks.
no code implementations • 30 May 2022 • Christian Tomani, Daniel Cremers
Regularization is key in deep learning, especially when training complex models on relatively small datasets.
1 code implementation • 24 Feb 2021 • Christian Tomani, Daniel Cremers, Florian Buettner
We address the problem of uncertainty calibration and introduce a novel calibration method, Parametrized Temperature Scaling (PTS).
1 code implementation • CVPR 2021 • Christian Tomani, Sebastian Gruber, Muhammed Ebrar Erdem, Daniel Cremers, Florian Buettner
First, we show that existing post-hoc calibration methods yield highly over-confident predictions under domain shift.
1 code implementation • 20 Dec 2020 • Christian Tomani, Florian Buettner
That is, it is crucial for predictive models to be uncertainty-aware and yield well-calibrated (and thus trustworthy) predictions for both in-domain samples as well as under domain shift.