no code implementations • 25 Jan 2024 • Stephen Casper, Carson Ezell, Charlotte Siegmann, Noam Kolt, Taylor Lynn Curtis, Benjamin Bucknall, Andreas Haupt, Kevin Wei, Jérémy Scheurer, Marius Hobbhahn, Lee Sharkey, Satyapriya Krishna, Marvin Von Hagen, Silas Alberti, Alan Chan, Qinyi Sun, Michael Gerovitch, David Bau, Max Tegmark, David Krueger, Dylan Hadfield-Menell
The effectiveness of an audit, however, depends on the degree of system access granted to auditors.
1 code implementation • 9 Nov 2023 • Jérémy Scheurer, Mikita Balesni, Marius Hobbhahn
We demonstrate a situation in which Large Language Models, trained to be helpful, harmless, and honest, can display misaligned behavior and strategically deceive their users about this behavior without being instructed to do so.
no code implementations • 26 Oct 2022 • Pablo Villalobos, Jaime Sevilla, Lennart Heim, Tamay Besiroglu, Marius Hobbhahn, Anson Ho
We analyze the growth of dataset sizes used in machine learning for natural language processing and computer vision, and extrapolate these using two methods; using the historical growth rate and estimating the compute-optimal dataset size for future predicted compute budgets.
no code implementations • 5 Jul 2022 • Pablo Villalobos, Jaime Sevilla, Tamay Besiroglu, Lennart Heim, Anson Ho, Marius Hobbhahn
From 1950 to 2018, model size in language models increased steadily by seven orders of magnitude.
1 code implementation • 11 Feb 2022 • Jaime Sevilla, Lennart Heim, Anson Ho, Tamay Besiroglu, Marius Hobbhahn, Pablo Villalobos
Since the advent of Deep Learning in the early 2010s, the scaling of training compute has accelerated, doubling approximately every 6 months.
1 code implementation • 7 May 2021 • Marius Hobbhahn, Philipp Hennig
The method can be thought of as a pre-processing step which can be implemented in <5 lines of code and runs in less than a second.
1 code implementation • 2 Mar 2020 • Marius Hobbhahn, Agustinus Kristiadi, Philipp Hennig
We reconsider old work (Laplace Bridge) to construct a Dirichlet approximation of this softmax output distribution, which yields an analytic map between Gaussian distributions in logit space and Dirichlet distributions (the conjugate prior to the Categorical distribution) in the output space.