1 code implementation • 7 Nov 2024 • Bai Cong, Nico Daheim, Yuesong Shen, Daniel Cremers, Rio Yokota, Mohammad Emtiyaz Khan, Thomas Möllenhoff
We replace AdamW by the Improved Variational Online Newton (IVON) algorithm to finetune large language models.
1 code implementation • 27 Feb 2024 • Yuesong Shen, Nico Daheim, Bai Cong, Peter Nickl, Gian Maria Marconi, Clement Bazan, Rio Yokota, Iryna Gurevych, Daniel Cremers, Mohammad Emtiyaz Khan, Thomas Möllenhoff
We give extensive empirical evidence against the common belief that variational learning is ineffective for large neural networks.