no code implementations • 4 Feb 2025 • Mateusz Piotrowski, Paul M. Riechers, Daniel Filan, Adam S. Shai
What computational structures emerge in transformers trained on next-token prediction?
no code implementations • 24 May 2024 • Adam S. Shai, Sarah E. Marzen, Lucas Teixeira, Alexander Gietelink Oldenziel, Paul M. Riechers
What computational structure are we building into large language models when we train them on next-token prediction?