no code implementations • 9 Nov 2023 • Florian E. Dorner, Tom Sühr, Samira Samadi, Augustin Kelava
With large language models (LLMs) appearing to behave increasingly human-like in text-based interactions, it has become popular to attempt to evaluate various properties of these models using tests originally designed for humans.