no code implementations • 21 Mar 2024 • Carina Kauf, Emmanuele Chersoni, Alessandro Lenci, Evelina Fedorenko, Anna A. Ivanova
Experiment 1 shows that, across model architectures and plausibility datasets, (i) log likelihood ($\textit{LL}$) scores are the most reliable indicator of sentence plausibility, with zero-shot prompting yielding inconsistent and typically poor results; (ii) $\textit{LL}$-based performance is still inferior to human performance; (iii) instruction-tuned models have worse $\textit{LL}$-based performance than base models.
no code implementations • 3 Dec 2023 • Anna A. Ivanova
In this paper, I describe methodological considerations for studies that aim to evaluate the cognitive capacities of large language models (LLMs) using language-based behavioral assessments.
no code implementations • 16 Jan 2023 • Kyle Mahowald, Anna A. Ivanova, Idan A. Blank, Nancy Kanwisher, Joshua B. Tenenbaum, Evelina Fedorenko
Large Language Models (LLMs) have come closest among all models to date to mastering human language, yet opinions about their linguistic and cognitive capabilities remain split.
1 code implementation • 2 Dec 2022 • Carina Kauf, Anna A. Ivanova, Giulia Rambelli, Emmanuele Chersoni, Jingyuan Selena She, Zawad Chowdhury, Evelina Fedorenko, Alessandro Lenci
Overall, our results show that important aspects of event knowledge naturally emerge from distributional linguistic patterns, but also highlight a gap between representations of possible/impossible and likely/unlikely events.
no code implementations • 23 Aug 2022 • Anna A. Ivanova, Martin Schrimpf, Stefano Anzellotti, Noga Zaslavsky, Evelina Fedorenko, Leyla Isik
Moreover, we argue that, instead of categorically treating the mapping models as linear or nonlinear, we should instead aim to estimate the complexity of these models.
no code implementations • 16 Apr 2021 • Anna A. Ivanova, John Hewitt, Noga Zaslavsky
A major challenge in both neuroscience and machine learning is the development of useful tools for understanding complex information processing systems.