Improving fit to human reading times via temperature-scaled surprisal

15 Nov 2023 · Tong Liu, Iza Škrjanec, Vera Demberg ·

Past studies have provided broad support for that words with lower predictability (i.e., higher surprisal) require more time for comprehension by using large language models (LLMs) to simulate humans' cognitive load. In general, these studies have implicitly assumed that the probability scores from LLMs are accurate, ignoring the discrepancies between human cognition and LLMs from this standpoint. Inspired by the concept of probability calibration, we are the first work to focus on the probability distribution for human reading simulation. We propose to use temperature-scaled surprisal, a surprisal calculated by shaped probability, to be the predictor of human reading times. Our results across three corpora consistently revealed that such a surprisal can drastically improve the prediction of reading times. Setting the temperature to be approximately 2.5 across all models and datasets can yield up to an 89% of increase in delta log-likelihood in our setting. We also propose a calibration metric to quantify the possible human-likeness bias. Further analysis was done and provided insights into this phenomenon.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Datasets

Natural Stories

Results from the Paper

Edit

Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods

Add Remove

Focus

Edit Social Preview

Improving fit to human reading times via temperature-scaled surprisal

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove