Search Results for author: Byung-Doh Oh

Found 11 papers, 5 papers with code

Coreference-aware Surprisal Predicts Brain Response

no code implementations • Findings (EMNLP) 2021 • Evan Jaffe, Byung-Doh Oh, William Schuler

Recent evidence supports a role for coreference processing in guiding human expectations about upcoming words during reading, based on covariation between reading times and word surprisal estimated by a coreference-aware semantic processing model (Jaffe et al. 2020). The present study reproduces and elaborates on this finding by (1) enabling the parser to process subword information that might better approximate human morphological knowledge, and (2) extending evaluation of coreference effects from self-paced reading to human brain imaging data.

Paper
Add Code

Character-based PCFG Induction for Modeling the Syntactic Acquisition of Morphologically Rich Languages

no code implementations • Findings (EMNLP) 2021 • Lifeng Jin, Byung-Doh Oh, William Schuler

A subsequent evaluation on multilingual treebanks shows that the model with subword information achieves state-of-the-art results on many languages, further supporting a distributional model of syntactic acquisition.

Paper
Add Code

Team Ohio State at CMCL 2021 Shared Task: Fine-Tuned RoBERTa for Eye-Tracking Data Prediction

1 code implementation • NAACL (CMCL) 2021 • Byung-Doh Oh

This paper describes Team Ohio State’s approach to the CMCL 2021 Shared Task, the goal of which is to predict five eye-tracking features from naturalistic self-paced reading corpora.

Language Modelling Sentence

Paper
Code

Contributions of Propositional Content and Syntactic Category Information in Sentence Processing

no code implementations • NAACL (CMCL) 2021 • Byung-Doh Oh, William Schuler

Expectation-based theories of sentence processing posit that processing difficulty is determined by predictability in context.

Sentence

Paper
Add Code

Frequency Explains the Inverse Correlation of Large Language Models' Size, Training Data Amount, and Surprisal's Fit to Reading Times

1 code implementation • 3 Feb 2024 • Byung-Doh Oh, Shisen Yue, William Schuler

Additionally, training dynamics reveal that during later training steps, all model variants learn to predict rare words and that larger model variants do so more accurately, which explains the detrimental effect of both training data amount and model size on fit to reading times.

Language Modelling

Paper
Code

Token-wise Decomposition of Autoregressive Language Model Hidden States for Analyzing Model Predictions

1 code implementation • 17 May 2023 • Byung-Doh Oh, William Schuler

While there is much recent interest in studying why Transformer-based large language models make predictions the way they do, the complex computations performed within each layer have made their behavior somewhat opaque.

Language Modelling

Paper
Code

Transformer-Based Language Model Surprisal Predicts Human Reading Times Best with About Two Billion Training Tokens

no code implementations • 22 Apr 2023 • Byung-Doh Oh, William Schuler

Recent psycholinguistic studies have drawn conflicting conclusions about the relationship between the quality of a language model and the ability of its surprisal estimates to predict human reading times, which has been speculated to be due to the large gap in both the amount of training data and model capacity across studies.

Language Modelling

Paper
Add Code

Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times?

no code implementations • 23 Dec 2022 • Byung-Doh Oh, William Schuler

This work presents a detailed linguistic analysis into why larger Transformer-based pre-trained language models with more parameters and lower perplexity nonetheless yield surprisal estimates that are less predictive of human reading times.

Paper
Add Code

Entropy- and Distance-Based Predictors From GPT-2 Attention Patterns Predict Reading Times Over and Above GPT-2 Surprisal

1 code implementation • 21 Dec 2022 • Byung-Doh Oh, William Schuler

Transformer-based large language models are trained to make predictions about the next word by aggregating representations of previous tokens through their self-attention mechanism.

Informativeness Language Modelling +1

Paper
Code

Surprisal Estimators for Human Reading Times Need Character Models

1 code implementation • ACL 2021 • Byung-Doh Oh, Christian Clark, William Schuler

While the use of character models has been popular in NLP applications, it has not been explored much in the context of psycholinguistic modeling.

Sentence

Paper
Code

THOMAS: The Hegemonic OSU Morphological Analyzer using Seq2seq

no code implementations • WS 2019 • Byung-Doh Oh, Pranav Maneriker, Nanjiang Jiang

This paper describes the OSU submission to the SIGMORPHON 2019 shared task, Crosslinguality and Context in Morphology.

Decoder Morphological Analysis +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.