The relational processing limits of classic and contemporary neural network models of language processing
The ability of neural networks to capture relational knowledge is a matter of long-standing controversy. Recently, some researchers in the PDP side of the debate have argued that (1) classic PDP models can handle relational structure (Rogers & McClelland, 2008, 2014) and (2) the success of deep learning approaches to text processing suggests that structured representations are unnecessary to capture the gist of human language (Rabovsky et al., 2018). In the present study we tested the Story Gestalt model (St. John, 1992), a classic PDP model of text comprehension, and a Sequence-to-Sequence with Attention model (Bahdanau et al., 2015), a contemporary deep learning architecture for text processing. Both models were trained to answer questions about stories based on the thematic roles that several concepts played on the stories. In three critical test we varied the statistical structure of new stories while keeping their relational structure constant with respect to the training data. Each model was susceptible to each statistical structure manipulation to a different degree, with their performance failing below chance at least under one manipulation. We argue that the failures of both models are due to the fact that they cannotperform dynamic binding of independent roles and fillers. Ultimately, these results cast doubts onthe suitability of traditional neural networks models for explaining phenomena based on relational reasoning, including language processing.
PDF Abstract