no code implementations • 15 Feb 2023 • Kishore Surendra, Achim Schilling, Paul Stoewer, Andreas Maier, Patrick Krauss
Strikingly, we find that the internal representations of nine-word input sequences cluster according to the word class of the tenth word to be predicted as output, even though the neural network did not receive any explicit information about syntactic rules or word classes during training.