How Relevant Are Selectional Preferences for Transformer-based Language Models?

Selectional preference is defined as the tendency of a predicate to favor particular arguments within a certain linguistic context, and likewise, reject others that result in conflicting or implausible meanings. The stellar success of contextual word embedding models such as BERT in NLP tasks has led many to question whether these models have learned linguistic information, but up till now, most research has focused on syntactic information. We investigate whether Bert contains information on the selectional preferences of words, by examining the probability it assigns to the dependent word given the presence of a head word in a sentence. We are using word pairs of head-dependent words in five different syntactic relations from the SP-10K corpus of selectional preference (Zhang et al., 2019b), in sentences from the ukWaC corpus, and we are calculating the correlation of the plausibility score (from SP-10K) and the model probabilities. Our results show that overall, there is no strong positive or negative correlation in any syntactic relation, but we do find that certain head words have a strong correlation and that masking all words but the head word yields the most positive correlations in most scenarios {--}which indicates that the semantics of the predicate is indeed an integral and influential factor for the selection of the argument.

PDF Abstract


Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.