Word associations and the distance properties of context-aware word embeddings
What do people know when they know the meaning of words? Word associations have been widely used to tap into lexical repre- sentations and their structure, as a way of probing semantic knowledge in humans. We investigate whether current word embedding spaces (contextualized and uncontextualized) can be considered good models of human lexi- cal knowledge by studying whether they have comparable characteristics to human associa- tion spaces. We study the three properties of association rank, asymmetry of similarity and triangle inequality. We find that word embeddings are good mod- els of some word associations properties. They replicate well human associations between words, and, like humans, their context-aware variants show violations of the triangle in- equality. While they do show asymmetry of similarities, their asymmetries do not map those of human association norms.
PDF Abstract