Tracking the progress of Language Models by extracting their underlying Knowledge Graphs
The state of the art, previously dominated by pre-trained word embeddings, is now being pushed forward by large pre-trained contextual representation models. This success has driven growing interest to understand what these models encode inside their inner workings. Despite this, understanding their semantic skills has been elusive, often leading to unsuccessful, non-conclusive, or contradictory results among different works. In this work, we define a probing classifier that we then use to extract the underlying knowledge graph of nine of the most influential models of the last years, including word embeddings, context encoders, and text generators. This probe is based on concept relatedness, grounded on WordNet. Our results show that this knowledge is present in all the models, but has several inaccuracies. Furthermore, we show that the different pre-training strategies and architectures lead to different model biases. We conduct a systematic evaluation to discover specific factors that explain why some concepts are challenging for the different families of models. We hope our insights will motivate the future development of models that capture concepts more precisely.
PDF Abstract