What is all this new MeSH about? Exploring the semantic provenance of new descriptors in the MeSH thesaurus

20 Jan 2021  ·  Anastasios Nentidis, Anastasia Krithara, Grigorios Tsoumakas, Georgios Paliouras ·

The Medical Subject Headings (MeSH) thesaurus is a controlled vocabulary widely used in biomedical knowledge systems, particularly for semantic indexing of scientific literature. As the MeSH hierarchy evolves through annual version updates, some new descriptors are introduced that were not previously available. This paper explores the conceptual provenance of these new descriptors. In particular, we investigate whether such new descriptors have been previously covered by older descriptors and what is their current relation to them. To this end, we propose a framework to categorize new descriptors based on their current relation to older descriptors. Based on the proposed classification scheme, we quantify, analyse and present the different types of new descriptors introduced in MeSH during the last fifteen years. The results show that only about 25% of new MeSH descriptors correspond to new emerging concepts, whereas the rest were previously covered by one or more existing descriptors, either implicitly or explicitly. Most of them were covered by a single existing descriptor and they usually end up as descendants of it in the current hierarchy, gradually leading towards a more fine-grained MeSH vocabulary. These insights about the dynamics of the thesaurus are useful for the retrospective study of scientific articles annotated with MeSH, but could also be used to inform the policy of updating the thesaurus in the future.

PDF Abstract


  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here