Citations Beyond Self Citations: Identifying Authors, Affiliations, and Nationalities in Scientific Papers

WOSP 2020  ·  Yoshitomo Matsubara, Sameer Singh ·

The question of the utility of the blind peer-review system is fundamental to scientific research. Some studies investigate exactly how “blind” the papers are in the double-blind review system by manually or automatically identifying the true authors, mainly suggesting the number of self-citations in the submitted manuscripts as the primary signal for identity. However, related work on the automated approaches are limited by the sizes of their datasets and the restricted experimental setup, thus they lack practical insights into the blind review process. In this work, we train models that identify the authors, their affiliations, and their nationalities through real-world, large-scale experiments on the Microsoft Academic Graph, including the cold start scenario. Our models are accurate; we identify at least one of authors, affiliations, and nationalities of held-out papers with 40.3%, 47.9% and 86.0% accuracy respectively, from the top-10 guesses of our models. However, through insights from the model, we demonstrate that these entities are identifiable with a small number of guesses primarily by using a combination of self-citations, social, and common citations. Moreover, our further analysis on the results leads to interesting findings, such as that prominent affiliations are easily identifiable (e.g. 93.8% of test papers written by Microsoft are identified with top-10 guesses). The experimental results show, against conventional belief, that the self-citations are no more informative than looking at the common citations, thus suggesting that removing self-citations is not sufficient for authors to maintain their anonymity.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here