Modeling a century of citation distributions

作者：

Highlights：

•

摘要

The prevalence of uncited papers or of highly cited papers, with respect to the bulk of publications, provides important clues as to the dynamics of scientific research. Using 25 million papers and 600 million references from the Web of Science over the 1900–2006 period, this paper proposes a simple model based on a random selection process to explain the “uncitedness” phenomenon and its decline over the years. We show that the proportion of cited papers is a function of (1) the number of articles available (the competing papers), (2) the number of citing papers and (3) the number of references they contain. Using uncitedness as a departure point, we demonstrate the utility of the stretched-exponential function and a form of the Tsallis q-exponential function to fit complete citation distributions over the 20th century. As opposed to simple power-law fits, for instance, both these approaches are shown to be empirically well-grounded and robust enough to better understand citation dynamics at the aggregate level. On the basis of these models, we provide quantitative evidence and provisional explanations for an important shift in citation practices around 1960. We also propose a revision of the “citation classic” category as a set of articles which is clearly distinguishable from the rest of the field.

论文关键词：Citations,Bibliometrics,Citation distributions,Uncitedness,History of science

论文评审过程：Received 12 January 2009, Revised 27 March 2009, Accepted 31 March 2009, Available online 6 May 2009.

论文官网地址：https://doi.org/10.1016/j.joi.2009.03.010