The effect of data pre-processing on understanding the evolution of collaboration networks

作者:

Highlights:

• Author names were disambiguated by algorithm, all-, and first-initial of given name.

• Algorithmic disambiguation approximated the ground-truth better than initial methods.

• Initial methods distorted size, degree, distance, and clustering of coauthor network.

• Distortion of network properties by initial methods got severe over time.

• Initial methods produced degree distributions seemingly following a power law.

摘要

•Author names were disambiguated by algorithm, all-, and first-initial of given name.•Algorithmic disambiguation approximated the ground-truth better than initial methods.•Initial methods distorted size, degree, distance, and clustering of coauthor network.•Distortion of network properties by initial methods got severe over time.•Initial methods produced degree distributions seemingly following a power law.

论文关键词:Collaboration network,Network evolution,Name ambiguity,Disambiguation

论文评审过程:Received 21 September 2014, Revised 28 December 2014, Accepted 5 January 2015, Available online 17 January 2015.

论文官网地址:https://doi.org/10.1016/j.joi.2015.01.002