Disambiguating authors in citations on the web and authorship correlations

作者:

Highlights:

摘要

Members of the academic community have increasingly turned to digital libraries to search for the latest work of their peers. On account of their role in the academic community, it is very important that these digital libraries collect citations in a consistent, accurate, and up-to-date manner, yet they do not correctly compile citations for myriads of authors for various reasons including authors with the same name, a problem known as the “name ambiguity problem.” This problem occurs when multiple authors share the same name and particularly when names are simplified as in cases where names merely contain the first initial and the last name. This paper proposes a reliable and accurate pair-wise similarities approach to disambiguate names using supervised classification on Web correlations and authorship correlations. This approach makes use of Web correlations among citations assuming citations that co-refer on publication lists on the Web should to refer to the same author. This approach also makes use of authorship correlations assuming citations with the same rare author name refer to the same author, and furthermore, citations with the same full names of authors or e-mail addresses likely refer to the same author. These two types of correlations are measured in our approach using pair-wise similarity metrics. In addition, a binary classifier, as part of supervised classification, is applied to label matching pairs of citations using pair-wise similarity metrics, and these labels are then used to group citations into different clusters such that each cluster represents an individual author. Results show our approach greatly improves upon the name disambiguation accuracy and performance of other proposed approaches, especially in some name clusters with high degree of ambiguity.

论文关键词:Author name disambiguation,Citation analysis,Web correlation,Authorship correlation

论文评审过程:Available online 5 March 2012.

论文官网地址:https://doi.org/10.1016/j.eswa.2012.02.121