A sensitivity analysis of factors influential to the popularity of shared data in data repositories

作者:

Highlights:

• This paper used a neural network based method to analyze factors influencing dataset popularity in UCI data repositories.

• Sensitivity degree as a weight was used to re-rank the datasets in order to predict high-popularity datasets.

• We examined whether the relationship between factors and popularity differs depending on the subject domain in the UCI.

• GitHub data repository was used for evaluating the applicability of the proposed framework to other types of factor analysis.

摘要

•This paper used a neural network based method to analyze factors influencing dataset popularity in UCI data repositories.•Sensitivity degree as a weight was used to re-rank the datasets in order to predict high-popularity datasets.•We examined whether the relationship between factors and popularity differs depending on the subject domain in the UCI.•GitHub data repository was used for evaluating the applicability of the proposed framework to other types of factor analysis.

论文关键词:Data repository,Sensitivity analysis,Neural network,UCI repository,GitHub

论文评审过程:Received 17 July 2020, Revised 31 January 2021, Accepted 3 February 2021, Available online 27 February 2021, Version of Record 27 February 2021.

论文官网地址:https://doi.org/10.1016/j.joi.2021.101142