Analyzing data quality issues in research information systems via data profiling

作者:

Highlights:

• The paper presents methods of data profiling in order to gain an overview of the quality of the data in the data sources before their integration into the research information system.

• With the help of data profiling, the institutions can evaluate their research information and provide information about their quality, and also examine the data errors and correct them within their research information system.

• The methods of data profiling can reduce project costs and minimize the time spent in institutions, for example for tracing unknown data stocks and identifying causes of quality problems.

• Data profiling is considered an important component in improving data quality in research information systems.

摘要

•The paper presents methods of data profiling in order to gain an overview of the quality of the data in the data sources before their integration into the research information system.•With the help of data profiling, the institutions can evaluate their research information and provide information about their quality, and also examine the data errors and correct them within their research information system.•The methods of data profiling can reduce project costs and minimize the time spent in institutions, for example for tracing unknown data stocks and identifying causes of quality problems.•Data profiling is considered an important component in improving data quality in research information systems.

论文关键词:Current research information systems,CRIS,Research information systems,RIS,Research information,Data sources,Data quality,Extraction transformation load,ETL,Data analysis,Data profiling,Science system,Standardization

论文评审过程:Received 30 January 2018, Accepted 24 February 2018, Available online 29 March 2018, Version of Record 29 March 2018.

论文官网地址:https://doi.org/10.1016/j.ijinfomgt.2018.02.007