Development and evaluation of a continuous-time Markov chain model for detecting and handling data currency declines
作者:
Highlights:
• Development of a Continuous-Time Markov Chain model for data quality management
• Estimating data currency levels, and predicting future declines
• Recommending if and when to reacquire data, considering cost-benefit tradeoffs
• Evaluation in the context of insurance claim handling, using real-world data
摘要
Data currency declines, caused by recorded data values becoming outdated, can damage the usability and accountability of data resources. Detecting and updating outdated values may improve data currency and reduce the associated damage, but such efforts may be costly and cannot always be justified. This study models currency decline scenarios using a continuous-time Markov chain stochastic process with a finite number of states, each reflecting a valid data value. The model considers state transition probabilities, transition time distributions, and the tradeoff between the damage associated with outdated data and the cost of reacquisition. The proposed formulation permits the currency level to be estimated without having to rely on a baseline for comparison, as well as the prediction of future currency declines, assessment of their accumulated damage, and optimization of the timing of cost-effective data auditing and reacquisition. The study introduces a comprehensive evaluation of the proposed model, using a large real-world dataset relating to the handling of insurance claims over multiple time periods. The evaluation results highlight the applicability of the model, and its potential contribution to proactive data quality management and cost-effective handling of currency declines.
论文关键词:Data quality management,Data currency,Continuous-time Markov chain
论文评审过程:Received 14 July 2016, Revised 9 September 2017, Accepted 20 September 2017, Available online 21 September 2017, Version of Record 22 October 2017.
论文官网地址:https://doi.org/10.1016/j.dss.2017.09.006