Preventing human error: The impact of data entry methods on data accuracy and statistical results

作者:

Highlights:

摘要

Human data entry can result in errors that ruin statistical results and conclusions. A single data entry error can make a moderate correlation turn to zero and a significant t-test non-significant. Therefore, researchers should design and use human computer interactions that minimize data entry errors. In this paper, 195 undergraduates were randomly assigned to three data entry methods: double entry, visual checking, and single entry. After training in their assigned method, participants entered 30 data sheets, each containing six types of data. Visual checking resulted in 2958% more errors than double entry, and was not significantly better than single entry. These data entry errors sometimes had terrible effects on coefficient alphas, correlations, and t-tests. For example, 66% of the visual checking participants produced incorrect values for coefficient alpha, which was sometimes wrong by more than .40. Moreover, these data entry errors would be hard to detect: Only 0.06% of the errors were blank or outside of the allowable range for the variables. Thus, researchers cannot rely upon histograms and frequency tables to detect data entry errors. Single entry and visual checking should be replaced with more effective data entry methods, such as double entry.

论文关键词:Data entry,Double entry,Visual checking,Outliers,Data cleaning

论文评审过程:Available online 4 May 2011.

论文官网地址:https://doi.org/10.1016/j.chb.2011.04.004