Improving data quality through effective use of data semantics

作者:

Highlights:

摘要

Data quality issues have taken on increasing importance in recent years. In our research, we have discovered that many “data quality” problems are actually “data misinterpretation” problems—that is, problems caused by heterogeneous data semantics. In this paper, we first identify semantic heterogeneities that, when not resolved, often cause data quality problems. We discuss the especially challenging problem of aggregational ontological heterogeneity, which concerns how complex entities and their relationships are aggregated. Then we illustrate how COntext INterchange (COIN) technology can be used to capture data semantics and reconcile semantic heterogeneities, thereby improving data quality.

论文关键词:Data quality,Data semantics,Semantic heterogeneity,Ontology,Context

论文评审过程:Received 5 October 2005, Accepted 5 October 2005, Available online 8 November 2005.

论文官网地址:https://doi.org/10.1016/j.datak.2005.10.001