Rule based joins in heterogeneous databases
作者:
摘要
In this paper, the problem of computing joins in heterogeneous databases is analyzed. Rules are combined with a probabilistic framework to resolve the data heterogeneity problem. The Entity join operator is defined to identify and join records across databases. Certain amount of uncertainty is associated with this Entity join model due to the possibility of wrong matches. While the rule based approach captures the data semantics, the probabilistic framework models the uncertainty and provides a formal measure of accuracy of the Entity join. Representing the values of mismatched attributes presents a difficult problem because the true value of the attribute cannot be identified from the various conflicting values. Probabilistic partial values are used to represent these attribute values so that user preferences and reliability of the data can be taken into account.
论文关键词:Federated database,Database management system,Structural heterogeneity,Semantic heterogeneity,Rules,Approximation,Join,Tuple probability,Probabilistic partial value
论文评审过程:Available online 22 December 1999.
论文官网地址:https://doi.org/10.1016/0167-9236(93)E0048-I