Mining semantic association rules from RDF data

作者:

Highlights:

摘要

The Semantic Web opens up new opportunities for the data mining research. Semantic Web data is usually represented in the RDF triple format (subject, predicate, object). Large RDF-style Knowledge Bases contain hundreds of millions of RDF triples that represent knowledge in a machine-understandable format. Association rule mining is one of the most effective techniques for detecting frequent patterns. In the context of Semantic Web data mining, most existing methods rely on users intervention that is time-consuming and error-prone due to a large amount of data. Meanwhile, rule quality factors (e.g. support and confidence) usually consider knowledge at the instance-level. Namely, these factors disregard the knowledge embedded at the schema-level. In this paper, we demonstrate that ignoring knowledge encoded at the schema-level negatively impacts the interpretation of discovered rules. We introduce an approach called SWARM (Semantic Web Association Rule Mining) that automatically mines Semantic Association Rules from RDF data. The main achievement of SWARM is to reveal common behavioural patterns associated with knowledge at the instance-level and schema-level. We discuss how to utilize knowledge encoded at the schema-level to add more semantics to the rules. We compare the semantic of rules discovered by SWRAM with one of the latest approaches in this field to show the importance of considering schema-level knowledge. Initial experiments performed on RDF-style Knowledge Bases demonstrate the effectiveness of the proposed approach.

论文关键词:Semantic Web data,Association rule mining,Ontology,Knowledge discovery

论文评审过程:Received 14 July 2016, Revised 30 June 2017, Accepted 8 July 2017, Available online 10 July 2017, Version of Record 4 September 2017.

论文官网地址:https://doi.org/10.1016/j.knosys.2017.07.009