Interpretation of text patterns

作者:Md Abul Bashar, Yuefeng Li

摘要

Patterns are used as a fundamental means to analyse data in many text mining applications. Many efficient techniques have been developed to discover patterns. However, the excessive number of discovered patterns and lack of grounded (e.g. a priori defined) semantics have made it difficult for a user to interpret and explore the patterns. An insight into the meanings of the patterns can benefit users in the process of exploring them. In this regard, this paper presents a model to automatically interpret patterns by achieving two goals: (1) providing the meanings of patterns in terms of ontology concepts and (2) providing a new method for generating and extracting features from an ontology to describe the relevant information more effectively. Taking advantage of a domain ontology and a set of relevant statistics (e.g. term frequency in a document, inverse term frequency in a domain ontology, etc.), our proposed model can give an insight into the hidden meanings of the patterns. The model is evaluated by comparing it with different baseline models on three standard datasets. The results show that the performance of the proposed model is significantly better than baseline models.

论文关键词:Text mining, Frequent pattern, Pattern interpretation, Conceptual annotation, Contextual weighting, Information filtering

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10618-018-0556-z