XML document clustering: techniques and challenges

作者:Elaheh Asghari, MohammadReza KeyvanPour

摘要

The increasing availability of heterogeneous XML sources has raised a number of issues concerning how to represent and manage these semi-structured data. In recent years due to the importance of managing these resources and extracting knowledge from them, lots of methods have been proposed in order to represent and cluster them in different ways. Different similarity measures have been extended and also in some context semantic issues have been taken into account. In this context, we review different XML clustering methods with considering different representation methods such as tree based and vector based with use of different similarity measures. We also propose taxonomy for these proposed methods.

论文关键词:XML document, Similarity measure, Clustering algorithm, Cluster quality, Semantic clustering

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10462-012-9379-2