MAR: Maximum Attribute Relative of soft set for clustering attribute selection

作者:

Highlights:

摘要

Clustering, which is a set of categorical data into a homogenous class, is a fundamental operation in data mining. One of the techniques of data clustering was performed by introducing a clustering attribute. A number of algorithms have been proposed to address the problem of clustering attribute selection. However, the performance of these algorithms is still an issue due to high computational complexity. This paper proposes a new algorithm called Maximum Attribute Relative (MAR) for clustering attribute selection. It is based on a soft set theory by introducing the concept of the attribute relative in information systems. Based on the experiment on fourteen UCI datasets and a supplier dataset, the proposed algorithm achieved a lower computational time than the three rough set-based algorithms, i.e. TR, MMR, and MDA up to 62%, 64%, and 40% respectively and compared to a soft set-based algorithm, i.e. NSS up to 33%. Furthermore, MAR has a good scalability, i.e. the executing time of the algorithm tends to increase linearly as the number of instances and attributes are increased respectively.

论文关键词:Data mining,Soft set theory,Clustering attributes,Attribute relative,Complexity

论文评审过程:Received 13 July 2012, Revised 15 May 2013, Accepted 15 May 2013, Available online 26 July 2013.

论文官网地址:https://doi.org/10.1016/j.knosys.2013.05.009