Two-stage approach to feature set optimization for unsupervised dataset with heterogeneous attributes

作者:

Highlights:

• A new unsupervised feature selection method is proposed for heterogeneous dataset.

• Relevant and non-redundant features are selected without prior data transformation.

• The proposed algorithm is suitable for high dimensional data.

• The proposed algorithm is scalable with respect to any size of data.

• Rigorous comparative study carried out to prove efficacy of proposed mechanism.

摘要

•A new unsupervised feature selection method is proposed for heterogeneous dataset.•Relevant and non-redundant features are selected without prior data transformation.•The proposed algorithm is suitable for high dimensional data.•The proposed algorithm is scalable with respect to any size of data.•Rigorous comparative study carried out to prove efficacy of proposed mechanism.

论文关键词:Feature selection,Feature ranking,Normalized mutual information,Unsupervised learning,Hybrid feature set optimization

论文评审过程:Received 23 September 2019, Revised 19 December 2020, Accepted 31 December 2020, Available online 7 January 2021, Version of Record 10 February 2021.

论文官网地址:https://doi.org/10.1016/j.eswa.2021.114563