MCFS: Min-cut-based feature-selection

作者:

Highlights:

摘要

In this paper, MCFS (Min-Cut-based feature-selection) is presented, which is a feature-selection algorithm based on the representation of the features in a dataset by means of a directed graph. The main contribution of our work is to show the usefulness of a general graph-processing technique in the feature-selection problem for classification datasets. The vertices of the graphs used herein are the features together with two special-purpose vertices (one of which denotes high correlation to the feature class of the dataset, and the other denotes a low correlation to the feature class). The edges are functions of the correlations among the features and also between the features and the classes. A classic max-flow min-cut algorithm is applied to this graph. The cut returned by this algorithm provides the selected features. We have compared the results of our proposal with well-known feature-selection techniques. Our algorithm obtains results statistically similar to those achieved by the other techniques in terms of number of features selected, while additionally significantly improving the accuracy.

论文关键词:Machine-learning,Feature-selection,Nearest-neighbour,Correlations,Max-flow min-cut,Classification

论文评审过程:Received 27 May 2019, Revised 29 January 2020, Accepted 1 February 2020, Available online 4 February 2020, Version of Record 4 April 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2020.105604