A filter attribute selection method based on local reliable information

作者：Ricardo Martín, Ricardo Aler, Inés M. Galván

摘要

In this article, a filter feature weighting technique for attribute selection in classification problems is proposed (LIA). It has two main characteristics. First, unlike feature weighting methods, it is able to consider attribute interactions in the weighting process, rather than only evaluating single features. Attribute subsets are evaluated by projecting instances into a grid defined by attributes in the subset. Then, the joint relevance of the subset is computed by measuring the information present in the cells of the grid. The final weight for each attribute is computed by taking into account its performance in each of the grids it participates. Second, many real problems contain low signal-to-noise ratios, due to instance of high noise levels, class overlap, class imbalance, or small training samples. LIA computes reliable local information for each of the cells by estimating the number of target class instances not due to chance, given a confidence value. In order to study its properties, LIA has been evaluated with a collection of 18 real datasets and compared to two feature weighting methods (Chi-Squared and ReliefF) and a subset feature selection algorithm (CFS). Results show that the method is significantly better in many cases, and never significantly worse. LIA has also been tested with different grid dimensions (1, 2, and 3). The method works best when evaluating attribute subsets larger than 1, hence showing the usefulness of considering attribute interactions.

论文关键词：Feature selection, Filter methods, Feature weighting methods, Noisy domains

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10489-017-0959-3