Individual attribute prior setting methods for naïve Bayesian classifiers

作者:

Highlights:

摘要

The generalized Dirichlet distribution has been shown to be a more appropriate prior for naïve Bayesian classifiers, because it can release both the negative-correlation and the equal-confidence requirements of the Dirichlet distribution. The previous research did not take the impact of individual attributes on classification accuracy into account, and therefore assumed that all attributes follow the same generalized Dirichlet prior. In this study, the selective naïve Bayes mechanism is employed to choose and rank attributes, and two methods are then proposed to search for the best prior of each single attribute according to the attribute ranks. The experimental results on 18 data sets show that the best approach is to use selective naïve Bayes for filtering and ranking attributes when all of them have Dirichlet priors with Laplace's estimate. After the ranks of the chosen attributes are determined, individual setting is performed to search for the best noninformative generalized Dirichlet prior for each attribute. The selective naïve Bayes is also compared with two representative filters for the feature selection, and the experimental results show that it has the best performance.

论文关键词:Dirichlet distribution,Generalized Dirichlet distribution,Naïve Bayesian classifier,Prior distribution,Selective naïve Bayes

论文评审过程:Received 2 September 2009, Revised 1 September 2010, Accepted 5 November 2010, Available online 11 November 2010.

论文官网地址:https://doi.org/10.1016/j.patcog.2010.11.002