Improving Tree augmented Naive Bayes for class probability estimation

作者:

Highlights:

摘要

Numerous algorithms have been proposed to improve Naive Bayes (NB) by weakening its conditional attribute independence assumption, among which Tree Augmented Naive Bayes (TAN) has demonstrated remarkable classification performance in terms of classification accuracy or error rate, while maintaining efficiency and simplicity. In many real-world applications, however, classification accuracy or error rate is not enough. For example, in direct marketing, we often need to deploy different promotion strategies to customers with different likelihood (class probability) of buying some products. Thus, accurate class probability estimation is often required to make optimal decisions. In this paper, we investigate the class probability estimation performance of TAN in terms of conditional log likelihood (CLL) and present a new algorithm to improve its class probability estimation performance by the spanning TAN classifiers. We call our improved algorithm Averaged Tree Augmented Naive Bayes (ATAN). The experimental results on a large number of UCI datasets published on the main web site of Weka platform show that ATAN significantly outperforms TAN and all the other algorithms used to compare in terms of CLL.

论文关键词:Naive Bayes,Tree Augmented Naive Bayes,Class probability estimation,Conditional log likelihood,Ensemble learning

论文评审过程:Received 23 February 2011, Revised 16 August 2011, Accepted 19 August 2011, Available online 30 August 2011.

论文官网地址:https://doi.org/10.1016/j.knosys.2011.08.010