A combinatorial optimization approach for multi-label associative classification

作者:

Highlights:

摘要

Mining associations between variables corresponding to multiple class labels (or outcomes) is prevalent in various applied domains, such as medical diagnosis, text mining, e-commence, and social behavior analysis. While most associative classification algorithms have been developed to discover association rules of binary variables for single-label classification problem, there are limited methods designed for the problem, called multi-label classification, that accounts for multi-labels. In this study, we consider the multi-label classification problem as a multi-class classification problem and formulate it as a 0–1 integer optimization model. We then leverage combinatorial optimization and association rule techniques to solve this hard problem. More specifically, we propose a ranking metric for selecting and aggregating stronger rules to form an optimal multi-label classifier. The computational results for multiple real applications show that our algorithm is able to identify significant association rules between key variables and multiple labels, and in turn achieves a competitive classification performance compared to state-of-the-art machine learning methods such as logistic regression, decision trees, and random forest. Moreover, we design a user-interface tool interfaced with the developed algorithm and demonstrate a medical diagnosis problem to predict multiple high-risk subgroups during emergency care in practice.

论文关键词:Associative classification,Combinatorial optimization,Interpretable data mining,Machine learning

论文评审过程:Received 21 April 2021, Revised 2 October 2021, Accepted 25 December 2021, Available online 31 December 2021, Version of Record 18 January 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2021.108088