On the consistency of multi-label learning

作者：

摘要

Multi-label learning has attracted much attention during the past few years. Many multi-label approaches have been developed, mostly working with surrogate loss functions because multi-label loss functions are usually difficult to optimize directly owing to their non-convexity and discontinuity. These approaches are effective empirically, however, little effort has been devoted to the understanding of their consistency, i.e., the convergence of the risk of learned functions to the Bayes risk. In this paper, we present a theoretical analysis on this important issue. We first prove a necessary and sufficient condition for the consistency of multi-label learning based on surrogate loss functions. Then, we study the consistency of two well-known multi-label loss functions, i.e., ranking loss and hamming loss. For ranking loss, our results disclose that, surprisingly, none of convex surrogate loss is consistent; we present the partial ranking loss, with which some surrogate losses are proven to be consistent. We also discuss on the consistency of univariate surrogate losses. For hamming loss, we show that two multi-label learning methods, i.e., one-vs-all and pairwise comparison, which can be regarded as direct extensions from multi-class learning, are inconsistent in general cases yet consistent under the dominating setting, and similar results also hold for some recent multi-label approaches that are variations of one-vs-all. In addition, we discuss on the consistency of learning approaches that address multi-label learning by decomposing into a set of binary classification problems.

论文关键词：Machine learning,Multi-label learning,Surrogate loss,Bayes consistency,Ranking loss,Hamming loss

论文评审过程：Received 9 July 2012, Revised 17 March 2013, Accepted 29 March 2013, Available online 8 April 2013.

论文官网地址：https://doi.org/10.1016/j.artint.2013.03.001