Data-driven decomposition for multi-class classification
作者:
Highlights:
•
摘要
This paper presents a new study on a method of designing a multi-class classifier: Data-driven Error Correcting Output Coding (DECOC). DECOC is based on the principle of Error Correcting Output Coding (ECOC), which uses a code matrix to decompose a multi-class problem into multiple binary problems. ECOC for multi-class classification hinges on the design of the code matrix. We propose to explore the distribution of data classes and optimize both the composition and the number of base learners to design an effective and compact code matrix. Two real world applications are studied: (1) the holistic recognition (i.e., recognition without segmentation) of touching handwritten numeral pairs and (2) the classification of cancer tissue types based on microarray gene expression data. The results show that the proposed DECOC is able to deliver competitive accuracy compared with other ECOC methods, using parsimonious base learners than the pairwise coupling (one-vs-one) decomposition scheme. With a rejection scheme defined by a simple robustness measure, high reliabilities of around 98% are achieved in both applications.
论文关键词:Multi-class classification,Error Correcting Output Coding (ECOC),Data-driven Error Correcting Output Coding (DECOC),Support vector machine,Handwritten numeral recognition,Gene expression classification
论文评审过程:Received 6 April 2006, Revised 24 May 2007, Accepted 29 May 2007, Available online 19 June 2007.
论文官网地址:https://doi.org/10.1016/j.patcog.2007.05.020