Incorporating prior knowledge into learning by dividing training data

作者:Baoliang Lu, Xiaolin Wang, Masao Utiyama

摘要

In most large-scale real-world pattern classification problems, there is always some explicit information besides given training data, namely prior knowledge, with which the training data are organized. In this paper, we proposed a framework for incorporating this kind of prior knowledge into the training of min-max modular (M3) classifier to improve learning performance. In order to evaluate the proposed method, we perform experiments on a large-scale Japanese patent classification problem and consider two kinds of prior knowledge included in patent documents: patent’s publishing date and the hierarchical structure of patent classification system. In the experiments, traditional support vector machine (SVM) and M3-SVM without prior knowledge are adopted as baseline classifiers. Experimental results demonstrate that the proposed method is superior to the baseline classifiers in terms of training cost and generalization accuracy. Moreover, M3-SVM with prior knowledge is found to be much more robust than traditional support vector machine to noisy dated patent samples, which is crucial for incremental learning.

论文关键词:prior knowledge, patent classification, support vector machine, min-max modular network, task decomposition

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11704-009-0013-7