Text Classification from Labeled and Unlabeled Documents using EM

作者：Kamal Nigam, Andrew Kachites Mccallum, Sebastian Thrun, Tom Mitchell

摘要

This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. This is important because in many text classification problems obtaining training labels is expensive, while large quantities of unlabeled documents are readily available.

论文关键词：text classification, Expectation-Maximization, integrating supervised and unsupervised learning, combining labeled and unlabeled data, Bayesian learning

论文评审过程：

论文官网地址：https://doi.org/10.1023/A:1007692713085

原文链接
谷歌学术
必应学术
百度学术