Efficient classification of multi-labeled text streams by clashing

作者:

Highlights:

• An efficient method to classify large streams of documents is proposed.

• Text representation and multi-label classification are both performed online.

• The system guarantees bounded memory usage and constant processing time.

• The system approximates the TF IDF representation online without corpus wise computations.

• In terms of accuracy, the method is better or comparable to a periodically recomputed SVM.

摘要

•An efficient method to classify large streams of documents is proposed.•Text representation and multi-label classification are both performed online.•The system guarantees bounded memory usage and constant processing time.•The system approximates the TF IDF representation online without corpus wise computations.•In terms of accuracy, the method is better or comparable to a periodically recomputed SVM.

论文关键词:Text classification,Data streams,Multi-label classification,Feature hashing,Massive data mining

论文评审过程:Available online 19 February 2014.

论文官网地址:https://doi.org/10.1016/j.eswa.2014.02.017