Text classification method based on self-training and LDA topic models
作者:
Highlights:
• A novel text classification method for learning from very small labeled set.
• The method uses a text representation based on the LDA topic model.
• Self-training is used to enlarge labeled set from unlabeled instances.
• A model for setting methods’ parameters for any document collection is proposed.
摘要
•A novel text classification method for learning from very small labeled set.•The method uses a text representation based on the LDA topic model.•Self-training is used to enlarge labeled set from unlabeled instances.•A model for setting methods’ parameters for any document collection is proposed.
论文关键词:Classification,Topic modeling,LDA,Semi-supervised learning,Self-training
论文评审过程:Received 26 August 2016, Revised 7 March 2017, Accepted 8 March 2017, Available online 8 March 2017, Version of Record 17 March 2017.
论文官网地址:https://doi.org/10.1016/j.eswa.2017.03.020