Sentiment classification: The contribution of ensemble learning

作者：

Highlights：

• Analyzing and predicting the polarity of the sentiment plays an important role.

• Three popular ensemble methods for sentiment classification are assessed.

• Ten public sentiment analysis datasets are investigated to verify the results.

• Experimental results reveal that ensemble learning can be used as a viable method.

摘要

With the rapid development of information technologies, user-generated contents can be conveniently posted online. While individuals, businesses, and governments are interested in evaluating the sentiments behind this content, there are no consistent conclusions on which sentiment classification technologies are best. Recent studies suggest that ensemble learning methods may have potential applicability in sentiment classification. In this study, we conduct a comparative assessment of the performance of three popular ensemble methods (Bagging, Boosting, and Random Subspace) based on five base learners (Naive Bayes, Maximum Entropy, Decision Tree, K Nearest Neighbor, and Support Vector Machine) for sentiment classification. Moreover, ten public sentiment analysis datasets were investigated to verify the effectiveness of ensemble learning for sentiment analysis. Based on a total of 1200 comparative group experiments, empirical results reveal that ensemble methods substantially improve the performance of individual base learners for sentiment classification. Among the three ensemble methods, Random Subspace has the better comparative results, although it was seldom discussed in the literature. These results illustrate that ensemble learning methods can be used as a viable method for sentiment classification.

论文关键词：Sentiment classification,Ensemble learning,Bagging,Boosting,Random Subspace

论文评审过程：Received 27 August 2012, Revised 1 August 2013, Accepted 5 August 2013, Available online 15 August 2013.

论文官网地址：https://doi.org/10.1016/j.dss.2013.08.002