Tweet sentiment analysis with classifier ensembles

作者：

Highlights：

• We show that classifier ensembles are promising for tweet sentiment analysis.

• We compare bag-of-words and feature hashing for the representation of tweets.

• Classifier ensembles obtained from bag-of-words and feature hashing are discussed.

摘要

Twitter is a microblogging site in which users can post updates (tweets) to friends (followers). It has become an immense dataset of the so-called sentiments. In this paper, we introduce an approach that automatically classifies the sentiment of tweets by using classifier ensembles and lexicons. Tweets are classified as either positive or negative concerning a query term. This approach is useful for consumers who can use sentiment analysis to search for products, for companies that aim at monitoring the public sentiment of their brands, and for many other applications. Indeed, sentiment classification in microblogging services (e.g., Twitter) through classifier ensembles and lexicons has not been well explored in the literature. Our experiments on a variety of public tweet sentiment datasets show that classifier ensembles formed by Multinomial Naive Bayes, SVM, Random Forest, and Logistic Regression can improve classification accuracy.

论文关键词：Twitter,Sentiment analysis,Classifier ensembles

论文评审过程：Received 2 January 2014, Revised 2 May 2014, Accepted 6 July 2014, Available online 24 July 2014.

论文官网地址：https://doi.org/10.1016/j.dss.2014.07.003