Sentiment polarity detection in Spanish reviews combining supervised and unsupervised approaches

作者:

Highlights:

摘要

Sentiment polarity detection is one of the most popular tasks related to Opinion Mining. Many papers have been presented describing one of the two main approaches used to solve this problem. On the one hand, a supervised methodology uses machine learning algorithms when training data exist. On the other hand, an unsupervised method based on a semantic orientation is applied when linguistic resources are available. However, few studies combine the two approaches. In this paper we propose the use of meta-classifiers that combine supervised and unsupervised learning in order to develop a polarity classification system. We have used a Spanish corpus of film reviews along with its parallel corpus translated into English. Firstly, we generate two individual models using these two corpora and applying machine learning algorithms. Secondly, we integrate SentiWordNet into the English corpus, generating a new unsupervised model. Finally, the three systems are combined using a meta-classifier that allows us to apply several combination algorithms such as voting system or stacking. The results obtained outperform those obtained using the systems individually and show that this approach could be considered a good strategy for polarity classification when we work with parallel corpora.

论文关键词:Sentiment polarity detection,Multilingual opinion mining,Spanish review corpus,SentiWordNet,Metaclassifiers,Stacking algorithm,Voting system

论文评审过程:Available online 24 December 2012.

论文官网地址:https://doi.org/10.1016/j.eswa.2012.12.084