Predicting consumer sentiments from online text

作者:

摘要

Sentiment analysis from unstructured text has witnessed a boom in interest in recent years, due to the sheer volume of online reviews and news corpora available in digital form. An accurate method for predicting sentiments could enable us, for instance, to extract opinions from the Internet and gauge online customers' preferences, which could prove valuable for economic or marketing research, for leveraging a strategic advantage for an enterprise, or for detecting cyber risk and security threats. In this paper, we propose a heuristic search-enhanced Markov blanket model that is able to capture the dependencies among words and provide a vocabulary that is adequate for the purpose of extracting sentiments. Computational results on two collections of online movie reviews and three collections of online news show that our method is able to identify a parsimonious set of predictive features, yet simultaneously yield comparable or better prediction results about sentiment orientations, than several state-of-the-art feature selection algorithms as well as sentiment prediction methods. Our results suggest that sentiments are captured by conditional dependencies among words as well as by keywords or high-frequency words.

论文关键词:Sentiment analysis,Online reviews,Online news,Markov blanket,Heuristic search

论文评审过程:Available online 19 August 2010.

论文官网地址:https://doi.org/10.1016/j.dss.2010.08.024