SACPC: A framework based on probabilistic linguistic terms for short text sentiment analysis

作者:

Highlights:

摘要

Short text sentiment analysis is challenging because short texts are limited in length and lack context. Short texts are usually rather ambiguous because of polysemy and the typos these texts contain. Polysemy is the coexistence of multiple word meanings and commonly appears in every language. Various uses of a word may assign the word both positive and negative meanings. In previous studies, the variability of words is often ignored, which may cause analysis errors. In this study, to resolve this problem, we proposed a novel text representation model named Word2PLTS for short text sentiment analysis by introducing probabilistic linguistic terms sets (PLTSs) and the relevant theory. In this model, every word is represented as a PLTS that fully describes the possibilities for the sentiment polarity of the word. Then, by using support vector machines (SVM), a novel sentiment analysis and polarity classification framework named SACPC is obtained. This framework is a technique that combines supervised learning and unsupervised learning. We compare SAPCP to lexicon-based approaches and machine learning approaches on three benchmark datasets. A noticeable improvement in performance is achieved. To further verify the superiority of SAPCP, state of the art performance comparisons are conducted, and the results are impressive.

论文关键词:Semantic change,Sentiment analysis,Probabilistic linguistic terms,Polarity classification

论文评审过程:Received 4 April 2019, Revised 28 October 2019, Accepted 23 January 2020, Available online 28 January 2020, Version of Record 18 May 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2020.105572