Disambiguating context-dependent polarity of words: An information retrieval approach

作者:

Highlights:

摘要

The paper introduces PolaritySim – a novel approach to disambiguating context-dependent sentiment polarity of words. The task of resolving the polarity of a given word instance as positive or negative is addressed as an information retrieval problem. At the pre-processing stage, a vector of context features is built for each word w based on all its occurrences in the positive polarity corpus (consumer reviews with high ratings) and another vector – on its contexts in the negative polarity corpus (reviews with low ratings). Lexico-syntactic context features are automatically generated from dependency parse graphs of the sentences containing the word. These two vectors are treated as “documents”, one with positive and one with negative polarity. To resolve the contextual polarity of a specific instance of the word w in a given sentence, its context feature vector is built in the same way, and is treated as the “query”. An information retrieval (IR) model is then applied to calculate the similarity of the “query” to each of the two “documents”, with the polarity of the best matching “document” attributed to the “query”. The method uses no prior polarity sentiment lexicons or purposefully annotated training datasets. The only external resource used is a readily available corpus of user-rated reviews. Evaluation on different domains shows more effective performance compared to state-of-the-art baselines, Support Vector Machines (SVM) and Multinomial Naive Bayes (MNB) classifiers, on three out of four datasets. PolaritySim, SVM and MNB were also evaluated with an out-of-domain training corpus. The results indicate that PolaritySim is more effective and robust when used with an out-of-domain corpus compared to SVM and MNB. We conclude that an IR based approach can be an effective and robust alternative to machine learning approaches for disambiguating word-level polarity using either within-domain, or out-of-domain training corpora.

论文关键词:Sentiment analysis,Polarity disambiguation,Word polarity,Context-dependent polarity of words

论文评审过程:Received 25 October 2016, Revised 16 March 2017, Accepted 27 March 2017, Available online 3 May 2017, Version of Record 3 May 2017.

论文官网地址:https://doi.org/10.1016/j.ipm.2017.03.007