A semantic similarity-based perspective of affect lexicons for sentiment analysis

作者：

Highlights：

•

摘要

Lexical resources are widely popular in the field of Sentiment Analysis, as they represent a resource that directly encodes sentimental knowledge. Usually sentiment lexica are used for polarity estimation through the matching of words contained in a text and their associated lexicon sentiment polarities. Nevertheless, such resources have limitations in vocabulary coverage and domain adaptation. Besides, many recent techniques exploit the concept of distributed semantics, normally through word embeddings. In this work, a semantic similarity metric is computed between text words and lexica vocabulary. Using this metric, this paper proposes a sentiment classification model that uses the semantic similarity measure in combination with embedding representations. In order to assess the effectiveness of this model, we perform an extensive evaluation. Experiments show that the proposed method can improve Sentiment Analysis performance over a strong baseline, being this improvement statistically significant. Finally, some characteristics of the proposed technique are studied, showing that the selection of lexicon words has an effect in cross-dataset performance.

论文关键词：Sentiment analysis,Sentiment lexicon,Semantic similarity,Word embeddings

论文评审过程：Received 11 July 2018, Revised 3 December 2018, Accepted 4 December 2018, Available online 8 December 2018, Version of Record 7 January 2019.

论文官网地址：https://doi.org/10.1016/j.knosys.2018.12.005