Adapting sentiment lexicons to domain-specific social media texts

作者:

Highlights:

• We propose a method to adapt existing sentiment lexicons for domain-specific sentiment classification.

• The proposed method addresses challenges from both content domain and language domain.

• We evaluate our method using two large developing corpora and five existing sentiment lexicons as seeds and baselines.

• The evaluation results demonstrate the usefulness of our method.

摘要

Social media has become the largest data source of public opinion. The application of sentiment analysis to social media texts has great potential, but faces great challenges because of domain heterogeneity. Sentiment orientation of words varies by content domain, but learning context-specific sentiment in social media domains continues to be a major challenge. The language domain poses another challenge since the language used in social media today differs significantly from that used in traditional media. To address these challenges, we propose a method to adapt existing sentiment lexicons for domain-specific sentiment classification using an unannotated corpus and a dictionary. We evaluate our method using two large developing corpora, containing 743,069 tweets related to the stock market and one million tweets related to political topics, respectively, and five existing sentiment lexicons as seeds and baselines. The results demonstrate the usefulness of our method, showing significant improvement in sentiment classification performance.

论文关键词:Sentiment analysis,Opinion mining,Sentiment lexicon,Lexicon expansion,Social media

论文评审过程:Received 17 February 2016, Revised 7 September 2016, Accepted 1 November 2016, Available online 3 November 2016, Version of Record 24 January 2017.

论文官网地址:https://doi.org/10.1016/j.dss.2016.11.001