DWWP: Domain-specific new words detection and word propagation system for sentiment analysis in the tourism domain

作者:

Highlights:

摘要

Online travel has developed dramatically during the past three years in China. This results in a large amount of unstructured data like tourism reviews from which it is hard to extract useful knowledge. In this paper, a DWWP system consisting of domain-specific new words detection (DW) and word propagation (WP) is presented. DW deals with the negligence of user-invented new words and converted sentiment words by means of AMI (Assembled Mutual Information). Inspired by social networks, the new method WP incorporates manually calibrated sentiment scores, semantic and statistical similarity information, which improves the quality of sentiment lexicon in comparison with existing data-driven methods. Experimental results show that DWWP improves seventeen percentage points compared with graph propagation and four percentage points compared with label propagation in terms of accuracy on Dataset I and Dataset II, respectively.

论文关键词:Online travel,DWWP,Chinese new words detection,Sentiment lexicon,Word propagation

论文评审过程:Received 7 April 2017, Revised 23 January 2018, Accepted 3 February 2018, Available online 5 February 2018, Version of Record 28 February 2018.

论文官网地址:https://doi.org/10.1016/j.knosys.2018.02.004