Stock market sentiment lexicon acquisition using microblogging data and statistical measures
作者:
Highlights:
• Proposal of an automatic procedure for the creation of stock market lexicons.
• The procedure uses diverse statistical measures on StockTwits labeled messages.
• The new lexicons obtain better investor sentiment indicators than general lexicons.
• The new Twitter sentiment indicators correlate with survey sentiment indicators.
摘要
Lexicon acquisition is a key issue for sentiment analysis. This paper presents a novel and fast approach for creating stock market lexicons. The approach is based on statistical measures applied over a vast set of labeled messages from StockTwits, which is a specialized stock market microblog. We compare three adaptations of statistical measures, such as Pointwise Mutual Information (PMI), two new complementary statistics and the use of sentiment scores for affirmative and negated contexts. Using StockTwits, we show that the new lexicons are competitive for measuring investor sentiment when compared with six popular lexicons. We also applied a lexicon to easily produce Twitter investor sentiment indicators and analyzed their correlation with survey sentiment indexes. The new microblogging indicators have a moderate correlation with popular Investors Intelligence (II) and American Association of Individual Investors (AAII) indicators. Thus, the new microblogging approach can be used alternatively to traditional survey indicators with advantages (e.g., cheaper creation, higher frequencies).
论文关键词:Sentiment analysis,Stock market,Microblogging data
论文评审过程:Received 1 June 2015, Revised 1 February 2016, Accepted 27 February 2016, Available online 5 March 2016, Version of Record 15 April 2016.
论文官网地址:https://doi.org/10.1016/j.dss.2016.02.013