A context-aware embeddings supported method to extract a fuzzy sentiment polarity dictionary

作者:

Highlights:

摘要

The latest development in cognitive technologies are helping us understand emotions and sentiments with unprecedented precision. Polarity detection is the key enabler to sentiment analysis and typically relies on experimental dictionaries, where terms are assigned polarity scores, yet lacking contextual information and based on human inputs and conventions. In this article, we present a novel approach to automatically extract a polarity dictionary from a particular domain, the stock market, without human intervention and addressing the scaling and thresholding problem. Our approach tracks the price changes of particular stocks over time, using it as a guiding polarity value. The magnitude of the price variation for a particular stock is then attributed to the financial news about this stock in corresponding period of time and that is what we use as our working corpus. On top of that, we derive the so-called binned corpus and apply the well-known TF–IDF information retrieval techniques to compute the TF–IDF value for each term. These values are then disseminated within the neighbourhood of each term based on the embeddings-enabled cosine distance. After introducing the problem and providing the background information, we thoroughly describe our method and all the components required to implement the system. Last but not least, we assign the terms to fuzzy linguistic labels and provide a volatility metric indicating how reliable our scores are depending on their distribution of occurrences in the corpus. To show how our approach works, we implement it for the Euro Stoxx 50 from January 2018 to March 2019 and discuss the results compared with typical approaches, pointing out potential improvements for further research work.

论文关键词:Sentiment analysis,Polarity extraction,Word embeddings,Information retrieval,Contextual bias,Fuzzy polarity

论文评审过程:Received 15 April 2019, Revised 11 November 2019, Accepted 12 November 2019, Available online 19 November 2019, Version of Record 7 February 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2019.105236