Developing insights from social media using semantic lexical chains to mine short text structures
作者:
Highlights:
• We develop a technique for clustering short posts
• We test the technique on the Facebook data of a politician
• We perform further testing on a subset of the 20 Usenet Newsgroup
摘要
Social media is increasingly being used for communication by individuals and organizations. Social media stores vast amounts of publicly available data that provides a rich source of information and insights. Often, social media users can easily infer meaning from short text such as microblogs and Facebook posts because they understand the context and terminology used. Although automated data-mining can be effective for gaining insights from text data, a significant challenge is to accurately infer meaning from social media text derived from a single social media account. This is difficult because social media communication uses very short, or sparse, text, which yields a relatively small sample of usable words for analysis. Furthermore, interpreting the contextual meaning from a relatively small set of words is challenging. This research proposes a methodology for extracting semantic lexical chains from frequently occurring words in a single social media account and using these chains to mine short text structures to infer the overall themes of the user. The methodology is based on a proposed clustering algorithm and illustrated with examples from Facebook posts. The algorithm is tested and illustrated by comparing it to existing work and further applying it to a variety of news posts. This methodology could be useful for gaining decision-making insights from social media, or other online forms with short or sparse text.
论文关键词:Social media,Semantic clustering,Lexical chain,Short text,Text mining,Word sense disambiguation
论文评审过程:Received 19 February 2019, Revised 15 July 2019, Accepted 21 August 2019, Available online 26 August 2019, Version of Record 15 November 2019.
论文官网地址:https://doi.org/10.1016/j.dss.2019.113142