Representing the Twittersphere: Archiving a representative sample of Twitter data under resource constraints
作者:
Highlights:
• We propose a new method for creating a representative archive of Twitter data.
• Our sample shows high similarity to the full Twitter data in volumes and topics.
• This archiving method enables a wide range of post-hoc analyses.
• This method makes Twitter data accessible to researchers with a limited budget.
摘要
•We propose a new method for creating a representative archive of Twitter data.•Our sample shows high similarity to the full Twitter data in volumes and topics.•This archiving method enables a wide range of post-hoc analyses.•This method makes Twitter data accessible to researchers with a limited budget.
论文关键词:Twitter,Social media,Sampling,Representativeness,Data collection,API,application programming interface,LDA,latent Dirichlet allocation,LDP,Liberal Democratic Party,DP,Democratic Party
论文评审过程:Received 6 June 2018, Revised 25 January 2019, Accepted 25 January 2019, Available online 25 March 2019, Version of Record 25 March 2019.
论文官网地址:https://doi.org/10.1016/j.ijinfomgt.2019.01.019