An easy numeric data augmentation method for early-stage COVID-19 tweets exploration of participatory dynamics of public attention and news coverage

作者:

Highlights:

• An easy numeric data augmentation (ENDA) method proposed for text classification outperforms an easier data augmentation (AEDA).

• Dataset size and augmentation number influence model performance with ENDA and AEDA greatly.

• Turning points around January 20 and February 23 and tweets peaks trigged by alarming news.

• A strong positive correlation between the news coverage and personal narrative at the daily level.

• Limited government responses and missed windows for early warnings in early January and February.

摘要

•An easy numeric data augmentation (ENDA) method proposed for text classification outperforms an easier data augmentation (AEDA).•Dataset size and augmentation number influence model performance with ENDA and AEDA greatly.•Turning points around January 20 and February 23 and tweets peaks trigged by alarming news.•A strong positive correlation between the news coverage and personal narrative at the daily level.•Limited government responses and missed windows for early warnings in early January and February.

论文关键词:Social media analysis,Data augmentation,COVID-19 outbreak,Public engagement,News coverage,Text classification

论文评审过程:Received 21 May 2022, Revised 21 August 2022, Accepted 25 August 2022, Available online 29 August 2022, Version of Record 1 September 2022.

论文官网地址:https://doi.org/10.1016/j.ipm.2022.103073