An intelligent early warning system of analyzing Twitter data using machine learning on COVID-19 surveillance in the US

作者:

Highlights:

摘要

The World Health Organization (WHO) declared on 11th March 2020 the spread of the coronavirus disease 2019 (COVID-19) a pandemic. The traditional infectious disease surveillance had failed to alert public health authorities to intervene in time and mitigate and control the COVID-19 before it became a pandemic. Compared with traditional public health surveillance, harnessing the rich data from social media, including Twitter, has been considered a useful tool and can overcome the limitations of the traditional surveillance system. This paper proposes an intelligent COVID-19 early warning system using Twitter data with novel machine learning methods. We use the natural language processing (NLP) pre-training technique, i.e., fine-tuning BERT as a Twitter classification method. Moreover, we implement a COVID-19 forecasting model through a Twitter-based linear regression model to detect early signs of the COVID-19 outbreak. Furthermore, we develop an expert system, an early warning web application based on the proposed methods. The experimental results suggest that it is feasible to use Twitter data to provide COVID-19 surveillance and prediction in the US to support health departments’ decision-making.

论文关键词:COVID-19 surveillance,Early warning system,Text classification,BERT,Epidemic intelligence

论文评审过程:Received 3 May 2021, Revised 14 September 2021, Accepted 10 March 2022, Available online 14 March 2022, Version of Record 23 March 2022.

论文官网地址:https://doi.org/10.1016/j.eswa.2022.116882