Topic2Labels: A framework to annotate and classify the social media data through LDA topics and deep learning models for crisis response

作者:

Highlights:

• A novel framework for the annotation and classification of crisis text is proposed.

• It leverages Latent Dirichlet Allocation (LDA) for annotation of textual data.

• A new topics ranking algorithm is developed for dominant topic extraction.

• Deep learning classifiers are used with Bert embeddings for classification of text.

• The classification results have statistical significance as compared to baselines.

摘要

•A novel framework for the annotation and classification of crisis text is proposed.•It leverages Latent Dirichlet Allocation (LDA) for annotation of textual data.•A new topics ranking algorithm is developed for dominant topic extraction.•Deep learning classifiers are used with Bert embeddings for classification of text.•The classification results have statistical significance as compared to baselines.

论文关键词:Social media,Natural language processing,Neural network,Topic modeling,Annotation,Classification,Transformer,Crisis response

论文评审过程:Received 19 August 2021, Revised 30 December 2021, Accepted 17 January 2022, Available online 9 February 2022, Version of Record 15 February 2022.

论文官网地址:https://doi.org/10.1016/j.eswa.2022.116562