Weakly labeled data augmentation for social media named entity recognition

作者:

Highlights:

• We propose a method to improve NER performance in user-generated texts.

• No additional cost is required for annotating user-generated texts in our method.

• Our method improves the performance regardless of the embedding types.

• We discussed the effectiveness of transfer learning with the weakly labeled data.

• The final NER model has achieved state-of-the-art performance on the WNUT-17 dataset.

摘要

•We propose a method to improve NER performance in user-generated texts.•No additional cost is required for annotating user-generated texts in our method.•Our method improves the performance regardless of the embedding types.•We discussed the effectiveness of transfer learning with the weakly labeled data.•The final NER model has achieved state-of-the-art performance on the WNUT-17 dataset.

论文关键词:Named entity recognition,Social-media text mining,Weakly labeled data,Transfer learning

论文评审过程:Received 19 April 2021, Revised 16 July 2022, Accepted 16 July 2022, Available online 20 July 2022, Version of Record 3 August 2022.

论文官网地址:https://doi.org/10.1016/j.eswa.2022.118217