Hate speech detection is not as easy as you may think: A closer look at model validation (extended version)

作者：

Highlights：

• The state-of-the-art results are highly overestimated due to experimental issues.

• User distribution on datasets has an impact on the classification results.

• Better user-distributed datasets lead to better generalization.

• Improving English models generalization is a first step toward crosslingual models.

摘要

•The state-of-the-art results are highly overestimated due to experimental issues.•User distribution on datasets has an impact on the classification results.•Better user-distributed datasets lead to better generalization.•Improving English models generalization is a first step toward crosslingual models.

论文关键词：Hate speech classification,Experimental evaluation,Social media,Deep learning

论文评审过程：Received 14 November 2019, Revised 13 May 2020, Accepted 26 June 2020, Available online 30 June 2020, Version of Record 24 December 2021.

论文官网地址：https://doi.org/10.1016/j.is.2020.101584