Misogyny Detection in Twitter: a Multilingual and Cross-Domain Study

作者:

Highlights:

• We conduct a broad and in-depth study on online misogyny, a relevant and timely task given that more and more episodes of hate speech and online harassment happen in social media.

• An extensive review of the state of the art in misogyny detection is presented.

• A state-of-the-art model to detect misogyny in social media is developed, and evaluated on three different languages, English, Italian, and Spanish.

• We investigate the most predictive linguistic features to distinguish misogynistic content from not-misogynistic content.

• Relationships between misogyny and other abusive language phenomena are postulated, and empirically investigated with cross-dataset experiments.

• The feasibility of detecting misogyny in a multilingual environment is explored.

摘要

•We conduct a broad and in-depth study on online misogyny, a relevant and timely task given that more and more episodes of hate speech and online harassment happen in social media.•An extensive review of the state of the art in misogyny detection is presented.•A state-of-the-art model to detect misogyny in social media is developed, and evaluated on three different languages, English, Italian, and Spanish.•We investigate the most predictive linguistic features to distinguish misogynistic content from not-misogynistic content.•Relationships between misogyny and other abusive language phenomena are postulated, and empirically investigated with cross-dataset experiments.•The feasibility of detecting misogyny in a multilingual environment is explored.

论文关键词:Automatic misogyny identification,Abusive language online,Cross-domain classification,Cross-lingual classification,Social media

论文评审过程:Received 7 February 2020, Revised 29 May 2020, Accepted 15 July 2020, Available online 16 September 2020, Version of Record 20 October 2020.

论文官网地址:https://doi.org/10.1016/j.ipm.2020.102360