How to take advantage of behavioral features for the early detection of grooming in online conversations

作者:

Highlights:

摘要

Detecting grooming behavior in online conversations has become a growing problem due to the large number of messaging platforms that children and young people use nowadays. The biggest drawback is the lack of tools focused on the automatic prevention of this risk. This paper proposes seven Behavioral Features (BFs) to be used for early grooming detection. A detailed study is conducted to understand the background that allows these features to contribute to tasks of early classification. Besides, we introduce the Behavioral Feature-Profile Specific Representation (BF-PSR) framework as an extension of the well-known Profile Specific Representation (PSR) framework to properly employ the proposed behavioral features. Experimental results reveal that our proposal outperforms all the concurrent methods and obtains state-of-the-art performance in the area of early grooming detection. Specifically, the new BF-PSR framework achieves a gain of more than 40% in effectiveness over five competitors when only 10% of the conversations’ content is available, thus it shows a substantial advantage to allow the early detection of grooming; besides, it maintains a similar gain in effectiveness as more data arrives. To the best of our knowledge, this is the first work to employ behavioral features for the early detection of grooming. Furthermore, we have assembled two new datasets called PJZ and PJZC to mitigate the lack of data in the grooming detection area. Both sets are publicly available for download aimed at fostering further researches. Additional experiments reveal that our BF-PSR framework outperforms all of the state-of-the-art methods when processing these new datasets.

论文关键词:Early text classification,Classification with partial information,Behavioral features,Online grooming detection

论文评审过程:Received 10 May 2021, Revised 20 November 2021, Accepted 18 December 2021, Available online 29 December 2021, Version of Record 22 January 2022.

论文官网地址:https://doi.org/10.1016/j.knosys.2021.108017