A novel automatic satire and irony detection using ensembled feature selection and data mining

作者:

Highlights:

摘要

Figurative language detection has always been a difficult task for human beings while being a more difficult proposition, even if automated using text and data mining. The available computational approaches are also quite limited in their capabilities and scope. In this regard, we propose an ensembled text feature selection method followed by a new framework in the paradigm of text and data mining to automatically detect satire, sarcasm, and irony found in news and customer reviews. The effectiveness of the proposed approach was demonstrated on three datasets including two satiric and one ironic dataset. The proposed methodology performed well on one satiric dataset and yielded promising results on the remaining two datasets. Moreover, we found out some interesting common characteristics of satire and irony like affective process (negative emotion), personal concern (leisure), biological process (body and sexual), perception (see), informal language (swear), social process (male), cognitive process (certain), and psycholinguistic (concreteness and imageability), which were extracted from three corpora. Of particular significance is the comparison of our approach with human annotators' evaluations, which served as a baseline in these tasks.

论文关键词:Satiric news,Satire detection,Irony detection,Customer reviews,Ensembled feature subset selection,Sentiment analysis,LIWC,TAALES

论文评审过程:Received 20 August 2016, Revised 5 November 2016, Accepted 19 December 2016, Available online 21 December 2016, Version of Record 15 February 2017.

论文官网地址:https://doi.org/10.1016/j.knosys.2016.12.018