Towards automatically filtering fake news in Portuguese

作者:

Highlights:

• An unprecedented fake news collection in the Portuguese language is presented.

• Important open questions related to detecting fake news are raised and properly answered.

• A comprehensive performance evaluation of established classification methods and features are presented.

• Results with bag-of-words outperformed the results with the state-of-the art Word2Vec and FastText techniques.

• The combination of linguistic-based features and bag-of-words-based features is recommended.

摘要

•An unprecedented fake news collection in the Portuguese language is presented.•Important open questions related to detecting fake news are raised and properly answered.•A comprehensive performance evaluation of established classification methods and features are presented.•Results with bag-of-words outperformed the results with the state-of-the art Word2Vec and FastText techniques.•The combination of linguistic-based features and bag-of-words-based features is recommended.

论文关键词:Fake news,Text categorization,Natural language processing,Machine learning,Corpus construction

论文评审过程:Received 22 April 2019, Revised 2 October 2019, Accepted 8 January 2020, Available online 14 January 2020, Version of Record 21 January 2020.

论文官网地址:https://doi.org/10.1016/j.eswa.2020.113199