Multilayer Convolutional Neural Network to Filter Low Quality Content from Quora

作者:Pradeep Kumar Roy

摘要

Question answering (QA) websites now play a crucial role in meeting Internet users’ information needs. Quora is a growing QA platform where users get quick answers to their questions from their peers. Nonetheless, it is noted that a significant number of questions remained unanswered for a long time. Questions that have long been unable to receive any answer, opinion-based, need a debate to get the answers, or a valid answer does not exist, fall under Insincere question group. It is therefore important to weed out Insincere questions in order to maintain the integrity of the site. Quora have a huge number of such questions that can not be filtered manually. To overcome this problem, this paper proposes a multi-layer convolutional neural network model that helps to minimize Insincere questions from the website. Two embeddings were created from Quora dataset: (i) using Skipgram, and (ii) using Continuous Bag of Word model. The created embeddings and a pre-trained GloVe embedding vector were used for system development. The proposed model needs only the question text to predict the question is Insincere question or not and hence free from manual feature engineering. The experimental results indicated that the proposed multilayer CNN model outperforming over the earlier works by achieving the F1-score of 0.98 for the best case.

论文关键词:Quora, Deep learning, Question–answering, Convolutional neural network

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11063-020-10284-x