Multi-label crowd consensus via joint matrix factorization

作者：Jinzheng Tu, Guoxian Yu, Carlotta Domeniconi, Jun Wang, Guoqiang Xiao, Maozu Guo

摘要

Crowdsourcing is a useful and economic approach to annotate data. Various computational solutions have been developed to pursue a consensus of high quality. However, available solutions mainly target single-label tasks, and they neglect correlations among labels. In this paper, we introduce a multi-label crowd consensus (MLCC) model based on a joint matrix factorization. Specifically, MLCC selectively and jointly factorizes the sample-label association matrices into products of individual and shared low-rank matrices. As such, it makes use of the robustness of low-rank matrix approximation to noisy annotations and diminishes the impact of unreliable annotators by assigning small weights to their annotation matrices. To obtain coherent low-rank matrices, MLCC additionally leverages the shared low-rank matrix to model correlations among labels, and the individual low-rank matrices to measure the similarity between annotators. MLCC then computes the low-rank matrices and weights via a unified objective function, and adopts an alternative optimization technique to iteratively optimize them. Finally, MLCC uses the optimized low-rank matrices and weights to compute the consensus labels. Our experimental results demonstrate that MLCC outperforms competitive methods in inferring consensus labels. Besides identifying spammers, MLCC achieves robustness against their incorrect annotations, by crediting them small, or zero, weights.

论文关键词：Crowdsourcing, Multi-label crowd consensus, Joint matrix factorization, Low-rank, Spammers

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10115-019-01386-7