Label distribution for multimodal machine learning

作者:Yi Ren, Ning Xu, Miaogen Ling, Xin Geng

摘要

Multimodal machine learning (MML) aims to understand the world from multiple related modalities. It has attracted much attention as multimodal data has become increasingly available in real-world application. It is shown that MML can perform better than single-modal machine learning, since multi-modalities containing more information which could complement each other. However, it is a key challenge to fuse the multi-modalities in MML. Different from previous work, we further consider the side-information, which reflects the situation and influences the fusion of multi-modalities. We recover multimodal label distribution (MLD) by leveraging the side-information, representing the degree to which each modality contributes to describing the instance. Accordingly, a novel framework named multimodal label distribution learning (MLDL) is proposed to recover the MLD, and fuse the multimodalities with its guidance to learn an in-depth understanding of the jointly feature representation. Moreover, two versions of MLDL are proposed to deal with the sequential data. Experiments on multimodal sentiment analysis and disease prediction show that the proposed approaches perform favorably against state-of-the-art methods.

论文关键词:multimodal machine learning, label distribution learning, sentiment analysis, disease prediction

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11704-021-0611-6