Semi-supervised cross-modal image generation with generative adversarial networks

作者:

Highlights:

摘要

Cross-modal image generation is an important aspect of the multi-modal learning. Existing methods usually use the semantic feature to reduce the modality gap. Although these methods have achieved notable progress, there are still some limitations: (1) they usually use single modality information to learn the semantic feature; (2) they require the training data to be paired. To overcome these problems, we propose a novel semi-supervised cross-modal image generation method, which consists of two semantic networks and one image generation network. Specifically, in the semantic networks, we use image modality to assist non-image modality for semantic feature learning by using a deep mutual learning strategy. In the image generation network, we introduce an additional discriminator to reduce the image reconstruction loss. By leveraging large amounts of unpaired data, our method can be trained in a semi-supervised manner. Extensive experiments demonstrate the effectiveness of the proposed method.

论文关键词:Multi-modality,Semi-supervised learning,Semantic networks,Generative adversarial networks,Multi-label learning

论文评审过程:Received 5 May 2019, Revised 8 September 2019, Accepted 15 October 2019, Available online 12 November 2019, Version of Record 26 November 2019.

论文官网地址:https://doi.org/10.1016/j.patcog.2019.107085