Transformer models for enhancing AttnGAN based text to image generation
作者:
Highlights:
• A new variant of AttnGAN model is proposed for TTI synthesis.
• The proposed AttnGANTRANS model uses a Transformer based text encoder.
• Transformers like BERT, GPT2 and XLNet are employed and analysed.
• Experiments validate that AttnGANTRANS outperforms state-of-the art methods.
• Over AttnGAN, AttnGANTRANS has a 49.9% lower FID and 27.23% higher inception score.
摘要
Highlights•A new variant of AttnGAN model is proposed for TTI synthesis.•The proposed AttnGANTRANS model uses a Transformer based text encoder.•Transformers like BERT, GPT2 and XLNet are employed and analysed.•Experiments validate that AttnGANTRANS outperforms state-of-the art methods.•Over AttnGAN, AttnGANTRANS has a 49.9% lower FID and 27.23% higher inception score.
论文关键词:Generative Adversarial Networks (GANs),Natural Language Processing (NLP),Text to image synthesis,Transformers,Attention mechanism
论文评审过程:Received 2 April 2021, Revised 22 July 2021, Accepted 13 August 2021, Available online 25 August 2021, Version of Record 7 September 2021.
论文官网地址:https://doi.org/10.1016/j.imavis.2021.104284