Lightweight dynamic conditional GAN with pyramid attention for text-to-image synthesis
作者:
Highlights:
• We propose a Conditional Manipulating Modular (CM-M) in Conditional Manipulating Block (CM-B) to compensate semantic information.
• We develop a Pyramid Attention Refine Block (PAR-B) to capture multi-scale context.
• The perceptual loss L1 and image-consistency loss L2 are used to optimize the generator to improve the sharpness and consistency of generated images.
摘要
•We propose a Conditional Manipulating Modular (CM-M) in Conditional Manipulating Block (CM-B) to compensate semantic information.•We develop a Pyramid Attention Refine Block (PAR-B) to capture multi-scale context.•The perceptual loss L1 and image-consistency loss L2 are used to optimize the generator to improve the sharpness and consistency of generated images.
论文关键词:Text-to-image synthesis,Conditional generative adversarial network (CGAN),Network complexity,Disentanglement process,Entanglement process,Information compensation,Pyramid attentive fusion
论文评审过程:Received 18 June 2019, Revised 19 March 2020, Accepted 15 April 2020, Available online 29 May 2020, Version of Record 1 November 2020.
论文官网地址:https://doi.org/10.1016/j.patcog.2020.107384