Coupled-dynamic learning for vision and language: Exploring Interaction between different tasks

作者:

Highlights:

• We propose a novel coupled dynamic framework that can exploit the complementary knowledge learning between different tasks, where the image captioning and synthesis tasks can be synchronously trained to reduce the distance between task dependent dynamics effectively.

• To embed adverse information into individual network, we construct a dual loss architecture to connect different tasks. Particularly, the novel message interaction unit is proposed to interactively align task dependent dynamics. To improve optimization strategies, we decompose the objective function into three consecutive steps, which allows the use of adadelta gradient algorithms in general back propagation problems.

• We perform comprehensive evaluations on three image benchmarks. Our framework can achieve the competing performances against state of the art methods. Furthermore, we exploit various alignment formulas and generalizat ion properties for the couple dynamic interactive learning framework.

摘要

•We propose a novel coupled dynamic framework that can exploit the complementary knowledge learning between different tasks, where the image captioning and synthesis tasks can be synchronously trained to reduce the distance between task dependent dynamics effectively.•To embed adverse information into individual network, we construct a dual loss architecture to connect different tasks. Particularly, the novel message interaction unit is proposed to interactively align task dependent dynamics. To improve optimization strategies, we decompose the objective function into three consecutive steps, which allows the use of adadelta gradient algorithms in general back propagation problems.•We perform comprehensive evaluations on three image benchmarks. Our framework can achieve the competing performances against state of the art methods. Furthermore, we exploit various alignment formulas and generalizat ion properties for the couple dynamic interactive learning framework.

论文关键词:Image captioning,Image synthesis,Coupled dynamics

论文评审过程:Received 12 February 2020, Revised 17 November 2020, Accepted 7 December 2020, Available online 19 January 2021, Version of Record 28 January 2021.

论文官网地址:https://doi.org/10.1016/j.patcog.2021.107829