DeepCT: A novel deep complex-valued network with learnable transform for video saliency prediction

作者:

Highlights:

• We propose a new structure of DeepCT for video saliency prediction, in which the complexvalued CNN and Convolutional LSTM are integrated.

• We propose learning multi-scale spatio-temporal transforms, through the developed complexvalued transform and inverse complex-valued transform modules.

• We formulate the learnable transforms through a cycle consistency loss, such that transform and in-transform can be paired by minimizing reconstruction errors in both pixel and transformed domains.

• We evaluate the saliency prediction accuracy of our method over 2 databases and 4 metrics, as well as statistical analysis. The experimental results show that our method outperform 13 other state-ofthe-art methods.

摘要

•We propose a new structure of DeepCT for video saliency prediction, in which the complexvalued CNN and Convolutional LSTM are integrated.•We propose learning multi-scale spatio-temporal transforms, through the developed complexvalued transform and inverse complex-valued transform modules.•We formulate the learnable transforms through a cycle consistency loss, such that transform and in-transform can be paired by minimizing reconstruction errors in both pixel and transformed domains.•We evaluate the saliency prediction accuracy of our method over 2 databases and 4 metrics, as well as statistical analysis. The experimental results show that our method outperform 13 other state-ofthe-art methods.

论文关键词:Saliency prediction,Complex-valued network,Learnable transform,Convolutional LSTM

论文评审过程:Received 4 June 2019, Revised 24 December 2019, Accepted 23 January 2020, Available online 4 February 2020, Version of Record 12 February 2020.

论文官网地址:https://doi.org/10.1016/j.patcog.2020.107234