DeepCT: A novel deep complex-valued network with learnable transform for video saliency prediction
作者:
Highlights:
• We propose a new structure of DeepCT for video saliency prediction, in which the complexvalued CNN and Convolutional LSTM are integrated.
• We propose learning multi-scale spatio-temporal transforms, through the developed complexvalued transform and inverse complex-valued transform modules.
• We formulate the learnable transforms through a cycle consistency loss, such that transform and in-transform can be paired by minimizing reconstruction errors in both pixel and transformed domains.
• We evaluate the saliency prediction accuracy of our method over 2 databases and 4 metrics, as well as statistical analysis. The experimental results show that our method outperform 13 other state-ofthe-art methods.
摘要
•We propose a new structure of DeepCT for video saliency prediction, in which the complexvalued CNN and Convolutional LSTM are integrated.•We propose learning multi-scale spatio-temporal transforms, through the developed complexvalued transform and inverse complex-valued transform modules.•We formulate the learnable transforms through a cycle consistency loss, such that transform and in-transform can be paired by minimizing reconstruction errors in both pixel and transformed domains.•We evaluate the saliency prediction accuracy of our method over 2 databases and 4 metrics, as well as statistical analysis. The experimental results show that our method outperform 13 other state-ofthe-art methods.
论文关键词:Saliency prediction,Complex-valued network,Learnable transform,Convolutional LSTM
论文评审过程:Received 4 June 2019, Revised 24 December 2019, Accepted 23 January 2020, Available online 4 February 2020, Version of Record 12 February 2020.
论文官网地址:https://doi.org/10.1016/j.patcog.2020.107234