Video saliency prediction using enhanced spatiotemporal alignment network

作者:

Highlights:

• A novel MDAN module is designed to align the features across frames with deformable convolution (This is the first work to apply Dconv to video saliency prediction (VSP).

• A novel Bi ConvLSTM is introduced to VSP, which makes full use of the long term temporal context information in the forward and backward timing directions.

• Extensive evaluations on four VSP benchmarks demonstrate the proposed method achieves competing performance against state of the arts.

摘要

•A novel MDAN module is designed to align the features across frames with deformable convolution (This is the first work to apply Dconv to video saliency prediction (VSP).•A novel Bi ConvLSTM is introduced to VSP, which makes full use of the long term temporal context information in the forward and backward timing directions.•Extensive evaluations on four VSP benchmarks demonstrate the proposed method achieves competing performance against state of the arts.

论文关键词:Video saliency prediction,Feature alignment,Deformable convolution,Bidirectional ConvLSTM

论文评审过程:Received 30 December 2019, Revised 18 April 2020, Accepted 24 August 2020, Available online 27 August 2020, Version of Record 1 September 2020.

论文官网地址:https://doi.org/10.1016/j.patcog.2020.107615