DevsNet: Deep Video Saliency Network using Short-term and Long-term Cues

作者:

Highlights:

• We design a novel video saliency detection model by design the new 3-D ConvNet and B-ConvLSTM to extract short-term and long-term spatiotemporal cues, respectively. Through combining short-term and long-term spatiotemporal features, the proposed model can obtain promising performance for video saliency prediction.

• We design a new two-layer B-ConvLSTM structure for long-term spatiotemporal feature extraction for video saliency detection. The proposed B-ConvLSTM can extract the temporal information not just from the previous video frames but also from the next frames, which demonstrates that the proposed network takes both the forward and backward temporal features into account.

摘要

•We design a novel video saliency detection model by design the new 3-D ConvNet and B-ConvLSTM to extract short-term and long-term spatiotemporal cues, respectively. Through combining short-term and long-term spatiotemporal features, the proposed model can obtain promising performance for video saliency prediction.•We design a new two-layer B-ConvLSTM structure for long-term spatiotemporal feature extraction for video saliency detection. The proposed B-ConvLSTM can extract the temporal information not just from the previous video frames but also from the next frames, which demonstrates that the proposed network takes both the forward and backward temporal features into account.

论文关键词:Video saliency detection,Spatiotemporal saliency,3D convolution network (3D-ConvNet),Bidirectional convolutional long-short term memory network (B-ConvLSTM)

论文评审过程:Received 7 May 2019, Revised 30 January 2020, Accepted 19 February 2020, Available online 21 February 2020, Version of Record 28 February 2020.

论文官网地址:https://doi.org/10.1016/j.patcog.2020.107294