Exploring global diverse attention via pairwise temporal relation for video summarization

作者:

Highlights:

• A global diverse attention mechanism is developed to model the long-range temporal dependency of video by using pairwise relations between every two frames regardless of their stride magnitude.

• SUM-GDA only needs very limited computational costs by directly handling the pairwise similarity matrix.

• SUM-GDA is fully explored in supervised, unsupervised and semi-supervised scenarios. Empirical studies in terms of both quantitative and qualitative views are provided.

• The diversity of generated summaries and the influence of optical flow features are both investigated.

摘要

•A global diverse attention mechanism is developed to model the long-range temporal dependency of video by using pairwise relations between every two frames regardless of their stride magnitude.•SUM-GDA only needs very limited computational costs by directly handling the pairwise similarity matrix.•SUM-GDA is fully explored in supervised, unsupervised and semi-supervised scenarios. Empirical studies in terms of both quantitative and qualitative views are provided.•The diversity of generated summaries and the influence of optical flow features are both investigated.

论文关键词:Global diverse attention,Pairwise temporal relation,Video summarization,Convolutional neural networks

论文评审过程:Received 4 May 2020, Revised 4 July 2020, Accepted 22 September 2020, Available online 28 September 2020, Version of Record 28 September 2020.

论文官网地址:https://doi.org/10.1016/j.patcog.2020.107677