Exploring global diverse attention via pairwise temporal relation for video summarization

作者：

Highlights：

• A global diverse attention mechanism is developed to model the long-range temporal dependency of video by using pairwise relations between every two frames regardless of their stride magnitude.

• SUM-GDA only needs very limited computational costs by directly handling the pairwise similarity matrix.

• SUM-GDA is fully explored in supervised, unsupervised and semi-supervised scenarios. Empirical studies in terms of both quantitative and qualitative views are provided.

• The diversity of generated summaries and the influence of optical flow features are both investigated.

摘要

•A global diverse attention mechanism is developed to model the long-range temporal dependency of video by using pairwise relations between every two frames regardless of their stride magnitude.•SUM-GDA only needs very limited computational costs by directly handling the pairwise similarity matrix.•SUM-GDA is fully explored in supervised, unsupervised and semi-supervised scenarios. Empirical studies in terms of both quantitative and qualitative views are provided.•The diversity of generated summaries and the influence of optical flow features are both investigated.

论文关键词：Global diverse attention,Pairwise temporal relation,Video summarization,Convolutional neural networks

论文评审过程：Received 4 May 2020, Revised 4 July 2020, Accepted 22 September 2020, Available online 28 September 2020, Version of Record 28 September 2020.

论文官网地址：https://doi.org/10.1016/j.patcog.2020.107677