Exploring global diverse attention via pairwise temporal relation for video summarization
作者:
Highlights:
• A global diverse attention mechanism is developed to model the long-range temporal dependency of video by using pairwise relations between every two frames regardless of their stride magnitude.
• SUM-GDA only needs very limited computational costs by directly handling the pairwise similarity matrix.
• SUM-GDA is fully explored in supervised, unsupervised and semi-supervised scenarios. Empirical studies in terms of both quantitative and qualitative views are provided.
• The diversity of generated summaries and the influence of optical flow features are both investigated.
摘要
•A global diverse attention mechanism is developed to model the long-range temporal dependency of video by using pairwise relations between every two frames regardless of their stride magnitude.•SUM-GDA only needs very limited computational costs by directly handling the pairwise similarity matrix.•SUM-GDA is fully explored in supervised, unsupervised and semi-supervised scenarios. Empirical studies in terms of both quantitative and qualitative views are provided.•The diversity of generated summaries and the influence of optical flow features are both investigated.
论文关键词:Global diverse attention,Pairwise temporal relation,Video summarization,Convolutional neural networks
论文评审过程:Received 4 May 2020, Revised 4 July 2020, Accepted 22 September 2020, Available online 28 September 2020, Version of Record 28 September 2020.
论文官网地址:https://doi.org/10.1016/j.patcog.2020.107677