Video scene detection using graph-based representations

作者:

Highlights:

摘要

One of the fundamental steps in organizing videos is to parse it in smaller descriptive parts. One way of realizing this step is to obtain shot or scene information. One or more consecutive semantically correlated shots sharing the same content construct video scenes. On the other hand, video scenes are different from the shots in the sense of their boundary definitions; video scenes have semantic boundaries and shots are defined with physical boundaries. In this paper, we concentrate on developing a fast, as well as well-performed video scene detection method. Our graph partition based video scene boundary detection approach, in which multiple features extracted from the video, determines the video scene boundaries through an unsupervised clustering procedure. For each video shot to shot comparison feature, a one-dimensional signal is constructed by graph partitions obtained from the similarity matrix in a temporal interval. After each one-dimensional signal is filtered, an unsupervised clustering is conducted for finding video scene boundaries. We adopt two different graph-based approaches in a single framework in order to find video scene boundaries. The proposed graph-based video scene boundary detection method is evaluated and compared with the graph-based video scene detection method presented in literature.

论文关键词:Video scene detection,Graph-based representation,Graph partitioning,Dominant sets

论文评审过程:Received 17 June 2010, Accepted 12 October 2010, Available online 20 October 2010.

论文官网地址:https://doi.org/10.1016/j.image.2010.10.001