Automated location matching in movies

作者：

Highlights：

•

摘要

We describe progress in matching shots which are images of the same 3D location in a film. The problem is hard because the camera viewpoint may change substantially between shots, with consequent changes in the imaged appearance of the scene due to foreshortening, scale changes, partial occlusion and lighting changes. We develop and compare two methods which achieve this task. In the first method we match key frames between shots using wide baseline matching techniques. The wide baseline method represents each frame by a set of viewpoint covariant local features. The local spatial support of the features means that segmentation of the frame (e.g., into foreground/background) is not required, and partial occlusion is tolerated. Matching proceeds through a series of stages starting with indexing based on a viewpoint invariant description of the features, then employing semi-local constraints (such as spatial consistency) and finally global constraints (such as epipolar geometry). In the second method the temporal continuity within a shot is used to compute invariant descriptors for tracked features, and these descriptors are the basic matching unit. The temporal information increases both the signal-to-noise ratio of the data and the stability of the computed features. We develop analogues of local spatial consistency, cross-correlation, and epipolar geometry for these tracks. Results of matching shots for a number of very different scene types are illustrated on two entire commercial films.

论文关键词：

论文评审过程：Received 1 September 2002, Accepted 1 June 2003, Available online 22 October 2003.

论文官网地址：https://doi.org/10.1016/j.cviu.2003.06.008