Understanding and localizing activities from correspondences of clustered trajectories

作者：

Highlights：

•

摘要

We present an approach for human activity recognition based on trajectory grouping. Our representation allows to perform partial matching between videos obtaining a robust similarity measure. This approach is extremely useful in sport videos where multiple entities are involved in the activities. Many existing works perform person detection, tracking and often require camera calibration in order to extract motion and imagery of every player and object in the scene. In this work we overcome this limitations and propose an approach that exploits the spatio-temporal structure of a video, grouping local spatio-temporal features unsupervisedly. Our robust representation allows to measure video similarity making correspondences among arbitrary patterns. We show how our clusters can be used to generate frame-wise action proposals. We exploit proposals to improve our representation further for localization and recognition. We test our method on sport specific and generic activity dataset reporting results above the existing state-of-the-art.

论文关键词：

论文评审过程：Received 21 March 2016, Revised 15 November 2016, Accepted 29 November 2016, Available online 1 December 2016, Version of Record 7 June 2017.

论文官网地址：https://doi.org/10.1016/j.cviu.2016.11.007