The spatial Laplacian and temporal energy pyramid representation for human action recognition using depth sequences

作者:

Highlights:

摘要

Depth sequences are useful for action recognition since they are insensitive to illumination variation and provide geometric information. Many current action recognition methods are limited by being computationally expensive and requiring large-scale training data. Here we propose an effective method for human action recognition using depth sequences captured by depth cameras. A multi-resolution operation, the spatial Laplacian and temporal energy pyramid (SLTEP), decomposes the depth sequences into certain frequency bands in different space and time positions. A spatial aggregating and fusion scheme is applied to cluster the low-level features and concatenate two different feature types extracted from low and high frequency levels, respectively. We evaluate our approach on five public benchmark datasets (MSRAction3D, MSRGesture3D, MSRActionPairs, MSRDailyActivity3D, and NTU RGB+D) and demonstrate its advantages over existing methods and is likely to be highly useful for online applications.

论文关键词:Action recognition,Depth maps,Spatial Laplacian pyramid,Temporal energy pyramid,Feature fusion

论文评审过程:Received 9 August 2016, Revised 20 January 2017, Accepted 23 January 2017, Available online 31 January 2017, Version of Record 27 February 2017.

论文官网地址:https://doi.org/10.1016/j.knosys.2017.01.035