Spatio-temporal stacking model for skeleton-based action recognition

作者：Yufeng Zhong, Qiuyan Yan

摘要

Due to the prevalence of affordable depth sensors, skeleton-based action recognition has attracted much attention as a significant computer vision task. The state-of-the-art recognition precision usually comes from the complicated deep learning networks which need a large quantity of training data. On the counterparts, none-deep learning methods are easy to be trained and understood, however, have restricted expressive ability to extract the spatial-temporal features of skeleton data simultaneously. Therefore, it is a challenging problem to use shallow learning architecture to effectively identify complex actions in skeleton data. In this paper, we first combine Temporal Hierarchy Pyramid (THP) and Symmetric Positive Definite (SPD) features to simultaneously capture the temporal relationship of inter-frame and the spatial relationship of intra-frame. Then, to achieve the same learning ability as the deep learning network for a non-linear system, we propose a novel stacking ensemble-based method to effectively identify complex actions in skeleton data. We carry out extensive verification of our method on widely used 3D action recognition datasets. The experiment results indicate that we achieve state-of-the-art performance on all compared datasets.

论文关键词：Action recognition, Skeleton data, Stacking model, Ensemble learning

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10489-021-02994-z