High performance visual tracking with circular and structural operators

作者:

Highlights:

摘要

Visual tracking algorithms based on structured output support vector machine (SOSVM) have demonstrated excellent performance. However, sampling methods and optimization strategies of SOSVM undesirably increase the computational overloads, which hinder real-time application of these algorithms. Moreover, due to the lack of high-dimensional features and dense training samples, SOSVM-based algorithms are unstable to deal with various challenging scenarios, such as occlusions and scale variations. Recently, visual tracking algorithms based on discriminative correlation filters (DCF), especially the combination of DCF and features from deep convolutional neural networks (CNN), have been successfully applied to visual tracking, and attains surprisingly good performance on recent benchmarks. The success is mainly attributed to two aspects: the circular correlation properties of DCF and the powerful representation capabilities of CNN features. Nevertheless, compared with SOSVM, DCF-based algorithms are restricted to simple ridge regression which has a weaker discriminative ability. In this paper, a novel circular and structural operator tracker (CSOT) is proposed for high performance visual tracking, it not only possesses the powerful discriminative capability of SOSVM but also efficiently inherits the superior computational efficiency of DCF. Based on the proposed circular and structural operators, a set of primal confidence score maps can be obtained by circular correlating feature maps with their corresponding structural correlation filters. Furthermore, an implicit interpolation is applied to convert the multi-resolution feature maps to the continuous domain and make all primal confidence score maps have the same spatial resolution. Then, we exploit an efficient ensemble post-processor based on relative entropy, which can coalesce primal confidence score maps and create an optimal confidence score map for more accurate localization. The target is localized on the peak of the optimal confidence score map. Besides, we introduce a collaborative optimization strategy to update circular and structural operators by iteratively training structural correlation filters, which significantly reduces computational complexity and improves robustness. Experimental results demonstrate that our approach achieves state-of-the-art performance in mean AUC scores of 71.5% and 69.4% on the OTB2013 and OTB2015 benchmarks respectively, and obtains a third-best expected average overlap (EAO) score of 29.8% on the VOT2017 benchmark.

论文关键词:Visual tracking,Circular and structural operators,Ensemble post-processor,Collaborative optimization

论文评审过程:Received 22 March 2018, Revised 6 August 2018, Accepted 8 August 2018, Available online 9 August 2018, Version of Record 31 October 2018.

论文官网地址:https://doi.org/10.1016/j.knosys.2018.08.008