Design of Asynchronous Correlation Discriminant Single Object Tracker Based on Siamese Network
-
摘要: 现有基于孪生网络的单目标跟踪算法能够实现很高的跟踪精度, 但是这些跟踪器不具备在线更新的能力, 而且其在跟踪时很依赖目标的语义信息, 这导致基于孪生网络的单目标跟踪算法在面对具有相似语义信息的干扰物时会跟踪失败. 为了解决这个问题, 本文提出了一种异步相关响应的计算模型, 并提出一种高效利用不同帧间目标语义信息的方法. 在此基础上, 提出了一种新的具有判别性的跟踪算法. 同时为了解决判别模型使用一阶优化算法收敛慢的问题, 本文使用近似二阶优化的方法更新判别模型. 为验证所提算法的有效性, 本文分别在Got-10k, TC128, OTB 和VOT2018 上做了对比实验, 实验结果表明, 本文提出的方法可以明显地改进基准算法的性能.Abstract: The existing single target object tracking algorithms based on the siamese network can achieve very high tracking performance, but these trackers can not update online, and they heavily rely on the semantic information of the target in tracking. It caused the trackers, which based on the siamese network, fail when facing the disruptor who has similar semantic information. To address this issue, this paper proposes an asynchronous correlation response calculation model and an efficient method of using the target's semantic information in different frames. Based on this, a new discriminative siamese network-based tracker is proposed. To address the convergence speed issue in the traditional first-order optimization algorithm, an approximate second-order optimization method is introduced to update the discriminant model online. To evaluate the effectiveness of the proposed method, comparison experiments on Got-10k, TC128, OTB, and VOT2018 between the proposed tracker and other lastest state-of-the-art trackers are adopted. The experimental results demonstrate that the proposed method can significantly improve the performance of the baseline.
-
Key words:
- Siamese network /
- semantic information /
- asynchronous correlation /
- discriminative /
- update online
-
图 6 在OTB50的6 个序列上的实验结果. 其中Init Sampler 表示第一帧目标计算得到的
${\rm k}_0$ , Current Sampler 表示当前帧目标计算得到的${\rm k}_t$ , Optim Sampler 表示对当前${\rm k}_t$ 进行优化后得到的${\rm k}_{\rm{t}} = \dfrac{1}{{\rm{m}}}\sum_{{\rm{i}}}^{{\rm{m}}}\Phi_{\rm{i}}({\rm{k}}_{\rm{t}})$ Fig. 6 The response visualization on OTB50. Init Sampler denotes
${\rm k}_0$ , which is obtained in the first frame. Current Sampler denotes${\rm k}_{\rm{t}}$ , which is calculated in the current frame. Optim Sampler denotes the${\rm k}_{\rm{t}} = \dfrac{1}{{\rm{m}}}\sum_{{\rm{i}}}^{{\rm{m}}}\Phi_{\rm{i}}({\rm{k}}_{\rm{t}})$ , which is obtained after optimized discriminate model表 1 本文所提方法与基准算法的消融实验
Table 1 The ablation expirement of the proposed algorithm and the benchmark algorithm
AO $ {\rm SR}_{0.5} $ $ {\rm SR}_{0.75} $ FPS baseline 0.445 0.539 0.208 21.95 baseline+AC 0.445 0.539 0.211 20.03 baseline+AC+S 0.447 0.542 0.211 19.63 baseline+AC+S+ $ {\rm D}_{{\rm{KL}}} $ m = 30.442 0.537 0.209 18.72 baseline+AC+S+ $ {\rm D}_{{\rm{KL}}} $ m = 60.457 0.553 0.215 18.60 baseline+AC+S+ $ {\rm D}_{{\rm{KL}}} $ m = 90.440 0.532 0.211 18.49 表 2 OTB2013 的 BC、DEF 等情景下的跟踪精度对比结果
Table 2 Comparison of tracking accuracy under 11 attributes on OTB2013
BC BC DEF DEF FM FM IPR IPR S P S P S P S P ECO-HC 0.700 0.559 0.567 0.719 0.570 0.697 0.517 0.648 ECO 0.776 0.619 0.613 0.772 0.655 0.783 0.630 0.764 ATOM 0.733 0.598 0.623 0.771 0.595 0.709 0.579 0.714 DIMP 0.749 0.607 0.602 0.740 0.618 0.739 0.561 0.685 MDNet 0.777 0.621 0.620 0.780 0.652 0.796 0.658 0.822 SiamFC 0.605 0.494 0.487 0.608 0.509 0.618 0.483 0.583 DaSiamRPN 0.728 0.592 0.609 0.761 0.565 0.702 0.625 0.780 SiamRPN(baseline) 0.605 0.745 0.591 0.724 0.589 0.724 0.627 0.770 baseline+AC 0.605 0.745 0.591 0.724 0.589 0.724 0.627 0.770 baseline+AC+ $ {\rm D}_{{\rm{KL}}}^{{\rm{m}} = 3} $ 0.599 0.741 0.603 0.749 0.645 0.797 0.651 0.808 baseline+AC+ $ {\rm D}_{{\rm{KL}}}^{{\rm{m}} = 6} $ 0.592 0.733 0.597 0.742 0.636 0.787 0.650 0.807 baseline+AC+ $ {\rm D}_{{\rm{KL}}}^{{\rm{m}} = 9} $ 0.598 0.736 0.586 0.725 0.587 0.723 0.654 0.809 表 3 OTB2013的IV、LR等情景下的跟踪精度对比结果
Table 3 Comparison of tracking accuracy under 11 attributes on OTB2013
IV IV LR LR MB MB OCC OCC S P S P S P S P ECO-HC 0.556 0.690 0.536 0.619 0.566 0.685 0.586 0.749 ECO 0.616 0.766 0.569 0.677 0.659 0.786 0.636 0.800 ATOM 0.604 0.749 0.554 0.654 0.529 0.665 0.617 0.762 DIMP 0.606 0.749 0.485 0.571 0.564 0.695 0.610 0.750 MDNet 0.619 0.780 0.644 0.804 0.662 0.813 0.623 0.777 SiamFC 0.479 0.593 0.499 0.600 0.485 0.617 0.512 0.635 DaSiamRPN 0.589 0.736 0.490 0.618 0.533 0.688 0.583 0.726 SiamRPN(baseline) 0.585 0.723 0.519 0.653 0.532 0.684 0.586 0.726 baseline+AC 0.585 0.723 0.519 0.653 0.532 0.684 0.586 0.726 baseline+AC+ $ {\rm D}_{{\rm{KL}}}^{{\rm{m}} = 3} $ 0.600 0.749 0.554 0.697 0.610 0.785 0.593 0.740 baseline+AC+ $ {\rm D}_{{\rm{KL}}}^{{\rm{m}} = 6} $ 0.592 0.741 0.546 0.688 0.596 0.770 0.586 0.732 baseline+AC+ $ {\rm D}_{{\rm{KL}}}^{{\rm{m}} = 9} $ 0.581 0.724 0.549 0.689 0.533 0.687 0.576 0.716 表 4 OTB2013的OPR、OV等情景下的跟踪精度对比结果
Table 4 Comparison of tracking accuracy under 11 attributes on OTB2013
OPR OPR OV OV SV SV S P S P S P ECO-HC 0.563 0.718 0.549 0.763 0.587 0.740 ECO 0.628 0.787 0.733 0.827 0.651 0.793 ATOM 0.607 0.751 0.522 0.563 0.654 0.792 DIMP 0.596 0.737 0.549 0.593 0.636 0.767 MDNet 0.628 0.787 0.698 0.769 0.675 0.842 SiamFC 0.500 0.620 0.574 0.642 0.542 0.665 DaSiamRPN 0.599 0.750 0.570 0.633 0.587 0.740 SiamRPN(baseline) 0.598 0.736 0.658 0.725 0.608 0.751 baseline+AC 0.598 0.736 0.658 0.725 0.608 0.751 baseline+AC+ $ {\rm D}_{{\rm{KL}}}^{{\rm{m}} = 3} $ 0.611 0.760 0.702 0.778 0.656 0.819 baseline+AC+ $ {\rm D}_{{\rm{KL}}}^{{\rm{m}} = 6} $ 0.604 0.752 0.659 0.733 0.631 0.791 baseline+AC+ $ {\rm D}_{{\rm{KL}}}^{{\rm{m}} = 9} $ 0.597 0.740 0.660 0.735 0.603 0.755 表 5 VOT2018 上的实验结果
Table 5 Experimental results on VOT2018
baseline unsupervised realtime A-R rank Failures EAO FPS AO FPS EAO KCF 0.4441 50.0994 0.1349 60.0053 0.2667 63.9847 0.1336 SRDCF 0.4801 64.1136 0.1189 2.4624 0.2465 2.7379 0.0583 ECO 0.4757 17.6628 0.2804 3.7056 0.402 4.5321 0.0775 ATOM 0.5853 12.3591 0.4011 5.2061 0 NaN 0 SiamFC 0.5002 34.0259 0.188 31.889 0.3445 35.2402 0.182 DaSiamRPN 0.5779 17.6608 0.3826 58.854 0.4722 64.4143 0.3826 SiamRPN(baseline) 0.5746 23.5694 0.2941 14.3760 0.4355 14.4187 0.0559 baseline+AC 0.5825 27.0794 0.2710 13.7907 0.4431 13.8772 0.0539 baseline+AC+ $ {\rm D}_{{\rm{KL}}}^{{\rm{m}} = 3} $ 0.5789 14.8312 0.2865 13.6035 0.4537 13.4039 0.0536 baseline+AC+ $ {\rm D}_{{\rm{KL}}}^{{\rm{m}} = 6} $ 0.5722 22.6765 0.2992 13.5359 0.4430 12.4383 0.0531 baseline+AC+ $ {\rm D}_{{\rm{KL}}}^{{\rm{m}} = 9} $ 0.5699 22.9148 0.2927 13.5046 0.4539 12.1159 0.0519 -
[1] 刘巧元, 王玉茹, 张金玲, 殷明浩. 基于相关滤波器的视频跟踪方法研究进展. 自动化学报, 2019, 45(2): 265−275LIU Qiao-Yuan, WANG Yu-Ru, ZHANG Jin-Ling, YIN Ming-Hao. Research Progress of Visual Tracking Methods Based on Correlation Filter. ACTA AUTOMATICA SINICA, 2019, 45(2): 265−275 [2] 刘畅, 赵巍, 刘鹏, 唐降龙. 目标跟踪中辅助目标的选择、跟踪与更新. 自动化学报, 2018, 44(7): 1195−1211LIU Chang, ZHAO Wei, LIU Peng, TANG Xiang-Long. Auxiliary Objects Selecting, Tracking and Updating in Target Tracking. ACTA AUTOMATICA SINICA, 2018, 44(7): 1195−1211 [3] 蔺海峰, 宇峰, 宋涛. 基于SIFT特征目标跟踪算法研究. 自动化学报, 2010, 36(8): 1204−1208LIN Hai-Feng, MA Yu-Feng, SONG Tao. Research on Object Tracking Algorithm Based on SIFT. ACTA AUTOMATICA SINICA, 2010, 36(8): 1204−1208 [4] Bolme D S, Beveridge J R, Draper B A, Lui Y M. Visual object tracking using adaptive correlation fllters. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, 2010. 2544−2550 [5] Henriques J F, Caseiro R, Martins P, Batista J. High-Speed Tracking with Kernelized Correlation Filters. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 3, 1 March 2015. 583−596 [6] Danelljan M, Hger G, Khan F S, Felsberg M. Learning Spatially Regularized Correlation Filters for Visual Tracking. In: 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015. 4310−4318 [7] Danelljan M, Bhat G, Khan F S, Felsberg M. ECO: E–cient Convolution Operators for Tracking. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017. 6931−6939 [8] Nam H, Han B. Learning Multi-domain Convolutional Neural Networks for Visual Tracking. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016. 4293−4302 [9] Ma C, Huang J, Yang X, Yang M. Hierarchical Convolutional Features for Visual Tracking. In: 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015. 3074−3082 [10] Wang Nai-Yan, Dit-Yan Yeung. Learning a deep compact image representation for visual tracking. In: 2013 Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 1 (NIPS’ 13). Curran Associates Inc., Red Hook, NY, USA. 809−817 [11] Held D, Thrun S, Savarese S. Learning to Track at 100 FPS with Deep Regression Networks. In: Leibe B., Matas J., Sebe N., Welling M. (eds) Computer Vision – European Conference on Computer Vision (ECCV) 2016. ECCV 2016. Lecture Notes in Computer Science, vol 9905. Springer, Cham. 749−765 [12] Bertinetto L, Valmadre J, Henriques J F, Vedaldi A, Torr P H S. Fully-Convolutional Siamese Networks for Object Tracking. In: Hua Gang, JÉgou HervÉ(eds) Computer Vision – ECCV 2016 Workshops. ECCV 2016. Lecture Notes in Computer Science, vol 9914. Springer, Cham. 850−865 [13] Li B, Yan J, Wu W, Zhu Z, Hu X. High Performance Visual Tracking with Siamese Region Proposal Network. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018. 8971−8980 [14] Ren S, He K, Girshick R, Sun J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, 2017. 1137−1149 [15] Li B, Wu W, Wang Q, Zhang F, Xing J, Yan J. SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019. 4277−4286 [16] Wang Q, Zhang L, Bertinetto L, Hu W, Torr P H S. Fast Online Object Tracking and Segmentation: A Unifying Approach. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019. 1328−1338 [17] Zhang J, Ma S, Sclarofi S. MEEM: Robust Tracking via Multiple Experts Using Entropy Minimization. In: Fleet D., Pajdla T., Schiele B., Tuytelaars T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8694. Springer, Cham. 188−203 [18] Hare Sam, Safiari Amir, Torr Philip H S. Struck: Structured Output Tracking with Kernels. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 10, 2016. 2096−2109 [19] Grabner H, Leistner C, Bischof H. Semi-supervised OnLine Boosting for Robust Tracking. In: Forsyth D., Torr P., Zisserman A. (eds) Computer Vision – ECCV 2008. ECCV 2008. Lecture Notes in Computer Science, vol 5302. Springer, Berlin, Heidelberg. 234−247 [20] Jia X, Lu H, Yang M. Visual tracking via adaptive structural local sparse appearance model. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, 2012. 1822−1829 [21] Adam A, Rivlin E, Shimshoni I. Robust Fragments-based Tracking using the Integral Histogram. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), New York, NY, USA, 2006. 798−805 [22] Danelljan M, Bhat G, Khan F S, Felsberg M. ATOM: Accurate Tracking by Overlap Maximization. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019. 4655−4664 [23] Danelljan M, Bhat G, Khan F S, M Felsberg. ECO: E–cient Convolution Operators for Tracking. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017. 6931−6939 [24] Jiang B, Luo R, Mao J, Xiao T, Jiang Y. Acquisition of Localization Confldence for Accurate Object Detection. In: Ferrari V., Hebert M., Sminchisescu C., Weiss Y. (eds) Computer Vision – ECCV 2018. ECCV 2018. Lecture Notes in Computer Science, vol 11218. Springer, Cham. 816−832 [25] Huang L, Zhao X, Huang K. GOT-10k: A Large HighDiversity Benchmark for Generic Object Tracking in the Wild. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019. 1−1 [26] Liang P, Blasch E, Ling H. Encoding Color Information for Visual Tracking: Algorithms and Benchmark. In: IEEE Transactions on Image Processing, 2015. 5630−5644 [27] Wu Y, Lim J, Yang M. Object Tracking Benchmark. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015. 1834−1848 [28] Zhu Z, Wang Q, Li B, Wu W, Yan J, Hu W. DistractorAware Siamese Networks for Visual Object Tracking. In: Ferrari V., Hebert M., Sminchisescu C., Weiss Y. (eds) Computer Vision – ECCV 2018. ECCV 2018. Lecture Notes in Computer Science, vol 11213. Springer, Cham. 103−119 [29] Bhat Goutam, Danelljan Martin, Gool Luc Van, Timofte Radu. Learning Discriminative Model Prediction for Tracking. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), 2019. 6181−6190 [30] Kristan M, et al. A Novel Performance Evaluation Methodology for Single-Target Trackers. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 11, 2016. 2137−2155 [31] Danelljan M, Robinson A, Shahbaz Khan F, Felsberg M. Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking. In: Leibe B., Matas J., Sebe N., Welling M. (eds) Computer Vision – ECCV 2016. ECCV 2016. Lecture Notes in Computer Science, vol 9909. Springer, Cham. 472−488 [32] Ramasubramanian K, Singh A. (2017) Machine Learning Theory and Practices. In: Machine Learning Using R. Apress, Berkeley, CA. [33] Pearlmutter B A. Fast Exact Multiplication by the Hessian. In Neural Computation, vol. 6, no. 1, 1994. 147−160 -

计量
- 文章访问数: 88
- HTML全文浏览量: 12
- 被引次数: 0