A Robust ROI Extraction Method for Biometrics using Adversarial Structure
-
摘要: 感兴趣区域(Region of Interest, ROI) 提取在生物特征识别中, 常用于减少后续处理的计算消耗, 提高识别模型的准确性, 是生物识别系统中预处理的关键步骤. 针对生物识别数据, 本文提出了一种鲁棒的ROI提取方法. 方法使用语义分割模型作为基础, 通过增加全局感知模块, 与分割模型形成对抗结构, 为模型提供先验知识, 补充全局视觉模式信息, 解决了语义分割模型的末端收敛困难问题, 提高了模型的鲁棒性和泛化能力. 本文在传统二维(2D)指纹, 人脸, 三维(3D)指纹和指纹汗孔数据集中验证了方法的有效性. 实验结果表明, 相比于现有方法, 本文提出的ROI提取方法更具鲁棒性和泛化能力, 精度最高.Abstract: ROI extraction is an initial and key step in biometrics since it can not only facilitate more accurate feature extraction but also can reduce the computational cost. This paper proposes a more robust ROI extraction method for biometric image. The method uses semantic segmentation network as the basis. By adding the global perceptual loss module (i.e. adversarial structure) into the loss function of the learning model, prior knowledge is provided to try to make the model know the global pattern information. Furthermore, global perceptual loss module solves the problem of terminal convergence and improve the robustness of the ROI extraction. The effectiveness of the proposed method is validated on the 2D fingerprint, face, 3D fingerprint and sweat pore datasets, respectively. Comparisons with other ROI extraction methods also shows the outstanding performance of the proposed method.
-
Key words:
- ROI Extraction /
- Semantic Segmentation /
- Adversarial Structure /
- Biometrics
-
图 1 基于PASCAL VOC 2011验证集的分割结果[2]. positive), 图像从左到右依次是, 原图, ROI的标签, 以及FCN[15]的分割结果. 第一行显示的案例是以马作为提取目标, 第二行显示的是飞行器提取案例.
Fig. 1 Sample segmentation results[2] on the PASCAL VOC 2011 validation set. Columns(left to right): original images, ground-truth ROI, segmentations produced by FCN[15]. The first row shows the ROI extraction result for horse and the second row shows the result for aircraft extraction.
图 2 拥有不同域信息的指纹图像: (a)指纹图像来自于FVCs[21–23]. (b)是图像(a)的ROI区域. (c)指纹图像是来自数据集NIST 29[24]. (d)是(c)的ROI区域.
Fig. 2 Samples of 2D Fingerprint images in different domains: (a). Images from FVCs[21–23] (b). The labeled image of (a) marked with the ROI. (c). The fingerprints impressions in NIST29[24] (d). The artificial annotation for (c).
图 8 3D指纹的横截面和对应ROI区域: (a). 标注了生物组织结构的指纹横截面图像 (b). 该横截面对应的ROI区域 (c). 指尖的生物结构[38].
Fig. 8 An example of X-Z cross-section image labeled for 3D fingerprints:(a). The longitudinal(X-Z) fingertip image marked with biological structure. (b). The labeled image mark with the ROI. (c). Physical structure of human skin[38].
图 10 不同训练次数下的2D指纹ROI, 人脸提取和3D指纹ROI的提取结果: 从左至右依次是不同的迭代次数的模型分割结果. 上面的一行是Baseline的分割结果, 下面的一行是本文方法的分割结果.
Fig. 10 The Result for 2D fingerprint, face and 3D fingerprint ROI extraction with different iteration numbers. From left to right, there are the extraction results with different iteration numbers. The upper row corresponds to the extraction results of Baseline, and the lower row shows the results of the proposed method.
图 11 人脸ROI提取和2D指纹ROI提取结果: 从左至右依次是原图, FCN, U-Net, PSPNet, Baseline和使用全局感知模块的ROI提取模型的结果. 第一行是人脸ROI提取的结果, 第二行是2D传统指纹的ROI提取结果.
Fig. 11 The Result for face ROI extraction and 2D fingerprint ROI extraction. From left to right: the original image & the prediction of FCN, U-Net, PSPNet, Baseline and the proposed ROI extraction model using global perceptual loss module. The first row corresponds to face ROI extraction, and the second row shows the result of 2D traditional fingerprint ROI extraction.
图 12 基于全局感知模块的3D指纹ROI提取结果: (a)原始的3D指纹图像[41]. (b)使用本文提出的方法, 针对(a)提取得到的ROI结果.
Fig. 12 A set of images which show the ROI extraction result of our proposed method for 3D Fingerprint: (a). 3D fingerprint images obtained by OCT device[41]. (b). Effective structure of 3D fingerprint extracted by our proposed method.
表 1 不同设置下的全局感知模块表现
Table 1 Investigation of Global Perceptual Loss Module with Different Settings
优化策略 2D传统指纹(Pixel Acc.(%)/Mean IoU) 人脸(Pixel Acc.(%)/Mean IoU) 3D指纹(Pixel Acc.(%)/Mean IoU) 本文方法 Baseline 本文方法 Baseline 本文方法 Baseline 损失函数 IoU loss[2] 92.07/0.8632 90.66/0.8380 92.05/0.8579 90.03/0.8254 96.97/0.8859 95.18/0.8640 Lovasz loss[25] 92.48/0.8648 93.14/0.8822 97.21/0.9475 96.71/0.9388 95.74/0.8788 95.69/0.8767 L2 loss 93.33/0.8613 89.33/0.8219 96.99/0.9434 96.90/0.9420 95.70/0.8850 94.14/0.8331 CrossEntropy loss(base) 92.58/0.8606 82.71/0.7180 97.06/0.9429 96.77/0.9389 96.13/0.8975 95.43/0.8719 优化器 AMSGrad[29] 93.65/0.8863 92.39/0.8672 96.50/0.9353 96.17/0.9289 93.56/0.8230 90.45/0.7540 Radam[30] 92.72/0.8694 92.27/0.8665 96.72/0.9390 96.52/0.9350 95.77/0.8806 95.19/0.8676 Adam(base)[27] 92.58/0.8606 82.71/0.7180 97.06/0.9429 96.77/0.9389 96.13/0.8975 95.43/0.8719 表 2 2D指纹ROI提取实验结果
Table 2 ROI Extraction Results of 2D Fingerprints
FVCs vs. NIST29 Pixel Acc.(%)/MeanIoU NIST29 vs. FVCs Pixel Acc.(%)/MeanIoU 平均值(Average) Pixel Acc.(%)/MeanIoU Mean and Variance based Method[13] 76.71/0.6852 77.23/0.7551 76.97/0.7202 Orientation based Method[12] 75.37/0.7532 74.46/0.6213 74.92/0.6873 Fourier based Method[13] 65.45/0.6349 65.45/0.6349 65.45/0.6349 PSPNet[17] 87.74/0.8000 79.41/0.7209 83.58/0.7605 FCN[15] 87.20/0.7932 75.77/0.6736 81.49/0.7334 U-Net[16] 85.83/0.7839 76.46/0.7251 81.15/0.7545 Baseline 82.71/0.7180 73.12/0.7341 77.92/0.7261 Baseline+Dense-CRF[48] 90.33/0.7835 78.30/0.7347 84.32/0.7591 本文方法 92.58/0.8606 80.29/0.7469 86.44/0.8038 本文方法+Dense-CRF 94.67/0.8852 82.73/0.7852 88.70/0.8352 表 3 人脸提取案例实验结果
Table 3 ROI Extraction Results of Face Images
表 4 3D指纹的ROI提取结果.
Table 4 ROI Extraction Results of 3D Fingerprints
表 5 指纹汗孔提取实验结果
Table 5 Fingerprint Pore Extraction Results
$ R_T$ (%)$ R_F$ (%)Gabor Filter[44] 75.90(7.5) 23.00(8.2) Adapt. Dog[14] 80.80(6.5) 22.20(9.0) DAPM[14] 84.80(4.5) 17.60(6.3) Xu等人[45] 84.80(4.5) 17.60(6.3) Labati等人[46] 84.69(7.81) 15.31(6.2) DeepPore[47] 93.09(4.63) 8.64(4.15) DeepPore $ ^*$ 96.33(6.57) 6.45(17.22) Baseline 97.48(9.63) 7.57(5.85) 本文方法 98.30(9.2927) 7.83(4.18) -
[1] Sergi Caelles, Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Laura Leal-Taixe, Daniel Cremers, and Luc Van Gool. One-shot video object segmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017. [2] Md Atiqur Rahman and Yang Wang. Optimizing intersection-over-union in deep neural networks for image segmentation. In International symposium on visual computing, pages 234–244. Springer, 2016. [3] Yunchao Wei, Jiashi Feng, Xiaodan Liang, Ming-Ming Cheng, Yao Zhao, and Shuicheng Yan. Object region mining with adversarial erasing: A simple classification to semantic segmentation approach. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017. [4] Xiaowei Xu, Qing Lu, Lin Yang, Sharon Hu, Danny Chen, Yu Hu, and Yiyu Shi. Quantization of fully convolutional networks for accurate biomedical image segmentation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018. [5] 刘青山, 卢汉清, 马颂德. 综述人脸识别中的子空间方法. 自动化学报, 2003, 29(6): 900−911LIU Qing-Shan, LU Han-Qing, MA Song-De. A Survey: Subspace Analysis for Face Recognition. ACTA AUTOMATICA SINICA, 2003, 29(6): 900−911 [6] 高全学, 潘泉, 梁彦, 张洪才, 程咏梅. 基于描述特征的人脸识别研究. 自动化学报, 2006, 32(3): 386−392GAO Quan-Xue, PAN Quan, LIANG Yan, ZHANG Hong-Cai, CHENG Yong-Mei. Face Recognition Based on Expressive Features. ACTA AUTOMATICA SINICA, 2006, 32(3): 386−392 [7] 王森, 张伟伟, 王阳生. 指纹图像分割中新特征的提出及其应用. 自动化学报, 2003, 29(4): 622−627WANG Sen, ZHANG Wei-Wei, WANG YangSheng. New Features Extraction and Application in Fingerprint Segmentation. ACTA AUTOMATICA SINICA, 2003, 29(4): 622−627 [8] Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, and Nong Sang. Learning a discriminative feature network for semantic segmentation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018. [9] Chih-Yu Hsu, Chih-Hung Yang, and HuiChing Wang. Multi-threshold level set model for image segmentation. EURASIP Journal on Advances in Signal Processing, 2010, 2010(1): 950438 doi: 10.1155/2010/950438 [10] Sima Taheri, Sim Heng Ong, and VFH Chong. Level-set segmentation of brain tumors using a threshold-based speed function. Image and Vision Computing, 2010, 28(1): 26−37 doi: 10.1016/j.imavis.2009.04.005 [11] Anping Xu, Lijuan Wang, Sha Feng, and Yunxia Qu. Threshold-based level set method of image segmentation. In 2010 Third International Conference on Intelligent Networks and Intelligent Systems, pages 703–706. IEEE, 2010. [12] Jianjiang Feng, Jie Zhou, and Anil K Jain. Orientation field estimation for latent fingerprint enhancement. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 35(4): 925−940 [13] Davide Maltoni, Dario Maio, Anil K Jain, and Salil Prabhakar. Handbook of fingerprint recognition. Springer Science & Business Media, 2009. [14] Qijun Zhao, David Zhang, Lei Zhang, and Nan Luo. Adaptive fingerprint pore modeling and extraction. Pattern Recognition, 2010, 43(8): 2833−2844 doi: 10.1016/j.patcog.2010.02.016 [15] Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networks for semantic segmentation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015. [16] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 234–241. Springer, 2015. [17] Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. Pyramid scene parsing network. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017. [18] Svetlana Lazebnik, Cordelia Schmid, and Jean Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), volume 2, pages 2169–2178. IEEE, 2006. [19] Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation. In The European Conference on Computer Vision (ECCV), September 2018. [20] Panqu Wang, Pengfei Chen, Ye Yuan, Ding Liu, Zehua Huang, Xiaodi Hou, and Garrison Cottrell. Understanding convolution for semantic segmentation. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1451–1460. IEEE, 2018. [21] Dario Maio, Davide Maltoni, Raffaele Cappelli, James L. Wayman, and Anil K. Jain. Fvc2000: Fingerprint verification competition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(3): 402−412 doi: 10.1109/34.990140 [22] Dario Maio, Davide Maltoni, Raffaele Cappelli, James L Wayman, and Anil K Jain. Fvc2002: Second fingerprint verification competition. In Object recognition supported by user interaction for service robots, volume 3, pages 811–814. IEEE, 2002. [23] Dario Maio, Davide Maltoni, Raffaele Cappelli, Jim L Wayman, and Anil K Jain. Fvc2004: Third fingerprint verification competition. In International Conference on Biometric Authentication, pages 1–7. Springer, 2004. [24] Craig I Watson and Craig I Watson. NIST Special Database 29:Plain and Rolled Images from Paired Fingerprint Cards . US Department of Commerce, National Institute of Standards and Technology, 2001. [25] Maxim Berman, Amal Rannen Triki, and Matthew B Blaschko. The lovász-softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4413–4421, 2018. [26] Tijmen Tieleman and Geoffrey Hinton. Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning, 2012, 4(2): 26−31 [27] Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv: 1412.6980, 2014. [28] Ashia C Wilson, Rebecca Roelofs, Mitchell Stern, Nati Srebro, and Benjamin Recht. The marginal value of adaptive gradient methods in machine learning. In Advances in Neural Information Processing Systems Advances in Neural Information Processing Systems, pages 4148–4158, 2017. [29] Liangchen Luo, Yuanhao Xiong, Yan Liu, and Xu Sun. Adaptive gradient methods with dynamic bound of learning rate. arXiv preprint arXiv: 1902.09843, 2019. [30] Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Jiawei Han. On the variance of the adaptive learning rate and beyond. arXiv preprint arXiv: 1908.03265, 2019. [31] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680, 2014. [32] Yaroslav Ganin and Victor Lempitsky. Unsupervised domain adaptation by backpropagation.arXiv preprint arXiv : 1409.7495, 2014. [33] Sinno Jialin Pan and Qiang Yang. A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 2009, 22(10): 1345−1359 [34] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. [35] AIsegment.com. Matting human datasets. https://www.kaggle.com/laurentmih/aisegmentcom-matting-human-datasets/. [36] Xiaoyong Shen, Xin Tao, Hongyun Gao, Chao Zhou, and Jiaya Jia. Deep automatic portrait matting. In European Conference on Computer VisionEuropean Conference on Computer Vision, pages 92–107. Springer, 2016. [37] Martin Arjovsky, Soumith Chintala, and Léon Bottou. Wasserstein gan. arXiv preprint arXiv: 1701.07875, 2017. [38] Madhero88. Layers of the skin. https://en.wikipedia.org/wiki/File:Skin_layers.png. [39] Andrew L Maas, Awni Y Hannun, and Andrew Y Ng. Rectifier nonlinearities improve neural network acoustic models. In Proc. icml, volume 30, page 3, 2013. [40] Feng Liu, Linlin Shen, Haozhe Liu, Caixiong Shen, Guojie Liu, Yahui Liu, Wentian Zhang, and Yong Qi. A-benchmark-databaseusing-optical-coherence-tomography-forfingerprints. https://github.com/CVSZU/A-Benchmark-Database-using-OpticalCoherence-Tomography-for-Fingerprints. [41] Feng Liu, Caixiong Shen, Haozhe Liu, Guojie Liu, Yahui Liu, Zhenhua Guo, and Lei Wang. A flexible touch-based fingerprint acquisition device and a benchmark database using optical coherence tomography. IEEE Transactions on Instrumentation and Measurement, 2020. [42] Feng Liu, Guojie Liu, and Xingzheng Wang. High-accurate and robust fingerprint antispoofing system using optical coherence tomography. Expert Systems with Applications, 2019, 130: 31−44 doi: 10.1016/j.eswa.2019.03.053 [43] Haozhe Liu, Wentian Zhang, Feng Liu, and Yong Qi. 3d fingerprint gender classification using deep learning. In Chinese Conference on Biometric Recognition, pages 37–45. Springer, 2019. [44] Anil Jain, Yi Chen, and Meltem Demirkus. Pores and ridges: Fingerprint matching using level 3 features. In 18th International Conference on Pattern Recognition (ICPR’06), volume 4, pages 477–480. IEEE, 2006. [45] Yuanrong Xu, Guangming Lu, Feng Liu, and Yanxia Li. Fingerprint pore extraction based on multi-scale morphology. In Chinese Conference on Biometric Recognition, pages 288–295. Springer, 2017. [46] Ruggero Donida Labati, Angelo Genovese, Enrique Muñoz, Vincenzo Piuri, and Fabio Scotti. A novel pore extraction method for heterogeneous fingerprint images using convolutional neural networks. Pattern Recognition Letters, 2018, 113: 58−66 doi: 10.1016/j.patrec.2017.04.001 [47] Han-Ul Jang, Dongkyu Kim, Seung-Min Mun, Sunghee Choi, and Heung-Kyu Lee. Deeppore: fingerprint pore extraction using deep convolutional neural networks. IEEE Signal Processing Letters, 2017, 24(12): 1808−1812 doi: 10.1109/LSP.2017.2761454 [48] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv preprint arXiv: 1412.7062, 2014. [49] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(4): 834−848 -

计量
- 文章访问数: 44
- HTML全文浏览量: 15
- 被引次数: 0