2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于拓扑一致性对抗互学习的知识蒸馏

赖轩 曲延云 谢源 裴玉龙

赖轩, 曲延云, 谢源, 裴玉龙. 基于拓扑一致性对抗互学习的知识蒸馏. 自动化学报, 2023, 49(1): 102−110 doi: 10.16383/j.aas.c200665
引用本文: 赖轩, 曲延云, 谢源, 裴玉龙. 基于拓扑一致性对抗互学习的知识蒸馏. 自动化学报, 2023, 49(1): 102−110 doi: 10.16383/j.aas.c200665
Lai Xuan, Qu Yan-Yun, Xie Yuan, Pei Yu-Long. Topology-guided adversarial deep mutual learning for knowledge distillation. Acta Automatica Sinica, 2023, 49(1): 102−110 doi: 10.16383/j.aas.c200665
Citation: Lai Xuan, Qu Yan-Yun, Xie Yuan, Pei Yu-Long. Topology-guided adversarial deep mutual learning for knowledge distillation. Acta Automatica Sinica, 2023, 49(1): 102−110 doi: 10.16383/j.aas.c200665

基于拓扑一致性对抗互学习的知识蒸馏

doi: 10.16383/j.aas.c200665
基金项目: 国家自然科学基金(61876161, 61772524, 61671397, U1065252, 61772440), 上海市人工智能科技支撑专项(21511100700)资助
详细信息
    作者简介:

    赖轩:厦门大学信息学院硕士研究生. 主要研究方向为计算机视觉与图像处理. E-mail: laixuan@stu.xmu.edu.cn

    曲延云:厦门大学信息学院教授.主要研究方向为模式识别, 计算机视觉和机器学习. 本文通信作者.E-mail: yyqu@xmu.edu.cn

    谢源:华东师范大学计算机科学与技术学院教授. 主要研究方向为模式识别, 计算机视觉和机器学习. E-mail: yxie@cs.ecnu.edu.cn

    裴玉龙:厦门大学信息学院硕士研究生. 主要研究方向为计算机视觉与图像处理. E-mail: 23020181154279@ stu.xmu.edu.cn

Topology-guided Adversarial Deep Mutual Learning for Knowledge Distillation

Funds: Supported by National Natural Science Foundation of China (61876161, 61772524, 61671397, U1065252, 61772440) and Shanghai Science and Technology Commission (21511100700)
More Information
    Author Bio:

    LAI Xuan Master student at the School of Informatics, Xiamen University. His research interest covers computer vision and image processing

    QU Yan-Yun Professor at the School of Informatics, Xiamen University. Her research interest covers pattern recognition, computer vision and machine learning. Corresponding author of this paper

    XIE Yuan Professor in the Department of Computer Science & Technology, East China Normal University. His research interest covers pattern recognition, computer vision and machine learning

    PEI Yu-Long Master student at the School of Informatics, Xiamen University. His research interest covers computer vision and image processing

  • 摘要: 针对基于互学习的知识蒸馏方法中存在模型只关注教师网络和学生网络的分布差异, 而没有考虑其他的约束条件, 只关注了结果导向的监督, 而缺少过程导向监督的不足, 提出了一种拓扑一致性指导的对抗互学习知识蒸馏方法(Topology-guided adversarial deep mutual learning, TADML). 该方法将教师网络和学生网络同时训练, 网络之间相互指导学习, 不仅采用网络输出的类分布之间的差异, 还设计了网络中间特征的拓扑性差异度量. 训练过程采用对抗训练, 进一步提高教师网络和学生网络的判别性. 在分类数据集CIFAR10、CIFAR100和Tiny-ImageNet及行人重识别数据集Market1501上的实验结果表明了TADML的有效性, TADML取得了同类模型压缩方法中最好的效果.
  • 图  1  本文方法框架

    Fig.  1  The framework of the proposed method

    图  2  判别器结构图

    Fig.  2  The structure of discriminator

    表  1  损失函数对分类精度的影响比较(%)

    Table  1  Comparison of classification performance with different loss function (%)

    损失构成CIFAR10CIFAR100
    LS92.9070.47
    LS + LJS93.1871.70
    LS + LJS + Ladv93.5272.75
    LS + L1 + Ladv93.0471.97
    LS + L2 + Ladv93.2672.02
    LS + L1 + LJS + Ladv92.8771.63
    LS + L2 + LJS + Ladv92.3870.90
    LS + LJS + Ladv + LT93.0571.81
    下载: 导出CSV

    表  2  判别器结构对分类精度的影响比较(%)

    Table  2  Comparison of classification performance with different discriminator structures (%)

    结构CIFAR100
    256fc-256fc71.57
    500fc-500fc72.09
    100fc-100fc-100fc72.33
    128fc-256fc-128fc72.51
    64fc-128fc-256fc-128fc72.28
    128fc-256fc-256fc-128fc72.23
    下载: 导出CSV

    表  3  判别器输入对分类精度的影响比较(%)

    Table  3  Comparison of classification performance with different discriminator inputs (%)

    输入约束CIFAR100
    Conv472.33
    FC72.51
    Conv4 + FC72.07
    FC + DAE71.97
    FC + Label72.35
    FC + Avgfc71.20
    下载: 导出CSV

    表  4  采样数量对分类精度的影响比较(%)

    Table  4  Comparison of classification performance with different sampling strategies (%)

    网络结构VanilaRandomK = 2K = 4K = 8K = 16K = 32K = 64
    Resnet3271.1472.1231.0760.6972.4372.8472.5071.99
    Resnet11074.3174.5922.6452.3374.5975.1875.0174.59
    下载: 导出CSV

    表  5  网络结构对分类精度的影响比较(%)

    Table  5  Comparison of classification performance with different network structures (%)

    网络结构原始网络DML[13]ADMLTADML
    网络 1网络 2网络 1网络 2网络 1网络 2网络 1网络 2网络 1网络 2
    ResNet32ResNet3270.4770.4771.8671.8972.8572.8973.0773.13
    ResNet32ResNet11070.4773.1271.6274.0872.6674.1873.1474.86
    ResNet110ResNet11073.1273.1274.5974.5575.0875.1075.5275.71
    WRN-10-4WRN-10-472.6572.6573.0673.0173.7773.7573.9774.08
    WRN-10-4WRN-28-1072.6580.7773.5881.1174.6181.4375.1182.13
    下载: 导出CSV

    表  6  网络结构对行人重识别平均识别精度的影响比较(%)

    Table  6  Comparison of person re-identification mAP with different network structures (%)

    网络结构原始网络DML[13]ADMLTADML
    网络 1网络 2网络 1网络 2网络 1网络 2网络 1网络 2网络 1网络 2
    InceptionV1MobileNetV165.2646.0765.3452.8765.6053.2266.0353.91
    MobileNetV1MobileNetV146.0746.0752.9551.2653.4253.2753.8453.65
    下载: 导出CSV

    表  7  本文算法与其他压缩算法的实验结果

    Table  7  Experimental results of the proposed algorithm and other compression algorithms

    对比算法参数量
    (MB)
    CIFAR10
    (%)
    CIFAR100
    (%)
    Tiny-ImageNet
    (%)
    ResNet200.2791.4266.6354.45
    ResNet1642.693.4372.2461.55
    Yim 等[10]0.2788.7063.33-
    SNN-MIMIC[22]0.2790.9367.21-
    KD[8]0.2791.1266.6657.65
    FitNet[9]0.2791.4164.9655.59
    Quantization[20]0.2791.13--
    Binary Connect[21]15.2091.73--
    ANC[23]0.2791.9267.5558.17
    TSANC[24]0.2792.1767.4358.20
    KSANC[24]0.2792.6868.5859.77
    DML[13]0.2791.8269.4757.91
    ADML0.2792.2369.6059.00
    TADML0.2793.0570.8160.11
    下载: 导出CSV
  • [1] He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA: IEEE, 2016. 770−778
    [2] Zhang X Y, Zhou X Y, Lin M X, Sun J. ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake, USA: IEEE, 2018. 6848−6856
    [3] Guo Y W, Yao A B, Zhao H, Chen Y R. Network sketching: Exploiting binary structure in deep CNNs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA: IEEE, 2017. 4040−4048
    [4] Tai C, Xiao T, Wang X G,E W N. Convolutional neural networks with low-rank regularization. In: Proceedings of the 4th International Conference on Learning Representations, San Ju-an, Puerto Rico, 2016
    [5] Chen W, Wilson J T, Tyree S, Weinberger K Q, Chen Y X. Compressing neural networks with the hashing trick. In: Proce-edings of the 32nd International Conference on Machine Learni-ng, Lille, France: 2015. 37: 2285−2294
    [6] Denton E L, Zaremba W, Bruna J, LeCun Y, Fergus R. Exploiting linear structure within convolutional networks for efficient evaluation. In: Proceedings of the 27th Annual Conference on Neural Information Processing Systems, Montreal, Canada: 2014. 1269−1277
    [7] Li Z, Hoiem D. Learning without forgetting. In: Proceedings of the 14th European Conference on Computer Vision, Amsterdam, Netherlands: 2016. 614−629
    [8] Hinton G E, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv preprint, 2015, arXiv: 1503.02531
    [9] Romero A, Ballas N, Kahou S E, Chassang A, Gatta C, Bengio Y. Fitnets: Hints for thin deep nets. In: Proceedings of the 3rd International Conference on Learning Representations. San Di-ego, USA, 2015
    [10] Yim J, Joo D, Bae J H, Kim J. A Gift from knowledge distillation: Fast optimization, network minimization and transfer learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA: IEEE, 2017. 7130−7138
    [11] Peng B Y, Jin X, Li D S, Zhou S F, Wu Y C, Liu J H, et al. Correlation congruence for knowledge distillation. In: Proceedings of the IEEE International Conference on Computer Vision, Seoul, South Korea: IEEE, 2017. 5006−5015
    [12] Park W, Kim D, Lu Y, Cho M. Relational knowledge distillation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, USA: IEEE, 2019. 3967−3976
    [13] Zhang Y, Xiang T, Hospedales T M, Lu H C. Deep mutual learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA: IEEE, 2018. 4320−4328
    [14] Batra T, Parikh D. Cooperative Learning with Visual Attributes. arXiv preprint, 2017, arXiv: 1705.05512
    [15] Zhang H, Goodfellow I J, Metaxas D N, Odena A. Self-attention generative adversarial networks. In: Proceedings of the 36th International Conference on Machine Learning, Long Beach, USA: 2019. 7354−7363
    [16] Zagoruyko S, Komodakis N. Wide residual networks. In: Proceedings of the British Machine Vision Conference, York, UK: 2016. 1−12
    [17] Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images. Handbook of Systemic Autoimmune Diseases, 2009.1(4).
    [18] Mirza M, Osindero S. Conditional generative adversarial nets. arXiv preprint, 2014, arXiv: 1411.1784
    [19] Shu C Y, Li P, Xie Y, Qu Y Y, Kong H. Knowledge Squeezed Adversarial Network Compression. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, New York, NY, USA: 2020. 11370−11377
    [20] Zhu C Z, Han S, Mao H Z, Dally W J. Trained ternary quantization. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, France, 2017.
    [21] Courbariaux M, Bengio Y, David J P. Binaryconnect: Training deep neural networks with binary weights during propagations. In: Proceedings of the 27th Annual Conference on Neural Information Processing Systems. Montreal, Canada: 2015. 3123− 3131
    [22] Ba J, Caruana R. Do deep nets really need to be deep? In: Proceedings of the 27th Annual Conference on Neural Information Processing Systems, Montreal, Canada: 2014. 2654−2662
    [23] Belagiannis V, Farshad A, Galasso F. Adversarial network compression. In: Proceedings of the European Conference on Computer Vision, Munich, Germany: 2018. 11132: 431−449
    [24] Xu Z, Hsu Y C, H J W. Training student networks for acceleration with conditional adversarial networks. In: Proceedings of British Machine Vision Conference, Newcastle, UK: 2018. 61
  • 加载中
图(2) / 表(7)
计量
  • 文章访问数:  1146
  • HTML全文浏览量:  884
  • PDF下载量:  256
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-08-18
  • 录用日期:  2020-12-23
  • 网络出版日期:  2021-01-19
  • 刊出日期:  2023-01-07

目录

    /

    返回文章
    返回