孙亮 韩毓璇 康文婧 葛宏伟

孙亮, 韩毓璇, 康文婧, 葛宏伟. 基于生成对抗网络的多视图学习与重构算法. 自动化学报, 2018, 44(5): 819-828. doi: 10.16383/j.aas.2018.c170496
SUN Liang, HAN Yu-Xuan, KANG Wen-Jing, GE Hong-Wei. Multi-view Learning and Reconstruction Algorithms via Generative Adversarial Networks. ACTA AUTOMATICA SINICA, 2018, 44(5): 819-828. doi: 10.16383/j.aas.2018.c170496
国家自然科学基金 61572104

国家自然科学基金 61103146

国家自然科学基金 61402076

中央高校基本科研业务项目 DUT17JC04

吉林大学符号计算与知识工程教育部重点实验室项目 93K172017K03


    孙亮  大连理工大学计算机科学与技术学院讲师.2012年获得吉林大学计算机应用技术博士学位和高知工科大学信息科学博士学位.主要研究方向为机器学习, 计算智能, 群智计算理论与应用.E-mail:liangsun@dlut.edu.cn

    韩毓璇  大连理工大学计算机科学与技术学院硕士研究生.主要研究方向为智能计算与机器学习方法.E-mail:yuxuanhan@mail.dlut.edu.cn

    康文婧  大连理工大学计算机科学与技术学院硕士研究生.主要研究方向为智能计算与机器学习方法.E-mail:wjkang@mail.dlut.edu.cn


    葛宏伟  大连理工大学计算机科学与技术学院副教授.2006年获得吉林大学计算机应用技术博士学位.主要研究方向为计算智能, 机器学习, 系统建模与优化.本文通信作者.E-mail:hwge@dlut.edu.cn

Multi-view Learning and Reconstruction Algorithms via Generative Adversarial Networks


     Lecturer at the College of Computer Science and Technology, Dalian University of Technology. He received his Ph. D. degree in computer application technology and information science from Jilin University and Kochi University of Technology in 2012. His research interest covers machine learning, computational intelligence, theory and application of swarm based intelligent computing

     Master student at the College of Computer Science and Technology, Dalian University of Technology. Her research interest covers computational intelligence and machine learning methods

     Master student at the College of Computer Science and Technology, Dalian University of Technology. Her research interest covers computational intelligence and machine learning methods

    Corresponding author: GE Hong-Wei  Associate professor at the College of Computer Science and Technology, Dalian University of Technology. He received his Ph. D. degree in computer application technology from Jilin University in 2006. His research interest covers computational intelligence, machine learning, system modeling and optimization. Corresponding author of this paper
  • 摘要: 同一事物通常需要从不同角度进行表达.然而,现实应用经常引出复杂的场景,导致完整视图数据很难获得.因此研究如何构建事物的完整视图具有重要意义.本文提出一种基于生成对抗网络(Generative adversarial networks,GAN)的多视图学习与重构算法,利用已知单一视图,通过生成式方法构建其他视图.为构建多视图通用的表征,提出新型表征学习算法,使得同一实例的任意视图都能映射至相同的表征向量,并保证其包含实例的重构信息.为构建给定事物的多种视图,提出基于生成对抗网络的重构算法,在生成模型中加入表征信息,保证了生成视图数据与源视图相匹配.所提出的算法的优势在于避免了不同视图间的直接映射,解决了训练数据视图不完整问题,以及构造视图与已知视图正确对应问题.在手写体数字数据集MNIST,街景数字数据集SVHN和人脸数据集CelebA上的模拟实验结果表明,所提出的算法具有很好的重构性能.
    1)  本文责任编委 王坤峰
  • 图  1  多视图表征向量映射

    Fig.  1  Multi-view representative vector mapping

    图  2  原始视图数据$x$, 表征向量$\pmb c$, 重构视图数据$\hat{x}$间的互信息示意图

    Fig.  2  Schematic diagram of mutual information among original view data $x$, representative vector $\pmb c$, reconstructed data $\hat{x}$

    图  3  基于生成对抗网络的多视图数据生成框架

    Fig.  3  Framework of the generative adversarial network based multi-view data generation

    图  4  MNIST视图3数据经过PCA后的可视化二维图

    Fig.  4  The 2D-visualization of view 3 on MNIST after PCA

    图  5  以视图2为源数据在MNIST上的重构结果

    Fig.  5  Reconstruction results that take view 2 as source data on MNIST

    图  6  以视图3为源数据在MNIST上的重构结果

    Fig.  6  Reconstruction results that take view 3 as source data on MNIST

    图  7  以视图2为源数据在SVHN上的重构结果

    Fig.  7  Reconstruction results that take view 2 as source data on SVHN

    图  8  以视图3为源数据在SVHN上的重构结果

    Fig.  8  Reconstruction results that take view 3 as source data on SVHN

    图  9  以视图2为源数据在CelebA上的重构结果

    Fig.  9  Reconstruction results that take view 2 and view 3 as source data respectively on CelebA

    表  1  MNIST数据集上的SSIM和PSNR比较结果

    Table  1  Comparison results of SSIM and PSNR on MNIST

    算法 SSIM值 PSNR值(dB)
    MVGAN (视图2重构视图1) 0.8520±0.0001 16.3135±0.0880
    MVGAN (视图3重构视图1) 0.6474±0.0013 12.2109±0.1442
    CGAN 0.7414±0.0001 12.0301±0.0512
    CVAE 0.7912±0.0031 12.1184±0.0013
    表  2  SVHN数据集上的SSIM和PSNR比较结果

    Table  2  Comparison results of SSIM and PSNR on SVHN

    算法 SSIM值 PSNR值(dB)
    MVGAN (视图2重构视图1) 0.4140±0.0022 18.7987±0.1475
    MVGAN (视图3重构视图1) 0.1848±0.0020 15.8026±0.1306
    CGAN 0.3357±0.0017 14.8910±0.0002
    CVAE 0.3465±0.0028 15.0137±0.0071
    表  3  CelebA视图2和视图3对应选中的10维属性

    Table  3  The chosen attributes for view 2 and view 3 (10 dimensions)

    图片编号 秃顶 刘海 黑发 眼镜 男性 嘴微张 窄眼 无胡须 苍白肤色 戴帽
    a -1 -1 -1 1 1 -1 -1 -1 -1 -1
    b -1 -1 1 -1 1 -1 -1 -1 -1 -1
    c -1 -1 1 -1 1 -1 -1 1 -1 -1
    d -1 -1 1 -1 1 1 -1 1 -1 -1
    e -1 -1 -1 -1 -1 -1 -1 1 -1 -1
    f -1 -1 -1 -1 -1 -1 -1 1 1 -1
    g -1 -1 -1 -1 1 -1 -1 1 -1 -1
    h -1 -1 -1 -1 -1 1 -1 1 -1 -1
    i -1 -1 -1 -1 -1 -1 -1 1 -1 -1
    j -1 -1 -1 -1 1 1 1 1 1 -1
    k -1 1 -1 -1 1 -1 -1 1 -1 -1
    l -1 -1 -1 -1 -1 1 -1 1 -1 -1
    m -1 -1 -1 -1 -1 1 -1 1 -1 -1
    n -1 -1 -1 -1 1 1 -1 -1 -1 -1
    o -1 1 1 -1 1 1 -1 1 -1 -1
    表  4  CelebA数据集上的SSIM和PSNR比较结果

    Table  4  Comparison results of SSIM and PSNR on CelebA

    算法 SSIM值 PSNR值(dB)
    MVGAN (视图2重构视图1) 0.1143±0.0023 10.0574±0.0605
    MVGAN (视图3重构视图1) 0.1132±0.0022 10.0342±0.0587
    CGAN 0.0512±0.0036 9.5312±0.0012
    CVAE 0.0716±0.0058 9.7881±0.0020
