詹兆康 胡旭光 赵浩然 张思琪 张峻凯 马大中

詹兆康, 胡旭光, 赵浩然, 张思琪, 张峻凯, 马大中. 基于多变量时空融合网络的风机数据缺失值插补研究. 自动化学报, 2024, 50(6): 1171−1184 doi: 10.16383/j.aas.c230534
Zhan Zhao-Kang, Hu Xu-Guang, Zhao Hao-Ran, Zhang Si-Qi, Zhang Jun-Kai, Ma Da-Zhong. Study of missing value imputation in wind turbine data based on multivariate spatiotemporal integration network. Acta Automatica Sinica, 2024, 50(6): 1171−1184 doi: 10.16383/j.aas.c230534
基金项目: 国家自然科学基金(U22A20221, 62303103, 62073064), 中央高校基本科研业务费(N2304017, N2204007), 辽宁省自然科学基金(2022-KF-11-02)资助

    詹兆康:东北大学信息科学与工程学院硕士研究生. 主要研究方向为神经网络, 基于数据驱动的数据补偿. E-mail: 2200758@stu.neu.edu.cn

    胡旭光:东北大学信息科学与工程学院讲师. 主要研究方向为数模混合驱动的能源系统智能化建模、综合高效利用与优化调控. 本文通信作者. E-mail: huxuguang@mail.neu.edu.cn

    赵浩然:山东大学电气工程学院教授. 主要研究方向为新能源发电与并网, 新型电力系统建模与仿真和综合能源优化运行与控制. E-mail: hzhao@sdu.edu.cn

    张思琪:东北大学信息科学与工程学院硕士研究生. 主要研究方向为基于机器学习的数据预测. E-mail: 2270967@stu.neu.edu.cn

    张峻凯:东北大学信息科学与工程学院硕士研究生. 主要研究方向为能源系统的数据预测及分区恢复. E-mail: 2100687@stu.neu.edu.cn

    马大中:东北大学信息科学与工程学院教授. 主要研究方向为故障诊断, 容错控制, 能源管理系统, 分布式发电系统、微网和能源互联网的优化与控制. E-mail: madazhong@ise.neu.edu.cn

Study of Missing Value Imputation in Wind Turbine Data Based on Multivariate Spatiotemporal Integration Network

Funds: Supported by National Natural Science Foundation of China (U22A20221, 62303103, 62073064), Fundamental Research Funds for the Central Universities in China (N2304017, N2204007), and Natural Science Foundation of Liaoning Province (2022-KF-11-02)
    ZHAN Zhao-Kang Master student at the College of Information Science and Engineering, Northeastern University. Her research interest covers neural networks and data-driven data imputation

    HU Xu-Guang Lecturer at the College of Information Science and Engineering, Northeastern University. His research interest covers intelligent modelling, integrated and efficient utilization and optimal regulation of energy system driven by data-model hybrid. Corresponding author of this paper

    ZHAO Hao-Ran Professor at the School of Electrical Engineering, Shandong University. His research interest covers new energy generation and grid connection, modeling and simulation of new power systems, and optimal operation and control of integrated energy sources

    ZHANG Si-Qi Master student at the College of Information Science and Engineering, Northeastern University. Her main research interest is machine learning-based data prediction

    ZHANG Jun-Kai Master student at the College of Information Science and Engineering, Northeastern University. His research interest covers data prediction and partition recovery of energy systems

    MA Da-Zhong Professor at the College of Information Science and Engineering, Northeastern University. His research interest covers fault diagnosis, fault-tolerant control, energy management systems, and control and optimization of distributed generation systems, microgrids and energy internet

  • 摘要: 风电场数据的完整性会因恶劣天气、输入信号丢失、传感器故障等原因遭到破坏, 而大面积的数据缺失将给风机设备的运行和维护带来严峻考验. 因此, 提出一个多变量时空融合网络(Multivariate spatiotemporal integration network, MSIN)来解决缺失数据问题. 首先, 提出包含缺失值定位−指引机制的MSIN结构, 揭示缺失部分数据的潜在信息, 确保插补数据符合真实分布. 其次, 在网络中设计多视角时空卷积模块, 捕捉同一风机多个变量与多个风机同一变量之间的局部空间和全局时间相关性, 用于提高插补数据的真实性. 接着, 提出网络实时自更新机制, 根据风电场实时变化情况实现在线调整, 能够提升网络泛化能力, 由此弥补重新训练模型的时间和空间成本高的缺陷. 最后, 通过真实的风机数据验证所提网络的有效性和优越性. 相关分析结果表明, 相较于MissForest等传统数据插补方法的插补性能, 平均绝对误差(Mean absolute error, MAE)、平均绝对百分比误差(Mean absolute percentage error, MAPE)和均方根误差(Root mean square error, RMSE)分别下降 18.54%、41.00% 和 3.15% 以上.
  • 图  1  风机时空关联分析示意图

    Fig.  1  Schematic diagram of spatiotemporal correlation analysis of wind turbines

    图  2  多变量时空融合网络的网络架构

    Fig.  2  The architecture of MSIN

    图  3  多视角时空卷积模块

    Fig.  3  Multi-view spatiotemporal convolution module

    图  4  网络训练流程图

    Fig.  4  Network training flowchart

    图  5  所提方法对具有相同缺失率的不同风机的不完整数据插补结果((a) 样本1; (b) 样本2; (c)样本3; (d)样本4)

    Fig.  5  Results of incomplete data imputation of the proposed method for different wind turbines with the same missing rate ((a) Sample 1; (b) Sample 2; (c) Sample 3; (d) Sample 4)

    图  6  同一风机样本在不同缺失率下的不完整数据插补结果((a) 0.1; (b) 0.2; (c) 0.3; (d) 0.4; (e) 0.5; (f) 0.6; (g) 0.7; (h) 0.8)

    Fig.  6  Incomplete data imputation results for the same wind turbine sample at different missing rates ((a) 0.1; (b) 0.2; (c) 0.3; (d) 0.4; (e) 0.5; (f) 0.6; (g) 0.7; (h) 0.8)

    图  7  消融实验评价指标的平均结果 ((a) MAE; (b) MAPE; (c) RMSE)

    Fig.  7  The average results of evaluation metrics for ablation experiments ((a) MAE; (b) MAPE; (c) RMSE)

    图  8  七种插补方法运行时的CPU利用率

    Fig.  8  CPU usage at runtime for seven imputation methods

    图  9  不同方法对比实验结果 ((a) MAE; (b) MAPE; (c) RMSE)

    Fig.  9  Comparative experimental results of different methods ((a) MAE; (b) MAPE; (c) RMSE)

    表  1  风机变量

    Table  1  The variables of wind turbine

    10ISU温度23INU RMIO 温度
    表  2  不同提示率下的评估结果

    Table  2  Evaluation results under different hint-rates

    0.10 0.1549 3.0010 0.2396
    0.20 0.1552 2.9599 0.2398
    0.30 0.1557 2.3107 0.2384
    0.40 0.1564 2.2437 0.2401
    0.50 0.1552 3.3131 0.2390
    0.60 0.1555 2.2019 0.2400
    0.70 0.1577 2.2831 0.2398
    0.80 0.1543 2.8454 0.2397
    0.90 0.1541 1.1783 0.2381
    0.95 0.1561 1.9770 0.2391
    表  3  不同$ \alpha $下的评估结果

    Table  3  Evaluation results under different$ \alpha $

    $ \alpha $ MAE MAPE RMSE
    0.0001 0.6231 27135.3668 0.4956
    0.0010 0.4983 128671.0614 0.6251
    0.0100 0.4963 42939.8706 0.6236
    0.1000 0.4967 167721.3201 0.6238
    1 0.3625 229.8665 0.4843
    10 0.1805 23.6173 0.2644
    100 0.1539 5.4836 0.2321
    1000 0.1518 5.7790 0.2488
    表  4  不同$ \beta $下的评估结果

    Table  4  Evaluation results under different$ \beta $

    $ \beta $ MAE MAPE RMSE
    0.0001 0.1532 1.2270 0.2320
    0.0010 0.1505 2.3903 0.2290
    0.0100 0.1507 2.3558 0.2274
    0.1000 0.1499 1.9291 0.2268
    1 0.1530 4.0830 0.2319
    10 0.1801 23.7244 0.2641
    100 0.3652 237.1457 0.4874
    1000 0.4970 35792.8434 0.6240
    表  5  不同学习率下的评估结果

    Table  5  Evaluation results under different learning rates

    0.0001 0.2121 1.7066 0.2941
    0.0010 0.1521 1.4009 0.2295
    0.0100 0.4272 4.2201 0.5652
    0.1000 0.4264 7.0552 0.5648
    1 0.4302 5.2400 0.5676
    10 0.4269 7.8907 0.5646
    100 0.4272 9.6068 0.5657
    1000 0.4298 6.7900 0.5674
    表  6  风机数据在不同缺失率下的评价指标结果

    Table  6  Results of evaluation metrics for wind turbine data with different missing rates

    表  7  七种插补方法一次迭代的运行时间(s)

    Table  7  Running time of the seven imputation methods for one iteration (s)

    插补方法 缺失率
    0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
    MSIN 4.3156 4.7167 4.9595 5.1400 5.1159 4.9905 5.1656 5.0997
    TimeGAN[28] 6.5895 6.6172 7.3519 8.8907 7.7120 8.4728 7.8757 8.3546
    M-RNN[29] 81.1218 70.8649 69.9753 67.5593 69.0319 68.2631 71.2586 68.9668
    MIRACLE[30] 0.2554 0.3761 0.3752 0.3925 0.3879 0.3692 0.3712 0.3941
    MICE[31] 2.5963 2.1705 2.1164 2.7042 2.2922 2.3221 2.6145 2.5653
    MissForest[32] 0.5963 0.5771 0.7897 0.7921 0.8396 0.9587 0.9132 0.8527
    LGDI[33] 15.6514 14.0879 15.8731 16.3439 14.9822 17.3042 15.9346 17.8468
