2.765

2022影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

非平衡数据流在线主动学习方法

李艳红 任霖 王素格 李德玉

李艳红, 任霖, 王素格, 李德玉. 非平衡数据流在线主动学习方法. 自动化学报, 2024, 50(7): 1−13 doi: 10.16383/j.aas.c211246
引用本文: 李艳红, 任霖, 王素格, 李德玉. 非平衡数据流在线主动学习方法. 自动化学报, 2024, 50(7): 1−13 doi: 10.16383/j.aas.c211246
Li Yan-Hong, Ren Lin, Wang Su-Ge, Li De-Yu. Online active learning method for imbalanced data stream. Acta Automatica Sinica, 2024, 50(7): 1−13 doi: 10.16383/j.aas.c211246
Citation: Li Yan-Hong, Ren Lin, Wang Su-Ge, Li De-Yu. Online active learning method for imbalanced data stream. Acta Automatica Sinica, 2024, 50(7): 1−13 doi: 10.16383/j.aas.c211246

非平衡数据流在线主动学习方法

doi: 10.16383/j.aas.c211246
基金项目: 国家自然科学基金(62076158, 62072294, 41871286), 山西省重点研发计划(201903D421041)资助
详细信息
    作者简介:

    李艳红:山西大学计算机与信息技术学院副教授. 主要研究方向为数据挖掘, 机器学习. 本文通信作者. E-mail: liyh@sxu.edu.cn

    任霖:山西大学计算机与信息技术学院硕士研究生. 主要研究方向为数据挖掘, 机器学习. E-mail: renlinssdx@163.com

    王素格:山西大学计算机与信息技术学院教授. 主要研究方向为自然语言处理, 机器学习. E-mail: wsg@sxu.edu.cn

    李德玉:山西大学计算机与信息技术学院教授. 主要研究方向为数据挖掘, 人工智能. E-mail: lidy@sxu.edu.cn

Online Active Learning Method for Imbalanced Data Stream

Funds: Supported by National Natural Science Foundation of China (62076158, 62072294, 41871286) and Shanxi Key Research and Development Program (201903D421041)
More Information
    Author Bio:

    LI Yan-Hong Associate professor at the School of Computer and Information Technology, Shanxi University. Her research interest covers data mining and machine learning. Corresponding author of this paper

    REN Lin Master student at the School of Computer and Information Technology, Shanxi University. His research interest covers data mining and machine learning

    WANG Su-Ge Professor at the School of Computer and Information Technology, Shanxi University. Her research interest covers natural language processing and machine learning

    LI De-Yu Professor at the School of Computer and Information Technology, Shanxi University. His research interest covers data mining and artificial intelligence

  • 摘要: 数据流分类是数据流挖掘领域一项重要研究任务, 目标是从实时到达不断变化的海量数据中捕获变化的类结构. 目前, 很少有框架可以同时处理数据流中常见的多类非平衡、概念漂移、异常点和标记样本成本高昂问题. 基于此, 提出一种非平衡数据流在线主动学习方法(Online active learning method for imbalanced data stream, OALM-IDS). AdaBoost是一种将多个弱分类器经过迭代生成强分类器的集成分类方法, AdaBoost.M2引入了弱分类器的置信度, 此类方法常用于静态数据. 定义了基于不平衡比率和自适应遗忘因子的训练样本重要性度量, 从而使AdaBoost.M2方法适用于非平衡数据流, 提升了非平衡数据流集成分类器的性能. 提出了边际阈值矩阵的自适应调整方法, 优化了标签请求策略. 将概念漂移程度融入模型构建过程中, 定义了基于概念漂移指数的自适应遗忘因子, 实现了漂移后的模型重构. 在6个人工数据流和3个真实数据流上的对比实验表明, 提出的非平衡数据流在线主动学习方法的分类性能优于其他5种非平衡数据流学习方法.
  • 图  1  算法框架

    Fig.  1  Algorithm framework

    图  2  6种算法的ROC曲线

    Fig.  2  ROC curve of six algorithms

    图  3  ${\rm{DS}}_{6}$上的分类准确率曲线

    Fig.  3  Precision curve of the ${\rm{DS}}_{6}$

    图  5  Shuttle上的分类准确率曲线

    Fig.  5  Precision curve of the Shuttle

    图  4  Kddcup$99\_10\%$上的分类准确率曲线

    Fig.  4  Precision curve of the Kddcup$99\_10\%$

    图  6  消融实验结果

    Fig.  6  Result of the ablation experiment

    表  1  数据流的特征

    Table  1  Data stream feature

    数据流 样本数 特征数 类别数 类分布 漂移次数 异常点
    ${\rm{DS} }_{1}$ 200000 21 5 (0.2, 0.2, 0.2, 0.2, 0.2) 0 0
    ${\rm{DS}}_{2}$ 200000 21 5 (0.2, 0.2, 0.2, 0.2, 0.2) 3 10
    ${\rm{DS}}_{3}$ 200000 21 5 (0.1, 0.3, 0.4, 0.2, 0.1) 0 0
    ${\rm{DS}}_{4}$ 200000 21 5 (0.1, 0.3, 0.4, 0.2, 0.1) 3 10
    ${\rm{DS}}_{5}$ 200000 21 5 (0.1, 0.3, 0.4, 0.2, 0.1), (0.4, 0.2, 0.1, 0.1, 0.2) 0 0
    ${\rm{DS}}_{6}$ 200000 21 5 (0.1, 0.3, 0.4, 0.2, 0.1), (0.4, 0.2, 0.1, 0.1, 0.2) 3 10
    Kddcup$99\_10\%$ 494000 42 23
    Statlog 570000 10 7
    IoT 663000 115 11
    HAR 10299 561 6
    下载: 导出CSV

    表  2  6种算法的分类准确率

    Table  2  Precision value of six algorithms

    数据流 LB BOLE ${\rm{ARF}}_{RE}$ OALE CALMID OALM-IDS
    DS$_{1}$ $94.56\pm0.12$ ${\boldsymbol{95.61} }{\boldsymbol{\pm0.11} }$ $93.54\pm0.13$ $89.78\pm0.21$ $94.76\pm0.16$ $95.48\pm0.15$
    DS$_{2}$ $92.27\pm0.17$ $92.44\pm0.14$ $91.04\pm0.19$ $88.31\pm0.23$ $92.81\pm0.13$ ${\boldsymbol{93.94}}{\boldsymbol{\pm0.12}}$
    DS$_{3}$ $88.39\pm0.22$ $89.52\pm0.14$ $90.95\pm0.13$ $88.83\pm0.16$ $92.57\pm0.13$ ${\boldsymbol{93.72}}{\boldsymbol{\pm0.13}}$
    DS$_{4}$ $86.55\pm0.31$ $88.68\pm0.26$ $89.89\pm0.23$ $86.29\pm0.29$ $91.31\pm0.18$ ${\boldsymbol{92.18}}{\boldsymbol{\pm0.21}}$
    DS$_{5}$ $85.64\pm0.29$ $87.04\pm0.34$ $89.61\pm0.51$ $88.83\pm0.21$ $91.13\pm0.21$ ${\boldsymbol{92.92}}{\boldsymbol{\pm0.16}}$
    DS$_{6}$ $82.10\pm0.69$ $83.15\pm0.73$ $86.54\pm0.72$ $83.42\pm0.55$ $90.64\pm0.42$ ${\boldsymbol{92.41}}{\boldsymbol{\pm0.21}}$
    Kddcup$99\_10\%$ $83.87\pm0.43$ $81.09\pm0.56$ $85.48\pm0.65$ $81.01\pm0.36$ $92.06\pm0.19$ ${\boldsymbol{92.07}}{\boldsymbol{\pm0.18}}$
    Statlog $64.55\pm0.31$ $63.78\pm0.61$ $79.97\pm0.39$ $73.78\pm0.43$ $85.40\pm0.34$ ${\boldsymbol{85.68}}{\boldsymbol{\pm0.33}}$
    IoT $64.03\pm0.48$ $61.54\pm0.43$ $66.66\pm0.53$ $55.81\pm0.51$ $70.85\pm0.54$ ${\boldsymbol{73.12}}{\boldsymbol{\pm0.38}}$
    HAR $61.63\pm0.53$ $59.76\pm0.46$ $63.22\pm0.49$ $55.16\pm0.69$ $68.64\pm0.71$ ${\boldsymbol{69.98}}{\boldsymbol{\pm0.51}}$
    下载: 导出CSV

    表  3  6种算法的召回率

    Table  3  Recall value of six algorithms

    数据流 LB BOLE ${\rm{ARF}}_{RE}$ OALE CALMID OALM-IDS
    ${\rm{DS}}_{1}$ $95.37\pm0.18$ $95.96\pm0.13$ $93.39\pm0.11$ $90.13\pm0.13$ $95.91\pm0.11$ ${\boldsymbol{96.14}}{\boldsymbol{\pm0.12}}$
    ${\rm{DS}}_{2}$ $92.39\pm0.21$ $92.28\pm0.35$ $91.35\pm0.26$ $89.45\pm0.18$ $92.51\pm0.15$ ${\boldsymbol{94.08}}{\boldsymbol{\pm0.14}}$
    ${\rm{DS}}_{3}$ $87.55\pm0.19$ $88.19\pm0.22$ $86.14\pm0.21$ $88.52\pm0.22$ $90.55\pm0.13$ ${\boldsymbol{92.52}}{\boldsymbol{\pm0.13}}$
    ${\rm{DS}}_{4}$ $84.57\pm0.36$ $86.73\pm0.29$ $87.47\pm0.28$ $83.05\pm0.31$ $89.89\pm0.21$ ${\boldsymbol{92.44}}{\boldsymbol{\pm0.18}}$
    ${\rm{DS}}_{5}$ $84.14\pm0.43$ $86.44\pm0.49$ $87.26\pm0.69$ $83.26\pm0.36$ $90.25\pm0.18$ ${\boldsymbol{91.16}}{\boldsymbol{\pm0.13}}$
    ${\rm{DS}}_{6}$ $83.98\pm1.13$ $81.87\pm0.91$ $84.56\pm1.31$ $78.87\pm0.69$ $90.46\pm0.13$ ${\boldsymbol{90.71}}{\boldsymbol{\pm0.21}}$
    Kddcup$99\_10\%$ $60.82\pm0.71$ $62.75\pm0.64$ $58.17\pm1.32$ $58.44\pm1.63$ $61.88\pm0.43$ ${\boldsymbol{63.71}}{\boldsymbol{\pm0.37}}$
    Statlog $61.39\pm0.91$ $50.92\pm1.32$ $54.36\pm1.11$ $51.20\pm1.34$ $59.52\pm0.63$ ${\boldsymbol{63.12}}{\boldsymbol{\pm0.39}}$
    IoT $40.73\pm2.14$ $42.29\pm1.58$ $39.35\pm1.89$ $40.42\pm2.15$ $48.04\pm1.04$ ${\boldsymbol{51.26}}{\boldsymbol{\pm0.81}}$
    HAR $61.64\pm1.18$ $60.57\pm0.97$ $57.91\pm1.43$ $54.11\pm1.36$ $65.53\pm0.76$ ${\boldsymbol{66.57}}{\boldsymbol{\pm0.46}}$
    下载: 导出CSV

    表  4  6种算法的F1值

    Table  4  F1 value of six algorithms

    数据流 LB BOLE ${\rm{ARF}}_{RE}$ OALE CALMID OALM-IDS
    ${\rm{DS}}_{1}$ $94.96\pm0.11$ ${\boldsymbol{95.80}}{\boldsymbol{\pm0.10}}$ $93.42\pm0.13$ $89.93\pm0.15$ $95.33\pm0.11$ ${\boldsymbol{95.80}}{\boldsymbol{\pm0.10}}$
    ${\rm{DS}}_{2}$ $92.32\pm0.16$ $92.34\pm0.13$ $91.18\pm0.15$ $88.85\pm0.21$ $92.65\pm0.13$ ${\boldsymbol{94.01}}{\boldsymbol{\pm0.12}}$
    ${\rm{DS}}_{3}$ $87.91\pm0.20$ $88.81\pm0.24$ $88.11\pm0.36$ $88.67\pm0.20$ $91.50\pm0.16$ ${\boldsymbol{93.07}}{\boldsymbol{\pm0.14}}$
    ${\rm{DS}}_{4}$ $85.35\pm0.42$ $87.38\pm0.36$ $88.42\pm0.51$ $84.50\pm0.33$ $90.51\pm0.21$ ${\boldsymbol{92.29}}{\boldsymbol{\pm0.20}}$
    ${\rm{DS}}_{5}$ $84.85\pm0.41$ $86.67\pm0.43$ $88.30\pm0.46$ $85.36\pm0.48$ $90.62\pm0.21$ ${\boldsymbol{91.93}}{\boldsymbol{\pm0.18}}$
    ${\rm{DS}}_{6}$ $82.97\pm0.87$ $82.43\pm0.71$ $85.35\pm0.91$ $80.59\pm0.63$ $90.46\pm0.39$ ${\boldsymbol{91.53}}{\boldsymbol{\pm0.31}}$
    Kddcup$99\_10\%$ $73.12\pm0.55$ $72.47\pm0.63$ $72.01\pm0.46$ $72.81\pm0.51$ $73.56\pm0.33$ ${\boldsymbol{74.65}}{\boldsymbol{\pm0.20}}$
    Statlog $66.18\pm0.83$ $54.32\pm1.91$ $63.85\pm1.03$ $63.42\pm0.98$ $74.42\pm0.36$ ${\boldsymbol{75.19}}{\boldsymbol{\pm0.31}}$
    IoT $47.01\pm1.24$ $48.40\pm0.96$ $47.34\pm1.89$ $44.94\pm1.36$ $54.26\pm0.65$ ${\boldsymbol{56.73}}{\boldsymbol{\pm0.67}}$
    HAR $59.93\pm0.91$ $58.81\pm1.21$ $58.52\pm0.79$ $54.43\pm1.13$ $65.43\pm0.63$ ${\boldsymbol{67.76}}{\boldsymbol{\pm0.58}}$
    下载: 导出CSV

    表  5  6种算法的Kappa值

    Table  5  Kappa value of six algorithms

    数据流 LB BOLE ${\rm{ARF}}_{RE}$ OALE CALMID OALM-IDS
    ${\rm{DS}}_{1}$ $90.17\pm0.12$ $91.18\pm0.14$ $90.59\pm0.16$ $85.47\pm0.21$ $90.48\pm0.19$ $\boldsymbol{91.31\pm0.12}$
    ${\rm{DS}}_{2}$ $88.85\pm0.19$ $88.14\pm0.23$ $87.91\pm0.39$ $83.18\pm0.56$ $89.97\pm0.31$ ${\boldsymbol{90.66}}{\boldsymbol{\pm0.23}}$
    ${\rm{DS}}_{3}$ $85.25\pm0.22$ $85.86\pm0.38$ $86.68\pm0.29$ $83.91\pm0.39$ $88.91\pm0.26$ ${\boldsymbol{89.93}}{\boldsymbol{\pm0.21}}$
    ${\rm{DS}}_{4}$ $84.15\pm0.55$ $86.04\pm0.63$ $87.14\pm0.66$ $83.42\pm0.71$ $88.92\pm0.33$ ${\boldsymbol{89.33}}{\boldsymbol{\pm0.36}}$
    ${\rm{DS}}_{5}$ $83.85\pm0.77$ $85.83\pm0.69$ $86.45\pm0.81$ $86.67\pm0.70$ $88.57\pm0.31$ ${\boldsymbol{89.12}}{\boldsymbol{\pm0.29}}$
    DS$_{6}$ $81.49\pm1.12$ $82.98\pm1.69$ $84.15\pm1.87$ $79.92\pm1.48$ $89.01\pm0.41$ ${\boldsymbol{89.73}}{\boldsymbol{\pm0.28}}$
    Kddcup$99\_10\% $ $80.93\pm0.67$ $75.62\pm1.13$ $79.32\pm1.32$ $78.31\pm0.91$ $83.32\pm0.26$ ${\boldsymbol{85.83}}{\boldsymbol{\pm0.18}}$
    Statlog $58.71\pm1.42$ $61.43\pm1.18$ $73.72\pm0.93$ $71.21\pm1.24$ $79.39\pm0.46$ ${\boldsymbol{80.11}}{\boldsymbol{\pm0.19}}$
    IoT $67.53\pm1.54$ $65.02\pm1.89$ $68.99\pm2.14$ $59.53\pm2.12$ $71.65\pm0.71$ ${\boldsymbol{73.29}}{\boldsymbol{\pm0.68}}$
    HAR $60.49\pm1.12$ $60.01\pm1.38$ $61.86\pm1.13$ $56.75\pm2.03$ $68.52\pm0.76$ ${\boldsymbol{69.64}}{\boldsymbol{\pm0.71}}$
    下载: 导出CSV

    表  6  参数$\theta $对OALM-IDS的影响

    Table  6  Effect of parameter $\theta $ to OALM-IDS

    数据流 $\theta $ $b$ 分类准确率 召回率 F1值 Kappa
    0.4 0.17143 $94.21\pm0.16 $ $93.18\pm0.12$ $94.13\pm0.11$ $90.11\pm0.12$
    DS$_{1}$ ${\boldsymbol{0.5}}$ ${\boldsymbol{0.18026}}$ ${\boldsymbol{95.48}}{\boldsymbol{\pm0.15}}$ ${\boldsymbol{96.14}}{\boldsymbol{\pm0.12}}$ ${\boldsymbol{95.80}}{\boldsymbol{\pm0.10}}$ ${\boldsymbol{91.31}}{\boldsymbol{\pm0.12}}$
    0.6 0.19782 $95.03\pm0.15 $ $93.19\pm0.12$ $95.16\pm0.10$ $91.01\pm0.12$
    0.4 0.17136 $93.01\pm0.12 $ $92.81\pm0.16$ $93.04\pm0.13$ $89.09\pm0.26$
    DS$_{2}$ ${\boldsymbol{0.5}}$ ${\boldsymbol{0.19178}}$ ${\boldsymbol{93.94}}{\boldsymbol{\pm0.12}}$ ${\boldsymbol{94.08}}{\boldsymbol{\pm0.14}}$ ${\boldsymbol{94.01}}{\boldsymbol{\pm0.12}}$ ${\boldsymbol{90.66}}{\boldsymbol{\pm0.23}}$
    0.6 0.20000 $93.18\pm0.13 $ $93.16\pm0.14$ $93.75\pm0.12$ $90.07\pm0.23$
    0.4 0.17821 $93.24\pm0.13 $ $92.05\pm0.13$ $92.54\pm0.16$ $88.56\pm0.22$
    DS$_{3}$ ${\boldsymbol{0.5}}$ ${\boldsymbol{0.19512}}$ ${\boldsymbol{93.72}}{\boldsymbol{\pm0.13}}$ ${\boldsymbol{92.52}}{\boldsymbol{\pm0.13}}$ ${\boldsymbol{93.07}}{\boldsymbol{\pm0.14}}$ ${\boldsymbol{89.93}}{\boldsymbol{\pm0.21}}$
    0.6 0.20000 $93.43\pm0.13 $ $92.24\pm0.13$ $92.10\pm0.14$ $88.71\pm0.21$
    0.4 0.18423 $91.63\pm0.21 $ $91.34\pm0.18$ $91.76\pm0.20$ $88.54\pm0.38$
    DS$_{4}$ ${\boldsymbol{0.5}}$ ${\boldsymbol{0.19877}}$ ${\boldsymbol{92.18}}{\boldsymbol{\pm0.21}}$ ${\boldsymbol{92.44}}{\boldsymbol{\pm0.18}}$ ${\boldsymbol{92.29}}{\boldsymbol{\pm0.20}}$ ${\boldsymbol{89.33}}{\boldsymbol{\pm0.36}}$
    0.6 0.20000 $91.06\pm0.21 $ $91.56\pm0.19$ $91.80\pm0.21$ $88.63\pm0.36$
    0.4 0.18002 $92.01\pm0.16 $ $90.46\pm0.13$ $90.76\pm0.18$ $88.42\pm0.29$
    DS$_{5}$ ${\boldsymbol{0.5}}$ ${\boldsymbol{0.19722}}$ ${\boldsymbol{92.92}}{\boldsymbol{\pm0.16}}$ ${\boldsymbol{91.16}}{\boldsymbol{\pm0.13}}$ ${\boldsymbol{91.93}}{\boldsymbol{\pm0.18}}$ ${\boldsymbol{89.12}}{\boldsymbol{\pm0.29}}$
    0.6 0.20000 $92.50\pm0.16 $ $90.76\pm0.13$ $91.21\pm0.19$ $88.56\pm0.30$
    0.4 0.18331 $91.02\pm0.21 $ $89.03\pm0.22$ $90.32\pm0.31$ $88.12\pm0.28$
    DS$_{6}$ ${\boldsymbol{0.5}}$ ${\boldsymbol{0.19923}}$ ${\boldsymbol{92.41}}{\boldsymbol{\pm0.21}}$ ${\boldsymbol{90.71}}{\boldsymbol{\pm0.21}}$ ${\boldsymbol{91.53}}{\boldsymbol{\pm0.31}}$ ${\boldsymbol{89.73}}{\boldsymbol{\pm0.28}}$
    0.6 0.20000 $91.01\pm0.21 $ $89.92\pm0.22$ $90.12\pm0.31$ $89.13\pm0.28$
    0.4 0.18188 $90.59\pm0.18 $ $63.51\pm0.37$ $73.35\pm0.20$ $83.14\pm0.18$
    Kddcup$99\_10\%$ ${\boldsymbol{0.5}}$ ${\boldsymbol{0.19961}}$ ${\boldsymbol{92.07}}{\boldsymbol{\pm0.18}}$ ${\boldsymbol{63.71}}{\boldsymbol{\pm0.37}}$ ${\boldsymbol{74.65}}{\boldsymbol{\pm0.20}}$ ${\boldsymbol{85.83}}{\boldsymbol{\pm0.18}}$
    0.6 0.20000 $91.63\pm0.18 $ $63.63\pm0.37$ $74.43\pm0.21$ $85.61\pm0.18$
    0.4 0.19022 $84.75\pm0.33 $ $62.19\pm0.39$ $74.85\pm0.31$ $78.86\pm0.19$
    Statlog ${\boldsymbol{0.5}}$ ${\boldsymbol{0.19994}}$ ${\boldsymbol{85.68}}{\boldsymbol{\pm0.33}}$ ${\boldsymbol{63.12}}{\boldsymbol{\pm0.39}}$ ${\boldsymbol{75.19}}{\boldsymbol{\pm0.31}}$ ${\boldsymbol{80.11}}{\boldsymbol{\pm0.19}}$
    0.6 0.20000 $85.66\pm0.33 $ $63.01\pm0.39$ $75.19\pm0.31$ $79.89\pm0.19$
    0.4 0.19113 $71.21\pm0.38 $ $49.86\pm0.81$ $51.21\pm0.67$ $71.61\pm0.68$
    IoT ${\boldsymbol{0.5}}$ ${\boldsymbol{0.19684}}$ ${\boldsymbol{73.12}}{\boldsymbol{\pm0.38}}$ ${\boldsymbol{51.26}}{\boldsymbol{\pm0.81}}$ ${\boldsymbol{56.73}}{\boldsymbol{\pm0.67}}$ ${\boldsymbol{73.29}}{\boldsymbol{\pm0.68}}$
    0.6 0.20000 $72.11\pm0.39 $ $50.06\pm0.81$ $54.33\pm0.67$ $71.34\pm0.68$
    0.4 0.18634 $66.54\pm0.52 $ $64.32\pm0.48$ $65.05\pm0.59$ $66.81\pm0.72$
    HAR ${\boldsymbol{0.5}}$ ${\boldsymbol{0.19547}}$ ${\boldsymbol{69.98}}{\boldsymbol{\pm0.51}}$ ${\boldsymbol{66.57}}{\boldsymbol{\pm0.46}}$ ${\boldsymbol{67.76}}{\boldsymbol{\pm0.58}}$ ${\boldsymbol{69.64}}{\boldsymbol{\pm0.71}}$
    0.6 0.20000 $64.32\pm0.52 $ $65.14\pm0.46$ $66.11\pm0.58$ $64.32\pm0.71$
    下载: 导出CSV
  • [1] 于洪, 何德牛, 王国胤, 李劼, 谢永芳. 大数据智能决策. 自动化学报, 2020, 46(5): 878−896

    Yu Hong, He De-Niu, Wang Guo-Yin, Li Jie, Xie Yong-Fang. Big data for intelligent decision making. Acta Automatica Sinica, 2020, 46(5): 878−896
    [2] Lu J, Liu A, Dong F, Gu F, Gama J, Zhang G. Learning under concept drift: A review. IEEE Transactions on Knowledge and Data Engineering, 2020, 31(12): 2346−2363
    [3] Liu W, Zhang H, Liu Q. An air quality grade forecasting approach based on ensemble learning. In: Proceedings of the International Conference on Artificial Intelligence and Advanced Manufacturing. Dublin, Ireland: AIAM, 2019. 87−91
    [4] Cano A, Krawczyk B. Kappa updated ensemble for drifting data stream mining. Machine Learning, 2020, 109(1): 175−218 doi: 10.1007/s10994-019-05840-z
    [5] Liu A, Lu J, Zhang G. Concept drift detection via equal intensity k-means space partitioning. IEEE transactions on cybernetics, 2020, 51(6): 3198−3211
    [6] 王金甲, 张玉珍, 夏静, 王凤嫔. 多层局部块坐标下降法及其驱动的分类重构网络. 自动化学报, 2020, 46(12): 2647−2661

    Wang Jin-Jia, Zhang Yu-Zhen, Xia Jing, Wang Feng-Pin. Multi-layer local block coordinate descent algorithm and unfolding classification and reconstruction networks. Acta Automatica Sinica, 2020, 46(12): 2647−2661
    [7] Lu Y, Cheung M Y, Tang Y Y. Adaptive chunk-based dynamic weighted majority for imbalanced data stream with concept drift. IEEE Transactions on Neural Networks and Learning Systems, 2020, 31(8): 2764−2778 doi: 10.1109/TNNLS.2019.2951814
    [8] Grzyb J, Klikowski J, Woźniak M. Hellinger distance weighted ensemble for imbalanced data stream classification. Journal of Computational Science, 2021, 51: Article No. 101314 doi: 10.1016/j.jocs.2021.101314
    [9] Kim T, Park C H. Anomaly pattern detection for streaming data. Expert Systems with Applications, 2020, 149: Article No. 113252 doi: 10.1016/j.eswa.2020.113252
    [10] Wankhade K K, Dongre S S, Jondhale K C. Data stream classification: A review. Iran Journal of Computer Science, 2020, 3: 239−260 doi: 10.1007/s42044-020-00061-3
    [11] Bahri M, Bifet A, Gama J, Gomes H M, Maniu S. Data stream analysis: Foundations, major tasks and tools. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, to be published
    [12] Kontopoulos I, Chatzikokolakis K, Tserpes K, Zissis D. Classification of vessel activity in streaming data. In: Proceedings of the 14th ACM International Conference on Distributed and Event-based Systems. Jerusalem, Israel: ACM, 2020. 153−164
    [13] Wang S, Minku L L. Auc estimation and concept drift detection for imbalanced data streams with multiple classes. In: Proceedings of the International Joint Conference on Neural Networks. Glasgow, UK: IJCNN, 2020. 1−8
    [14] Fan S, Zhang X, Song Z. Reinforced knowledge distillation: Multi-class imbalanced classifier based on policy gradient reinforcement learning. Neurocomputing, 2021, 463: 422−436 doi: 10.1016/j.neucom.2021.08.040
    [15] Bifet A, Holmes G, Pfahringer B. Leveraging bagging for evolving data stream. In: Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Barcelona, Spain: PKDD, 2010. 135−150
    [16] Mirza B, Lin Z. Meta-cognitive online sequential extreme learning machine for imbalanced and concept drifting data classification. Neural Networks, 2016, 80: 79−94 doi: 10.1016/j.neunet.2016.04.008
    [17] Barros R S M, Carvalho-Santos S G T, Júnior P M G. A boosting-like online learning ensemble. In: Proceedings of the International Joint Conference on Neural Networks. Vancouver, Canada: IJCNN, 2016. 1871−1878
    [18] Carvalho-Santos S G T, Barros R S M. Online adaboost-based methods for multi-class problems. Artificial Intelligence Review, 2020, 53(2): 1293−1322 doi: 10.1007/s10462-019-09696-6
    [19] Ferreira L E B, Gomes H M, Bifet A, Oliveira L S. Adaptive random forests with resampling for imbalanced data stream. In: Proceedings of the International Joint Conference on Neural Networks. Budapest, Hungary: IJCNN, 2019. 1−6
    [20] Ren P Z, Xiao Y, Chang X J, Huang P Y, Li Z, Gupta B B, et al. A survey of deep active learning. ACM Computing Surveys, 2021, 54(9): 1−40
    [21] Yousaf M S, Ahmad I, Khurshid A, Ikram M. Machine assisted classification of chicken, beef and mutton tissues using optical polarimetry and bagging model. Photodiagnosis and Photodynamic Therapy, 2020, 31: 101779 doi: 10.1016/j.pdpdt.2020.101779
    [22] Wang Y, Feng L. An adaptive boosting algorithm based on weighted feature selection and category classification confidence. Applied Intelligence, 2021, 51(10): 1−22
    [23] Gomes H M, Bifet A, Read J, Barddal J P, Enembreck F, Pfharinger B, et al. Adaptive random forests for evolving data stream classification. Machine Learning, 2017, 106(9): 1469−1495
    [24] Babüroǧlu E S, Durmuşoǧlu A, Dereli T. Novel hybrid pair recommendations based on a large-scale comparative study of concept drift detection. Expert Systems With Applications, 2021, 163: 113786 doi: 10.1016/j.eswa.2020.113786
    [25] 刘子昂, 蒋雪, 伍冬睿. 基于池的无监督线性回归主动学习. 自动化学报, 2021, 47(12): 2771−2783

    Liu Zi-Ang, Jiang Xue, Wu Dong-Rui. Unsupervised pool-based active learning for linear regression. Acta Automatica Sinica, 2021, 47(12): 2771−2783
    [26] Shekhar S, Ghavamzadeh M, Javidi T. Active learning for classification with abstention. IEEE Journal on Selected Areas in Information Theory, 2021, 2(2): 705−719 doi: 10.1109/JSAIT.2021.3081433
    [27] Shan J, Zhang H, Liu W, Liu Q. Online active learning ensemble framework for drifted data stream. IEEE Transactions on Neural Networks and Learning Systems, 2018, 30(2): 486−498
    [28] Liu W, Zhang H, Ding Z, Liu Q, Zhu C. A comprehensive active learning method for multiclass imbalanced data stream with concept drift. Knowledge-based Systems, 2021, 215: 106778 doi: 10.1016/j.knosys.2021.106778
    [29] Gu X, Angelov P P. Multi-class fuzzily weighted adaptive boosting-based self-organising fuzzy inference ensemble systems for classification. IEEE Transactions on Fuzzy Systems, to be published
    [30] Moraes M B, Gradvohl A L S. Moafs: A massive online analysis library for feature selection in data streams. Journal of Open Source Software, 2020, 5(45): 1970 doi: 10.21105/joss.01970
  • 加载中
计量
  • 文章访问数:  29
  • HTML全文浏览量:  13
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-12-29
  • 录用日期:  2022-04-07
  • 网络出版日期:  2024-06-19

目录

    /

    返回文章
    返回