-
摘要: 图像扩增是工业外观检测中常用的数据处理方法, 有助于提升检测模型泛化性, 避免过拟合. 根据扩增结果的不同来源, 将当前工业图像扩增方法分为基于传统变换和基于模型生成两类. 基于传统变换的扩增方法包括基于图像空间和特征空间两类; 根据模型输入条件信息的不同, 基于模型生成的方法分为无条件、低维条件和图像条件三类. 对相关方法的原理、应用效果、优缺点等进行分析, 重点介绍基于生成对抗网络、扩散模型等模型生成的扩增方法. 依据扩增结果的标注类型和方法的技术特点, 对三类基于模型生成方法的相关文献进行分类统计, 通过多维表格阐述各类方法的研究细节, 对其基础模型、评价指标、扩增性能等进行综合分析. 最后, 总结当前工业图像扩增领域存在的挑战, 并对未来发展方向进行展望.Abstract: Image augmentation is a commonly used data processing method in industrial cosmetic inspection, which improves the generalization of detection models and prevents overfitting. Based on the different sources of augmentation results, current industrial image augmentation methods are categorized into traditional transformation-based and model generation-based. The former includes image space-based and feature space-based methods. The latter is classified into unconditional, low-dimensional conditional, and image conditional methods based on different input conditional information of models. The principles, application effects, advantages, and disadvantages of related methods are analyzed, focusing on model generation-based augmentation methods such as those based on generative adversarial networks and diffusion models. Furthermore, the relevant works on the three types of model generation-based methods are categorized according to the type of annotations for augmentation results and the technical characteristics of the methods. A multidimensional table is used to elaborate on the research details of various methods, followed by comprehensive analyses of their base models, evaluation metrics, and augmentation performance. Finally, the paper summarizes the current challenges in industrial image augmentation and provides an outlook on future development directions.
-
图 1 工业缺陷图像, 每幅图像的左下角是对应掩膜标注 ((a)木板的孔洞缺陷图像; (b)在木板上随机切出圆形区域; (c)手机中框的异色缺陷图像)
Fig. 1 Industrial defect images, the bottom left of each image is the corresponding mask annotations ((a) Hole defect image of wood; (b) Randomly cut out the circular area on the wood; (c) Heterochromatic defect image of phone band)
图 6 图像子区域替换结果. “输入”是来自MVTec的4幅真实输入图, “输出”是不同方法的扩增结果, 括号中的内容表示扩增结果的输入
Fig. 6 Results of image subregion replacement. “Inputs” are four real images from MVTec, “Outputs” are augmentation results from each method, the contents within the parentheses indicate the source images in the first row for generating the output images
图 10 常用生成式模型架构. (a), (b), (c), (g)是无条件架构, (d)和(e)是基于低维条件的架构, (f), (h), (i), (j)是基于图像条件的架构. ${p} \left({\boldsymbol{ z}} \right)$是低维分布, ${\boldsymbol{z}},\;{{\boldsymbol{z}}_s}$是采样的低维随机噪声, G是生成器, D是鉴别器, En是编码器, De是解码器. c是低维条件信息, x, y是来自不同域的真实图像, ${{\boldsymbol{x}}_f},\;{{\boldsymbol{y}}_f}$是对应域的合成样本, $\hat{{\boldsymbol{x}}} ,\;\hat{{\boldsymbol{y}}}$ 是生成器的重构输出. MLP表示多层感知机.
Fig. 10 Common architectures of generative models. (a), (b), (c), (g) are unconditional architectures, (d) and (e) are architectures based on low-dimensional conditions, (f), (h), (i) and (j) are image-conditional architectures. ${p} \left( {\boldsymbol{z}} \right)$ indicates the low-dimensional distribution, ${\boldsymbol{z}},\;{{\boldsymbol{z}}_s}$ are low-dimensional sampled random noises, G is the generator, D is the discriminator, En is the encoder, and De is the decoder. c is the low-dimensional conditional information, x and y are the real images from different domains, ${{\boldsymbol{x}}_f},\;{{\boldsymbol{y}}_f}$ are the generated samples of the corresponding domains, $\hat{{\boldsymbol{x}}}$ and $\hat{{\boldsymbol{y}}}$are the reconstructed output of the generator. MLP is MultiLayer Perceptron
图 12 基于无条件生成模型的扩增方法流程图 ((a) 图像级标注的无条件扩增方法, 包括对生成模型和训练目标的改进; (b) 基于图像处理获取非图像级标注的无条件扩增方法; (c) 基于改进架构获取非图像级标注的无条件扩增方法)
Fig. 12 Flowchart of the unconditional generative model-based augmentation methods ((a) Unconditional augmentation methods with image-level annotation, including improvement for the generation model and the training objective; (b) Unconditional augmentation method with non-image-level annotation based on image processing; (c) Unconditional augmentation method with non-image-level annotation based on improved architecture)
图 13 将生成缺陷块融合到正常背景上 ((a)生成缺陷块; (b)真实的正常背景; (c) CutPaste结果; (d)泊松融合结果, 红色框为生成缺陷对应的标注框)
Fig. 13 Blending defect patch into the normal background ((a) Generated defect patch; (b) Real normal background; (c) Results of CutPaste; (d) Poisson blending results, the red box is the bounding box of the generated defect sample)
图 15 基于图像条件生成模型的扩增流程图 ((a)基于图像条件的图像级标注扩增; (b) 预定义标注的非图像级扩增; (c) 后处理获取标注的非图像级扩增)
Fig. 15 Flowchart of image-conditional generative model-based augmentation ((a) Augmentation method with image-level annotations based on image condition; (b) Non-image-level augmentation based on predefined annotations; (c) Non-image-level augmentation based on post-processing to obtain annotations)
表 1 常用的简单图像变换扩增方法
Table 1 Common simple image transformation augmentation methods
方法 原理 几何变换 旋转 旋转一定角度 翻转 关于水平或竖直轴翻转 缩放 按比例放大或缩小 平移 沿水平、垂直方向移动 裁剪 裁剪出图像的子区域 非几何变换 添加噪声 添加高斯、椒盐等类型噪声 核过滤 利用核进行卷积 颜色变换 在颜色空间调节颜色分量或颜色通道 亮度变换 将像素值映射到新的范围 表 2 图像擦除扩增方法
Table 2 Image erasing augmentation methods
表 3 图像子区域替换扩增方法
Table 3 Image subregion replacement augmentation methods
方法 原理 CutMix[40] 裁剪源图像随机区域并将其粘贴到目标图像的对应位置 FMix[41] 通过一系列计算得到二值掩膜图像, 基于此对两幅图像进行组合 SaliencyMix[42] 根据显著性图剪裁源图像最具显著性区域, 并复制粘贴到其他图像中 RICAP[43] 根据边界条件分别对四幅图像进行裁剪, 将裁剪下来的区域进行组合 KeepAugment[44] 在扩增时保持显著性强的图像区域不变, 提高扩增结果的保真度 ResizeMix[45] 将源图像缩小成小尺寸图像, 并将其粘贴到目标图像的随机位置 SnapMix[46] 裁剪随机大小的源图像区域, 变换后粘贴到目标图像中 Copy-Paste[47] 对两幅图像进行随机比例抖动和翻转, 然后将目标图像的实例子集粘贴到源图像 Cut-Thumbnail[48] 将图像缩小为小尺寸图像, 并将其粘贴到原始图像或其他大尺寸图像中 Local Augment[49] 将图像划分为图像块, 对每个图像块进行不同的扩增 ObjectAug[50] 提取目标图像的前景进行增强, 并将其粘贴到修复后的源图像中 Self Augment[51] 将图像中的随机区域复制到图像的另一个位置 CutPaste[52] 切割图像块并在图像的随机位置粘贴 SalfMix[53] 根据显著性图将图像中显著性最强的区域复制到显著性最弱的区域 YOCO[54] 将图像划分为两块, 对两块图像分别进行扩增并重新拼接 RSMDA[55] 对源图像进行随机切片处理, 并将切片粘贴到目标图像的相应位置 表 4 图像混合扩增方法
Table 4 Image mixing augmentation methods
方法 原理 Mixup[59] 根据混合因子对两幅图像及其标签进行加权混合 SamplePairing[60] 将两张图像对应位置像素取平均得到新的图像 MixMatch[61] 对无标签图像进行K次扩增, 将扩增后的图像经过预测网络得到K个预测标签, 计算平均预测标签, 对平均标签锐化后作为图像伪标签, 将伪标签图像与增强后的有标签图像进行Mixup得到扩增图像 AugMix[62] 对选取的扩增方法进行组合, 从中采样对图像进行k次扩增, 对k个增强图像进行加权得到混合增强图像, 将混合增强图像与原始图像加权得到最终的扩增结果 FixMatch[63] 将经过随机翻转、平移等弱扩增的无标签图像输入模型获得标签预测, 当标签预测大于阈值时, 将预测值转化为one-hot伪标签. 然后对同一图像进行RandAugment、Cutout等强扩增并进行标签预测, 计算预测结果与伪标签的交叉熵损失, 使得模型对弱扩增版本图像的预测伪标签与强扩增图像预测结果匹配 ReMixMatch[64] 利用分布对齐和增强锚定两种方法对MixMatch算法进行改进. Puzzle Mix[65] 通过利用样本的显著性信息和局部统计信息来生成混合数据 StyleMix[66] 分别处理不同图像的风格和内容特征, 并将处理后的风格和内容特征进行混合 StyleCutMix[66] 结合StyleMix和CutMix RandomMix[67] 从候选的混合扩增方法中随机选择一种对随机配对的训练样本进行扩增 表 5 基于策略搜索的传统变换扩增方法
Table 5 Policy-searching-based traditional transformation augmentation methods
方法 原理 AutoAugment[72] 基于强化学习算法, 在由增强方法及其应用概率和幅度组成的搜索空间中寻找最优策略 PBA[73] 将策略搜索问题看作是超参数优化问题, 训练时用效果较好的模型参数替换效果差的模型参数, 然后打乱参数继续搜索更好的策略 Fast AutoAugment[74] 贝叶斯优化器采样扩增策略, 密度匹配算法评估策略效果, 贝叶斯优化器根据评估结果进行新的策略搜索 RandAugment[75] 引入两个极简的超参数, 等概率选取增强次数N、 增强幅度M, 用简单的网格搜索寻找最优策略 Faster AutoAugment[76] 使用梯度逼近使不可微数据增强操作变得可微, 引入对抗网络最小化扩增图像分布与原始图像分布的距离, 使得搜索过程端到端可微 MADAO[77] 使用隐式梯度法和诺依曼级数逼近, 同时通过梯度下降同步优化图像分类模型和数据增强策略 Adversarial AutoAugment[78] 引入GAN的对抗思想, 策略网络和分类网络分别作为生成器和判别器, 策略网络最大化分类损失, 分类网络最小化该损失 DADA[79] 通过Gumbel-Softmax梯度估计器将不可微的数据增强参数松弛为可微的, 引入无偏梯度估计器RELAX以实现准确的梯度估计 Patch AutoAugment[80] 将图像划分为图像块, 使用多智能体强化学习算法针对每个图像块和整幅图像的内容来学习扩增策略, 搜索到整幅图像的最优扩增 RangeAugment[81] 引入一个辅助损失来了解增强操作的幅值范围, 通过控制给定模型和任务的输入与增强图像之间的相似性来有效地学习增强操作的幅度范围 表 6 工业图像量化评价指标
Table 6 Quantitative evaluation metrics for industrial images
评价指标 基本原理 优点 缺点 基于语义 IS 基于预训练网络的分类结果 同时评价质量和多样性 难以提取有代表性的工业图像特征 FID 基于预训练网络提取的特征 有利于评价分布差异 数据量少时结果不准确 KID 采用核函数改进的FID 比FID更符合人类感知 需要大量计算资源 LPIPS 计算多层特征的不相似性 符合人类感知 计算复杂度高 基于纹理 MMD 基于核函数计算分布差异 无需预训练神经网络 对核函数的选择敏感 SSIM 基于亮度、对比度、结构衡量相似性 利于捕捉底层结构信息 容易受噪声干扰 SNR 计算相对像素值保留程度 有效评价质量和失真程度 不考虑结构信息 PSNR 以最大像素值代替平均像素值 更低的计算复杂度 无法衡量伪影等特定失真 SD 基于梯度差异量化相似度 适合评价清晰度和细节 忽略全局差异 UQI 基于像素均值和方差量化相似度 从像素角度评价全局质量 难以衡量模糊结果 VIF 利用小波变换模拟人类感知 结果符合人类感知 适用范围较小 表 7 基于模型生成的工业扩增基础模型分类基准
Table 7 Taxonomy of basic generative models of generative model-based industrial augmentation methods
表 8 基于模型生成的扩增方法特点
Table 8 Characteristic of generative model-based augmentation methods
方法 子类型 优点 缺点 改进策略 无条件 图像级标注 直接应用 仅需修改架构细节, 无需新增模块, 应用简便 无法充分发挥模型潜力, 适用于自然图像生成的模型, 难以复用到工业场景 结合多个模型的优点对架构细节进行修改, 例如加入谱归一化[107]、梯度惩罚损失[105−106]等 改进架构 加入了新的模块, 结合不同模块的优点, 提高对少样本数据集的建模能力 过于复杂的模块会导致模型难以收敛 针对任务的特点设计新的模块, 避免复杂化, 保证生成模型的整体一致性 改进训练目标 采用新的正则化损失项并结合有特定功能的网络架构, 避免模式崩溃 过多的约束降低生成结果多样性, 影响训练速度 结合生成模型的原理和工业扩增需求, 简化损失约束 非图像级标注 图像处理 生成结果融合到背景来获取标注, 灵活性强 像素域的融合方法可能造成边缘断裂问题 基于梯度域、特征级、小波变换等方式探索一致性良好的融合方法 改进架构 通过模型给出完整缺陷样本和对应标注, 直接、高效 难以准确定位与正常背景差异较小的缺陷, 模型倾向于将非缺陷伪影标注为缺陷 基于正常与缺陷区域的多层级特征差异设计生成模型, 基于特征融合定位缺陷 低维条件 条件引导 无需设计额外的分类模块来判断输入样本的条件类别 加入新的条件信息时需要训练整个生成模型 基于文本引导条件微调预训练生成模型, 设计多场景统一的生成架构 类别拟合 额外的分类模块判别输入的条件类别, 有利于在生成模型训练完成后适应新条件 需要设计额外的分类网络和分类基准, 模型难训练 综合评价指标和引导条件的特点设计分类模块和分类基准 图像条件 图像级标注 直接应用 直接利用已有的图像转换模型 缺少成对训练集, 过于依赖CycleGAN 利用图像修复创建成对的训练集, 将循环一致性损失应用到其他架构中 改进模块 改进生成网络的部分或整体架构, 提升特定场景性能 难以适应复杂工业场景 结合文本、类别标签等低维引导条件设计通用的生成架构 改进训练目标 采用新的损失, 使得生成模型适应工业场景 少样本下损失难收敛 从工业图像的特点出发, 引入正则化项避免过拟合和模式崩溃 非图像级标注 后处理 采用阈值分割、注意力图融合等后处理方法定位缺陷 仅适用于特定任务, 存在标注不准确问题 对比生成图像与输入图像条件之间的特征, 通过多层信息融合定位缺陷 预定义
(人工)无需掩膜生成算法, 仅在人工绘制的掩膜引导下生成 人工绘制的掩膜数量和多样性有限 构建工业场景掩膜集, 基于庞大的掩膜库进行变换 预定义
(自动)利用随机函数、生成网络快速获取大量掩膜 形状特异性较差, 难以确保位置准确性 细化掩膜图像类别, 以产品图像作为掩膜生成的条件, 实现可控的掩膜图像生成 表 9 基于无条件生成模型的扩增方法相关文献明细表
Table 9 Detailed literature table on augmentation methods based on unconditional generative models
文献 基础模型 应用场景 任务类型 标注类型 子类型 训练集数据量 合成数据数量 RPI QEI [109] WGAN 光伏组件 分类 IL 直接 1800 2400 4.40% MMD [110] WGAN-GP 混凝土路面裂缝 检测, 分割 IL 直接 1000 1000 8.41% SSIM, PSNR [111] WGAN 碳纤维聚合物 分割 IL 直接 — — 13.64% SNR [112] DCGAN 光伏组件 分类 IL 直接 300 2000 40% MMD, FID [113] AE WM-811K[114] 分类 IL 直接 — 10500 8.25% — [115] StyleGAN v2 混凝土下水道 分类 IL 直接 1200 1200 50.07% FID, KID [116] DCGAN 混凝土表面 分类 IL 直接 400 10000 20% — [117] DCGAN 生活垃圾焚烧图像 分类 IL 直接 — — — FID [118] DCGAN 刮刀 分类 IL 直接 3518 少数类过采样 0.006% (to 99) FID [119] StyleGAN v2 DiffAugment 激光焊接 检测 IL 直接 503 503 −0.57% FID [120] DCGAN 混凝土路面裂缝 分类 IL 直接 1268 4732 13.80% — [121] DCGAN SDNET2018[122] 分类 IL 直接 200 2000 1% — [123] DCGAN 齿轮 分类 IL 直接 — — 3.06% IS [124] WGAN 焊接 分类 IL 直接 — — — — [125] DCGAN+AE WM-811K 分类 IL 直接 — — — 提出PGI (Polymorphic generative index) [126] DCGAN 聚合物复合材料 分类 IL 直接 59 40 20% — [127] WGAN 焊接 检测 IL 直接 — — — — [128] StyleGAN+
WGAN-GP钢板 分类 IL 直接 — — 22.90% FID [129] DCGAN 火电厂水冷壁 检测 IL 直接 300 40000 22.21% FID [130] DCGAN 碳纤维复合材料 分类 IL 架构 59 80 22.57% SNR [131] DCGAN DeepCrack[132] 分割 IL 架构 300 300 2.86% — [133] DCGAN NEU[134] 分类 IL 架构 1080 3240 4.1% (to 99) — [135] StyleGAN v2 混凝土管道 分类 IL 架构 300 1000 1.60% — [136] DCGAN 轧钢 分类 IL 架构 — — — — [137] ProGAN GC10-DET[138] 检测 IL 架构 — 2000 75% FID [139] StyleGAN DeepCrack 检测, 分割 IL 架构 5477 10000 7.16% IS, FID [140] DCGAN NEU, PCB 分类 IL 训练目标 10 1500 1.70% FID, MMD [141] multiGAN CODEBRIM[142] — IL 训练目标 — — — FID [143] StyleGAN v2+WGAN 卫生陶瓷 分类 IL 训练目标 325 277 16.70% FID [144] DCGAN NEU 分类 IL 训练目标 400 2600 12% FID, IS [145] AE 源: MixedWM38[146]
目标: WM-811K分类 IL 训练目标 435 10003 75% — [147] multiGAN GC10-DET, NEU,
太阳能铝型材框架检测 IL 训练目标 100 200 —,
12.7%,
5.70% (to 99)FID, SDS [148] WGAN 三种高光谱图像 分类 IL 训练目标 21025 ,
207400 ,111104 部分小样
本类别数据
扩增一倍2.06%,
2.02%,
2.87%— [149] multiGAN 船舶涂层 — IL 训练目标 — — — IS, FID [150] WGAN-GP NEU, MTD[151] 检测 BB 图像处理 250 1600 22.0%,
45.10%mAP [152] DCGAN 粒子板 检测 BB 图像处理 2096 1310 — — [153] StyleGAN v2 DiffAugment NEU 分割 BM 图像处理 163 100 2.90% — [154] DM MVTec 分割 BM 图像处理 — — — — [155] AE KSDD[156], NEU, CrackForest[157],
太阳能铝型材框架分割 BM 改进架构 150/80 在线 9.2% IOU [158] StyleGAN v2 MVTec, 太阳能
铝型材框架分割, 分类 BM 改进架构 6 1000 18.60% KID, LPIPS [159] DM MVTec, VISION[160], Cotton[161] 分割 BM 改进架构 — — 4.10% — 表 10 基于无条件生成模型的扩增方法基础模型使用次数
Table 10 The times of each basic model used in the unconditional generative model-based augmentation methods
模型 DCGAN WGAN StyleGAN AE multiGAN ProGAN DM 次数 25 14 11 5 3 1 2 表 11 基于低维条件生成模型的扩增方法相关文献明细表
Table 11 Detailed literature table on augmentation methods based on generative models with low-dimensional conditions
文献 基础模型 应用场景 任务类型 标注类型 子类型 训练集数据量 合成数据数量 RPI QEI [163] CGAN 珍珠 分类 IL 条件引导 4200 2800 26.71% — [164] Condition VAE 金属表面 分类 IL 条件引导 150 150 3.1% (to 99) — [165] CGAN, WGAN-GP,
ProGAN刀具 分类 IL 条件引导 — — 1.13%,
7.04%,
4.69%IS, FID [166] CGAN 汽车点焊 分类 IL 条件引导 5142 9855 20% FID [167] CGAN 芯片 分类 IL 条件引导 — — — — [168] WGAN-GP 遥感高光谱 分类 IL 条件引导 — — — — [169] CGAN 混凝土桥柱 预测 IL 条件引导 110 — — FID [170] CGAN 织物 分割 BB 条件引导 1 5 微调 — [171] CGAN DeepPCB[172] 检测 BB 条件引导 — — 2.15% mAP [173] CGAN+VAE MVTec,
手机中框分割 BM 条件引导 12 (裁剪为 700) 2108 3.66% — [174] DM 燃气轮机、压缩机和
燃烧室内窥镜图分割,
检测BM 条件引导 1170 — 14.6% L2, FID,
互信息[175] ACGAN Salinas, Indiana Pines,
Kennedy Space Center分类 IL 类别拟合 53785 ,5211 ,10249 100 0.6987 %,0.4401 %,
0.452%— [176] ProGAN+ACGAN 光伏组件 分类 IL 类别拟合 3744 1000 2% — [177] ACGAN, DCGAN,
InfoGAN[178]NEU 分类 IL 类别拟合 5400 3600 2.77%,
6.09%, 5.1%— [179] ACGAN 钢板 分类 IL 类别拟合 200 1000 23.40% IS, FID [180] ACGAN NEU 分类 IL 类别拟合 — — — — 表 12 基于低维条件生成模型的扩增方法基础模型使用次数
Table 12 The times of each basic model used in the augmentation methods based on generative models with low-dimensional condition
模型 CGAN ACGAN ProGAN WGAN AE DM 次数 8 5 2 2 2 1 表 13 基于图像条件生成模型的扩增方法相关文献明细表
Table 13 Detailed literature table on augmentation methods based on image-conditional generative models
文献 基础模型 应用场景 任务类型 标注类型 子类型 训练集数据量 合成数据数量 RPI QEI [183] CycleGAN 扫描电镜图 分类 IL 直接 — — — — [184] CycleGAN KSDD, DAGM2007[185] 分类 IL 直接 1 200 — FID [186] CycleGAN 换向器 分类 IL 模块 250 700 57.47% FID [187] multi-GAN NEU 分类 IL 模块 10 100 3.40% SSIM [188] CycleGAN 墙面裂缝 分类 IL 模块 11000 25230 1.80% FID, KID [189] CGAN 混凝土路面 分割 IL 模块 1960 — — IS [190] CGAN 钢板, 木材, 磁瓦 分类 IL 模块 — — — AUC [191] ACGAN+
CycleGANCODEBRIM 分类 IL 模块 — 50000 7% FID [192] CycleGAN KSDD,
DAGM2007,
玻璃瓶分类 IL 模块 50,
150,
21200 1.9%,
2.62%,
3.18%FID [193] multiGAN 多普勒频谱 分类 IL 模块 20 200 5% IS, ACC [194] CycleGAN 环氧树脂滴液 分类 IL 训练目标 16 1400 160% PSNR, UQI, VIF [195] Pix2pix 线阵扫描相机获取
紧固件数据集分类 IL 训练目标 270 — 14% IS, FID [196] ACGAN MVTec, MTD 分类 IL 训练目标 — — — FID [197] Pix2pix 发光二极管芯片 检测 IL 训练目标 50 200 9.08% FID [198] CycleGAN 绝缘子 检测 IL 训练目标 1200 — 59.89% — [199] CGAN 输电线减震器 — BB 后处理 2500 — — IS, FID, PSNR,
SSIM, SD[200] CycleGAN+
WGAN石油管道 检测 BB 后处理 706 3200 89.60% — [201] CycleGAN 铁路缺陷 检测 BB 后处理 100 100 — — [202] CycleGAN 木材,
DAGM2007分割 BM 后处理 100/500 50/100 79.2%,
—人类评价标注
的质量[203] CycleGAN 太阳能电池板 分割 BM 后处理 50 200 79.45% FID [9] Pix2pix DAGM2007,
NEU, MTD检测 BM 人工 — — — — [204] Pix2pix 遥感道路图像 分割 BM 人工 — — 1.40% — [205] CycleGAN 汽车零部件,
MTD分割 BM 人工 1350 3600 4.31,
28.6%FID,
LPIPS, 人工[206] Pix2pix 固体废物 分割 BM 人工 1630 3386 3.31% — [207] CycleGAN 铁路缺陷 分割 BM 人工 500 1334 21.50% FID, LPIPS [208] Pix2pix 刀具 检测 BM 人工 100 400 9.87% SSIM [209] Pix2pixHD, Pix2pix,
CycleGAN, OASIS混凝土 分类, 分割, 检测 BM 人工 500 500 15.60% FID, IS [210] DM MVTec, BTAD[211],
KSDD2[212]分割 BM 人工 10,
10,
246— — — [213] CGAN+Pix2pix+
WGANOLED面板 分类 BB 自动 150 4000 15% — [214] CycleGAN TILDA[215] 分割 BM 自动 200 200 80.50% — [216] AE+Pix2pix KSDD等 分割 BM 自动 150 300 11.85% SSIM, PSNR,
FID[217] Pix2pix+WGAN 碳纤维 分割 BM 自动 300 2700 32.87% — [218] DCGAN+Pix2pix 太阳能铝型材 分割 BM 自动 7800 15600 1.25% — [219] Pix2pix+VAE 红外小目标 分割 BM 自动 160 262 4.20% IoU [220] Pix2pix MVTec,
手机中框分割 BM 自动 8 (裁剪为
500)/—1000 /—6.86%
—FID, SSIM [221] DM MVTec 分割 BM 自动 1/3测试集 1000 — IS, IC-L 表 14 基于图像条件生成模型的扩增方法基础模型使用次数
Table 14 The times of each basic model used in the image-conditional generative model-based augmentation methods
模型 CycleGAN Pix2pix CGAN multi-GAN AE ACGAN DCGAN DM 次数 15 12 4 2 3 2 1 2 表 15 不同检测任务中各类模型使用次数
Table 15 The times of each model used in different inspection tasks
任务类型 模型类型 DCGAN WGAN AE StyleGAN CGAN ACGAN CycleGAN Pix2pix ProGAN multi-GAN DM 分割 3 2 3 2 3 0 6 7 0 1 5 目标检测 1 3 0 2 2 0 2 2 0 0 1 分类 12 8 4 4 7 7 7 2 2 3 1 总和 16 13 7 8 12 7 15 11 2 4 7 表 16 应用到不同检测任务中的基于模型生成的扩增方法数
Table 16 The number of model-based generation augmentation methods applied to different inspection tasks
应用任务 基于模型生成的扩增方法类型 总和 无条件 低维条件 图像条件 分类 24 11 13 48 目标检测 10 1 6 17 分割 9 3 16 28 表 17 不同基于模型生成的扩增方法的评价指标使用次数
Table 17 The times of each evaluation index used in different model-based generation augmentation methods
方法类型 评价指标 FID IS LPIPS SSIM KID MMD PSNR SNR UQI VIF SD 无条件 14 7 1 1 2 3 1 2 0 0 0 低维条件 5 3 1 0 0 0 0 0 0 0 0 图像条件 14 5 2 5 1 0 3 0 1 1 1 总和 33 15 4 6 3 3 4 2 1 1 1 表 18 常用工业数据集
Table 18 Common industrial datasets
数据集名称 标注类型 场景 缺陷种类 特点 数量 MVTec[23] BM 15种综合 73 成像位置固定, 产品和缺陷类别丰富, 各类别缺陷数量较少 训练集: 3629 N
测试集:1258 D + 467 NVISION[160] BB, BM 14种综合 44 多种真实工业场景下的高分辨率图像, 一些产品具有复杂的结构, 标注信息丰富 训练集: 1959 D
验证集:2514 D
测试集:5364 DDeepCrack[132] BM 混凝土路面 — 缺陷内容简单, 多为黑色的条状裂缝 训练集: 300 D
测试集: 237 DWM-811K[114] IL (部分) 晶圆 — 结构固定, 缺陷内容表现为不均匀的色块 训练集: 17625 D +36731 N
测试集:7894 D +110701 NDeepPCB[172] BB PCB — 组件成像为黑色, 背景为白色; 缺陷类别丰富, 内容表现为黑白色的突起或缺失 训练集: 1000 D +1000 N
测试集: 500 D + 500 NDAGM2007[185] BB 10种合成纹理 — 细粒度、纹理复杂的灰度图像, 缺陷与背景差异度小, 每个缺陷图像上恰有一个缺陷 训练集: 1050 D +7000 N
测试集:1050 D +7000 NNEU[134] BB 带钢 6 背景复杂的灰度图像, 部分类别缺陷与背景差异度小 1800 DCODEBRIM[142] BB 混凝土 4 真实场景桥梁成像, 背景复杂 1052 D + 538 NGC10-DET[138] BB 钢板 10 真实钢带灰度图 3570 DSDNET2018[122] IL 混凝土 1 背景干净的灰度图像, 缺陷表现为黑色细条纹 8484 D +47608 NKolektorSDD[156] BM 电子换向器 1 背景纹理丰富的灰度图像, 缺陷多为横向灰黑色条状 52 D + 347 N KolektorSDD2[212] BM 单一产品表面 — 形状、纹理丰富的细粒度颜色缺陷 训练集: 246 D + 2085 N
测试集: 110 D + 894 NCrackForest[157] BM 混凝土路面 — 真实场景道路成像, 图像间差异较大, 噪声较多 118 D + 37 N MTD[151] BM 磁瓦 5 灰度图像, 缺陷表现为与背景深浅不一的灰度区域 392 D + 952 N TILDA[215] IL 8种织物 — 纹理种类丰富, 没有像素级标注 2800 D + 400 N表 19 生成图像量化评价结果对比. 每一行的最优和次优值分别用加粗和下划线表示. ↓ 表示值越低越好, ↑ 表示值越高越好
Table 19 Comparison of quantitative evaluation results for generated images. The optimal and suboptimal values for each row are shown in bold and underlined, respectively. ↓ indicates that lower values are better, and ↑ indicates that higher values are better
指标 方法 CycleGAN StyleGAN v2 DFMGAN Defect-Gen AE VAE AnomalyDiffusion FID ↓ 181.00 101.36 135.81 190.26 259.34 436.61 107.17 IS ↑ 1.40 1.19 1.26 2.40 1.67 1.55 2.04 LPIPS (%) ↑ 10.34 23.06 22.91 24.66 33.67 3.68 13.70 SSIM (%) ↑ 16.38 11.12 6.89 5.92 3.71 3.94 7.72 PSNR ↑ 7.03 9.82 6.72 6.79 7.72 5.48 7.05 表 20 真实数据集和扩增数据集的平均准确率, 加粗和下划线分别表示最优和次优结果 (%)
Table 20 Average accuracy of real and augmented datasets, bold and underlined line indicate the optimal and suboptimal results, respectively (%)
训练集 电缆 地毯 螺丝钉 木板 真实 64.97 98.35 86.70 98.68 CycleGAN 86.62 97.80 76.15 97.37 StyleGAN v2 89.17 100.00 81.19 68.42 DFMGAN 85.99 100.00 72.48 91.45 Defect-Gen 83.44 99.45 79.36 100.00 AE 78.34 98.90 88.99 100.00 VAE 74.52 77.98 99.45 98.03 AnomalyDiffusion 85.35 57.34 98.35 100.00 表 21 分割性能对比, 加粗和下划线分别表示最优和次优结果 (%)
Table 21 Comparison of segmentation performance, bold and underlined line indicate the optimal and suboptimal results, respectively (%)
训练集 指标 电缆 地毯 螺丝钉 木板 真实 IoU 55.01 64.79 26.08 68.28 F1 70.97 78.63 41.37 81.15 Anomaly-
DiffusionIoU 59.74 60.42 29.87 48.20 F1 74.80 75.33 46.00 65.04 DFMGAN IoU 58.36 66.14 32.72 66.54 F1 73.71 79.62 49.30 79.91 DefectGen IoU 60.51 67.41 42.99 69.33 F1 75.40 80.53 60.13 81.89 表 22 扩增样本训练检测网络时的应用方式
Table 22 Application modes of training inspection network with augmented samples
应用方式 流程 优点 缺点 直接应用 与真实训练集混合, 从零开始训练检测网络 流程简单 低质量合成数据可能引入噪声 预训练 扩增样本对检测模型进行预训练, 真实训练集用于微调 提供比随机初始化更优的参数 域差异较大时微调效果不明显 微调 真实训练集训练完成后, 采用扩增样本对网络进行微调 可以更有针对性的学习缺陷特征 模型丢失对真实数据集的特征提取能力 联合训练 联合训练生成模型和缺陷检测网络 检测模型直接学习难样本特征 训练时间成本高, 对模型的设计要求高 -
[1] Hao R Y, Lu B Y, Cheng Y, Li X, Huang B Q. A steel surface defect inspection approach towards smart industrial monitoring. Journal of Intelligent Manufacturing, 2021, 32(7): 1833−1843 doi: 10.1007/s10845-020-01670-2 [2] Diers J, Pigorsch C. A survey of methods for automated quality control based on images. International Journal of Computer Vision, 2023, 131(10): 2553−2581 doi: 10.1007/s11263-023-01822-w [3] Chen M Q, Yu L J, Zhi C, Sun R J, Zhu S W, Gao Z Y, et al. Improved faster R-CNN for fabric defect detection based on Gabor filter with genetic algorithm optimization. Computers in Industry, 2022, 134: Article No. 103551 doi: 10.1016/j.compind.2021.103551 [4] Lee D H, Kim E S, Choi S H, Bae Y M, Park J B, Oh Y C, et al. Development of taxonomy for classifying defect patterns on wafer bin map using Bin2Vec and clustering methods. Computers in Industry, 2023, 152: Article No. 104005 doi: 10.1016/j.compind.2023.104005 [5] Lv C K, Shen F, Zhang Z T, Xu D, He Y H. A novel pixel-wise defect inspection method based on stable background reconstruction. IEEE Transactions on Instrumentation and Measurement, 2021, 70: Article No. 5005213 [6] Lv C K, Zhang Z T, Shen F, Zhang F, Su H. A fast surface defect detection method based on background reconstruction. International Journal of Precision Engineering and Manufacturing, 2020, 21(3): 363−375 doi: 10.1007/s12541-019-00262-2 [7] Hebbache L, Amirkhani D, Allili M S, Hammouche N, Lapointe J F. Leveraging saliency in single-stage multi-label concrete defect detection using unmanned aerial vehicle imagery. Remote Sensing, 2023, 15(5): Article No. 1218 doi: 10.3390/rs15051218 [8] Szarski M, Chauhan S. An unsupervised defect detection model for a dry carbon fiber textile. Journal of Intelligent Manufacturing, 2022, 33(7): 2075−2092 doi: 10.1007/s10845-022-01964-7 [9] Liu T H, He Z S. TAS.2-Net: Triple-attention semantic segmentation network for small surface defect detection. IEEE Transactions on Instrumentation and Measurement, 2022, 71: Article No. 5004512 [10] Li B, Liu H T, Chen L Y, Lee Y J, Li C Y, Liu Z W. Benchmarking and analyzing generative data for visual recognition. arXiv preprint arXiv: 2307.13697, 2023. [11] Czimmermann T, Ciuti G, Milazzo M, Chiurazzi M, Roccella S, Oddo C M, et al. Visual-based defect detection and classification approaches for industrial applications——A survey. Sensors, 2020, 20(5): Article No. 1459 doi: 10.3390/s20051459 [12] Shorten C, Khoshgoftaar T M. A survey on image data augmentation for deep learning. Journal of Big Data, 2019, 6(1): Article No. 60 doi: 10.1186/s40537-019-0197-0 [13] Zhang L R, Zhang S H, Xie G Y, Liu J Q, Yan H, Wang J B, et al. What makes a good data augmentation for few-shot unsupervised image anomaly detection? In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Vancouver, Canada: IEEE, 2023. 4345−4354 [14] Nalepa J, Marcinkiewicz M, Kawulok M. Data augmentation for brain-tumor segmentation: A review. Frontiers in Computational Neuroscience, 2019, 13: Article No. 83 doi: 10.3389/fncom.2019.00083 [15] Bissoto A, Valle E, Avila S. GAN-based data augmentation and anonymization for skin-lesion analysis: A critical review. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Nashville, USA: IEEE, 2021. 1847−1856 [16] Chen Y Z, Yang X H, Wei Z H, Heidari A A, Zheng N G, Li Z C, et al. Generative adversarial networks in medical image augmentation: A review. Computers in Biology and Medicine, 2022, 144: Article No. 105382 doi: 10.1016/j.compbiomed.2022.105382 [17] Xu M L, Yoon S, Fuentes A, Park D S. A comprehensive survey of image augmentation techniques for deep learning. Pattern Recognition, 2023, 137: Article No. 109347 doi: 10.1016/j.patcog.2023.109347 [18] Zhong X P, Zhu J W, Liu W X, Hu C X, Deng Y L, Wu Z Z. An overview of image generation of industrial surface defects. Sensors, 2023, 23(19): Article No. 8160 doi: 10.3390/s23198160 [19] 汤健, 崔璨麟, 夏恒, 乔俊飞. 面向复杂工业过程的虚拟样本生成综述. 自动化学报, 2024, 50(4): 688−718Tang Jian, Cui Can-Lin, Xia Heng, Qiao Jun-Fei. A survey of virtual sample generation for complex industrial processes. Acta Automatica Sinica, 2024, 50(4): 688−718 [20] 汤健, 郭海涛, 夏恒, 王鼎, 乔俊飞. 面向工业过程的图像生成及其应用研究综述. 自动化学报, 2024, 50(2): 211−240Tang Jian, Guo Hai-Tao, Xia Heng, Wang Ding, Qiao Jun-Fei. Image generation and its application research for industrial process: A survey. Acta Automatica Sinica, 2024, 50(2): 211−240 [21] Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative adversarial nets. In: Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada: MIT Press, 2014. 2672−2680 [22] Ho J, Jain A, Abbeel P. Denoising diffusion probabilistic models. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc., 2020. Article No. 574 [23] Bergmann P, Fauser M, Sattlegger D, Steger C. MVTec AD——A comprehensive real-world dataset for unsupervised anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019. 9584−9592 [24] Feng Z Q, Guo L, Huang D R, Li R Z. Electrical insulator defects detection method based on YOLOv5. In: Proceedings of the IEEE 10th Data Driven Control and Learning Systems Conference (DDCLS). Suzhou, China: IEEE, 2021. 979−984 [25] Akram M W, Li G Q, Jin Y, Chen X, Zhu C G, Zhao X D, et al. CNN based automatic detection of photovoltaic cell defects in electroluminescence images. Energy, 2019, 189: Article No. 116319 doi: 10.1016/j.energy.2019.116319 [26] Sassi P, Tripicchio P, Avizzano C A. A smart monitoring system for automatic welding defect detection. IEEE Transactions on Industrial Electronics, 2019, 66(12): 9641−9650 doi: 10.1109/TIE.2019.2896165 [27] Saqlain M, Abbas Q, Lee J Y. A deep convolutional neural network for wafer defect identification on an imbalanced dataset in semiconductor manufacturing processes. IEEE Transactions on Semiconductor Manufacturing, 2020, 33(3): 436−444 doi: 10.1109/TSM.2020.2994357 [28] Zhong Z, Zheng L, Kang G L, Li S Z, Yang Y. Random erasing data augmentation. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York, USA: AAAI, 2020. 13001−13008 [29] DeVries T, Taylor G W. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv: 1708.04552, 2017. [30] Singh K K, Yu H, Sarmasi A, Pradeep G, Lee Y J. Hide-and-seek: A data augmentation technique for weakly-supervised localization and beyond. arXiv preprint arXiv: 1811.02545, 2018. [31] Chen P G, Liu S, Zhao H S, Jia J Y. GridMask data augmentation. arXiv preprint arXiv: 2001.04086, 2020. [32] Li P, Li X Y, Long X. FenceMask: A data augmentation approach for pre-extracted image features. arXiv preprint arXiv: 2006.07877, 2020. [33] Zheng X Q, Wang H C, Chen J, Kong Y G, Zheng S. A generic semi-supervised deep learning-based approach for automated surface inspection. IEEE Access, 2020, 8: 114088−114099 doi: 10.1109/ACCESS.2020.3003588 [34] Li X Y, Zheng Y, Chen B, Zheng E R. Dual attention-based industrial surface defect detection with consistency loss. Sensors, 2022, 22(14): Article No. 5141 doi: 10.3390/s22145141 [35] Wu Y P, Qin Y, Qian Y, Guo F. Automatic detection of arbitrarily oriented fastener defect in high-speed railway. Automation in Construction, 2021, 131: Article No. 103913 doi: 10.1016/j.autcon.2021.103913 [36] Wang S, Xia X J, Ye L Q, Yang B B. Automatic detection and classification of steel surface defect using deep convolutional neural networks. Metals, 2021, 11(3): Article No. 388 doi: 10.3390/met11030388 [37] Chiu M C, Chen T M. Applying data augmentation and mask R-CNN-based instance segmentation method for mixed-type wafer maps defect patterns classification. IEEE Transactions on Semiconductor Manufacturing, 2021, 34(4): 455−463 doi: 10.1109/TSM.2021.3118922 [38] Ding C B, Pang G S, Shen C H. Catching both gray and black swans: Open-set supervised anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 7378−7388 [39] Jamshidi M, El-Badry M, Nourian N. Improving concrete crack segmentation networks through CutMix data synthesis and temporal data fusion. Sensors, 2023, 23(1): Article No. 504 doi: 10.3390/s23010504 [40] Yun S, Han D, Chun S, Oh S J, Yoo Y, Choe J. CutMix: Regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea (South): IEEE, 2019. 6022−6031 [41] Harris E, Marcu A, Painter M, Niranjan M, Prügel-Bennett A, Hare J. FMix: Enhancing mixed sample data augmentation. arXiv preprint arXiv: 2002.12047, 2021.Harris E, Marcu A, Painter M, Niranjan M, Prügel-Bennett A, Hare J. FMix: Enhancing mixed sample data augmentation. arXiv preprint arXiv: 2002.12047, 2021. [42] Uddin A F M S, Monira M S, Shin W, Chung T, Bae S H. SaliencyMix: A saliency guided data augmentation strategy for better regularization. arXiv preprint arXiv: 2006.01791, 2021.Uddin A F M S, Monira M S, Shin W, Chung T, Bae S H. SaliencyMix: A saliency guided data augmentation strategy for better regularization. arXiv preprint arXiv: 2006.01791, 2021. [43] Takahashi R, Matsubara T, Uehara K. Data augmentation using random image cropping and patching for deep CNNs. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(9): 2917−2931 doi: 10.1109/TCSVT.2019.2935128 [44] Gong C Y, Wang D L, Li M, Chandra V, Liu Q. KeepAugment: A simple information-preserving data augmentation approach. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE, 2021. 1055−1064 [45] Qin J, Fang J M, Zhang Q, Liu W Y, Wang X G. ResizeMix: Mixing data with preserved object information and true labels. arXiv preprint arXiv: 2012.11101, 2020. [46] Huang S L, Wang X C, Tao D C. SnapMix: Semantically proportional mixing for augmenting fine-grained data. arXiv preprint arXiv: 2012.04846, 2020.Huang S L, Wang X C, Tao D C. SnapMix: Semantically proportional mixing for augmenting fine-grained data. arXiv preprint arXiv: 2012.04846, 2020. [47] Ghiasi G, Cui Y, Srinivas A, Qian R, Lin T Y, Cubuk E D, et al. Simple copy-paste is a strong data augmentation method for instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE, 2021. 2918−2928 [48] Xie T S, Cheng X, Wang X M, Liu M H, Deng J L, Zhou T, Liu M. Cut-thumbnail: A novel data augmentation for convolutional neural network. In: Proceedings of the 29th ACM International Conference on Multimedia. Virtual Event: Association for Computing Machinery, 2021. 1627−1635Xie T S, Cheng X, Wang X M, Liu M H, Deng J L, Zhou T, Liu M. Cut-thumbnail: A novel data augmentation for convolutional neural network. In: Proceedings of the 29th ACM International Conference on Multimedia. Virtual Event: Association for Computing Machinery, 2021. 1627−1635 [49] Kim Y, Uddin A F M S, Bae S H. Local augment: Utilizing local bias property of convolutional neural networks for data augmentation. IEEE Access, 2021, 9: 15191−15199 doi: 10.1109/ACCESS.2021.3050758 [50] Zhang J W, Zhang Y C, Xu X W. ObjectAug: Object-level data augmentation for semantic image segmentation. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN). Shenzhen, China: IEEE, 2021. 1−8 [51] Seo J W, Jung H G, Lee S W. Self-augmentation: Generalizing deep networks to unseen classes for few-shot learning. Neural Networks, 2021, 138: 140−149 doi: 10.1016/j.neunet.2021.02.007 [52] Li C L, Sohn K, Yoon J, Pfister T. CutPaste: Self-supervised learning for anomaly detection and localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE, 2021. 9659−9669 [53] Choi J, Lee C, Lee D, Jung H. SalfMix: A novel single image-based data augmentation technique using a saliency map. Sensors, 2021, 21(24): Article No. 8444 doi: 10.3390/s21248444 [54] Han J L, Fang P F, Li W H, Hong J, Armin M A, Reid I D, et al. You only cut once: Boosting data augmentation with a single cut. In: Proceedings of the 39th International Conference on Machine Learning. Baltimore, USA: PMLR, 2022. 8196−8212 [55] Kumar T, Mileo A, Brennan R, Bendechache M. RSMDA: Random slices mixing data augmentation. Applied Sciences, 2023, 13(3): Article No. 1711 doi: 10.3390/app13031711 [56] Bhattacharya A, Cloutier S G. End-to-end deep learning framework for printed circuit board manufacturing defect classification. Scientific Reports, 2022, 12(1): Article No. 12559 doi: 10.1038/s41598-022-16302-3 [57] Yang T C, Liu Y M, Huang Y P, Liu J B, Wang S C. Symmetry-driven unsupervised abnormal object detection for railway inspection. IEEE Transactions on Industrial Informatics, 2023, 19(12): 11487−11498 doi: 10.1109/TII.2023.3246995 [58] Rippel O, Zwinge C, Merhof D. Increasing the generalization of supervised fabric anomaly detection methods to unseen fabrics. Sensors, 2022, 22(13): Article No. 4750 doi: 10.3390/s22134750 [59] Zhang H Y, Cissé M, Dauphin Y N, Lopez-Paz D. Mixup: Beyond empirical risk minimization. In: Proceedings of the 6th International Conference on Learning Representations. Vancouver, Canada: ICLR, 2018. 1−13 [60] Hiroshi I. Data augmentation by pairing samples for images classification. arXiv preprint arXiv: 1801.02929, 2018. [61] David B, Carlini N, Goodfellow I, Oliver A, Papernot N, Colin C. MixMatch: A holistic approach to semi-supervised learning. arXiv preprint arXiv: 1905.02249, 2019.David B, Carlini N, Goodfellow I, Oliver A, Papernot N, Colin C. MixMatch: A holistic approach to semi-supervised learning. arXiv preprint arXiv: 1905.02249, 2019. [62] Hendrycks D, Mu N, Cubuk E D, Zoph B, Gilmer J, Lakshminarayanan B. AugMix: A simple data processing method to improve robustness and uncertainty. In: Proceedings of the 8th International Conference on Learning Representations. Addis Ababa, Ethiopia: ICLR, 2020. 1−15 [63] Sohn K, Berthelot D, Li C L, Zhang Z Z, Carlini N, Cubuk E D, et al. FixMatch: Simplifying semi-supervised learning with consistency and confidence. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc., 2020. Article No. 51 [64] Berthelot D, Carlini N, Cubuk E D, Kurakin A, Sohn K, Zhang H, et al. ReMixMatch: Semi-supervised learning with distribution alignment and augmentation anchoring. arXiv preprint arXiv: 1911.09785, 2019. [65] Kim J H, Choo W, Song H O. Puzzle mix: Exploiting saliency and local statistics for optimal mixup. arXiv preprint arXiv: 2009.06962, 2020.Kim J H, Choo W, Song H O. Puzzle mix: Exploiting saliency and local statistics for optimal mixup. arXiv preprint arXiv: 2009.06962, 2020. [66] Hong M, Choi J, Kim G. StyleMix: Separating content and style for enhanced data augmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE, 2021. 14857−14865 [67] Liu X L, Shen F R, Zhao J, Nie C H. RandoMix: A mixed sample data augmentation method with multiple mixed modes. Multimedia Tools and Applications, 2025, 84(8): 4343−4359 [68] Zhang Y, Liu X F, Guo J, Zhou P C. Surface defect detection of strip-steel based on an improved PP-YOLOE-m detection network. Electronics, 2022, 11(16): Article No. 2603 doi: 10.3390/electronics11162603 [69] Zeng N Y, Wu P S, Wang Z D, Li H, Liu W B, Liu X H. A small-sized object detection oriented multi-scale feature fusion approach with application to defect detection. IEEE Transactions on Instrumentation and Measurement, 2022, 71: Article No. 3507014 [70] Yao X C, Li R Q, Zhang J, Sun J, Zhang C Y. Explicit boundary guided semi-push-pull contrastive learning for supervised anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE, 2023. 24490−24499 [71] Gyimah N K, Gupta K D, Nabil M, Yan X Y, Girma A, Homaifar A, et al. A discriminative deeplab model (DDLM) for surface anomaly detection and localization. In: Proceedings of the 13th Annual Computing and Communication Workshop and Conference (CCWC). Las Vegas, USA: IEEE, 2023. 1137−1144 [72] Cubuk E D, Zoph B, Mané D, Vasudevan V, Le Quoc V. AutoAugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019. 113−123 [73] Ho D, Liang E, Stoica I, Abbeel P. Population based augmentation: Efficient learning of augmentation policy schedules. In: Proceedings of the 36th International Conference on Machine Learning. Long Beach, USA: PMLR, 2019. 2731−2741 [74] Lim S, Kim I, Kim T, Kim C, Kim S. Fast AutoAugment. arXiv preprint arXiv: 1905.00397, 2019.Lim S, Kim I, Kim T, Kim C, Kim S. Fast AutoAugment. arXiv preprint arXiv: 1905.00397, 2019. [75] Cubuk E D, Zoph B, Shlens J, Le Quoc V. Randaugment: Practical automated data augmentation with a reduced search space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Seattle, USA: IEEE, 2020. 3008−3017 [76] Hataya R, Zdenek J, Yoshizoe K, Nakayama H. Faster AutoAugment: Learning augmentation strategies using backpropagation. In: Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer, 2020. 1−16 [77] Hataya R, Zdenek J, Yoshizoe K, Nakayama H. Meta approach to data augmentation optimization. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Waikoloa, USA: IEEE, 2022. 3535−3544 [78] Zhang X Y, Wang Q, Zhang J, Zhong Z. Adversarial AutoAugment. In: Proceedings of the 8th International Conference on Learning Representations. Addis Ababa, Ethiopia: ICLR, 2020. 1−13 [79] Li Y G, Hu G S, Wang Y T, Hospedales T, Robertson N M, Yang Y X. Differentiable automatic data augmentation. In: Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer, 2020. 580−595 [80] Lin S Q, Yu T, Feng R Y, Chen Z B. Patch AutoAugment. arXiv preprint arXiv: 2103.11099, 2021. [81] Mehta S, Naderiparizi S, Faghri F, Horton M, Chen L L, Farhadi A, et al. RangeAugment: Efficient online augmentation with range learning. arXiv preprint arXiv: 2212.10553, 2022. [82] Liu Z K, Zhou Y M, Xu Y S, Wang Z L. SimpleNet: A simple network for image anomaly detection and localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Vancouver, Canada: IEEE, 2023. 20402−20411 [83] You Z Y, Cui L, Shen Y J, Yang K, Lu X, Zheng Y, et al. A unified model for multi-class anomaly detection. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. New Orleans, USA: Curran Associates Inc., 2022. Article No. 330 [84] Zavrtanik V, Kristan M, Skočaj D. DSR——A dual subspace re-projection network for surface anomaly detection. In: Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer, 2022. 539−554 [85] Bond-Taylor S, Leach A, Long Y, Willcocks C G. Deep generative modelling: A comparative review of VAEs, GANs, normalizing flows, energy-based and autoregressive models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(11): 7327−7347 doi: 10.1109/TPAMI.2021.3116668 [86] Makhzani A, Shlens J, Jaitly N, Goodfellow I, Frey B. Adversarial autoencoders. arXiv preprint arXiv: 1511.05644, 2015. [87] Kingma D P, Welling M. Auto-encoding variational bayes. In: Proceedings of the 2nd International Conference on Learning Representations. Banff, Canada: ICLR, 2014. 1−13 [88] Karras T, Laine S, Aila T. A style-based generator architecture for generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019. 4396−4405 [89] Mirza M, Osindero S. Conditional generative adversarial nets. arXiv preprint arXiv: 1411.1784, 2014. [90] Odena A, Olah C, Shlens J. Conditional image synthesis with auxiliary classifier GANs. In: Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: JMLR.org, 2017. 2642−2651 [91] Isola P, Zhu J Y, Zhou T H, Efros A A. Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE, 2017. 5967−5976 [92] Zhu J Y, Park T, Isola P, Efros A A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 2242−2251 [93] Karras T, Aila T, Laine S, Lehtinen J. Progressive growing of gans for improved quality, stability, and variation. In: Proceedings of the 6th International Conference on Learning Representations. Vancouver, Canada: ICLR, 2018. 1−26 [94] Rombach R, Blattmann A, Lorenz D, Esser P, Ommer B. High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE, 2022. 10674−10685 [95] Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X. Improved techniques for training GANs. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: Curran Associates Inc., 2016. 2234−2242 [96] Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE, 2016. 2818−2826 [97] Heusel M, Ramsauer H, Unterthiner T, Nessler B, Hochreiter S. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc., 2017. 6629−6640 [98] Bińkowski M, Sutherland D J, Arbel M, Gretton A. Demystifying MMD GANs. In: Proceedings of the 6th International Conference on Learning Representations. Vancouver, Canada: ICLR, 2018. 1−5 [99] Zhang R, Isola P, Efros A A, Shechtman E, Wang O. The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 586−595 [100] Xu Q T, Huang G, Yuan Y, Guo C, Sun Y, Wu F, et al. An empirical study on evaluation metrics of generative adversarial networks. arXiv preprint arXiv: 1806.07755, 2018. [101] Mathieu M, Couprie C, LeCun Y. Deep multi-scale video prediction beyond mean square error. In: Proceedings of the 4th International Conference on Learning Representations. San Juan, Puerto Rico: ICLR, 2016. 1−14 [102] Wang Z, Bovik A C. A universal image quality index. IEEE Signal Processing Letters, 2002, 9(3): 81−84 doi: 10.1109/97.995823 [103] Sheikh H R, Bovik A C. Image information and visual quality. IEEE Transactions on Image Processing, 2006, 15(2): 430−444 doi: 10.1109/TIP.2005.859378 [104] Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. In: Proceedings of the 4th International Conference on Learning Representations. San Juan, Puerto Rico: ICLR, 2016. [105] Arjovsky M, Chintala S, Bottou L. Wasserstein GAN. arXiv preprint arXiv: 1701.07875, 2017. [106] Gulrajani I, Ahmed F, Arjovsky M, Dumoulin V, Courville A C. Improved training of wasserstein GANs. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc., 2017. 5769−5779 [107] Miyato T, Kataoka T, Koyama M, Yoshida Y. Spectral normalization for generative adversarial networks. In: Proceedings of the 6th International Conference on Learning Representations. Vancouver, Canada: ICLR, 2018. 1−26 [108] Nguyen T D, Le T, Vu H, Phung D. Dual discriminator generative adversarial nets. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc., 2017. 2667−2677 [109] Chen L S, Yang Q, Yan W J. Generative adversarial network based data augmentation for PV module defect pattern analysis. In: Proceedings of the Chinese Control Conference (CCC). Guangzhou, China: IEEE, 2019. 8422−8427 [110] Zhong J T, Huyan J, Zhang W G, Cheng H L, Zhang J, Tong Z, et al. A deeper generative adversarial network for grooved cement concrete pavement crack detection. Engineering Applications of Artificial Intelligence, 2023, 119: Article No. 105808 doi: 10.1016/j.engappai.2022.105808 [111] Fang Q, Ibarra-Castanedo C, Yuxia D, Erazo-Aux J, Garrido I, Maldague X. Defect enhancement and image noise reduction analysis using partial least square-generative adversarial networks (PLS-GANs) in thermographic nondestructive evaluation. Journal of Nondestructive Evaluation, 2021, 40(4): Article No. 92 doi: 10.1007/s10921-021-00827-0 [112] Tang W Q, Yang Q, Xiong K X, Yan W J. Deep learning based automatic defect identification of photovoltaic module using electroluminescence images. Solar Energy, 2020, 201: 453−460 doi: 10.1016/j.solener.2020.03.049 [113] Tsai T H, Lee Y C. A light-weight neural network for wafer map classification based on data augmentation. IEEE Transactions on Semiconductor Manufacturing, 2020, 33(4): 663−672 doi: 10.1109/TSM.2020.3013004 [114] Wu M J, Jang J S R, Chen J L. Wafer map failure pattern recognition and similarity ranking for large-scale data sets. IEEE Transactions on Semiconductor Manufacturing, 2015, 28(1): 1−12 doi: 10.1109/TSM.2014.2364237 [115] Situ Z X, Teng S, Liu H L, Luo J H, Zhou Q Q. Automated sewer defects detection using style-based generative adversarial networks and fine-tuned well-known CNN classifier. IEEE Access, 2021, 9: 59498−59507 doi: 10.1109/ACCESS.2021.3073915 [116] Shin H, Ahn Y, Tae S, Gil H, Song M, Lee S. Enhancement of multi-class structural defect recognition using generative adversarial network. Sustainability, 2021, 13(22): Article No. 12682 doi: 10.3390/su132212682 [117] Guo H T, Tang J, Zhang H, Wang D D. A method for generating images of abnormal combustion state in MSWI process based on DCGAN. In: Proceedings of the 3rd International Conference on Industrial Artificial Intelligence (IAI). Shenyang, China: IEEE, 2021. 1−6 [118] Rožanec J M, Zajec P, Theodoropoulos S, Koehorst E, Fortuna B, Mladenić D. Synthetic data augmentation using GAN for improved automated visual inspection. IFAC-PapersOnLine, 2023, 56(2): 11094−11099 doi: 10.1016/j.ifacol.2023.10.817 [119] Shirazi M, Schmitz M, Janssen S, Thies A, Safronov G, Rizk A, et al. Verifying the applicability of synthetic image generation for object detection in industrial quality inspection. In: Proceedings of the 20th IEEE International Conference on Machine Learning and Applications (ICMLA). Pasadena, USA: IEEE, 2021. 1365−1372 [120] Xu B Q, Liu C. Pavement crack detection algorithm based on generative adversarial network and convolutional neural network under small samples. Measurement, 2022, 196: Article No. 111219 doi: 10.1016/j.measurement.2022.111219 [121] Dunphy K, Fekri M N, Grolinger K, Sadhu A. Data augmentation for deep-learning-based multiclass structural damage detection using limited information. Sensors, 2022, 22(16): Article No. 6193 doi: 10.3390/s22166193 [122] Dorafshan S, Thomas R J, Maguire M. SDNET2018: An annotated image dataset for non-contact concrete crack detection using deep convolutional neural networks. Data in Brief, 2018, 21: 1664−1668 doi: 10.1016/j.dib.2018.11.015 [123] Gao H B, Zhang Y, Lv W K, Yin J W, Qasim T, Wang D Y. A deep convolutional generative adversarial networks-based method for defect detection in small sample industrial parts images. Applied Sciences, 2022, 12(13): Article No. 6569 doi: 10.3390/app12136569 [124] Le X Y, Mei J H, Zhang H D, Zhou B Y, Xi J T. A learning-based approach for surface defect detection using small image datasets. Neurocomputing, 2020, 408: 112−120 doi: 10.1016/j.neucom.2019.09.107 [125] Park S, You C. Deep convolutional generative adversarial networks-based data augmentation method for classifying class-imbalanced defect patterns in wafer bin map. Applied Sciences, 2023, 13(9): Article No. 5507 doi: 10.3390/app13095507 [126] Liu K X, Li Y J, Yang J G, Liu Y, Yao Y. Generative principal component thermography for enhanced defect detection and analysis. IEEE Transactions on Instrumentation and Measurement, 2020, 69(10): 8261−8269 [127] Zhang H D, Chen Z Z, Zhang C Q, Xi J T, Le X Y. Weld defect detection based on deep learning method. In: Proceedings of the 15th International Conference on Automation Science and Engineering (CASE). Vancouver, Canada: IEEE, 2019. 1574−1579 [128] Song S, Chang K, Yun K, Jun C, Baek J G. Defect synthesis using latent mapping adversarial network for automated visual inspection. Electronics, 2022, 11(17): Article No. 2763 doi: 10.3390/electronics11172763 [129] Geng Z Q, Shi C J, Han Y M. Intelligent small sample defect detection of water walls in power plants using novel deep learning integrating deep convolutional GAN. IEEE Transactions on Industrial Informatics, 2023, 19(6): 7489−7497 doi: 10.1109/TII.2022.3159817 [130] Liu K X, Ma Z Y, Liu Y, Yang J G, Yao Y. Enhanced defect detection in carbon fiber reinforced polymer composites via generative kernel principal component thermography. Polymers, 2021, 13(5): Article No. 825 doi: 10.3390/polym13050825 [131] Zhang T J, Wang D L, Mullins A, Lu Y. Integrated APC-GAN and attunet framework for automated pavement crack pixel-level segmentation: A new solution to small training datasets. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(4): 4474−4481 doi: 10.1109/TITS.2023.3236247 [132] Liu Y H, Yao J, Lu X H, Xie R P, Li L. DeepCrack: A deep hierarchical feature learning architecture for crack segmentation. Neurocomputing, 2019, 338: 139−153 doi: 10.1016/j.neucom.2019.01.036 [133] He Y, Song K C, Dong H W, Yan Y H. Semi-supervised defect classification of steel surface based on multi-training and generative adversarial network. Optics and Lasers in Engineering, 2019, 122: 294−302 doi: 10.1016/j.optlaseng.2019.06.020 [134] Song K C, Yan Y H. A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects. Applied Surface Science, 2013, 285: 858−864 doi: 10.1016/j.apsusc.2013.09.002 [135] Ma D, Liu J H, Fang H Y, Wang N N, Zhang C, Li Z N, et al. A multi-defect detection system for sewer pipelines based on StyleGAN-SDM and fusion CNN. Construction and Building Materials, 2021, 312: Article No. 125385 doi: 10.1016/j.conbuildmat.2021.125385 [136] He D, Xu K, Zhou P, Zhou D D. Surface defect classification of steels with a new semi-supervised learning method. Optics and Lasers in Engineering, 2019, 117: 40−48 doi: 10.1016/j.optlaseng.2019.01.011 [137] Zhang H B, Pan D, Liu J H, Jiang Z H. A novel MAS-GAN-based data synthesis method for object surface defect detection. Neurocomputing, 2022, 499: 106−114 doi: 10.1016/j.neucom.2022.05.021 [138] Lv X M, Duan F J, Jiang J J, Fu X, Gan L. Deep metallic surface defect detection: The new benchmark and detection network. Sensors, 2020, 20(6): Article No. 1562 doi: 10.3390/s20061562 [139] Cao X G, Wei H Y, Wang P, Zhang C Y, Huang S K, Li H. High quality coal foreign object image generation method based on StyleGAN-DSAD. Sensors, 2023, 23(1): Article No. 374 doi: 10.1109/JSEN.2022.3224441 [140] Du Z W, Gao L, Li X Y. A new contrastive GAN with data augmentation for surface defect recognition under limited data. IEEE Transactions on Instrumentation and Measurement, 2023, 72: Article No. 3502713 [141] Guo J Y, Wang C, Feng Y. Online adversarial knowledge distillation for image synthesis of bridge defect. In: Proceedings of the 5th International Conference on Computer Science and Application Engineering. Sanya, China: Association for Computing Machinery, 2021. Article No. 96 [142] Mundt M, Majumder S, Murali S, Panetsos P, Ramesh V. Meta-learning convolutional neural architectures for multi-target concrete defect classification with the concrete defect bridge image dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019. 11188−11197 [143] Ren X Y, Lin W Y, Yang X Q, Yu X H, Gao H J. Data augmentation in defect detection of sanitary ceramics in small and non-i.i.d datasets. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(11): 8669−8678 doi: 10.1109/TNNLS.2022.3152245 [144] Hu J H, Yan P, Su Y T, Wu D Y, Zhou H. A method for classification of surface defect on metal workpieces based on twin attention mechanism generative adversarial network. IEEE Sensors Journal, 2021, 21(12): 13430−13441 doi: 10.1109/JSEN.2021.3066603 [145] Li A S, Bertino E, Wu R T, Wu T Y. Building manufacturing deep learning models with minimal and imbalanced training data using domain adaptation and data augmentation. In: Proceedings of the 24th IEEE International Conference on Industrial Technology. Orlando, USA: IEEE, 2023. 1−8 [146] Junliangwangdhu. WaferMap Dataset: MixedWM38 [Online], available: https://github.com/Junliangwangdhu/WaferMap, April 17, 2025 [147] Liu B H, Zhang T R, Yu Y, Miao L G. A data generation method with dual discriminators and regularization for surface defect detection under limited data. Computers in Industry, 2023, 151: Article No. 103963 doi: 10.1016/j.compind.2023.103963 [148] Zhang M Y, Wang Z Y, Wang X Y, Gong M G, Wu Y, Li H. Features kept generative adversarial network data augmentation strategy for hyperspectral image classification. Pattern Recognition, 2023, 142: Article No. 109701 doi: 10.1016/j.patcog.2023.109701 [149] Bu H N, Hu C Z, Yuan X, Ji X Y, Lv H Y, Zhou H G. An image generation method of unbalanced ship coating defects based on IGASEN-EMWGAN. Coatings, 2023, 13(3): Article No. 620 doi: 10.3390/coatings13030620 [150] Jalayer M, Jalayer R, Kaboli A, Orsenigo C, Vercellis C. Automatic visual inspection of rare defects: A framework based on GP-WGAN and enhanced faster R-CNN. In: Proceedings of the IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology. Bandung, Indonesia: IEEE, 2021. 221−227 [151] Huang Y B, Qiu C Y, Guo Y, Wang X N, Yuan K. Surface defect saliency of magnetic tile. In: Proceedings of the 14th International Conference on Automation Science and Engineering (CASE). Munich, Germany: IEEE, 2018. 612−617 [152] Li B Z, Xu Z J, Bian E, Yu C, Gao F, Cao Y L. Particleboard surface defect inspection based on data augmentation and attention mechanisms. In: Proceedings of the 27th International Conference on Automation and Computing. Bristol, UK: IEEE, 2022. 1−6 [153] Saiz F A, Alfaro G, Barandiaran I, Graña M. Generative adversarial networks to improve the robustness of visual defect segmentation by semantic networks in manufacturing components. Applied Sciences, 2021, 11(14): Article No. 6368 doi: 10.3390/app11146368 [154] Zhang X M, Xu M, Zhou X Z. RealNet: A feature selection network with realistic synthetic anomaly for anomaly detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2024. 16699−16708 [155] Niu S L, Peng Y R, Li B, Wang X G. A transformed-feature-space data augmentation method for defect segmentation. Computers in Industry, 2023, 147: Article No. 103860 doi: 10.1016/j.compind.2023.103860 [156] Tabernik D, Šela S, Skvarč J, Skočaj D. Segmentation-based deep-learning approach for surface-defect detection. Journal of Intelligent Manufacturing, 2020, 31(3): 759−776 doi: 10.1007/s10845-019-01476-x [157] Shi Y, Cui L M, Qi Z Q, Meng F, Chen Z S. Automatic road crack detection using random structured forests. IEEE Transactions on Intelligent Transportation Systems, 2016, 17(12): 3434−3445 doi: 10.1109/TITS.2016.2552248 [158] Duan Y X, Hong Y, Niu L, Zhang L Q. Few-shot defect image generation via defect-aware feature manipulation. In: Proceedings of the 37th AAAI Conference on Artificial Intelligence. Washington, USA: AAAI, 2023. 571−578 [159] Yang S, Chen Z F, Chen P G, Fang X, Liang Y X, Liu S, et al. Defect spectrum: A granular look of large-scale defect datasets with rich semantics. In: Proceedings of the 18th European Conference on Computer Vision. Milan, Italy: Springer, 2024. 187−203 [160] Bai H P, Mou S C, Likhomanenko T, Cinbis R G, Tuzel O, Huang P, et al. VISION datasets: A benchmark for vision-based industrial inspection. arXiv preprint arXiv: 2306.07890, 2023. [161] Cotton Incorporated. Standard fabric defect glossary [Online], available: https://www.cottoninc.com/quality-products/textile-resources/fabric-defect-glossary, April 17, 2025. [162] Karras T, Laine S, Aittala M, Hellsten J, Lehtinen J, Aila T. Analyzing and improving the image quality of StyleGAN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020. 8107−8116 [163] Xuan Q, Chen Z Z, Liu Y, Huang H M, Bao G J, Zhang D. Multiview generative adversarial network and its application in pearl classification. IEEE Transactions on Industrial Electronics, 2019, 66(10): 8244−8252 doi: 10.1109/TIE.2018.2885684 [164] Yun J P, Shin W C, Koo G, Kim M S, Lee C, Lee S J. Automated defect inspection system for metal surfaces based on deep learning and data augmentation. Journal of Manufacturing Systems, 2020, 55: 317−324 doi: 10.1016/j.jmsy.2020.03.009 [165] Molitor D A, Kubik C, Becker M, Hetfleisch R H, Lv F, Groche P. Towards high-performance deep learning models in tool wear classification with generative adversarial networks. Journal of Materials Processing Technology, 2022, 302: Article No. 117484 doi: 10.1016/j.jmatprotec.2021.117484 [166] Dai W, Li D Y, Tang D, Wang H M, Peng Y H. Deep learning approach for defective spot welds classification using small and class-imbalanced datasets. Neurocomputing, 2022, 477: 46−60 doi: 10.1016/j.neucom.2022.01.004 [167] Sha Y H, He Z Z, Gutierrez H, Du J W, Yang W W, Lu X N. Small sample classification based on data enhancement and its application in flip chip defection. Microelectronics Reliability, 2023, 141: Article No. 114887 doi: 10.1016/j.microrel.2022.114887 [168] Koumoutsou D, Siolas G, Charou E, Stamou G. Generative adversarial networks for data augmentation in hyperspectral image classification. Generative Adversarial Learning: Architectures and Applications. Cham: Springer, 2022. 115−144 [169] Wu T Y, Wu R T, Wang P H, Lin T K, Chang K C. Development of a high-fidelity failure prediction system for reinforced concrete bridge columns using generative adversarial networks. Engineering Structures, 2023, 286: Article No. 116130 doi: 10.1016/j.engstruct.2023.116130 [170] Liu J H, Wang C Y, Su H, Du B, Tao D C. Multistage GAN for fabric defect detection. IEEE Transactions on Image Processing, 2020, 29: 3388−3400 doi: 10.1109/TIP.2019.2959741 [171] Wang C L, Huang G H, Huang Z Y, He W M. Conditional TransGAN-based data augmentation for PCB electronic component inspection. Computational Intelligence and Neuroscience, 2023, 2023(1): Article No. 2024237 doi: 10.1155/2023/2024237 [172] Tang S L, He F, Huang X L, Yang J. Online PCB defect detector on a new PCB defect dataset. arXiv preprint arXiv: 1902.06197, 2019. [173] Wei J, Shen F, Lv C K, Zhang Z T, Zhang F, Yang H B. Diversified and multi-class controllable industrial defect synthesis for data augmentation and transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Vancouver, Canada: IEEE, 2023. 4445−4453 [174] Valvano G, Agostino A, de Magistris G, Graziano A, Veneri G. Controllable image synthesis of industrial data using stable diffusion. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Waikoloa, USA: IEEE, 2024. 5342−5351 [175] Zhu L, Chen Y S, Ghamisi P, Benediktsson J A. Generative adversarial networks for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(9): 5046−5063 doi: 10.1109/TGRS.2018.2805286 [176] Luo Z, Cheng S Y, Zheng Q Y. GAN-based augmentation for improving CNN performance of classification of defective photovoltaic module cells in electroluminescence images. IOP Conference Series: Earth and Environmental Science, 2019, 354: Article No. 012106 [177] Jain S, Seth G, Paruthi A, Soni U, Kumar G. Synthetic data augmentation for surface defect detection and classification using deep learning. Journal of Intelligent Manufacturing, 2022, 33(4): 1007−1020 doi: 10.1007/s10845-020-01710-x [178] Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P. InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: Curran Associates Inc., 2016. 2180−2188 [179] Zhang Y L, Wang Y L, Jiang Z Q, Liao F G, Zheng L, Tan D Z, et al. Diversifying tire-defect image generation based on generative adversarial network. IEEE Transactions on Instrumentation and Measurement, 2022, 71: Article No. 5007312 [180] Hao L, Shen P, Pan Z W, Xu Y. Multi-level semantic information guided image generation for few-shot steel surface defect classification. Frontiers in Physics, 2023, 11: Article No. 1208781 doi: 10.3389/fphy.2023.1208781 [181] Dhariwal P, Nichol A. Diffusion models beat GANs on image synthesis. arXiv preprint arXiv: 2105.05233, 2021.Dhariwal P, Nichol A. Diffusion models beat GANs on image synthesis. arXiv preprint arXiv: 2105.05233, 2021. [182] Ho J, Salimans T. Classifier-free diffusion guidance. arXiv preprint arXiv: 2207.12598, 2022. [183] Wang Z, Yu L J, Pu L L. Defect simulation in SEM images using generative adversarial networks. In: Proceedings of the Metrology, Inspection, and Process Control for Semiconductor Manufacturing XXXV. Virtual Event: SPIE, 2021. Article No. 116110PWang Z, Yu L J, Pu L L. Defect simulation in SEM images using generative adversarial networks. In: Proceedings of the Metrology, Inspection, and Process Control for Semiconductor Manufacturing XXXV. Virtual Event: SPIE, 2021. Article No. 116110P [184] Wen L, Wang Y, Li X Y. A new cycle-consistent adversarial networks with attention mechanism for surface defect classification with small samples. IEEE Transactions on Industrial Informatics, 2022, 18(12): 8988−8998 doi: 10.1109/TII.2022.3168432 [185] Wieler M, Hahn T, Hamprecht F A. Weakly supervised learning for industrial optical inspection [Online], available: https://zenodo.org/records/8086136, April 17, 2025. [186] Niu S L, Li B, Wang X G, Lin H. Defect image sample generation with GAN for improving defect recognition. IEEE Transactions on Automation Science and Engineering, 2020, 17(3): 1611−1622 [187] Yi C C, Chen Q R, Xu B, Huang T. Steel strip defect sample generation method based on fusible feature GAN model under few samples. Sensors, 2023, 23(6): Article No. 3216 doi: 10.3390/s23063216 [188] Varghese S, Hoskere V. Unpaired image-to-image translation of structural damage. Advanced Engineering Informatics, 2023, 56: Article No. 101940 doi: 10.1016/j.aei.2023.101940 [189] Sekar A, Perumal V. Crack image synthesis and segmentation using paired image translation. In: Proceedings of the International Conference on Advances in Computing, Communication and Applied Informatics (ACCAI). Chennai, India: IEEE, 2022. 1−7 [190] Lian J, Jia W K, Zareapoor M, Zheng Y J, Luo R, Jain D K, et al. Deep-learning-based small surface defect detection via an exaggerated local variation-based generative adversarial network. IEEE Transactions on Industrial Informatics, 2020, 16(2): 1343−1351 doi: 10.1109/TII.2019.2945403 [191] Zhang G J, Cui K W, Hung T Y, Lu S J. Defect-GAN: High-fidelity defect synthesis for automated defect inspection. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV). Waikoloa, USA: IEEE, 2021. 2523−2533 [192] Wang Y, Hu W T, Wen L, Gao L. A new foreground-perception cycle-consistent adversarial network for surface defect detection with limited high-noise samples. IEEE Transactions on Industrial Informatics, 2023, 19(12): 11742−11751 doi: 10.1109/TII.2023.3252410 [193] Yang Y, Zhang Y T, Lang Y, Li B C, Guo S S, Tan Q. GAN-based radar spectrogram augmentation via diversity injection strategy. IEEE Transactions on Instrumentation and Measurement, 2023, 72: Article No. 2502512 [194] Alam L, Kehtarnavaz N. Generating defective epoxy drop images for die attachment in integrated circuit manufacturing via enhanced loss function CycleGAN. Sensors, 2023, 23(10): Article No. 4864 doi: 10.3390/s23104864 [195] Sampath V, Maurtua I, Aguilar Martín J J, Iriondo A, Lluvia I, Aizpurua G. Intraclass image augmentation for defect detection using generative adversarial neural networks. Sensors, 2023, 23(4): Article No. 1861 doi: 10.3390/s23041861 [196] Wang R Y, Hoppe S, Monari E, Huber M F. Defect transfer GAN: Diverse defect synthesis for data augmentation. In: Proceedings of the 33rd British Machine Vision Conference. London, UK: BMVC, 2022. Article No. 445 [197] 罗月童, 段昶, 江佩峰, 周波. 一种基于Pix2pix改进的工业缺陷数据增强方法. 计算机工程与科学, 2022, 44(12): 2206−2212 doi: 10.3969/j.issn.1007-130X.2022.12.014Luo Yue-Tong, Duan Chang, Jiang Pei-Feng, Zhou Bo. An improved industrial defect data augmentation method based on Pix2pix. Computer Engineering & Science, 2022, 44(12): 2206−2212 doi: 10.3969/j.issn.1007-130X.2022.12.014 [198] 崔克彬, 潘锋. 用于绝缘子故障检测的CycleGAN小样本库扩增方法研究. 计算机工程与科学, 2022, 44(3): 509−515 doi: 10.3969/j.issn.1007-130X.2022.03.017Cui Ke-Bin, Pan Feng. A CycleGAN small sample library amplification method for faulty insulator detection. Computer Engineering & Science, 2022, 44(3): 509−515 doi: 10.3969/j.issn.1007-130X.2022.03.017 [199] Chen W X, Li Y N, Zhao Z G. Transmission line vibration damper detection using multi-granularity conditional generative adversarial nets based on UAV inspection images. Sensors, 2022, 22(5): Article No. 1886 doi: 10.3390/s22051886 [200] Chen K, Li H T, Li C S, Zhao X Y, Wu S J, Duan Y X, et al. An automatic defect detection system for petrochemical pipeline based on Cycle-GAN and YOLO v5. Sensors, 2022, 22(20): Article No. 7907 doi: 10.3390/s22207907 [201] Xia Y W, Han S W, Kwon H J. Image generation and recognition for railway surface defect detection. Sensors, 2023, 23(10): Article No. 4793 doi: 10.3390/s23104793 [202] Tsai D M, Fan S K S, Chou Y H. Auto-annotated deep segmentation for surface defect detection. IEEE Transactions on Instrumentation and Measurement, 2021, 70: Article No. 5011410 [203] Su B Y, Zhou Z, Chen H Y, Cao X C. SIGAN: A novel image generation method for solar cell defect segmentation and augmentation. arXiv preprint arXiv: 2104.04953, 2021. [204] Lv N, Ma H X, Chen C, Pei Q Q, Zhou Y, Xiao F L, et al. Remote sensing data augmentation through adversarial training. In: Proceedings of the IEEE International Geoscience and Remote Sensing Symposium. Waikoloa, USA: IEEE, 2020. 2511−2514 [205] Yang B Y, Liu Z Y, Duan G F, Tan J R. Mask2Defect: A prior knowledge-based data augmentation method for metal surface defect inspection. IEEE Transactions on Industrial Informatics, 2022, 18(10): 6743−6755 doi: 10.1109/TII.2021.3126098 [206] Xu X, Zhao B B, Tong X H, Xie H, Feng Y J, Wang C, et al. A data augmentation strategy combining a modified Pix2pix model and the copy-paste operator for solid waste detection with remote sensing images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2022, 15: 8484−8491 doi: 10.1109/JSTARS.2022.3209967 [207] Liu R K, Liu W M, Zheng Z X, Wang L, Mao L, Qiu Q S, et al. Anomaly-GAN: A data augmentation method for train surface anomaly detection. Expert Systems With Applications, 2023, 228: Article No. 120284 doi: 10.1016/j.eswa.2023.120284 [208] Zhao C Y, Xue W, Fu W P, Li Z Q, Fang X M. Defect sample image generation method based on GANs in diamond tool defect detection. IEEE Transactions on Instrumentation and Measurement, 2023, 72: Article No. 2519009 [209] Li S Y, Zhao X F. High-resolution concrete damage image synthesis using conditional generative adversarial network. Automation in Construction, 2023, 147: Article No. 104739 doi: 10.1016/j.autcon.2022.104739 [210] Li H X, Zhang Z X, Chen H, Wu L, Li B, Liu D Y, et al. A novel approach to industrial defect generation through blended latent diffusion model with online adaptation. arXiv preprint arXiv: 2402.19330, 2024. [211] Božič J, Tabernik D, Skočaj D. Mixed supervision for surface-defect detection: From weakly to fully supervised learning. Computers in Industry, 2021, 129: Article No. 103459 doi: 10.1016/j.compind.2021.103459 [212] Bergmann P, Fauser M, Sattlegger D, Steger C. Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2020. 4182−4191 [213] Jeon Y, Kim H, Lee H, Jo S, Kim J. GAN-based defect image generation for imbalanced defect classification of OLED panels. In: Proceedings of the 33rd Eurographics Symposium on Rendering. Prague, Czech Republic: EGSR, 2022. 145−150 [214] Kim M, Jo H, Ra M, Kim W Y. Weakly-supervised defect segmentation on periodic textures using CycleGAN. IEEE Access, 2020, 8: 176202−176216 doi: 10.1109/ACCESS.2020.3024554 [215] TILDA textile texture-database [Online], available: http://lmb.informatik.uni-freiburg.de/resources/datasets/tilda, July 23, 2020. [216] Niu S L, Li B, Wang X G, Peng Y R. Region- and strength-controllable GAN for defect generation and segmentation in industrial images. IEEE Transactions on Industrial Informatics, 2022, 18(7): 4531−4541 doi: 10.1109/TII.2021.3127188 [217] Mertes S, Margraf A, Geinitz S, André E. Alternative data augmentation for industrial monitoring using adversarial learning. arXiv preprint arXiv: 2205.04222, 2022.Mertes S, Margraf A, Geinitz S, André E. Alternative data augmentation for industrial monitoring using adversarial learning. arXiv preprint arXiv: 2205.04222, 2022. [218] Jin T, Ye X W, Li Z X. Establishment and evaluation of conditional GAN-based image dataset for semantic segmentation of structural cracks. Engineering Structures, 2023, 285: Article No. 116058 doi: 10.1016/j.engstruct.2023.116058 [219] Kim J H, Hwang Y. GAN-based synthetic data augmentation for infrared small target detection. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: Article No. 5002512 [220] Wei J, Zhang Z T, Shen F, Lv C K. Mask-guided generation method for industrial defect images with non-uniform structures. Machines, 2022, 10(12): Article No. 1239 doi: 10.3390/machines10121239 [221] Hu T, Zhang J N, Yi R, Du Y Z, Chen X, Liu L, et al. AnomalyDiffusion: Few-shot anomaly image generation with diffusion model. In: Proceedings of the 38th AAAI Conference on Artificial Intelligence. Vancouver, Canada: AAAI, 2024. 8526−8534 [222] Huang G, Liu Z, van der Maaten L, Weinberger K. Densely connected convolutional networks [Online], available: https://github.com/liuzhuang13/DenseNet, September 11, 2021 [223] Park T, Liu M Y, Wang T C, Zhu J Y. Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019. 2332−2341 [224] Oksuz K, Cam B C, Kalkan S, Akbas E. Imbalance problems in object detection: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(10): 3388−3415 doi: 10.1109/TPAMI.2020.2981890 [225] He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE, 2016. 770−778 [226] Ronneberger O, Fischer P, Brox T. U-Net: Convolutional networks for biomedical image segmentation. In: Proceedings of the 18th International Conference on Medical Image Computing and Computer-assisted Intervention. Munich, Germany: Springer, 2015. 234−241 [227] Ojala T, Pietikäinen M, Harwood D. Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In: Proceedings of the 12th International Conference on Pattern Recognition. Jerusalem, Israel: IEEE, 1994. 582−585 [228] He D C, Wang L. Texture unit, texture spectrum and texture analysis. In: Proceedings of the 12th Canadian Symposium on Remote Sensing Geoscience and Remote Sensing Symposium. Vancouver, Canada: IEEE, 1989. 2769−2772 [229] DeVries T, Taylor G W. Dataset augmentation in feature space. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, France: ICLR, 2017. 1−12 [230] Zhang M, Levine S, Finn C. MEMO: Test time robustness via adaptation and augmentation. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. New Orleans, USA: Curran Associates Inc., 2022. Article No. 2799 [231] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 26th Neural Information Processing Systems. Lake Tahoe, USA: Curran Associates Inc., 2012. 1097−1105 [232] Kim I, Kim Y, Kim S. Learning loss for test-time augmentation. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc., 2020. Article No. 350 [233] Wang G T, Li W Q, Aertsen M, Deprest J, Ourselin S, Vercauteren T. Aleatoric uncertainty estimation with test-time augmentation for medical image segmentation with convolutional neural networks. Neurocomputing, 2019, 338: 34−45 doi: 10.1016/j.neucom.2019.01.103 [234] Saharia C, Chan W, Saxena S, Li L, Whang J, Denton E, et al. Photorealistic text-to-image diffusion models with deep language understanding. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. New Orleans, USA: Curran Associates Inc., 2022. Article No. 2643 [235] Ruiz N, Li Y Z, Jampani V, Pritch Y, Rubinstein M, Aberman K. Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE, 2023. 22500−22510 [236] Hu J, Huang Y W, Lu Y L, Xie G Y, Jiang G N, Zheng Y F, et al. AnomalyXFusion: Multi-modal anomaly synthesis with diffusion. arXiv preprint arXiv: 2404.19444, 2024. [237] Chen X, Luo X Z, Weng J, Luo W Q, Li H T, Tian Q. Multi-view gait image generation for cross-view gait recognition. IEEE Transactions on Image Processing, 2021, 30: 3041−3055 doi: 10.1109/TIP.2021.3055936 [238] Wang N Y, Zhang Y D, Li Z W, Fu Y W, Yu H, Liu W, et al. Pixel2Mesh: 3D mesh model generation via image guided deformation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(10): 3600−3613 doi: 10.1109/TPAMI.2020.2984232 [239] Voleti V, Yao C H, Boss M, Letts A, Pankratz D, Tochilkin D, et al. SV3D: Novel multi-view synthesis and 3D generation from a single image using latent video diffusion. In: Proceedings of the 18th European Conference on Computer Vision. Milan, Italy: Springer, 2024. 439−457 [240] Huang X, Liu M Y, Belongie S, Kautz J. Multimodal unsupervised image-to-image translation. In: Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 2018. 179−196 [241] Zhu J Y, Zhang R, Pathak D, Darrell T, Efros A A, Wang O, et al. Toward multimodal image-to-image translation. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc., 2017. 465−476 [242] Ye J R, Ni H M, Jin P, Huang S X, Xue Y. Synthetic augmentation with large-scale unconditional pre-training. In: Proceedings of the 26th International Conference on Medical Image Computing and Computer Assisted Intervention. Vancouver, Canada: Springer, 2023. 754−764 -
计量
- 文章访问数: 43
- HTML全文浏览量: 38
- 被引次数: 0