首页|结合频率和ViT的工业产品表面相似特征缺陷检测方法

结合频率和ViT的工业产品表面相似特征缺陷检测方法

扫码查看
目的 工业产品表面的缺陷检测是保证其质量的重要环节.针对工业产品表面缺陷与背景相似度高、表面缺陷特征相似的问题,提出了一种差异化检测网络YOLO-Differ(you only look once-difference).方法 该网络以YOLOv5(you only look once version 5)为基础,利用离散余弦变换算法和自注意力机制提取和增强频率特征,并通过融合频率特征,增大缺陷与背景特征之间的区分度;同时考虑到融合中存在的错位问题,设计自适应特征融合模块对齐并融合RGB特征和频率特征.其次,在网络的检测模块后新增细粒度分类分支,将视觉变换器(vision Trans-former,ViT)作为该分支中的校正分类器,专注于提取和识别缺陷的微小特征差异,以应对不同缺陷特征细微差异的挑战.结果 实验在3个数据集上与7种目标检测模型进行了对比,YOLO-Differ模型均取得了最优结果,与其他模型相比,平均准确率均值(mean average precision,mAP)分别提升了 3.6%、2.4%和0.4%以上.结论 YOLO-Differ模型与同类模型相比,具有更高的检测精度和更强的通用性.
Defect detection method for industrial product surfaces with similar features by combining frequency and ViT
Objective In industrial production,influenced by the complex environment during manufacturing and produc-tion processes,surface defects on products are difficult to avoid.These defects not only destroy the integrity of the products but also affect their quality,posing potential threats to the health and safety of individuals.Thus,defect detection on the surface of industrial products is an important part that cannot be ignored in production.In defect detection tasks,the tar-gets must be accurately classified to determine whether they should be subjected to recycling treatment.At the same time,the detection results must be presented in the form of bounding boxes to assist enterprises in analyzing the causes of defects and improving the production process.The traditional method of surface defect detection is the manual inspection method.However,in practice,manual inspection often has large limitations.In recent years,the performance of computers has improved by leaps and bounds,and traditional machine vision technology has been widely tested in various production fields.These methods rely on image processing and feature engineering,and in specific scenarios,they can reach a level close to manual detection,truly realizing the productivity replacement of machines for some manual labor.However,the shortcoming is the difficulty in extracting features from complex backgrounds,often resulting in inaccurate detection.Therefore,it is hardly reused in other types of workpiece inspection tasks.Deep learning has played an increasingly impor-tant role in the field of computer vision in recent years.Deep learning-based defect detection methods learn the features of numerous defect samples and utilize the defect sample features to achieve classification and localization.With high detec-tion accuracy and applicability,they have addressed the complexity and uncertainty associated with manual feature extrac-tion in traditional image processing,achieving remarkable results in industrial product surface defect detection.However,given the complex background of some industrial product surfaces,the high similarity between some surface defects and the background,and the small difference between different defects,the existing methods could hardly detect surface defects with accuracy.In this study,we propose a differential detection network(YOLO-Differ)based on YOLOv5.Method First,for cases where some defects are similar to background features on the surface of products,according to the studies of biology and psychology,predators use perceptual filters bound to specific features to separate target animals from the back-ground during predation.In other words,they capture camouflaged targets by utilizing frequency domain features.The fre-quency signal strength of the target is lower than that of the background,and this difference helps us find targets similar to the background.Therefore,a novel method is proposed for the first time to integrate frequency cues in the object detection network,thus addressing the issue of inaccurate localization caused by defects that resemble the background,thereby enhancing the distinguishability between defects and the background.Second,a fine-grained classification branch is added after the detection module of the network to address the issue of small differences in defect features among different types.The vision Transformer(ViT)classification network is used as the corrective classifier in this branch to extract subtle distin-guishing features of defects.Specifically,it divides the defective image into N blocks small enough to allow its inherent attention mechanism to capture important regions in the image.At the same time,Transformer performs global relationship modeling on different patches and gives each patch the importance of affecting classification results.This large range of relationship modeling and importance settings enable it to locate subtle differences in features and focus on important fea-tures of defects.Therefore,YOLO-Differ is divided into five parts:RGB feature extraction,frequency feature extraction,feature fusion,detection head,and fine-grained classification.First,RGB feature extraction,which consists of the back-bone network and neck,is responsible for extracting the basic RGB feature information and fusing RGB features of different scales to obtain improved detection results.Next,RGB images are converted to YCbCr image space,and its results are pro-cessed through discrete cosine transform(DCT)and frequency enhancement to obtain their frequency features.The feature fusion module aligns and fuses the RGB features with frequency features.Then,the fused features are fed into the detec-tion head to obtain defect localization information and preliminary classification results.Finally,the defect images are cropped in accordance with the location information and fed into the fine-grained classifier for secondary classification to obtain the final classification results of defects.Result In the experiment,YOLO-Differ models were compared with seven object detection models on three datasets,and YOLO-Differ consistently achieved optimal results.Compared with the current state-of-the-art models,the mean average precision(mAP)improved by 3.6%,2.4%,and 0.4%on each respective dataset.Conclusion Compared with similar models,the YOLO-Differ model exhibits higher detection accuracy and stronger generality.

surface defect detectionsimilarityfrequency featuresfine-grained classificationgenerality

王素琴、程成、石敏、朱登明

展开 >

华北电力大学控制与计算机工程学院,北京 102206

中国科学院计算技术研究所,北京 100190

太仓中科信息技术研究院,太仓 215400

表面缺陷检测 相似性 频率特征 细粒度分类 通用性

国家重点研发计划资助国家自然科学基金项目

2020YFB171040061972379

2024

中国图象图形学报
中国科学院遥感应用研究所,中国图象图形学学会 ,北京应用物理与计算数学研究所

中国图象图形学报

CSTPCD北大核心
影响因子:1.111
ISSN:1006-8961
年,卷(期):2024.29(10)