Multi-modal hierarchical classification for power equipment defect detection
Objective Safety state detection of power equipment is a fundamental task to ensure the safe operation of power systems.The state detection and fault maintenance of power equipment are the basic prerequisites for ensuring the normal operation of the power system.With the growing diversities and complexity of defects in substations,the current defect rec-ognition and power detection has increasingly been required to handle multi-label classification tasks based on a large num-ber of closely related defect labels.However,due to the complex types of power equipment defects in most substations,most existing approaches for power equipment defect detection are inefficient at multi-label defect detection because the defect category labels often have different granularities in their semantic concepts and are often closely related with each other.All these problems cause existing defect detection methods to have difficulty meeting the requirements of multi-label classification-based defect detection tasks of power equipment.To address these problems,this paper proposes a multi-modal hierarchical classification for power equipment defect detection,which is suitable for defect detection in complex power equipment environments.Method We propose a multi-modal hierarchical classification method,which fuses the fea-ture information of defect images,hierarchical structure information,and the semantic information of category labels.First,defect images of power equipment from multiple substations are collected and preprocessed with manual annotation,data enhancement,and normalization to construct a power equipment defect image dataset with a hierarchical label struc-ture.Then,a hierarchical classification model based on multi-modal feature fusion and hierarchical fine-tuning techniques is proposed,which uses the ResNet50 network to extract features from images,and a region proposal network to locate object and predict the foreground and background.The region of interest align(ROI Align)method is further used to con-tinuously generate the position coordinates to avoid introducing errors in quantifying the position coordinates generated by the region proposal network.Finally,the hierarchical structure of power equipment to be detected is used to embed the par-ent category labels into the current layer's object feature representation for layer-by-layer defect classification.The final defect detection result is obtained in the final layer.Result Comparative experiments are conducted on the real-world power equipment defect dataset and the PASCAL VOC2012 benchmark dataset against the current multi-label classification-based power equipment defect detection methods and the popularly used object detection algorithms.Experimental results show that the proposed method achieved the best detection accuracy for most equipment defect categories,with a mean average precision of 86.4%.Compared with the second-best performing model,the accuracy improved by 5.1%,and the mean average precision on the benchmark dataset increased by 1.1%to 3%.The proposed method can be executed in a rel-evantly shorter time than the compared methods.Conclusion Our method achieves superior detection accuracy performance against the compared methods while maintaining a lower computational cost.It can improve the accuracy of power equip-ment defect detection through a hierarchical classification model based on multi-modal feature fusion by fully utilizing the semantic relationship between equipment defect labels.