Unsupervised Video Person Re-identification Based on Multiple Kernel Dilated Convolution
Person re-identification aims to identify specific individuals across surveillance cameras,overcoming challenges such as pose variations,occlusions,and background noise that often lead to insufficient feature extraction.This paper proposes a novel unsupervised video-based person re-identification method that utilizes multi-kernel dilated convolution to provide a more comprehensive and accurate representation of individual differences and features.Initially,we employ a pre-trained ResNet50 as an encoder.To further enhance the encoder's feature extraction capability,we introduce a multiple kernel dilated convolution module.Enlarging the receptive field of convolutional kernels allows the network to more effectively capture both local and global feature information,offering a more comprehensive depiction of a person's appearance features.Subsequently,a decoder is employed to restore high-level semantic information to a more fundamental feature representation,thereby strengthening feature representation and improving system performance under complex imaging conditions.Finally,a multi-scale feature fusion module is introduced in the decoder output to merge features from adjacent layers,reducing semantic gaps between different feature channel layers and generating more robust feature representations.Offline experiments are conducted on three mainstream datasets,and results show that the proposed method achieves significant improvements in both accuracy and robustness.
person re-identificationmultiple kernel dilated convolutionunsupervised learningfeature extractionattention mechanism