Person Re-Identification Network with Fine-Grained Local Semantics and Attribute Learning
Pedestrian Re-IDentification(Person Re-ID)aims to search for the same pedestrian across multiple different camera views.Due to its importance in various practical applications such as video surveillance and content-based image retrieval,it has garnered extensive attention in recent years.However,the task still faces numerous challenges,including significant variations in pedestrian poses,lighting,and backgrounds in different camera views.Additionally,the similar appearance of clothing among different pedestrians and inaccurate pedestrian detection bounding boxes further complicate its practical application.Personal belongings information(e.g.,backpacks,handbags)is often overlooked by semantic models because these items are not person body parts,but personal belongings information provide crucial contextual information for re-identification.On the other hand,attribute descriptions,such as gender,type of upper body clothes,color of upper body clothes,are also discriminative information in person re-identification and can effectively enhance Person Re-ID task performance.Addressing the issues that current semantic methods cannot effectively extract potential personal belongings information and clustering methods are too coarse,failing to fully utilize the attribute information of local semantic features,this paper proposed a pedestrian re-identification algorithm based on fine-grained local semantics and attribute learning.This algorithm extracts information about personal belongings and obtains attribute descriptions from semantic regions.The proposed method involves several key modules.Firstly,the Fine-grained Local Semantics(FLS)Module integrates the personal belonging regions generated by the feature clustering method into the semantic parsing results generated by an additional semantic model,addressing the problem of many additional semantic parsing models missing personal belongings information and resulting in smooth and more comprehensive semantic regions.Secondly,Attribute Learning Module(ALM)uses the generated semantic regions as body labels,allowing the network to construct semantic feature mappings of these regions from global features,and then predicts the associated attribute information from the semantic features to capture detailed and contextually relevant information about the pedestrian.Lastly,considering the strong correlations between certain pedestrian attributes,such as female and long hair,Attribute Weighted Module(AWM)is constructed to improve the confidence scores of certain attributes and optimize the prediction accuracy of attributes.Then the model combines the attribute prediction results with the global features of the pedestrian to form robust feature representations.In addition,the high confidence attribute information is used to filter out the irrelevant pedestrian images from the image gallery to be retrieved to improve the speed of similarity sorting.To evaluate the performance of the proposed model,experiments were conducted on two public datasets widely used for pedestrian re-identification tasks(Market-1501 and DukeMTMC-reID)and their attribute datasets Experiments on the Market-1501 and DukeMTMC-reID attribute datasets show that the proposed algorithm achieves 3.6%and 6.4%mAP index gains,and 1.1%and 5.3%mAP index gains compared with the baseline network respectively,indicating that the proposed model can improve the performance of person re-identification task.Visual analyses of person attribute prediction results and person similarity ranking were also performed to verify the effectiveness of the proposed model in accurately predicting pedestrian attributes and utilizing them to improve matching accuracy and efficiency.