Image Description Generation Model Based on RPR Transformer
Image description generation combines computer vision and natural language process-ing,aiming to provide accurate descriptions for images.The attention mechanism ignores the two-dimensional spatial characteristics of images.This article proposes a self attention model(RPR Transformer)based on the relative position relationship between objects.Extract object features through object detection technology and calculate the center position and area of the cor-responding object;Using a relational feature extraction model to extract the correlation features between objects in an image;Filter the fused features using gating units to remove interference in formation.The experimental results indicate that this model has strong robustness.