首页|基于RPR-Transformer图像描述生成模型

基于RPR-Transformer图像描述生成模型

扫码查看
图像描述生成结合了计算机视觉和自然语言处理,旨在为图像提供准确描述.注意力机制忽略了图像的二维空间特性.文章提出基于物体间相对位置关系的自注意力模型(RPR-Transform-er).通过目标检测技术提取物体特征并计算对应物体的中心位置以及面积;使用关系特征提取模型提取图像中物体之间的关联特征;对融合后的特征使用门控单元过滤,去除干扰信息.实验结果表明本模型具有较强的鲁棒性.
Image Description Generation Model Based on RPR Transformer
Image description generation combines computer vision and natural language process-ing,aiming to provide accurate descriptions for images.The attention mechanism ignores the two-dimensional spatial characteristics of images.This article proposes a self attention model(RPR Transformer)based on the relative position relationship between objects.Extract object features through object detection technology and calculate the center position and area of the cor-responding object;Using a relational feature extraction model to extract the correlation features between objects in an image;Filter the fused features using gating units to remove interference in formation.The experimental results indicate that this model has strong robustness.

Image CaptionRelation Feature ExtractionAttention Mechanism

赵芸

展开 >

上海宝信软件股份有限公司,上海 200000

图像描述生成 关系特征提取 注意力机制

2024

长江信息通信
湖北通信服务公司

长江信息通信

影响因子:0.338
ISSN:2096-9759
年,卷(期):2024.37(12)