计算机仿真2024,Vol.41Issue(7) :216-221.

基于卷积神经网络的图像描述生成改进算法

An Improved Algorithm for Image Description Generation Based on Convolutional Neural Networks

柯杰 曾上游 黄飞燕 雷松橦
计算机仿真2024,Vol.41Issue(7) :216-221.

基于卷积神经网络的图像描述生成改进算法

An Improved Algorithm for Image Description Generation Based on Convolutional Neural Networks

柯杰 1曾上游 1黄飞燕 1雷松橦1
扫码查看

作者信息

  • 1. 广西师范大学,广西 桂林 541004
  • 折叠

摘要

图像描述是对指定图片进行自然语言描述,在现阶段的研究中大多是基于编码器-解码器结构进行的,为提升图像描述的精确度还可以引入注意力机制,使用模型在编码器-解码器架构基础上,同时引入了一种基于AoA(Attention on Atten-tion)的新的改进注意力机制,使注意力机制轻量化的同时将注意力结果和查询结果的相关性进行确定,来增强图片与词之间的相关性,最后输出自然语言.在公共数据集MSCOCO和Flickr30k作对比验证,通过实验结果与传统一般的注意力机制模型评价结果相比,在进行图像文本描述时使用的改进注意力机制模型,加快了整体模型的收敛速率,提高了相关评价指标并增强了模型性能,有显著的优越性.

Abstract

Image description is a natural language description of a specified image.Most of the current research is based on the encoder-decoder structure.In order to improve the accuracy of image description,an attention mecha-nism can also be introduced.This paper uses the model in the encoder-decoder.On the basis of the decoder architec-ture,a new improved attention mechanism based on AoA(Attention on Attention)is introduced,which makes the at-tention mechanism lightweight and determines the correlation between the attention result and the query result to en-hance the correlation between the image and the word,and finally outputs natural language.Compared with the evalu-ation results of traditional general attention mechanism models,the improved attention mechanism model used in image text description has significantly improved the convergence rate of the overall model,improved relevant evalua-tion indicators,and enhanced model performance in the public dataset MSCOCO and Flickr30k.

关键词

图像描述/自然语言处理/注意力机制/卷积神经网络/长短期记忆网络

Key words

Image description/Natural language processing/Attention mechanism/Convolutional neural network/Long short-term memory network

引用本文复制引用

基金项目

国家自然科学基金资助项目(61976063)

出版年

2024
计算机仿真
中国航天科工集团公司第十七研究所

计算机仿真

CSTPCD
影响因子:0.518
ISSN:1006-9348
参考文献量2
段落导航相关论文