Research on image captioning generation method of double infor-mation flow based on ECA-Net
To address the problem of mismatch between description statements and image content due to insufficient visual information in image captioning generation,an image captioning generation method based on efficient channel attention network(ECA-Net)is proposed.Firstly,the image segmentation feature as an additional source of visual information,and the iterative independent layer normalization(IILN)module is used to fuse the segmentation feature and grid feature.Also,the image feature is extracted by the double information flow network.Secondly,an ECA-Net module is introduced to the encoder facilitates the learning of correlations among image features through cross-channel interaction,so that the prediction results are more focused on visual content.Finally,the decoder predicts the next phrase based on the provided visual information and the partially generated captions,thus generating accurate captions.Experimental results on MSCOCO data demonstrate that the proposed method can enhance the dependency between the visual information of images,and make the subtitles more relevant and more accurate.
captions generationchannel attentioncodecdouble information flow