首页|基于多尺度注意力的鸟类图像识别

基于多尺度注意力的鸟类图像识别

扫码查看
鸟类图像不同子类别外观相似,而同类别目标因复杂的背景、姿态等呈现较大的类内差异。针对这个问题,提出了基于多尺度注意力的卷积神经网络模型。模型通过无参数学习的目标模块和部件模块使注意力由全局图像逐渐聚焦到目标和部件图像,形成了能输入多尺度图像的三分支网络模型。此外,引入排序损失以减少背景的干扰。在CUB-200-2011和NABirds数据集上,模型的识别精度分别为87。21%和85。96%,与基线模型相比,识别精度得到有效提高,验证了模型的有效性。
Bird Image Recognition Based on Multiscale Attention
Different sub-categories of bird images have similar appearances,while objects of the same category show large in-tra-class variances due to complex backgrounds and pose.To solve this problem,a convolutional neural network model based on multi-scale attention is proposed.The model gradually focuses on the attention from the global image to the target and component im-ages through the target module and component module of parameter-free learning and forms a three-branch network model that can input multi-scale images.Furthermore,an ordering loss is introduced to reduce background interference.On the CUB-200-2011 and NABirds datasets,the recognition accuracy of the model is 87.21%and 85.96%,respectively.Compared with the baseline mod-el,the recognition accuracy is effectively improved,which verifies the effectiveness of the model.

bird image recognitionmultiscale attentionrank lossconvolutional neural networks

阮涛、郝智程

展开 >

北京信息科技大学应用数学研究所 北京 100010

鸟类图像识别 多尺度注意力 排序损失 卷积神经网络

2024

计算机与数字工程
中国船舶重工集团公司第七0九研究所

计算机与数字工程

CSTPCD
影响因子:0.355
ISSN:1672-9722
年,卷(期):2024.52(10)