Natural scene text recognition based on character attention
In natural scene text recognition,a fixed size convolution kernel is used to extract visual fea-tures,and then character classification is performed.The global modeling ability of this method is weak and it ignores the importance of text semantic modeling.Therefore,this paper proposes a natural scene text recognition method based on character attention.Firstly,a multi-level efficient Swin Transformer network is constructed to extract features,which is different from the convolutional network.This net-work can make the features of different windows interact with each other.Secondly,the character atten-tion module(CAM)is designed to make the network focus on the features of the character region,so as to extract the visual features with higher recognition ability.Then,the semantic reasoning module(SRM)is designed to model the text sequence according to the context information of characters.And the module can obtain semantic features to correct the indistinguishable or fuzzy characters.At last,visu-al and semantic features are fused to get the results of character recognition.The experimental results show that the recognition accuracy in this paper reaches 95.2%on the regular text data set IC13 and 85.8%on the irregular curved text data set CUTE.The feasibility of the proposed method is proved by ablative and comparative experiments.